It will only take you 5 minutes and will guide you through:
✅ Install bigdata-client package 
✅ Authenticate to bigdata.com 
✅ Create two sample files to upload 
✅ Upload private files 
✅ Query bigdata.com 
Private & Secure: No LLM training on your data
 
Install bigdata-client package
Follow Prerequisites instructions
to set up the require environment.
Authenticate to bigdata.com
Because you have already set your credentials in the environment
following the Prerequisites step,
Bigdata constructor will read them.
from bigdata_client import Bigdata
bigdata = Bigdata()
Create two samples files to upload
Create the following two sample files in your local directory.
File name data_science_research-2020-06.txt:
RavenPack Data Science researchers recommend the following stocks in June 2020
Microsoft (NASDAQ: MSFT): Microsoft has been heavily investing in AI and cloud computing, which are key growth areas for the company.
Datadog (NASDAQ: DDOG): Datadog is a leading provider of monitoring and analytics solutions for cloud-based applications.
Oracle (NYSE: ORCL): Oracle is a major player in the enterprise software and cloud computing market.
soup_recipes-2020-06.txt:
We recommend making chicken noodle soup with homemade chicken stock
Upload private files
We will upload two files and use the parameter provider_data_utc to
inform bigdata about their creation date. This will modify the document
published date, allowing us to better assign a reporting date to
detected events.
file = bigdata.uploads.upload_from_disk("./data_science_research-2020-06.txt",
                                    provider_document_id='my_document_id',
                                    provider_date_utc='2020-06-10 12:00:00',
                                    primary_entity='RavenPack',
                                    skip_metadata=True)
# Check the file's processing status
file.reload_status()
print(f"File processing status: {file.status}")
# Wait for completion
file.wait_for_completion(timeout=60)
print(f"File processing status: {file.status}")
File processing status: PENDING
File processing status: COMPLETED
file.add_tags(["Data Science Research"])
print(f"File tags: {file.tags}")
File tags: ['Data Science Research']
file = bigdata.uploads.upload_from_disk("./soup_recipes-2020-06.txt",
                                    provider_document_id='my_document_id',
                                    provider_date_utc='2020-06-10 12:00:00',
                                    primary_entity='RavenPack',
                                    skip_metadata=True)
# Check the file's processing status
file.reload_status()
print(f"File processing status: {file.status}")
# Wait for completion
file.wait_for_completion(timeout=60)
print(f"File processing status: {file.status}")
File processing status: PENDING
File processing status: COMPLETED    
Cooking recipes
file.add_tags(["Cooking recipes"])
print(f"File tags: {file.tags}")
File tags: ['Cooking recipes']
Query bigdata.com
Let’s do a Similarity search with the text recommend stock in the
month of June 2020:
from bigdata_client.query import Similarity
from bigdata_client.daterange import AbsoluteDateRange
from bigdata_client.models.search import DocumentType
# Similarity search
query = Similarity("recommend stock")
# Full month of June 2020
in_june_2020 = AbsoluteDateRange("2020-06-01T08:00:00", "2020-06-30T00:00:00")
# Create a bigdata search
search = bigdata.search.new(query, date_range=in_june_2020, scope=DocumentType.ALL)
# Retrieve content of four documents
documents = search.run(4)
for doc in documents:
    print(f"\nDocument headline: {doc.headline}")
Document headline: Nifty outlook and stock recommendations by CapitalVia: Buy RBL Bank, ONGC
Document headline: Forget the Naysayers: 3 Top Retail Stocks You Should Own
Document headline: Here's How to Invest Like Warren Buffett
Document headline: 2 Tech Stocks to Buy Right Now
# Narrow down the date range to 2 seconds
two_secs_in_june_2020 = AbsoluteDateRange("2020-06-10T11:59:59", "2020-06-10T12:00:01")
# Create a bigdata search
search = bigdata.search.new(query, date_range=two_secs_in_june_2020, scope=DocumentType.ALL)
# Retrieve content of four documents
documents = search.run(4)
for doc in documents:
    print(f"\nDocument headline: {doc.headline}")
Document headline: soup_recipes-2020-06.txt
Document headline: Deutsche Post AG: Investor Meeting
Document headline: Ford Motor Co.: Deutsche Bank Global Auto Industry Conference
Document headline: data_science_research-2020-06.txt
scope to DocumentType.FILES.
# Create a bigdata search with scope "FILES"
search = bigdata.search.new(query, date_range=in_june_2020, scope=DocumentType.FILES)
# Retrieve content of four documents
documents = search.run(4)
# Read all retrieved documents and print some details
for doc in documents:
    print(f"\nDocument headline: {doc.headline}")
    for chunk in doc.chunks:
        print(f"  Chunk text: {chunk.text}")
Document headline: soup_recipes-2020-06.txt
Chunk text: We recommend making chicken noodle soup with homemade chicken stock
Document headline: data_science_research-2020-06.txt
Chunk text: (Sample file for testing purpose) RavenPack Data Science researches recommend the following stocks in June 2020 Microsoft (NASDAQ: MSFT): Microsoft has been heavily investing in AI and cloud computing, which are key growth areas for the company.
Data Science Research
from bigdata_client.query import Similarity, FileTag
# Similarity search
query = Similarity("recommend stock") & FileTag("Data Science Research")
# Create a bigdata search
search = bigdata.search.new(query, date_range=in_june_2020, scope=DocumentType.FILES)
# Retrieve content of four documents
documents = search.run(4)
# Read all retrieved documents and print some details
for doc in documents:
    print(f"\nDocument headline: {doc.headline}")
    for chunk in doc.chunks:
        print(f"  Chunk text: {chunk.text}")
Document headline: data_science_research-2020-06.txt
Chunk text: (Sample file for testing purpose) RavenPack Data Science researches recommend the following stocks in June 2020 Microsoft (NASDAQ: MSFT): Microsoft has been heavily investing in AI and cloud computing, which are key growth areas for the company.
Summary
Congratulations! 🎉 You have successfully uploaded private files and
retrieve insights about them amongst millions of other documents.
The following pages are related to private file uploading, managing tags
and search using the tag query filter:
- Upload your own content: It describes all
supported parameters and methods to manage private files.
- Batch file upload: It contains a
script to help your organization quickly upload all private files.
- FileTag: It describes the
FileTagquery filter.
- Query operators: It describes the
supported query operators: &,|,~,AllandAny.