You can upload your own files such as PDFs, TXT documents, and other textual formats to Bigdata.com; they are then enriched (extraction, structure and annotation of the content) and indexed. Once uploaded, your files are enriched and indexed automatically, making them available for the Search and Research Agent endpoints. The script below uploads multiple files to Bigdata using the REST API: it reads a list of file paths, uploads each file (POST → PUT to presigned URL → poll until enrichment completes), and writes results to a CSV.Documentation Index
Fetch the complete documentation index at: https://docs.bigdata.com/llms.txt
Use this file to discover all available pages before exploring further.
Setup
-
Create a virtual environment (recommended)
-
Install dependencies
-
Configure environment
Copy
.envto a new file if needed, then edit.envand set your Bigdata API key:The script loads variables from.envin the script directory. You can also setBIGDATA_API_KEY(and optionallyBIGDATA_API_BASE_URL) in your shell. For general environment setup, see Prerequisites.
Usage
Run the script with these parameters:| Parameter | Description |
|---|---|
| workdir | Directory that contains your files and where the log and result CSV will be written. |
| upload_txt_filename | Name of a text file inside workdir that lists files to upload (one path per line; paths are relative to workdir unless absolute). |
| max_concurrency | Number of files to upload in parallel (e.g. 5). |
file_list.txt) in workdir with one filename or path per line:
Example: run the script
From thebatch_file_upload directory, with BIGDATA_API_KEY set in .env:
- Write a log file in
workdir(e.g.bigdata_processing_20260312_120000.log). - Write a result CSV in
workdir(e.g.uploaded_file_ids_20260312_120000.csv) with columns:file_id,upload_status,file_path.
uploaded_file_ids_20260312_120000.csv:
Environment variables
| Variable | Required | Default | Description |
|---|---|---|---|
BIGDATA_API_KEY | Yes | — | Your Bigdata API key. |
BIGDATA_API_BASE_URL | No | https://api.bigdata.com | API base URL. |
BIGDATA_RATE_LIMIT_PER_MINUTE | No | 500 | Max requests per minute (should match your WAF). |
BIGDATA_RATE_LIMIT_SAFETY_MARGIN | No | 20 | Margin under the limit (actual cap = limit − margin). |
BIGDATA_POLL_INTERVAL_SEC | No | 10 | Seconds between status polls while waiting for completion. |
BIGDATA_UPLOAD_MAX_RETRIES | No | 5 | Max retries per file on 429/5xx. |
.env in the script folder; you can override them in the shell.