File analytics download
The following script allows you to download analytics of previous
uploaded files using parallel threads. If you still need to upload your
files, follow the how-to guide
threading_upload_files
If your browser displays the python text instead of downloading it. You can press ctrl+s after the file opens.
Script parameters
workdir
: Absolute path to the work directory. For instance/home/user/workdir_batch_01
output_dir
: Absolute path of the directory to download all analytic files. For instance:/home/user/workdir_batch_01/analytics_files
uploaded_file_ids_csv_filename
: Filename of the previous generated CSV containing IDs of the uploaded files. For instance:uploaded_file_ids_20241026_002611.csv
max_concurrency
: The number of concurrent threads to usemax_download_timeout
: Timeout in seconds the script will wait for each file in case it is not processed yet.
How to run the script
- (If not yet done) Follow Prerequisites instructions to set up the require environment
- Ensure that the CSV file
uploaded_file_ids_YYYYMMDD_HHMMSS.csv
, containing the ID of the previous uploaded files, is in the work directory/home/user/workdir_batch_01
- Create a new directory to store all analytic files that we plan to
download, for instance
/home/user/workdir_batch_01/analytics_files
- Finally, you can run the script
The script will download and store the analytic files in the
output_dir
folder. The analytic files will have the following format:
<original_base_filename>_<original_file_extention>_analytics.json
. For instancefile_01_abc_analytics.json
The script will also generate an output CSV file
download_result__%Y%m%d_%H%M%S.csv
with the following values:
file_id
: File identifier that we can use in future requests to download or delete filesdownload_status
: Status of the download. It can beDOWNLOAD_DONE
orDOWNLOAD_ERROR
original_absolute_file_path
: The absolute path of the uploaded files
Example of the file download_result_20241026_003611.csv
If the file contains any DOWNLOAD_ERROR
you can run the script again,
but using the download_result_20241026_003611.csv
in the parameter
uploaded_file_ids_csv_filename
. The script will then try to download
all file IDs with the status DOWNLOAD_ERROR