Why It Matters

In an era of increasing trade tensions and evolving geopolitical landscapes, companies face unprecedented uncertainty around import tariffs and trade barriers. Understanding corporate exposure to tariff risks across global supply chains is critical for investment decisions, risk management, and strategic planning, but manual tracking of tariff impacts across multiple companies and markets is time-intensive and often incomplete.

What It Does

This workflow combines the RiskAnalyzer class and GenerateReport class from the bigdata-research-tools package to systematically analyze corporate exposure to US import tariff risks. Designed for portfolio managers, risk analysts, and trade compliance professionals, it transforms scattered information from news, filings, and earnings calls into a detailed research report covering risk intelligence and mitigation strategies.

How It Works

The workflow integrates hybrid semantic search, AI-powered risk taxonomies, and multi-source content analysis to deliver:
  • Automated Risk Taxonomy Creation using RiskAnalyzer to generate hierarchical risk categories specific to tariff impacts
  • Cross-Source Intelligence Gathering searching news articles, SEC filings, and earnings transcripts for relevant discussions
  • AI-Powered Risk Classification categorizing content into specific risk scenarios with Media Attention, Financial Impact, and Uncertainty metrics
  • Corporate Response Extraction identifying and summarizing company mitigation plans from official communications
  • Dual Report Generation producing both executive summary and detailed analysis formats in professional HTML reports

A Real-World Use Case

This cookbook demonstrates the complete end-to-end workflow through analyzing how US import tariffs impact major American companies. You’ll see how the system transforms scattered tariff discussions across news, SEC filings, and earnings transcripts into structured risk assessments, complete with corporate response strategies and quantified exposure metrics for investment and risk management decisions. Ready to get started? Let’s dive in! Open in GitHub

Prerequisites

To run the Specialized Report Tariffs workflow, you can choose between two options:
  • 💻 GitHub cookbook
    • Use this if you prefer working locally or in a custom environment.
    • Follow the setup and execution instructions in the README.md.
    • API keys are required:
      • Option 1: Follow the key setup process described in the README.md
      • Option 2: Refer to this guide: How to initialise environment variables
        • ❗ When using this method, you must manually add the OpenAI API key:
          # OpenAI credentials
          OPENAI_API_KEY = "<YOUR_OPENAI_API_KEY>"
          
  • 🐳 Docker Installation
    • Docker installation is available for containerized deployment.
    • Provides an alternative setup method with containerized deployment, simplifying the environment configuration for those preferring Docker-based solutions.

Setup and Imports

Below is the Python code required for setting up our environment and importing necessary libraries.
from IPython.display import display, HTML
import pandas as pd
import numpy as np
from datetime import datetime

from src.report_generator import GenerateReport, theme_tree_to_dict
from bigdata_client import Bigdata
from bigdata_research_tools.workflows.risk_analyzer import RiskAnalyzer
from bigdata_client.models.search import DocumentType

import plotly.io as pio

# Define output file paths for our report
output_dir = "output"
os.makedirs(output_dir, exist_ok=True)

Defining the Analysis Parameters

Core Parameters

  • Main Theme (main_theme): The central risk scenario to analyze across companies
  • Focus (focus): Expert perspective for generating targeted risk taxonomies
  • Company Universe (my_watchlist_id): The set of companies to analyze from your watchlist section
  • Model Selection (llm_model): The AI model used for risk classification and summarization
  • Frequency (freq): The frequency of the date ranges to search over. Supported values:
    • Y: Yearly intervals.
    • M: Monthly intervals.
    • W: Weekly intervals.
    • D: Daily intervals. Defaults to 3M.
  • Time Period (start_date and end_date): The date range for the analysis
  • Document Limits (document_limit_news, document_limit_filings): The maximum number of documents to return per query to Bigdata API for each category of documents
  • Batch Size (batch_size): The number of entities to include in a single batched query
  • Rerank Threshold (rerank_threshold): By setting this value, you’re enabling the cross-encoder which reranks the results and selects those whose relevance is above the percentile you specify (0.7 being the 70th percentile). More information on the re-ranker can be found here.
  • News Fallback Control (news_search_fallback): If True, when no response is found in transcripts/filings, the system uses News as fallback and tags those responses with [From News]. If False, missing responses are shown as “No evidence of discussions found in Transcripts/Filings.” Default: True.
# ===== Customizable Parameters =====

# Company Universe (from Watchlist)
my_watchlist_id = 'fa589e57-c9e0-444d-801d-18c92d65389f' # Magnificent 7
watchlist = bigdata.watchlists.get(my_watchlist_id)
companies = bigdata.knowledge_graph.get_entities(watchlist.items)

# Main Analysis Theme
main_theme = 'US Import Tariffs Corporate Risk Impact Analysis'
focus = "Provide a detailed taxonomy of risks describing how new American import tariffs will impact worldwide companies, their operations and strategy."

# LLM Model Configuration
llm_model = "openai::gpt-4o-mini"

# Time Range Configuration
start_date = "2025-02-01"
end_date = "2025-06-13"
freq = 'M'  # Monthly search frequency

# Enable/Disable Reranker 
rerank_threshold = None

# Document Retrieval Limits
document_limit_news = 10
document_limit_filings = 5
batch_size = 1

# Toggle fallback to News for company responses
response_from_news = True

Risk Analysis Phase

The first phase uses the RiskAnalyzer class to establish the foundation for our tariff impact analysis. This phase includes three critical steps that prepare the data for the report generation phase.

Initialize RiskAnalyzer

The RiskAnalyzer handles the initial risk discovery and taxonomy creation with automated taxonomy generation, semantic content retrieval, and intelligent content labeling.
# Initialize RiskAnalyzer with tariff-specific parameters
analyzer = RiskAnalyzer(
    llm_model=llm_model,
    main_theme=main_theme,
    companies=companies,
    start_date=start_date,
    end_date=end_date,
    document_type=DocumentType.NEWS,
    sources=None,
    rerank_threshold=None,
    focus=focus,
)

Generate Risk Taxonomy

Create a comprehensive taxonomy that breaks down tariff risks into specific, analyzable categories such as supply chain disruption, pricing impacts, and market access challenges.
# Generate the comprehensive risk taxonomy
risk_tree, risk_summaries, terminal_labels = analyzer.create_taxonomy()
themes_tree_dict = {risk_tree.label: theme_tree_to_dict(risk_tree)}

# Visualize the taxonomy structure
risk_tree.visualize()
Tariffs Risk Taxonomy Tree
The taxonomy tree shows how tariff risks branch into specific sub-scenarios. Each terminal node represents a distinct risk category that will be used to classify and analyze news content.

Retrieve Relevant Content

Search news articles using the generated taxonomy to find discussions about tariff impacts across our company universe.
# Search for tariff-related content in news articles
df_sentences_semantic = analyzer.retrieve_results(
    sentences=risk_summaries,
    freq=freq,
    document_limit=document_limit_news,
    batch_size=batch_size
)

Labeling

Use AI to analyze each news excerpt and categorize it into the appropriate risk scenarios. This creates structured data from unstructured news content.
# Classify news content into risk categories
df_labeled, _ = analyzer.label_search_results(
    df_sentences=df_sentences_semantic,
    terminal_labels=terminal_labels,
    risk_tree=risk_tree,
    additional_prompt_fields=['entity_sector','entity_industry', 'headline']
)

Report Generation Phase

The second phase uses GenerateReport and transforms the classified risk data into comprehensive reports with corporate mitigation strategies.

Initialize GenerateReport

The GenerateReport class will create sector-wide risk summaries, generate company-specific risk scores and summaries, extract mitigation plans from SEC filings and earnings transcripts, and produce professional HTML reports with customizable ranking criteria.
# Initialize the report generator with our analysis parameters
report_generator = GenerateReport(
        watchlist_id=my_watchlist_id,
        main_theme=main_theme,
        focus=focus,
        llm_model='gpt-4o-mini',
        api_key=OPENAI_API_KEY,
        start_date=start_date,
        end_date=end_date,
        search_frequency=freq,
        document_limit_news=document_limit_news,
        document_limit_filings=document_limit_filings,
        bigdata=bigdata,
        batch_size=batch_size,
        themes_tree_dict=themes_tree_dict
)

Generate Comprehensive Report

Execute the complete report generation workflow including:
  1. Sector-Level Summarization: Create thematic summaries across risk categories
  2. Company-Level Analysis: Generate risk scores for Media Attention, Financial Impact, and Uncertainty
  3. Mitigation Strategy Extraction: Search filings and transcripts for corporate response plans (with optional News fallback controlled by news_search_fallback)
  4. Data Integration: Combine all sources into structured report datasets
# Generate the risk report data
report = report_generator.generate_report(
    df_labeled=df_labeled,
    news_search_fallback = response_from_news, # Use response_from_news to enable/disable News fallback
    import_from_path=None,
    export_to_path=output_dir,
)

Final Output

Transform the analysis results into professional, customizable reports. The system provides two distinct presentation styles, each optimized for different use cases and audiences.

Report Customization Options

Both report formats allow customization through multiple ranking criteria: Sector-Wide Analysis:
  • Identifies the most significant tariff risks across all companies
  • Ranks themes by media attention and document frequency
  • Provides executive summaries for each risk category
Company-Specific Analysis:
  • Most Reported Issue: Highest media coverage and attention
  • Biggest Risk: Greatest potential financial impact
  • Most Uncertain Issue: Highest uncertainty scores and ambiguity
Each company analysis includes extracted mitigation plans from official corporate communications, providing actionable intelligence for investment and risk management decisions.

Report Format 1: Executive Summary Style

This format prioritizes clarity and executive readability, focusing on the top risks per company across three key dimensions. Ideal for senior management briefings and board presentations.
from src.html_report import generate_html_report, prepare_data_report_0

# Extract report data for processing
df_by_theme = report.report_by_theme
df_by_company_with_responses = report.report_by_company

# Prepare data with executive summary formatting
top_by_theme, top_by_company = prepare_data_report_0(df_by_theme, df_by_company_with_responses)

# Generate executive-style HTML report
html_content = generate_html_report(top_by_theme, top_by_company, 'US Import Tariffs: Corporate Risk Impact Analysis')

# Save the executive report
report_filename = f'{output_dir}/tariffs_executive_report.html'
with open(report_filename, 'w') as file:
     file.write(html_content)

Executive Report Preview p1
Executive Report Preview p2

Report Format 2: Detailed Analysis Version

This format provides comprehensive risk analysis with extended company coverage and detailed risk breakdowns. Designed for analysts, portfolio managers, and risk management teams requiring in-depth insights.
from src.html_report import generate_html_report_v1, prepare_data_report_1

# Prepare data with detailed analysis formatting
top_by_theme, top_by_company = prepare_data_report_1(df_by_theme, df_by_company_with_responses)

# Generate detailed analysis HTML report
html_content_detailed = generate_html_report_v1(top_by_theme, top_by_company, 'US Import Tariffs: Comprehensive Risk Analysis')

# Save the detailed report
detailed_filename = f'{output_dir}/tariffs_detailed_analysis.html'
with open(detailed_filename, 'w') as file:
     file.write(html_content_detailed)

Executive Report Preview p1
Executive Report Preview p2

Export Results for Further Analysis

The generated data can be exported for integration with existing risk management systems, portfolio optimization tools, or compliance reporting workflows.
# Optional: Export structured data for external analysis
try:
    # Export the core datasets
    df_by_theme.to_csv(f'{output_dir}/tariffs_risks_by_theme.csv', index=False)
    df_by_company_with_responses.to_csv(f'{output_dir}/tariffs_risks_by_company.csv', index=False)
    
    print("✅ Data exported successfully:")
    print(f"   - Thematic analysis: {output_dir}/tariffs_risks_by_theme.csv")
    print(f"   - Company analysis: {output_dir}/tariffs_risks_by_company.csv")
    print(f"   - Executive report: {output_dir}/tariffs_executive_report.html")
    print(f"   - Detailed analysis: {output_dir}/tariffs_detailed_analysis.html")
    
except Exception as e:
    print(f"Warning: Export failed - {e}")

Conclusion

This workflow demonstrates a comprehensive approach to tariff risk analysis that combines the analytical power of RiskAnalyzer with the report generation capabilities of GenerateReport. Through the integrated analysis of tariff risks and corporate mitigation strategies, you can:
  1. Identify Risk-Exposed Companies - Discover which companies face the greatest exposure to tariff impacts across different risk categories including supply chain disruption, pricing pressures, and market access challenges
  2. Assess Corporate Preparedness - Evaluate how well companies are positioned to handle tariff risks through their documented mitigation strategies and response plans
  3. Quantify Multi-Dimensional Risk - Use Media Attention, Financial Impact, and Uncertainty scores to compare risk levels across companies and risk categories
  4. Generate Investment Intelligence - Create professional reports that inform portfolio decisions, risk management strategies, and trade policy impact assessments
From conducting due diligence on trade policy exposure to building thematic investment strategies focused on geopolitical risks or assessing portfolio-wide vulnerability to trade disruptions, this integrated workflow automates the research process while maintaining the depth and nuance required for professional analysis. The dual-output format ensures both executive communication and detailed analytical needs are met in a single automated workflow.