Entity
A filter to match an entity by its “EntityID”. Utilize the methods
provided in Knowledge Graph
to identify entities/topics/sources of interest and use the obtained IDs
to build queries.
Example:
from bigdata_client import Bigdata
from bigdata_client.query import Entity
bigdata = Bigdata()
# Entity IDs:
MICROSOFT = "228D42"
APPLE = "D8442A"
query = Entity(MICROSOFT) | Entity(APPLE)
search = bigdata.search.new(query)
for document in search.limit_documents(10):
print(document)
# or
documents = search.run(10)
for document in documents:
print(document)
If you don’t know the EntityID, you can use the autosuggest feature to
find it, and use the returned entity to build the query:
from bigdata_client import Bigdata
bigdata = Bigdata()
microsoft = bigdata.knowledge_graph.autosuggest("Microsoft")[0]
apple = bigdata.knowledge_graph.autosuggest("Apple")[0]
query = microsoft | apple
search = bigdata.search.new(query)
documents = search.run(10)
for document in documents:
print(document)
Checkout the page Find companies for more information on how to find companies’ EntityIDs.
The search
object is of type Search
, and
the individual items returned by the search are instances of Document
.
See the Document
and the page Search results to see available attributes and methods. Also, see
reference_entities
for further details
on each specific entity type.
Watchlist
If you want to retrieve insights about any of the entities in a Watchlist, you can add all the entities in the query with a Any
operator.
from bigdata_client import Bigdata
from bigdata_client.query import Any
bigdata = Bigdata()
MY_WATCHLIST_ID = "c2356958-48f6-4380-bb1f-c588656fb2c0"
watchlist = bigdata.watchlists.get(MY_WATCHLIST_ID)
companies = bigdata.knowledge_graph.get_entities(watchlist.items)
query = Any(companies)
search = bigdata.search.new(query)
documents = search.run(2)
for doc in documents:
print(doc)
Topic
A filter to match content containing macroeconomic, geopolitical, and
business events. Just like in the cases before, you can use the TopicID
if it’s known:
from bigdata_client import Bigdata
from bigdata_client.query import Topic
bigdata = Bigdata()
query = (
Topic("business,labor-issues,executive-appointment,,")
| Topic("business,labor-issues,executive-resignation,,")
| Topic("business,labor-issues,executive-retirement,,")
)
search = bigdata.search.new(query)
documents = search.run(10)
for document in documents:
print(document)
Or use the autosuggest feature to find the Topic object:
from bigdata_client import Bigdata
from bigdata_client.query import Any
bigdata = Bigdata()
topics = ["executive appointment", "executive resignation", "executive retirement"]
topic_list = [bigdata.knowledge_graph.find_topics(topic)[0] for topic in topics]
query = Any(topic_list)
search = bigdata.search.new(query)
documents = search.run(10)
for document in documents:
print(document)
Source
Bigdata’s ecosystem comprises key high-quality content sources,
including web content, premium news, press wires, call transcripts, and
regulatory filings. Filter out your search results by the target source
in your query.
Example:
from bigdata_client import Bigdata
from bigdata_client.query import Source, Entity
bigdata = Bigdata()
MICROSOFT = "228D42"
ABC_NEWS = "E54C73"
query = (
Entity(MICROSOFT)
& Source(ABC_NEWS)
)
search = bigdata.search.new(query)
documents = search.run(2)
print(documents)
Keyword
Type and search, matching a specific keyword. Note that there is
stemming applied to the keyword which means that the search will also
match similar words. For example, searching for “resignation” will
also match results containing the word “resignations”.
Example:
from bigdata_client import Bigdata
from bigdata_client.query import Keyword
bigdata = Bigdata()
# Search for matches of Announcements that mention 2024 but not 2023
query = (
Keyword("Announcement")
& Keyword("2024")
& (~Keyword("2023"))
)
search = bigdata.search.new(query)
documents = search.run(2)
print(documents)
Similarity
Search for sentences after transforming them into embeddings.
Example:
from bigdata_client import Bigdata
from bigdata_client.query import Similarity
bigdata = Bigdata()
query = (
Similarity("South Korea elections")
)
search = bigdata.search.new(query)
documents = search.run(2)
print(documents)
- The OR operator (
|
) is not supported for Similarity. If you want to
search for multiple sentences, you must use AND (&
) to combine them. -
Querying by a Watchlist and Similarity is not supported. We advise
creating a query per Entity ID and Similarity filter.
SentimentRange
With Sentiment Ranges you can filter out document chunks by specifying a
sentiment score range between -1.00 and +1.00. This score reflects the
sentiment of each chunk based on the language used in every sentence. A
score closer to -1.00 indicates negative sentiment, while a score closer
to +1.00 indicates positive sentiment.
from bigdata_client import Bigdata
from bigdata_client.query import Entity, SentimentRange
bigdata = Bigdata()
MICROSOFT = "228D42"
APPLE = "D8442A"
positive_peak_microsoft = Entity(MICROSOFT) & SentimentRange([0.8,1])
negative_peak_apple = Entity(APPLE) & SentimentRange([-1,-0.8])
query = positive_peak_microsoft | negative_peak_apple
search = bigdata.search.new(query)
documents = search.run(2)
print(documents)
Document
By providing a document ID, you can retrieve all the chunks within that
document, or all the chunks that meet the criteria of your query
statements.
Example:
from bigdata_client import Bigdata
from bigdata_client.query import Entity, Document
bigdata = Bigdata()
MICROSOFT = "228D42"
query = Entity(MICROSOFT) & Document("0B4EE52A6A611A8326D7EA3E8DC075E3","9C67269CD8747E33DDEE94554A13E6EC")
search = bigdata.search.new(query)
documents = search.run(2)
print(documents)
TranscriptTypes
At this point, you’re already familiar with the various components of a
query and how to filter by specific types of content. Now, let’s delve
into how to perform queries that allow you to discover transcripts with
greater precision:
TranscriptTypes
: This filter enables querying by the document type
of the transcript. A
DocumentChunk
will be defined by a single document type at a time,
with the possible values being:
ANALYST_INVESTOR_SHAREHOLDER_MEETING
: Analyst, Investor and
Shareholder meeting.
CONFERENCE_CALL
: General Conference Call.
Coming Soon
GENERAL_PRESENTATION
: General Presentation.
EARNINGS_CALL
: Earnings Call.
EARNINGS_RELEASE
: Earnings Release.
Coming Soon
GUIDANCE_CALL
: Guidance Call.
SALES_REVENUE_CALL
: Sales and Revenue Call.
SALES_REVENUE_RELEASE
: Sales and Revenue Release.
Coming Soon
SPECIAL_SITUATION_MA
: Special Situation, M&A and Other.
SHAREHOLDERS_MEETING
: Shareholders Meeting.
Coming Soon
MANAGEMENT_PLAN_ANNOUNCEMENT
: Management Plan Announcement.
Coming Soon
INVESTOR_CONFERENCE_CALL
: Investor Conference Call.
Coming Soon
Example:
from bigdata_client import Bigdata
from bigdata_client.query import Entity, TranscriptTypes
bigdata = Bigdata()
MICROSOFT = "228D42"
query = Entity(MICROSOFT) & TranscriptTypes.EARNINGS_CALL
search = bigdata.search.new(query)
documents = search.run(2)
print(documents)
SectionMetadata
: This filter allows querying for segments inside
transcript documents. A
DocumentChunk
will be defined by one or more sections, always
within its hierarchical structure:
QA
: question and answer section. This section can be decomposed on:
QUESTION
: a question made during the session to a speaker.
ANSWER
: an answer from a speaker of the event.
MANAGEMENT_DISCUSSION
: Management Discussion Section.
Example:
from bigdata_client import Bigdata
from bigdata_client.query import Entity, TranscriptTypes, SectionMetadata
bigdata = Bigdata()
MICROSOFT = "228D42"
query = Entity(MICROSOFT) & TranscriptTypes.EARNINGS_CALL & SectionMetadata.MANAGEMENT_DISCUSSION
search = bigdata.search.new(query)
documents = search.run(2)
print(documents)
FilingTypes
You can also query a specific Filing type:
FilingTypes
: This filter enables querying by a filing type. A
DocumentChunk
will be defined by a single document type at a time,
with the possible values being:
SEC_10_K
: Annual report filing regarding a company’s financial
performance submitted to the Securities and Exchange Commission
(SEC).
SEC_10_Q
: Quarterly report filing regarding a company’s financial
performance submitted to SEC.
SEC_8_K
: Report filed whenever a significant corporate event takes
place that triggers a disclosure submitted to SEC.
SEC_20_F
: Annual report filing for non-U.S. and non-Canadian
companies that have securities trading in the U.S.
SEC_S_1
: Filing needed to register the securities of companies
that wish to go public with the U.S.
SEC_S_3
: Filing utilized when a company wishes to raise capital.
SEC_6_K
: Report of foreign private issuer pursuant to rules 13a-16
and 15d-16.
Example:
from bigdata_client import Bigdata
from bigdata_client.query import Entity, FilingTypes
bigdata = Bigdata()
MICROSOFT = "228D42"
query = Entity(MICROSOFT) & FilingTypes.SEC_10_K
search = bigdata.search.new(query)
documents = search.run(2)
print(documents)
Reporting details
When querying TranscriptTypes
or FilingTypes
, you can also filter by
reporting details like:
FiscalYear
: Integer representing the annual reporting period.
FiscalQuarter
: Integer representing the fiscal quarter covered.
ReportingEntity
: This field allows searching by the reporting
company.
Example:
from bigdata_client import Bigdata
from bigdata_client.query import Entity, TranscriptTypes, SectionMetadata, FiscalYear, FiscalQuarter, ReportingEntity
bigdata = Bigdata()
MICROSOFT = "228D42"
query = (
Entity(MICROSOFT)
& TranscriptTypes.EARNINGS_CALL
& SectionMetadata.MANAGEMENT_DISCUSSION
& FiscalYear(2024) & FiscalQuarter(2) # filter by fiscal quarter 2, 2024
# & FiscalQuarter(2) # filter by fiscal quarter, any year
# & FiscalYear(2024) # filter by fiscal year only
& ReportingEntity(MICROSOFT) # Reported by the company itself
)
search = bigdata.search.new(query)
documents = search.run(2)
print(documents)
FileTag
You can also add a tag to your query to filter by private documents that
include that tag.
Example:
from bigdata_client import Bigdata
from bigdata_client.query import FileTag
bigdata = Bigdata()
MICROSOFT = "228D42"
query = (
Entity(MICROSOFT)
& FileTag("tag_1", "tag_2")
)
search = bigdata.search.new(query)
documents = search.run(2)
print(documents)
Query operators
Bigdata also allows you to perform very complex queries in a very
expressive way. This is done by combining different query filters with
&
(AND) |
(OR) and ~
(NOT) operators. For example:
from bigdata_client import Bigdata
from bigdata_client.daterange import RollingDateRange
from bigdata_client.models.search import DocumentType
from bigdata_client.query import Entity, Keyword, Topic, Similarity
bigdata = Bigdata()
TESLA = "DD3BB1"
APPLE = "D8442A"
GOOGLE = "D8C3A1"
tech_companies = Entity(TESLA) | Entity(APPLE) | Entity(GOOGLE)
keywords = Similarity("executive appointment") | Keyword("CEO resignation")
topics = (
Topic("business,labor-issues,executive-appointment,,")
| Topic("business,labor-issues,executive-resignation,,")
| Topic("business,labor-issues,executive-retirement,,")
)
query = tech_companies & (keywords | topics)
search = bigdata.search.new(query)
for result in search.limit_documents(10):
print(result)
This should be sufficient for most use cases, but sometimes the query is
built from an external list of entities, keywords, topics, etc. For
example, provided a list of entity ids you could do:
from bigdata_client import Bigdata
from bigdata_client.query import Entity
bigdata = Bigdata()
entity_ids = read_entity_ids_from_file() # Just for explanation purposes
entities = [Entity(eid) for eid in entity_ids]
query = None
for entity in entities:
if query is None:
query = entity
else:
query = query | entity
search = bigdata.search.new(query)
documents = search.run(2)
print(documents)
This is a bit cumbersome, so we provide two helper function to make this
easier: All
and Any
. The first one is used to combine a list of
entities, keywords, topics, etc. with the AND operator, and the second
one is used to combine them with the OR operator. With the help from
Any
the previous example would be rewritten as:
from bigdata_client import Bigdata
from bigdata_client.query import Entity, Any
bigdata = Bigdata()
entity_ids = read_entity_ids_from_file() # Just for explanation purposes
entities = [Entity(eid) for eid in entity_ids]
query = Any(entities)
search = bigdata.search.new(query)
documents = search.run(2)
print(documents)
Document Version
Search by Document Version.
Example:
from bigdata_client import Bigdata
from bigdata_client.query import DocumentVersion
bigdata = Bigdata()
VERSION = "RAW"
query = DocumentVersion(VERSION)
# Search for DocumentVersion
search = bigdata.search.new(query)
documents = search.run(2)
print(documents)
See class DocumentVersion
for further details.