50 billion new facts Added to the KG in the last month
by Kris Negulescu50 billion new facts were added to the Diffbot Knowledge Graph in the last month, including 30M new organizations and 600M articles.
50 billion new facts were added to the Diffbot Knowledge Graph in the last month, including 30M new organizations and 600M articles.
Use this optional param to return the extracted data as llm-ready markdown, i.e. '&mode=llm', including interactive elements. Try it here: Website example or Github example.
KG DATA CHANGE NOTIFICATION - Organization.naceClassification
We will be updating Organization.naceClassification to NACE Rev. 2.1 in build v437 of the Diffbot Knowledge Graph, targeted to go live in about two weeks. Please read on for more details.
Ordinarily, we take extraordinary measures to avoid breaking changes in the Diffbot Knowledge Graph ontologies. However, in some cases, there is no benefit in retaining a prior version of the data, so we replace an existing attribute with a new data format. The Organization.naceClassification field is one such case. The current version of the NACE codes in the KG lacks level, isPrimary, and ancestor codes. And, some of the codes are no longer valid in the latest NACE Rev. 2.1 version.
In Rev 2.1 of the NACE codes:
For a comparison of the existing code format versus the new Rev 2.1 format, see below.
Organization.naceClassificationVolkswagen's current NACE classification in the KG appears as the following
[
{
"code": "2910",
"isPrimary": false,
"name": "Manufacture of motor vehicles"
},
{
"code": "7022",
"isPrimary": false,
"name": "Business and other management consultancy activities"
},
{
"code": "7021",
"isPrimary": false,
"name": "Public relations and communication activities"
}
]Issues with this data:
When the updates deploy, Volkswagen's Organization.naceClassification NACE codes will look like this:
[
{
"code": "29.10",
"level": 4,
"isPrimary": true,
"name": "Manufacture of motor vehicles",
"version": "Rev 2.1"
},
{
"code": "29.1",
"level": 3,
"isPrimary": true,
"name": "Manufacture of motor vehicles",
"version": "Rev 2.1"
},
{
"code": "29",
"level": 2,
"isPrimary": true,
"name": "Manufacture of motor vehicles, trailers and semi-trailers",
"version": "Rev 2.1"
},
{
"code": "C",
"level": 1,
"isPrimary": true,
"name": "MANUFACTURING",
"version": "Rev 2.1"
},
{
"code": "28.11",
"level": 4,
"isPrimary": false,
"name": "Manufacture of engines and turbines, except aircraft, vehicle and cycle engines",
"version": "Rev 2.1"
},
{
"code": "28.1",
"level": 3,
"isPrimary": false,
"name": "Manufacture of general-purpose machinery",
"version": "Rev 2.1"
},
{
"code": "28",
"level": 2,
"isPrimary": false,
"name": "Manufacture of machinery and equipment n.e.c.",
"version": "Rev 2.1"
}
]diffbot-small-xl is finally live on https://lmarena.ai/! Check out: https://lmarena.ai/leaderboard/search. LMArena has open-sourced the largest repository of organic human preferences on generative models in the world. These datasets are free and open to access.
Try Diffbot LLM by dropping our hosted API URL into an existing OpenAI SDK project, or on the web at diffy.chat.
Khuyen Tran and Bex Tuychiev also wrote up a great getting started guide you can follow.
You can now label and disable tokens from the Dashboard. And, be sure to use a child token in production environments. They're much easier to replace if compromised.
You can now find us on Postman! We're starting with Extract API, and moving quickly to get the rest of our APIs on Postman as well.
Postman is an API testing platform that eliminates the need to manually write cURL. The API testing UI is quite similar to what we have in the docs, with even more features to setup your environment, testing scripts, and more.
Note that our primary documentation platform will continue to live on docs.diffbot.com. Postman is an extension of our docs presence to make it easier for Postman users to test Diffbot APIs on their preferred platform.
Fork and watch our Diffbot API collection on Postman!
Investment Transactions are now searchable on LeadGraph! This makes it possible to:
⁃ Stay on top of recent funding rounds
⁃ Find investors that have invested in companies with particular industries, keywords, company size, etc.
⁃ See funding insights for investors, industries, funding rounds, and more....
We now offer additional troubleshooting tools in the Extract UI: a DOM view of the rendered page for all data types and an HTML button for article extractions.
You can now use the DQL search API to get company reports for Organizations in your database, at scale, including 10-Ks, 10-Qs, 8-K, etc. To date, we have exported ~3M SEC EDGAR reports . And, we have started to download reports from Forbes Global 2000 company websites as well with ~400K reports downloaded so far to date . This data is still a work in progress so review outputs carefully, e.g. we are working to improve report titles extracted from PDFs.
The report types we support include Current Reports, Quarterly Reports, Annual Reports, and more . Please let us know if there are reports you'd like us to add to the graph.