Known issues
OpenAlex is still very new, and so you'll encounter some bugs as you look through the data. This page documents the ones we currently know about.
Please report any other issues you find by emailing us at [email protected]

Some strings not yet matched to entities.

We've got a lot of strings floating around for venues and institutions that haven't actually been linked to a Venue or Institution entity in the database. These show up as objects missing an id property, in these fields:
These ID-less objects are tricky because they can't do most of the things a regular entity can. They're just a suitcase for a name, right now. They are inherited from MAG, and we plan to fix them. Over the next month or so, we'll be processing all these stub entities, clustering them together, and minting tens of millions of new entities from them.

MAG format snapshot has a few duplicate rows and escaping issues.

We're continuing to improve our processes to make sure the data in the MAG format is clean and easy to pull in to a relational database. This current release still has a few issues, but we'll try to fix these by the next release.

Questionable dates

Some dates, notably publication dates, come from external sources like publishers and are included in OpenAlex as-is. Dates in the future can be especially suspect.
https://openalex.org/W4205467938 has a publication date of 2023-01-31, for example (if you're reading this after February 2023, that date used to be in the future). This date came from publisher-submitted Crossref metadata for this article. Looking at https://dl.acm.org/doi/10.1145/3485132, this does seem to be part of an ACM issue-in-progress with a print publication date of 2023-01-31.
https://openalex.org/W4200151376 has a publication date in 2029. This also comes from the publisher's Crossref metadata, but it's less plausible that the journal has an issue planned that far in advance. On https://doi.org/10.12960/tsh.2020.0006, we see an accepted date of 2019-12-21 and a publication date of 2029-01-31, suggesting that the latter is a typo and the publication_date is wrong.
Last modified 30d ago