OpenAlex CLI

The OpenAlex CLI is our official command-line tool for downloading data from OpenAlex. It's the easiest way to build a local corpus of work metadata and full-text content (PDFs, TEI XML) for text mining, machine learning, or research.

pip install openalex-official
circle-info

Work in progress. The CLI currently focuses on work metadata and content downloads. We're adding more features like CSV export and queries for other entity types. Follow development on GitHubarrow-up-right.

Quick examples

Download metadata for works on a topic:

openalex download \
  --api-key YOUR_KEY \
  --output ./frogs \
  --filter "topics.id:T10325"

This saves a JSON file for each work with the complete metadata from OpenAlex.

Download metadata + PDFs:

openalex download \
  --api-key YOUR_KEY \
  --output ./frogs \
  --filter "topics.id:T10325" \
  --content pdf

Download metadata + PDFs + TEI XML:

openalex download \
  --api-key YOUR_KEY \
  --output ./frogs \
  --filter "topics.id:T10325" \
  --content pdf,xml

Download by DOI:

Pipe in a list of work IDs:

Output format

By default, metadata is saved as JSON files alongside any content:

Why use the CLI?

Building a robust bulk downloader is harder than it looks. The CLI handles:

  • Metadata by default β€” Every work gets a complete JSON file

  • Parallel downloads β€” Up to 200 concurrent connections

  • Automatic checkpointing β€” Resume interrupted downloads without re-downloading

  • Adaptive rate limiting β€” Adjusts to API conditions automatically

  • DOI resolution β€” Auto-detects DOIs and converts them to OpenAlex IDs

  • Progress tracking β€” Real-time stats in your terminal

At full speed, you can download thousands of works per hour.

Credit costs

  • Metadata downloads are free β€” The singleton API doesn't cost credits

  • Content downloads cost 100 credits each β€” PDFs and TEI XML

With a free API key (100K credits/day), you can download unlimited metadata and about 1,000 content files per day.

Need more content? Contact usenvelope about enterprise credit packs for large-scale projects.

Full documentation

For all options and advanced usage, see the GitHub READMEarrow-up-right.

What's next?

We're actively developing the CLI. Planned features include:

  • CSV/JSON export of search results

  • More entity types beyond works

Have a feature request? Open an issuearrow-up-right.

Last updated