We’ve recently reached an important milestone for the research nexus: the works in our metadata corpus are now connected with over 2 billion citation links! This is a great opportunity to share a dedicated dataset and discuss why these are important for science.
The Crossref Nominating Committee is inviting expressions of interest to join the Board of Directors of Crossref for the term starting in January 2027. The committee will gather responses from those interested and create the slate of candidates that our membership will vote on in an election in September.
Expressions of interest will be due Monday, June 22, 2026
Today is Global Accessibility Awareness Day, and accessibility has been on our minds lately. We’ve recently completed an internal audit of all our user interfaces, and have added a new accessibility page to our website, where you can find the accessibility documentation that we put together as part of the audit.
For a funder with over thirty years of funding history, making all of their funding metadata openly available is no small undertaking. In this conversation, I chat with Guntram Bauer, Chief Scientific Officer at the Human Frontiers Science Program (HFSP), about how the organisation is working to register decades of grant data with Crossref, the challenges of linking historical awards to published research outputs, and what open, structured funding metadata means for accountability to member countries and the wider scientific community.
We’ve recently reached an important milestone for the research nexus: the works in our metadata corpus are now connected with over 2 billion citation links! This is a great opportunity to share a dedicated dataset and discuss why these are important for science.
The reference metadata is a lifeline of discoverability. Scholars use citations to critique and build on existing research. They acknowledge the contributions of others through references. Our members can then deposit those references as part of metadata with Crossref, and we use those to link the cited and citing objects. This results in complex thematic networks that can be explored by interested researchers. Many tools for research discovery use the linked reference metadata in Crossref to support searches of related content.
The citation links are derived from bibliographic references in the metadata of one work that include DOIs of materials it cites (scholarly works, data, code, etc.). It’s always best if the members can deposit these relationships in full. In a recent post, we shared that nearly half of these links are asserted by our members through metadata deposits, and the other half are created thanks to our automated matching. This form of metadata enrichment happens when members include some information about the references but without the DOI of the cited work, and it’s enough to automatically find and add that DOI. The enrichment supports making data more useful for the community.
The most important impact of citation links is the increased discoverability of connected works. Reference metadata is an important tool for improving visibility and readership of our members’ content. These links are also the foundation of our Cited-by service, which enables implementing members to display citation counts of the work they published on their landing pages.
The chart below shows the cumulative count of citations over time, by the created date of the citing DOI’s record. These include records linked by DOI either through member-submitted metadata or matched by Crossref, as well as records that are unmatched. Unmatched records can include records that we were unable to match with the information we have, but also records that truly have no DOI to link to. You can explore the full citation dataset of all 2 billion citation links between Crossref DOIs available now as a (somewhat hefty) download.
Cumulative count of references deposited to Crossref by created date of citing DOI
The push for open citation data is something that has unfolded over the last few decades, making more and more of these relationships public. Notably, the growth in citation links reflects not just the output of new scholarship, but also a sustained effort to extend coverage of the historical scholarly record. We can see evidence of this playing out over time by looking at our historical data—periodic snapshots of Crossref’s metadata going back to 2019. When comparing successive snapshots and examining the publication dates of citing and cited works, we can classify each newly appearing citation as either a new paper citation, or a retrospective one. A new citation is where the citing work was published since the previous snapshot, representing real growth in the scholarly record. A retrospective citation is where both papers already existed but the link between them had not yet been captured by Crossref, and these represent indexing catchup rather than new publishing activity.
The chart below shows the cumulative count of citations added in each category since 2019. In the early years of our data, retrospective backfill was the dominant source: the blue line climbs steeply from 2019 to 2021 as a large volume of previously uncaptured historical citation relationships entered the corpus. Over time, however, that rate of backfilling has levelled off. New paper citations, meanwhile, have grown steadily throughout the period, and by 2025 they surpassed the cumulative retrospective total. The open citation ecosystem continues recovering historical links, but the citation network’s growth is now increasingly driven by the natural momentum of scholarly publishing itself.
Cumulative citations added to Crossref by type, 2019–2026. Retrospective citations (blue) represent links to and from works that existed before the previous snapshot; new paper citations (green) come from works published since the last snapshot.
Combined with other metadata for more context, reference metadata supports bibliographic and meta-research on different aspects of the scholarly process, and can support judgements about research integrity and conflicts of interest.
Stereotypically, when talking about references, we consider links to published works (whether preprints, journal articles, or books). However all types of records in Crossref can be cited. Thanks to the changes in our latest schema, members can now signal the types of content that is being referenced. And with our new Data citations endpoint, the community can explore specifically links from Crossref-registered records to research data, including citation links to works within Crossref, as well as DataCite’s corpus.
Close to half of all records registered with Crossref still have none or not enough reference information to make such connections. We invite members to regular Metadata health-check webinars to support them in improving completeness of their records for increased transparency and visibility.