Open Access Licences and Open Metadata in Transformative Agreements

Author
Affiliation

Sophia Dörner

Published

May 15, 2026

Doi
Abstract
I analyse Crossref metadata for around 250,000 journal articles published between 2019 and 2025. The analysis reveals that contractual language alone is insufficient to ensure implementation: coverage of specific metadata elements varied considerably across publishers and agreements, and explicit provisions do not consistently translate into higher coverage. A notable near-absence of ROR ID coverage across most agreements highlights gaps that undermine institutional attribution in research assessment and bibliometric analyses, while CC BY licence adoption is highest where agreements mandate it as the sole permissible licence.

Introduction

Over the past decade, transformative agreements have been instrumental in driving the transition to open access by redirecting former subscription costs to fund open access publishing for affiliated authors of the negotiating institutions (de Jonge et al. 2025; Dér 2025). Open publishing agreements have gained serious traction, with the ESAC registry listing 1,644 such agreements as of April 29, 2026, including 177 concluded with German institutions and consortia (ESAC Initiative, n.d.).

In Dörner (2026), I conducted a content analysis of 13 transformative agreements between scientific publishers and German consortia, with a focus on contractual provisions related to open metadata and data-analytical research services. With respect to open metadata provisions, 12 of the 13 agreements mandated metadata submission to Crossref. However, only four agreements specified detailed metadata fields for Crossref delivery, despite these being non-exhaustive lists. Metadata fields explicitly referenced included, inter alia, author information, ORCID identifiers, affiliations, and open access licence information.

This blog post extends the prior content analysis by assessing the implementation of contractual provisions through a data-driven approach, focusing on two aspects: (i) CC BY licence adoption and (ii) open metadata coverage including ORCID and ROR IDs, TDM information, funding information, funder DOIs, and open abstracts. I build upon prior work by the Hybrid Open Access Dashboard (HOAD) and the Sesame Open Science (SOS) Crossref truth table. HOAD is an openly available data analytics tool that currently tracks, amongst other things, the use of Creative Commons (CC) licences and the coverage of open metadata in Crossref for open access articles in hybrid journals. The metadata captured by HOAD include text and data mining (TDM) information, ORCID coverage, funding information, and the coverage of open abstracts and citations (Achterberg and Jahn 2023). However, some dimensions are missing from HOAD, such as ROR coverage and information for articles in fully open access journals. The SOS Crossref truth table is a processed table based on Crossref data indicating presence and counts of several metadata elements for each record in Crossref. Compared to HOAD the truth table provides detailed information on affiliations and ROR coverage, but does not include TDM and CC licence information. Furthermore, its scope is broader, meaning that articles made available through transformative agreements are not pre-identified.

Data and Method

This analysis reproduces and adapts the data analytics workflow underlying HOAD, as describe in Jahn (2025b). To this end, data retrieval started with obtaining journal and institutional information from preserved transformative agreement data from the cOAlition S Journal Checker Tool (JCT) 1. I subsequently filtered the data based on the ESAC IDs of the selected 13 transformative agreements (see Table 1) to obtain information about journal portfolios and participating institutions pertaining specifically to the respective agreements. Journal information was further enriched with linking ISSNs (ISSN-Ls), as provided by the ISSN International Centre in the February 2026 version, based on ISSN matching to allow unique reference of journals.

Using the compiled JCT data, I retrieved article-level metadata from the Crossref April 2026 database snapshot and enriched it with first-author affiliation data from the OpenAlex Walden April 2026 database snapshot. All data was accessed via the SUB Göttingen Open Scholarly Data Warehouse. It should be noted that affiliation metadata coverage in OpenAlex has been reported to have declined substantially following the transition to Walden, partly due to publishers not sharing affiliation metadata via Crossref and partly due to technical issues within OpenAlex itself (Jahn 2025a), which may affect the completeness of first-author affiliation matching in this analysis.

The article-level data was restricted to the publication years 2019–2025, based on the issued date, to account for varying transformative agreement terms and full-year data availability (see Table 1). Non-scholarly journal content, such as table of contents etc., was excluded via paratext recognition, enrichment with first-author affiliation data was done by matching JCT participating institution information with OpenAlex affiliation data using ROR-IDs and institutional names as a fall back option. Furthermore the SOS Crossref truth table data from the January 2026 snapshot was matched using DOIs.

Table 1: Transformative agreements concluded with German consortia as analysed in Dörner (2026)
Publisher Consortium Lead Term ESAC ID
Elsevier DEAL 01.09.2023–31.12.2028 els2023deal
Hogrefe SUB Göttingen 01.01.2021–31.12.2023 hogrefe2021gac
01.01.2024–31.12.2026 hogrefe2024gac
Optica TIB 01.01.2023–31.12.2026 opg2023tib
Royal Society of Chemistry TIB 01.01.2024–31.12.2027 rsc2024tib
Springer Nature DEAL 01.01.2020–31.12.2023 sn2020deal
01.01.2024–31.12.2028 sn2024deal
MPDL 01.01.2021–31.12.2024 sn2021gac
Trans Tech Publications TIB 01.01.2024–31.12.2026 ttp2024tib
Walter de Gruyter SUB Göttingen 01.01.2022–31.12.2022 degruy2022gac
01.01.2023–31.12.2024 degruy2023gac
Wiley DEAL 01.01.2019–31.12.2023 wiley2019deal
01.01.2024–31.12.2028 wiley2024deal

My workflow extends the current workflows for HOAD in the following respects: (1) journal coverage includes both hybrid and fully open access journals rather than hybrid journals only, (2) article data is restricted to pertain only to the selected 13 transformative agreements presented in Table 1, (3) I additionally enriched the data set with data from the SOS Crossref truth table.

To obtain information about metadata coverage, I computed counts of authors, affiliations, funders, and their respective persistent identifiers, as well as indicators for the presence of key metadata fields such as TDM support and open abstracts. Affiliation and ROR ID counts were computed as distinct values per article. This means that if multiple authors from the same institution appear on an article, the affiliation and its designated ROR ID are each counted only once. The number of affiliations was determined by first checking for ROR IDs and using name strings as a fall back option, when affiliations had no ROR IDs assigned. I identified open access articles by the presence of a CC licence in the version of record metadata. I therefore did not retrieve licence URLs for author accepted manuscript or TDM article versions, as identified by Crossref’s content_version field. Subsequently, articles without a CC licence were classified as non-open access, regardless of publisher policies or other access indicators.

The following SQL code shows the query used to retrieve the article-level data set, compute different metrics and join the SOS Crossref truth table.

SQL code
WITH filtered_publications AS (
  SELECT DISTINCT
    UPPER(TRIM(cr.doi)) AS doi,
    jct.issn_l,
    jct.esac_id AS jn_esac_id,
    cr.publisher AS cr_publisher,
    cr.container_title AS cr_journal_title,
    EXTRACT(YEAR FROM cr.issued) AS cr_year,
    CASE WHEN cr.abstract IS NOT NULL THEN 1 END AS has_abstract,
    cr.license,
    cr.author,
    cr.link,
    cr.funder
  FROM (
      SELECT SPLIT(issn, ",") AS issn,
        doi,
        publisher,
        container_title,
        issued,
        license,
        abstract,
        author,
        link,
        funder
      FROM `subugoe-collaborative.cr_instant.snapshot`
      WHERE NOT REGEXP_CONTAINS(
          title,
          '(?i)^Author Index$|^Back Cover|^Contents$|^Contents:|^Corrigendum|^Cover Image|^Cover Picture|^Editorial Board|^Front Cover|^Frontispiece|^Inside Back Cover|^Inside Cover|^Inside Front Cover|^Issue Information|^List of contents|^Masthead|^Title page|^Correction$|^Corrections to|^Corrections$|^Withdrawn|^Frontmatter'
        )
        AND (
          NOT REGEXP_CONTAINS(page, '^S')
          OR page IS NULL
        )
        AND (
          NOT REGEXP_CONTAINS(issue, '^S')
          OR issue IS NULL
        )
        AND EXTRACT(YEAR FROM issued) BETWEEN 2019 AND 2025
  ) AS cr
  CROSS JOIN UNNEST(cr.issn) AS issn
  INNER JOIN `subugoe-collaborative.resources.oad_jct_jn` jct
    ON issn = jct.issn
  WHERE (
      (jct.esac_id = 'wiley2019deal' AND EXTRACT(YEAR FROM cr.issued) BETWEEN 2019 AND 2023) OR
      (jct.esac_id = 'wiley2024deal' AND EXTRACT(YEAR FROM cr.issued) BETWEEN 2024 AND 2025) OR
      (jct.esac_id = 'sn2021gac' AND EXTRACT(YEAR FROM cr.issued) BETWEEN 2021 AND 2023) OR
      (jct.esac_id = 'sn2020deal' AND EXTRACT(YEAR FROM cr.issued) BETWEEN 2020 AND 2023) OR
      (jct.esac_id = 'sn2024deal' AND EXTRACT(YEAR FROM cr.issued) BETWEEN 2024 AND 2025) OR
      (jct.esac_id = 'els2023deal' AND EXTRACT(YEAR FROM cr.issued) BETWEEN 2023 AND 2025) OR
      (jct.esac_id = 'degruy2022gac' AND EXTRACT(YEAR FROM cr.issued) = 2022) OR
      (jct.esac_id = 'degruy2023gac' AND EXTRACT(YEAR FROM cr.issued) BETWEEN 2023 AND 2024) OR
      (jct.esac_id = 'hogrefe2021gac' AND EXTRACT(YEAR FROM cr.issued) BETWEEN 2021 AND 2023) OR
      (jct.esac_id = 'hogrefe2024gac' AND EXTRACT(YEAR FROM cr.issued) BETWEEN 2024 AND 2025) OR
      (jct.esac_id = 'rsc2024tib' AND EXTRACT(YEAR FROM cr.issued) BETWEEN 2024 AND 2025) OR
      (jct.esac_id = 'opg2023tib' AND EXTRACT(YEAR FROM cr.issued) BETWEEN 2023 AND 2025) OR
      (jct.esac_id = 'ttp2024tib' AND EXTRACT(YEAR FROM cr.issued) BETWEEN 2024 AND 2025)
  )
),
first_author_data AS (
  SELECT
    UPPER(TRIM(w.doi)) AS doi,
    a.countries
  FROM `subugoe-collaborative.openalex_walden.works` w
  CROSS JOIN UNNEST(authorships) AS a
  WHERE a.author_position = 'first'
    AND 'DE' IN (
      SELECT UPPER(TRIM(country_code))
      FROM UNNEST(a.countries) AS country_code
    )
    AND EXISTS (
      SELECT 1
      FROM UNNEST(a.institutions) AS i
      JOIN `subugoe-collaborative.resources.oad_jct_inst` ji
        ON LOWER(i.ror) = LOWER(ji.ror_id)
        OR LOWER(i.display_name) = LOWER(ji.inst_name)
      WHERE i.display_name IS NOT NULL
        AND ji.inst_name IS NOT NULL
    )
),
oad_ta_md AS (
  SELECT
    fp.doi,
    fp.cr_year,
    fp.issn_l,
    fp.cr_journal_title,
    fp.jn_esac_id,
    fp.cr_publisher,
    fp.has_abstract,
    ARRAY_LENGTH(fp.author) AS num_authors,
    (SELECT COUNT(1) FROM UNNEST(fp.author) AS authors WHERE authors.orcid IS NOT NULL) AS num_orcids,
    COUNT(DISTINCT CASE WHEN md_2.id IS NOT NULL THEN md_2.id ELSE md_1.name END) AS num_affiliations,
    COUNT(DISTINCT CASE WHEN LOWER(md_2.id_type) = 'ror' THEN md_2.id END) AS num_rors,
    ARRAY_LENGTH(fp.funder) AS num_funders,
    (SELECT COUNT(1) FROM UNNEST(fp.funder) AS funders, UNNEST(funders.id) WHERE LOWER(id_type) = "doi") AS num_funder_dois,
    MAX(CASE WHEN md_3.content_version IN ('vor', 'unspecified') THEN md_3.url END) AS license_url,
    CASE
      WHEN EXISTS (SELECT 1 FROM UNNEST(fp.link) WHERE intended_application = "text-mining") THEN 1
      ELSE 0
    END AS has_tdm,
    tt.doi AS tt_doi,
    tt.count_authors,
    tt.has_authors_id_orcid,
    tt.count_authors_id_orcid,
    tt.has_affiliations,
    tt.count_affiliations,
    tt.has_affiliations_id_ror,
    tt.count_affiliations_id_ror,
    tt.has_abstract AS tt_has_abstract,
    tt.has_funders,
    tt.count_funders,
    tt.has_funders_id_doi,
    tt.count_funders_id_doi
  FROM filtered_publications fp
  INNER JOIN first_author_data fa
    ON fp.doi = fa.doi
  LEFT JOIN UNNEST(fp.author) AS md_0
  LEFT JOIN UNNEST(md_0.affiliation) AS md_1
  LEFT JOIN UNNEST(md_1.id) AS md_2
  LEFT JOIN UNNEST(fp.license) AS md_3
  LEFT JOIN `sos-datasources.truthtables.crossref_truthtable_20260131` tt
    ON fp.doi = tt.doi
  GROUP BY
    fp.doi,
    tt.doi,
    fp.cr_year,
    fp.issn_l,
    fp.cr_journal_title,
    fp.jn_esac_id,
    fp.cr_publisher,
    fp.has_abstract,
    fp.author,
    fp.funder,
    fp.license,
    fp.link,
    tt.count_authors,
    tt.has_authors_id_orcid,
    tt.count_authors_id_orcid,
    tt.has_affiliations,
    tt.count_affiliations,
    tt.has_affiliations_id_ror,
    tt.count_affiliations_id_ror,
    tt.has_abstract,
    tt.has_funders,
    tt.count_funders,
    tt.has_funders_id_doi,
    tt.count_funders_id_doi
)

SELECT * FROM oad_ta_md

Data cleaning and preparation involved several standardisation procedures: (1) harmonisation of ISSN-Ls for journals assigned to multiple ISSN-Ls, (2) consolidation of ESAC IDs for ISSN-Ls assigned to multiple ESAC IDs, in particular those resulting from publisher changes after the termination of the agreement, and (3) correction of errors in licence URLs. Journal-level standardisations were based on journal information from EZB and ZDB.

The so-compiled data set I used for the analysis consists of 246,499 articles published between 2019 and 2025 and enabled by one of the 13 transformative agreements of interest.

I used the SOS Crossref truth table data to validate my approach and found strong alignment for all coverage metrics (ORCIDs, funding information, funder DOIs, open abstracts) except for ROR coverage. Specifically, the ROR coverage values from the SOS dataset were higher than those I computed, in some instances even considerably. A more detailed analysis of the data revealed that these differences are attributable to 355 articles for which the absolute number of ROR IDs counted in the SOS dataset exceeds the absolute number of affiliations. Inspecting a sample of 20 DOIs from this set of articles via the Crossref API revealed that when a ROR ID was present for a given affiliation, no additional name string (Crossref field name) was included in the metadata. The SQL query used to create the SOS dataset and calculate the metrics shows that the number of affiliations is determined based on the name string in the Crossref affiliation name field. This differs from the calculation approach applied here. If publishers do not provide a name string for each affiliation, additionally to the respective ROR IDs, this explains why, for these articles, the number of ROR IDs is assessed as higher by the SOS counting method than the number of actual affiliations.

Results

The contractual provisions analysed in Dörner (2026), with underlying data provided in Dörner (2025), form the baseline for the analysis presented here. Briefly, the agreements with de Gruyter, Optica, and Trans Tech Publications mandated CC BY licensing; 12 of the 13 agreements committed to metadata delivery to Crossref, with several explicitly referencing ORCIDs, ROR IDs, and funding information as metadata fields available to the publishers; and 12 agreements contained TDM provisions.

Creative Commons Licences

Table 2 provides an overview of the article volume per agreement and publication year, aligned with the respective agreement terms, detailing the overall share of open access articles, the proportion published under a CC BY licence, and trends of CC BY shares across publication years.

Table 2: Article volume, open access share, CC BY adoption, and trends of CC BY shares by transformative agreement and publication year.

For the agreements with Optica and Trans Tech Publications, the proportion of open access articles and the proportion with a CC BY licence are identical, indicating full implementation of the CC provision. An overall slight increase in CC BY share can be observed for Optica, while Trans Tech Publications shows a decline. For de Gruyter, the CC BY share falls short of the overall open access share — the underlying data suggest this is due to some articles having been published under CC BY-NC-ND — though a general upward trend in CC BY adoption is visible.

For the agreements with Elsevier, Wiley, and the Royal Society of Chemistry, where CC BY was required only as a default or preferred option rather than the sole permissible licence, the open access share consistently exceeds the CC BY share across all agreement terms, despite an overall increase in both. The Hogrefe agreements allow authors to choose from several CC licences without preference for CC BY, which is reflected in a lower CC BY share relative to the overall open access share, and a downward trend in CC BY for the 2024 agreement. For the Springer Nature agreements, the 2021 MPDL agreement shows identical open access and CC BY shares throughout its term, while the 2020 DEAL agreement shows a slightly higher overall open access share, with an upward trend in CC BY for both.

Crossref Metadata Coverage

Crossref is critical, because many publishers use this DOI registration agency to share metadata openly. In addition to bibliographic data, efforts from initiatives such as the Initiative for Open Citations and the Initiative for Open Abstracts have led to an increase in the open availability of references and abstracts in Crossref (Van Eck and Waltman 2025).

Table 32 provides an overview of the proportion of openly accessible metadata in Crossref for articles covered by the 13 transformative agreements, focusing on TDM support, authors’ ORCIDs, author affiliations’ ROR IDs, general funding information, funders’ DOIs, and openly accessible abstracts.

Table 3: Proportion of publicly available Crossref metadata per transformative agreement and publication year.

TDM support

For the agreements with Elsevier, de Gruyter, and the 2021 MPDL Springer Nature agreement, TDM information has been persistently submitted to Crossref throughout the entire terms, consistent with contractual permissions for TDM for private use or research purposes. By contrast, no TDM information was delivered for the agreements with Hogrefe, the Royal Society of Chemistry, Trans Tech Publications, or Optica, despite non-commercial TDM being permissible under the first three and no TDM clause having been identified for Optica. For the Springer Nature DEAL agreements, TDM support is high throughout, reaching 100% in 2022. For Wiley, a strong decline is visible from 2023 onwards — corroborated by HOAD and the Crossref Participation Report for Wiley — though HOAD data for 2026 shows a recovery to over 90%, suggesting the decline may reflect a temporary technical issue rather than a strategic change.

ORCID coverage

ORCID coverage ranges from 10.9% (Hogrefe for 2021) to 62.5% (Royal Society of Chemistry for 2025). Notably, the Royal Society of Chemistry agreement — which does not explicitly reference ORCIDs for Crossref delivery — shows the highest coverage. Despite most other agreements mentioning ORCIDs for author identification or metadata delivery, coverage remains mostly below 50%.

ROR coverage

Substantial disparities are visible across publishers. Optica is the only agreement with noticeable and consistent ROR coverage. Wiley shows minimal ROR coverage from 2022 onwards, and all other agreements show 0% — including the agreements with the Royal Society of Chemistry and Trans Tech Publications, which explicitly reference ROR IDs for institutional identification or reporting.

Funding information and funder DOIs

No funding-related metadata was submitted to Crossref for the Hogrefe or Trans Tech Publications agreements, despite both containing funding information provisions. The Trans Tech Publications agreement, however, specified that funding information would only be recorded upon request by the authors. The Royal Society of Chemistry agreement — the only one without a funding information provision — nevertheless achieves among the highest funding information coverage, with DOIs recorded for all funders. De Gruyter shows low funding information coverage (maximum 7.4%) but with around 70% funder DOI coverage. Elsevier, which only committed to examining the feasibility of including funding information, shows coverage rising to 60% by 2025 with near-complete funder DOI coverage. The 2021 MPDL Springer Nature agreement shows declining funding information and funder DOI coverage, while the DEAL agreements show comparably higher funding information coverage but also a downward trend in funder DOI coverage. Wiley coverage of funding information oscillates around 50%, while funder DOI coverage is consistently above 90% and trending upward. Optica — the only agreement with an explicit contractual guarantee for funding information — shows the highest funding information coverage, though with a slight downward trend in funder DOI coverage.

Open abstracts

Almost no abstracts were delivered to Crossref for articles under the Elsevier agreement. Coverage is highest for Optica and Trans Tech Publications, followed by the Royal Society of Chemistry and Wiley, each exceeding 90%. None of the 13 agreements contained an explicit contractual provision for abstract delivery to Crossref.

Discussion

The analysis reveals substantial discrepancies between the contractual provisions of transformative agreements and their actual implementation in Crossref metadata, highlighting a gap in the operationalisation of open metadata commitments.

Creative Commons Licences

The CC licence analysis demonstrates that contractual specificity directly correlates with implementation compliance. Agreements that explicitly mandated CC BY licensing (Optica, Trans Tech Publications, and de Gruyter) generally achieved higher compliance rates, though de Gruyter’s implementation was imperfect due to residual CC BY-NC-ND usage. In contrast, agreements where CC BY was merely presented as a default or preferred option (Elsevier, Wiley, Royal Society of Chemistry) consistently showed lower CC BY adoption rates despite increasing overall open access volumes. This finding suggests that without explicit contractual mandates requiring CC BY as the sole permissible licence, publishers maintain flexibility to offer alternative licences, potentially undermining the transformative goals of these agreements.

Crossref Metadata Coverage

The near-absence of ROR coverage across most agreements is particularly striking. Even agreements that explicitly reference ROR IDs in the agreement text show 0% coverage, and virtually no ROR IDs were submitted for articles under the DEAL agreements — which account for the largest share of publication volume. This is problematic because ROR IDs provide persistent, unambiguous institutional identification essential for research assessment and funding attribution. Without them, bibliometric analyses must rely on error-prone string matching, introducing noise and increasing the manual curation burden. The methodological comparison with the SOS Crossref truth table further highlights the difficulty of measuring ROR coverage consistently when publishers submit ROR IDs without corresponding institutional name strings. The importance of ROR adoption is further underscored by Crossref’s plans to retire the Funder Registry in favour of ROR (Portenoy 2026), making consistent ROR submission increasingly central to the Crossref metadata ecosystem.

In line with de Jonge and Kramer (2026), the ORCID coverage findings suggest that publisher workflow integration and editorial systems might play a more decisive role in ORCID adoption than contractual language alone, as for instance, the Royal Society of Chemistry agreement – which does not explicitly reference ORCIDs for Crossref delivery – achieves the highest coverage regardless.

The funding metadata findings illustrate a broader pattern: the presence of contractual provisions is no guarantee of implementation. The Trans Tech Publications agreement committed to recording funding information but only upon author request, and the complete absence of such metadata suggests either that authors rarely invoked this option, that the publisher lacked the necessary infrastructure, or that the provision was too vague to operationalise. Conversely, the Royal Society of Chemistry — without any funding metadata provision — achieved among the highest coverage, again pointing to publisher infrastructure and internal priorities as the more decisive factors. The declining funder DOI coverage for both Springer Nature agreement types, despite generally reasonable funding information coverage, further illustrates that even partial implementation can be inconsistent over time.

The TDM findings similarly show that contractual permissions do not reliably translate into metadata delivery. The temporary decline in Wiley’s TDM metadata after 2022, despite contractual permissions, likely reflects a technical issue given the subsequent recovery visible in HOAD’s 2026 data, but it underscores the need for ongoing monitoring rather than reliance on contractual commitments alone.

Taken together, these findings demonstrate that vague or permissive contractual language on metadata tends to produce inconsistent outcomes. The cases of the Royal Society of Chemistry (high ORCID and funding coverage without explicit provisions) and Trans Tech Publications (zero funding coverage despite an explicit provision) illustrate the disconnect between contractual commitment and operational reality. Future agreements should specify exact metadata elements, required values, and implementation timelines, with measurable compliance indicators to enable systematic monitoring.

Outlook

One of the most compelling findings is the near-total absence of ROR ID coverage across most agreements. This underscores a clear opportunity for HOAD to expand its analytical scope by incorporating ROR coverage as a standard metric. Such an extension would be technically straightforward and would enable monitoring of ROR coverage across a wider set of agreements over time.

While this analysis is focused on 13 agreements with German consortia, the methodological approach could serve as a template for broader investigation. A distributed effort, connected to the Joint Task Force on Negotiating Openness of Publication Metadata between OA2020 (now OA Forward) and the Barcelona Declaration, has been started to analyse all openly available transformative agreements in the ESAC Registry of Open Publishing Agreements. The approach presented here could contribute to that initiative, providing empirical evidence to inform the task force’s work on negotiating openness of publication metadata across publishers and nations.

Code and Data Availability

R Code and processed data files for this data analysis are available on GitHub: https://github.com/subugoe/scholcomm_analytics/tree/main/posts/ta_coverage_analysis

Data tables used for data retrieval are publicly available on Google BigQuery, as provided by the SUB Göttingen: https://subugoe.github.io/scholcomm_analytics/data.html.

References

Achterberg, Inke, and Najko Jahn. 2023. Introducing the Hybrid Open Access Dashboard (HOAD).” August 17. https://www.coalition-s.org/blog/introducing-the-hybrid-open-access-dashboard-hoad/.
de Jonge, Hans, and Bianca Kramer. 2026. “Manuscript Submission Systems and Metadata Completeness in Crossref: Patterns and Associations.” PLOS One 21 (3): e0345417. https://doi.org/10.1371/journal.pone.0345417.
de Jonge, Hans, Bianca Kramer, and Jeroen Sondervan. 2025. Tracking Transformative Agreements Through Open Metadata: Method and Validation Using Dutch Research Council NWO Funded Papers. https://doi.org/10.31222/osf.io/tz6be_v4.
Dér, Ádám. 2025. “What Gets Missed in the Discourse on Transformative Agreements.” Katina Magazine, ahead of print. https://doi.org/10.1146/katina-20250212-1.
Dörner, Sophia. 2025. Datensatz Zu: Offene Metadaten Und Datenanalytische Forschungsservices in Der Open-Access-Transformation. Eine Analyse Zu Regelungen in Open-Access-Transformationsverträgen Deutscher Einrichtungen Und Ihrer Konsortien. Zenodo. https://doi.org/10.5281/ZENODO.17513172.
Dörner, Sophia. 2026. Offene Metadaten und datenanalytische Forschungsservices in der Open-Access-Transformation: Eine Analyse zu Regelungen in Open-Access-Transformationsverträgen deutscher Einrichtungen und ihrer Konsortien.” Bibliothek Forschung Und Praxis, ahead of print. https://doi.org/10.1515/bfp-2025-0035.
ESAC Initiative. n.d. ESAC Registry of Open Publishing Agreements. Accessed April 30, 2026. https://esac-initiative.org/about/transformative-agreements/agreement-registry/.
Jahn, Najko. 2025a. Decreasing Affiliation Metadata Coverage in OpenAlex. December. https://doi.org/10.59350/z3c5x-bfk63.
Jahn, Najko. 2025b. “How Open Are Hybrid Journals Included in Transformative Agreements?” Quantitative Science Studies 6: 242–62. https://doi.org/10.1162/qss_a_00348.
Portenoy, Jason. 2026. Matching funders in scholarly metadata: linking names to ROR IDs. April. https://doi.org/10.64000/d3f5t-g5017.
Van Eck, Nees Jan, and Ludo Waltman. 2025. Crossref Metadata Statistics. Zenodo. https://doi.org/10.5281/ZENODO.14931176.

Footnotes

  1. https://github.com/njahn82/jct_data↩︎

  2. Table 3 reuses and extends the interactive visualization framework from the Hybrid Open Access Dashboard, including its reactable-based layout and filtering system. It adds ROR ID coverage as a new metric, which was not included in HOAD’s original scope and uses a modified colour palette. Code adaptations were based on HOAD’s open-source implementation on GitHub.↩︎

Reuse

Citation

BibTeX citation:
@article{dörner2026,
  author = {Dörner, Sophia},
  title = {Open {Access} {Licences} and {Open} {Metadata} in
    {Transformative} {Agreements}},
  journal = {Scholarly Communication Analytics},
  date = {2026-05-15},
  url = {https://subugoe.github.io/scholcomm_analytics/posts/ta_coverage_analysis/main.html},
  doi = {10.59350/nv30b-3yg69},
  langid = {en},
  abstract = {I analyse Crossref metadata for around 250,000 journal
    articles published between 2019 and 2025. The analysis reveals that
    contractual language alone is insufficient to ensure implementation:
    coverage of specific metadata elements varied considerably across
    publishers and agreements, and explicit provisions do not
    consistently translate into higher coverage. A notable near-absence
    of ROR ID coverage across most agreements highlights gaps that
    undermine institutional attribution in research assessment and
    bibliometric analyses, while CC BY licence adoption is highest where
    agreements mandate it as the sole permissible licence.}
}
For attribution, please cite this work as:
Dörner, Sophia. 2026. “Open Access Licences and Open Metadata in Transformative Agreements.” Scholarly Communication Analytics, accepted, May 15. https://doi.org/10.59350/nv30b-3yg69.