Skip to main content

Doc Maps Overview

A summary description of the proposed Doc Maps project.

Published onSep 01, 2020
Doc Maps Overview

Defining Doc Maps: making object-level evaluation metadata machine-readable, interoperable, and extensible

The first phase of the project will involve identifying different potential models and providers of object-level editorial events relevant to the biology community (including multiple models currently in development), developing the Doc Maps specification to meet the needs of those efforts, working with key stakeholders to produce technical guides for implementing Doc Maps in their current or planned systems/frameworks, and laying out a technical roadmap for a future aggregation service and browser extension.

Phase 1 also includes an optional stretch goal, where we would develop a working proof of concept of the aggregation service and extension using a variety of existing data sources. Building a working model of a peer-review-curate (PRC) ecosystem would improve the project outcome by starting to test the value proposition of object-level editorial data against real users and by giving the TC a source of real-world user feedback, usage data, and event data to inform the development of the Doc Maps specification.

In a potential second phase, the project could be extended to work directly with providers to implement Doc Maps in their systems, and to build an aggregation service and browser extension that surfaces editorial metadata for a given manuscript in a consistent format no matter the source of the metadata, deposit location, or model used to represent it. If the phase 1 stretch goal were to be completed, the data collected and lessons learned from the proof of concept would make phase 2 easier and less expensive to implement.

Ultimately, our project will ensure that the object-level editorial metadata models in development are compatible with a broad range of possible futures for scholarly publishing, rather than locking in the current system.


Editorial practices (ie, the processes, checks, and transformations that journals and publishing platforms apply to manuscripts, such as peer review, ethics checks, certification such as journal acceptance, etc) are highly heterogeneous, and will become even more so as scholarly publishing is disrupted by new innovations, the open science movement, and the removal of barriers to entry. Multiple initiatives to develop models describing peer review practices have emerged, including Transpose, Peer Review Transparency, Review Maps, and an STM Association working group.

These models are a positive development, but they are often narrowly focused on the needs of their creators, and as such do not fully accommodate the needs of readers, funders, and the scholarly publishing ecosystem as a whole. In particular, these efforts do not focus on representing editorial practices in ways that can be reliably aggregated, surfaced, and queried. Moreover, these efforts are often limited to traditional peer review processes, and do not capture the full range of editorial practices and events needed to accommodate a Publish-Review-Curate world where reviews can be conducted by multiple parties. To support this world, the community needs a machine-readable, interoperable, and extensible framework for representing and surfacing object-level review/editorial events.


We have identified three key requirements for representations of editorial processes for creating a healthy ecosystem:

  • Extensibility: taxonomies should be capable of representing a wide range of editorial process events, ranging from a simple assertion that a review occurred to a complete history of editorial comments on a document to a standalone review submitted by an independent reviewer

  • Machine-readability: taxonomies should be represented in a format (eg XML) that can be interpreted computationally and translated into visual representations.

  • Interoperability: a single service should be able to interpret multiple taxonomies against the same criteria and arrive at the same interpretations.

We will assemble a Technical Committee (TC) composed of the parties interested in modelling object-level editorial processes (STM Association, eLife, Knowledge Futures Group, COAR, etc.) to create and adopt Doc Maps, a common framework for representing object-level editorial processes that meets the above requirements. This committee will also bring in other parties who may be willing to surface metadata as part of their infrastructure (e.g. indexing services, preprint servers, Crossref, etc), as well as entities like independent review services who may be willing to create metadata as part of their editorial process.

The scope for Phase 1 of the project will include a specification for representing editorial events as Doc Maps, technical guides for implementing Doc Maps aimed at the specific needs of publishers and aggregators in biology, and a roadmap for the development of an aggregation service that can query for, verify, and visualize the Doc Maps on a manuscript from a broad range of sources, likely in the form of a search engine and companion browser extension.

In a stretch goal for Phase 1, we will build a proof of concept of this service to test the value proposition of object-level editorial event data against real users and provide the TC with real-world feedback and data to inform their discussions.

In a future Phase 2, we will construct the service itself (building upon the phase 1 stretch goal, if pursued). Ultimately, this work will make it possible to display the editorial processes that have been undertaken on a manuscript, regardless of who conducts the editorial process, which model is used to represent it, and where the manuscript is displayed.

An example of a publishing ecosystem with Doc Maps. Phase 1 of the project focuses on the underlying infrastructure for A, and as a stretch goal, a proof of concept of B and C on top of Crossref data. A potential Phase 2 would help providers implement A, and fully build out B and C, with a stretch goal of building out a proof of concept of D.

How this fits into the PRC (publish-review-curate) model

In a world where content is published first, then reviewed and curated, many different models for conducting this review will flourish. Reviewers will need to be able to publish their reviews in a way that can be discovered and surfaced by publishers and aggregators. Publishers and aggregators will need ways to discover, normalize, and display reviews from multiple sources and taxonomies in consistent ways that readers can rapidly understand and contextualize.

Comparison to existing initiatives

This project builds on existing initiatives led by the partners. Peer Review Transparency Standards (KFG) is a proposal for simplified common language (and accompanying icons) to describe peer review status of individual articles & books. However, more flexibility is needed to capture variation in practices. This led to the early-stage Review Maps proposal (KFG), a framework for exposing and sharing human- and machine-readable review histories, and early-stage Review Stacks proposal (KFG), a high-level architecture for a distributed review ecosystem. Doc Maps will be a generalized version of Review Maps that will allow an ecosystem like the one proposed in Review Stacks to develop. Doc Maps will specify how to represent a broad range of editorial processes independent of the model used to define them; it will provide technical guides for implementing those requirements in a number of common formats such as JATS, Activity Streams, and HTML metadata rather. Additional related projects can be seen in appendix 1.

This project will build upon the work of the STM Association Peer Review Taxonomy project, a working group that aims to deliver definitions of journal-organized peer review for publishers to later implement. The working group aims to open their taxonomy for public feedback in summer 2020 and complete integration of feedback and dissemination by the end of the year. However, accommodating non-journal article peer review (eg on preprints or datasets) or review by 3rd parties, as well as a common standard for machine readability of the taxonomy, is out of scope for the STM Association project. In order to ensure that the projects are compatible with one another, we are working on a letter of collaboration with the STM Association that specifies reciprocal participation of project leaders in the working group and technical committee.

Future directions

If phase 1 goes well, it could be quickly followed by a second phase focused on implementation of the specification and development of an aggregation service and browser extension. If the proof of concept stretch goal for phase 1 is funded, phase 2 would build on those efforts even more quickly because a developer would already have been hired, much of the infrastructure built, and initial product validation and user testing completed.

No comments here
Why not start the discussion?