Skip to main content

DocMaps Framework Overview

Work-in-progress documentation for DocMaps, a framework for describing the processes used to create a document in a machine-readable way.

Published onJan 11, 2021
DocMaps Framework Overview
·

Summary

DocMaps is a framework for describing the processes used to create a document in a machine-readable way. The framework is designed to be interpreted in multiple formats depending on the use-case of the consumer of the DocMap. It can extend to capture any amount of contextual data about any kind of document — from a minimum assertion that a process took place, to a detailed history of every edit to a document.

We intend DocMaps to be general enough to capture the process behind any kind of document, but our initial focus is capturing evaluative processes surrounding research documents — primarily peer review processes conducted by journals or by services reviewing preprints. From June 2020 to February 2021, we worked with a technical committee of technologists, publishers, preprint server operators, aggregators, and advocates to develop the first use-cases for the framework.

Since then, an informal working group from Knowledge Futures, Cold Spring Harbor Laboratory, and eLife’s Sciety have been working on a pilot implementation to use DocMaps from two providers, Sciety and EMBO’s Early Evidence Base, to two consumers, CSHL’s bioRxiv and medRxiv preprint servers.

Core Concepts

What DocMaps are

  • DocMaps are immutable assertions that describe the processes used to create a document (e.g. the peer review process used to evaluate a scientific article, or the process used to create an overlay review article).

  • DocMaps are asserted by the publisher of the document (e.g. PREReview).

  • DocMaps can describe a single event or a complex, multi-event process.

  • DocMaps may contain links to other documents, including other DocMaps.

What DocMaps aren’t

  • DocMaps are not representations of the documents whose contexts they describe (though they could embed representations or link to representations of those documents).

  • DocMaps are not representations of a document’s current state (though aggregators can use them to produce views of these states).

  • DocMaps are not notifications that an event took place (though they can be used as the payload for such notifications).

  • DocMaps are not assertions of the subjective result of any given process (though they may contain the DocMap provider’s subjective result, which DocMap consumers can choose to use).

Provider vs. Consumer

DocMaps are asserted by a provider and interpreted by a consumer. It is up to the provider to provide information about processes that they believe are meaningful to understand the process and context used to create the document. It is up to the consumer to decide how to interpret the provided DocMap. In this way, consumers from different perspectives can decide how to interpret provided DocMaps based on their needs and purpose. One consumer may decide that a provider’s DocMap qualifies as peer review for their purposes. Another consumer may decide that the same DocMap is only an evaluation, and must be combined with other DocMaps to qualify as a review. A third consumer may decide to ignore the provider’s DocMap entirely. All approaches are valid.

DocMaps as Convention

At its core, the DocMaps framework is a set of agreed-upon conventions for aligning editorial processes against the Publishing Status Ontology (PSO), Publishing Workflow Ontology (PWO) and for expressing these events in a domain-specific language that can be easily interpreted by machines and humans alike. When needed, we use other SPAR Ontologies to describe objects and relationships that fall outside the domains of PWO and PSO, particularly the FRBR-aligned Bibliographic Ontology (FaBiO).

By aligning against well-defined publishing Ontologies, DocMaps makes use of community endorsed, tested, and extensible models rather than trying to invent a new model, which would risk being too specific and less likely to succeed. This alignment also allows developers who are familiar with RDF and other Semantic Web technologies to model editorial events as graphs and serialize them in many formats.

By expressing this alignment using domain-specific language conventions, DocMaps makes it easy for users to reason about editorial events without needing to understand the underlying Ontologies or semantic web structures. This alignment also allows developers who are less familiar with semantic web technologies to create and consume DocMaps in simple formats like JSON using their existing technology stacks.

Convention Summary

A Docmap is a pwo:Workflow that describes a series of pwo:Steps, which are executed in (taskex:isExecutedIn) actions. Each action contains assertions (pso:ResultsInAcquiring) that assert that a particular action resulted in (pwo:isStatusHeldBy) an item (frbr:Work) acquiring (pso:withStatus) a pso:publishingStatus such as “peer-reviewed”.

In addition to asserting status changes, each action describes the inputs (pwo:needs) and outputs (pwo:produces) of the action in the form of fabio:expressions. Each action also describes participants (pro:isDocumentContextFor), which are actors (pro:isHeldBy) of a defined type (usually foaf:person) holding pro:Roles such as peer-reviewer or editor.

How It Works

By using the convention, providers can express Docmaps in plain JSON. By enhancing the plain JSON using JSON-LD framing, it can be interpreted as JSON-LD, allowing RDF consumers to reason about them.

JSON Example

This example models this Review Commons evaluation process.

{
    "id": "https://sciety.org/docmaps/v1/articles/10.1101/2020.04.05.20054403.docmap.json",
    "type": "docmap",
    "created": "2020-05-01T00:00:00.000Z",
    "updated": "2020-05-01T00:00:00.000Z",
    "publisher": {
        "id": "https://ncrc.jhsph.edu/",
        "name": "NCRC",
        "logo": "https://sciety.org/static/groups/ncrc--62f9b0d0-8d43-4766-a52a-ce02af61bc6a.jpg",
        "homepage": "https://ncrc.jhsph.edu/",
        "account": {
            "id": "https://sciety.org/groups/62f9b0d0-8d43-4766-a52a-ce02af61bc6a",
            "service": "https://sciety.org"
        }
    },
    "first-step": "_:b0",
    "steps": {
        "_:b0": {
            "assertions": [],
            "inputs": [
                {
                    "doi": "10.1101/2020.04.05.20054403",
                    "url": "https://doi.org/10.1101/2020.04.05.20054403",
                    "published": "2020-04-10T00:00:00.000Z"
                }
            ],
            "actions": [
                {
                    "participants": [
                        {
                            "actor": {
                                "name": "anonymous",
                                "type": "person"
                            },
                            "role": "peer-reviewer"
                        }
                    ],
                    "outputs": [
                        {
                            "type": "review-article",
                            "published": "2020-05-01T00:00:00.000Z",
                            "content": [
                                {
                                    "type": "web-page",
                                    "url": "https://ncrc.jhsph.edu/research/projected-early-spread-of-covid-19-in-africa/"
                                },
                                {
                                    "type": "web-page",
                                    "url": "https://sciety.org/articles/activity/10.1101/2020.04.05.20054403#ncrc:90c56170-fa65-4d4c-82a2-dceefeb603fe"
                                }
                            ]
                        }
                    ]
                }
            ]
        }
    }
}

JSON-LD Frame

By adding a JSON-LD frame and context, the document can be interpreted as JSON-LD.

{
    "@context": {
        "@version": 1.1,
        "dcterms": "http://purl.org/dc/terms/",
        "atom": "http://www.w3.org/2005/Atom",
        "foaf": "http://xmlns.com/foaf/0.1/",
        "cnt": "http://www.w3.org/2011/content#",
        "fabio": "http://purl.org/spar/fabio/",
        "frbr": "http://purl.org/vocab/frbr/core#",
        "pso": "http://purl.org/spar/pso/",
        "pwo": "http://purl.org/spar/pwo/",
        "xsd": "http://www.w3.org/2001/XMLSchema#",
        "prism": "http://prismstandard.org/namespaces/basic/2.0/",
        "pro": "http://purl.org/spar/pro/",
        "prov": "https://www.w3.org/TR/prov-o/",
        "taskex": "http://www.ontologydesignpatterns.org/cp/owl/taskexecution.owl#",
        "ti": "http://www.ontologydesignpatterns.org/cp/owl/timeinterval.owl#",
        "id": "@id",
        "type": "@type",
        "docmap": "pwo:Workflow",
        "created": {
            "@id": "dcterms:created",
            "@type": "xsd:date"
        },
        "updated": {
            "@id": "atom:updated",
            "@type": "xsd:date"
        },
        "description": "dcterms:description",
        "title": {
            "@id": "dcterms:title",
            "@container": "@language"
        },
        "doi": "prism:doi",
        "creator": {
            "@id": "dcterms:creator",
            "@container": "@set",
            "@type": "@id"
        },
        "published": {
            "@id": "prism:publicationDate",
            "@type": "xsd:date"
        },
        "name": "foaf:name",
        "person": "foaf:Person",
        "publisher": {
            "@id": "dcterms:publisher",
            "@type": "foaf:Organization"
        },
        "logo": {
            "@id": "foaf:logo",
            "@type": "xsd:anyURI"
        },
        "homepage": {
            "@id": "foaf:homepage",
            "@type": "xsd:anyURI"
        },
        "account": {
            "@id": "foaf:OnlineAccount",
            "@type": "xsd:anyURI"
        },
        "service": {
            "@id": "foaf:accountServiceHomepage",
            "@type": "xsd:anyURI"
        },
        "provider": {
            "@id": "foaf:Organization",
            "@type": "@id"
        },
        "process": {
            "@id": "prov:process",
            "@type": "xsd:string"
        },
        "inputs": {
            "@id": "pwo:needs",
            "@container": "@set",
            "@type": "@id"
        },
        "outputs": {
            "@id": "pwo:produces",
            "@container": "@set",
            "@type": "@id"
        },
        "assertions": {
            "@id": "pso:resultsInAcquiring",
            "@container": "@set",
            "@type": "@id",
            "@context": {
                "item": {
                    "@id": "pso:isStatusHeldBy",
                    "@type": "@id"
                },
                "status": {
                    "@id": "pso:withStatus",
                    "@type": "@vocab",
                    "@context": {
                        "@vocab": "http://purl.org/spar/pso/"
                    }
                }
            }
        },
        "steps": {
            "@id": "pwo:hasStep",
            "@container": [
                "@id"
            ]
        },
        "first-step": {
            "@id": "pwo:hasFirstStep",
            "@type": "@id"
        },
        "next-step": {
            "@id": "pwo:hasNextStep",
            "@type": "@id"
        },
        "previous-step": {
            "@id": "pwo:hasPreviousStep",
            "@type": "@id"
        },
        "content": {
            "@id": "fabio:hasManifestation",
            "@type": "@id",
            "@container": "@set"
        },
        "url": {
            "@id": "fabio:hasURL",
            "@type": "xsd:anyURI"
        },
        "review": "fabio:ProductReview",
        "web-page": "fabio:WebPage",
        "participants": {
            "@id": "pro:isDocumentContextFor",
            "@container": "@set",
            "@type": "@id"
        },
        "role": {
            "@id": "pro:withRole",
            "@type": "@vocab",
            "@context": {
                "@vocab": "http://purl.org/spar/pro/"
            }
        },
        "actor": {
            "@id": "pro:isHeldBy",
            "@type": "@id"
        },
        "email": "fabio:Email",
        "file": {
            "@id": "fabio:DigitalManifestation",
            "@context": {
                "text": "cnt:chars"
            }
        },
        "letter": "fabio:Letter",
        "manuscript": "fabio:Manuscript",
        "format": {
            "@id": "dcterms:format",
            "@type": "@vocab",
            "@context": {
                "@vocab": "https://w3id.org/spar/mediatype/"
            }
        },
        "includes": "frbr:part",
        "actions": {
            "@id": "taskex:isExecutedIn",
            "@container": "@set",
            "@type": "@id"
        },
        "happened": {
            "@id": "pwo:happened",
            "@type": "@id"
        },
        "at-date": {
            "@id": "ti:hasIntervalDate",
            "@type": "xsd:date"
        },
        "realization-of": {
            "@id": "frbr:realizationOf",
            "@type": "@id"
        },
        "author-response": "fabio:Reply",
        "evaluation-summary": "fabio:ExecutiveSummary",
        "decision-letter": "fabio:Letter"
    },
        "id": "https://sciety.org/docmaps/v1/articles/10.1101/2020.04.05.20054403.docmap.json",
    "type": "docmap",
    "created": "2020-05-01T00:00:00.000Z",
    "updated": "2020-05-01T00:00:00.000Z",
    "publisher": {
        "id": "https://ncrc.jhsph.edu/",
        "name": "NCRC",
        "logo": "https://sciety.org/static/groups/ncrc--62f9b0d0-8d43-4766-a52a-ce02af61bc6a.jpg",
        "homepage": "https://ncrc.jhsph.edu/",
        "account": {
            "id": "https://sciety.org/groups/62f9b0d0-8d43-4766-a52a-ce02af61bc6a",
            "service": "https://sciety.org"
        }
    },
    "first-step": "_:b0",
    "steps": {
        "_:b0": {
            "assertions": [],
            "inputs": [
                {
                    "doi": "10.1101/2020.04.05.20054403",
                    "url": "https://doi.org/10.1101/2020.04.05.20054403",
                    "published": "2020-04-10T00:00:00.000Z"
                }
            ],
            "actions": [
                {
                    "participants": [
                        {
                            "actor": {
                                "name": "anonymous",
                                "type": "person"
                            },
                            "role": "peer-reviewer"
                        }
                    ],
                    "outputs": [
                        {
                            "type": "review-article",
                            "published": "2020-05-01T00:00:00.000Z",
                            "content": [
                                {
                                    "type": "web-page",
                                    "url": "https://ncrc.jhsph.edu/research/projected-early-spread-of-covid-19-in-africa/"
                                },
                                {
                                    "type": "web-page",
                                    "url": "https://sciety.org/articles/activity/10.1101/2020.04.05.20054403#ncrc:90c56170-fa65-4d4c-82a2-dceefeb603fe"
                                }
                            ]
                        }
                    ]
                }
            ]
        }
    }
}

Usage Guidance

Work and Status Types

Defining work and status types for different scholarly outputs is one of the more challenging aspects of this effort. The PSO and FaBiO ontologies, while comprehensive, were designed for a world without much transparent article evaluation. We are working with the ASAPBio Peer Review Taxonomy group to define terms, and will propose adding them to SPAR ontologies once that group has completed its work. In the meantime, we are using the following convention to describe different outputs of evaluation processes.

Description

Crossref Name

sciety Name

Docmap Term

Docmap Alias

An evaluation that results in a recommendation

major-revision, minor-revision, accept

evaluation

pso:peerReviewed

TBD. For now, use the PSO term.

An evaluation that does not result in a recommendation

reject, reject-with-resubmit

evaluation

pso:reviewed

TBD. For now, use the PSO term.

Text of an individual evaluator’s evaluation

referee-report

review

fabio:ProductReview

review

An author’s response to evaluations

author-comment

author response

fabio:Reply

author-response

A summary of an entire round of evaluations

aggregate

evaluation summary

fabio:ExecutiveSummary

evaluation-summary

A letter sent to authors describing the results of the evaluation

editor-report

decision letter

fabio:Letter

decision-letter

Referencing Works

It is best practice to provide dereferenceable IDs to inputs and outputs, but we expect that Docmap providers will commonly provide inline FaBiO object parts, such as:

{
 "type": "review-article",
 "published": "2021-04-23",
 "content": [
  {
   "type": "web-page",
   "url": "https://ncrc.jhsph.edu/research/evidence-for-increased-breakthrough-rates-of-sars-cov-2-variants-of-concern-in-bnt162b2-mrna-vaccinated-individuals/"
  },
  {
   "type": "web-page",
   "url": "https://sciety.org/articles/activity/10.1101/2020.11.09.374330#ncrc:c0e4f483-eb58-4c13-b475-66c3d86fb430"
  }
 ]
}

Referencing People

People are commonly referenced both as actor s within step s and creator s of outputs. In both cases, the best practice is to reference people using their ORCID in the ID field, which is a dereferenceable IRI, and to provide a givenName and familyName for consumers who are unable to dereference, like so:

{
   "id": "https://orcid.org/0000-0003-3440-0259",
   "type": "person"
   "givenName": "Narayanan",
   "familyName": "Shivakumar"
}

In the absence of an ORCID, the given and family names can be provided without an ID. In these cases, it is best practice to add a blank node identifier (e.g. _:p0) to the first reference of the person so that they can be referenced later if mentioned more than once.

Ordering Steps

JSON-LD arrays do not guarantee an order. Thus, steps should be ordered by assigning each step a blank node identifier (e.g. _:b0) and referencing them using first-step (pwo:hasFirstStep) at the top of the Docmap, and next-step (pwo:nextStep) and previous-step (pwo:previousStep) in each step, as so:

{
 "type": "docmap"
 "first-step": "_:b0"
 ...
 "actions": [
  "_:b0": {
   "next-step": "_:b1"
   ...
  },
  "_:b1": {
   "previous-step": "_:b0",
   "next-step": "_:b2"
   ...
  }
  "_:b2": {
   "previous-step": "_:b1"
   ...
  }
 ]
}

Grouping Steps

We have deliberately refrained from specifying how steps should be grouped. In general, you should try to be as transparent as possible, using as many steps as needed to accurately describe the process. In practice, however, we expect that many people will not have granular access to data, and will describe processes in a single, or just a few, steps.

Comments
0
comment
No comments here
Why not start the discussion?