Skip to main content

DocMaps Framework Overview

Work-in-progress documentation for DocMaps, a framework for describing the processes used to create a document in a machine-readable way.

Published onJan 11, 2021
DocMaps Framework Overview
·
history

You're viewing an older Release (#1) of this Pub.

  • This Release (#1) was created on Jan 11, 2021 ()
  • The latest Release (#7) was created on Aug 19, 2021 ().

Summary

DocMaps is a framework for capturing valuable context about the processes used to create documents, in a machine-readable way that can be expressed and interpreted in multiple formats depending on the use-case of the reader of the DocMap. The framework is designed to capture any amount of contextual data about a document — from a minimum assertion that a process took place, to a detailed history of every edit to a document.

We intend DocMaps to be general enough to capture the process behind any document, but our initial focus is capturing evaluative processes surrounding research documents — primarily peer review processes conducted by journals or by services reviewing preprints. We are working with a technical committee of technologists, publishers, preprint server operators, aggregators, and advocates to develop the first use-cases for the framework.

Core Concepts

DocMaps

In addition to being the name of the framework, a DocMap is the name of the object used to capture information about a document. At its most basic, a DocMap consists of just a few pieces of information:

{
createdOn: timestamp // The time the DocMap was created 
contentType: string // The type of content (book, chapter, review)
content: uri | optional // A link to the content the DocMap refers to
provider: uri | optional // A link identifying the provider of the DocMap (e.g. the journal)
}

Using just these three pieces of information, a publisher could make a simple assertion that an article exists.

{
createdOn: 07-07-1999T00:00:00Z
contentType: "article"
content: https://doi.org/10.1109/5.771073
provider: https://ieee.org
}

Of course, this assertion leaves much to be desired. Most DocMaps will contain more information, e.g. the date an article was published; which version of an article the DocMap refers to; etc.

DocMaps accomplishes this by building and maintaining schemas for different Content Types.

Content Types

Content types extend the basic DocMaps structure to add information needed to describe a specific kind of content.

In the article example above, the publisher can provide more information about the article by looking at the schema for a DocMaps Article (see below) and providing as many of the fields as they feel necessary. For example:

{
createdOn: 07-07-1999T00:00:00Z
contentType: "article"
content: https://doi.org/10.1109/5.771073
provider: https://ieee.org
contributors: [{name: "N. Paskin", type: "author"}]
title: "Toward unique identifiers"
}

Although the initial set of content types will be limited, we imagine working with relevant communities to develop many different types that support their use cases. For example: reviews; versions; translations; and discussions.

For now, we’ve defined three different content types with the input of the Technical Committee and the DocMaps core team: Articles, Reviews, and Versions. For more details, see the draft content type schemas below.

Contexts

The real fun starts when you create relationships between DocMaps to describe a series of events. These relationships are called Contexts, and are defined by a context key that describes the type of relationship and the direction of the relationship. Like content types, we imagine that many different types of Contexts will eventually be available, such as:

// context key: [list of DocMaps]
reviews: [Docmap]
isReviewOf: [Docmap]
versions: [Docmap]
isVersionOf: [Docmap]
translations: [Docmap]
isTranslationOf: [Docmap] 
discussions: [Docmap]
isDiscussionOf: [Docmap]
...

Using Contexts and Content Types, you can describe almost any kind of editorial process. For example, let’s imagine that the publisher wanted to let you know that two versions of the article exist. They could add Version Contexts to the Article to show that they have published two versions of the article:

{
createdOn: 07-07-1999T00:00:00Z
contentType: "article"
content: https://doi.org/10.1109/5.771073
provider: https://ieee.org
contributors: [{name: N. Paskin, type: author}]
title: 'Toward unique identifiers'
versions: [
   {
    createdOn: 07-07-1999T00:00:Z
    contentType: "version"
    content: https://doi.org/10.1109/5.771073v1
   }
   {
    createdOn: 07-08-1999T00:00:Z
    contentType: "version"
    content: https://doi.org/10.1109/5.771073v2  
   }
}

But the flexibility of DocMaps means they don’t have to publish all this information at once. They could also publish a DocMap describing just the second version of the paper by creating a Version DocMap with an isVersionOf context containing an Article DocMap.

{
createdOn: 07-08-1999T00:00:00Z
contentType: "version"
content: https://doi.org/10.1109/5.771073v2
provider: https://ieee.org
isVersionOf: [
   {
    createdOn: 07-07-1999T00:00:Z
    contentType: "article"
    content: https://doi.org/10.1109/5.771073
   }
]

Because contexts can flexibly describe events in many different ways, we expect that conventions will arise relatively organically between providers and consumers of DocMaps for describing different editorial processes at different levels of complexity.

As you’ll see in the use cases below, we’ve used the Versions context to provide information about both published article versions and review revision rounds. If a provider didn’t need or want to report revision rounds, or if there was only one round, they could omit the Versions context. In all of these cases, the consumer of the DocMap can decide how much of the context they want to use depending on how they plan to use the DocMap.

Initial Use Cases

The Technical Committee specified two first use cases for DocMaps and further prioritized the data elements needed to capture relevant data for those cases. Below, we describe example DocMaps for these cases and explain how the elements requested by the Technical Committee can be imputed from the DocMaps.

1. A publisher captures context about a review of an article published in their journal

In this example, a journal is describing a double-masked peer review of an article with two rounds of revisions. They do this by nesting a Review context within an Article Context. They then further nest two Version Contexts within the Review Context to describe multiple rounds of feedback.

{
contentType: "article"
content: https://doi.org/article/123
createdOn: 2020-01-01T00:00:00Z
provider: https://myjournal.org
title: 'An article about something!'
contributors: [
    {
     name: "Liz Jones"
     id: https://orcid.org/0002-0002
     role: "author"
    }
    {
     name: "Eric Mays"
     id: https://orcid.org/0005-0001
     role: "data visualization"
    }
]
datePublished: 2020-01-01T:00:00Z
versions: [
   {
     contentType: "version"
     content: https://doi.org/article/123v1
     date_submitted: 2019-12-20T00:00:00Z
     date_online: 2020-08-15T00:00:00Z
     ethics_statements: "This was conducted ethically."
     competing_interests: "There were no conflicts of interest."
   }
]
reviews: [
  {
   contentType: "review"
   createdOn: 2020-06-01T00:00:00z
   provider: https://myjournal.org
   decision_date: 2020-07-20T00:00:00z
   decision: 'accept with revisions'
   contributors: 
   [
    {
     name: "John Doe"
     affiliation: "Wassamatta U"
     role: "editor"
    }
    {
     id: 12345
     role: reviewer
    }
    {
     id: 23456
     role: reviewer
    }
   ]
   identity_transparency: 'double-anonymized'
   reviewer_interacts_with: [editor]
   review_information_published: [editor-identities]
   versions: [
    {
     contentType: "version"
     createdOn: 2020-06-15T00:00:00Z
     contributors: [
      {
       id: 12345
       role: reviewer
      }
      {
       id: 23456
       role: reviewer
      }
     ]
    }
    {
     contentType: "version"
     createdOn: 2020-07-10T00:00:00Z
     contributors: [
      {
       id: 12345
       role: reviewer
      }
     ]
    }
   ]
  }
 ]
}

Data Elements

Name

Value

DocMaps Location

Submission date

2019-12-20

article -> versions -> date_submitted

Date of decision

2020-07-20

article -> reviews -> decision_date

Date that paper is available online

2020-08-15

article -> versions -> date_online

Publication of report

none

article -> reviews -> (no content field provided)

Decision / Score

accept with revisions

article -> reviews -> decision

Number of reviewers (1st round)

2

article -> reviews -> first version -> count of contributors with role “reviewer”

Publication of response

none

article -> reviews -> (no author response provided)

Review info published

editor identities

article -> reviews -> review_information_published

Who was asked

12345, 23456

article -> reviews -> count of contributors with roles “reviewer” or “invited_reviewer”

Do editors ask authors to suggest reviewers

no

article -> reviews -> contributors (no author_suggested_reviewer on any contributors)

Name of handling editor

John Doe

article -> reviews -> contributors (editor)

Type of model

double anonymized

article -> reviews -> identity_transparency

Opt-in for peer review e.g. double blind opt-in

no

article -> reviews -> identity_transparency (opt-in not provided from STM schema)

Number of revisions

2

article -> reviews -> versions

How many revisions particular reviewer participated in

12345: 2

23456: 1

article -> reviews -> versions -> contributors

Reviewer interacts with

editor

article -> reviews -> reviewer_interacts_with

Post publication commenting

none

article -> reviews -> (no post_publication_commenting provided)

Object type

article

article -> contentType

Ethics

This was conducted ethically.

article -> version -> ethics_statements

Competing interests

There were no conflicts of interest

article -> version -> competing_interests

Data availability statements

none

article -> version -> (no data_availability_statements) provided

Authorship contributions

Liz Jones, author

Eric Mays, data visualization

article -> contributors

2. An independent review service notifies a preprint server about a review of an article on their platform.

In this example, a review service is describing a fully transparent review of a preprint article with links to the review report and author response. They do this by including a content field for the review object and filling out the author response and STM Association Taxonomy metadata to describe the process of the review.

{
contentType: "review"
content: https://doi.org/review/123
createdOn: 2020-01-01T00:00:00z
provider: https://myreviewservice.org
decision_date: 2020-07-20T00:00:00z
decision: "accept"
contributors: [
    {
     name: "Tricia McMillan"
     affiliation: "Maximegalon University"
     role: "editor"
     id: https://orcid.org/0000-0000
     author_suggested: false
    },
    {
     name: "Zaphod Beeblebrox"
     affiliation: "Betelgeuse State College"
     role: reviewer
     id: https://orcid.org/0001-0001
     author_suggested: true
    }
    {
     name: "Arthur Dent"
     affiliation: "BBC"
     role: reviewer
     id: https://orcid.org/0002-0002
    }
    {
     name: "Ford Prefect"
     affiliation: "Pan Galactic Gargle Blaster Society"
     role: invited_reviewer
     id: https://orcid.org/0002-0002
    }
]
author_response: https://doi.org/response/123
identity_transparency: [all-identities-visible, opt-in]
reviewer_interacts_with: [editors, reviewers, authors]
review_information_published: [reviewer-identities, editor-identities, review-reports-author-opt-in]
versions: [
   {
    contentType: "version"
    date_submitted: 2020-06-15T00:00:00Z
   }
]
isReviewOf: [
    {
     contentType: "article"
     content: https://doi.org/preprint/123
    }
]

Name

Value

DocMaps Location

Submission date

2020-06-15

review -> versions -> date_submitted

Date of decision

2020-07-20

review -> decision_date

Date that paper is available online

n/a for this example, data can be obtained from preprint server or crossref

Publication of report

Yes (https://doi.org/review/123)

review -> content

Decision / Score

accept

review -> decision

Number of reviewers (1st round)

2

review -> count of contributors with role “reviewer”

Publication of response

Yes (https://doi.org/response/123)

review -> author_response

Review info published

reviewer-identities, editor-identities, review-reports

review -> review_information_published

Who was asked

Zaphod Beeblebrox, Arthur Dent, Ford Prefect

article -> reviews -> count contributors with roles “reviewer” or “invited_reviewer”

Do editor ask authors to suggest reviewers

yes

article -> reviews -> contributors -> author_suggested: true (Zaphod Beeblebrox)

Name of handling editor

Tricia McMillan

review -> contributors (editor)

Type of model

fully transparent

article -> reviews -> identity_transparency

Opt-in for peer review e.g. double blind opt-in

yes

article -> reviews -> identity_transparency

Number of revisions

2

article -> reviews -> versions

How many revisions particular reviewer participated in

Arthur Dent: 1

Tricia McMillan: 1

Zaphod Beeblebrox: 1

Ford Prefect: 0

implied by convention from review -> version with no reviewers specified, but reviewers specified at review -> contributors level

Reviewer interacts with

editors, reviewers, authors

review -> author_interacts_with

Post publication commenting

none

article -> reviews -> (no post_publication_commenting provided)

Object type

review

review -> contentType

Ethics

n/a for this example, data can be obtained from preprint server or crossref

Competing interests

n/a for this example, data can be obtained from preprint server or crossref

Data availability statements

n/a for this example, data can be obtained from preprint server or crossref

Authorship contributions

Tricia McMillan, editor

Arthur Dent, reviewer

Zaphod Beeblebrox, Reviewer

review -> contributors

Draft Content Type Schemas

These are initial drafts of content type schemas, and will be somewhat incomplete by design. They are intended to capture the information necessary to make the two initial use cases listed above possible.

Article

{

// Common DocMap fields
contentType: "article"
content: uri // A canonical link or DOI to the article, if it exists
createdOn: timestamp
provider: uri

// Article specific metadata
title: string
abstract: string
contributors: [Contributors]
datePublished: date

// Contexts
versions: [Version]

}

Review

{

// Common DocMap fields
contentType: "review"
content: uri // A canonical link or DOI to the report publication, if it exists
createdOn: timestamp
provider: uri

// Review specific metadata fields
decision_date: timestamp
decision: string
contributors: [Contributor] // Includes reviewers, editors, asked reviewers
author_response: uri
identity_transparency: string // STM Assoc taxonomy
reviewer_interacts_with: [string] // STM Assoc taxonomy
review_information_published: [string] // STM Assoc taxonomy
post_publication_commenting: [string] // STM Assoc taxonomy

// Contexts
isReviewOf: [Article] // A list of DocMaps of reviewed objects

}

Version

{

// Common DocMap fields
contentType: "version"
content: uri // A canonical link or DOI to the version of the referenced object, if it exists
createdOn: timestamp
provider: uri

// Version specific metadata
date_submitted: timestamp
date_online: timestamp
ethics_statements: string
competing_interests: string
data_availability: string
contributors: [Contributor]

}

Implementation

Once we discuss and come to agreement on the initial schemas, we’ll turn to implementation. We’re using pseudocode in the examples above because we want DocMaps to support multiple formats, since publishers’ technological needs and capabilities are so varied.

Comments
0
comment
No comments here
Why not start the discussion?