> For the complete documentation index, see [llms.txt](https://docs.navigaglobal.com/navigadoc/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://docs.navigaglobal.com/navigadoc/navigadoc-walkthrough.md).

# Walk-through the NavigaDoc format

## Introduction to the NavigaDoc document format

NavigaDoc is a JSON version of our existing NewsML document format.

*...and some other XML formats we're replacing*

The NavigaDoc format is less verbose and JSON/GraphQL/struct-friendly. The idea is to be able to use the same format all the way from authoring tools to frontend applications.

The JSON format sticks close enough to the way data and content is modelled in our NewsML format to make bi-directional conversion feasible. We are automatically converting between NavigaDoc JSON and NewsML on the fly in the Content Creation API (CCA). All writes to the OC in question will have to go through the intermediate API (the Content Creation API - CCA) that handles the conversion.

## Relying on conversion JSON<->XML

We're still storing documents as NewsML XML in the OC Content repository. To ensure a reliable 1:1 conversion, we puts some constraint on how to use NewsML in the Writer. And some constraints on what we can use in NavigaDoc. More on that when we are more familiar with the format.

The supported document types, i.e. the document types that can be converted to and from the NewsML XML are:

* Article:  `x-im/article`
* Article template: `x-im/article-template`
* Image: `x-im/image`
* Imagelink: `x-im/imagelink`
* PDF: `x-im/pdf`
* Concept: The actual type depends on which type of concept it is;
  * Author: `x-im/author`
  * Category: `x-im/category`
  * Channel: `x-im/channel`
  * Content profile: `x-im/content-profile`
  * Event: `x-im/event`
  * Organisation: `x-im/organisation`
  * Person: `x-im/person`
  * Place: `x-im/place`
  * Section: `x-im/section`
  * Story: `x-im/story`
  * Topic: `x-im/topic`
* List: `x-im/list`
* Package: `x-im/package`
* Planning: `x-im/newscoverage`
* Assignment: `x-im/assignment`

The type corresponds to the `$.type` attribute of the NavigaDoc document (see below).

## The attributes of a NavigaDoc document

Top level document structure:

```javascript
{
  "uuid": "1d02738f-7c99-42ba-a6da-3d1b97261523",
  "title": "Proin eget dignissim ipsum",
  "status": "withheld",
  "provider": "acme",
  "modified": "2015-07-01T14:11:20Z",
  "created": "2015-07-01T14:00:02Z",
  "type": "x-im/article",
  "uri": "im://article/1d02738f-7c99-42ba-a6da-3d1b97261523",
  "url": "http://example.org/articles/2015-",
  "published": "2015-07-01T14:27:00+02:00",
  "unpublished": "2015-10-05T15:14:13+02:00",
  "language": "en-GB",
  "links": [{...}],
  "meta": [{...}],
  "content": [{...}],
  "properties": [{...}],
}
```

## The attributes of a document

* `uuid`: ID of the document
* `uri`: an URL-based identifier for the document
* `url`: the location of the document (if any)
* `title`: title of the document (not headline)
* `status`: workflow status of the document, "draft" et.c.
* `provider`: the provider of the document, e.g. "TT" or "NavigaPhotos"
* `created`: when the document was created
* `type`: the type of document, e.g. "x-im/article"
* `modified`: last modified timestamp
* `published`: when the document was, or will be published
* `unpublished`: when the document was, or will be unpublished
* `language`: the language code for the document: "en-GB", "fi", "sv" et.c.
* `path`: the path on which the document can be exposed when consumed through a website
* `products`: a list of product names, replaced by channel-links but preserved for legacy support
* `content`: a list of content blocks
* `meta`: a list of metadata blocks that describes the document
* `links`: a list of link blocks that describes the documents relationships.
* `properties`: a list of properties, primarily used when converting from XML

{% hint style="info" %}
**Note:** Storing the document in Open Content, there is a "core" property named `source` which will have the value "cca" if document is saved using CCA. The `source` property is not supported in the actual document, only as a property in Open Content.&#x20;
{% endhint %}

## UUID and URI

Articles created by the writer have a random UUIDv4 and an URI that contains the UUID: `im://article/1d02738f-7c99-42ba-a6da-3d1b97261523`

Documents from an external systems should construct an URI that represents the ID of the document in the external system.

If you have a system called Robot that produces an article with the ID 1234-8754 you could construct an URI like `robot://article/1234-8754` and generate a v5 UUID from it.

## Generating a v5 UUID

A UUIDv5 is created from a name (uri) in a namespace (url).

In a shell you would do it like this:

```bash
$ uuidgen --namespace @url --sha1 --name robot://article/1234-8754 
bda1a573-e7ab-5076-adbf-aa3ff9ba8106
```

In Go, you would do it like this:

```go
package main

import uuid "github.com/satori/go.uuid"

func main() {
	uri := "robot://article/1234-8754"
	uuidV5 := uuid.NewV5(uuid.NamespaceURL, uri)
	println(uuidV5.String())
	// Output: bda1a573-e7ab-5076-adbf-aa3ff9ba8106
}

```

## Status and timestamps

Statuses

* draft: a working copy
* done: work is done, needs to be approved by e.g. an editor
* withheld: scheduled for publishing
* usable: published
* canceled: the article has been published, but was then unpublished

The document status works in collaboration with the `published` and `unpublished` timestamps.

## Status and timestamps

For "withheld" documents the `published` timestamp is when they will be published

For "usable" documents `published` is the time they were published, and if `unpublished` is set they will be `cancelled`

## Building blocks

The document has three primary sets of blocks that describes it:

* links
* meta
* content

These blocks are also recursive and can in turn contain links, properties (metadata equivalent) and content.

## Block attributes

* `id` is the block ID
* `uuid` is used when a block references another document.
* `uri` is used to reference another entity (document or otherwise)
* `url` is a browseable URL for the block.
* `type` is a mime-ish type for the block
* `title` title/headline of the block
* `data` key-value data
* `rel` is the relationship the block has to its parent
* `name` is a name for the block. An alternative to "rel" when relationship is a term that doesn't fit

## Block attributes

* `value` is a value for the block. Useful when we want to store a primitive value
* `contentType` is used to describe the content type of the block/linked entity if it differs from the type of the block

And then we have the nested blocks:

* `links` is a set of link blocks
* `properties` is a set of properties for the block, much like `meta` is for the document
* `content` is used to nest content under a block

## Modelling data with blocks

Block nesting and the key value structure under `data` is intended to be used responsibly.

Model your data with nesting, but don't go multi-level without considering complexity costs.

## Modelling data with blocks - a video block

Try to keep data keys generic, and don't do things like this:

```javascript
{
  "type": "sanoma/video-type",
  "title": "Onnellinen lokki",
  "uri": "videprovider://video/1234-5678",
  "data": {
    "bylineName": "Hugo Wetterberg",
    "bylineImage": "https://example.com/hugo.png",
    "bylineLink": "https://example.com/photographer/hugo",
    "coverImage": "https://example.com/seagull.png"
  }
}
```

## Modelling data with blocks

Use nesting instead:

```javascript
{
  "type": "sanoma/video-type",
  "title": "Onnellinen lokki",
  "uri": "videprovider://video/1234-5678",
  "links": [
    {
      "rel": "author",
      "title": "Hugo Wetterberg",
      "url": "https://example.com/photographer/hugo",
      "links": [
          { "rel": "avatar", "url": "https://example.com/seagull.png" }
      ]
    },
    {
      "rel": "cover-image", "url": "https://example.com/hugo.png"
    }
  ]
}
```

## Modelling data with blocks

The link model scales better with feature requests like "we need to credit multiple authors", or "we need to provide the dimensions of the cover image". These innocent requests could result in:

```javascript
"data": {
  "bylineName": "Hugo Wetterberg",
  "bylineImage": "https://example.com/hugo.png",
  "bylineLink": "https://example.com/photographer/hugo",
  "bylineTwoName": "Kristofer Pasanen",
  "bylineTwoImage": "https://example.com/kristofer.png",
  "bylineTwoLink": "https://example.com/photographer/kristofer",
  "coverImage": "https://example.com/seagull.png"
  "coverImageWidth": "1920"
  "coverImageHeight": "1080"
}
```

Instead of semantic structure, we are left with arbitrary fields.

## The data block

The data block allows arbitrary keys and values

* but there are some keys that are expected to contain certain values
  * "geometry" is a [WKT](https://en.wikipedia.org/wiki/Well-known_text_representation_of_geometry) string
  * "width", "height", "x", "y", "score" et.c. are expected to be numbers
  * when used with "text", "format" refers to the format of the text.
* and the type of the block must act as a contract between producers and consumers of content

## Document links

Document links associates the documents with concepts, external resources, and other documents.

The order of links doesn't have any semantic meaning.

## Document links - subjects

A document may have the same kind of relationship to document of different types

Or different relationships to documents of the same type

```javascript
{
  "title": "Dalarna",
  "rel": "subject", "type": "x-im/category",
  "uuid": "03d22994-91e4-11e5-8994-feff819cdc9f"
},
{
  "title": "Volvo",
  "rel": "subject", "type": "x-im/topic",
  "uuid": "b201e042-555b-11e5-885d-feff819cdc9f"
},
{
  "title": "Alvesta",
  "rel": "subject", "type": "x-im/place",
  "uuid": "bce38dda-555b-11e5-885d-feff819cdc9f",
  "data": { "geometry": "POINT(14.55600 56.89921)" }
}
```

## Document links - authors

```javascript
{
  "title": "Jane Doe", "rel": "author", "type": "x-im/author",
  "uuid": "bad4314c-7e33-11e5-8bcf-feff819cdc9f",
  "uri": "im://user/58456",
  "links": [
    {
      "rel":"avatar",
      "type":"x-im/image",
      "uuid":"9c188460-c500-11e5-9912-ba0be0483c18",
      "uri":"im://image/janedoe.jpeg"
    }
  ]
}
```

## Metadata blocks

Metadata blocks carry information about the document.

## Metadata blocks

Metadata blocks are commonly associated with a writer plugin.

Here we see the product of the news value plugin, where somebody is feeding an algorithm (or analytics) information about the editorial valuation of an article.

```javascript
{
  "type" : "x-im/newsvalue",
  "data" : {
    "duration" : "86400",
    "description" : "1D",
    "score" : "4"
  }
}
```

## Metadata blocks - teaser

```javascript
{
  "type": "x-im/teaser",
  "title": "The squid comes for you",
  "data": {
    "title": "The squid comes for you",
    "text": "10 facts about the mecha-squids that terrorised cowboys during the gold rush.",
    "subject": "A mecha squid racing to catch a gunslinger on horseback"
  },
  "links": [
    {
      "rel": "image",
      "type": "x-im/image",
      "uri": "im://image/WEH99iJHOXu6ssz7h8Ne7kFLmqs.png",
      "uuid": "f7d9d837-5048-54ee-b961-064dfc8467ca",
      "data": {
        "width": "1600",
        "height": "900"
      }
    }
  ]
}
```

## Content blocks

Content blocks describes the content that typically gets rendered when we display a document.

## Content blocks

Examples of a headline and a paragraph.

```javascript
{
  "id": "d0dbf67d385e",
  "type": "x-im/header",
  "data": {
    "format": "html",
    "text": "Lorem ipsum dolor sit"
  }
},
{
  "id": "fafbedf02da1", "type": "x-im/paragraph",
  "data": {
    "format": "html",
    "text": "Mauris eleifend, <a href=\"http://google.com\">Bacon</a> orci nec volutpat."
  }
}
```

## Content block - Image

```javascript
{
  "type" : "x-im/image"
  "uuid" : "1b34f847-fb4c-59e2-a648-42fe168061d2",
  "id" : "MTE2LDQxLDE3MywxMDU",
  "links" : [
    {
      "type" : "x-im/image",
      "links" : [
        {
          "title" : "Kristofer Pasanen",
          "rel" : "author"
        }
      ],
      "data" : {
        "height" : "1668",
        "width" : "2500",
        "text" : "Sed libero metus, iaculis sit amet dolor."
      },
      "uri" : "im://image/ZcrcVwEZyI29HDnmykq1te8M5-s.jpeg",
      "uuid" : "1b34f847-fb4c-59e2-a648-42fe168061d2",
      "rel" : "self"
    }
  ]
}
```


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://docs.navigaglobal.com/navigadoc/navigadoc-walkthrough.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
