Images

Here you will find information regarding how images that are uploaded via Digital Writer (or Binary Service Lite) are handled, stored and accessed. This is a “general” documentation. Details of how to setup and configure the Digital Writer and it’s services in regards to image handling are not covered here.

MIME types supported

As-is the Digital Writer supports image/jpeg,image/gif,image/png MIME types. The max size restriction of an image being uploaded is configurable per MIME type.

Storage

The original image is stored in a S3 bucket not publicly accessible. In addition a smaller version, a preview, of the image will be created and stored in the S3 bucket. The Digital Writer can be configured to also create a thumb derivative of the image in the bucket.

When the image gets published, a copy of the original image, for which all metadata is removed, is uploaded to another, public, S3 bucket (used for front-end rendering).

For every image uploaded, a metadata document is stored in Open Content. If the Digital Writer is configured to store the original image in Open Content, the image will be stored as the primary file (in addition to the image being stored on a S3 bucket). Otherwise the primary file will point to the metadata document. In addition, a preview and a thumb of the image is normally also uploaded to Open Content.

Identification

The image is identified by a calculated (hashed) and unique file name (i.e. not the image's original name), e.g. “0GUpOh076jJ9hgbe47zH-gZ0sx4.jpg”.

The correspondent metadata document in Open Content is given a UUID (UUID v5) that is created using the image's file given uri (x-im/image/{filename}, e.g. “9db2f917-d397-484a-8837-519f067ed4f2”. By being able to calculate the UUID using the image’s file name the Digital Writer can avoid uploading duplicates of the same image by checking if the image already exists in Open Content.

Metadata

An uploaded image will always be paired with a metadata document in Infomaker NewsML format in Open Content. The Digital Writer will extract a subset of the available metadata that is actually stored on the image binary.

Labels

Fields

Mapped metadata fields

Description

ImageDescription, Caption-Abstract

xmp-dc:description, iptc:caption-abstract

Creator

Creator, By-line

xmp-dc:creator, iptc:by-line

Credit

Credit

xmp-photoshop:credit, iptc:credit

Instructions

Instructions, SpecialInstructions

xmp-photoshop:instructions, iptc:specialinstructions

Object name (original file name)

ObjectName, Title

xmp-dc:title, iptc:objectname

Photo date

DateCreated

xmp-photoshop:datecreated, iptc:datecreated

Source (not used in Writer as-is)

Credit

xmp-photoshop:source, iptc:source

In the generated metadata document for the image, the file name is specified by the <fileName> element. The UUID is specified by the guid attribute for <newsItem> element.

When an image is referenced (used) in an article, the file name is specified as the last component of the uri attribute of the <link rel=”self” type=”x-im/image” …> element. The UUID is found as the attribute uuid on the <object type=”x-im/image” …> element. E.g.

<object id="dcc7c5fcf709" type="x-im/image" uuid="f845d7b8-40cb-545a-8069-36e21ff00908">
    <links>
        <link rel="self" type="x-im/image"
            uri="im://image/znX8U1C123JLDjlksdfgb40_jIka.jpeg">
        </link>
    </links>
</object>

Add metadata when uploading an image

By supplying metadata as a query parameter to the url of the image to upload (or to the file upload url used), metadata can be added to image and corresponding newsitem. The query parameter name must be metadata and the value must be in the format described below. Note that the value must be a json string that is uri encoded.

{
    "ImageDescription": "Hello World",
    "By-line": "John Doe"
}

And the above json would then be used as query parameter with the format as specified below;

http request 
http://test.com?metadata="%7B%22ImageDescription%22%3A%22Hello World%22%2C%22By-line%22%3A%22John Doe%22%7D
"

Note that, as-is, the only metadata fields supported are those listed in the table below. Also note that in order for the Digital Writer to consume the metadata, the parameter must use the "Fields" names, e.g. ImageDescription.

Access

There are several ways to access images that has been uploaded by the Digital Writer. Depending on your use case, you can use the following.

Image services

Digital Writer uses either imgix or imengine to render images (if the chosen service is not available, a fallback solution using a signed URL to the preview version of the image on S3 is used).

Note that if you want to use imgix to access an image that has not yet been published you need to create a “signed imgix URL” using the access token used in imgix for your Digital Writer. The signed URL needs to be created programmatically for which there are several clients available, https://docs.imgix.com/libraries.

In order to access a published image using imgix, you would use the file name as an identifier in addition to the imgix specifics. E.g. {protocol}://{imgixhostname}/1k5EJVdLeZKqTpKbl-ArSF-bna0.png.

When using imengine to access an image you would instead use the image’s UUID. E.g. {protocol}://{imenginehostname}/imengine/image.php?uuid=133059c9-70ba-5806-a843-318c1a00d433

Binary Service Lite

If you have Binary Service Lite configured for your Digital Writer, you can use its API to fetch a URL to the image. Binary Service Lite can also be used to upload images in the same way as the Digital Writer does.

S3

Published images are accessible using the S3 URL to the image. https://s3-{region}.amazonaws.com/{bucket}/{filename}

The “{region}” part of the URL is the AWS region in which the bucket resides. The image resides in the “root” of the bucket.

In order to access the images that have not been published (the original image) you need to use a signed URL to the S3 bucket and object. The original image resides in the “root” of the S3 bucket. The preview has the prefix “preview/” and thumb (if exists) has prefix “thumb/”.

Open Content

By authenticating yourself, you can use the Open Content API to access the images using the UUID of the image.

Preview

{protocol}://{opencontenthosname}:{port}/opencontent/objects/{uuid}/files/preview

Thumb

{protocol}://{opencontenthosname}:{port}/opencontent/objects/{uuid}/files/thumb

Original (if stored in Open Content)

{protocol}://{opencontenthosname}:{port}/opencontent/objects/{uuid}

Proxy Cluster

If you have a Proxy Cluster (locked down on IP) configured for your Open Content you can use that to access your images. The URL to use is described below. The “protocol” is HTTP or HTTPS (configurable). The “oc-alias” is a identification of the Open Content used by the proxy. It follows a naming convention “{customer-id}-{oc-type}. E.g. “im-editorial”.

Production environment

{protocol}://{oc-alias}.proxy.infomaker.io