# Add & Update dataset content

A dataset can have new content added to it or existing content updated - individually or as a batch - with this endpoint. Content is sent as an array of objects which contains the unique ID for the content, the URL from which it can be retrieved and optional metadata that can be used later in the Vision API for content filtering.

Why both Add & Update?

This endpoint is used for both adding and updating (replacing) dataset content. Traditionally this is handled by separate POST & PUT methods respectively, however we chose to simplify batch operations by using only POST. This avoids requiring a check for the existence of content every time before sending updates.

Therefore, you can make a single request to add and/or update 1,000 units (some new, some existing) as this endpoint supports batch. Instead of requiring multiple requests by first checking the existence of individual content before deciding which endpoint to use.

Checking content in a dataset

To check the content in a dataset, you can use list dataset content.

To check content successfully processed that is now live, use the Vision API (even from the client-side) endpoint: get content details.

The maximum content body length is 1MB. There is no limit on the number of content items per request.

The ID must be a unique string within the dataset.


# Content Guide

# Which content to add

For recommendations to be shown for content it must be in the dataset. Once you remove content it will no longer be possible to query it for recommendations.

A common example of this is out-of-stock products. Rather than removing it from a dataset, it should have a metadata field to identify that it is no longer in stock (for example a boolean field instock).

Showing out-of-stock recommendations becomes as simple as querying q=instock:true to ensure out-of-stock products are not returned in results.

# Unique Content and placeholder image URLs

A dataset can have multiple content items with different IDs but the same image URL. We allow for the use case when some content doesn't have yet a permanent image, so a placeholder (default / generic) image is used instead. Having the same image multiple times in a dataset will increase the chance of seeing the image multiple times when making a Vision API request.

# Image Resolution

Image resolution should be as close to 256 pixels per edge, with a maximum of 1000 pixels. All ratios are accepted, so there is no need to do any padding, but consider cropping to clearly see the product / object. For datasets that have fashion related content you can have better results if the minimum image size is 300 pixels per edge.

# Image guide: Cropping & Homogeneity

Image Resolution Result Reason
447 x 300 pixels Good
  • Correct minimum dimensions for content matter (Fashion - see above)
  • Clear subject
866 x 300 pixels Good
  • Correct minimum height and under maximum with ratio
  • Good cropping to show product
907 x 300 pixels Good
  • Correct minimum height and under maximum with ratio
  • Good cropping to show product
  • Simple & homogeneous backgrounds to other images in dataset
427 x 424 pixels Bad
  • Correct image dimensions
  • Poor cropping
426 x 422 pixels Bad
  • Actual image is less than minimum resolution
  • Padding added
410 x 143 pixels Bad
  • Height less than minimum dimension
  • Good cropping to show product
  • Poor quality image
648 x 300 pixels Bad
  • Correct image resolution
  • Poor cropping showing multiple products, making identifying subject difficult

# Metadata

The optional metadata field is a JSON object that accepts key values of the following types: string, number, boolean and simple arrays with values of type string or number. Although we don't check that the same keys have the same type for all content in a dataset, enforcing that on your side could avoid filtering issues when using the Vision API.

# Deciding metadata to add

We recommend only adding metadata that is useful for your customers or filtering:

  • Is it useful for my customer? price and quantity are both good examples of fields a customer might query by or wish to see. Many internal fields like timestamps and other fields added by CMS or e-commerce platforms are not
  • Do I plan to filter by it? Add fields you use to determine whether to return a product, such as vendor_ranking that may filter or order by even though you may never display this to a customer

# Preparing your metadata

Once you've determined which metadata fields should be included, you can begin to prepare your metadata. Please consider the different uses of your content and familiarize yourself with Vision API's queries and filters. Spending time preparing your metadata for querying now will save you time later.

It's best to follow these steps when deciding what to do with metadata:

  • Can the metadata be simplified, replaced or combined?
    • See Variants below for other reasons to combine fields.
    • Example: A rule that joins multiple fields together - quickShip:true, available:true and published:true to only be returned in results. Consider adding a boolean field displayInResults and removing any fields that aren't important (in this example available and published).
    • Example: There are many values in tags (an array field) but only a few values are important for querying. Either, add a new field for each important tag, or remove unnecessary tags and enable filtering on that list field.
  • Any field that no longer serves a purpose, likely due to being combined, can be removed

Reserved fields

text is a reserved field, used by Search By Text functionality.

# Automatic metadata

Upon processing an image, the Visii API adds image_width, image_height and image_url to metadata for use by your UI. Request these values with the fields parameter in the Vision API.

# Variants

Many products have options for a customer to choose before purchasing. These often differ in metadata, such as price, size and color.

There are two common scenarios that require a different method of adding content to the dataset:

# Scenario: An image per variant

You have an image that is visually different per variant. For example: You have content with 3 variations in color and the 3 images that represents these colors.

  • add a unique id and url
  • add the field parent to make it possible to identify this is a variation with a shared parent

Removing variants from results

When querying the Vision API you can use id=<id>&q=parent:!<parent> to exclude from the results all variants that have <parent> as a parent.

{
  "content": [
    {
      "id": "EX1-W",
      "url": "https://example.com/my-content-1-white.jpg",
      "metadata": {
        "available": true,
        "color": "white",
        "parent": "EX1",
        "price": 99.99
      }
    },
    {
      "id": "EX1-B",
      "url": "https://example.com/my-content-1-blue.jpg",
      "metadata": {
        "available": true,
        "color": "blue",
        "parent": "EX1",
        "price": 99.99
      }
    },
    {
      "id": "EX1-R",
      "url": "https://example.com/my-content-1-red.jpg",
      "metadata": {
        "available": true,
        "color": "red",
        "parent": "EX1",
        "price": 99.99
      }
    }
  ]
}

# Scenario: A single image for all variants but different metadata

If the difference is not represented visually between variants (or you only have a single image), you can use metadata fields to represent information about the variants.

For example, you have 3 variants with different sizes & prices, but only 1 image to represent them - such as Child & Adult sizes.

# Preparing your metadata

Modify metadata you will use for querying: so that filtering should be either added, combined or removed to support all variant values.

  • Add a field: when dealing with numbers that need to be queried by a range, which differ per variant, you need a new field maxPrice. For example, price is usually queried q=price:20:50, however with multiple variant prices, you would pair it with a new field maxPrice. This allows variants to be returned within the range: q=price:20:,maxPrice:20:50.
  • Combine fields: when filtering by an important variant option, such as size, combine all variants in an array field sizes. You can also combine information you'd like to display, such as individual prices.
  • Remove a field: when a filter is required for each variant, such as available but is captured by the first two scenarios.
# Examples

Taking these example variants - note their metadata:

[
  {
    "id": "EX1-C",
    "url": "https://example.com/my-content-1-generic.jpg", // note only 1 image for all variants
    "metadata": {
      "available": true,
      "size": "child",
      "price": 29.99
    }
  },
  {
    "id": "EX1-T",
    "url": "https://example.com/my-content-1-generic.jpg", // note only 1 image for all variants
    "metadata": {
      "available": false,
      "size": "teen",
      "price": 39.99
    }
  },
  {
    "id": "EX1-A",
    "url": "https://example.com/my-content-1-generic.jpg", // note only 1 image for all variants
    "metadata": {
      "available": true,
      "color": "adult",
      "price": 49.99
    }
  }
]

Would become:

  • Added: maxPrice: the most expensive variant price
  • Combined: sizes: containing the variants options and their availability.
  • Removed: available: as sizes now contains their availability with their presence.
{
  "content": [
    {
      "id": "EX1",
      "url": "https://example.com/my-content-1-generic.jpg",
      "metadata": {
        "maxPrice": 49.99, // the most expensive variant price
        "price": 25.99, // the lowest variant price
        "sizes": ["child","adult"], // used for filtering: q=sizes:child
        "prices": [25.99,49.99] // for display purposes
      }
    }
  ]
}

In the instance when the size teen becomes available - add it from sizes. A query to the API for q=sizes:teen will now include this product.

{
  "content": [
    {
      "id": "EX1",
      "url": "https://example.com/my-content-1-generic.jpg",
      "metadata": {
        "maxPrice": 49.99, // the most expensive variant price
        "price": 25.99, // the lowest variant price
        "sizes": ["child","teen","adult"], // used for filtering: q=sizes:teen
        "prices": [25.99,39.99,49.99] // for display purposes
      }
    }
  ]
 }

# Input

Endpoint

POST /orgs/:organisation/datasets/:dataset/content

Field Type Required Value Description
organisation String yes The organisation name
dataset String yes The dataset name

# Body

Field Type Required Value Description
content Object yes The content to add or update in the dataset
content.$.id String yes The unique ID for the content. Maximum length is 255 characters.
content.$.url String yes An RFC 1738 compliant URL from which the image for the content can be retrieved. Supported image formats are jpeg, jpg and png.
content.$.metadata Object no Content metadata information (category, price, quantity, availability, etc.)

# Example

{
  "content": [
    {
      "id": "my-id-1",
      "url": "https://example.com/my-content-1.jpg"
    },
    {
      "id": "my-id-2",
      "url": "https://example.com/my-content-2.jpg",
      "metadata": {
        "category": "art",
        "quantity": 10,
        "price": 99.99,
        "available": true
      }
    },
    {
      "id": "my-id-3",
      "url": "https://example.com/my-content-3.jpg",
      "metadata": {
        "category": "drawings",
        "available": false
      }
    }
  ]
}

# Request

curl -X POST \
     -H "Authorization: token my-org-api-token" \
     -H "Content-Type: application/json" \
     -H "Accept: application/vnd.visii.v2+json" \
     -d '{"content":[{"id":"123","url": "http://example.com/image1.jpg"},{"id":"345","url":"http://example.com/image2.jpg"}]}' \
     "https://api.visii.com/orgs/my-org/datasets/my-dataset/content"

# Response

Field Type Value Description
status String The status of the response
HTTP/1.1 202 Accepted
{
  "status": "accepted"
}
Last Updated: 9/11/2020, 3:06:21 PM