# Add & Update dataset content

A dataset can have new content added to it or existing content updated - individually or as a batch - with this endpoint. Content is sent as an array of objects which contains the unique ID for the content, the URL from which it can be retrieved and optional metadata that can be used later in the Vision API for content filtering.

A dataset requires at least 200 items before it can be requested by the Vision API.

A note on downloading your images

We do not store/cache your images on our servers.

Every item added or updated in a dataset will be processed, meaning that the image will be downloaded from your server each time, processed and then deleted.
Every dataset is unique, so images will be downloaded each add or update (so long as the image URL has changed), irrelevant if they are in another dataset.

Checking content in a dataset

To check the content in a dataset, you can use list dataset content.

To check content successfully processed that is now live, use the Vision API (even from the client-side) endpoint: get content details.

The maximum content body length is 5MB. There is no limit on the number of content items per request.

The ID must be a unique string within the dataset.


# Content Guide

# Which content (products) to include in a dataset

The recommendations are produced only for items in the dataset.

In order to get recommendations for content it must be in the dataset. Once you remove a product from the dataset, it will no longer be possible to query recommendations for that product.

# Out-of-stock products

We recommend keeping out-of-stock products in the dataset, rather than removing them when they are no longer available (as required by Google Merchant feeds). It is often the case that your customer is referred to the product page by affiliate links or other lists (such as Pinterest), so maintaining out-of-stock products can ensure they are served in-stock & relevant recommendations.

To serve you with recommendations for out-of-stock products, they should have a metadata field to identify that it is no longer in stock (for example a boolean field instock).

Showing out-of-stock recommendations becomes as simple as querying q=instock:true to ensure out-of-stock products are not returned in results.

# Unique Content and placeholder image URLs

A dataset can have multiple content items with different IDs but the same image URL. We allow for the use case when some content doesn't have yet a permanent image, so a placeholder (default / generic) image is used instead. Having the same image multiple times in a dataset will increase the chance of seeing the image multiple times when making a Vision API request.

# Image Resolution

Image resolution should be as close to 256 pixels per edge, with a maximum of 1000 pixels. All ratios are accepted, so there is no need to do any padding, but consider cropping to clearly see the product / object. For datasets that have fashion related content you can have better results if the minimum image size is 300 pixels per edge.

# Image guide: Cropping & Homogeneity

Image Resolution Result Reason
447 x 300 pixels Good
  • Correct minimum dimensions for content matter (Fashion - see above)
  • Clear subject
866 x 300 pixels Good
  • Correct minimum height and under maximum with ratio
  • Good cropping to show product
907 x 300 pixels Good
  • Correct minimum height and under maximum with ratio
  • Good cropping to show product
  • Simple & homogeneous backgrounds to other images in dataset
427 x 424 pixels Bad
  • Correct image dimensions
  • Poor cropping
426 x 422 pixels Bad
  • Actual image is less than minimum resolution
  • Padding added
410 x 143 pixels Bad
  • Height less than minimum dimension
  • Good cropping to show product
  • Poor quality image
648 x 300 pixels Bad
  • Correct image resolution
  • Poor cropping showing multiple products, making identifying subject difficult

# Metadata

The optional metadata field is a JSON object that accepts key values of the following types: string, number, boolean and simple arrays with values of type string or number. Although we don't check that the same keys have the same type for all content in a dataset, enforcing that on your side could avoid filtering issues when using the Vision API.

# Deciding metadata to add

We recommend only adding metadata that is useful for your customer, filtering, or recommendation engine. We provide 4 simple suggestions that will help you decide on what to include in the dataset:

# Is it useful for my customers?

The fields price and quantity are both good examples of fields a customer might be interested in and use them to search by. Many internal fields like timestamps and other fields added by CMS or e-commerce platforms are not relevant to the customer.

# Do I plan to filter by it?

Add fields you use to determine whether to return a product. These fields may be the internal information about products and their relevance in the context of what you would like to include in the recommendations. Examples are:

  • vendor_name - may be used to filter recommendations only from the same vendor. Even though you may never display this internal information to a customer, it can help you to craft recommendation.
  • instock - an already described case in out-of-stock, it is used when you wish to recommend alternatives that are currently available in the inventory.

# Does it achieve business needs?

Add fields you based on your business insights or goals. These fields may be the internal information derived from your business operations. Examples are:

  • vendor_ranking - may be used for both filtering, and to order recommendations by. Even though you may never display this internal ranking to a customer, it can serve your business case.
  • product_location - could be use to determine which products to show a customer based on their location to minimise shipping costs and taxes.

# Is it useful to describe a product?

Add fields that describe the product. The fields might be the internal information on products as well as information you display to the customer. These fields can be used for all previous use cases such as filtering, business relevance but most importantly they are used in the recommendation engine. Provided metadata informs recommendation engines about exhaustive product characteristics. The quality and richness of provided information affect the quality of recommendations. Although we do check your data before using them in our recommendation system, yet often low-quality curation of data cannot be easily solved without your input, and thus discarded from being used to inform better recommendations.

Examples are:

  • Description: textual information on an item
  • Colour: e.g. blue or hex number
  • Material: e.g. wood, metal
  • Descriptive tags: hand-made, vintage, classic, modern
  • Product details: corner sofa, acrylic/oil/watercolour painting
  • Category: side tables, t-shirts

# Preparing your metadata

Once you've determined which metadata fields should be included, you can begin to prepare your metadata. Please consider the different uses of your content and familiarize yourself with Vision API's queries and filters. Spending time preparing your metadata for querying now will save you time later.

It's best to follow these steps when deciding what to do with metadata:

  • Can the metadata be simplified, replaced or combined?
    • See Variants below for other reasons to combine fields.
    • Example: A rule that joins multiple fields together - quickShip:true, available:true and published:true to only be returned in results. Consider adding a boolean field displayInResults and removing any fields that aren't important (in this example available and published).
    • Example: There are many values in tags (an array field) but only a few values are important for querying. Either, add a new field for each important tag, or remove unnecessary tags and enable filtering on that list field.
  • Any field that no longer serves a purpose, likely due to being combined, can be removed

Reserved fields

text is a reserved field, used by Search By Text functionality.

# Automatic metadata

Upon processing an image, the Visii API adds image_width, image_height and image_url to metadata for use by your UI. Request these values with the fields parameter in the Vision API.

# Variants

Many products have options for a customer to choose before purchasing. These often differ in metadata, such as price, size and color.

There are two common scenarios that require a different method of adding content to the dataset:

# Scenario: An image per variant

You have an image that is visually different per variant. For example: You have content with 3 variations in color and the 3 images that represents these colors.

  • add a unique id and url
  • add the field parent to make it possible to identify this is a variation with a shared parent

Removing variants from results

When querying the Vision API you can use id=<id>&q=parent:!<parent> to exclude from the results all variants that have <parent> as a parent.

{
  "content": [
    {
      "id": "EX1-W",
      "url": "https://example.com/my-content-1-white.jpg",
      "metadata": {
        "available": true,
        "color": "white",
        "parent": "EX1",
        "price": 99.99
      }
    },
    {
      "id": "EX1-B",
      "url": "https://example.com/my-content-1-blue.jpg",
      "metadata": {
        "available": true,
        "color": "blue",
        "parent": "EX1",
        "price": 99.99
      }
    },
    {
      "id": "EX1-R",
      "url": "https://example.com/my-content-1-red.jpg",
      "metadata": {
        "available": true,
        "color": "red",
        "parent": "EX1",
        "price": 99.99
      }
    }
  ]
}

# Scenario: A single image for all variants but different metadata

If the difference is not represented visually between variants (or you only have a single image), you can use metadata fields to represent information about the variants.

For example, you have 3 variants with different sizes & prices, but only 1 image to represent them - such as Child & Adult sizes.

# Preparing your metadata

Modify metadata you will use for querying: so that filtering should be either added, combined or removed to support all variant values.

  • Add a field: when dealing with numbers that need to be queried by a range, which differ per variant, you need a new field maxPrice. For example, price is usually queried q=price:20:50, however with multiple variant prices, you would pair it with a new field maxPrice. This allows variants to be returned within the range: q=price:20:,maxPrice:20:50.
  • Combine fields: when filtering by an important variant option, such as size, combine all variants in an array field sizes. You can also combine information you'd like to display, such as individual prices.
  • Remove a field: when a filter is required for each variant, such as available but is captured by the first two scenarios.
# Examples

Taking these example variants - note their metadata:

[
  {
    "id": "EX1-C",
    "url": "https://example.com/my-content-1-generic.jpg", // note only 1 image for all variants
    "metadata": {
      "available": true,
      "size": "child",
      "price": 29.99
    }
  },
  {
    "id": "EX1-T",
    "url": "https://example.com/my-content-1-generic.jpg", // note only 1 image for all variants
    "metadata": {
      "available": false,
      "size": "teen",
      "price": 39.99
    }
  },
  {
    "id": "EX1-A",
    "url": "https://example.com/my-content-1-generic.jpg", // note only 1 image for all variants
    "metadata": {
      "available": true,
      "color": "adult",
      "price": 49.99
    }
  }
]

Would become:

  • Added: maxPrice: the most expensive variant price
  • Combined: sizes: containing the variants options and their availability.
  • Removed: available: as sizes now contains their availability with their presence.
{
  "content": [
    {
      "id": "EX1",
      "url": "https://example.com/my-content-1-generic.jpg",
      "metadata": {
        "maxPrice": 49.99, // the most expensive variant price
        "price": 25.99, // the lowest variant price
        "sizes": ["child","adult"], // used for filtering: q=sizes:child
        "prices": [25.99,49.99] // for display purposes
      }
    }
  ]
}

In the instance when the size teen becomes available - add it from sizes. A query to the API for q=sizes:teen will now include this product.

{
  "content": [
    {
      "id": "EX1",
      "url": "https://example.com/my-content-1-generic.jpg",
      "metadata": {
        "maxPrice": 49.99, // the most expensive variant price
        "price": 25.99, // the lowest variant price
        "sizes": ["child","teen","adult"], // used for filtering: q=sizes:teen
        "prices": [25.99,39.99,49.99] // for display purposes
      }
    }
  ]
 }

# Private Metadata

Sometimes items have additional attributes that you don't want to make public, are useful when using Visii's suite of products but are not needed by Vision API to return in the responses. This can happen for:

  • Individually curated scores / ratings
  • Longer text descriptions that are useful to improve text search results but don't need to be retrieved for display purposes
  • Internal identifiers / tags / classifications that are useful when using Visii's Marketing or Catalogue products

This information can be passed using a privateMetadata attribute for each item, having the same format and constraints as the metadata attribute.

Access to private metadata is restricted

Private metadata fields cannot be retrieved when using the fields parameter in Vision API requests.

# POST Example

{
  "content": [
    {
      "id": "my-id-1",
      "url": "https://example.com/my-content-1.jpg",
      "privateMetadata": {
        "curation_score": 9.5,
        "marketing_tags": ["campaign1", "campaign2"]
      }
    },
    {
      "id": "my-id-2",
      "url": "https://example.com/my-content-2.jpg",
      "metadata": {
        "category": "art",
        "quantity": 10,
        "price": 99.99,
        "available": true
      },
      "privateMetadata": {
        "curation_score": 7.5,
        "all_time_revenue": 10400,
        "third_party_identifier": "abcd"
      }
    }
  ]
}

# Input

# Add new items

To add new content use the POST request method. For each content item the id and url parameters are required. The metadata field is optional but if it's present then all wanted fields need to be sent. Existing items are updated and the previous metadata details are overridden.

# Partially update items

To partially update metadata for existing items use the PUT request method. The url is optional, but when present it will update the previous value. The metadata field is optional but if present all fields present will be added or override existing values.

When using PUT is useful

For smaller updates using PUT will reduce the request body size and allow for more updates at the same time (considering the 5MB body limit).

Endpoint

POST /orgs/:organisation/datasets/:dataset/content

Field Type Required Value Description
organisation String yes The organisation name
dataset String yes The dataset name

# Body

Field Type Required Value Description
content Object yes The content to add or update in the dataset
content.$.id String yes The unique ID for the content. Maximum length is 100 characters.
content.$.url String yes An RFC 1738 compliant URL from which the image for the content can be retrieved. Supported image formats are jpeg, jpg and png.
content.$.metadata Object no Content metadata information (category, price, quantity, availability, etc.)
content.$.privateMetadata Object no Private metadata information

# POST Example

{
  "content": [
    {
      "id": "my-id-1",
      "url": "https://example.com/my-content-1.jpg"
    },
    {
      "id": "my-id-2",
      "url": "https://example.com/my-content-2.jpg",
      "metadata": {
        "category": "art",
        "quantity": 10,
        "price": 99.99,
        "available": true
      }
    },
    {
      "id": "my-id-3",
      "url": "https://example.com/my-content-3.jpg",
      "metadata": {
        "category": "drawings",
        "available": false
      }
    }
  ]
}

# PUT Example

{
  "content": [
    {
      "id": "my-id-1",
      "url": "https://example.com/my-content-1-new.jpg"
    },
    {
      "id": "my-id-2",
      "metadata": {
        "quantity": 9
      }
    },
    {
      "id": "my-id-3",
      "url": "https://example.com/my-content-3-new.jpg",
      "metadata": {
        "available": true
      }
    }
  ]
}

# Request

curl -X POST \
     -H "Authorization: token my-org-api-token" \
     -H "Content-Type: application/json" \
     -H "Accept: application/vnd.visii.v2+json" \
     -d '{"content":[{"id":"123","url": "http://example.com/image1.jpg"},{"id":"345","url":"http://example.com/image2.jpg"}]}' \
     "https://api.visii.com/orgs/my-org/datasets/my-dataset/content"

# Response

Field Type Value Description
status String The status of the response
HTTP/1.1 202 Accepted
{
  "status": "accepted"
}
Last Updated: 11/28/2023, 7:18:44 PM