# Add & Update dataset content
A dataset can have new content added to it or existing content updated - individually or as a batch - with this endpoint. Content is sent as an array of objects which contains the unique ID for the content, the URL from which it can be retrieved and optional metadata that can be used later in the Vision API for content filtering.
A dataset requires at least 200 items before it can be requested by the Vision API.
A note on downloading your images
We do not store/cache your images on our servers.
Every item added or updated in a dataset will be processed, meaning that the image will be downloaded from your server each time, processed and then deleted.
Every dataset is unique, so images will be downloaded each add or update (so long as the image URL has changed), irrelevant if they are in another dataset.
Checking content in a dataset
To check the content in a dataset, you can use list dataset content.
To check content successfully processed that is now live, use the Vision API (even from the client-side) endpoint: get content details.
The maximum content body length is 5MB. There is no limit on the number of content items per request.
The ID must be a unique string within the dataset.
# Content Guide
# Which content (products) to include in a dataset
The recommendations are produced only for items in the dataset.
In order to get recommendations for content it must be in the dataset. Once you remove a product from the dataset, it will no longer be possible to query recommendations for that product.
# Out-of-stock products
We recommend keeping out-of-stock products in the dataset, rather than removing them when they are no longer available (as required by Google Merchant feeds). It is often the case that your customer is referred to the product page by affiliate links or other lists (such as Pinterest), so maintaining out-of-stock products can ensure they are served in-stock & relevant recommendations.
To serve you with recommendations for out-of-stock products, they should have a metadata field to identify that it is no longer in stock (for example a boolean field instock
).
Showing out-of-stock recommendations becomes as simple as querying q=instock:true
to ensure out-of-stock products are not returned in results.
# Unique Content and placeholder image URLs
A dataset can have multiple content items with different IDs but the same image URL. We allow for the use case when some content doesn't have yet a permanent image, so a placeholder (default / generic) image is used instead. Having the same image multiple times in a dataset will increase the chance of seeing the image multiple times when making a Vision API request.
# Image Resolution
Image resolution should be as close to 256 pixels per edge, with a maximum of 1000 pixels. All ratios are accepted, so there is no need to do any padding, but consider cropping to clearly see the product / object. For datasets that have fashion related content you can have better results if the minimum image size is 300 pixels per edge.
# Image guide: Cropping & Homogeneity
Image | Resolution | Result | Reason |
---|---|---|---|
447 x 300 pixels | Good |
| |
866 x 300 pixels | Good |
| |
907 x 300 pixels | Good |
| |
427 x 424 pixels | Bad |
| |
426 x 422 pixels | Bad |
| |
410 x 143 pixels | Bad |
| |
648 x 300 pixels | Bad |
|
# Metadata
The optional metadata
field is a JSON object that accepts key values of the following types: string, number, boolean and simple arrays with values of type string or number. Although we don't check that the same keys have the same type for all content in a dataset, enforcing that on your side could avoid filtering issues when using the Vision API.
# Deciding metadata to add
We recommend only adding metadata that is useful for your customer, filtering, or recommendation engine. We provide 4 simple suggestions that will help you decide on what to include in the dataset:
# Is it useful for my customers?
The fields price
and quantity
are both good examples of fields a customer might be interested in and use them to search by. Many internal fields like timestamps and other fields added by CMS or e-commerce platforms are not relevant to the customer.
# Do I plan to filter by it?
Add fields you use to determine whether to return a product. These fields may be the internal information about products and their relevance in the context of what you would like to include in the recommendations. Examples are:
vendor_name
- may be used to filter recommendations only from the same vendor. Even though you may never display this internal information to a customer, it can help you to craft recommendation.instock
- an already described case in out-of-stock, it is used when you wish to recommend alternatives that are currently available in the inventory.
# Does it achieve business needs?
Add fields you based on your business insights or goals. These fields may be the internal information derived from your business operations. Examples are:
vendor_ranking
- may be used for both filtering, and to order recommendations by. Even though you may never display this internal ranking to a customer, it can serve your business case.product_location
- could be use to determine which products to show a customer based on their location to minimise shipping costs and taxes.
# Is it useful to describe a product?
Add fields that describe the product. The fields might be the internal information on products as well as information you display to the customer. These fields can be used for all previous use cases such as filtering, business relevance but most importantly they are used in the recommendation engine. Provided metadata informs recommendation engines about exhaustive product characteristics. The quality and richness of provided information affect the quality of recommendations. Although we do check your data before using them in our recommendation system, yet often low-quality curation of data cannot be easily solved without your input, and thus discarded from being used to inform better recommendations.
Examples are:
- Description: textual information on an item
- Colour: e.g. blue or hex number
- Material: e.g. wood, metal
- Descriptive tags: hand-made, vintage, classic, modern
- Product details: corner sofa, acrylic/oil/watercolour painting
- Category: side tables, t-shirts
# Preparing your metadata
Once you've determined which metadata fields should be included, you can begin to prepare your metadata. Please consider the different uses of your content and familiarize yourself with Vision API's queries and filters. Spending time preparing your metadata for querying now will save you time later.
It's best to follow these steps when deciding what to do with metadata:
- Can the metadata be simplified, replaced or combined?
- See Variants below for other reasons to combine fields.
- Example: A rule that joins multiple fields together -
quickShip:true
,available:true
andpublished:true
to only be returned in results. Consider adding a boolean fielddisplayInResults
and removing any fields that aren't important (in this exampleavailable
andpublished
). - Example: There are many values in
tags
(an array field) but only a few values are important for querying. Either, add a new field for each important tag, or remove unnecessarytags
and enable filtering on that list field.
- Any field that no longer serves a purpose, likely due to being combined, can be removed
Reserved fields
text
is a reserved field, used by Search By Text functionality.
# Automatic metadata
Upon processing an image, the Visii API adds image_width
, image_height
and image_url
to metadata
for use by your UI. Request these values with the fields
parameter in the Vision API.
# Variants
Many products have options for a customer to choose before purchasing. These often differ in metadata, such as price, size and color.
There are two common scenarios that require a different method of adding content to the dataset:
# Scenario: An image per variant
You have an image that is visually different per variant. For example: You have content with 3 variations in color and the 3 images that represents these colors.
- add a unique
id
andurl
- add the field
parent
to make it possible to identify this is a variation with a shared parent
Removing variants from results
When querying the Vision API you can use id=<id>&q=parent:!<parent>
to exclude from the results all variants that have <parent>
as a parent
.
{
"content": [
{
"id": "EX1-W",
"url": "https://example.com/my-content-1-white.jpg",
"metadata": {
"available": true,
"color": "white",
"parent": "EX1",
"price": 99.99
}
},
{
"id": "EX1-B",
"url": "https://example.com/my-content-1-blue.jpg",
"metadata": {
"available": true,
"color": "blue",
"parent": "EX1",
"price": 99.99
}
},
{
"id": "EX1-R",
"url": "https://example.com/my-content-1-red.jpg",
"metadata": {
"available": true,
"color": "red",
"parent": "EX1",
"price": 99.99
}
}
]
}
# Scenario: A single image for all variants but different metadata
If the difference is not represented visually between variants (or you only have a single image), you can use metadata fields to represent information about the variants.
For example, you have 3 variants with different sizes & prices, but only 1 image to represent them - such as Child & Adult sizes.
# Preparing your metadata
Modify metadata you will use for querying: so that filtering should be either added, combined or removed to support all variant values.
- Add a field: when dealing with numbers that need to be queried by a range, which differ per variant, you need a new field
maxPrice
. For example,price
is usually queriedq=price:20:50
, however with multiple variant prices, you would pair it with a new fieldmaxPrice
. This allows variants to be returned within the range:q=price:20:,maxPrice:20:50
. - Combine fields: when filtering by an important variant option, such as size, combine all variants in an array field
sizes
. You can also combine information you'd like to display, such as individual prices. - Remove a field: when a filter is required for each variant, such as
available
but is captured by the first two scenarios.
# Examples
Taking these example variants - note their metadata:
[
{
"id": "EX1-C",
"url": "https://example.com/my-content-1-generic.jpg", // note only 1 image for all variants
"metadata": {
"available": true,
"size": "child",
"price": 29.99
}
},
{
"id": "EX1-T",
"url": "https://example.com/my-content-1-generic.jpg", // note only 1 image for all variants
"metadata": {
"available": false,
"size": "teen",
"price": 39.99
}
},
{
"id": "EX1-A",
"url": "https://example.com/my-content-1-generic.jpg", // note only 1 image for all variants
"metadata": {
"available": true,
"color": "adult",
"price": 49.99
}
}
]
Would become:
- Added:
maxPrice
: the most expensive variant price - Combined:
sizes
: containing the variants options and their availability. - Removed:
available
: assizes
now contains their availability with their presence.
{
"content": [
{
"id": "EX1",
"url": "https://example.com/my-content-1-generic.jpg",
"metadata": {
"maxPrice": 49.99, // the most expensive variant price
"price": 25.99, // the lowest variant price
"sizes": ["child","adult"], // used for filtering: q=sizes:child
"prices": [25.99,49.99] // for display purposes
}
}
]
}
In the instance when the size teen
becomes available - add it from sizes
. A query to the API for q=sizes:teen
will now include this product.
{
"content": [
{
"id": "EX1",
"url": "https://example.com/my-content-1-generic.jpg",
"metadata": {
"maxPrice": 49.99, // the most expensive variant price
"price": 25.99, // the lowest variant price
"sizes": ["child","teen","adult"], // used for filtering: q=sizes:teen
"prices": [25.99,39.99,49.99] // for display purposes
}
}
]
}
# Private Metadata
Sometimes items have additional attributes that you don't want to make public, are useful when using Visii's suite of products but are not needed by Vision API to return in the responses. This can happen for:
- Individually curated scores / ratings
- Longer text descriptions that are useful to improve text search results but don't need to be retrieved for display purposes
- Internal identifiers / tags / classifications that are useful when using Visii's Marketing or Catalogue products
This information can be passed using a privateMetadata
attribute for each item, having the same format and constraints as the metadata
attribute.
Access to private metadata is restricted
Private metadata fields cannot be retrieved when using the fields
parameter in Vision API requests.
# POST Example
{
"content": [
{
"id": "my-id-1",
"url": "https://example.com/my-content-1.jpg",
"privateMetadata": {
"curation_score": 9.5,
"marketing_tags": ["campaign1", "campaign2"]
}
},
{
"id": "my-id-2",
"url": "https://example.com/my-content-2.jpg",
"metadata": {
"category": "art",
"quantity": 10,
"price": 99.99,
"available": true
},
"privateMetadata": {
"curation_score": 7.5,
"all_time_revenue": 10400,
"third_party_identifier": "abcd"
}
}
]
}
# Input
# Add new items
To add new content use the POST
request method. For each content item the id
and url
parameters are required. The metadata
field is optional but if it's present then all wanted fields need to be sent. Existing items are updated and the previous metadata details are overridden.
# Partially update items
To partially update metadata for existing items use the PUT
request method. The url
is optional, but when present it will update the previous value. The metadata
field is optional but if present all fields present will be added or override existing values.
When using PUT is useful
For smaller updates using PUT
will reduce the request body size and allow for more updates at the same time (considering the 5MB body limit).
Endpoint
POST /orgs/:organisation/datasets/:dataset/content
Field | Type | Required | Value Description |
---|---|---|---|
organisation | String | yes | The organisation name |
dataset | String | yes | The dataset name |
# Body
Field | Type | Required | Value Description |
---|---|---|---|
content | Object | yes | The content to add or update in the dataset |
content.$.id | String | yes | The unique ID for the content. Maximum length is 100 characters. |
content.$.url | String | yes | An RFC 1738 compliant URL from which the image for the content can be retrieved. Supported image formats are jpeg , jpg and png . |
content.$.metadata | Object | no | Content metadata information (category, price, quantity, availability, etc.) |
content.$.privateMetadata | Object | no | Private metadata information |
# POST Example
{
"content": [
{
"id": "my-id-1",
"url": "https://example.com/my-content-1.jpg"
},
{
"id": "my-id-2",
"url": "https://example.com/my-content-2.jpg",
"metadata": {
"category": "art",
"quantity": 10,
"price": 99.99,
"available": true
}
},
{
"id": "my-id-3",
"url": "https://example.com/my-content-3.jpg",
"metadata": {
"category": "drawings",
"available": false
}
}
]
}
# PUT Example
{
"content": [
{
"id": "my-id-1",
"url": "https://example.com/my-content-1-new.jpg"
},
{
"id": "my-id-2",
"metadata": {
"quantity": 9
}
},
{
"id": "my-id-3",
"url": "https://example.com/my-content-3-new.jpg",
"metadata": {
"available": true
}
}
]
}
# Request
curl -X POST \
-H "Authorization: token my-org-api-token" \
-H "Content-Type: application/json" \
-H "Accept: application/vnd.visii.v2+json" \
-d '{"content":[{"id":"123","url": "http://example.com/image1.jpg"},{"id":"345","url":"http://example.com/image2.jpg"}]}' \
"https://api.visii.com/orgs/my-org/datasets/my-dataset/content"
# Response
Field | Type | Value Description |
---|---|---|
status | String | The status of the response |
HTTP/1.1 202 Accepted
{
"status": "accepted"
}