pattern-lines
wave-down
spaceship
From zero to one

Revolutionize how users find your products by transforming images into searchable assets. A picture is worth 1000 words, now you can automatically harness that power

natural-search
Conversational search

Search that understands you and is scalable. No need for manual categorisation or managing keywords. Empower your users to find what they want by describing it in plain english

time-alarm
Real time results

Optimized to be lightning fast and bespoke for your company. Have 1m+ product images? Visual Search can find the most relevant result for a user's search in less than 1sec.

tools
Powered by A.I

State of the art computer vision, N.L.P and optimization models deployed in efficient pipelines to give your users the best experience

How it works

Deep learning is used to understand your images and generate semantically meaningful representations. When a user searches we rank how relevant your images are to their query and return the results. Learn about the underlying technologies in this blog post

Learn more
wave-up
Demo

Try it for yourself!

We've applied Visual Search to a dataset of 25,000 nature themed images. Simply describe what you want to see in the image and hit search! The top 6 most relevant images in our dataset will then display



photographer

vinomamba24

Relevance: 0.96
photographer

wolfgang_hasselmann

Relevance: 0.95
photographer

lisaac16

Relevance: 0.95


photographer

alex_rainer

Relevance: 0.93


photographer

alexgeerts

Relevance: 0.93
photographer

taypaigey

Relevance: 0.92

Frequently Asked Questions

Discover how Visual Search works and explore the ways your company can benefit

Visual Search has four primary stages. For a practical demonstration check out this YouTube video

1. Understanding images
This is the most crucial stage and requires developing an automated process for taking an image and generating a textual representation of what that image is about.

To do so, we leverage a state of the art deep learning architecture that is trained on 20M publicly available image-text pairs. We then efficiently batch generate bespoke solutions for a company by applying this model to their dataset of photos.

At the end of this stage a company has their images and text representing what is in each image.

2. Storing understanding
The generations from stage 1 must then be stored efficiently without degrading semantic richness.

To do so, we leverage a second state of the art model that is trained on 1B text pairs with a self-supervised objective of minimizing the contrastive loss. That sentence is quite technical but it can be interpreted casually as a model that has been trained to group text pairs with similar meaning close together and dissimilar far apart. This is exactly what we want, namely a way to represent the image such that if a user searches for something similar we can accurately identify and return it.

We then store the numerical representations for each of the images in a data structure that can facilitate fast search and retrieval on machines with low RAM and compute resources. This enables real time search and is coupled with solving a challenging optimization problem, to do so we leverage current state of the art solutions.

At the end of this stage a company has their images and a numerical representation that captures what is in each image contained within an efficient and lightweight data structure.

3. Facilitating search
This stage involves transforming a user's search query into a meaningful numeric representation such that we can use this to find the most relevant results. By doing this conversion we remove all need for a company to manually categorise their images, maintain keywords or filters.

This task is very similar to what was required in stage 2. In fact, we use the same model to generate a meaningful representation of a user's search query.

At the end of this stage a company has their images, the data structure storing the understanding and a numeric representation of a user's search query.

4. Generating results
The numeric representation of the search query is now compared to the numeric representation of our understanding of the images. We define a relevance metric that scores how similar the search query is to the images. There are several possible choices, the metric we employ for Visual Search means that the highest possible relevance is 1.00 and if it's 0.00 then the image is not at all relevant to the query.

Because of the choices we made in the earlier stages we can calculate this score in real time for our images and return the most relevant ones.

At the end of this stage Visual Search is complete and we have the most relevant images for a user's query.
Visual Search is a game-changer for businesses in any industry looking to provide a more seamless and intuitive search experience for their customers. By allowing users to search for products by simply describing what they want to see, Visual Search reduces friction and increases the likelihood of conversions. Here are just a few examples of how Visual Search can benefit companies:

E-commerce
Traditional search methods require users to navigate a maze of filters, such as style, color, and size. These can be overwhelming and result in lower conversion rates. Plus, companies have to work hard to maintain these filters, adding to their technical overhead. Visual Search streamlines the search process by allowing customers to simply describe what they are looking for, such as a "A blue dress with flowers." This improves the overall user experience and can increase the likelihood of conversions.

Accommodation and Rental
Finding the perfect holiday property, new home or rental can be a daunting task. Visual Search makes it easy by allowing users to just describe their ideal property. For example "A modern apartment with wooden floors and ocean view." This enables businesses to provide more personalized recommendations that better match their customers' preferences. It also reduces the workload for listing agents and homeowners by automating the search process and ensuring that their properties are displayed to the right audience.

Supplement to existing search
Visual Search can be seamlessly integrated into existing search strategies to provide a more comprehensive and accurate search experience. Visual Search returns a relevance rating across all your images, this can then be combined with your other data points and business logic to take your current search system to the next level.
Our solution is designed to provide good results for a very broad range of images. This is influenced by the training approach of the first stage model which leveraged 20M image-text pairs from a variety of domains. Should your solution cover a more bespoke image category not included in the original training dataset then further training is possible to develop a customised model for your company.

Out of the box search improvements can be gained by including the relevance scores returned by Visual Search as part of a broader search algorithm that accounts for additional relevant data.

Looking for further guidance?
Contact info@aragoai.com and we can provide advice.
To give a strong indicator of the generalisability of this solution. Our demo leverages 25,000 publicly available images released by Unsplash. The dataset is biased towards nature-themed imagery. Hence do not expect to observe high relevance scores for searches related to images not in this domain.

Remember the Visual Search solution returns the most relevant images based on a query in a specific dataset. Hence queries not related to the dataset will return images with low relevance. This is actually a very desirable property. If a user provides a search for something that your company doesn't have then instead of returning nothing, Visual Search is designed to return the closest thing you do have.
The demo dataset is biased towards nature themed images. This means that high relevance results for search queries outside of this domain should not be expected.

The model powering this Visual Search demo is deployed via on demand cloud infrastructure. This means the speed of the demo may be significantly delayed in scenarios where a cold boot of the server is required. Cold boots are required when the visual search model has not been used for a while. If you are waiting more than 1 minute for your results, try back again in another minute. Once the server is live, a typical execution of visual search has been benchmarked to take ~100ms!

Reliance on cold boots can be removed in commerical deployments, reach out to info@aragoai.com if you're interested in a solution with constant uptime.

Let's discuss transforming your search

The technologies used to power this demo can be customised for a variety of industries. If you want to discuss how they could be applied to your company, then ...