It's amazing that the best way to look for new items of clothing these days is to click on a few check boxes and then scroll through endless images. Why can't you look for a "green patterned dress with a scoop neckline" and see one? Glisten is a new startup that does just that by using Computer Vision to understand and list the key aspects of the products in each photo.
You may now think that this already exists. In a way – but not helpful. Co-founder Sarah Wooders came across this when she was working on her own fashion search project on the way to MIT.
“I was hesitant to shop online and I was looking for a C-neck crop shirt and only two things came up. But when I leafed through it, it was about 20, ”she said. "I found that things were tagged in a very inconsistent way – and if the data is so rough when consumers see it, it's probably worse in the backend."
As it turns out, computer vision systems have been trained to really effectively identify features of all types of images, from identifying dog breeds to recognizing facial expressions. When it comes to fashion and other relatively complex products, they do the same: look at the picture and make a list of features with the appropriate level of confidence.
For a specific image, a kind of tag list would be created as follows:
As you can imagine, it's actually pretty useful. But it also leaves something to be desired. The system doesn't really understand what "maroon" and "sleeve" really mean, except that they are present in this picture. If you ask the system what color the shirt is, it is perplexed if you don't sort through the list manually and say these two things are colors, these are styles, these are variations of styles, and so on.
It's not difficult for a picture, but a clothing retailer may have thousands of products with a dozen pictures each and new ones every week. Do you want to be the intern assigned to copy and paste tags into sorted fields? No, and nobody else. This is the problem that Glisten solves by making the Computer Vision Engine much more context sensitive and making its output much more useful.
Here is the same image that the Glisten system may be processing:
"Our API response will actually be, the neckline is this, the color is this, the pattern is this," said Wooders.
This type of structured data is much easier to insert into a database and to be queried with certainty. Users (not necessarily consumers, as Wooders later explained) can mix and match, knowing that when they say "long sleeves" the system has actually looked at the sleeves of the garment and found that they are long.
The system has been trained in a growing library of approximately 11 million product images and descriptions, which the system analyzes using natural language processing to find out what relates to what. This gives important contextual information that prevents the model from being considered “formal” as color or “sweet” as an occasion. But you're right if you think it's just not that easy to just plug in the data and let the network figure it out.
Here's a kind of idealized version of what it looks like:
"There is a lot of ambiguity about fashion and that is definitely a problem," Wooders admitted, but far from insurmountable. “When we provide the output to our customers, we give each attribute a rating. So if it is not clear whether it is a round neckline or a round neckline, the algorithm will put a lot of emphasis on both if it works correctly. If it is not certain, there will be a lower confidence rating. Our models are trained on how people label things so that you get an average of what people think. "
The model was originally geared towards fashion and clothing in general, but with the right training data, it can also be applied to many other categories – the same algorithms could find the defining characteristics of cars, beauty products, etc. This is what it could look like for a shampoo bottle – instead of sleeves, cut and occasion, you have volume, hair type and paraben content.
Although buyers are likely to see the benefits of Glisten technology in time, the company has found that its customers are actually two steps away from the point of sale.
"Over time, we have found that the right customer is the customer who feels the pain of having unreliable product data," said Wooders. “These are mainly technology companies that work with retailers. Our first customer was a price optimization company, another one was a digital marketing company. These are pretty much beyond our expectations. "
It makes sense if you think about it. The more you know about the product, the more data you have to correlate with consumer behavior, trends, etc. Knowing that summer dresses are coming back, but knowing that 3/4 sleeve blue and green floral patterns are coming back is better.
The competition mainly consists of internal tagging teams (the manual review we set that none of us want to perform) and general computer vision algorithms that don't produce the type of structured data that Glisten uses.
Before Y Combinator Next week's demo day, the company is already seeing 5 numbers of recurring earnings per month, with the sales process limited to contacting individuals they thought would be useful. "There has been a crazy amount of sales in the past few weeks," said Wooders.
Glisten may soon be running some product search engines online, although ideally you might not even notice it – with a little luck you will find what you are looking for much easier.
(Alice Deng was originally quoted throughout this article, although it was Wooders all the time – a mistake in my notes. It has also been updated to better reflect that the system is applicable to out-of-fashion products.)