Amazon Camera Search
The State of the Product in 2016
Camera search is a feature in the Amazon Shopping app that enables customers to use the camera on their mobile devices to find things around them on Amazon.
It works great on products with some form of packaging (i.e. a bag of chips, hand soap, books). But for products that don't have distinct graphics patterns (i.e. mugs, pens, headphones), customers are often brought to a sub-category page where they have to do more scoping and use text search in order to find what they want. This is not what customers expect and so different from when they point the camera to packaged goods or barcodes.
New UX Enhancing the New Model
Our computer vision team built a new model to improve the specificity of the results returned. However, instead of see the results right away, customers still need to pick one of the search terms presented in order to proceed.
The goal was to show customers results immediately. So we decided to break the search term into single words and surface more words from the model that were previously consider low possibilities. We turned all of them into selectable tags and then directly present results for the most confident set of tags. Customers can now see the results immediately with the freedom to change the search term by just selecting the tags.
Other Design Efforts to Improve the Experience
Accumulation: we created a few seconds gap between getting the first set of results and presenting the results to customers so that there is time to accumulate multiple sets of results and pick the bet one. Watching the accumulation happening and results getting more accurate helps us built trust with our customers.
Delay the start of the recognition: The first few seconds after the camera started are used to orient the phone and point to the subject. Triggering the recognition service too early would lead to unwanted results.
Align the presentation of results from various confident levels: Sometimes we can give exact match, sometimes we can recommend similar enough products, sometimes we can only suggest based on the recognized categories, and sometimes we recognize the product but we don't sell it. We want to make sure we design for all scenarios.
Guidance Messages: When we detect a scene or a cluster, we can guide customers to point at one single object at a time.
On-boarding: A lot of customers also don't know what product works best in Camera Search so they use it just on barcodes (conservative), or use it to point at rooms, pets, things that are not buyable products (adventurous). So I designed and filmed a short video for the camera permission screen demonstrating the various use cases for Camera Search.
We observed less drop-offs during and after scanning, more customers come back to use the feature and they use it more often.
A lot of customers skip the on-boarding very quickly. The drop-off rate increased as well. So we later on changed our strategy from “show and tell” to “try and learn”. Instead of showing a video for first time customers, we show the UI directly with some instructions that don't block them from using the feature.
We slowly expanded on functionalities as the product keeps growing. Therefore Andrea Alam led another round of design focuses on modalities before the scanning happens.