The future of visual search: smart glasses, AR and retail o…
What might search be like in 20 years’ time?
This was the question I was called upon to contemplate for a recent Google Performance Firestarters event, ‘Twenty years on: The present and future of search’.
Together with three other speakers, I was invited to give a presentation about the possibilities for the future of search, and what they might mean for performance marketing.
Over the next two decades, I believe that we’ll see search develop in two key directions. First of all, we’ll see voice search take off in a meaningful way – something that, despite the considerable hype surrounding voice search, we haven’t yet seen. And secondly, we’ll see the rise of visual search, alongside the development of technologies like smart glasses and augmented reality.
In this follow-up, I’ll expound on why I think that visual search is the other half of the future of search, how it links in with innovations like augmented reality, and what the possibilities are for its monetisation.
My predictions for the future of visual search are somewhat more speculative than my predictions for voice search, as the technology is still fairly young and in the early stages of development. However, we’re already seeing big indicators as to where it could be headed.
Before we begin: what do I mean when I talk about “visual search”? Visual search is usually considered something distinct from image search, though there is an overlap between the two technologies.
Image search is a type of search designed to return images, and is carried out using search engines like TinEye (the first image search engine to use image identification technology) or Google Images. The input can be text – a keyword search – or an image (also known as reverse image search).
Visual search, by contrast, is typically used to refer to a method of searching the physical world using a smartphone or other type of camera. The search engine will then use image recognition technology to identify what it’s being shown, and surface visually similar results either from around the web, or from within a specific website or app.
At the moment, visual search engines tend to need to focus on a single object in order to identify it, but are increasingly able to identify component parts of a wider ‘landscape’, which will be important in order for visual search to develop into a tool that we can use all the time to search the world around us.
Who are the leaders in this space? A short history of visual search technology
In voice search, few would dispute that the current major players are Google and Amazon, due to both companies having cornered the market on smart speakers and invested a great deal into improving their offerings.
While other companies are certainly present, for the moment, Google and Amazon have stolen a march on their competitors by having the best (and most popular) versions of the current generation of devices.
In visual search, the current technological leaders are Pinterest and Google, again by virtue of having entered the space early and invested a great deal into developing their capabilities.
Google had perhaps the very earliest visual search app, Google Goggles, which was initially developed for Android smartphones in 2009, before being launched for iOS in 2010. Goggles was designed to be able to search for information about pictures of objects taken in the real world, and was also capable of carrying out product searches using an item’s barcode.
However, Goggles’ technology was very basic, and the field was wide open for Pinterest to pull ahead in 2016 as it started developing more effective image recognition and visual search capabilities, including the ability to isolate individual objects within a larger image and search for visually similar items.
In 2017, Pinterest debuted Pinterest Lens, advertising it as a “visual discovery tool”, in keeping with Pinterest’s focus on the role of inspiration in the product journey. In my Firestarters talk, I highlighted this as a key difference between visual search and voice search: voice searches tend to have a known destination (the answer to a question, or a repeat purchase from a known brand), whereas visual searches are open to inspiration and influence.
With visual search, the searcher knows more or less what they want, but isn’t necessarily expecting to find the exact same object, providing a lot of opportunities for well-placed brands to get their product in front of the consumer. I’ll come back to this later when we talk about monetising visual search.
A few months after Pinterest launched Pinterest Lens, Google showed off its own direct equivalent at Google I/O 2017: Google Lens. Google Lens has much more impressive capabilities than Google Goggles (though the technology is far from perfect yet), and in August, officially replaced Goggles as its predecessor was formally discontinued.
Other companies competing in this space include Microsoft, which launched visual search for the Bing iOS and Android apps in the US last June, and Amazon, which recently entered into a partnership with Snapchat to create a visual search tool that would return product results from Amazon.
However, as I’ve mentioned, it’s early days yet – and the landscape could still shift dramatically, as the voice landscape did in 2014 when Amazon released the Amazon Echo. Right now, visual search is still a fairly unknown technology: the search and marketing community has been talking about it for a couple of years, but few people are using it in their day-to-day lives.
I believe that it will take the launch of a device with dedicated visual search capabilities to bring visual search into the mainstream, just as the launch of smart speakers began to make voice interfaces more mainstream.
The same device, if it lives up to its potential, will be able to remove the friction currently surrounding the act of using visual search – the need to download a dedicated app, open it up when you want to perform a search, point your smartphone at something and search – making it natural and intuitive. I’m referring to smart glasses, also known as AR glasses, and I think that they are the future of visual search.
The advent of smart glasses
Smart glasses, or augmented reality glasses, have been associated with some early and expensive flops, the most infamous of which is of course the Google Glass, whose prototype was discontinued just a year after it became available to the public.
Less roundly mocked, but no more commercially successful, are Snap’s Spectacles, which are designed for recording video clips to be shared on Snapchat. In 2017, Snap wrote off $40 million worth of unsold inventory and unused parts due to lacklustre demand.
And most recently, in February Intel debuted a stylish-looking pair of smart glasses, known as Vaunt, which many predicted could be a game-changer for the technology, which until now haven’t looked like anything you might choose to wear.
But Intel’s entry was the most short-lived of all, pronounced dead in the water just two months later.
Smart glasses are a tricky concept to get right, needing to combine functionality with comfort, an attractive design, and seamless controls in order for people to feel comfortable using them in their day-to-day lives.
However, the potential for this technology is huge, and major tech players including Apple and Facebook are still committed to launching their own devices. In August, Reutersbroke the news that Apple had acquired a start-up which specialises in making lenses for AR glasses, and the company has since been spotted hiring for augmented reality engineers.
Why do I believe smart glasses will take off? In my Google Firestarters presentation, I made the point that human beings are inherently visual creatures. Even our voice devices now have screens – the Amazon Echo Show and Google Smart Hub. Browsing the web is a highly visual experience, and it’s difficult to replicate the same quality of experience without visuals.
Advocates of voice-first interfaces often make the point that smartphones encourage unnatural behaviours, ‘sucking’ our attention downwards and towards screens and making us antisocial. However, voice on its own can only do so much – which is why I see smart glasses as a natural counterpart to voice interfaces and voice search.
Voice interfaces lend themselves extremely well to interacting with smart home appliances, as well as situations in which the user’s hands are occupied, such as while cooking, driving, doing DIY, and so on. But for activities which typically require a screen, such as shopping, browsing the internet or reading messages, smart glasses are a more natural fit.
The two aren’t mutually exclusive, either – most makers of smart glasses envision them responding to voice commands, as well as inputs like gestures.
As of right now, I believe that smart glasses are a few years away from even being commercially viable, much less mainstream. But I also believe that when they are, the technology will come with visual search built in, enabling us to seamlessly search the world around us – and opening up new possibilities for how we use the internet.
Visual search and augmented reality
I’ve mentioned that I believe the rise of visual search will be linked to augmented reality. More than that, I think that in the future visual search will be the norm for augmented reality. Even now, visual search is occasionally referred to as “augmented reality search”.
“Augmented reality” as a phrase is broadly used to mean any kind of digital output, visuals or information that appears superimposed onto the world around us. This could be by means of a smartphone screen, a pair of smart glasses, or even a pair of contact lenses. It differs from virtual reality in that with VR, the entirety of the world you see and interact with is digital.
Because visual search uses the world around us to search the internet, it’s frequently grouped in with augmented reality. Here’s the difference, as I see it: visual search refers to any type of search that uses the physical world as an input, but the search results are confined to your device, and will direct you to a regular website or app.
Augmented reality search is when the input and output of a search augment the world around you, by superimposing information or visuals onto your surroundings.
At the moment, they’re separate concepts, in that we can have visual search that doesn’t necessarily involve augmented reality. However, in a future where smart glasses are commonplace, the digital and physical worlds would be brought that much closer together, and visual search and AR search would become one in the same.
How will visual search be monetised?
With all of that said, what possibilities for monetisation and advertising does visual search present?
Visual search has immense potential for monetisation. In fact, I’d go so far as to say that it’s hard to imagine a type of search that’s more commercially friendly. This is because visual search is the most intuitive way to search for things that you’d like to buy – or things you’d like to buy to go with things you already own.
With “regular” text-based search as well as voice search, queries are more likely to be informational, with a relative minority of searches having purchase intent. Whereas visual search, while it can be used to find out information about an object, plant or animal, is much more likely to be used for product searches – or for product inspiration.
Pinterest has already moved to capitalise on this potential, introducing visual recognition technology to its ad targeting in 2017. This capability means that businesses who advertise with Pinterest will see their products surfaced as Related Pins or Instant Ideas when a user searches for a visually similar item.
It’s easy to imagine this capability expanding from surfacing visually similar products to surfacing visually complementary products – so that instead of seeing dining tables similar to the one you’ve searched for, you’re presented with a matching set of chairs.
Augmented reality opens up a whole new world of possibilities for visual search. Companies like IKEA have already developed tools that use augmented reality to place furniture within your home, so you can visualise it before you buy. Add visual search into the equation, and you might be able to ‘search’ your living room to find the perfect coffee table to match your sofa, or a pair of curtains to match your rug; visualise them within the space; and then buy them directly.
Add-on products and recommendations would fit naturally into this process: a visual search might detect a space on your shelving unit, and suggest an ornament to fit the space. With accurate visual search technology, the suggestions have the potential to be helpful without being intrusive. Machine learning could be used to learn the consumer’s tastes over time.
All of this is a way off into the future, but retailers are already eyeing up the opportunities presented by visual search. US retail giant Target has partnered with Pinterest to integrate Pinterest Lens’ search capability into its app, and UK online fashion and cosmetics retailer ASOS made its Style Match visual search tool available to all Android and iOS devices last March.
And the consumer appetite for visual search in ecommerce definitely exists. A study by visual search specialist ViSenze this year found that 62% of Millennial and Generation Z shoppers would like to use visual search to help them find and identify products that they were inspired by before buying.
How should marketers be preparing for this future? As I think we’re several years away from visual search becoming the norm, and the technology that will usher it in hasn’t arrived yet, there’s a limit to how much anyone can do at this stage.
But by all means, experiment with visual search and keep a close eye on developments, and you’ll be in a strong position to move in on this space when the time comes.