The enterprise search field has gone through a major consolidation since this same time a year ago. At a high level, it seems the industry is moving from “search engines” to “search experience engines,” also known by the popular, but vaguely defined, “insight engines.”
This consolidation follows a period not too many years ago, when the enterprise search space shifted from commercial vendors to open source and open-source-based products. We also saw serious consolidation in the commercial market at that time. Autonomy acquired Verity, which in turn was acquired by Hewlett Packard Enterprise. Autonomy now is part of Micro Focus, the folks who sold a COBOL language going back to the early MS-DOS days. Fulcrum Technologies was acquired by Hummingbird, which was later acquired by OpenText. Oracle acquired Endeca. IBM acquired Vivisimo, and rebranded the product “IBM Watson Explorer” to capitalize on the Watson name with minimal product changes. Dassault Systèmes acquired Exelead and expanded it into a public search platform to compete with Google. And now even Google has ended its experiment with enterprise search by ending support for the very popular Google Search Appliance, shifting the technology towards cloud services, which may not be acceptable to the wide range of enterprises where data privacy and security are critical.
Some of the consolidation happened because the market was too crowded with platforms that were far too similar. But I believe at least some of this consolidation was due to the improvements in free open source tools like the Apache Lucene and Solr projects, which offer a significantly similar functionality with zero licensing cost — more on that later.
As the Grateful Dead say, “What a long strange trip it’s been.”
The benefit — and the drawback — with Lucene and Solr is that they are single purpose tools. This is a common issue in the world of open source. While the Apache Project has many supporting tools, such as the Nutch crawler and the Tika “filter pack,” which extend the capabilities of both Lucene or Solr, implementing a pure open source search solution takes significant effort.
It only made sense that companies would form to both assist implementing Lucene and Solr; and the same companies would benefit by providing support, services, and products that extend the capabilities of the open source technology.
A few examples? Solr is the technology under Lucidworks Fusion. And Lucene, the kernel on which Solr is based, drives Elasticsearch and the enterprise search tool Swiftype, which is now part of Elasticsearch.
For years, those of us who work with search all the time have stressed that getting the relevance right is a key factor to search success. As mentioned earlier, vendors are integrating machine learning (ML), artificial intelligence (AI) and big data tools with search in the hopes of increasing this relevance. Lucidworks includes Spark and other open source tools, as do Attivio, Coveo and others. But will ML save search?
Previously, most machine learning environments required a significant amount of data to begin making relevant recommendations. I recently asked Lucidworks CEO Will Hayes how organizations with smaller content repositories and relatively smaller queries can take advantage of machine learning, which I typically see as needing very large sample sets to really deliver accurate results.
His answer was one I hadn’t considered, but it makes sense. He said the relatively small, homogenous vocabularies within even the largest organizations, combined with the more unique types of queries (when compared with public-facing intranet sites) makes it possible for machine learning tools to be effective, even without the query volume and dataset size that the machine learning on huge sites like Amazon and others benefit from.
If your search platform doesn’t already offer machine learning, while you may be able to integrate it on you own using Apache Spark or even Apache Mahout, I’d recommend starting by simply managing your search platform more aggressively.
Until recently, every search vendor would stress the value of their underlying code, the secret sauce that built the search indices and delivered relevant results. But an amazing trend is now emerging. An increasing number of search vendors are shifting their focus away from proprietary search kernels and over to the interface experience, both for administrators and for end users. Several vendors — Coveo being the first that comes to mind — will optionally license their enterprise search applications on top of the Elasticsearch index. Lucidworks, a company that employs a large number of open source committers, bases their commercial Fusion product on Solr.
And even those companies that do retain their proprietary products are often wrapping a series of open source Apache tools under and around their commercial product.
At the same time, enterprise search platforms such as Attivio, Coveo and others have repositioned their products to “search experience engines” or “insight engines,” perhaps reasoning if your customers are unhappy with search, why not try a new marketing pitch? Unfortunately the root cause of poor search isn’t usually due to any flaw in the product, but rather a lack of ongoing search management.
So as enterprise search morphs, via the magic of marketing, into experience engines and insight engines, the various search vendors are beginning to base their products on what is essentially the same base code: Apache Lucene. This isn’t true (yet) for Microsoft SharePoint, but as you look across the landscape of commercial search vendors, the underlying kernel is more and more becoming the same.
Once that happens, is there really a need for a half dozen or more companies marketing the same basic product? 2018 may just be the year the thinning of the enterprise search pack starts.
Miles Kehoe is founder of New Idea Engineering, a vendor neutral consultancy focused on enterprise search, analytics and big data. In addition to New Idea Engineering, he has also worked at search
vendors Verity, Fulcrum Technologies, and most recently at Lucidworks.