The amount of data we collect is only growing with each passing day, and there’s no sign of it slowing anytime soon. To make any use of all this information, we need a better way to search it. The bigger a haystack, the more difficult it is to find a needle.
Tech giants like Microsoft and IBM and startups like Lucidworks are improving search engines with different methods, such as supporting search across more data sources, improving data tuning, and incorporating artificial intelligence (AI) and AI-powered natural language understanding (NLU) tools. However, a generalized, unsupervised tool – an automated solution that works across all domains and use cases – has been beyond everyone’s grasp. As we recently heard from a search engine expert, “No one is [searching with AI-powered NLU] in an unsupervised workflow.”
At Luminoso, we’ve developed the most powerful unsupervised NLU in the world. So when we saw organizations looking to augment their search capabilities, we began to explore how we could integrate our leading NLU for a better, more humanlike search experience. With the help of a few customers, we tested the performance of our tech on a handful of different search processes.
The results were exciting, but explaining them, and how AI-powered NLU achieves these results, requires a short discussion on how search engines process natural language.
How traditional search engines work – and why you deserve better
When you’re looking for something via a search engine, the engine starts by using the query, or the words you’ve entered into the search bar. Traditional search engines work by parsing, or dividing the query into independently searchable words. The engine then looks for these words in the database – also known as the primary documents – which has been similarly parsed into words and indexed. Based on how many exact matches it finds between the query and the primary documents, the search engine will prioritize the results and display them.
Given this information, let’s consider what a traditional search engine doesn’t take into consideration. Without a robust understanding of natural language, the engine can’t find anything it hasn’t explicitly been told to look for, meaning if there’s a misspelling, a tangential relationship, or some new or domain-specific word that hasn’t been indexed, your results will be incomplete.
Searching without AI-powered NLU
Let’s start with a relatable example. Suppose your SUV recently started making a funny noise and accelerating more slowly than is typical. You take it to the shop and tell the mechanic that it “suddenly started making a ‘wheeee’ sound and doesn’t have the power it used to”. Unsure of the cause, your mechanic searches for your description of the problem in their database of past problems and repairs.
However, without any
augmentation, your mechanic will likely be frustrated by the search results they see. For example, your reference to the mechanic of “power” can also be described as “acceleration” … and many other synonymous words. Not to mention, there are dozens of ways to describe an onomatopoeic word like “wheee“, but unless someone has manually pre-programmed the search engine to connect synonyms, related words, and exact spellings … the search engine won’t find them.Additionally, the most likely fix for your car will be found in a separate forum that mechanics use to crowdsource car problems and solutions. In the past, these related documents haven’t been accessible. Search engines are starting to address this issue by building data connectors and ingesting intelligence into the search function. However, this actually increases the complexity of the language used in the search process, since crowdsourced data from other mechanics will have countless ways to express the same concepts.
The limitations of current search engines
What does this mean for users reliant on current solutions? Without the use of AI-powered NLU, a search engine will miss as much as 80% of the relevant solutions to your query.
Unstructured language is among the most dynamic and complex types of data. When speaking, humans express the same concepts, objects, and topics using a variety of word choices. When listening, our brains automatically associate the words we hear with related concepts and objects – an innate, common-sense understanding we don’t need to constantly retrain. For example, “broken” and “not working” are synonymous, and a “spark plug” and “ignition” are related. For humans, this is default knowledge, but computers have no understanding of basic facts about the world without us training them to do so.
Current search engine providers can only surface relative, nuanced results by manually amending indexing – steps that are iterative, and must be constantly monitored to stay up-to-date – or through the construction and maintenance of a static model with thousands of specific rules connecting synonyms and related terms in every language and across every use case.
Using AI-powered NLU to enhance search
AI-powered NLU aims to eliminate the need for constantly training systems across all datasets by providing the understanding and common sense necessary to understand synonymous, conceptual, and contextual relationships.
At Luminoso, our AI-powered NLU uses unsupervised machine learning to automatically understand synonymous and conceptual relationships in context. This means our solutions are generalized, surfacing domain-specific results in all 15 languages we natively support, across all use cases.
In our car repair example, this means that a search engine augmented by Luminoso would automatically know that when you’re talking about a car, “broken” is synonymous with “busted” and “not working”, but not “humiliated” or “lazy”, and a “transmission” controls your gears but isn’t the communication of a message or a disease.
The results of Search Enhancement with Luminoso
When we explored expanded search capabilities with our customers, those who deployed Search Enhancement with Luminoso showed a range of significantly improved results:
- Holding other variables constant and expanding only primary and related documents with Luminoso produced up to five times more “top matches”;
- When we also expanded the query with Luminoso, the search engine returned many more related results, significantly improving the search quality in most situations;
- Search results on both large and small datasets improved considerably;
- The weighting of search results improved without making any other modifications to the search engines;
- The benefit from expanding both the primary documents and any related documents increased even when the primary documents were less descriptive;
- Traditional search engines perform relatively poorly when applied to datasets using kanji or Chinese characters, but expanding these datasets with Luminoso yielded much better search results; and,
- Luminoso’s fluency with misspellings, slang, and domain-specific acronyms further enhanced performance.
What are the use cases for Search Enhancement?
The better question may be, Where aren’t search engines used these days? But to name a few examples of how our customers are using AI-powered NLU to augment search:
- Finding the best way to repair equipment in millions global maintenance notes, accumulated over years
- Trouble ticket resolutions for customer service organizations
- Retail in-store and online product searches
- e-commerce product searches
- FAQ-matching for employees
How important is the NLU-expanded search to the bottom line?
To answer this question, I interviewed a leading US-based retailer that conducted A/B testing of its search engine with and without Search Enhancement. The retailer identified three primary benefits:
- Dynamic results: Because of the unsupervised AI Luminoso uses, the retailer avoided tagging and training static models, relying instead on the tech to discover the best search results, dynamically, saving enormous amounts of time;
- Higher quality results: The search engine surfaced important but not obvious product features and their synonyms helpful to the customers’ buying decision;
- Supports customer-to-customer recommendations: Through the humanlike search of product reviews, customers could search for product recommendations directly from customer reviews, allowing the retailer to remove itself from the recommendation process when so desired – and reap the network benefits.
These search improvements tap into the full value of product reviews from other customers and increase the conversion from search-to-sale. But what about fake or computer-generated reviews? Will these undermine customer trust? It turns out that our unsupervised AI mitigates this risk by surfacing statistically impossible concentrations of concepts inherent in fake reviews, allowing the retailer to protect the authenticity of customer-provided product reviews.
For this retailer – and many others like them – Search Enhancement with Luminoso can improve their bottom line by increasing sales and reducing costs.
Search like a human
Many of you will remember the Tappet Brothers (also known as Click and Clack) who for 37 years diagnosed car problems and suggested resolutions during their radio call-in show. Their extraordinary knowledge of every car model and their quirks enabled the team to give accurate advice after only a minute or two of discussion with the car owner – all while making us laugh along the way.
What made the Tappet Brothers impressive was their encyclopedic car knowledge, not their ability to search for this knowledge in their memory banks. Understanding concepts in context is an unremarkable human process available to all of us. Search engines, in contrast, start with an encyclopedic amount of data, this is an unremarkable product of our digital age. Understanding concepts in context for a search engine, however, was a previously insurmountable problem. Until now.
Experience the power of natural language understanding for search. Get started with Search Enhancement in Luminoso today.