Why should Olympic athletes have all the fun when it comes to setting records and pushing the boundaries of what’s possible? Here at Luminoso, we’re constantly researching new ways to help our clients more quickly, easily, and accurately understand their customers. With that, we are proud and excited to unveil the next generation of the underlying technology in our text analytics software.
So what’s the big deal?
This technology upgrade offers two key benefits: even greater accuracy at interpreting the nuances in how people speak, and an improved ability to understand uncommon words.
Specifically, Luminoso’s software can now automatically identify synonyms, misspellings, and acronyms with more precision than it could before – though we should add that our researchers have already set the bar quite high! For example, with most text analytics systems, analysts must anticipate that there are many synonyms for “employees” and that words like “knowledgeable” and “courteous” are often misspelled, and must program software to take this into account. Luminoso’s software understands this automatically and at a higher degree of accuracy.
In addition, our software now automatically interprets the meanings of rare words at a higher rate … to be specific, 16% higher than the previous industry record. As a result, Luminoso is now even more suitable for highly technical, industry-specific, or acronym- and slang-heavy datasets – and you can spend that much less time deciphering incomprehensible tweets with Urban Dictionary.
Wait… how did you do this?
… great question. It took a lot of late nights, some intricate coding, and unreasonable amounts of caffeine. But long story short: we improved the way that computers, and our software in particular, interpret and understand the world.
If that’s not enough of an explanation for you, then let’s get (a little) technical!
Luminoso’s proprietary methodology relies in part on ConceptNet, an open-source knowledge base developed in the MIT Media Lab that helps computers understand language in the same way that humans do.
When people communicate, we bring to the table our basic knowledge of how the world works. For example, we know that the sun is hot, water is wet, and that most people spend way too many hours of their day watching cat videos. (Guilty as charged.) We don’t need to explain these things when we’re communicating with another person because we assume that they also understand. However, computers don’t know these basic facts unless we teach them – and this is exactly what ConceptNet does.
But why is this important? How does teaching computers about the world improve my text analytics results?
ConceptNet is important to our methodology because it increases accuracy and prevents computers from drawing incorrect or illogical conclusions about text that a human with common sense would never make. This
reduces the need to have a person devote precious hours to checking and verifying text analytics outputs, like they had to in the past. ConceptNet has been delivering great results that are on the cutting-edge of text analytics methodologies, but we wanted to see if we could go even further. And we did!We integrated ConceptNet with similar systems that have been developed across the world – specifically GloVe, word2vec, and PPDB. The goal was to take the best capabilities of each system and unite them in order to create a system that performed better than any of the individual parts. After exhaustive testing, benchmarking, and yet more caffeine, we were delighted to find that we succeeded!
You can read the full results of Luminoso co-founder and chief science officer Robyn Speer and awesome intern Joshua Chin’s research here, and another blog post on the topic here.
When can I start taking advantage of this upgrade?
Immediately! This upgrade has been integrated into Luminoso’s products and is in place to help you understand what your customers are truly saying.