Luminoso wins multiple awards at SemEval 2017

February 2017 was an exciting month for us! Among other things, we came in first place in nearly every category we participated in at SemEval 2017, an annual international workshop on semantic evaluation. If those last two words have you scratching your head, read on. (If you and semantic vectors are old friends, check out our Chief Science Officer Robyn Speer’s more detailed post on the topic here.)

One of the first questions we’re asked when we talk about Luminoso’s software is “So do I have to provide the training data set?” or “What do you use to train the system?” We’re usually met with baffled expressions when we truthfully say, “We don’t; we use word embeddings.”

Many people are familiar with traditional methodologies for analyzing text-based data. A long list of keywords that might be present in the data must be created, and then a computer hunts through text data and returns documents containing those keywords. Alternatively, you can train an algorithm to return documents matching certain criteria – but you need an initial dataset to train your algorithm, and the algorithm won’t know what to do with data that doesn’t match exactly with the training dataset.

Introducing word embeddings

Over the past few years, a new approach called “word embeddings” has been developed to more quickly and accurately analyze text-based data. It’s gotten popular enough that people have started making memes about it, which is saying something.

Rather than matching a new dataset to a predefined list of keywords (the equivalent of doing a command-f search in a word doc), word embeddings turn language into mathematical vectors. Vectors that are similar to each other represent words or phrases that are similar or highly related to each other in a dataset.

These word embeddings aren’t created in a vacuum; there are a number of different sources for creating these vectors. The most well-known are Google’s word2vec and Facebook’s fastText, but there a number of others, including MIT’s ConceptNet, Stanford’s GloVe, and Luminoso’s proprietary ensemble of multiple systems.

How Luminoso uses word embeddings to analyze your data

Luminoso’s software starts out with its proprietary ensemble of systems that provide background knowledge about how the world works. In other words, our software has a good idea of what words mean before it sees a single sentence of your data.

Our software then reads through your data and refines its understanding of what words and phrases mean based on how they’re used in your data, allowing it to accurately understand jargon, common misspellings, and domain-specific meanings of words. This makes it easier to quickly understand data sets with lots of industry-, company-, or brand-specific terms.

SemEval 2017: Pitting Luminoso against everyone else

SemEval is a long-running evaluation of computational semantic systems, including word embeddings like the ones Luminoso uses. It does an important job of counteracting publication bias. Most organizations will only publish the results of evaluations where their system performs well, and omit findings where it didn’t.

The evaluation organized by SemEval asks many groups to compete head-to-head on an evaluation they haven’t seen yet (e.g. a test with no practice questions or study guide), with results released all at the same time. When SemEval results come out, you can see a fair comparison of everyone’s approach, with positive and negative results.

This year’s evaluation was a typical word-relatedness task. You get a list of pairs of words, and your system has to assess how related they are, which is a useful thing to know in applications our clients ask for such as text classification, search, and topic detection. The score is how well your system’s responses correlate with the responses that people give. Systems were evaluated on how well pairs of words were related to each other in a single language (comparing a German word to another German word) or multiple languages (comparing a German word to a Farsi word).

Drumroll, please…

We are thrilled to announce that we outperformed all other evaluated systems in every category we participated in, save one (in which we came in third… still not too shabby!).

In the first task we were asked to complete, identifying the relatedness of words in a single language, Luminoso came in first place in English, German, Spanish, and Italian. We came in third in Farsi. The first- and second-place winners in that category were Farsi-only systems.

The second task was identifying word relatedness for word pairs where each word is in a different language. Luminoso won in every category SemEval offered, including English-Spanish, English-Farsi, German-Spanish, and Spanish-Italian, amongst others.

For the full list of who participated in SemEval 2017 and what our exact scores were, you can find them here. If have you have other questions about how Luminoso uses word embeddings, or the pros and cons of word embeddings compared to other systems, drop us a line; we’d love to hear from you!

Related Posts

Step into the light

KAPS GROUP

The KAPS Group is a network of consultants with a wide range of skills and experience in text analytics, taxonomy, ontology and knowledge graphs, Python and other proprietary text analytics programming languages, and information and knowledge management.

Interested in becoming a partner? Contact Us Today!

About This Partnership

The KAPS Group is a network of consultants with a wide range of skills and experience in text analytics, taxonomy, ontology and knowledge graphs, Python and other proprietary text analytics programming languages, and information and knowledge management. It was founded by Tom Reamy, author of the most comprehensive book on text analytics, Deep Text.

IBM

IBM Consulting’s watsonx practice brings expertise in the generative AI technology stack as well as domain and industry experience that can help accelerate clients’ business transformations

Interested in becoming a partner? Contact Us Today!

About This Partnership

IBM Consulting’s watsonx practice brings expertise in the generative AI technology stack as well as domain and industry experience that can help accelerate clients’ business transformations. In the same way that we established our successful Hybrid Cloud services business built on the Red Hat® OpenShift® platform, IBM Consulting intends to be the leading consulting services provider for watsonx. Businesses are demanding AI that produces accurate and trustworthy results, can scale across clouds, and can be easily adapted to enterprise domains and use cases. Watsonx is designed to help them address those needs. Let’s put AI to work and make the world work better — together.
Smart Insight Logo

Smart Insight

It features capabilities like natural language understanding AI and analytics, allowing for comprehensive data usage across organizations.

Interested in becoming a partner? Contact Us Today!

About This Partnership

Smart Insight, operated by Uchida Yoko Co., Ltd., offers digital transformation (DX) tools like Mµgen. Mµgen integrates various data types, including IoT and big data, and supports visual data integration, AI-driven text analysis, and advanced analytics. It’s designed for quick deployment, reducing data warehouse needs and implementation costs. The tool is used by companies like Toyota, Toshiba, and Yamaha for DX initiatives. It features capabilities like natural language understanding AI and analytics, allowing for comprehensive data usage across organizations.

EDLIGO

EDLIGO offers an advanced, AI-powered comprehensive Talent Analytics solution for data-driven talent management, workforce planning, project staffing, competency management, employee experience, and retention management.

Interested in becoming a partner? Contact Us Today!

About This Partnership

EDLIGO GmbH is a leading company specializing in AI-powered Talent Analytics. EDLIGO offers an advanced, AI-powered comprehensive Talent Analytics solution for data-driven talent management, workforce planning, project staffing, competency management, employee experience, and retention management. We believe that employees are lifelong learners, so we have built a comprehensive solution that empowers organizations to master all aspects of talent management, including learning and development, with data and AI to drive the highest business impact.

EDLIGO has a strong track record, with customers successfully using our platform in more than twenty countries, boasting more than 2 million users, and filing 17 patents. In 2023, EDLIGO was recognized as one of Germany’s top three most innovative mid-sized companies in software.

Zyte

Zyte is a leader in web scraping services, offering advanced data extraction tools and proxy solutions to power business data needs efficiently and reliably.

Interested in becoming a partner? Contact Us Today!

About This Partnership

Zyte provides a comprehensive web data platform, specializing in extracting and delivering structured web data at scale. They offer solutions like AI-powered automatic extraction, cloud hosting for crawlers, and a proxy manager for seamless data scraping.

Zyte’s services are beneficial for businesses needing large-scale, reliable web data for market research, competitive analysis, and data-driven decision-making.

Their tools cater to various data types including e-commerce products, job postings, news articles, and real estate listings, ensuring high-quality data extraction.

Salesforce

Salesforce is a leading CRM provider, offering a unified platform for sales, service, marketing, and customer engagement, integrated with AI for enhanced business growth.

Interested in becoming a partner? Contact Us Today!

About This Partnership

Salesforce provides a comprehensive CRM platform, integrating sales, service, marketing, and customer experience tools.

Their AI-driven approach ensures efficient data handling, personalized customer interactions, and streamlined operations.

The platform benefits businesses of all sizes by enhancing customer relationships, improving sales productivity, and enabling effective marketing strategies.

Salesforce’s solutions are adaptable across various industries, helping companies achieve growth and operational excellence.

RainFocus

RainFocus offers a comprehensive platform for managing in-person, virtual, and hybrid events. They specialize in data-driven event management, providing robust registration flows, attendee engagement, and seamless omnichannel marketing.

Interested in becoming a partner? Contact Us Today!

About This Partnership

RainFocus’s platform is designed to streamline event management across various lifecycle phases. It offers a unified approach to plan, manage, deliver, and optimize events, ensuring personalized attendee experiences.

Their solutions are beneficial for businesses seeking efficient event orchestration, as they enable data integration, flexibility, and customization. This approach results in enhanced attendee engagement, operational efficiency, and strategic marketing alignment.

HiFly Labs

Hiflylabs is a data solutions company offering data engineering, science, strategy advisory, and visualization. They focus on creating enterprise solutions with an emphasis on practicality and efficiency.

Interested in becoming a partner? Contact Us Today!

About This Partnership

Hiflylabs provides tailored data services, including data engineering, science, and visualization. They cater to various industries, offering specialized solutions like Appic for app development and Hifly SODA for sales-oriented analytics.

Their approach focuses on leveraging modern technologies and ecosystems like Databricks, dbt, and the Modern Data Stack, ensuring robust, flexible, and powerful tools for their clients. This helps clients optimize their data handling and business value creation processes.

Data Ideology

Data Ideology specializes in data strategy, engineering, AI, and analytics, offering solutions to maximize data-driven outcomes and insights.

Interested in becoming a partner? Contact Us Today!

About This Partnership

Data Ideology provides comprehensive data services, including strategy, engineering, AI, and analytics. They help businesses identify data-driven opportunities and create strategies for optimal outcomes.

Their services include building robust data pipelines, streamlining data processing, and leveraging AI for actionable insights.

This approach ensures data quality, compliance, and maximizes the strategic value of data assets, aiding organizations in making informed, data-driven decisions.

8x8

8×8, Inc. is a provider of integrated cloud communications and customer engagement solutions, offering unified communications, contact center, video conferencing, and team chat services.

Interested in becoming a partner? Contact Us Today!

About This Partnership

8×8 delivers a unified platform for contact center, voice, video, chat, and embedded communications. Their solutions focus on enhancing customer experience, agent engagement, and employee connectivity.

Offering reliable, secure, and compliant services, 8×8 integrates with business and CRM applications like Microsoft Teams and Salesforce.

Their technology supports businesses in various industries, ensuring efficient communications and collaboration, global reach, and data-driven insights.

Vatis Tech

Vatis Tech provides an AI-powered speech-to-text infrastructure tool, offering high accuracy and efficiency in transcribing audio and video data for various industries.

Interested in becoming a partner? Contact Us Today!

About This Partnership

Vatis Tech specializes in AI-driven speech-to-text technology, serving sectors like contact centers, broadcasting, medical, legal, media, and education.

Their platform features high accuracy, real-time transcription, and support for multiple languages and formats. It benefits users by enhancing data accessibility, improving workflow efficiency, and enabling more effective content analysis.

The technology is particularly beneficial for organizations needing rapid, precise transcription of large volumes of audio or video data.

OnlineSales

OnlineSales.ai is an advanced retail media monetization platform, offering AI-powered advertising solutions for retailers to optimize ad revenues.

Interested in becoming a partner? Contact Us Today!

About This Partnership

OnlineSales.ai specializes in retail media monetization with an AI-driven platform. It offers tools like sponsored product ads, display ads, offsite ads, and email ads to enhance digital marketing.

The platform enables retailers to increase ad revenues, deliver personalized shopping experiences, and automate ad campaign management.

Key benefits include maximizing ad spending, scaling advertising efforts, and providing an immersive shopper experience. The service is designed to be fully white-labeled and self-serve, ensuring user-friendly operation and customization according to business needs.

BabelStreet

Babel Street is a data analytics platform offering threat intelligence tools. They specialize in AI-enabled analysis of publicly and commercially available information for risk mitigation, fraud detection, and security.

Interested in becoming a partner? Contact Us Today!

About This Partnership

Babel Street’s platform empowers organizations with AI-driven insights from vast public and commercial data sources. It offers multilingual understanding, end-to-end automation, and extensive source access.

The platform is useful for threat intelligence, risk mitigation, and fraud detection. It’s valuable to government, law enforcement, and commercial sectors for its ability to process and analyze large volumes of data, helping them stay ahead of threats and risks.

Paychex

Paychex is a leading provider of integrated human capital management solutions for payroll, benefits, human resources, and insurance services.

Interested in becoming a partner? Contact Us Today!

About This Partnership

Paychex offers a range of services aimed at simplifying payroll and HR processes for businesses. Their solutions cover payroll, benefits, insurance, and HR administration.

By automating and streamlining these aspects, Paychex helps businesses save time and reduce errors. They cater to small and mid-sized businesses, providing tools for tax administration, employee onboarding, and regulatory compliance.

Their platform is designed to be user-friendly, ensuring a seamless experience for employers and employees alike.

Experience

Experience.com is a platform offering solutions for customer and employee experience management, as well as online reputation management, using AI-driven feedback campaigns.

Interested in becoming a partner? Contact Us Today!

About This Partnership

Experience.com provides AI-powered tools for managing customer and employee experiences, and online reputation. Their platform aids businesses in driving intelligent customer and employee feedback campaigns, amplifying marketing efforts, and enhancing customer-focused employee behavior.

It supports industries like banking, insurance, real estate, and healthcare, helping companies build a strong brand reputation and culture, ultimately leading to better client engagement and operational efficiency.

Qlik

Qlik provides data integration, data quality, and analytics solutions, integrating AI for advanced data management and actionable insights.

Interested in becoming a partner? Contact Us Today!

About This Partnership

Qlik offers a comprehensive data and AI platform, integrating data integration and quality solutions with advanced analytics and AI.

Their services help companies optimize data management, enhancing decision-making and operational efficiency. Qlik’s AI-assisted analytics empower users of all skill levels, facilitating better data understanding and use.

Their tools assist in data quality governance, real-time data movement, and machine learning, supporting clients in various industries to leverage their data effectively.

Databricks

Databricks specializes in AI and data intelligence, offering a platform that integrates data management, real-time analytics, and AI for efficient data processing and insights.

Interested in becoming a partner? Contact Us Today!

About This Partnership

Databricks provides a data intelligence platform, integrating ETL, data ingestion, business intelligence, AI, and governance tools. It helps organizations in efficiently managing and analyzing large volumes of data, aiding in better decision-making.

The platform is designed to simplify complex data processing, ensuring data privacy and control while developing AI applications.

Key benefits include streamlined workflows, enhanced data management, and the ability to drive insights using natural language. Databricks caters to various industries, optimizing operations and accelerating success in data and AI initiatives.

Knowledge Works Logo

Knowledge Works

KnowledgeWorks is dedicated to transforming education through personalized, competency-based approaches and systems change to benefit students and educators.

Interested in becoming a partner? Contact Us Today!

About This Partnership

KnowledgeWorks focuses on reimagining education to ensure all students, regardless of background, can thrive. They provide tools and guidance for personalized, competency-based learning, advocating for policies that support this model.

Their work includes strategic planning, workshops, and resources for educators and policymakers. By fostering student-centered learning environments, they aim to create equitable educational opportunities, preparing students for an evolving world.

Minerva Logo

MinervaCQ

Minerva CQ specializes in AI-enhanced support for contact centers, focusing on customer-agent interaction optimization through real-time assistance, workflow adaptation, and knowledge surfacing.

Interested in becoming a partner? Contact Us Today!

About This Partnership

Minerva CQ revolutionizes customer service in contact centers using AI. Their system analyzes millions of interactions to assist agents in real-time, offering insights, data, and workflow optimization.

This leads to personalized, efficient customer interactions. Key benefits include improved customer experience, reduced handle times, enhanced agent performance, and increased revenue opportunities.

Minerva CQ also focuses on reducing agent onboarding times and optimizing training, making every agent more effective in their role.

Clarteza Logo

Clarteza

Clarteza is an innovation agency specializing in consumer insights and brand strategy, leveraging AI, innovative research methods, and curated technologies to understand and connect with consumers.

Interested in becoming a partner? Contact Us Today!

About This Partnership

Clarteza focuses on driving brand innovation by deeply understanding consumer behavior and needs. They use AI and unique research methods to gather insights and translate these into actionable strategies for brands.

Their services benefit clients by enhancing brand positioning, improving consumer engagement, and guiding product development.

Clarteza’s approach helps brands connect with consumers more effectively, ensuring that their products and services are aligned with consumer expectations and market trends.

CEE Logo

The Centre For Educational Effectiveness

The Center for Educational Effectiveness (CEE) specializes in developing surveys, data tools, and services to support the growth of communities, districts, schools, and individuals. They focus on creating a positive impact in the educational sector since 1999.

Interested in becoming a partner? Contact Us Today!

About This Partnership

CEE partners with over 950 schools in 280 districts, offering services like strategic planning, coaching, professional development, and research projects.

They help educational institutions use data effectively, build strategic plans, improve leadership skills, and review programs objectively.

CEE’s approach centers on understanding and improving school climate and culture, enhancing performance, and promoting continuous improvement.

Realty Check Logo

Reality Check

RealityCheck is a full-service market research firm specializing in advanced qualitative analysis, quantitative research, and integrated qual/quant approaches.

Interested in becoming a partner? Contact Us Today!

About This Partnership

RealityCheck offers deep consumer insights for strategic decision-making in brand strategy, concept testing, and consumer experience mapping.

Their unique approach combines advanced qualitative and quantitative methods, focusing on the critical 10% of new information essential for business growth.

They excel in translating complex data into actionable strategies, aiding companies in understanding and engaging with their customers effectively.

Socratic Technologies Logo

Socratic Technologies

Sotech offers comprehensive research services including product testing, strategy consulting, message testing, and brand health tracking.

Interested in becoming a partner? Contact Us Today!

About This Partnership

Sotech is a leader in concept testing services. Sotech offers comprehensive research services including product testing, strategy consulting, message testing, and brand health tracking. They cater to various industries like consumer products, financial services, restaurants, and technology.

Their approach focuses on collaboration, innovative solutions, and strategic insights to help clients make informed decisions.

Sotech’s expertise in market research and concept testing enables businesses to understand consumer preferences, optimize product development, and enhance brand positioning, thereby ensuring customer satisfaction and market success.

Mckinney Logo

McKinney

McKinney & Company is a multi-discipline planning, design, and construction firm known for its innovation and comprehensive project delivery approach.

Interested in becoming a partner? Contact Us Today!

About This Partnership

McKinney & Company specializes in integrating multiple disciplines like architecture, engineering, and construction management to offer innovative and efficient solutions. With a commitment to collaboration and quality, the firm ensures projects are completed to a high standard, on time, and within budget.

This approach has led to its reputation for handling challenging projects and delivering lasting value, making it a trusted partner for clients seeking comprehensive, high-quality services in planning, design, and construction.

Shapiro+Raj

Shapiro & Raj

Shapiro+Raj is a strategic insights consultancy specializing in social science, data analysis, and creative strategies, with over 60 years of industry experience

Interested in becoming a partner? Contact Us Today!

About This Partnership

Shapiro+Raj is a future-forward insights consultancy recognized as a leading strategic insights firm. They are distinguished for being innovative, having earned a top-25 most innovative company recognition for five consecutive years.

As the largest minority insights company, Shapiro+Raj operates with an integrated team comprising social scientists, data analysts, brand strategists, and creative ideators. Their approach combines social science and behavioral economics, enhanced by a blend of technology and humanity.

The company boasts over six decades of experience in various industries and has contributed to over $100 billion in market cap growth for their clients in the past seven years

Company Name

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Ut elit tellus, luctus nec ullamcorper mattis, pulvinar dapibus leo.

About This Partnership

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Ut elit tellus, luctus nec ullamcorper mattis, pulvinar dapibus leo.