We stand with Ukraine

How Taxonomies Solve Your Data Discovery Problems (Part 2)

Expert.ai Team - 16 November 2022

This is Part 2 of a 3-part blog series on how to utilize taxonomies to leverage the value of unstructured language data in your organization. 

Whether it’s your employees or your customers, people need to find the information they need when they need it. In fact, data discoverability is a high stakes activity: content that is difficult to find can lead users to have less confidence in the relevance and usefulness of the content itself, defeating the purpose of your content in the first place. The bottom line is this: making your data easy to access is an enterprise imperative.   

So, how can you make sure your users are able to find the precise information that they need? 

Enter the taxonomy. 

What Is a Taxonomy?  

In a business context, we can think of taxonomy as a content management process that allows users to easily find information on your website, your intranet or any other digital repository.  

From a more technical perspective, a taxonomy is a system for classifying information into a hierarchical structure that matters to your business. It’s written as domain categories that are organized from broad to specific or from specific to generic, and it helps embed terms together so that we understand the logical connections and associations between them. Take COVID-19 for example: it’s part of the family of respiratory viruses and it belongs to a broader category of medical conditions. It’s also related to SARs-cov2 because they have similar symptoms. In the design of a taxonomy, these relationships are very important.    

It’s useful to think of a taxonomy not as a fixed structure but as a living organism that evolves with your business. Your taxonomy can grow vertically with new categories, or horizontally, when new terms become associated with an existing term. Think of the words “cloud” and “tablet” and how their meanings have been extended to new domains over the past decade.   

The most important thing to keep in mind is that your taxonomy is organization-centric because it structures the knowledge for the domains that matter for your business.     

How Taxonomies Work with Your Data   

So, let’s look at how taxonomies solve the data discovery problems that we previously identified.   

First, they help you deal with ambiguity. A key goal for any classification system is to minimize ambiguity. To put it another way, it’s about minimizing the number of buckets that the given results can be put into, preferably to the point where there are no overlaps.  

By its very nature, language is ambiguous. For instance, take the example of “gas” in the automotive domain in American English. The term gas is used for the accelerator as in, “press the gas pedal” and, also, for the fuel that runs the car, as in “fill the gas tank.” This ambiguity exists in different ways across all languages. 

At expert.ai, our natural language understanding platform is designed to understand the context of the concept to link it to the correct product definition or categorical taxonomy.    

Next, taxonomies provide embedded domain knowledge that allows you to designate the meaning of certain terms. 

Thanks to this embedded domain knowledge, we can designate meaning, linking “gas” (“my Jaguar eats gas”) to “gasoline,” and alternatively, when we say “step on the gas,” to designate meaning to the “accelerator.” In the world of taxonomy design and implementation, this ability to accurately link concepts together is absolutely critical.     

Taxonomies support data discovery by bringing consistency to how your content is tagged and connected. In other words, taxonomies ensure that concepts are grouped under the proper category. For example, “salt” is a chemical component and it’s also known by a variety of synonyms (chlorate, NaCL, alkali, etc.) that subject matter experts may refer to during experiments.   


When applied to all of the content in your organization, a taxonomy physically helps ensure that the variety of similar concepts are placed under the proper category.  

Lastly, taxonomies can really transform content into actionable intelligence. This is the result of being able to connect the dots, to link information and data together. When people are able to more easily find relevant content, this also helps to prevent unnecessary rework. 

The Benefits of Using Taxonomies  

When you’ve created an effective taxonomy, there are three main outcomes you can expect:   

  1. Users will be able to easily navigate your content. The tags you associate with your documents will allow users to find the documents they need and to navigate the content they contain. In addition, the taxonomy branches become facets that enable you to add elements like semantic search or related topic recommendations.   
  2. You can link content together. Applying your taxonomy to multiple content silos allows you to link different pockets of content together across document sources. This highlights the importance of consistency in tagging so that you are able to link different sources and databases together. This also allows you to repurpose your archives and give them a second life (and increase their value), thanks to the variable insights they contain.   
  3. Your content is made more valuable. Overall, a taxonomy makes your content more discoverable, supporting easier navigation, increased relevancy of search results and a personalized experience, thanks to content recommendations. 

Ready to create your own taxonomy? Our final post in this series has the design considerations and recommendations you need to build your own taxonomy: Building Your Taxonomy. Read Part 1 of the series here: The Data Discovery Challenge.