From Black Box to Green Glass: The Responsible AI Imperative

Recently Walt Mayo, CEO of expert.ai, led our weekly NLPStream where he spoke about the elements required for Responsible AI.

AI Hype?

The hype surrounding responsible AI has existed for over a generation but AI for the sake of AI is just keeping up with the Joneses. In implementing any AI solution, it’s necessary to make sure your technology and business teams have goals and committed partners that are mutually accountable to deliver on its promise.

Natural Language is a distinct area within AI. Even the natural language (NL) arena is highly fractured, and there are many different NL functional areas.

The expert.ai Platform is based on the principle that no single natural language AI technique is a fit for every project. Analysts have asserted that combining different AI techniques solves a wider range of business problems.

Other approaches force a machine learning-only approach for natural language, which has proven over time to be difficult (and costly).

The Meaning Behind Responsible AI

There is no one steel-clad definition of responsible AI, but as the technology matures, and as customers themselves are becoming more attentive to issues related to the environment, social responsibility, equity and privacy, there is a growing acceptance and understanding that AI must include the following elements:

Transparency (also expressed as explainability or accountability)
Fairness (also expressed as non biased/no marginalization)
Data integrity
Low carbon footprint

Intrinsic in expert.ai’s AI language solution is the ability to address ALL of the items considered above.

“As a business owner, responsible AI means if the technology is helping you to make a better decision, you should understand how it’s helping you.”

Compute Intensive Models Lack Transparency

Today, we’re seeing major investments by some of the world’s largest tech companies in large language models (LLMs).

These models are trained on massive data sets (think gigabytes and petabytes) with billions of parameters. They are often used to generate text for writing articles, engaging in conversation, question & answering and text summarization.

While these models have produced some astonishing results, the very scale and complexity of data involved, as well as the technology approach itself (machine learning techniques based on statistics and pattern recognition) make these results almost uninterpretable.

In other words, we don’t know how the algorithm arrived at a result. This is what is meant by a black box.

“…if somebody asks you how your technology produced a given outcome, you could reasonably explain it.”

Large Language Models: Data and Bias

The very nature of these LLMs creates several areas of concern for companies who are considering integrating them into their products or applications. In addition to the lack of explainability and transparency in how these models operate, there is the problem of data integrity and, as a result, bias and a lack of fairness.

How does this happen? To be able to access the necessary massive amounts of data for language models, there’s only one place to turn: the internet. LLMs are trained on open-source text data scraped from the internet, which as we all know, is filled with toxic content. So, when these LLMs are used to generate natural language in some form, it’s not surprising to see some of this toxic language being regurgitated.

“It gets into this broader responsibility to ensure that your technology does things reliably that are useful, and it doesn’t do things that have meaningful unintended consequences.”

Not only does the initial processing and crunching of so much data introduce bias, the model also requires a high degree of data and/or training. In fact, each of these areas present challenges for bias and contribute to the high carbon footprint that such models are known for:

Data: Large language models require massive amounts of data, and because we’re not talking about homogenous information, there will be lots and lots of complexity, which requires even more processing.
Processing: Specialized processors for deep learning GPUs are expensive. This is a boon to cloud computing providers who can charge premium for these services
Training: Training is how a LLM learns from a sample dataset, and it can take weeks to train a model. Tweaking results to improve performance or avoid bias, or adding new data requires additional training, incurring compute/energy resources each time.

“As you’re going back to try to obtain results…you’re repeating the cycle…in an inefficient way, because typically what you’re having to do is either add more data to it or label a lot more data.”

The Carbon Footprint of Compute Intensive Models

The other thing about these large models? All of the processing and training they require is extremely compute intensive. An often cited study found that a single LLM emitted nearly 300,000 kg of carbon dioxide emissions. This is the equivalent of 125 round-trip flights between New York and Beijing, and five times the lifetime emissions of the average American car.

This is a problem for obvious reasons, especially for companies who have pledged to be net zero by 2040 (and sooner), not to mention the European Union’s 2030 Climate Target plan that aims to reduce greenhouse gas emissions 55% by 2030 and other movements toward climate related disclosures.

“When we hear about the carbon footprint of AI, most often, it’s referring to the work of these complex, billion-parameter models…we don’t really know how they are working [back to transparency issues, and the cycle continues].”

Fortunately, more energy-efficient approaches to AI don’t require significant compromises to be made on the quality of AI models. In many cases, they can reach a sufficient level of accuracy using far smaller data volumes and much less energy. An example of what can be achieved? Researchers at Accenture Labs found that training an AI model on 70% of the full dataset reduced its accuracy by less than 1%, but cut energy consumption by a staggering 47%. Also, as the AI community grapples with its environmental impact, some conferences now ask paper submitters to include information on CO2 emissions.

A Framework for Responsible AI

As a business owner, responsible AI means that, if the technology you’re using is helping you make a better decision, you should understand how it’s helping you. And if you’re evaluating AI, you need to know how the AI is going to produce the useful results that you seek.

As Walt said, “Think about what is the most thoughtful way to achieve your business objective and then from there, generally speaking, you’re likely to find something that is going to be more efficient and elegant. Often the two go together, a simple, elegant solution that works well.”

“The kind of overarching framework we have is that we have accountable, efficient technology that reliably does useful things.”

For more information, discover the expert.ai Platform.