UX for AI
Posts
STOP F*cking Up AI Projects (Avoid these 5 pitfalls)

STOP F*cking Up AI Projects (Avoid these 5 pitfalls)

A lot of folks say hindsight is 20/20, but personally, I love post-mortems. Seeing clearly how my team screwed-up in the past teaches me how to avoid creating the same messes in the future. According to Forbes, 85% of all AI projects fail. And knowing many of the ways the project might go wrong and being able to spot them early is what makes Greg a UX for AI expert. And speaking of messes…

Greg Nudelman
July 26, 2024

A lot of folks say hindsight is 20/20, but personally, I love post-mortems. Seeing clearly how my team screwed up in the past teaches me how to avoid creating the same messes in the future. According to Forbes, 85% of all AI projects fail. Knowing many of the ways the project might go wrong and being able to spot them early is what makes Greg a UX for AI expert. And speaking of messes…

A Boiling Pot of Spaghetti

Imagine, if you will, a complex industrial process akin to a giant boiling pot of spaghetti at a boil on a stove. If you increase the temperature, the spaghetti will be done faster, and you can cook more pasta in 24 hours. However, increasing the temperature of the stove burner also increases the risk of the pot boiling over, creating a sticky spaghetti mess all over the stove. Which requires shutting down the cooking, deploying an expensive industrial cleaning process, and ruining a fun day of boiling pasta and an encounter with an angry boss.

One industrial supplier of these giant “industrial pasta pots” (who shall remain nameless) thought of a brilliant solution: they could use AI to predict when the pot was about to boil over. They invested over six months in my team of seven well-paid Data Science/Dev/UX professionals trying to make it work, but sadly, the entire project was a complete and utter failure.

Their loss is now your gain.

I shall break down the reasons why this particular AI project failed into 5 principles. You can copy these down and tape them above your monitor so you remain aware of possible ways for the project to fail as you work on your own AI-driven projects:

1. Don’t try to replace an expert with AI

It did not take my team long to figure out that every “industrial pasta pot” (worth millions of dollars) was operated by a dedicated expert technician specifically trained in maintaining the right level of boil to achieve a good yield and avoid boiling over. If the technician saw the level of liquid rise rapidly, they would lower the heat, avoiding over-boil. After a short time, these technicians became experts in avoiding over-boiling on this specific pot installation.

Our team theorized that our AI solution would replace these technicians, rendering them unemployed and saving the company money in the process. While this strategy sounds bullet-proof in theory, in reality, the idea was akin to trying to block bullets with a wet tissue.

To begin with, these technicians, despite being experts, were not so very highly paid, and our AI solution out of the gate would cost the customer much more than the technician. The AI we were selling them on was not trained on their specific pot installation. (AI was actually not trained at all since we could not get the data for ML training; see point 3 below). So there was no possibility at all of AI performing as well as the technician, while in actuality costing more.

❝

TL/DR: Any time your AI solution tries to replace an existing installed expert process operator, take care! This is a huge red flag, and the likelihood of your project failing goes way up. If your AI solution costs more than the installed expert, don’t just walk away. Run.

Here’s a detailed guide on how to pick the right AI use case for your AI project: https://www.uxforai.com/p/how-to-pick-an-ai-use-case

2. Don’t forget to review Cost vs. Benefit (Value Matrix)

Before engaging with an AI-building exercise, take the time to understand the cost/benefit analysis of your use case. Every AI action is basically a prediction with a certain probability of success or failure. Thus, every prediction has a certain cost and benefit. Our project team failed to quantify the cost/benefit of this project before trying to develop the AI solution.

Don’t make the same mistake.

While avoiding over-boiling was relatively easy (just lower the heat), the cost impact of an over-boiling event was very high. In fact, the cost of just a single over-boiling of the pot was several times higher than the yearly salary of the expert operator. Thus, to justify the cost of the installation, the AI solution needed to be ridiculously accurate at avoiding over-boiling because the single false negative (failing to check the over-boiling) would wipe out all of the profits from a full year of preventing over-boiling, or a factor of 1000:1 against. (e.g. 1000 correct True Negative guesses against a single False Negative.)

Skewed cost/benefit impact made it very hard to convince the customers that they should replace their proven, installed, and trained solution (a low-cost, full-time expert pot operator) with an expensive, unproven, and untrained AI solution.

As an additional disincentive to the adoption of AI in this case, our company refused to cover the cost of over-boiling caused by a faulty AI guess. This made the whole thing a complete non-starter.

❝

TL/DR: If the potential cost of a wrong AI guess far exceeds the benefit of a correct AI guess, walk away. If the cost of a bad AI guess is catastrophic, run.

Here’s a detailed walk-through on how to do your own complete cost-benefit analysis with a Value Matrix: https://www.uxforai.com/p/ai-accuracy-bullsht-heres-ux-must-part-1

3. Make sure you have the ML training data

While our company made pots, it did not use the pots; only our customers did. This made collecting data challenging from the start. The high cost of each pot meant that only a few thousand pots were installed worldwide—not enough to automatically collect generalized ML data.

What made things even worse was that every pot installation was a little different: different pipes, different heat sources, slight variations in atmospheric temperature, pressure, humidity, rate of flow, fans, and the like made each installation bespoke. (In the same way that me boiling a pot of pasta on my own stove tells you nothing about the conditions of boiling pasta on your stove, even if we use the same pot!) The AI model from installation A could not be used in installation B. This meant that every pot required its own custom AI system.

❝

TL/DR: If you do not have the data to train your AI/ML or have no easy, cheap way to obtain the data, walk away. If your solution requires a custom AI model for each installation, run.

Here’s a detailed discussion about how to spot the bias in your ML training data: https://www.uxforai.com/p/transforming-ai-bias-into-augmented-intelligence

4. Make sure your AI model is answering the right question.

While the lack of data alone should have killed the project, the question we were modeling with AI sealed its demise.

The human operator was tasked with answering the question: how high can I make my temperature before the risk of boiling over is too great?

In contrast, the AI model my team was building was trying to answer a different question: given the measurement of temperature and liquid level, how long do we have until the next boil-over event?

Now you can see the problem: the operator’s question was aimed at increasing the customer’s profit because, as you recall, more heat meant more cooked pasta at the end of the day.

In contrast, AI was trying to answer a question that was related to operations but not necessarily directly aimed at increasing profits. It was, however, a convenient question for our model to answer, and so my employer decided that was good enough.

It wasn’t.

My team was akin to a man in that famous joke about looking for keys:

It’s the middle of the night, and a man is on his hands and knees underneath a streetlight, looking for something. A passerby stops to help:
Passerby: “What did you lose?”
Man: “My keys.”
Passerby: “Where did you lose them?”
Man: “Over there in the bushes.”
Passerby: “Then why are you looking for them here?”
Man: “Because here, under the streetlight, I can see what I’m doing!”

❝

TL/DR: If your AI model is trying to answer a question not directly related to maximizing profits but instead is answering a data science question, walk away. If your team insists on looking under a streetlight only because it’s the only place they can see what they are doing, run.

Here’s a detailed write-up about modeling inputs and outputs with a Digital Twin so you can aim your AI to answer the right question that your customer actually cares about: https://www.uxforai.com/p/digital-twin

5. Make sure you can conduct user research

What completely made the project a non-starter was that each of these 1000+ plants where my company’s pots were installed was in a remote area, not readily accessible for user research. As a result, our team made all kinds of assumptions, most of them wrong, that could have been cleared up within an hour of seeing the situation for ourselves.

For example, we were told by a subject-matter expert (SME) that the only sensors and controls available to AI for modeling were:

Liquid level sensor
Temperature sensor

So, using those two sensors, our AI was trying to answer the question: given the measurement of temperature and liquid level, how long do we have until the next boil-over event?

Anyone who has ever boiled a covered pot of pasta can tell you that overboiling is not a gradual event—it is fast, explosive, and very messy. The best way to avoid boiling the pasta is to look at the surface of the liquid, not at the measurement of the liquid level.

After many months of toiling at the problem and failing, we discovered that the human operator had an additional sensor: they could look through a small glass window onto the boiling surface of the pot and visually ascertain how the boiling was performing.

Sure enough, the company’s SME “knew” this, but he thought it was not important to us because, of course, the visual of the boiling surface could not be easily instrumented with a sensor. The other two sensors (liquid level and temperature) were convenient numbers that you could measure and feed into the AI. The visual of the boiling surface was messy – in other words, it was what human experts are good at looking at, and machines are, well, terrible at this.

❝

TL/DR: If you do not have a well-run research program that will help you connect directly with your customers, walk away. If you cannot conduct even a single in-person, on-site interview with your target customers, run.

More about conducting research for AI: https://www.uxforai.com/p/ai-and-ux-research

In Conclusion

Hindsight is 20/20 — looking backward allows you to see clearly all the ways your team screwed up. While it’s uncomfortable, this learning is essential if we aim to improve and avoid the same mistakes in the future. I hope that by reading about our mistakes here, you can avoid making some of your own.

The most important strategic takeaway is that, in most cases, the more your AI solution replaces the expert, the higher the potential cost of a false positive/false negative. However, the more your AI augments your expert and acts as a “blind-spot indicator,” the lower the impact of the potential screw-up. Augmenting the human allows the AI to continuously learn from the environment and get better over time. That is why I counsel my teams and my students to think of AI not as “Artificial Intelligence” but as “Augmented Intelligence.” Thinking this way will improve your batting average considerably, up from the measly 15% of AI projects that are actually successful.

On September 9th, me and Daria will be teaching a fabulous full-day workshop aimed at making your next UX for AI project a success. It’s full of hands-on exercises to select your use case, build your Digital Twin, set up your own Value Matrix, and try out user testing with a lean paper prototype. We guarantee that attending this workshop will improve your batting average and vastly increase your chances for success in the next AI-Driven Design project.

Get your ticket here: https://strat.events/usa/

Cheers,

Greg Nudelman and Daria Kempka (Contributing Editor)

P.S. Our full-day hands-on UX for AI workshop at UXSRAT on September 9th will sell out like our previous workshops at UXSTRAT 2023, UX Copenhagen, Rosenfeld Media online, and UXLx in Lisbon. To ensure your access to practical techniques you will need to succeed in your next AI-driven project, get your ticket now: https://strat.events/usa/

See you there!

Reply

or to participate.