AI projects are not like building houses
How I "unstucked" a critical AI workstream. Part 4 of a series on getting value out of AI in the enterprise
After all the thinking and discovery work discussed in parts 1, 2 and 3 of this series, you’ll now learn how I helped fix the execution of this AI workstream.
This is the 4th and final part of a series on finding and executing on the most valuable AI use case in an F100 company:
Part 1: How I Found the Most Valuable AI Use Case in an F100 Company...with no industry experience
Part 2: (Case Study) Process Reengineering w/ AI
Part 3: Creating an AI Roadmap: An Algorithmic Approach
Part 4 is about execution. If you don’t get this part right, none of the other stuff will matter. Your roadmap will remain a beautiful dream.
AI projects are among the hardest in technology to do well because so much can go wrong. While I would never write this on my resume, most of the projects I’ve worked on never went anywhere. I dare you to find a data scientist that this isn’t true for!
In this letter, you will learn:
why AI projects are so difficult
how our project went wrong
how I convinced the team to change course using a very simple “sales” process
AI projects are not like building houses
The biggest mistake most teams make when executing AI projects is treating them like software projects. In some ways, building software is like building a house. You get a spec and know you can execute it. You may not know exactly what it’ll be worth when it’s done. Unexpected costs can hit you, like you might find a giant rock when laying the foundation that needs to be removed or built around. But, whatever obstacles you encounter, you at least know it’s possible to build this house.
AI projects are like trying to build a house with a brand-new kind of wall material. Maybe the walls are really thin AND bulletproof. Maybe they change color at the push of a button, like a chameleon. It’s not just engineering, architecture and interior design. It’s research. Research has non-deterministic timelines.
No matter how many times a methodology has been used, all AI projects start as research projects, because every project has its own data and little idiosyncrasies.
In research, you don’t know if something will work until you try it. Getting people to pay for research is exceedingly difficult, because, in the end, it might not work.
Think of it like this:
How much money do you have in your bank account? Now imagine there’s something you really want, but it “might not work” and it costs 90% of your life savings. Would you do it?
This is what we’re asking people to do when we ask them to fund AI projects. It’s a lot easier to make this call when you have a bunch of excess cash lying around.
Most budgets do not have such excess. So, AI becomes nice-to-have and teams prioritize deterministic improvements.
Why we implicitly deprioritized critical AI work
This was exactly the case with the team I was working with. There were 2 hands-on resources split among 3 primary categories of work:
prod support (putting out fires...so many fires)
rules engine improvements (which usually led to 1-2% improvements in the KPI)
AI experimentation (which could lead to 10, 20 or even 30% improvements in the KPI)
The above order was their implicit prioritization scheme. Prod support came first. Rules engine second. AI third.
Actually, it was more like “AI never.” Very little AI experimentation happened for months. I found this mysterious. It seemed so obvious to me that the AI work was most important for the long-term, but it wasn’t getting done.
After a couple months of this, I realized what was happening. The team wanted to be “agile,” which means they wanted to show iterative progress. The rules engine allowed them to show iterative progress within the allotted time frame - a 10-week PI (”planning interval,” composed of five 2-week sprints).
So the rules engine got prioritized. And with regular production fires and very few people with the background to solve them, there was no slack left for AI experimentation.
It was a beautiful trap. No one looks bad for dropping everything to solve production issues. But the marginal improvements from the rules engine work would not last forever. The team’s behavior was not as nonsensical as it first appeared to me. Indeed, they were acting perfectly rationally, locally optimizing against their implicit incentives:
avoiding blame
showing “some progress”
This is the not lose trap, something I’ve written about here before. The team confused not losing (avoiding blame, showing some progress) with winning (hitting the ambitious KPI goal by the deadline).
To move forward, I had to convince the team that not-losing was, in fact, losing, and show them how few paths there were to winning.
Exercise
Before you read what I did, think about what you’d do in this situation and why you’d do it. Consider leaving your answer as a comment.
As you compare your answer to mine, your brain will register a “prediction error,” which is proven to improve learning, memory and recall. (This is why pretests are the thing in school.)
As you read on, consider how your answer might be constrained by your level and position in the company. The lower your level, the more sophisticated your solution must be, and the more energy it will probably take.
What a CDO might do by edict in minutes, a senior manager might need hours of stakeholder alignment and weeks to get on a key executive’s calendar.
I’d love to read your solution. Maybe you’ll come up with something better than what I did!
So here’s what I did. It’s the last step in my process of executing on the most valuable AI use case in an enterprise.
Step 5 - Use small wins to educate and accelerate
My main goal was to convince the team to prioritize the AI experiments that would lead to the biggest potential KPI improvements. We would need more hands-on resources (people), ideally parallelized across the 3 workstreams of prod, rules engine, and AI. This would require the 2 hands-on folks to educate new people, leading to a temporary drop in productivity.
All of this would require the team to go against their short-term instincts.
Ultimately, the prioritization was more important than adding more people. It’s much more common to throw people and money at a problem without a clear idea of what really matters and get chaos. The risk here was adding resources and just having them focus on prod issues and the rules engine.
I wanted the team to follow this prioritization scheme:
prod support
AI experimentation
rules engine
In the end, I succeeded. In fact, other workstreams came about because of this reprioritization, including a fairly involved data cleaning effort.
I didn’t do it all myself, of course. Notably, the technology leader in that line-of-business did the hard work of getting more funding, which allowed us to parallelize across the 3 areas.
But first, I had to convince them and others that this move was needed.
So how did I do it?
My argument boiled down to one sentence:
“There’s no other way.”
As in, “there’s no way to reach our KPI goal except through AI.”
Specifically, AI experiments targeted at breaking through our biggest bottleneck, which was clear from the process diagram. And, because AI experiments are iterative and we don’t know what results we’d get until we tried, we’d need to give the experiments enough time. Meaning, we needed to start immediately!
Presented like this, it became quite clear that there was only 1 right choice: prioritize the AI work.
Question assumptions
In meetings, I also questioned whether we really needed to show definitive improvements to the KPIs every 2 - 10 weeks. The team realized that they just assumed senior leaders wanted to see these incremental improvements. It turns out they didn’t care that much, as long as we had a clear plan to reach our goal and were executing on it!
Show progress by reducing your indicator lag
The KPI improvement numbers were pulled from the production system. They were highly-lagging indicators of progress, not ideal when building AI systems that require more guardrails. Instead, I emphasized that the results of our experiments would allow us to claim much larger “potential” benefits to the KPIs, which could be translated directly from our experiments’ performance metrics (accuracy, precision, etc.).
While this was not a “leading” indicator, it was better than needing to deploy in production to claim any credit. In this way, we could not only share actual improvements in the system, but “banked” ones as well.
AI change management is very simple
This process is analogous to sales. I learn what the customer really wants (KPI improvements) and what’s stopping them (under-resourcing, bad prioritization). I position my product or service (prioritize AI) as the natural and obvious solution to their problem (”it’s the only way.”)
In this “sales” process, the part that took the longest was wrapping people’s minds around the AI lifecycle. It was especially challenging to explain that “you don’t know what your results will be until you try it.”
Why?
There were real, 7-figure hiring decisions on the line. If we succeeded, we’d need to hire fewer new people to scale the business. Hiring and training take time, and business didn’t want to be caught understaffed because they waited in vain for our experiments.
Because of this, they REALLY WANTED definitive promises - and we couldn’t give it to them. In a sense, they didn’t “want” to believe that AI experiments take a while and that you can’t promise specific gains. Motivated reasoning is a tough nut to crack.
Building comfort with uncertainty
But we cracked it. Over time, the teams became comfortable with the uncertainty. The solution was repeating myself across different contexts, adding good people to the workstream, parallelizing everything that could be parallelized, and estimating improvements before we saw them in the production systems so business could make more informed decisions.
Then, we communicated progress directly and transparently, translating our results into business terms. “In this experiment, our accuracy was X%. If it works this well in production, this would mean Y% improvement in the KPI.” (We also hedged our bets by emphasizing that things never work as well in production as you think.)
When things went wrong, we proactively shared concrete plans to overcome them. By prioritizing impact over comfort and concentrating our resources on problems that matter, we created real momentum.
That’s all for this week. I’d love to hear what you think and how you might have handled things differently.
If you think someone else would find this valuable, sharing it with them would mean the world to me!

