AI is Terrible Software

And why LLM's still have a long way to go

Aug 28, 2023

Now, here’s an interesting quote from Prof. Ethan Mollick:

AI is terrible software. Or rather, while Large Language Models like ChatGPT are obviously amazing achievements of software engineering, they don’t act like software should… We want our software to yield the same outcomes every time … [LLMs] will absolutely do different things every time…. We also want to know what our software does, and how it does it, and why it does it … LLMs are literally inexplicable.

What tasks are AI best at? Intensely human ones. They do a good job with writing, with analysis, with coding, and with chatting. They make impressive marketers and consultants. They can improve productivity on writing tasks by over 30%(*) and programming tasks by over 50%, by acting as partners to which we outsource the worst work… AI is [a] human resource problem… because it is best to think of AI as people.

Thinking about AI decision accountability (potentially very important for B2B contracts involving AI services), Rohit Krishnan takes this much further in “How do you govern something that's part software and part people?”

Here are some points from Krishnan’s article:

"AI software is more like an employee than a product"

When dealing with AI, the article points out the tension between viewing AI as software or as employees. Depending on the perspective, this affects the approach to accountability.

"Creating a framework"

In a B2B context, understanding how AI models are evaluated is crucial. The article introduces a comprehensive list of evaluations, which can be likened to methods used to evaluate both software and human employees. Ensuring that these evaluations are shared between the seller and buyer can increase trust and transparency.

"Software behaves badly"

AI, in ways similar to software and employees, respectively, can be prone to errors. Such errors could be based on the training data, the system's current capabilities, or how it's integrated into larger work processes. It's essential for both parties in a commercial contract to understand and agree upon acceptable error rates or potential areas of concern.

Adapting to Future Challenges

AI's landscape is continuously evolving. The article notes that while current models might be robust, future iterations can present new challenges or opportunities. (“GPT-4 doesn't make the mistakes GPT-3.5 made”.) This evolution must be considered in long-term contracts.

Mitigating Risks with a Framework

Given the unpredictability and the unique nature of AI, the article suggests a framework to gauge what is known and distinguish assumptions from concerns. Such a framework could be an asset in B2B contracts, ensuring both parties have a shared understanding of potential risks and mitigation strategies.

* Harold: This writing gained about a 30% productivity boost from GPT-4.

Artificial Legal

AI is Terrible Software

And why LLM's still have a long way to go