AI: Explainable Enough
They look really juicy, she said. I was sitting in a small room with a faint chemical smell, doing one my first customer interviews. There is a sweet spot between going too deep and asserting a position. Good AI has to be just explainable enough to satisfy the user without overwhelming them with information. Luckily, I wasn’t new to the problem.
Nuthatcher atop Persimmons (ca. 1910) by Ohara Koson. Original from The Clark Art Institute. Digitally enhanced by rawpixel.
Coming from a microscopy and bio background with a strong inclination towards image analysis I had picked up deep learning as a way to be lazy in lab. Why bother figuring out features of interest when you can have a computer do it for you, was my angle. The issue was that in 2015 no biologist would accept any kind of deep learning analysis and definitely not if you couldn’t explain the details.
What the domain expert user doesn’t want:
– How a convolutional neural network works. Confidence scores, loss, AUC, are all meaningless to a biologist and also to a doctor.
What the domain expert desires:
– Help at the lowest level of detail that they care about.
– AI identifies features A, B, C, and that when you see A, B, & C it is likely to be disease X.
Most users don’t care how a deep learning really works. So, if you start giving them details like the IoU score of the object detection bounding box or if it was YOLO or R-CNN that you used their eyes will glaze over and you will never get a customer. Draw a bounding box, heat map, or outline, with the predicted label and stop there. It’s also bad to go to the other extreme. If the AI just states the diagnosis for the whole image then the AI might be right, but the user does not get to participate in the process. Not to mention regulatory risk goes way up.
This applies beyong images, consider LLMs. No one with any expertise likes a black box. Today, why do LLMs generate code instead of directly doing the thing that the programmer is asking them to do? It’s because the programmer wants to ensure that the code “works” and they have the expertise to figure out if and when it goes wrong. It’s the same reason that vibe coding is great for prototyping but not for production and why frequent readers can spot AI patterns, ahem, easily. So in a Betty Crocker cake mix kind of way, let the user add the egg.
Building explainable-enough AI takes immense effort. It actually is easier to train AI to diagnose the whole image or to give details. Generating high-quality data at that just right level is very difficult and expensive. However, do it right and the effort pays off. The outcome is an AI-Human causal prediction machine. Where the causes, i.e. the median level features, inform the user and build confidence towards the final outcome. The deep learning part is still a black box but the user doesn’t mind because you aid their thinking.
I’m excited by some new developments like REX which sort of retro-fit causality onto usual deep learning models. With improvements in performance user preferences for detail may change, but I suspect that need for AI to be explainable enough will remain. Perhaps we will even have custom labels like ‘juicy’.