I’m biased against AI. It’s not out of ignorance. I studied AI in college. I worked in speech recognition research during college and after college. I have a patent in speech recognition using hidden Markov models. I’ve designed a VLSI chip to do parallel processing for neural networks. This isn’t to brag (ok, maybe a little); it’s to say my opinions aren’t based on ignorance, or at least not complete ignorance.
My experience in AI taught me that I never wanted to work in AI again.
People have a weird psychology about AI. One of my favorite stories is about this AT&T study where they had a speech recognition system that could understand digits at something like 99.6% accuracy, and it turns out that that’s more accurate than a phone’s buttons. When you push a button on a phone (at least on an old phone that used DTMF - dual tone multi frequency), the phone would translate the digit on the button into two frequencies that it would play at the same time. That sound was then encoded and transmitted digitally (or sent analog over really old lines) to some receiver system that would then decode and interpret the signal back as a digit. It turns out that this isn’t perfect. There’s noise in the system, and sometimes the system would “hear” the wrong digit. But when people pushed a button and got a wrong number, they assumed it was their fault. You know your fingers aren’t 100% accurate. People are clumsy. So they’re willing to take the blame when something goes wrong. They assume the button itself is 100% accurate, and it must have been them. I’ve done this myself, of course, and usually assumed it was my fault. And most of the time it probably ways. I hit the wrong button, or hit two buttons at the same time, or didn’t press it long enough. Whatever. But I’ve also hit the redial button to replay the exact same digits, and had it work one time and fail another. But the point is, psychologically, people tolerate less than 100% DTMF and it’s fine. But voice - that’s a whole different thing. Again, AT&T had higher accuracy in their digit recognition system, but people always thought their voice was perfect, and if the system didn’t recognize it, it must be the system that was wrong. Now, this is quite stupid. Objectively, people do all sorts of bad things with their voice. Mumble, speak softly, even literally say the wrong digits, but they don’t believe any of those things are justification for a system getting them wrong. Even between humans, if I say something and you hear it wrong, it must have been you, because my voice is infallible.
When we were training speech recognition systems, we’d start with training data recorded from real people speaking. But the problem with training data is that it’s not perfect. Before you train something on a sample of words, you need to know what those words were. So you’d have them transcribed by a human. I’d sometimes listen to the samples and reason that whoever transcribed it actually got it wrong, that the sample was saying a slightly different word. And this makes sense because people transcribing are under time pressure, maybe even paid per word, not per hour, so they rush through and make their best guess. And I, as a researcher, might be willing to listen to it 5 times to make sure I really knew what they said. But also sometimes I’d listen to a sample 15 times and still not have a clue what the person was saying. It was, by definition, unintelligible. People say things that aren’t even words, yet fully expect another person, and speech recognition systems, to understand them anyway. So it’s really hard to work in an industry against that kind of pressure. Weirdly illogical.
But back to GenAI, sure it seems to be different. ChatGPT is amazing. It’s not perfect, but it exceeds expectations and so everyone’s happy. And maybe it will fundamentally be different this time around.
Or maybe, it’s just because it’s hovering around 80% accuracy right now, and everyone knows that they need to check the results. They don’t tie GenAI into important systems without having humans in the loop. And maybe it’s only used in domains where 80% accuracy is good enough. But when it gets to 90% will that change? Will people start accepting a marketing landing page that is 90% accurate? Maybe for some outreach campaigns. I mean, let’s face it, spammers are going to use it at that point for sure without bothering to read the results. Heck, they probably already are. But for legit marketers? No, they’ll have too much integrity for that, of course. But at 95% does it change? 99%? What about 99.6%?
Somewhere around 99% accuracy, people will stop putting humans in the loop, stop using AI to augment your work and start using it to do the work. Stop being a co-pilot, and become the pilot. (Although that’s quite unfair to actual co-pilots that do, in fact, fly the plane too, but it’s the metaphor the industry seems to be going with at the moment.) And it’ll be justified because the accuracy will be higher than the majority of your employees anyway. I mean, how many employers can really claim that all of their employees are above 99% accurate in everything they do? Put that way, it’s kind of silly. So it’ll be a net win, right!?
But whether it hits the trough of disillusionment, or the uncanny valley, it’s going to freak people out. Suddenly, there’s going to be a backlash. Big companies will be called out for some tiny mistake in their ad copy because someone proves that it was generated by AI, and that it was wrong. On the other hand, other people will do research and prove that the error rate of AI is actually less than humans, so shouldn’t we all celebrate? But it won’t matter. A Tesla car crashed, so let’s block all self-driving cars, even if the world would be better off if every car on the road was self-driving, because let’s face it, humans suck at driving. But if I’m going to die in a car wreck, I want it be because of my mistake, dammit! How dare you crash far less often, but in different ways that I would have never crashed in. I’m infallible!
So will GenAI survive this backlash? Oh probably. Heck, speech recognition systems didn’t go away. But it took 20 years to go from a demo that sounded an awful lot like talking to Siri or Alexa, to actually having Siri and Alexa in mainstream consumer devices (and actually stay there rather than be rejected). Will it be another 20 years for GenAI to get past its own chasm? Who knows. Maybe it’ll be faster. This wave of AI does feel fundamentally different than 30 years ago. But I’d be willing to bet that a ton of the recent GenAI companies are going to flail and fail and at some point it’s going to be miserable and frustrating working in GenAI. And when it does mature, it might not be from any startup at all, but something the likes of Apple and Google bring to the market.