I’ve got a cold, so I’ve been idly sitting around doing some slightly more thorough testing of my Emacs/mpv setup for asking LLMs what actor is on the screen of the movie I’m watching, and it’s led to me pondering just why some people are so (literally) incredibly enthusiastic about LLMs.
I mean, LLMs are fun. And useful for doing tedious programming things — for instance, the other day I asked ChatGPT to transform a li’l 150 line Javascript thing from Jquery to pure Javascript — and it did it flawlessly. (And also badly, engineering wise, since it did pointless stylistic changes on just about every line.) That would have been boring to do myself, so it’s nice that there’s a tool for that now.
But non-programming things? I just don’t understand the enthusiasm, because whenever I try to use an LLM for something, it’s never more than a toy. Somewhat useful toy, sure, but you can’t say in any way that it actually, like, works.
I wonder whether the enthusiasm is paradoxically based on how bad LLMs are in general. That is, when you chat with one of those things, it’ll give you the wrong answer a lot of the time, but then you say “but that’s not right”, and it’ll say “Good catch! You’re so smart! I’ve never seen anybody be that smart before; you must be a genius!” or variations thereof, depending on how high the company in question has dialled the knob that’s marked “Ass-Kissing” in the LLM console.
It’s just hard to be mad at a technology that’s consistently stupider than you are, and that always confirms your secret suspicion that you’re really, really smart yourself.
So I’m testing the same screenshot repeatedly with the same LLM (Gemini-2.5-Flash here) to try to see just what level of bullshit it’s giving me. This is Keanu Reeves from Even Cowgirls Get The Blues, and while he does look kinda untypical here, I think if you give somebody a cast list and ask who it is, I don’t think it’s that hard to pick him out as the most likely candidate.
But Gemini isn’t better than a throw of the dice here, even though I’ve tried in many ways to instruct it “if you don’t know, don’t guess”.
It even guesses correctly some of the time.
But let’s try some other movies…
It gets Dakota Johnson correct (from the cinematic masterpiece Madame Web).
Always.
This, from Dune Part Two, doesn’t seem right? Actually, I forget his name now… Oh, yeah, Dave Bautista.
No. So is it really just going through the list of cast members and picking one at random? What if I don’t tell the LLM what movie the screenshot is from…
Nope, makes no difference. It picked a guy from a different Dune movie, though?
It gets Timothée right — always, whether I tell it the movie or not.
Well, it’s not a totally bad guess — they aren’t totally dissimilar. (It’s Bill Pullman from Lost Highway, though.)
Yeah, if I tell it the movie, it gets it right.
Nnnnno.
So, once again, when I try to use an LLM for something, the result is somewhat useful — it’s better than nothing, but you can’t rely on it at all.
And people want to use these things to process social security applications and stuff.
Amazon apparently has a thing for letting you know who’s on screen (called “X-Ray”), and according to rumours, it’s based on hiring people in low-cost countries to tag every scene with the people on the screen. I thought identifying actors would surely be a solved problem by now, but I guess Amazon knows best.