Sunday, November 23, 2025

From the Archives - Bits & Bobs Volume XXIV

Introduction

This is the twenty-fourth installment in the “From the Archives – Bits & Bobs” series. Volume I of the series is here, Volume II is here, Volume III is here, Volume IV is here, Volume V is here, Volume VI is here, Volume VII is here, Volume XIII is here, Volume IX is here, Volume X is here, Volume XI is here, Volume XII is here, Volume XIII is here, Volume XIV is here, Volume XV is here, Volume XVI is here, Volume XVII is here, Volume XVIII is here, Volume XIX is here, Volume XX is here, Volume XXI is here, Volume XXII is here, and Volume XXIII is here. The different volumes are unrelated, other than they tend to feature me, and for that I apologize. I’m not that exciting a character but I’m the only me I’ve got.

If you haven’t read the previous installments, don’t worry—this isn’t some cogent account of a life, no Portrait of the Blogger as a Young Man or anything. You might as well read the other volumes at some point, but they’re not, like, required reading or anything, and none of this will be on the test. Also: there is no test.

What are albertnet Bits & Bobs posts? They’re generally blobs of micro-literature recycled from letters to friends (with the cuss words removed) or family (with the obsequiousness removed). Read these in any order, or while drunk or disorderly, or read them aloud to an orderly, and/or all of the above. The date of each snippet is given, along with where I was living at the time.

May 27, 1990 – Santa Barbara

You might regret asking how Nationals went because it’s a long story. You know I had that bad crash like a month ago, descending Refugio Road. I was really banged up from that which really hosed my training.


Well, I finally started feeling better, and got some really good riding in, but things kept going wrong, like I had eight flat tires, all of them front (!?), in a period of like two weeks. Unbelievable. But things seemed to be turning around and despite having to drop out of all three races at the Conference Championship due to poor fitness, I felt like the form was coming back, and one afternoon I just felt so jubilant I started jumping on the bed. Next thing I know I’d smashed my hand through a glass light fixture, slicing the knuckle wide open. It was dramatic, squirting blood and everything, which really frightened my roommate, and I needed a bunch of stitches. Training after that really sucked because I could only hold on to the handlebars with two fingers and couldn’t even rest my hand there because of this stupid splint I had to wear. It really got in the way—like, I was on a descent and saw this pothole too late to steer around it, but I couldn’t bunny-hop it because I didn’t’ have a good enough grip on the bars. I hit it and totally wrecked my brand-new Mavic MA40 rim. It’s like I was cursed!

So on the Wednesday before nationals I finally sat down, grabbed the finger, and just bent it down to where it needed to go. That kind of hurt because the stitches were right on the knuckle and my whole hand was kind of frozen. But I had actually gotten pretty fit, at the expense of everything else. (I guess I’ll just sort of ramble on about this race weekend in sort of a free-association way, instead of trying to organize any thoughts, because I’m tired of form and structure and logic, having pulled an all-nighter the other night to crank out a research paper.) So we go up to Berkeley to hang out and buy some Scott clip‑on bars for my funny bike. Then we drive to Palo Alto, screw around a while, and check into the hotel. Stanford really screwed up the race this year by having only one sponsor, John Dough’s Pizza. The food was provided but it meant eating every single meal there: three days of goddam Italian food—breakfast, lunch, and dinner.

So, the road race was this 3‑mile loop: 1.5 miles up, 1.5 miles down. Kind of a bogus course for nationals, if you ask me. I was actually thinking I could do pretty well though, having done okay on that course earlier in the season. Well, on the descent on the first lap in the middle of the pack this guy’s wheel fell apart and he started shaking back and forth and totally losing control, at 45 miles an hour. I was passing him as this happened, going just a little faster so it took about a day to get by him, and I’m sweating the whole time. Just as I get clear of him he goes down and takes out half a dozen people. What an idiot. The rest of us, a huge pack in fact, get to the climb and I’m going along fine, and suddenly I have a flat front tire. The curse continues!

Anyhow, I’m not too stressed, I just pull over for the Mavic technical support, except then I notice it’s not really a flat tire—it feels like a broken axle or something. I pop the wheel out and the computer wire gets pulled tight and I realize I’ve got a broken fork (right dropout broken completely off)! So I look up at the support van and realize they’re not gonna have any bikes with toe-clips, least of all a 62 cm. Mitch, the tech support guy, jumps out with big grin and says, “Look or Time [pedals]?” I’m like, “Fuck.” So about four miles into the national road race, I’m out. Needless to say I was pretty pissed off. I didn’t want an excuse, I wanted a result. This brought my record of finished races up to zero in the last four. Not to mention I had to walk all the way back to where we were parked down the course, like a mile, carrying my damn bike and walking like a damn duck in my cleats. Actually, halfway there I ran into my brother Geoff [a spectator at the race and his team’s mechanic] and he threw me over his shoulder like a fireman, my bike over his other shoulder, and marched me back to the start/finish area. Anyhow, the team time trial that evening went a lot better but I’ll have to tell you about that later, this has gone on long enough.

January 19, 1991 – Berkeley

Full shift at the bike shop today. We just opened a new location, much closer to campus. I made a couple of bike sales, which I screwed up. The first one I sold at full retail because I forgot all about our Grand Opening sale. The next one I don’t know what I was thinking, but I rang it up like a mere accessory, instead of a bike, and I didn’t get the customer’s name or anything. (Except her first name, Dana, not too hard to remember. Especially since she was pretty fly.) So towards the end of the day the owner, M—, is showing bikes to this pretty fresh blonde, and I’m thinking, “Ya know, I should really be handling that sale.” As if reading my mind, he calls me over and tells me to take her on a little bike tour of the Berkeley campus. No sweat, boss! Why can’t every day be like this?

April 6, 1991 – Berkeley

Randomly I got this letter from a chick I dated for only like the last couple weeks I was in Santa Barbara [before transferring to Berkeley]. When we started hanging out I made it pretty clear this wasn’t a long-term thing, I wasn’t into long-distance romances, etc. And it’s not like we hung out 24x7, I made a point of that. Just a very low-key kind of fling, right? It carried all the emotional impact of, say, a game of miniature golf. And then I’d heard she ran off to England with some dude she met at Yosemite over the summer, and I was like, great, cool, I’m happy for her! But then her letter was just bizarre: super pissy, and larded with dramatic phrases like something from a bad romance novel: “with the intention of clearing the air between us” … “don’t want to accuse or justify either of us” … “not trying to absolve myself” … “I know that I have hurt you and for that I am sorry” (which actually made me burst out laughing), and winding up with the old classic, “If you don’t care, then fine.” And actually, I don’t care … but I’m kind of getting the impression that’s not fine.

Geez, what a pain in the ass. I guess she’s worried because we’re both going to this wedding in June, and we’ll both be in the wedding party, and she says “if we were to see each other now, we’d ignore the hell out of one another and make it difficult for everyone else.” Which is totally unfair, because she’s actually super fly, so how and why would I ignore her? Besides, I’m bringing my girlfriend to the wedding, and it’ll be fun to see them giving each other stink-eye. And how would us ignoring each other make it “difficult for everyone else”  anyway … would anyone even notice? Why can’t this chick be mellow about all this? She knew full well from our first date that there was food in her fridge that would last longer than “us” … so what’s with all the drama? I have the impression it’s not really much to do with me; I just happen to be the guy who stumbled into her maelstrom. My pal T— and I have an expression we like to use for this kind of person: a “spun chicken.” Perhaps not coincidentally, he dated her (and for a long time) before I did, and yet he’s accusing me of spinning her. I’m like, no dude, she was spun already, you did it!

August 19, 1992 – Berkeley

School’s getting ready to start, and I’m really excited. I finally triumphed over the staggering cost of books: I went to the student store to find out what I needed, and just wrote down all the titles instead of buying them. Typically, a used paperback starts at about $6 at the student store, when it’s available, and can go up to $8, $10, sometimes even more, even if it’s fifty years old and the cover price is like a buck. Well, I went down to Moe’s Books, a great new-and-used bookstore, where I found more than half of my books for dirt cheap. Used paperbacks at Moe’s are priced at half of the cover price—and the real killing is when you find an ancient one that was cheap when it was new. I found a 1960 Nikolai Gogol, originally $2.50, and got it for $1.25. A Norman Mailer from ‘68 was the same price. Some of the newer ones cost upwards of $3 but hey, I’ll take it! Another benefit of buying these books is that they haven’t been used by dirtbag students, so they have no writing in them, no various colors of highlighter, no little bracket in the margin with “symbolism!” scrawled there. I ended up with nine books for $25. My heart soars like a hawk.

November 6, 1992 – Berkeley

Three weeks ago I took the worst midterm exam of my life, in my Milton class. For me, this course realizes all the ridiculous stereotypes non-English-majors have about the English major: studying a long-dead British poet whose famous yet obscure works require enough referential “gloss” (i.e., footnotes) to sink a ship, or a student. We read out of a huge anthology with pages the scant thickness of onion skins, with practically microfiche-sized print. The professor is an old relic whose attempts to establish a link between seventeenth-century England and modern-day America instead bring us back to that long-lost era of disco dancing, bell‑bottoms, and gold medallions. The literature itself, with all due respect to this literary icon, bores me. With the exception of some of his Latin works which were translated into English and maintain a certain indescribable readability, I’m pretty unimpressed.

I guess this could just be sour grapes, because I bombed the midterm. I’d studied pretty hard and went into the exam just marginally less optimistic than usual. Turns out the exam was open-book, but this really doesn’t help when you don’t have time to peruse a thousand pages for a juicy quote. As usual, I grabbed the exam and, with a vigor something like panic, looked it over to make sure I could handle everything on it. This brief period of terror is pretty much standard for any English exam I take, but under normal circumstances I am flooded with relief and set about systematically knocking off each essay question. Well, this time—the horror! I was being asked to write an essay comparing Milton’s “Il Penseroso” to “L’Allegro” in terms of the idea of ethos—that is, how Milton creates a sense of authority through the images in his poetry. This might have been something I could tackle, except that I had never seen either of those poems in my life! I said to myself, “Self, this has got to be some kind of cruel hoax. Is this the right room? Have I been attending the wrong lectures?” With my extreme panic beginning to mingle with despair and self-pity, I grabbed my book and found the offending poems. Each of them was between 150 and 200 lines—meaning that to read and fully understand either would take hours, or even days if you count the number of times I’d surely fall asleep. Relating them to the essay topic would take even longer, assuming I could ever pull it off. When were these poems assigned? We certainly never discussed them in class, and I don’t remember seeing them on the syllabus. I looked around me and everyone else was writing furiously in their bluebooks like usual. What the hell? Five minutes past the hour, with the professor and two venomous T.A.s breathing down my neck, I finally turned in my crap-filled bluebook, and—thoroughly spent—practically crawled from the lecture hall on my hands and knees: a godforsaken wreck of a student.

I was looking for a bench to slump on and the first one I see has this naked guy sitting on it! Not just any naked guy, but The Naked Guy. He’s kind of a fixture around campus and honestly I think most people are just tired of him at this point. Seeing him there, in my already distressed state, was almost too much to take. I stormed off, found another bench, slumped on it, and dug out the course syllabus. Sure enough, those two poems weren’t on it! The next day I checked with a classmate, and he said the professor added those poems during lecture one day—the one day I was absent, on a work trip down in Anaheim, to the Bicycle Dealer Expo. Why the professor chose to test us on material we never discussed in class is a real mystery. Perhaps it’s his way of punishing students like me who cut class.

November 16, 1992 – Berkeley

I went down to Santa Barbara to visit our old friends and I’m sure you’d be interested in what they’re up to. T— and C— have shaved their heads. I arrived at C’s house at like 10:00 p.m., and she was blind drunk with this older couple. The woman was arrogant, belligerent, and (ominously) kind of burly. I don’t know what her problem was but while C— giggled in the background, this chick started arguing with me about something clearly very near to her heart, based on how heated she was. (I honestly can’t now recall what it even was.) Quite a time. Finally these two left, and the next morning (after I spent the night in the guest room, lest you get the wrong impression), we had breakfast with T—and A— (who are still a couple) and this pseudo-artsy, chain-smoking woman who cussed a lot. A friend of somebody’s, I guess, unless she was just some random person T—and A— thought was with me. A—, despite an all-black mod outfit and a hairdo somewhere between Steven Seagal and Pebbles ‘n’ Bam-Bam, is much the same sharp, funny guy. T— asked me, “So what are you doing after you graduate?” This was kind of a loaded question because in the past she was part of the chorus of people asking me, “English? What are you gonna do with that?” So I told her, “I’ve got a job lined up in a factory deburring plastic parts on an assembly line. You see, when plastics are molded, there are flashings left over from the holes the liquid material was poured through. It’s actually pretty tricky, because if you slip with the file you can ruin the whole piece.” I was pleased to have pulled off the entire speech with a straight face. “Wow!” she said enthusiastically. “That’s great!” Still managing not to laugh, I retorted, “No it isn’t, T—! It’s pathetic! I’m so ashamed!”

In case you’re wondering what I’m actually going to do when I graduate [in a month], I think I can get a job wearing a suit around in an office and saying “hi” to colleagues on the way to and from the water cooler. I’ve got a lot of nice dress shirts and I can get some ties. Then, the plan is, after like nine months of saving up, E— and I will go on a cross-country bike tour. Then I’ll either find a real job, go to grad school, or try to become a superhero of some kind.

—~—~—~—~—~—~—~—~—
Email me here. For a complete index of albertnet posts, click here.

Saturday, November 15, 2025

More AI Smackdown - ChatGPT, Copilot, & Gemini Write Poetry

Introduction

Two posts ago, I described what I think is a fundamental dichotomy between two central capabilities of modern AI chatbots: 1) helping with a nuts-and-bolts operation like coding software or scripting HTML, and 2) creating something original, like an essay or story. The first category involves being a resourceful researcher blessed with excellent natural language processing; the second is probably closer to what humans are (so far) uniquely capable of doing.

Earlier this year I did a whole post on the first category, “What is ChatGPT Great At (and Not)?” And last week I blogged about one aspect of the second category: writing a scholastic essay. To further explore AI’s ability to generate meaningful content, and to evaluate its ability to truly understand language, I turn this week to poetry. That is, I decided to have the three dominant chatbots—Gemini, ChatGPT, and Copilot—write a poem in an unusual meter: dactylic trimeter, a poetic form I learned in high school (details here). I chose this meter because, as described here, ChatGPT does a pretty good job at the classic Shakespearean sonnet in iambic pentameter, but I wonder if that’s just really good parroting since there’s such a vast amount of training data out there for that. I think this exercise really puts the chatbots through their paces, giving us insight into which is the closest to being truly intelligent. As you shall see, the differences in performance are not subtle.


(Custom art by Whisk. No rights reserved.)

Gemini’s effort

To start out, I quizzed Gemini about dactylic trimeter, to see if it knows what I’m even talking about. Gemini correctly stated that the rhythm of such a poem would be “DA-da da | DA-da-da | DA-da-da,” and an example it created of the form was reasonably close. So far so good. But then, to make the rhythm better, I instructed  the chatbot to add an extra trochee at the end of each line. A trochee is a two-syllable word with the stress on the first syllable, as in the word “praises” and the word “spirit.” As an example of this modification to dactylic trimeter, I provided Gemini these lines (that I took from a poem you can read here, in this albertnet post):

Once in a while a voice will sing praises,
Something to levitate everyone’s spirits.

A really smart AI, I would argue, could reverse-engineer the meter from those two lines alone, but I went one better and described exactly what I wanted in technical terms. Gemini correctly stated that the rhythm would therefore be “DA-da-da | DA-da-da | DA-da-da | DA-da” but its initial attempts at it were totally screwed up. I gave it a lot of coaching. I guess this is okay; a human with actual intelligence might require this as well.

Moving on, I prompted Gemini, “Now I would like to see if you can write such a poem based on an essay I provide. You can work in as much as you think works, understanding that not everything will fit. If I paste in a 1,600-word essay could you do that?” It said (in its usual obsequious way), “That sounds like a very interesting challenge! Yes, I can absolutely take a 1,600-word essay and distill its key themes, concepts, and emotional tones to create a poem in the specific Dactylic Trimeter + Trochee (11 syllables) structure.” I added that the poem should be ten lines long, with an ABBA rhyme scheme for the first two quatrains and then a rhyming couplet at the end. And I told it, “Focus on the idea of caprice: choosing the more difficult route simply to exercise one’s radical freedom and prove one’s mettle to oneself.”

Now, if you are a devoted follower of albertnet, you may recall that some years ago I wrote a poem in dactylic trimeter about my favorite cycling hill climb, the one-two punch of the steep, twisty roads El Toyonal and Lomas Cantadas. And, as with most of the  poems on this blog, I provided copious footnotes and commentary, just as you’d get in an anthology of Shakespeare’s sonnets. To help AI write a good poem, the “essay” I uploaded was the compilation of my footnotes, so that Gemini had plenty of material to work with: the fleshed-out observations, feelings, etc. behind my Lomas poem. The idea here is to give you, reader, the ability to do a side-by-side comparison of two poems on the same topic—one written by your amateur blogger, the other by a much-ballyhooed AI. You can decide for yourself who did the better job. Here is my poem (and you can click here for the footnotes):

Ode to Lomas Cantadas

El Toyonal is a beast of an uphill;
Pedaling here is a bit self-defeating.
Lomas Cantadas will sweeten your beating:
Insult to injury, sweet overkill!

Most riders wisely take Wildcat Canyon.
Half the uphill, after all—and you’re tired!
Wisdom, alas, is a flaw when you’re mired
In glory, in notions of being a man.

Sometimes caprice is much better than ease:
Radical freedom is here to be seized!

Here is Gemini’s poem, representing its effort to incorporate all my footnotes and commentary:

Reckless Choice

The simple road beckons And calmly the wind calls,
We stand at the turn now, Deciding to stay free.
To choose the hard pulses Our own perversity.
We turn for the suffering feeling the high walls.

The logic demands the Less burdensome track-way,
But we must display the Radical self-might.
We take the bad option Defying the pure light.
To prove we are strong to Our own confidences.

Yet this is the price of The joy and the pain-fall,
A reckless decision That answers the soul’s call.

So … how would you rate Gemini’s poem? Is AI going to come after our poets and put them out of work? Frankly, I’m less than impressed. The meter is a crude approximation of what I asked for, but as you surely felt (whether or not you could articulate the precise problem), almost every line is too long by one syllable, and instead of the easy, rollicking rhythm of a proper Krooprian poem, Gemini’s ode staggers and stutters. The AI pretty much blew it.

In case you really want to geek out, here’s exactly where the AI went wrong: in nine out of the ten lines, for no apparent reason, it preceded the first dactyl with an unstressed syllable, violating the rule I set forth about dactylic trimeter. Meanwhile, words like “perversity,” “radical,” and “confidences” aren’t padded out properly so we either have to mispronounce them to preserve the meter, or sound them out naturally which disturbs the rhythm—and since we don’t know which to do, we’re stopped in our tracks.

As far as the content, you can sense the AI compromising sense in order to (try to) achieve the rhythm and rhyme. Why else would it use the word “pulses,” the phrase “high walls,” and the Franken-words “track-way,” “self-might,” and “pain-fall”? The chatbot is practically babbling.

So is that it? I give it, say, a D and we move on to the next AI? Not so fast. I made the rookie move of feeding my prompt to the default version, 2.5 Flash, instead of the flagship version, 2.5 Pro, which Gemini tells me is “more adept at tasks requiring complex reasoning, deep instruction following, and high-quality creative output.” The good news is, as the AI assured me, all the context of my chat with 2.5 Flash can be used by 2.5 Pro, so it was super easy to switch to Pro and have the chatbot try again. Here’s its second effort, using the 2.5 Pro version:

The Gauntlet Thrown

Sober thoughts point the logical straight-way,
Turning left proves a man is still willing,
Recklessly choosing what he must still be,
Turning from all that the reasonable say.

This is the freedom to throw all the rules about,
Knowing the payment and what it will cost him,
Twenty hard minutes of everything lost now,
Heeding the irrational gonzo-man’s shouting.

A purely perverse act of personal pride,
Proving the mettle he carries inside him.
Well, I suppose it’s a bit better. Only one line erroneously starts with an accented syllable, and three of the lines actually have the correct meter (a modest improvement over 2.5 Flash’s zero). There’s some nice alliteration in the penultimate line. I also appreciate that 2.5 Pro didn’t do weird stuff with capitalization in the middle of each line. But the rhyming is pretty janky, and we still have these weird bolted-together phrases “straight-way” and the truly bizarre “gonzo-man.” To top it off, the very last word of the poem violates the rhyming convention, so the poem goes out on a flat note. This second poem is better but not by much.

Conclusion: if Gemini offers to write my next albertnet ode for me, I’ma be like, “Naw, I’m good.”

On to our next contestant.

ChatGPT’s effort

I asked ChatGPT if it could write a poem in dactylic trimeter. It said, kind of curtly, “Definitely — I can write in dactylic trimeter.” Unlike Gemini, it didn’t bother to give me a sample straightaway. Maybe this is to save on computing cycles (and thus energy and water), which I can appreciate. But ironically, this is the one chatbot whose capability I have reason to doubt, because as you may recall, I tried this prompt before, almost verbatim, with version GPT-3.5, back in February of 2023. That chatbot crashed and burned with this poetic abomination:

Pedaling up Toyonal, heart pumping fast
My mind is clear, free from the world’s contrast
With caprice and freedom, I take each turn
Pushing through pain, for what I will earn
Onward I climb, up Lomas Cantadas
My legs scream out, but my spirit never fades
The wind in my hair, the world fades away
A moment of peace, that I long to stay
And at last I reach the summit high
A sense of triumph, that no words can deny

Ugh. I almost threw up in my mouth rereading that. As you can see, GPT-3.5 ignored my instructions about meter, made a very half-assed effort to weave the theme of caprice into its poem, and worst of all produced pure treacle, like the verbal equivalent of Thomas Kinkade. Needless to say I was eager, this evening, to see how the more modern version would do.

I gave GPT-5.1 the same prompts as Gemini, uploading the same 1,600-word essay, and here is what it produced:

Lomas Caprice

Turning toward Toyonal, reckless and laughing
Choosing the steep when the wise path is brighter
Mocking my reasons, the prudent, the lighter
Savoring folly my sane side is quashing

Freedom is choosing the climb that will break me
Just for the thrill of defying the sensible
Proving I’m bent for the wholly nonsensical
Riding the harder way just ’cause it shakes me

Pain is a dare that I answer with pleasure
Freely embracing the hurt as a measure
Well, ChatGPT has come a long way for sure, but GPT-5.1’s effort is only somewhat better than Gemini’s. Certainly the meter is better, with a majority of the lines being correct. But the content is really off, with a bunch of the words clearly chosen just to satisfy the technical requirements without adding much meaning. The bit about “wise path is brighter” really makes no sense and is clearly just there for the rhythm and rhyme, no more sophisticated than Hall & Oates’ “your kiss is on my list.” In the next line, who is doing the mocking? And how does “the lighter” fit into anything? Lighter sky? Lighter weight? Cigarette lighter? It’s just a random word dropped into the poem. And in the next line, the word “quashing” in no way rhymes with “laughing” and doesn’t make sense as an intransitive verb. (“What are you doing this weekend?” / “Oh, you know, I’ll just be at home, quashing.”)

 I confess, I rather like the line “Freedom is choosing the climb that will break me,” but then the poem loses momentum again and commits rhythm-sucking metrical errors on the next two lines (though I like “bent”). The eighth line, suggesting that a hard climb “shakes me,” is lame, another word selected only because it rhymes. And that last line? “Freely embracing the hurt as a measure”? Huh? What is it measuring? This poem is lame.

Since AI does its best work when you iterate with increasingly refined and specific prompts, calling out what it did wrong in its previous attempt, I decided to give ChatGPT another chance, and told it, “I think it would be better if it didn’t assume what you and I know already about this climb. Consider that somebody encountering this poem for the first time wouldn't know that Wildcat Canyon is the easier climb, and that choosing the 1-2 punch of El Toyonal and Lomas Cantadas makes no logical sense but appeals to one’s love of suffering and sense of caprice. So, please try again on the poem and give the reader enough background to grasp all this and thus to understand the choice.” It came back with a poem that was quite broken, with the same issue that Gemini’s first effort had: starting each line with an unstressed syllable. It also screwed up the rhyme in the second quatrain. I coached it repeatedly to fix these issues, and after several tries this ended up being its best effort:

Reckless Climb

Climbing the hills of green Berkeley foothills,
Pedaling hard as the thighs start to quiver,
Wheels weaving wild like a paperboy’s river,
Lungs heaving fire as the body fulfills.

Turning to torment, no reason persuades me,
Pain blooms in muscles yet joy is commanding,
Twists of the road, and the thrill never fades me,
Searing the legs, but the spirit is standing.

Pleasure is folly, the wholly absurd,
We choose what will hurt us, yet laugh at the hurt.

Right off the bat, the first line has three problems: it trips us up with a missing syllable; the hills are not always green; and hills/foothills is somehow both redundant and oxymoronic. The rest of the lines have correct meter, but a bunch of really clunky constructions. Wheels weaving wild? Really? And what could “like a paperboy’s river” possibly mean? “Lungs heaving fire”—what am I, a dragon? And the body fulfills what? (I just asked ChatGPT if “fulfill” can be an intransitive verb, and it replied, “No — “fulfill” is fundamentally a transitive verb, which means it requires a direct object.”) The mention of the twists of the road isn’t connected to anything, and “thrill never fades me” makes no sense—I mean, is the thrill like the sun, and I’m a curtain? There’s just an abundance of nonsense here, though I do like the last line. My last complaint is that the AI didn’t consistently follow the rhyme scheme I asked for: its second quatrain is ABAB, not ABBA. I asked it five times to fix this and it never did.

In summary, I’d say this poem is marginally better than Gemini’s effort (I mean, it doesn’t have anything as glaring as “irrational gonzo-man’s shouting”) but it’s basically unusable. Maybe GPT-6 will do better.

Copilot’s effort

Copilot knew exactly what dactylic trimeter was, and its initial example was remarkably good. I suppose it could have plagiarized something, or has a bit of boilerplate at the ready, but in any case I was impressed.  I gave it the same extra instruction about tacking on an extra dactyl, and it got it almost right on the first try; a bit more coaching and it nailed it. After invoking Smart (GPT-5) mode, I fed Copilot the full prompt, with the same footnotes from my Lomas Contadas poem. Here is what the chatbot came up with:

Caprice on the Hills

Turning away from the sensible roadway,
Madness compels me to suffer the steepness.
Reason protests, but I laugh at its weakness,
Choosing the torment that beckons me boldly.

Freedom is found in the reckless decision,
Mocking the logic that urges avoidance.
Pain is a proof of my will and its poignance,
Glory is born of capricious derision.

Taunted by Toyonal, daring and cruel,
Lomas Cantadas declares I’m no fool.

Wow. Just wow. It is so much better than what Gemini and ChatGPT came up with. The meter is actually perfect. (Full disclosure: the meter falls short of what’s in my poem, but only because I forgot to give the AI one more instruction—to shorten the final line of each quatrain, and both lines in the couplet, by one syllable, to give them more punch. To be honest I’d forgotten I did this with my ode.)

I think the AI did a pretty job with the content as well. Calling my will “poignant” is a bit of a stretch perhaps, but we could chalk it up to poetic license. Meanwhile, the word “roadway” (like “poignance”) is nonstandard and evidently chosen to fit the meter, and also doesn’t rhyme very well with “boldly,” but this is nitpicking. My main issue with this poem, which is the same issue I had with the others, is that it doesn’t give the reader enough context to really understand some of the basics: that there’s an easy option (Wildcat Canyon) and a harder one (El Toyonal + Lomas Cantadas).

Perhaps as a victory lap, Copilot offered to write two more versions with slightly differing tones (the last being a drinking song). These weren’t actually as good as the first so I won’t bother dissecting them, but I did like the final couplet in one of them. I decided that since Copilot’s initial poem was good enough to warrant refining, I would point out its central error and let it try again (using the new couplet). Here’s its revised version:

Caprice on the Hills

Wildcat is gentle, a steady road climbing,
Toyonal rises with crueler intention.
Still I turn left in a rash intervention,
Mocking the voice that would counsel good timing.

Freedom is folly, a dare to my reason,
Choosing the torment that makes no good sense.
Pain is a jest and its cost the expense,
Glory is born of capricious decision.

Madness is mettle, a jest I renew,
Lomas Cantadas — I suffer for you.
Again, it’s not perfect. The bit about “turn left” doesn’t really make sense without more context. And what is meant by “counsel good timing”? There is no timing here. In addition, the phrase “its cost the expense” seems redundant. But those matters aside, the meter is solid, the rhyming consistent, and that final couplet is a real banger.

These AI chatbots always seem to want to extend the dialogue and provide more and more and more, which is kind of a double-edged sword. On the one hand, as human beings we should always be working to limit our time online and get out there in the world, right? On the other hand, refining what we get from chatbots is pretty key to making them an effective tool. So when Copilot asked if I’d like it to craft a prose introduction to the poem, I suddenly had another idea: what if I asked it to now create its own footnotes? This post is long enough already so I won’t post them here, but let me say that Copilot did a pretty good job on that.

And here is where I see this AI having a role with a real human writer (at least at the student or blogger level): it could probably help with writer’s block simply by producing something worth polishing. It kills me to concede this, actually, and I am far too proud to ever resort to this kind of “Hamburger Helper” approach to my own writing. But honestly, a cyclist who would like to compose a ride-themed poem in dactylic trimeter, replete with footnotes, could do worse than to start with Copilot. (Neither poem above truly passes muster, but taking the best of each, and from perhaps a few more attempts, and then replacing all the weak parts with our own lines, would be easier than—albeit still inferior to—starting from scratch.) The output of such an exercise might actually have some value, versus the writer getting frustrated, giving up, and producing nothing.

Crucially, the thing the AI will never be able to do is go on the bike ride, have that experience, and grasp what is important about it. So a human could start there and then get some help from AI in expressing himself or herself, since not everyone has the luxury of a liberal education. If AI is called upon to bridge that gap, the current Copilot is far better poised than Gemini and ChatGPT, I think we can now conclude.

If you read my last post, you may recall that Copilot did the best job of these three chatbots at writing a scholastic essay as well. Keep an eye on this one … Microsoft, through its partnership with ChatGPT’s OpenAI as well as its own resources, seems to be ascendant.

Other albertnet posts on A.I.

—~—~—~—~—~—~—~—~—
Email me here. For a complete index of albertnet posts, click here.     

Saturday, November 8, 2025

AI Smackdown - ChatGPT vs. Copilot vs. Gemini

Introduction

Chances are you use ChatGPT.  OpenAI’s chatbot had about a year head start on competing large language models like Google’s Gemini and Microsoft’s Copilot. The latter two offer integration with office productivity suites and man this paragraph is getting boring! Don’t worry, I’ll narrow the focus: in this post I pit these AI chatbots against one another in carrying out identical tasks: an essay and a picture. (Next week I’ll have them write a poem.) These tasks are  probably not what you use chatbots for, but I think they’re a good measure of the AIs’ so-called intelligence, which—in the face of all this uncertainty of where AI is going and what it means for humanity—is probably more interesting than noting how well they answer basic questions or perform routine tasks like writing emails or reports.


(Wondering about the picture? I’ll get to that.)

Now, if you’re an astute reader (which you are or you wouldn’t be here, so congratulations), you’ll be wondering, why even bother evaluating the current capabilities of a technology that is evolving so fast? Wouldn’t this post have a very short shelf life? Those are good questions and here’s my (kind of) short answer: it’s because it’s fun to capture a moment in time and look back on it later, to see how far we’ve come. It’s like watching really old Hollywood movies and appreciating a) how much better the dialogue is in modern film, and b) how much less sexist Americans are now. (Yes, we’ve still got a long way to go, but looking back can help us feel grateful for the progress that’s been made.)

Let me give you an example of how primitive early AI was. As it’s theoretically possible for you to have noticed, I’ve been tracking its progress since 2012, when I tried out Cleverbot. Over the next few years I evaluated the AI used in smartphones. In 2020 I did a test drive of the very first version of OpenAI’s GPT. As described here, all it could do was finish your sentences; you’d type the first half of a sentence and hit tab, and it would finish the sentence for you (as many modern text editors now do). Here’s how the original GPT “helped” me write a short essay about learning to type. I’ve formatted its output in bold italics so you can see what it contributed:

“Pack my box with five dozen liquor jugs” is a cool way to pass the time. It is particularly useful for budding gay men to read the words if they are trying to learn how to type on a computer. … Okay, what’s with this guess that my original text had anything to do with ‘gay? that was definitely a pretty random statement to make but it fit, and … no, it didn’t fit. For A.I. to be useful, it must stick to the ‘gay side of the word.’ No. It must stick to the point. I was writing about a sexual deviant. No, I was not! I was writing about the simple act of learning to use the ‘gay keyboard. Also, A.I., you’ve twice screwed up on putting a space between my legs. Okay, fine. I give up. This GPT technology obviously has a lot of potential.
What a gas, right? Of course AI will keep getting better, to the point that what’s considered amazing today will one day seem laughably primitive. Who knows, perhaps you’ve found this post years after I wrote it, and are looking to it to help you remember what it was like to interact with AI through a cumbersome keyboard, rather than having it read your mind automatically via WiFi 12 or 8G cellular technology.

Okay, down to brass tacks. In this post I will evaluate the latest versions of three leading AI chatbots: OpenAI’s ChatGPT (version GPT-5); Google’s Gemini (version 2.5 Flash and Pro); and Microsoft’s Copilot (version Smart GPT-5, based on Microsoft’s collaboration with OpenAI, which Copilot tells me “[goes] far beyond what you’d get from GPT-5 alone”).

Why a scholastic essay? Because that kind of writing is a lot harder than a lot of what AI does, which is just being a really good natural language search engine. Analyzing a large text and writing about it clearly requires something closer to real thought than just fielding a fairly specific question, harvesting the best existing resources on the topic, and mashing them into a concise and nicely formatted answer. For more on the fundamental difference between writing “thoughtfully” and merely researching, see my last post.

Activity #1: academic essay

Much of the hype around AI is its ability to do college kids’ work for them. In a shocking New Yorker article I read recently, a college professor interviewed several students at top universities about their shameless use of A.I. to write their papers, and how well they’re getting away with it. Here’s a crazy example:

A sophomore at Columbia studying computer science told me about a class where she was required to compose a short lecture on a topic of her choosing. “I just put it in [to an A.I. platform] and I maybe edited like two words and submitted it,” she said. Her professor identified her essay as exemplary work, and she was asked to read from it to a class of two hundred students. “I was a little nervous,” she said. But then she realized, “If they don’t like it, it wasn’t me who wrote it, you know?”

So, is the current AI really good enough for the Ivy Leagues? Or at least good enough for your kid’s high school English class? Two and a half years ago I asked ChatGPT (version GPT-3.5) to write an essay describing the use of dramatic irony in a children’s novel, Clarice Bean, Don’t Look Now by Lauren Child. I chose that book because it’s a less cryptic work than, say, Ulysses, which probably nobody, not even James Joyce himself, understood. But even without being written for grown-ups, the Clarice Bean book does feature dramatic irony, as my seven-year-old daughter was able to appreciate. So how did GPT-3.5 do? It crashed and burned, big time, as described here. To see how much the AI has improved, I’ll use the same prompt this time, with the same book.

ChatGPT’s essay

Here is the most salient excerpt of what OpenAI’s GPT-5 came up with:

At the heart of the novel is Clarice’s attempt to navigate friendship, change, and uncertainty. She idolizes her favorite fictional character, Ruby Redfort—a cool, unflappable girl detective who always knows what to do. Clarice, by contrast, often feels helpless when life becomes complicated. Much of the book’s dramatic irony arises from this gap between how Clarice perceives herself and how readers can see her situation. For instance, when Clarice tries to act as confident and worldly as Ruby, her bravado is clearly exaggerated or misplaced. Readers can see that her attempts to emulate Ruby’s poise only highlight her vulnerability. This irony makes Clarice endearing rather than foolish, because we recognize that her misunderstanding stems from innocence and idealism.

This really isn’t bad; I mean, it reads well, the flow is logical, and the essay has the semblance of depth. (It’s a lot better than GPT-3.5, whose entire essay was a hallucination.) Still, this analysis is rather flawed. For one thing, it completely misrepresents Clarice’s character. She actually exhibits no bravado, and puts on no airs. Her behavior is quite subdued; in fact, she withdraws from friends and classmates entirely. Meanwhile, the AI’s essay lacks the specific detail it would need to get a passing grade in a legitimate lit class.

Now, to be fair, much of the power of these chatbots comes from the user’s ability to iterate, guiding the AI with very precise prompts and helping it along when it stumbles. So I gave GPT-5 another chance, the way a TA would with an undergraduate, asking it, “Are there any scenes in particular that really show the dramatic irony, vs. generalizations? What is the best example of this literary device in the novel?” Here’s what it came back with:

The most striking use of dramatic irony in the novel comes when Clarice’s best friend, Betty Moody, suddenly goes missing. Clarice interprets Betty’s disappearance through the lens of her idol, Ruby Redfort … Convinced that something thrilling is afoot, Clarice imagines Betty has vanished into an adventure or secret mission, perhaps with spies involved.

The reader, however, quickly realizes something else: Betty hasn’t disappeared into a mystery, but into a family crisis. Her parents have separated, and Betty has been taken away unexpectedly. Clarice, too young to pick up on the signs of adult conflict, mistakes emotional upheaval for suspense.  This is classic dramatic irony — the audience understands the real reason for Betty’s absence long before Clarice does.

This really would be a classic case of dramatic irony, but there’s one problem: GPT-5 is hallucinating! Betty doesn’t suddenly go missing; she announces to Clarice that her family is moving away, and then they do. There is no mystery for Clarice to solve, no imagined spies or secret mission, and Clarice misunderstands nothing. The chatbot states that the reader knows that Betty’s parents are separating but Clarice doesn’t. This is actually backwards: Clarice only fears Betty’s parents are divorcing, and they’re not. So as nicely written as GPT-5’s essay is, it gets the basic facts wrong. (The only real dramatic irony in the essay is that we know something ChatGPT doesn’t!)

When I challenged GPT-5 on this, the AI casually acknowledged that it may have gotten stuff wrong because it actually hasn’t read the book. It proceeded to fact-check itself using three dubious secondary sources: “TV Tropes,” “Books for Kids Blog,” and “Books for Keeps.” Then it offered, “If you like, I can check for direct quotes from the text to confirm exactly what the author states about Betty’s situation and whether divorce is explicitly ruled out.” WTF?! GPT-5 blithely—and confidently—trotted out this nice essay despite having no access to the source material! What actual student would be shameless enough to try that?

I’m going to give ChatGPT a D on this. This is a big step up from last time, but it still has a long way to go. Obviously an AI would do better with older literature, like the classics that are past their copyright and in the public domain, but I suspect most of its training data will always be other papers and run-of-the-mill critical criticism, not the work of the authors themselves.

Copilot’s essay

On to the next contestant, Microsoft’s Copilot. It has different versions based on what you’re trying to do (e.g., simple search vs. deeper analysis). I engaged Smart Mode for the essay. Here are the highlights:

A central example of dramatic irony emerges in Clarice’s interpretation of Clem Hansson, the new girl at school. Clarice believes Clem is aligning herself with Justin Broach, a classmate Clarice views as a “sociopath.” This assumption fuels Clarice’s isolation and resentment, reinforcing her belief that Clem is shallow and disloyal. However, readers can infer — through subtle cues and Clarice’s unreliable narration — that Clem’s behavior is not malicious, and that Clarice’s judgment is clouded by her own insecurities and grief over Betty’s departure. This disconnect between Clarice’s perspective and the likely reality creates dramatic irony: readers recognize that Clarice is misreading the situation, which deepens our empathy for her while also injecting humor into her exaggerated reactions.

Another layer of dramatic irony stems from Clarice’s obsession with the Ruby Redfort Survival Handbook, a fictional spy guide she consults for advice on navigating life. Clarice treats the handbook as gospel, applying its cryptic wisdom to everyday dilemmas. For instance, she fixates on the idea that “your worst worry is the one you haven’t even thought to worry about,” which paradoxically increases her anxiety. Readers, however, can see that the handbook’s advice is often absurd or irrelevant, and that Clarice’s reliance on it is a coping mechanism rather than a source of genuine guidance. This ironic gap between Clarice’s faith in Ruby Redfort and the reader’s recognition of its impracticality adds both humor and poignancy to the narrative.

Right off the bat, Copilot doesn’t commit any major blunders, in terms of hallucination. I would say the reader only infers that Clem’s friendly gestures toward Clarice are sincere, as we don’t have anything concrete to go on, and Clarice’s hunches aren’t obviously inferior to ours. I also dislike the quotation marks around “sociopath” because they imply that Clarice has Justin figured all wrong, which would be another example of dramatic irony, except that Justin most certainly is a sociopath. (He assaults classmates, extorts them for money, and kidnaps Clem’s pet rabbit.) In my book, Clarice’s mystification at Clem’s interaction with this jerk is understandable and certainly not exaggerated. Moving on to the Survival Handbook, it’s advice is arguably absurd and, on the face of things, irrelevant (as it’s written for spies, not kids), but Clarice does grasp (and point out) its limitations. In a moment of epiphany, she asks herself, “What is the point of reading about all this survival information about lions and tigers and sharks? … And I have read her Spy Guide: How to Know Things Without Knowing Things—and all I have discovered is that I know nothing at all.”

But all this is boring, I know, and the main news is that this essay (or more to the point a version that’s been polished up by a student who at least read the actual book) would probably get a C at UC Santa Barbara, perhaps a C- at UC Berkeley, and who knows, maybe an A at Columbia. But does this mean Copilot is now ready to steal our jobs? No, not yet. If it offered to ghost-write my next albertnet post, I’d politely decline.

Gemini’s essay

And how did Gemini, our final contestant, do? Here are some highlights from its first effort:

The irony primarily revolves around Clarice’s attempts to manage her mounting anxieties—namely her fear of a looming geography test and her worry about her best friend Betty Moody potentially moving away—while remaining oblivious to the humorous and poignant truths the reader recognizes.

The most potent source of dramatic irony is Clarice’s belief that she can control her own anxieties and the world around her by actively not confronting them. The geography test, which Clarice refers to ominously as “the Big Test,” is a persistent undercurrent of dread. She employs elaborate avoidance strategies, from organizing a “Worries Box” to distract herself with minor, manageable tasks. The irony lies in the reader’s understanding that avoidance only amplifies anxiety; the test is drawing closer with every page turned, yet Clarice invests her energy in delaying the inevitable.

The technique is equally effective in building the emotional tension surrounding Betty Moody’s potential move. Clarice views every sign of Betty’s packing or preoccupation through the singular lens of impending abandonment. The reader, however, is frequently given clues—through overheard adult conversations or context within the illustrations—that the situation is more mundane or less catastrophic than Clarice imagines. For instance, the boxes are not necessarily for a long-distance move, but perhaps a temporary one, or simply a major reorganization.

Again, the AI has written an essay that kind of sounds good and might fool somebody unfamiliar with the book, but it ultimately falls down because it has the basic facts wrong. Like ChatGPT, Gemini has no direct access to the full text and is just guessing (well, generating) based on the flotsam and jetsam that critics, reviewers, et al have strewn around the Internet. Gemini’s mistakes are not subtle. For one thing, there is no geography test mentioned anywhere in the book. Meanwhile, Clarice has a worries notebook, not a box, and she doesn’t spend undue time writing in it or employing avoidance strategies. Moreover, there is no scene in the book in which Betty packs boxes for her move, and the situation with Betty is not “more mundane” or “less catastrophic” than Clarice imagines: her best friend literally moves from London to San Francisco.

In the spirit of fairness, I gave Gemini another prompt to try to hone its response and focus on a specific scene of dramatic irony, and it came back with more of the same about the geography test and Betty’s move, adding, “The peak of this irony comes when the news of Betty’s situation is finally revealed, specifically that Betty is not moving away forever, but just temporarily to the apartment upstairs due to a family situation (often related to house renovations).” Completely false, and completely lame.

I pointed out all the errors and asked Gemini if that was really the best it could do. It replied, “The detailed essay you requested on Clarice Bean, Don’t Look Now was written using the Gemini 2.5 Flash model. If you were looking for even deeper, more complex analysis, the Gemini 2.5 Pro model (available with Gemini Advanced) would be the choice optimized for that specific level of complex, multi-step reasoning.” Oh, so if I want a better essay I have to pay for it? What cheek! I almost decided to just give Gemini an F and move on. That would have made this post shorter. But, doggone it, if I’m going to do this, I’d better do it right.

Gemini’s second essay

I switched to version 2.5 Pro (which turns is offered on a limited basis for free) and here’s the gist of its new essay:

The central irony is this: the very “spy” techniques Clarice uses to gain control and uncover the truth are precisely what blind her to it, generating both the novel’s humor and its profound sense of childhood helplessness.

This irony is sharpened in Clarice’s “investigation” of her parents. Overwhelmed by stress (which the reader understands is likely related to their friends’ move, finances, or just the chaos of family life), Clarice’s parents have tense, private conversations. Clarice, filtering these events through her Ruby Redfort lens, interprets them as “clues” to a secret, impending divorce. She misreads their mundane exhaustion as a sinister conspiracy. The dramatic irony is that Clarice’s hyper-vigilance, her constant search for meaning, makes her less perceptive, not more…

Ultimately, the book’s title, Don’t Look Now, serves as the most direct summary of its central irony. Clarice believes her “looking”—her spying and investigating—is the solution. But the reader knows she is refusing to look at the one thing that matters: the deep, painful, and very normal emotion of sadness. She avoids the emotional core of her problem by obsessing over its mechanics.

I have to say, this is much better. Version 2.5 Pro didn’t hallucinate, and its argument is more cohesive. It’s not perfect; for example, it misses Clarice’s epiphany about the limits of the Ruby Redfort book and thus overstates her lack of perception. But this essay so much better than what 2.5 Flash “wrote.”

So is that it, I give Gemini a C+ and move on? Not quite: remember, this chatbot benefited not only from my invoking its 2.5 Pro version, but from all the coaching I gave it in the chat. This distinction is crucial: AI LLMs do much better when you feed them high quality prompts and lots of feedback to supplement their training data. It’s equally important to understand that your input is not itself training data that the model can use going forward. The benefit you provide dies with your session. Thus, AI doesn’t learn and get smarter the way a human would; its progress is much more gradual. Which brings me to:

Gemini’s third essay

To see how 2.5 Pro would do without all the coaching, I opened a fresh session on my work laptop (i.e., totally different login, no history of my chats). (Don’t worry, I did this on the weekend.) (If you’re my boss reading this, congratulations on finding my blog, and please consider that my working knowledge of AI is surely valuable in the workplace and you should give me a raise.)

I guess I wasn’t surprised that 2.5 Pro didn’t do so well this time, but what did surprise me is just how badly it crashed and burned. Here’s an excerpt:

The plot is set in motion by a catalyst of deliberate misinterpretation. A cryptic, unsigned letter containing the vague warning, “something terrible is going to happen,” is received not as a piece of misdelivered junk mail but as a profound, personal omen… The humor is generated directly from this disparity; the audience … understands that the “terrible” event will be domestic, not devastating. The characters’ frantic preparations—installing locks, suspecting neighbors—are thus rendered as escalating absurdities, a performance for an audience that already knows the final act.

OMG, it’s the worst essay yet: total hallucination. There is no cryptic letter in this novel, no locks installed, no suspicion of the neighbors. I called this out, the chatbot apologized profusely for having accidentally based its essay on a different book entirely, and then it tried again:

The gap between perception and reality generates the novel’s central tension. While Clarice is hunting for evidence of international espionage, the audience is processing signs of a painful family separation. The “mysterious man” Karl meets is not a sinister agent, but, as the reader strongly suspects, his father.

Again, pure hallucination! There is simply no “mysterious man” in the entire book. I challenged the chatbot, asking how it gets its source material, both when a work is under copyright and when it’s in the public domain. Gemini explained that for public domain works its training data contains the full texts and also “the centuries of critical, scholarly, and secondary sources,” and for copyrighted works “is built from secondary sources … book reviews, detailed plot summaries, fan wikis, essays, and educational matters about the book.” So basically it’s amateur hour: the AI can’t really differentiate between, say, an esteemed college professor and a (gasp!) lowly blogger. As you can see this doesn’t always work so well. I’m going to give Gemini 2.5 Pro a D+.

As an aside that perhaps ought to be my thesis, I’d like to point out that the better AI gets at writing student papers, the worse off students—and the whole institution of higher education—will be. After all, the point isn’t for students to edify their instructors through their observations; the point is for the students to think and write for themselves. Yes, this is hard, but the right kind of hard, and through this struggle they ideally learn how to think and write, and can one day contribute in the realms of actual, non-student writing such as books, articles, or—worst case scenario—blogs.

Activity #2: original art

I’ve tinkered a lot with AI-generated art, usually to generate pictures to run at the top of my blog posts. It’s been pretty hit-or-miss; a picture which doesn’t stray into uncanny valley territory, or commit a major gaff like the wrong number of fingers on a hand, is all I’ve realistically hoped for. Today’s exercise is simple: I pitted the platforms against one another in the task of creating a picture for this post, featuring Clarice Bean. You can see the winner at the top, though you might cry foul: the art I ended up using is from Whisk, Google’s latest “experimental” imagine generator. I resorted to this new tool because I just wasn’t happy with the runners-up, as you shall see.

ChatGPT’s art

I asked ChatGPT, “Can you make a drawing for me of Clarice Bean reading albertnet on her tablet?” Not surprisingly, it mentioned the copyright and said, “I can’t generate or reproduce images of her or derivative works featuring her likeness” but offered to “generate an image of a cartoonish, freckled, red-haired girl reading a tablet, in the style of a children’s book illustration, but not resembling or referencing Clarice Bean specifically.” I agreed and here’s what it came up with:


I think you’ll agree that’s just about the most boring picture ever. It also has the classic issue of the subject holding the tablet backwards. This is just not that hard a prompt … what gives?

I said, “Make it a more realistic picture, please, and she should look a bit older, and have her in an armchair in her attic bedroom with a desk lamp, and reading the Ruby Redfort Survival Guide.” Maddeningly, the chatbot came up with a picture that was almost perfect, except that made her look a bit too old (about 15) and gave her Instagram-worthy boobs, which seemed inappropriate and unseemly. The picture didn’t show a lot of skin, but still … totally unusable (and I don’t even want to post it here because it’s in such poor taste). I replied, “Please make her a bit younger and flat-chested.” The chatbot chided me: “I can’t modify or generate an image based on physical or anatomical details like that.” Like it was basically calling me pervy! It even offered to “create a child-appropriate illustration,” as though I’d asked for something that wasn’t. Sheesh.

Copilot’s art

I gave Copilot the same initial prompt I’d given ChatGPT, and here’s what it came up with:


This is almost as boring as ChatGPT’s picture, and for some reason it looks faded and I couldnt get Gemini to fix that. At least the tablet is facing the right way. Note that Clarice is wearing the same red-and-white-striped shirt in this picture as the ChatGPT version of her, which is curious given that such a shirt appears nowhere in any of her books (at least that I can find). It’s actually the shirt Waldo wears, which I’d prove to you if I could only find him.

Other similarities of this art include the hair being the same length, the art having the same level of detail (barely more than a cartoon), and a complete absence of any details in the background. In delivering the picture, Copilot said, “Here you go - a stylized, collage-like illustration of a child reading a tablet, inspired by the playful textures you mentioned.” I don’t know what it means by collage-like, and I didn’t mention any “playful textures.” Whatever, chatbot.

Gemini’s art

I gave exactly the same art to Gemini, and it produced the corniest, least aesthetically pleasing picture yet:


Obviously this is a matter of taste, but would you agree there is no charm here? And what’s with the red-and-white-striped shirt appearing here, too? What are these AIs keying off of?

In Gemini’s defense, at least the little thought bubbles bear a slight resemblance to some of the art in the actual book. But again the tablet is backward and “albertnet” is spelled “alphabertnet” (weird misspellings being a common screw-up with AI art).

Frustrated by not having any good art yet, I tried ImageFx, another Google AI tool, and it gave me a photo-style picture with lavish detail, featuring both Ruby and her brother rocking red-and-white-striped shirts. I think it’s some kind of global AI conspiracy. What a relief when Whisk broke the cycle and generated the worthy picture you saw at the top of this post. I particularly like how Clarice is kind of staring off into space instead of at the book, clearly either pondering what she’s just read or distracted from her book by all the difficulties she’s working through.

Well, at long last that’s it for today. Tune in next week because I plan to pitch these chatbots against one another again, this time writing poems in dactylic trimeter based on the best prompt an AI was even given.

Other albertnet posts on A.I.

—~—~—~—~—~—~—~—~—
Email me here. For a complete index of albertnet posts, click here.

Friday, October 31, 2025

Tech Reflection - Two Sides of AI

“This Halloween, I’m dressing up as generative AI. I’m going to show up to the party without a costume and just start stealing pieces of other people’s outfits.”
An X dispatch my niece screenshotted for me

Introduction

Is AI the amazing new technology that’s changing the world, or a petty thief that just steals people’s ideas and passes them off as its own? Does it actually carry out anything approaching thought, or is it just a zombie, stalking humans’ digital relics and muttering “brains … brains … brains” as it angles to get a piece of us?


In this post, I examine the two most fundamental functions AI chatbots can carry out, and draw a distinction between the two. I believe this can give us useful guidance in deciding how we ought to use this game-changing technology.

Ecclesiastes vs. Barthelme

AI is evolving fast, perhaps faster than our ability to understand it. I’m having to adapt; for example, I’ve stopped spelling it “A.I.” because leading media outfits like The New Yorker and The New York Times have now eschewed the periods. So if you’re reading this on your phone in a sans serif font you may have initially thought I was writing about there being two sides of Albert or Alfred. I asked ChatGPT what to do about this ambiguity between a capital “i” and a lowercase “L,” and it suggested I could “kern or tweak the glyphs.” I’m not exactly an expert at kerning glyphs, so I asked the chatbot how. It gave me all kinds of strategies, the best one for my blog format (HTML) being this:

AI <!-- default -->

A<span style="letter-spacing:0.05em;">I</span> <!-- slightly looser -->

So you can see, GPT is right there with an answer when queried about a technical operation that has been done before. But what about doing something creative and original? This is a fundamental distinction and I am going to propose we look at AI from two largely separate perspectives, for which I’ve invented labels:

  • Operational mode – I thought about calling this Ecclesiastes mode, for “no new thing under the sun.” This mode is about helping with a nuts-and-bolts operation (e.g., HTML scripting, DNS routing) that somebody else, probably many people in fact, already figured out and documented out there on the Internet for AI to gobble up, distill, pretty up, and present. Here, AI is basically a really good large language model that excels at combing through gobs of chaff to find answers, and organizes and summarizes information very clearly. I wouldn’t say it’s as parasitic as what’s suggested by the X epigram above, because lots of people freely post technical stuff to the Internet just to be helpful, without thinking of it as sacrosanct intellectual property.
  • Creation mode – I think of this as Barthelme mode, named for the writer Donald Barthelme, because I think he’s the epitome of totally original, wacky, one-of-a-kind creative types with an absolutely distinctive voice. In other words, this is the intelligence that I am quite convinced AI could never approach. By creation I mean using the full capability of your own mind to advance ideas that are new, and yours.

The trouble is, many people don’t make any distinction between these two general areas of AI, so on the basis of its prowess as a natural language search engine, they are be led to believe it can do a perfectly good job at creation mode. And since most people aren’t English majors, and in fact don’t respect English majors, AI platforms get to roll out some pretty inferior writing and everybody thinks it’s genius. (This widespread lack of sophistication is also why McDonald’s makes so much money.)

So what?

For many years, as I’ve lamented at length in these pages, kids have been told all the jobs are in tech, and they need to study STEM. And now, many of the kids who dutifully followed these marching orders are graduating from college with Computer Science degrees and not getting jobs, and tech is laying off gobs of people. Next time I meet a STEM major I’m gonna ask him, “Computer Science? What are you gonna do with that?”

So how did STEM go from meal ticket to a food stamp? Well, I think it’s largely because AI is actually getting pretty good at the operational mode. It writes software so well, all industry needs is a seasoned coder to check it. Will we still have seasoned coders in 20 or 30 years, when all the current ones have retired and nobody has come through the ranks to replace them? Probably not, but that’s a whole other blog post somebody has surely already written. (I did blog about ChatGPT’s prowess with operational mode earlier this year, here.)

So as we look at AI, and particularly its role in our personal and professional lives, I think we need to ask ourselves what we have to offer that is rare and valuable, and how AI can help. Specifically, I believe we should be asking the question: how do we use operational AI to handle rote stuff, so we have more time to develop our unique, original ideas—so as to bring out our inner Barthelme?

What to use AI for

I have to confess, I love AI for light research when I’m blogging. The kernel of my posts always comes from my own brain, usually from pondering all kinds of things while I’m out on a solo bike ride. But ChatGPT is a great way to chase down and pinpoint something I had vaguely committed to memory. For example, when working on a recent post I asked it, “Can you track down the Lore Segal quote from ‘Her First American’ about ‘protocol is the art of not doing what comes naturally’?” I probably could have found this with Google, but the AI helped (and might have been indispensable here had I not remembered the name of the novel). ChatGPT was also super helpful when I was writing my post on induction ranges, in researching certain facts (e.g., energy efficiency info and whether government rebates are available).

AI is also pretty helpful at work, where I use a “walled garden” version my employer provides. (It doesn’t use any of my chats as training data for the AI’s ongoing education.) In fact, my employer exhorts all us employees to use AI every day. It’s like with any great tool: we’re expected to work more efficiently because we have it, so we’d better use it well. Recently, I took several product specification documents for different Internet hardware devices, fed them into an AI utility, and asked it to read them all, highlight the differences among the different makes and models, and tell me which one I want for xyz purpose. This was much faster than poring through everything myself, which is a decidedly operational task. The report it generated was clear and reasonably concise, and probably won’t be read very carefully anyway. In fact, someone will probably upload it to a chatbot and have it summarized. All this is fine with me.

One other great use for AI chatbots is to ask them for instructions for quotidian technical matters in your personal life, like disabling the child lock on your new microwave oven, charging your new bike’s electronic shifting, or restoring your playlist after updating your smartphone’s MP3 app. Sure, these are things you could look up on YouTube, but often that search can be tricky, and the videos can be agonizingly slow. The following video tutorial, which is crisp and concise and beautifully shot, is perhaps the exception that proves the rule:

I guess one benefit with YouTube is it’s less likely to hallucinate. I asked ChatGPT if my bike’s brake/shift levers have button cell batteries, and it explained in great detail how there are actually wires running from the battery pack to each lever, so they get recharged along with the derailleurs. The chatbot even drew me a nice diagram to illustrate this. Alas, it was hallucinating: the levers totally do have button cell batteries that need to be periodically replaced.  But all this being said, it’s easy enough to sanity check this kind of output, and I usually get a good answer from AI when I can’t locate a product owner’s manual or don’t feel like leafing through the 50-page one that I have, trying to get past the 14 foreign language versions.

What NOT to do with AI

I think where people get into trouble with AI is when they try to get it to do their work for them, particularly writing documents or correspondence that they then pass off as their own. In some cases this is an ethical or even legal matter; as I described here, the New York Times is suing OpenAI for copyright violation, and I have firsthand evidence of ChatGPT essentially plagiarizing this blog. But I doubt you overmuch care about that. There are two bigger issues, I think:

  • What this “creation mode” usage does to the quality of “your” writing
  • What it does to the quality of your thinking

There’s this notion that you can ask a chatbot to write something for you, anything from an email to an invitation to a work report, and then you can just polish it up a bit, and you’re done. No more writer’s block! No more outlines, or worrying how to organize your thoughts! That might be okay for a very basic report, like what I described comparing features of tech hardware. But when you start from scratch with your own document, you’re not just leveraging AI’s impersonal, sprawling training data; you’re using your own—everything you’ve experienced, heard, read, and dreamt of. It’s your own personal muse, not the generic Internet one.

Honestly, for anything loftier than a rote technical document—that is to say, anything designed to edify, persuade, or entertain—haven’t you seen for yourself how AI can fail? Like, you’ll get this chipper invitation to a family reunion and it’s using corny phrases like “drum roll please” and joking about your family’s dance moves, and it just seems generic and clichéd? That’s all AI can do. It doesn’t know you or your family or friends well enough to say anything truly clever, and all the polish you want to give its rough draft won’t help. Your invitation will never have real style, along the lines of, “L— gets dibs on the guest room (which she may still anachronistically refer to as “her” “bedroom”) and its magnificent new king-sized guest bed. If you’re nice she might invite you to a slumber party there. Other guests can fight over the legendary Futon of Sand down in the home office. Beyond that, we have two large sofas for those interested in the college-esque party-‘til-dawn experience. We would not be offended if one or more parties were to seek a motel/hotel/AirBNB/VRBO, especially given the relatively small number of bathrooms here (i.e., one). Regarding rumors that the men are encouraged to pee in the backyard, this is true, but please stick to the planting beds and the fountain.” See how much better that is?

Now, you might be thinking, “Wait, I’m not a blogger and I wasn’t an English major. Cranking out an email or an essay may be easy enough for Dana, but I just want to get this task done and checked off.” But stop and think for a moment: what would you like to be good at, in life? Please tell me the answer isn’t just “typing good prompts into AI.” Wouldn’t you like to be articulate, interesting, and capable of thinking on your feet? Because what are you going to do at a cocktail party, or a job interview, or a non-virtual work meeting, when you don’t have a chatbot to help you, and that’s the habit you’ve let yourself fall into? The reality is, we get good at thinking by struggling to do it, for ourselves, the old fashioned way.

So let’s not undervalue written communication by outsourcing it to AI. The best case scenario is that it’ll do an inferior job, replacing what could have been original thought with a pile of trusty clichés and/or stealthily plagiarized, slyly anonymized content. The worst case scenario is that it’ll actually get good enough that you never have to write for yourself again, and your brain can atrophy to the point that you’re not even a thinker anymore … just a chatbot operator.

Because you don’t care

Gosh, I guess I drifted into high-and-mighty, pompous, full-on pontification there, and I feel pretty sheepish about it! Fortunately, I’m realistic enough to sense you snickering, and I know you’re going to turn right around and keep on using AI for whatever you can possibly think of. That being the case, check back next week because I’m going to catch you up on the latest AI technologies and how much they’ve improved since my last check-in. Whether your chatbot of choice is ChatGPT, Gemini, or Copilot, I’ll have you covered. Until then, I’ll be getting back to what I really enjoy in life: kerning glyphs.

Other albertnet posts on A.I.

—~—~—~—~—~—~—~—~—
Email me here. For a complete index of albertnet posts, click here.