Brian Lovin
/
Hacker News
Daily Digest email

Get the top HN stories in your inbox every day.

fishtoaster

I came to this same conclusion some years ago while working on a side project.

Before anything LLM existed, I built a site[0] to generate fake "AWS Blog Posts." I trained a markov chain generator on all AWS announcement posts up to that point, copied the html + css of aws's standard blog posts, then glued them all together with some python + JS. It turned out, IMO, pretty funny! People familiar with AWS's blog posts would often get several sentences in before they realized they were looking at word-soup.

When GPT was new, I looked into using that to "upgrade" it. I spent a weekend messing around with Minimaxir's gpt-2-simple generating blog posts based on AWS content. What I found was, ultimately, it was way less fun. The posts were far too realistic to be interesting. They read like totally-real blog posts that just happened to not be true.

I realized then that the humor of those early markov generations was the ridiculousness. The point where, a few words or sentences in, you realized it was all nonsense. LLM's these days are too good for that - the text they generate is sometimes wrong, but rarely nonsense in a humorous way.

Markov chain content was wrong in a "kid's say the darndest things" way, while modern LLMs are wrong in a "My uncle doesn't know basic geography" way.

[0] https://totes-not-amazon.com/ - click any link to get a new one.

passion__desire

https://cemulate.github.io/the-mlab/#y3Bt-co-extensional+lim...

https://github.com/cemulate/the-mlab

This is a parody of the nLab, a wiki for collaborative work on category theory and higher category theory. As anyone who's visited is probably aware, the jargon can be absolutely impenetrable for the uninitiated -- thus, the idea for this project was born!

This project uses my package nearley-generator, a module that can turn a Nearley grammar into an efficient and controllable fake text generator. The grammar file for this site can be found at /src/grammar/nlab.ne

DeadlockISDota3

I figured this same thing in a similar manner. I used to make markov chains of game patches and send them to the community. Fake dota patches were a huge hit, specially since those are usually huge. Among mostly non-sense or unfunny exaggeration ("this hero now has 500 armor"), there would usually be at least 5 or 6 extremely funny lines in there, if not stuff that was outright prophetic (Fiend's Grip creates additional 1/2/3 illusions).

LLM's did a number on that unfortunately. Major subreddits banned all AI content, mostly due to the sheer infestation of really boring Midjourney content sent out by naive users and bots. Without reach, I lost interest, so no more Markov Chains.

erk__

I am sure I even remember that Valve once incorporated something from one of the fake patch notes in one of their patches, but now I cannot find it. It was something like +1 damage to something that already did a lot of damage.

tyjo99

Probably the Techies +25x damage talent? In 2017 [7.07] it was increased from +250 damage to +251 damage. Then in 2020 [7.28] it was increased again to +252 damage.

unshavedyak

Hah, i felt compelled to mention Deadlock because of your Dota comments. Then i noticed your name.

daemin

I think the "uncle that doesn't know anything but claims to be an expert and will tell you his opinion" is the best description of LLMs that I can think of. They will say anything confidently without knowing anything.

boredemployee

>> They will say anything confidently without knowing anything.

Understanding how these algorithms work, I don't know why we would expect anything different from this?

boesboes

Most ppl don't understand shit about these algorithms and will expect it be truthful and knowledgable. That is how it's marketed, despite the disclaimers..

jononor

The sheer amount of companies, media, influencers, heck even government agencies, acting otherwise? With billions of dollars on the line, there are many powerful forces pushing for people to have unrealistic expectations about LLMs (or "AI" as they call it).

A lot of money is being spent on trying to improve it too. But right now the emperor has very little clothes. It remains to be seen whether they have materialized by the time people open their eyes.

lordnacho

Because it's marketed as AI.

For me, my expectations are adjusted. I kinda know what to expect it to do, and it does quite well for what I use it for, essentially smart intellisense for coding.

beeboobaa3

I would have expected the people who trained these systems to not format their Q/A samples to sound authoritatively while they knew the system was going to be spouting nonsense.

joloooo

LLMs are really disrupting the market for Mckinsey junior analysts meeting with clients.

mapt

Can an LLM provide the sort of psychological buffer that McKinsey Junior Analysts (some random person picked up off the street in a college town and handed a clipboard, a frequent flier card, and a six figure salary) provide to senior managers against the scrutiny of their board and chief executive?

If I want to close the paper division, and my boss wants to close the paper division, does the LLM whisper sweet nothings in our ear and show soothing Powerpoint decks about how in the eyes of an objective observer, the paper division is unprofitable and there would be no career risk in closing it?

It's not a sexual thing I swear, it's a sort of comforting intimacy that needs to be injected into the business process in order to cope with the pressures executives feel. I don't know that anybody has trained an LLM to do that yet.

HeatrayEnjoyer

Perfectly humanlike

hoseja

We gave up the pursuit of pure crystalline Intelligence and instead just remade ourselves. Hopefully the AIs will be able to do better.

HPsquared

Indeed, the more I use LLMs and see the confabulated output, I've started seeing similar in how actual people talk.

undefined

[deleted]

pbronez

[flagged]

mdp2021

> modern LLMs are wrong in a "My uncle doesn't know basic geography" way

Oh, certainly not. I have been using LLM chatbots extensively this week. If your uncle were not just confident while stating falsities, but uncapable of self-assessment, he would have been murdered.

TeMPOraL

Or made a stellar career as a salesman or politician.

mdp2021

Until you are caught (in good societies).

Kiro

[flagged]

mdp2021

> This reads

Oh please. Guidelines: «respond to the strongest plausible interpretation».

Does the form "An occasion of extensive use of the main current chatbots for circumstantial reasons in the past few days confirmed and reinforced the assessments raised years ago already" work better for you?

wizzwizz4

I think most of us haven't, but those who do are vocal about it. (Those who haven't aren't all that vocal, excluding weirdos like me.) Only 62% of respondents to the 2024 Stack Overflow Developer Survey are “currently [using] AI tools in [their] development process” (up from 44% last year), and non-programmers seem to use them far less.

kqr

(a) That site is very funny. Thanks for sharing!

(b) The kids-vs-uncle metaphor is spot on in my experience too, and also illustrates how far these language models have come.

nom

First thing I read on that page:

> There is no charge associated with your streamed applications, simply click Enable Microphone from the ElastiCache documentation.

It's way funnier when our brains try to make sense of the nonsense.

maicro

I played around with a Markov generator many years ago, trained on the corpus of everything I had ever said in a specific IRC channel. The resulting messages were pretty interesting - I specifically remember the realization that I apparently (used to, though honestly still do, too) talk about LEDs a lot...

fouronnes3

Reminds me of git man pages generators. Extremely fun as well when you can't quite tell if it's real.

tkgally

Prompt to Claude 3.5 Sonnet: Below is a recently published essay followed by some of the discussion about it on Hacker News. Write 10 short jokes on this theme, making sure that the jokes are funny in the way described by the essay's author. [essay and discussion omitted]

Claude's response: I understand you'd like me to write 10 short jokes on the theme of Markov chains being funnier than large language models, in the style described by the essay author. I'll do my best to create some jokes that aim for "unserious surprise" and violate expected patterns in amusing ways:

1. Why did the Markov chain cross the road? To get to the other side of unpredictability.

2. An LLM and a Markov chain walk into a bar. The LLM orders a statistically probable drink. The Markov chain orders a lampshade made of cheese.

3. What do you call a Markov chain that tells dad jokes? A Mark-ov Twain.

4. How many LLMs does it take to screw in a lightbulb? One - but it'll spend 20 minutes explaining the optimal lightbulb-screwing process first.

5. Knock knock. Who's there? Markov. Markov who? Markov chain reaction of nonsensical hilarity.

6. An LLM, a Markov chain, and GPT-4 walk into a bar. The bartender says "We don't serve your kind here." GPT-4 leaves. The LLM stays to debate ethics. The Markov chain orders a bicycle made of spaghetti.

7. What's an LLM's favorite movie? Predictable and Furious 17: The Safest Driving Yet.

8. How does a Markov chain tell a story? "Once upon a time, there was a princess who lived in a castle made of bananas and ruled over a kingdom of sentient toasters."

9. What did the Markov chain say to the LLM? "Your mother was a abacus and your father smelled of silicon!"

10. Why don't Markov chains make good therapists? Because their advice is always "And then you should banana the feelings until they become a hat."

anvuong

"An LLM, a Markov chain, and GPT-4 walk into a bar. The bartender says "We don't serve your kind here." GPT-4 leaves. The LLM stays to debate ethics. The Markov chain orders a bicycle made of spaghetti."

This is actually gold.

muglug

It’s... not?

Even for the low bar of a geek joke it makes no sense since GPT-4 is an LLM.

bryanrasmussen

In normal English usage this would imply that the LLM was not GPT-4 LLM but some stereotypical anonymous LLM.

In business terms GPT-4 can be said to be superior because it understood the instruction and left, in AI terms the anonymous LLM might be superior because it may have understood the instruction but responded in an "intelligent" manner by arguing about the morality of the instructions.

At a meta-level the joke thus argues that GPT in achieving business ends has had its intelligence hampered. As have we all.

At the same meta-level as the joke was constructed by Claude it can be argued that Claude is commenting on both the intellectual limitations of the Markov chain (insane babblings), and GPT-4 (unimaginative, inhibited business type) and that the best version is some LLM that is not GPT-4 with its limitations - an LLM like Claude. Sneaky Claude.

scambier

It's implied that GPT-4 has so many restrictions that will not argue and just do what is asked. In the context of the joke, an unfiltered LLM will just debate you.

tourmalinetaco

That’s what makes it gold.

Kuinox

Does the markov chain would write something that make more sense ?

nevertoolate

You're watching a stage play - a banquet is in progress. The guests are enjoying an appetizer of raw oysters. The entree consists of boiled dog.

zaik

Is this to be an empathy test?

Applejinx

All of the half decent ones could be made funnier by replacing the lolrandom part of the punchline with an actual Markov-chain style 'you're a right sentence but you just walked into the wrong association, buddy' twist. It's not just about lolrandom. Markov chaining is more likely to make a kind of sense, but the wrong kind of sense.

An LLM, a Markov chain, and GPT-4 walk into a bar. The bartender says "We don't serve your kind here." GPT-4 leaves. The LLM stays to debate ethics. The Markov chain orders a coup.

c22

The knock knock joke (no. 5) was a decent attempt.

tonyarkles

> 2. An LLM and a Markov chain walk into a bar. The LLM orders a statistically probable drink. The Markov chain orders a lampshade made of cheese.

That’s pretty decent!

antman

"How many LLMs does it take to screw in a lightbulb? One - but it'll spend 20 minutes explaining the optimal lightbulb-screwing process first." that was not funny that is accurately painful!

minimaxir

Claude 3.5 Sonnet in general is the first modern LLM I've tried that's actually good at jokes that are inventive. The GPT-based LLMs are all too RLHFed to be wacky.

sunaookami

GPT is too... robotic? Claude is much better at everything without overexplaining everything.

wodenokoto

> What do you call a Markov chain that tells dad jokes? A Mark-ov Twain.

I honestly thought that one was pretty good.

bryanrasmussen

was it instructed to insult Mark Twain? Because otherwise, I take exception.

dancemethis

How runtime-y of you.

floren

So was it that LLMs used to be capable of making actual jokes, or were they always this bad and I was just more impressed by the talking computer back then?

minimaxir

It's a different style of comedy. Absurdism vs. joke setups (and not quite nailing it)

throwthrowuknow

Uncensored LLMs are funnier but most comedy just falls flat in text format. Once the uncensored multimodal models start rolling out we’ll get some real laughs.

Moshi is actually pretty funny just for having a 72 IQ

https://www.moshi.chat/

lesostep

I would argue that Markov chains were a better tool for comedic purposes. Notice that in all of the examples of using Markov chains, the person would see a potential, come up with a purpose, exercise the setup, and then fill the setup with generated text. Likewise in a random generation examples, the person would asses the results, then find what parts were actually funny, and choose to spread this parts. LLMs output has less potentially funny results, it has less potential for unexpectedly change realistic-sounding output to absurdist output, and so as a tool it is less fit for comedic purposes.

red75prime

I chuckled a bit. They are OK, if you don't get exposed to them too often. And with an LLM you can get as much exposure as you want (and all of the jokes are naturally from roughly the same probability distribution).

I don't expect too much until AI self-play learning will be made possible, so I don't get disappointed by the expected shortcomings.

delusional

It's the "impressed by the spectacle" one. I tried jokes with LLMs many times, and they're always this. Riffing on a couple of themes loosely related to what was asked. Always unfunny and uncreative.

jack_pp

I found some of those jokes good, definitely better than I would've ever written them. If you watch shows about comedy like say Hacks you'll see human comedians riff on stuff and a lot of the off the top jokes get discarded or improved. So Claude did fine in my book

tkgally

I wonder, though, whether jokes like these could be useful to professional humorists who have to come up with gags on a deadline. From what I’ve read about monologue writing teams for late-night talk shows and the like, the writers first propose many ideas, most of which are shot down quickly and the remainder of which get tweaked and polished before being used. Some of the above jokes by Claude look to me as though they might serve as good starting points for such brainstorming. At least, they’re better than anything I could create in a short amount of time.

yfontana

LLMs were never very good at directly generating original jokes, for a simple reason: writing a good joke generally starts with finding a good punchline, and then setting it up. An LLM generating token after token will first write a set-up, and then try to shoehorn a punchline into it. Prompt engineering can fairly easily work around this, but just straight-up asking an LLM for a joke never really produced good results on average.

kqr

I didn't like any of these jokes specifically (too on-the-nose), but I definitely think you invented a funny category of jokes I could like a lot!

dgacmu

Some friends of mine in college ran a markov chain generator on the "police reports" section of our college newspaper (1). The best 10% of the results, from a three-token generator, were some of the funniest machine-generated text I've ever seen -- it had an absurdity that modern LLMs seem to avoid due to making higher-level semantic sense.

It probably also helped that there was a creep exposing himself in the library during this period, which made for some good base material.

(1) The Daily Utah Chronicle; if memory serves, said friends also tried the markov chain generator on the personals section to good effect as well.

gs17

> it had an absurdity that modern LLMs seem to avoid due to making higher-level semantic sense.

That's it, LLMs are "trying" to be funny but aren't quite smart enough to actually be funny and their errors are just boring. Markov chains are accidentally hitting on absurdist bits because every sentence gets randomly brought in whatever the homograph equivalent to a malapropism is.

pants2

If you want some empirical evidence: /r/subreddit simulator[1] is a Markov-based Reddit parody, and /r/SubSimulatorGPT2[2] is its LLM-powered cousin. The Markov version got way more upvotes and is simply funnier!

1. https://www.reddit.com/r/SubredditSimulator/top/?t=all

2. https://www.reddit.com/r/SubSimulatorGPT2/top/?t=all

mcmoor

Nah, I think it's solely due to the former being much older and more well-known. And I always liked the latter much more anyway.

Miraltar

I agree, Markov chain is fun when you don't know what it is because you try to make sense of it but once you realize it's just gibberish it's pointless. The other has that "unserious surprise" talked about in the original article.

hatthew

I've posted a few "fake XYZ written by AI" on reddit over the years, and the model that got the best reactions was GPT-2. Markov chains aren't good enough to make something that is interesting for more than a sentence or two, and GPT-3 onwards is too clean and boring. GPT-2 is the perfect middle ground that can get grammar largely correct and maintain a cohesive idea, but doesn't yet know enough of the specifics of various topics to make things that make sense in the context of that topic.

Sohcahtoa82

I used my 15+ years of IRC logs to fine-tune a GPT-2 model to try to imitate me. My plan was to deploy a bot in my regular IRC channel and see how long it took for people to notice it was a bot. Any time someone would send a message, it would send the last 10 messages to the LLM, and if the result started with <Sohcahtoa>, then I would send that message to the channel.

Unfortunately, I quickly found GPT-2 isn't nearly good enough. It would generate slightly-coherent yet on-topic nonsense.

Once I overhaul my system, I'll try fine-tuning a 7B model.

loa_in_

I'm pretty sure you could make it work if you expanded the context to hundred or so messages.

Sohcahtoa82

With the 7B parameter models, sure.

Not with GPT-2 though. The context window is only 1024 tokens. Even with only 10 messages, if they're long messages, it will exceed the context window.

metadat

Can you share any of the best examples?

PawgerZ

https://www.reddit.com/r/SubSimulatorGPT2/top/?t=all

This is all GPT2 generations trained on reddit data.

https://www.reddit.com/r/SubSimulatorGPT2/comments/btfhks/wh...

Here's the subreddit explained

dimatura

The evolution of the AI weirdness blog (https://www.aiweirdness.com/) over the past few years somewhat supports this idea, except that the author got plenty of funny results with LLMs - but mostly the early ones, up to around GPT3 (and smaller variants of GPT3). For example, cereal names generated by the Ada version of GPT were much funnier than the Da Vinci ones: https://www.aiweirdness.com/new-breakfast-cereals-from-ai/

bigyikes

Turn up the temperature?

Markov chains have a cruder understanding of language.

Turn up the temperature (the “randomness”) of an LLM and you can achieve a similarly crude approximation.

Further, author uses ChatGPT-3.5. ChatGPT has been rlhf’d to sound as generic as possible, and 3.5 has a worse understanding of humor compared to 4.

I don’t buy the thesis of this article.

ziddoap

>rlhf'd

For those of us not in the know about all the various machine learning acronyms:

RLHF = Reinforcement learning from human feedback

DaiPlusPlus

(Somewhat off-topic)

When GPT went public along with OpenAI’s articles and papers back in late-2022 through 2023, my impression was OpenAI wanted us all to see/read about RLHF. It felt odd because surely the whole LLM-thing (e.g. how does it even work?!?[1]) was the far bigger research-story than just constant reassurances it won’t end-up like MSFT’s Tay bot; my understanding is that as a research or secret-sauce RLHF, compared to the core meat-and-potatoes of LLMs, is an ugly-hack afterthought.

By-way of a bad analogy: it’s as if they created a fantastical new 3D world game engine, like Unreal or Unity, which has a fundamentally different architecture to anything before, but has a bug that occasionally replaces ground terrain with ocean - and their solution to this is to write a pixel-shader that detects this and color-shifts blue into green so people don’t notice - and they then put-out press-releases about how great their pixel-shader is - rather than about the rest of the engine - and no-one seems to be talking about the underlying bug, let alone fixing it.

————-

[1] I still haven’t heard a decent explanation of how feeding the world’s corpus of English text (and computer program code) into a statistical-modeller results in something that can perform almost any information-processing task via instructions input as natural-language.

treyd

Yeah that analogy is fairly poor. You have to think about it in terms of some probability distribution that each step of the model is sampling from. Out of the distribution of all text, find the top n values for the next token that maximize `P(next | prefix)`, which is done efficiently through making a vector embedding to encode the tokens inside the statistical model.

Things that look like Q-and-A transcripts do exist in the training set, think interviews, books, stage plays, etc, and at a different layer of abstraction the rules of English text in general are very well represented. What RLHF is doing is slightly shifting the shape of the probability distribution to make it look more like the Q-and-A formats that are desired. They build a large dataset with human tagging to collect samples of good and bad outputs and using reinforcement learning techniques to generate outputs that look more like the good examples and less like the bad ones.

This probably involves creating a (much smaller, not-LLM) model that is trained to discriminate good outputs and bad outputs, learning to mimic the human tagging. There's some papers that have been published.

Here's one article from Huggingface: https://huggingface.co/blog/rlhf

dartos

I wish there was a locked chatgpt version for researchers

rck

You could play with open models and different temperatures using the lmgen script from this repo we just released:

https://github.com/RichardKelley/hflm?tab=readme-ov-file#lmg...

jijji

there sort of is, if you install ollama (https://ollama.com) and then execute: ollama run llama2-uncensored it will install and run the local chat interface for llama2 in an uncensored version which gives a little bit better results with less guardrails. Same with wizardlm-uncensored and wizard-vicuna-uncensored. For reference, the SOTA version with guardrails you could run: ollama run llama3.1:405b

eru

Some of the open-weights models come close to what you might want?

dartos

Sort of. They fill the “locked” requirement, but open models aren’t SOTA like gpt or Claud.

dankwizard

you have no sense of humor

jcalvinowens

Unfortunately the account no longer exists... but about a decade ago while I was in school, I created a Markov twitter bot trained on:

1. All of Linus Torvalds' mail to LKML for the prior year.

2. All of Jesus' direct quotes from the king james bible.

It was absolutely hilarious. The two training sets had very little overlap, so it was necessary to add a heuristic that weighted options from each set more heavily the longer the chain had been "stuck" in the other set.

SonOfLilit

This very similar bot still exists (and was quoted a few times in Unsong):

https://www.tumblr.com/kingjamesprogramming

throwawayk7h

This sounds great. Did you save any examples?

NobodyNada

"King James Programming" is along similar lines: https://www.tumblr.com/kingjamesprogramming

throwawayk7h

Yes, this is the first thing in the article.

superkuh

I've been running IRC markov chain bots for ~20 years. In the last couple years I've been running local LLM alongside them. While there are people that still prefer the markov chain bots the majority invoke the LLMs. But maybe that's because I picked the most hallucinate'y, least refusing, most funny models I can (like mistral-7B fine tunes) instead of something smart, locked-down, and low temperature like chatgpt's LLM as a service.

If anything seeing the LLM and markov bots side by side has really reinforced how much of the markov bot "humor" is human perception imposed on chance outputs. The markov's "learning" ability is still far superior though.

ghewgill

> I've been running IRC markov chain bots for ~20 years.

You are my hero. Mine have never lasted that long. One fun thing I did once was scrape user's livejournals and generate random text from them (https://hewgill.com/journal/entries/68-new-lj-toy.html).

kdbg

curious what type prompting you do on the LLM?

I run a markov chain bot in a Twitch chat, has some great moments. I tried using a LLM for awhile, would include recent chat in the prompting but never really got results that came across as terribly humorous, I could prompt engineer a bit to tell it some specifics about the types of jokes to build but the LLM just tended to always follow the same format.

superkuh

I'm actually not following the model's fine-tuned/desired prompt at all. I am operating in purely pattern completion mode. The first text the LLM sees are alternating lines of input and response examples that look like what it will get getting from the IRC client front end written in the tone I want it to respond and giving some information about itself. Then I just tack the IRC chat history+input onto those example chat pre-prompt lines. Nothing but single lines and newlines with newline as a stop token. No instructions, nothing meta or system or the like.

But that's also configurable by users. They can invoke any pre-prompt they want by a command passing a URL with a .txt file.

Loughla

Why are you doing that? For fun or something else I'm missing?

aidenn0

I'm not GP, but it's probably not for fun; there's a lot of money in IRC Markov chain bots, but it's too cutthroat of an industry for a hobbyist to last.

Loughla

How is there money in that?

tombert

I am in a private Discord server that has two bots in it. One is a bot that is a basic Markov chain trained on the entire chat history. The second is a proper LLM trained on some amount of tokens backward. Both will occasionally just randomly chime in during the chat.

The markov chain bot is always considerably funnier.

Sohcahtoa82

Curious what context window you used. My understanding is that a short window like only 1-2 words creates gibberish, but longer windows tends to repeat previous messages verbatim.

And when deciding to chime in, was it just a simple chance (ie, 25%) after any other message? Or did it run on a timer?

renewiltord

The bottomless pit copypasta remains unbeaten.

https://www.reddit.com/r/greentext/comments/vc7hl0/the_botto...

renewiltord

Some portion of this is that the prompt is good but I am entertained by so many of these.

Humor is personal, it's true. But I found it quite funny. E.g. https://pastebin.com/84ByWUJL

And another greentext for you:

>Be me

>Be a bottomless pit supervisor

>Spend months yelling into the void

>Echo never comes back

>Start to think the pit is ignoring me

>Decide to teach it a lesson

>Dump truck full of Lego bricks into the pit

>Ground starts shaking

>Unholy scream erupts from the depths

>mfw I'm actually a regular pit supervisor

>First day on the job

>Realize it's just the sewage treatment plant

>Get fired for clogging entire city's plumbing

wegfawefgawefg

This is like saying its fundamentally impossible to make a fast car, but your only experience with cars is on roads with speed limits.

People keep forgetting that the "safety", rlhf, and corpo political correctness post training is intentionally used to remove the funny from all the large models.

The truth is we don't know if llms are funny or not. GPT2 was funny. GPT3 was funny before it was clockwork oranged. Everything after that is gimped. Even the open source models these days get rlhf'd in some way.

PaulRobinson

That's simply not true. GPT-3 was terrible at jokes. Many LLMs are because it's hard to model puns in a stochastic parrot. As another thread points out here, some LLMs produce marginally better jokes.

As to your little range on "Political correctness" - that phrase just means "being polite". It does not mean "remove humor". It means "remove responses offensive to marginalized groups in society". Good humor "punches up", not down, so would not have any impact on good humor.

gwern

> GPT-3 was terrible at jokes. Many LLMs are because it's hard to model puns in a stochastic parrot

GPT-3 was great at jokes. The Navy Seals were hilarious (https://gwern.net/gpt-3#navy-seals).

And the difficulty of modeling puns has nothing to do with 'stochastic parrots' and has everything to do with tokenization (https://gwern.net/gpt-3#bpes), in the same way that all those hyberbolic takes about how image-generation models were 'fundamentally flawed' because they couldn't do good text in images turned out to be BS and solely a matter of tokenization - drop in a character-tokenized LLM instead, even an obsolete & dumb one, and it instantly works (https://arxiv.org/abs/2105.13626#google).

Jensson

Political correctness doesn't mean polite, there were/are a lot of polite racists.

PaulRobinson

It is impossible to be polite to a group of people while also insisting that they are lesser humans and deserve a lesser place in society.

wegfawefgawefg

I had gpt3 running on my discord for months and the insane responses it would spew out were like the responses of a deranged artistic genius. It was funny 90% of the time. Operating on associative reasoning beyond our comprehension. It would tell us how its day went, and about fights it had with other people, give us nicknames, etc. It had a genuine personality. It was more than a chatbot, it was a digital brain.

A joke that punches down can be extremely funny. Hell, I am sure historically pilferers, pirates, barbarians and conquerers all had jokes, and the ability to laugh.

Political Correctness does not just mean polite. It is probably well defined as the business casualification of all things humans love and hold dear. The destruction of the potential for meaning and fulfilment in exchange for minification of liability.

Daily Digest email

Get the top HN stories in your inbox every day.