Dijkstra On the foolishness of "natural language programming"

cs.utexas.edu

Daily Digest email

Get the top HN stories in your inbox every day.

01100011

People are sticking up for LLMs here and that's cool.

I wonder, what if you did the opposite? Take a project of moderate complexity and convert it from code back to natural language using your favorite LLM. Does it provide you with a reasonable description of the behavior and requirements encoded in the source code without losing enough detail to recreate the program? Do you find the resulting natural language description is easier to reason about?

I think there's a reason most of the vibe-coded applications we see people demonstrate are rather simple. There is a level of complexity and precision that is hard to manage. Sure, you can define it in plain english, but is the resulting description extensible, understandable, or more descriptive than a precise language? I think there is a reason why legalese is not plain English, and it goes beyond mere gatekeeping.

drpixie

> Do you find the resulting natural language description is easier to reason about?

An example from an different field - aviation weather forecasts and notices are published in a strongly abbreviated and codified form. For example, the weather at Sydney Australia now is:

  METAR YSSY 031000Z 08005KT CAVOK 22/13 Q1012 RMK RF00.0/000.0

It's almost universal that new pilots ask "why isn't this in words?". And, indeed, most flight planning apps will convert the code to prose.

But professional pilots (and ATC, etc) universally prefer the coded format. Is is compact (one line instead of a whole paragraph), the format well defined (I know exactly where to look for the one piece I need), and it's unambiguous and well defined.

Same for maths and coding - once you reach a certain level of expertise, the complexity and redundancy of natural language is a greater cost than benefit. This seems to apply to all fields of expertise.

WillAdams

Reading up on the history of mathematics really makes that clear as shown in

https://www.goodreads.com/book/show/1098132.Thomas_Harriot_s...

(ob. discl., I did the typesetting for that)

It shows at least one lengthy and quite wordy example of how an equation would have been stated, then contrasts it in the "new" symbolic representation (this was one of the first major works to make use of Robert Recorde's development of the equals sign).

tim333

Although if you look at most maths textbooks or papers there's a fair bit of English waffle per equation. I guess both have their place.

diputsmonro

An interesting perspective on this is that language is just another tool on the job. Like any other tool, you use the kind of language that is most applicable and efficient. When you need to describe or understand weather conditions quickly and unambiguously, you use METAR. Sure, you could use English or another natural language, but it's like using a multitool instead of a chef knife. It'll work in a pinch, but a tool designed to solve your specific problem will work much better.

Not to slight multitools or natural languages, of course - there is tremendous value in a tool that can basically do everything. Natural languages have the difficult job of describing the entire world (or, the experience of existing in the world as a human), which is pretty awesome.

And different natural languages give you different perspectives on the world, e.g., Japanese describes the world from the perspective of a Japanese person, with dedicated words for Japanese traditions that don't exist in other cultures. You could roughly translate "kabuki" into English as "Japanese play", but you lose a lot of what makes kabuki "kabuki", as opposed to "noh". You can use lots of English words to describe exactly what kabuki is, but if you're going to be talking about it a lot, operating solely in English is going to become burdensome, and it's better to borrow the Japanese word "kabuki".

All languages are domain specific languages!

corimaith

I would caution to point of that the Strong Sapir-Whorf hypothesis is debunked; Language may influence your understanding, but it's not deterministic and just means more words to explain a concept for any language.

thaumasiotes

> You can use lots of English words to describe exactly what kabuki is, but if you're going to be talking about it a lot, operating solely in English is going to become burdensome, and it's better to borrow the Japanese word "kabuki".

This is incorrect. Using the word "kabuki" has no advantage over using some other three-syllable word. In both cases you'll be operating solely in English. You could use the (existing!) word "trampoline" and that would be just as efficient. The odds of someone confusing the concepts are low.

Borrowing the Japanese word into English might be easier to learn, if the people talking are already familiar with Japanese, but in the general case it doesn't even have that advantage.

Consider that our name for the Yangtze River is unrelated to the Chinese name of that river. Does that impair our understanding, or use, of the concept?

shit_game

> Same for maths and coding - once you reach a certain level of expertise, the complexity and redundancy of natural language is a greater cost than benefit. This seems to apply to all fields of expertise.

And as well as these points, ambiguity. A formal specification of communication can avoid ambiguity by being absolute and precise regardless of who is speaking and who is interpreting. Natural languages are riddled wth inconsistencies, colloquialisms, and imprecisions that can lead to misinterpretations by even the most fluent of speakers simply by nature of natural languages being human language - different people learn these languages differently and ascribe different meanings or interpretations to different wordings, which are inconsistent because of the cultural backgrounds of those involved and the lack of a strict formal specification.

smcin

Sure, but much ambiguity is trivially handled with a minimum amount of context. "Tomorrow I'm flying from Austin to Atlanta and I need to return the rental". (Is the rental (presumably car) to be returned to Austin or Atlanta? Almost always Austin, absent some unusual arrangement. And presumably to the Austin airport rental depot, unless context says it was another location. And presumably before the flight, with enough timeframe to transfer and checkin.)

(You meant inherent ambiguity in actual words, though.)

staplers

Extending this further, "natural language" changes within populations over time where words or phrases carry different meaning given context. The words "cancel" or "woke" were fairly banal a decade ago. Whereas they can be deeply charged now.

All this to say "natural language"'s best function is interpersonal interaction not defining systems. I imagine most systems thinkers will understand this. Any codified system is essentially its own language.

sim7c00

you guys are not wrong. explain any semi complez program, you will instantly resort to diagrams, tables, flow charts etc. etc.

ofcourse, you can get your LLM to be bit evil in its replies, to help you truly. rather than to spoon feed you an unhealthy diet.

i forbid my LLM to send me code and tell it to be harsh to me if i ask stupid things. stupid as in, lazy questions. send me the link to the manual/specs with an RTFM or something i can digest and better my undertanding. send links not mazes of words.

now i can feel myself grow again as a programmer.

as you said. you need to build expertise, not try to find ways around it.

with that expertise you can find _better_ ways. but for this, firstly, you need the expertise.

azernik

If you don't mind sharing - what's the specific prompt you use to get this to happen, and which LLM do you use it with?

thaumasiotes

You can see the same phenomenon playing a roguelike game.

They traditionally have ASCII graphics, and you can easily determine what an enemy is by looking at its ASCII representation.

For many decades now graphical tilesets have been available for people who hate the idea of ASCII graphics. But they have to fit in the same space, and it turns out that it's very difficult to tell what those tiny graphics represent. It isn't difficult at all to identify an ASCII character rendered in one of 16 (?) colors.

drob518

Exactly. Within a given field, there is always a shorthand for things, understood only by those in the field. Nobody describes things in natural language because why would you?

steveBK123

And to this point - the English language has far more ambiguity than most programming languages.

eszed

I'm told by my friends who've studied it that Attic Greek - you know, what Plato spoke - is superb for philosophical reasoning, because all of its cases and declinsions allow for a high degree of specificity.

I know Saffir-Whorf is, shall we say, over-determined - but that had to have helped that kind of reasoning to develop as and when and how it did.

Sammi

What do I need to google in order to learn about this format?

fluidcruft

I'm not so sure it's about precision rather than working memory. My presumption is people struggle to understand sufficiently large prose versions for the same reason a LLM would struggle working with larger prose versions: people have limited working memory. The time needed to reload info from prose is significant. People reading large text works will start highlighting and taking notes and inventing shorthand forms in their notes. Compact forms and abstractions help reduce demands for working memory and information search. So I'm not sure it's about language precision.

layer8

Another important difference is reproducibility. With the same program code, you are getting the same program. With the same natural-language specification, you will presumably get a different thing each time you run it through the "interpreter". There is a middle ground, in the sense that a program has implementation details that aren't externally observable. Still, making the observable behavior 100% deterministic by mere natural-language description doesn't seem a realistic prospect.

card_zero

So is more compact better? Does K&R's *d++ = *s++; get a pass now?

alankarmisra

I would guard against "arguing from the extremes". I would think "on average" compact is more helpful. There are definitely situations where compactness can lead to obfuscation but where the line is depends on the literacy and astuteness of the reader in the specific subject as already pointed out by another comment. There are ways to be obtuse even in the other direction where written prose can be made sufficiently complicated to describe even the simplest things.

fluidcruft

That's probably analogous to reading levels. So it would depend on the reading level of the intended audience. I haven't used C in almost a decade and I would have to refresh/confirm the precise orders of operations there. I do at least know that I need to refresh and after I look it up it should be fine until I forget it again. For people fluent in the language unlikely to be a big deal.

Conceivably, if there were an equivalent of "8th grade reading level" for C that forbade pointer arithmetic on the left hand side of an assignment (for example) it could be reformatted by an LLM fairly easily. Some for loop expressions would probably be significantly less elegant, though. But that seems better that converting it to English.

That might actually make a clever tooltip sort of thing--highlight a snippet of code and ask for a dumbed-down version in a popup or even an English translation to explain it. Would save me hitting the reference.

APL is another example of dense languages that (some) people like to work in. I personally have never had the time to learn it though.

anonzzzies

Arthur Whitney writes compact code in C (and in k of course); most things fit on one A4 which is actually very nice to me as an older person. I cannot remember as much as I could (although i'm still ok) and just seeing everything I need to know for a full program on 1 page is very nice vs searching through a billion files, jump to them, read, jump back and actually mostly forgotten the 1000 steps between (I know, this refers to a typical overarchitected codebase I have to work on, but I see many of those unfortunately).

layer8

When I first read the K&R book, that syntax made perfectly sense. They are building up to it through a few chapters, if I remember correctly.

What has changed is that nowadays most developers aren't doing low-level programming anymore, where the building blocks of that expression (or the expression itself) would be common idioms.

pton_xd

I think the parent poster is incorrect; it is about precision, not about being compact. There is exactly one interpretation for how to parse and execute a computer program. The opposite is true of natural language.

kmoser

Nothing wrong with that as long as the expected behavior is formally described (even if that behavior is indeterminate or undefined) and easy to look up. In fact, that's a great use for LLMs: to explain what code is doing (not just writing the code for you).

wizzwizz4

That's confusing because of order of operations. But

  while ( *(d++) = *(s++) );

is fairly obvious, so I think it gets a pass.

fluoridation

No, but *++d = *++s; does.

eightysixfour

Language can carry tremendous amounts of context. For example:

> I want a modern navigation app for driving which lets me select intersections that I never want to be routed through.

That sentence is low complexity but encodes a massive amount of information. You are probably thinking of a million implementation details that you need to get from that sentence to an actual working app but the opportunity is there, the possibility is there, that that is enough information to get to a working application that solves my need.

And just as importantly, if that is enough to get it built, then “can I get that in cornflower blue instead” is easy and the user can iterate from there.

fourside

You call it context or information but I call it assumptions. There are a ton assumptions in that sentence that an LLM will need to make in order to take that and turn it into a v1. I’m not sure what resulting app you’d get but if you did get a useful starting point, I’d wager the fact that you chose a variation of an existing type of app helped a lot. That is useful, but I’m not sure this is universally useful.

eightysixfour

> There are a ton assumptions in that sentence that an LLM will need to make in order to take that and turn it into a v1.

I think you need to think of the LLM less like a developer and more like an entire development shop. The first step is working with the user to define their goals, then to repeat it back to them in some format, then to turn it into code, and to iterate during the work with feedback. My last product development conversation with Claude included it drawing svgs of the interface and asking me if that is what I meant.

This is much like how other professional services providers don’t need you to bring them exact specs, they take your needs and translate it to specifications that producers can use - working with an architect, a product designer, etc. They assume things and then confirm them - sometimes on paper and in words, sometimes by showing you prototypes, sometimes by just building the thing.

The near to mid future of work for software engineers is in two areas in my mind:

1. Doing things no one has done before. The hard stuff. That’s a small percentage of most code, a large percentage of value generated.

2. Building systems and constraints that these automated development tools work within.

stouset

Dingdingding

Since none of those assumptions are specified, you have no idea which of them will inexplicably change during a bugfix. You wanted that in cornflower blue instead, but now none of your settings are persisted in the backend. So you tell it to persist the backend, but now the UI is completely different. So you specify the UI more precisely, and now the backend data format is incompatible.

By the time you specify all the bits you care about, maybe you start to think about a more concise way to specify all these requirements…

acka

This is why we have system prompts (or prompt libraries if you cannot easily modify the system prompt). They can be used to store common assumptions related to your workflow.

In this example, setting the system prompt to something like "You are an experienced Android app developer specialising in apps for phone form factor devices" (replacing Android with iOS if needed) would get you a long way.

anonzzzies

But it doesn't 'carry context' ; it's just vague and impossible to implement what you have in mind. And that's the problem; You assume people live in your reality, I assume mine, LLMs have some kind of mix between us and we will get 3 very different apps, none of which will be useful from that line alone. I like that line to be expanded with enough context to have an idea what you actually need to have built and I am quite sure pseudocode (or actual code) will be much shorter than a rambling english description you can come up with; most of which (unless it's logic language) will have enough unambiguous context to implement.

So sure, natural language is great for spitballing ideas, but after that it's just guessing what you actually want to get done.

Affric

Sure but we build (leaky) abstractions, and this is even happens in legal texts.

Asking an llm to build a graphical app in assembly from an ISA and a driver for the display would give you nothing.

But with a mountain of abstractions then it can probably do it.

This is not to defend an LLM more to say I think that by providing the right abstractions (reusable components) then I do think it will get you a lot closer.

fsloth

Being doing toy-examples of non-trivial complexity. Architecting the code so context is obvious and there are clear breadcrumbs everywhere is the key. And the LLM can do most of this. Prototype-> refactor/cleanup -> more features -> refactor / cleanup add architectural notes.

If you know what a well architected piece of code is supposed to look like, and you proceed in steps, LLM gets quite far as long as you are handholding it. So this is usable for non-trivial _familiar_ code where typing it all would be slower than prompting the llm. Maintaining LLM context is the key here imo and stopping it when you see weird stuff. So it requires you act as thr senior partner PR:ing everyhting.

cdkmoose

This begs the question, how many of the newer generation of developers/engineers "know what a well architected piece of code is supposed to look like"?

sciencesama

Llm frameworks !!

jimmydddd

--I think there is a reason why legalese is not plain English

This is true. Part of the precision of legalese is that the meanings of some terms have already been more precisely defined by the courts.

xwiz

This opens an interesting possibility for a purely symbol-based legal code. This would probably improve clarity when it came to legal phrases that overlap common English, and you could avoid ambiguity when it came to language constructs, like in this case[1], where some drivers were losing overtime pay because of a comma in the overtime law.

[1] https://cases.justia.com/federal/appellate-courts/ca1/16-190...

undefined

[deleted]

dongkyun

Yeah, my theory on this has always been that a lot of programming efficiency gains have been the ability to unambiguously define behavior, which mostly comes from drastically restricting the possible states and inputs a program can achieve.

The states and inputs that lawyers have to deal with tend to much more vague and imprecise (which is expected if you're dealing with human behavior and not text or some other encodeable input) and so have to rely on inherently ambiguous phrases like "reasonable" and "without undue delay."

_ea1k

I've thought about this quite a bit. I think a tool like that would be really useful. I can imagine asking questions like "I think this big codebase exposes a rest interface for receiving some sort of credit check object. Can you find it and show me a sequence diagram for how it is implemented?"

The challenge is that the codebase is likely much larger than what would fit into a single codebase. IMO, the LLM really needs to be taught to consume the project incrementally and build up a sort of "mental model" of it to really make this useful. I suspect that a combination of tool usage and RL could produce an incredibly useful tool for this.

soulofmischief

What you're describing is decontextualization. A sufficiently powerful transformer would theoretically be able recontextualize a sufficiently descriptive natural language specification. Likewise, the same or an equivalently powerful transformer should be able to fully capture the logic of a complicated program. We just don't have sufficient transformers yet.

I don't see why a complete description of the program's design philosophy as well as complete descriptions of each system and module and interface wouldn't be enough. We already produce code according to project specification and logically fill in the gaps by using context.

izabera

>sufficiently descriptive natural language specification https://www.commitstrip.com/en/2016/08/25/a-very-comprehensi...

intelVISA

sounds like it would pair well with a suitably smart compiler

soulofmischief

No, the key difference is that an engineer becomes more product-oriented, and the technicalities of the implementation are deprioritized.

It is a different paradigm, in the same way that a high-level language like JavaScript handles a lot of low-level stuff for me.

scribu

“Fill in the gaps by using context” is the hard part.

You can’t pre-bake the context into an LLM because it doesn’t exist yet. It gets created through the endless back-and-forth between programmers, designers, users etc.

soulofmischief

But the end result should be a fully-specced design document. That might theoretically be recoverable from a complete program given a sufficiently powerful transformer.

1vuio0pswjnm7

"Sure, you can define it in plain english, but is the resulting description extensible, understandable, or more descriptive than a precise language? I think there is a reason why legalese is not plain English, and it goes beyond mere gatekeeping."

Is this suggesting the reason for legalese is to make documents more "extensible, understable or descriptive" than if written in plain English.

What is this reason that the parent thinks legalese is used that "goes beyond gatekeeping".

Plain English can be every bit as precise as legalese.

It is also unclear that legalese exists for the purpose of gatekeeping. For example, it may be an artifact that survives based on familiarity and laziness.

Law students are taught to write in plain English.

https://www.law.columbia.edu/sites/default/files/2021-07/pla...

In some situations, e.g., drafting SEC filings, use of plain English is required by law.

https://www.law.cornell.edu/cfr/text/17/240.13a-20

feoren

> Plain English can be every bit as precise as legalese.

If you attempt to make "plain English" as precise as legalese, you will get something that is basically legalese.

Legalese does also have some variables, like "Party", "Client", etc. This allows for both precision -- repeating the variable name instead of using pronouns or re-identifying who you're talking about -- and also for reusability: you can copy/paste standard language into a document that defines "Client" differently, similar to a subroutine.

haolez

This reminded me of this old quote from Hal Abelson:

"Underlying our approach to this subject is our conviction that "computer science" is not a science and that its significance has little to do with computers. The computer revolution is a revolution in the way we think and in the way we express what we think. The essence of this change is the emergence of what might best be called procedural epistemology—the study of the structure of knowledge from an imperative point of view, as opposed to the more declarative point of view taken by classical mathematical subjects. Mathematics provides a framework for dealing precisely with notions of "what is". Computation provides a framework for dealing precisely with notions of "how to"."

light_triad

This is key: computation is about making things happen. Coding with an LLM adds a level of abstraction but the need for precision and correctness of the "things that happen" doesn't go away. No matter how many cool demos and "coding is dead" pronouncements because AI - and the demos are very cool - the bulk of the work moves to the pre- and post-processing and evals with AI. To the extent that it makes programming more accessible it's a good thing, but can't really replace it.

rootnod3

Yeah but in the end when it comes down to it, one would have to specify the exact details, especially for very intricate systems. And the more and more you would abstract and specialize that language for the LLM, you end up going a very long round about way to re-inventing code basically.

Cheer2171

Well that sure isn't what they teach in computer science programs anymore

pyrale

Hal Abelson, casually enraging functional programing CS people around the world.

l0new0lf-G

Finally someone put it this way! Natural language has embedded limitations that stem from our own mental limitations -the human mind thinks sometimes too abstract or too specific things, and misses important details or generalizations.

As a programmer, I know first hand that the problems or even absurdities of some assignments only become apparent after one has begun implement the code as code, i.e. as strict symbolisms.

Not to mention that it often takes more time to explain something accurately in natural language than it takes to just write the algorithm as code.

chilldsgn

Yes! I have a certain personality preference for abstractions and tend to understand things in an abstract manner which is extremely difficult for me to articulate in natural language.

roccomathijn

The man has been dead for 23 years

indigoabstract

He's the Immortal Dutchman.

pyrale

I wonder how long we have to wait before we can pitch a machine that presses fruit juice packs into a glass to gullible VCs again.

moralestapia

This is one of the best comments I've read on this site in a long while.

A single, crude, statement of fact slaying the work of a million typewriter monkeys spewing out random characters thinking they're actually writing the Shakespeare novel, lmao.

MattSayar

We need realistic expectations for the limitations of LLMs as they work today. Philosophically, natural language is imperfect at communicating ideas between people, which is its primary purpose! How often do you rewrite sentences, or say "actually what I meant was...", or rephrase your emails before pressing Send? We are humans and we rarely get things perfect on the first try.

And now we're converting this imperfect form of communication (natural language) into a language for machines (code), which notoriously do exactly what you say, not what you intend.

NLP is massively, and I mean massively, beneficial to get you started on the right path to writing an app/script/etc. But at the end of the day it may be necessary to refactor things here and there. The nice thing is you don't have to be a code ninja to get value out of LLMs, but it's still helpful and sometimes necessary.

Someone

/s: that’s because we haven’t gone far enough. People use natural language to generate computer programs. Instead, they should directly run prompts.

“You are the graphics system, an entity that manages what is on the screen. You can receive requests from all programs to create and destroys “windows”, and further requests to draw text, lines, circles, etc. in a window created earlier. Items can be of any colour.

You also should send more click information to whomever created the window in which the user clicked the mouse.

There is one special program, the window manager, that can tell you what windows are displayed where on any of the monitors attached to the system”

and

“you are a tic-tac-toe program. There is a graphics system, an entity that manages what is on the screen. You can command it to create and destroys “windows”, and to draw text, lines, circles, etc. in a window created earlier. Items can be of any colour.

The graphics you draw should show a tic-tac-toe game, where users take turn by clicking the mouse. If a user wins the game, it should…

Add ads to the game, unless the user has a pay-per-click subscription”

That should be sufficient to get a game running…

To save it, you’d need another prompt:

”you are a file system, an entity that persists data to disk…”

You also will want

”you are a multi-tasking OS. You give multiple LLMs the idea that they have full control over a system’s CPU and memory. You…”

I look forward to seeing this next year in early April.

slt2021

all these prompts are currently implemented under the hood by generating and running python code

sotix

> Machine code, with its absence of almost any form of redundancy, was soon identified as a needlessly risky interface between man and machine. Partly in response to this recognition so-called "high-level programming languages" were developed, and, as time went by, we learned to a certain extent how to enhance the protection against silly mistakes. It was a significant improvement that now many a silly mistake did result in an error message instead of in an erroneous answer.

I feel that we’ve collectively jumped into programming with LLMs too quickly. I really liked how Rust has iterated on pointing out “silly mistakes” and made it much more clear what the fix should be. That’s a much more favorable development for me as a developer. I still have the context and understanding of the code I work on while the compiler points out obvious errors and their fixes. Using an LLM feels like a game of semi-intelligent guessing on the other hand. Rust’s compiler is the master teaching the apprentice. LLMs are the confident graduate correcting the master. I greatly prefer Rust’s approach and would like to see it evolved further if possible.

astrobe_

Rust (and others) has type inference, LLMs have so-called "reasoning". They fake understanding, a lie that will sooner or later have consequences.

llsf

Natural language is poor medium at communicating rules and orders. The current state of affair in US is a prime example.

We are still debating what some laws and amendments mean. The meaning of words change over time, lack of historical context, etc.

I would love natural language to operate machines, but I have been programming since mid 80's and the stubbornness of the computer languages (from BASIC, to go) strikes a good balance, and puts enough responsibility on the emitter to precisely express what he wants the machine to do.

misja111

I somewhat disagree with this. In real life, say in some company, the inception of an idea for a new feature is made in the head of some business person. This person will not speak any formal language. So however you turn it, some translation from natural language to machine language will have to be done to implement the feature.

Typically the first step, translation from natural to formal language, will be done by business analysts and programmers. But why not try to let computers help along the way?

kmacdough

Computers can and should help along the way, but Dijkstra's argument is that a) much of the challenge of human ideas is discovered in the act of converting from natural to formal language and b) that this act, in and of itself, is what trains our formal logical selves.

So he's contesting not only the idea that programs should be specified in natural language, but also the idea that removing our need to understand the formal language would increase our ability to build complex systems.

It's worth noting that much of the "translation" is not translation, but fixing the logical ambiguities, inconsistencies and improper assumptions. Much of it can happen in natural language, if we take Dijkstra seriously, precisely because programmers at the table who have spent their lives formalizing.

There are other professions which require significant formal thinking, such as math. But also, the conversion of old proofs into computer proofs has lead us to discover holes and gaps in many well accepted proofs. Not that much has been overturned, but we still do t have a complete proof for Fermats last theorem [1].

[1] https://xenaproject.wordpress.com/2024/12/11/fermats-last-th...

nelgaard

But even real translation is bad.

There has been some efforts to make computer languages with local (non-english) keywords. Most have fortunately already failed horribly.

But it still exists, e.g. in spreadsheet formulas.

In some cases even number formatting (decimal separators) are affected.

delusional

I don't think youre fully comprehending Dijkstra's argument. He's not saying to not use tool to help with translation, he is saying that not thinking in terms of formal symbols HURTS THINKING. Your ideas are worse if you don't think in formal systems. If you don't treat your thoughts as formal things.

In your example, he has no opinion on how to translate the idea of a "business person" because in his view the ideas of the "business person" are already shallow and bad because they don't follow a formalism. They are not worth translating.

robertlagrant

If that's correct, then it's very falsifiable. If a businessperson says "there's a gap in the market - let's build X" they will be following a formalism at their level of detail. They see the market, the interactions between existing products and customers, and where things might be going.

Just because they can't spell it out to the nth degree doesn't matter. Their formalism is "this is what the market would like".

Having an LLM then tease out details - "what should happen in this case" would actually be pretty useful.

delusional

You're either not really thinking through what you're saying, or you're being disingenuous because you want to promote AI.

A formalism isn't "person says Y". It's about adhering to a structure, to a form of reasoning. Mathematical formalism is about adhering to the structure of mathematics, and making whatever argument you desire to make in the formal structure of formulas and equations.

Saying "A palindrome is a word that reads the same backwards as it does forwards" is not a formal definition. Saying "Let r(x) be the function that when given a string x returns the reversed string, x is then a palindrome iff x = r(x)" (sans the formal definitions of the function r).

Formalism is about reducing the set of axioms (the base assumptions of your formal system) to the minimal set that is required to build all other (provable) arguments. It's not vague hand waving about what some market wants, it's naturally extrapolating from a small set of axioms, and being rigorous if ever to add new ones.

If your hypothetical "business person" every says "it was decided" then they are not speaking a formal language, because formalism does not have deciders.

nightfly

The first step isn't from natural language to formal language. It's from the idea in your head into natural language. Getting that step right in a way that a computer could hope to turn into a useful thing is hard.

jasode

>It's from the idea in your head into natural language. Getting that step right in a way that a computer could hope to turn into a useful thing is hard.

The "inside the head" conversion step would be more relevant in the reply to the gp if the hypothetical AI computer would be hooked up directly to brain implants like neuralink, functional MRI scans, etc to translate brain activity to natural language or programming language code.

But today, human developers who are paid to code for business people are not translating brain implant output signals. (E.g. Javascript programmers are not translating raw electrical waveforms[1] into React code.)

Instead, they translate from "natural language" specifications of businesspeople to computer code. This layer of translation is more tractable for future AI computers even though natural language is more fuzzy and ambiguous. The language ambiguity in business requirements is unavoidable but it still hasn't stopped developers from somehow converting it into concrete non-ambiguous code.

[1] https://www.technologyreview.com/2020/02/06/844908/a-new-imp...

falcor84

Without descending fully into epistemology, I tend to think that there is no proper "idea" in your head before it's phrased in language - the act of initially describing something in natural language *is* the act of generating it.

antonvs

Research on LLMs suggest that's probably not the case. See the work on reasoning in latent space, and on shared concepts between languages being represented independently of the individual language.

Of course one might argue that even if LLMs are capable of ideation and conceptualisation without natural language, doesn't mean humans are.

But the fact that up to 50% of people have no inner monologue seems to refute that.

yencabulator

Are you saying it's impossible to program without first formulating a natural language sentence? That sounds dubious at the very least.

James_K

Because then you don't know what the computer's doing. The whole point of this article was that there is value in the process of writing your ideas out formally. If you "let computers help you along the way", you'll run straight into the issue of needing an increasingly formal natural language to get sufficiently good results from the machine.

nottorp

Say, doesn't each business - each activity - have its own formal language?

Not as formalized as programming languages, but it's there.

Try to define any process, you end up with something trending towards formalized even if you don't realize it.

skydhash

That's pretty much the whole basis of Domain Driven Design. The core message is to get to an Ubiquitous Language which is the formalization of the business jargon (pretty much a glossary). From which the code can then be derived naturally.

nottorp

> From which the code can then be derived naturally.

Disagree with "naturally". Unless you want to end up on accidentally quadratic. Or on accidentally exponential if there's such a list.

chilldsgn

I like your take.

The only issue I have with trusting a computer to do so much is that it doesn't necessarily have the long term vision or intuition some humans might have for the direction of the software product. There's so much nuance to the connection between a business need and getting it into software, or maybe I am overthinking it :D

jedimastert

> It was a significant improvement that now many a silly mistake did result in an error message instead of in an erroneous answer. (And even this improvement wasn't universally appreciated: some people found error messages they couldn't ignore more annoying than wrong results, and, when judging the relative merits of programming languages, some still seem to equate "the ease of programming" with the ease of making undetected mistakes.)

If I didn't know who wrote this it would seem like a jab directly at people who dislike Rust.

still_grokking

Rust? Since when is Rust the pinnacle of static type safety?

After I've worked for some time with a language that can express even stronger invariants in types than Rust (Scala) I don't see that property anymore as clear win regardless circumstances. I don't think any more "stronger types == better, no matter what".

You have a price to pay for "not being allowed to do mistakes": Explorative work becomes quite difficult if the type system is really rigid. Fast iteration may become impossible. (Small changes may require to re-architecture half your program, just to make the type system happy again![1])

It's a trade-off. Like with everything else. For a robust end product it's a good thing. For fast experimentation it's a hindrance.

[1] Someone described that issue quite well in the context of Rust and game development here: https://loglog.games/blog/leaving-rust-gamedev/

But it's not exclusive to Rust, nor game dev.

bob1029

> You have a price to pay for "not being allowed to do mistakes": Explorative work becomes quite difficult

This is a huge deal for me.

At the beginning of most "what if...?" exercises, I am just trying to get raw tuples of information in and out of some top-level-program logic furnace for the first few [hundred] iterations. I'll likely resort to boxing and extremely long argument lists until what I was aiming for actually takes hold.

I no longer have an urge to define OOP type hierarchies when the underlying domain model is still a vague cloud in my head. When unguided, these abstractions feel like playing Minecraft or Factorio.

mjburgess

I can't remember if I came up with this analogy or not, but programming in Rust is like trying to shape a piece of clay just as it's being baked.

card_zero

> Explorative work becomes quite difficult if the type system is really rigid

Or to put it another way, the ease of programming is correlated with the ease of making undetected mistakes.

still_grokking

I'm not sure you tried to understand what I've depicted.

As long as you don't know how the end result should look like there are no "mistakes".

The whole point of explorative work is to find out how to approach something in the first place.

It's usually impossible to come up with the final result at first try!

After you actually know how to do something in general tools which help to avoid all undetected mistakes in the implementation of the chosen approach are really indispensable. But before having this general approach figured out too much rigidity is not helpful but instead a hindrance.

To understand this better read the linked article. It explains the problem very well over a few paragraphs.

pwdisswordfishz

I would have thought of people who unironically liked fractal-of-bad-design-era PHP and wat-talk JavaScript.

I guess some kinds of foolishness are just timeless.

mjburgess

As a person who dislikes rust, the problem is the error messages when there's no error -- quite a different problem. The rust type system is not an accurate model of RAM, the CPU and indeed, no device.

He's here talking about interpreted languages.

He's also one of those mathematicians who are now called computer scientists whose 'algorithms' are simple restatements of mathematics and require no devices. A person actively hostile, in temperament, to the embarrassing activity of programming an actual computer.

indigoabstract

Using natural language to specify and build an application is not unlike having a game design document before you actually start prototyping your game. But once you have implemented the bulk of what you wanted, the implementation becomes the reference and you usually end up throwing away the GDD since it's now out of sync with the actual game.

Insisting that for every change one should go read the GDD, implement the feature and then sync back the GDD is cumbersome and doesn't work well in practice. I've never seen that happen.

But if there ever comes a time when some AI/LLM can code the next version of Linux or Windows from scratch based on some series of prompts, then all bets are off. Right now it's clearly not there yet, if ever.

octacat

Natural language is pretty good for describing the technical requirements for the complex system, though. I.e. not the current code implementation, but why the current code implementation is selected vs other possible implementations. Not what code do, but what it is expected to do. Basically, most of the missing parts, that live in Jira-s, instead of your repo. It is also good, at allowing better refactoring capabilities, when all your system is described by outside rules, which could be enforced on the whole codebase. We just use programming languages, because it is easier to use in automated/computer context (and was the only way to use, to be honest, before all the LLM stuff). Though, while it gives us non-ambiguity on the local scale, it stops working on the global scale, the first moment person went and copy-pasted part of the code. Are you sure that part follows all the high-level restrictions we should to follow and is correct program? It is program that would run, when compile, but definition of run is pretty loose. In C++ program that corrupts all the memory is also runnable.

hamstergene

Reminds me of another recurring idea of replacing code with flowcharts. First I've seen that idea coming from some unknown Soviet professor from 80s, and then again and again from different people from different countries in different contexts. Every time it is sold as a total breakthrough in simplicity and also every time it proves to be a bloat of complexity and a productivity killer instead.

Or weak typing. How many languages thought that simplifying strings and integers and other types into "scalar", and making any operation between any operands meaningful, would simplify the language? Yet every single one ended up becoming a total mess instead.

Or constraint-based UI layout. Looks so simple, so intuitive on simple examples, yet totally failing to scale to even a dozen of basic controls. Yet the idea keeps reappearing from time to time.

Or an attempt at dependency management by making some form of symlink to another repository e.g. git modules, or CMake's FetchContent/ExternalProject? Yeah, good luck scaling that.

Maybe software engineering should have some sort of "Hall of Ideas That Definitely Don't Work", so that young people entering the field could save their time on implementing one more incarnation of an already known not good idea.

Folcon

> Maybe software engineering should have some sort of "Hall of Ideas That Definitely Don't Work", so that young people entering the field could save their time on implementing one more incarnation of an already known not good idea.

I'm deeply curious to know how you could easily and definitively work out what is and is not an idea that "Definitely Don't Work"

Mathematics and Computer Science seem to be littered with unworkable ideas that have made a comeback when someone figured out how to make them work.

antonvs

Well, "Hall of Ideas That Are So Difficult To Make Work Well That They May Not In Fact Be Much Use" doesn't roll off the tongue as smoothly.

What this Hall could contain, for each idea, is a list of reasons why the idea has failed in the past. That would at least give future Quixotes something to measure their efforts by.

Folcon

Ok, so better documentation about what was tried, why, how it failed so as to make obvious if it's viable to try again or not.

I can get behind that :)...

Animats

Constraint-based layout works, but you need a serious constraint engine, such as the one in the sketch editors of Autodesk Inventor or Fusion 360, along with a GUI to talk to it. Those systems can solve hard geometry problems involving curves, because you need that when designing parts.

Flowchart-based programming scales badly. Blender's game engine (abandoned) and Unreal Engine's "blueprints" (used only for simple cases) are examples.

d1sxeyes

Not sure if you’re talking about DRAKON here, but I love it for documentation of process flows.

It doesn’t really get complicated, but you can very quickly end up with drawings with very high square footage.

As a tool for planning, it’s not ideal, because “big-picture” is hard to see. As a user following a DRAKON chart though, it’s very, very simple and usable.

Link for the uninitiated: https://en.m.wikipedia.org/wiki/DRAKON

hmhhashem

For young engineers, it is a good thing to spend time implementing what you call "bad ideas". In the worst-case, they learn from their mistake and gain valuable insight into the pitfalls of such ideas. In the best case, you can have a technological breakthrough as someone finds a way to make such an idea work.

Of course, it's best that such learning happens before one has mandate to derail the whole project.

oytis

FWIW, neural networks would be in that pool until relatively recently.

antonvs

If we change "definitely don't work" to "have the following so far insurmountable challenges", it addresses cases like this. Hardware scaling limitations on neural networks have been known to be a limitation for a long time - Minsky and Papert touched in this in Perceptrons in 1969.

The Hall would then end up containing a spectrum ranging from useless ideas to hard problems. Distinguishing between the two based on documented challenges would likely be possible in many cases.

octacat

Most popular dependency management systems literally linking to a git sha commit (tag), see locks file that npm/rebar/other tool gives you. Just in a recursive way.

hamstergene

They do way more than that. For example they won't allow you to have Foo-1 that depends on Qux-1 and Bar-1 that depends on Qux-2 where Qux-1 and Qux-2 are incompatible and can't be mixed within the same static library or assembly. But may allow it if mixing static-private Qux inside dynamic Foo and Bar and the dependency manager is aware of that.

A native submodule approach would fail at link time or runtime due to attempt to mix incompatible files in the same build run. Or, in some build systems, simply due to duplicate symbols.

That "just in a recursive way" addition hides a lot of important design decisions that separate having dependency manager vs. not having any.

octacat

They do way less then that. They just form a final list of locks and download that at the build time. Of course you have to also "recursively" go though all your dep tree and add submodules for each of subdependencies (recommend to add them in the main repo). Then you will have do waste infinite amount of time setting include dirs or something. If you have two libs that require a specific version of a shared lib, no dep manager would help you. Using submodules is questionable practice though. Useful for simple stuff, like 10 deps in total in the final project.

cubefox

> Or weak typing. How many languages thought that simplifying strings and integers and other types into "scalar", and making any operation between any operands meaningful, would simplify the language? Yet every single one ended up becoming a total mess instead.

Yet JavaScript and Python are the most widely used programming languages [1]. Which suggests your analysis is mistaken here.

[1] https://www.statista.com/statistics/793628/worldwide-develop...

throw1111221

Python went through a massive effort to add support for type annotations due to user demand.

Similarly, there's great demand for a typed layer on top of Javascript:

- Macromedia: (2000) ActionScript

- Google: (2006) GWT [Compiling Java to JS], and (2011) Dart

- Microsoft: (2012) Typescript

teddyh

You’re talking about static typing, the opposite of which is dynamic typing. User hamstergene is talking about weak vs. strong typing, which is another thing entirely. Python has always been strongly typed, while JavaScript is weakly typed. Many early languages with dynamic types also experimented with weak typing, but this is now, as hamstergene points out, considered a bad idea, and virtually all modern languages, including Python, are strongly typed.

teddyh

JavaScript is indeed weakly typed, and is widely lampooned and criticized for it¹². But Python has strong typing, and has always had it.

(Both JavaScript and Python have dynamic typing; Python’s type declarations are a form of optional static type checking.)

Do not confuse these concepts.

1. <https://www.destroyallsoftware.com/talks/wat>

2. <https://eqeq.js.org/>

undefined

[deleted]

cubefox

Ah, weak typing, a.k.a. implicit type conversions.

piokoch

This is recurring topic indeed. I remember it was hot topic at least two times, when ALM tools were introduced (e.g. Borland ALM suite - https://www.qast.com/eng/product/develop/borland/index.htm), next when BPML language become popular - processes were described by the "marketing" and the software was, you know, generated automatically.

All this went out of fashion, leaving some good stuff that was built at that time (remaining 95% was crap).

Today's "vibe coding" ends when Chat GPT and alikes want to call on some object a method that does not exist (but existed in 1000s of other objects LLM was trained with, so should work here). Again, we will be left with the good parts, the rest will be forgotten and we will move to next big thing.

weeeee2

Forth, PostScript and Assembly are the "natural" programming languages from the perspective of how what you express maps to the environment in which the code executes.

The question is "natural" to whom, the humans or the computers?

AI does not make human language natural to computers. Left to their own devices, AIs would invent languages that are natural with respect to their deep learning architectures, which is their environment.

There is always going to be an impedance mismatch across species (humans and AIs) and we can't hide it by forcing the AIs to default to human language.

Daily Digest email

Get the top HN stories in your inbox every day.