Lean4 helped Terence Tao discover a small bug in his recent paper

mathstodon.xyz

Daily Digest email

Get the top HN stories in your inbox every day.

X6S1x6Okd1st

He started learning lean4 with the help of GPT4 just at the start of the month:

https://mathstodon.xyz/@tao/111208692505811257

Many of his mastodon posts this month have been about his learning progress.

Certainly an interesting case study of how LLMs can accelerate the work of even of the most extremely successful people

mensetmanusman

I have found that good communicators that don’t code can quickly make functional automation.

Interestingly, LLMs may end up contributing to more inequality if only the highly skilled can leverage them effectively.

anonylizard

LLMs CURRENTLY favor the more curious and open individuals (High on openness in big 5 scale). Half of the population is not open, does not want to try new things, unless it leads to very direct and proven benefit.

However, over time, the overwhelming benefits of using LLMs will be well understood, and these ladder climbers will absolutely master LLMs, no matter their intelligence. People can become experts at taking exams despite how boring and soul sucking that can be, let alone using something way funner and useful like LLMs.

marshray

My friend had never written anything more than an Excel formula a few months ago and now he's using GPT-4 to write very nontrivial Python applications and automate large parts of his job.

I (having 30 years experience as a professional Software Developer^TM) am begging him to teach me his techniques.

Now that you mention it, I met him and we became friends in large part due to his communications abilities.

an_aparallel

5 months ago - a friend wrote me a python script and sent it to me...i couldnt get it to work. Used phind.com to explain what to do...it worked out my windows environment variables needed to be changed, told me how to structure a folder schema to place the src script...mindblowing stuff. And i have been using it - when it turn the same friend told me to write a similar script in python myself...it has been amazing to forego stackoverflow/google to get answers. When the GPT-4 model kicks in - my questions are answered with beautiful clarity...whether or not this technology "hallucinates" or not...without a doubt it's empowered me to learn to program. If it's helped me out of 50% of my ruts - i count that as overwhelmingly positive!

strikelaserclaw

gpt4 is amazing, i rarely use google as as starting point for my programming related queries these days.

kaba0

Google has actively become catastrophically bad to the point that it will ignore the only special keyterm I deliberately looking for for more general dumb results.. like it is literally unusable - I used to be able to find that very specific stackoverflow answer I have read years ago, or some barely read blogpost, and now I’m happy if it actually finds the website if I search for the domain name..

GPT would be good as a search engine, but who would want a search engine that stopped indexing a few years back? Also, it is not good at ultra niche topics, which would be the whole point of a search engine.

justinclift

Yeah. DuckDuckGo also seems similarly bad these days too. :(

RealityVoid

Huh, I can't seem to get in the groove of using it, maybe I'm old or something, but it annoys me all the subtle ways it's wrong and I feel I have a much better grasp if I think through it myself supported by Google.

popularonion

I get a lot of mileage out of ChatGPT just treating it like an intern who turns around work instantly. You don't expect interns to write perfect code, but they can save you a ton of time if you set them loose on the right problems.

For any relatively simple task I can say "Write a Python script to do X" and it will almost always spit out working code, even if it has subtle mistakes. Fixing mistakes is fine and part of the process. I don't have to read StackOverflow posts saying "Do you really want to do X?", or sift through documentation that follows the author's approach of how they want to introduce the material but doesn't directly address my question.

danenania

Especially now that the training cutoff is being moved up to April 2023. Questions requiring more recent results were the main ones I’ve been going back to google for.

shusaku

Part of this is that ad infested AI generated blog spam is flooding Google! But it’s also my go to. I also really liked GPT to bring me up to speed on a libraries I’ve never used.

colinhb

Agree in part but also I think Terry is such an outlier (though also generous and humble) that it’s hard to extrapolate from this example to a more general case

lulznews

They’re an easy 100x for the elite. Top engineers are now 10000x’ers.

voxl

The only benefit Terrence Tao has received is in correcting syntax. He has mentioned that he has gotten (in my opinion better) benefits from talking to existing Lean4 specialists.

rowanG077

I don't think you realize what you are saying here. I agree that it is a large boost but 100x is just too ridiculous to take serious. Do you really believe an engineer can now finish in 20 hours what would have taking them a year before?

ghshephard

As someone who uses ChatGPT4 pretty much nonstop all day (I have a monitor dedicated to it) - it's almost certainly a 10-20x in fields I have no knowledge of (writing in languages I've never used, dealing with OS/Kernel/Network constructs that are unfamiliar to me) - I can knock off in an hour what might have taken me a day or two previously - but I don't think I've ever had a task where I could say that I completed in 1 hour what would have taken 100 hours - though I would love to hear of counterclaims from people who have been able to do so - definitely not saying impossible, just that I haven't had that experience yet.

kaba0

I would argue that it only helps in very beginner questions as it is beyond useless on anything more complex.

Like if the answer is not a leetcode/CRUD code sample you can easily find on the internet either way, it can’t do anything useful. Like, my problems mostly constitute things like reasoning about the correctness of a given change in the context of the whole codebase, its threading model, etc. It will at best regurgitate some bullshit like “race conditions can be this and that” here, not helping at all.

tux3

For people looking for an easy introduction to Lean4, the Natural Number Game is great: https://adam.math.hhu.de/#/g/hhu-adam/NNG4

And if you just want to read without playing a game: https://lean-lang.org/theorem_proving_in_lean4/introduction....

ohduran

Noob here, how is Lean4 different from TLA+ or Alloy? Is that even a reasonable comparison?

Edit: Wrote Allow the first time, good stuff.

tux3

I'd say TLA+ is designed more for software people trying to design systems, write down specs, and reason about how the thing behaves dynamically.

Lean is used mostly for writing down math proofs, and a lot less for software (although by the Curry–Howard correspondence, math proofs and programs have an equivalence, so the line is a little blurry). Lean has "mathlib", which is like a standard library of formally verified math that people can contribute to and use in new proofs.

A big multi-year effort to start formalizing the proof of Fermat's Last Theorem in Lean 4 was approved recently: https://www.ma.imperial.ac.uk/~buzzard/xena/pdfs/AITP_2022_F...

kmill

Leo de Moura wants Lean 4 to be used for software verification too.

A cool thing about Lean 4 is that it's also a programming language, using the same syntax as for proofs, making it easy to consider proving correctness properties of programs you write. Most of Lean 4 and its tactics are written in Lean 4 (though at this point almost none of this code has any associated proofs).

ohduran

Thank you so much for your answer

undefined

[deleted]

krsrhe

Lean is for verifying proofs, not writing them. It helps you find mistakes, but doesn’t help you understand or express ideas in a human readable way.

kaba0

I am absolutely not an expert at either of them, but I believe TLA+ is interested mostly about all the possible ordering of “code”, that models your program, verifying the absence of concurrency bugs.

Do correct me if I’m wrong, but Lean is more about total functions/FP code, not sure how well does it handle threading, if at all. It might be more related to the correctness of the actual implementation of a serial algorithm.

atomicnature

A few years back, I was trying to find out how to reduce mistakes in the programs I write.

I got introduced to Lamport's TLA+ for creating formal specifications, thinking of program behaviors in state machines. TLA+ taught me about abstraction in a clear manner.

Then I also discovered the book series "software foundations", which uses the Coq proof assistant to build formally correct software. The exercises in this book are little games and I found them quite enjoyable to work through.

https://softwarefoundations.cis.upenn.edu/

Genbox

Code correctness is a lost art. I requirement to think in abstractions is what scares a lot of devs to avoid it. The higher abstraction language (formal specs) focus on a dedicated language to describe code, whereas lower abstractions (code contracts) basically replace validation logic with a better model.

C# once had Code Contracts[1]; a simple yet powerful way to make formal specifications. The contracts was checked at compile time using the Z3 SMT solver[2]. It was unfortunately deprecated after a few years[3] and once removed from the .NET Runtime it was declared dead.

The closest thing C# now have is probably Dafny[4] while the C# dev guys still try to figure out how to implement it directly in the language[5].

[1] https://www.microsoft.com/en-us/research/project/code-contra...

[2] https://github.com/Z3Prover/z3

[3] https://github.com/microsoft/CodeContracts

[4] https://github.com/dafny-lang/dafny

[5] https://github.com/dotnet/csharplang/issues/105

mjburgess

The issue is that modern software fails because it's part of a complex system with many moving parts, rather than, it is inherently complex at the source-level.

Certain sorts of algorithmically complex development (games, cars, medical hardware, etc.) would benefit from a 'closed-world verification' -- but that's not most software, and they have alternatives.

'Code correctness', including unit testing, ends up being a big misdirection here. What you need is comprehensive end-to-end tests, and instrumentation to identify where failures occur in that end-to-end.

The effort to source-level-check source-level-code is largely a huge waste of time and creates an illusion of reliability which rarely exists.

throw0101c

> The issue is that modern software fails because it's part of a complex system with many moving parts, rather than, it is inherently complex at the source-level.

A reminder of Gall's law:

> A complex system that works is invariably found to have evolved from a simple system that worked. A complex system designed from scratch never works and cannot be patched up to make it work. You have to start over with a working simple system.[8]

* https://en.wikipedia.org/wiki/John_Gall_(author)

mrkeen

Strong disagree.

> The issue is that modern software fails because it's part of a complex system with many moving parts, rather than, it is inherently complex at the source-level.

The choice to run a system as many different moving parts is a decision taken by the team in order to avoid failure.

> Certain sorts of algorithmically complex development -- but that's not most software

It's all software.

> 'Code correctness', including unit testing, ends up being a big misdirection here. What you need is comprehensive end-to-end tests, and instrumentation to identify where failures occur in that end-to-end.

No and no. I have comprehensive end-to-end tests. They take forever, don't fit into RAM (for some services I need to run them on my home PC because my work laptop only has 16GB), and most importantly: they show that the code is not correct. Now I have to change incorrect code to correct code (while not breaking any public interfaces. I wish my predecessors did not put incorrect code into the live system.

jerf

"Code correctness is a lost art."

No, it really isn't. It's doing better now than it ever has... and I mean all the negative implications of that. It was not an art that was ever "found" in the sense you mean.

throwalean

Note that the Z3 SMT solver was written by Leonardo de Moura, who also is the lead dev of Lean 4. Not a coincidence (-;

Lean 4 seems to be used in production at AWS: https://github.com/cedar-policy/cedar-spec/pull/138

pjmlp

Or using F* and then generate F# code, https://www.fstar-lang.org/

kccqzy

Formal verification is often not the best tool for ensuring code correctness from an ROI perspective. Things like unit tests (including property based tests) and ensuring 100% code coverage often achieve adequate results with less effort.

jlouis

The key point is that most software don't need correctness to the level formal verification provides. There's a subset of software however, for which there's no substitute for a formal verification process.

pfdietz

If you can specify your program (or a part of it) well enough to prove it correct, you can also use that specification for property-based testing.

gmadsen

Ada is gaining popularity in safety critical systems

gilcot

Ada/SPARK is already popular in safety critical systems

aeonik

Have you looked into Idris2 at all. While looking into these theorum provers, it always felt like they had an impedance mismatch with normal programming.

Idris2 portends to a general purpose language that also has a more advanced type system for the theorum proving.

https://github.com/idris-lang/Idris2

atomicnature

That's interesting; no I wasn't aware of it. Will check it out sometime, thanks.

iso8859-1

There's a list of packages here: https://github.com/stefan-hoeck/idris2-pack-db/blob/main/STA...

samvher

I had the same positive experience with Software Foundations.

There is another book somewhat derived from it (if I understand correctly) using Agda instead of Coq: https://plfa.github.io/

I haven't had the chance to go through it yet, but it's on my list - I think Agda (and as mentioned by another commenter, Idris) is likely to feel more like a programming language than Coq.

undefined

[deleted]

user3939382

I’m really excited about dependent types. I’m expecting we won’t get them for a while though. Dependent Haskell is progressing but apparently it’s hard to retrofit. Idris’ own creator has said he expects it to be a model for other languages, I don’t think it will ever have mainstream adoption. Coq and Agda, F* aren’t really designed to be general purpose.

Although the implementation for the compiler is complex, and the syntax can get complex and verbose, to me my requirement is simple: I want to encode everything about input and output that I know.

Right now in mainstream languages I often know more about my arguments or output than the type system will allow me to specify.

valyagolev

I totally share your excitement about dependent types, but it seems that, unlike the type systems we're used to, theorems about the dependent types are much harder to prove, which makes them not very comfortable to use for the whole program.

If only there was some kind of a gradual, perhaps typescript-like approach to adding arbitrary type-level value-limiting information in random places without having to have everything proven everywhere...

skulk

Every non-dependent typing relation is also a dependently typed relation so I think things are already the way you want, unless you have a certain example in mind.

valyagolev

Sure, in the same sense that every "untyped" program is already typed with some kind of a universal type, but what's the point?

What I want is to be able to specify and have (when possible) automatically proven arbitrary assertions anywhere in the code without necessarily making sure that every possible presupposition is proven from the ground up. Just like in Typescript, I can add a type at any point where there was only "any", and have a small portion of the program typed without typing the whole thing

mettamage

Eli5 dependent types?

Chatgpt:

Dependent types are a concept in computer science and programming languages where the type of a variable can depend on the value of another variable. In simpler terms, imagine you have a list of numbers and you also have information about how long that list is. With dependent types, you could create a type for the list that explicitly includes its length, ensuring at compile time that operations respect this length. This makes it possible to catch certain kinds of errors before the code even runs.

For example, in a language with dependent types, you could specify a function to take a list of length 3 and no other sizes. If you tried to pass a list of length 4, the program wouldn't compile, preventing this kind of mistake early on.

It's a bit like having an extra layer of safety checks, where the types are more expressive and can encode more intricate relationships between variables.

howenterprisey

I never care much for chatgpt answers. I don't know why people post them on here.

In the first sentence, "another" is wrong because you don't need two variables, you just need one. Final paragraph's wrong for the same reason.

The example given is poor given that I can write [i8; 3] or int[3] in Rust or C and those very much do not have "dependent types" in the popular sense (Rust's const generics excepted). To be fair, those examples are technically dependent types, but it would be better to give an example that's impossible in those languages, such as "array with length at most 3" or "even integer".

Finally, to nitpick, "a bit like" is unneeded hedging.

Stack Overflow did a much better job: https://stackoverflow.com/questions/9338709/what-is-dependen.... Wikipedia's article is pretty bad and maybe I'll get around to fixing it at some point.

ChadNauseam

I like ChatGPT’s answer better than yours. Rust’s cost generics and C++ aren’t dependent types in any sense. And saying “the type of one variable can depend on the value of another” is vague but a good intuition pump.

Since I guess we’re doing dependent types explanations I’ll give it a go. Dependent types extend a type system with two new type thingies: 2-tuples where the type of the second element depends on the value of the first, and functions where the type of the function’s return value depends on the value that was passed to it.

mettamage

Because I had the question and I figured this was quicker. I didn’t know what dependent types were.

So now you know why I do it. Also, I believe this is my first time doing it. I might be wrong.

Is it better to ask and wait for an answer instead?

There is nothing in the guidelines on HN about it. I don’t know what’s reasonable and I haven’t seen strong cultural norms from HN yet. I at least labeled that the text was from chatgpt as to not confuse it was my own text.

xeonmc

ChatGPT = Cunningham bait

bjornasm

That is really interesting. Whish there was some better name for that, as I feel like it isn't that descriptive. However their benefit seem really obvious, saying that something is an areay, or even an array of integers is barely half the picture if in reality it is a array of length 5 of even integers to loan from your example. I guess you would try to implement it in other languages creating objects?

icen

Another classic use case is that you can have expressions in the types of your function, for example this won't compile:

    stringOrInt : (x : boolean) -> int -> (if x then String else int)
    stringOrInt true x = toString x
    stringOrInt false x = x + 1

    1 + stringOrInt true 37 # this will error, because it knows you've not returned an int

The other example that you can do in depedently typed languages, but is too involved to write out here, is make a type-safe printf, where the format string produces the types for the other arguments.

turndown

It’s simplest explanation is that dependent types are types whose exact type use some kind of parameterized value in the definition. So a dependently typed array would use a parameterized length value as part of it’s definition(this is usually the easiest example to understand). So an array of length 4(of whatever type) could be identified as different from an array of length 5, etc.

omneity

That one of the brightest minds of our generation is able to increase his bandwidth with the combination of LLMs and automated proofs makes me super bullish on this tech combo in the future!

It starts with bug-fixing, then supports verification, until it starts propelling new discoveries and push the envelope.

We need a term when a dynamic like Moore's Law "infects" a field that had no such compounding properties before.

EDIT:

There's additional context that Terence Tao is using Copilot to help him learn Lean. As shared by adbachman: https://mathstodon.xyz/@tao/111271244206606941

Could Terence have done it without Copilot? Sure, but like many of us he might not have initiated it due to the friction of adopting a new tool. I think LLM tech has great potential for this "bicycle for the mind" kind of scenarios.

aylmao

Lean 4 is a theorem prover, and has nothing to do with LLMs as far as I know though.

aylmao

Lean 4 is a programming language and theorem prover, and has nothing to do with LLMs as far as I know though.

adbachman

the missing context from the previous comment is that Tao used GitHub Copilot to help with learning Lean.

He's been writing about it as he goes, most recently: https://mathstodon.xyz/@tao/111271244206606941

omneity

Thanks for adding the clarification. I thought this was more common knowledge. I will update my comment accordingly.

passion__desire

Not to beat the point: LLMs are compiler for english (natural) language.

syngrog66

I found a "bug" in one of Terence Tao's math blog posts too, years ago. I told him about it, he fixed it, and thanked me.

I didnt make the front page of Hacker News, of course. lol

syngrog66

If anyone interested, I went into more detail about it, over in a post on my blog:

https://news.ycombinator.com/item?id=38040982

Dudester230602

I was worried that Lean4 is a yet another LLM, but it's actually some hard and reliable stuff.

jamesrom

Anything that helps Terence Tao find bugs in his papers is hard and reliable in my books.

ykonstant

The Lean 4 community is quite optimistic about combining LLMs and theorem provers for proof assistance and formalization.

woolion

"Terry Tao finds ChatGPT very helpful to prove new theorems" would actually be bigger news than this one, IMO.

throwalean

"Terry Tao finds ChatGPT very helpful to formally verify his new theorems" seems to be a true statement. See some of his other recent mathstodon toots.

woolion

The point is that LLMs and similar tools tend to be very good at automating the trifle but not very useful at what would be considered really "interesting" work.

So while your point is somewhat true [0], as he mentions that these tools could become good enough to do the formal verification part, it's precisely not the interesting part. See [1] and [2]; in particular some things that are very easy to do in real maths can be very challenging in an automated theorem prover, quoting from [2]:

>In the analyst's dialect of Mathematical English, this is a one-line proof, namely "by the standard limiting argument". Unpacking this into a formal proof required me to go through a fair bit of the Mathlib documentation [...]

It's impressive to be able to do such mathematics in Lean/Coq..; at all, but it is very tedious mechanical work [3].

>It was more tedious than I expected, with each line of proof taking about an hour to formalize

So I think that rather proves the point of what LLMs are currently good for, and what tools can help for really difficult tasks, rather than invalidate it.

[0] https://mathstodon.xyz/@tao/111305365372766606

[1] https://mathstodon.xyz/@tao/111305336701455719

[2] https://mathstodon.xyz/@tao/111259986983504485

[3] https://mathstodon.xyz/@tao/111305336701455719

OscarCunningham

He is finding Copilot useful to help with the formalisation: https://mathstodon.xyz/@tao/111271244206606941.

cubefox

I wonder whether we could combine formal proof checkers (like the Lean proof checker) with language models that generate synthetic conjecture-proof pairs in a formal language like Lean.

The Lean proof checker could be used to automatically verify whether the synthetic proofs written by the language model are correct. This information could be used to provide an RL reward signal applied to the original language model, which would result in it writing better proofs. (Or we train a new model using the correct synthetic proofs of the previous round as training data.) And then the process repeats. So the model would self-train using its synthetic training data, without further human intervention.

We could even make this process more adversarial. First we split the generator language model into two: One which generates conjectures, and one which tries to prove/disprove them in Lean. Then add a predictor model which tries to predict whether a synthetic proof is verified by the Lean proof checker. The lower the predicted probability that the proof will be correct, the more reward gets the proof-generator model if it did indeed provide a correct proof.

Finally, we add another model which tries to predict the reward the proof-generator model will get for a given synthetic conjecture. Then the conjecture-generator model is rewarded for conjectures that are predicted to yield a high reward in the proof-generator model. So conjectures that are neither too hard not too easy for the proof-generator model.

So we would expect that the whole system would progressively create harder and harder synthetic proofs, which in turn allows for better and better self-training of the proof-generator.

It seems this could in principle scale to superhuman ability in generating proofs. The process would be somewhat similar GANs or to self-play in AlphaGo Zero.

I think the hard part is the initial bootstrapping part, to get the whole process off the ground. Because the initial training of the generator models has to be done with human provided training data (Lean proofs). But once the synthetic proofs are good enough, the system would self-train itself automatically.

ykonstant

This workflow is an explicit goal of the Lean 4 devs. The official Zulip chat has a channel dedicated to the interface between the two:

https://leanprover.zulipchat.com/#streams/219941/Machine%20L...

cubefox

That would be interesting, though this link requires an account.

pfdietz

It would also be interesting to train these models on the entire math literature. One would want to use this to verify the math literature, translating the proofs there to checkable forms. Eventually, we would get a model that knows every theorem and every correct proof in the math literature.

aSanchezStern

Sounds like you've stumbled into the wonderful world of machine-learning guided proof synthesis! While I don't think the full system you're describing has been built yet, many similar systems and pieces have. In terms of the initial phase of supervised learning on existing proofs to prove new ones, there's TacticToe (https://arxiv.org/abs/1804.00596), Tactician (https://arxiv.org/pdf/2008.00120.pdf), CoqGym/ASTactic (https://arxiv.org/abs/1905.09381), Proverbot9001 (https://arxiv.org/abs/1907.07794), and Diva (https://dl.acm.org/doi/10.1145/3510003.3510138#sec-terms), among others. Most of these have some sort of language model within them, but if you're specifically looking for the LLM's that have been big recently, there's GPT-f (https://arxiv.org/abs/2009.03393), Baldur (https://arxiv.org/abs/2303.04910), and COPRA (https://arxiv.org/abs/2310.04353), though currently these models don't seem as effective as the specialized non-LLM tools. In terms of using reinforcement learning to learn beyond human written proofs, there's TacticZero (https://openreview.net/forum?id=edmYVRkYZv), this paper from OpenAI (https://arxiv.org/pdf/2202.01344.pdf), rlCoP (https://arxiv.org/abs/1805.07563), the HOList line of work (https://arxiv.org/pdf/1905.10006.pdf), and HyperTree Proof Search (https://arxiv.org/abs/2205.11491), as well as some in progress work I'm working on with a team at the University of Massachusetts.

cubefox

Wow, that's a great overview. While most of it goes over my head, it is interesting that quite a lot of research has been done in this direction.

The system I outlined above was inspired by this intriguing paper:

https://arxiv.org/abs/2207.14502

It is superficially about a different topic, namely generating and solving "programming puzzles", basically a type of unit test, that are then verified by the python interpreter. This system seems quite analogous to one where the programming puzzles are replaced by conjectures in a language like Lean, the proposed puzzle solutions are instead proposed proofs, and the python interpreter is replaced by the proof assistant.

I just discovered that there seems to be at least one guy working on applying a similar system to formal mathematics: https://www.cs.cmu.edu/~jlaurent/pdf/reports/thesis-proposal... He actually cites the paper above.

staunton

There are people doing this already. For Lean, an example is https://morph.so/blog/the-personal-ai-proof-engineer/

cubefox

If I understand this correctly, they want to use an AI model for "autoformalization", which sounds like "using a language model to translate natural language proofs into Lean proofs". Which is cool but much less ambitious than the self-train system I described above. I guess AI technology isn't yet far enough to make such a proposal work.

staunton

Given a billion dollars or so, I guess one moght get pretty far. The roadblock might be more that nobody is seeing a lot of market potential.

pama

Here is earlier context for how Tao used LLM tools, including GPT-4, to help him in this journey.

https://mathstodon.xyz/@tao/111233986893287137

ocfnash

You can even follow his progress on GitHub here:

https://github.com/teorth/symmetric_project/

deterministic

Lean4 is brilliant. Worth digging into as a programmer. Coq, Lean4, Agda etc. made my brain explode in a good way. Making me a better software developer.

gridentio

I'm kind of interested in how useful Lean4 is as a programming language, and if it's easy to prove things about a program written in Lean. I should probably look into that when I have a minute.

ykonstant

Regarding usefulness: Lean is very nice to program in, if you care about pure functional languages; its FFI allows you to incorporate fast C routines very easily if pure Lean is not performant enough or lacks features. However, in some domains, Lean is within a decimal order of magnitude of (not hand-optimized) C; some benchmarks I hand-made recently impressed me.

Regarding proving things about programs, no, it is not easy, and the developers do not seem to consider it a core goal of Lean.

staunton

> the developers do not seem to consider it a core goal of Lean

I guess it depends on who you ask. The original devs of Lean wanted to do "everything" (because that's how you start projects, I guess). Since then it has attracted a lot of mathematicians (especially those working on Mathlib, a library that aspires to formalize "all known math") who are happy to have "algorithm objects" and prove things about them without being able to actually run an algorithm on any input.

This goes together with mostly embracing classical logic (which breaks the original and most powerful version of Curry-Howard, which allowed you to extract programs from proofs). However, in practical situations, algorithms extracted in this way tend to be too slow to be useful, so maybe that's not actually a downside for programming purposes.

Finally, Lean4 "compiles" to C-code, so at least it is (or can reasonably easily be made) portable. People have been trying to use it for real applications, like the AWS stuff others have linked in this thread.

kmill

The core Lean 4 developers do want proving properties about programs to be easy. In the short term maybe priorities have been elsewhere due to limited resources, but that doesn't mean they do not consider this to be a core goal. My understanding is that there are still some research-level problems here that need to be worked out. (Proving things about do notation is currently a real pain for example.)

tgv

Does anyone understand what the proof is about? An improved boundary on the mean of geometric series? And why it's only a problem for n=3, k=2, and not for all k=n-1?

pa7x1

From a very quick glance at page 6 of https://arxiv.org/pdf/2310.05328.pdf

You can see he is treating the case of k=1,2 with that formula and uses induction to extend it to 1 ≤ 𝑘 ≤ 𝑛 − 2.

For k = n - 1 he uses a different bound defined in equation 2.2. So he bypasses the issue.

Daily Digest email

Get the top HN stories in your inbox every day.