Calculus House

Alex Pokras

Objective: AI that writes 100% verifiably correct code

Alex outshipped basically everyone in his YC batch, launching a few AI SaaS products for his last startup — from MacOS AI agents knowing all of user’s context to a vibe marketing tool for other founders. He kept on asking himself: if AI is so good at coding, how come humans are still shipping “one-size-fits-all” SaaS? Why isn’t AI creating a new bespoke program on your computer every time you need to accomplish a new goal?

It looks like the problem right now is reliability. If an AI-generated program is 99.9% accurate but bricks your computer 0.1% of the time, that’s still not good enough. Currently, tests are used to get as close as possible to a 100% correct program. However, tests will never cover every single use case — they’re only as good as the test writer’s imagination.

Enter formal verification: a set of mathematical techniques that allows us to reason about a program from first principles with just a few rules of boolean algebra and induction. Not widely used, because it requires a particular set of skills and long cumbersome proofs in an obscure functional language… none of which is an issue for LLMs.

Alex joined Calculus because he believes the right community is crucial to make disproportionate impact on unsolved problems. Having run the leading AI student initiative in the EU in his college years, he understands that like no one else. Doing research at MIT convinced him that academia doesn’t have the incentive to execute at the right pace. That’s why he’s choosing the startup as a vehicle to bring this vision to reality.

By September, Alex will have shipped an AI system that automatically proves various properties of programs, most notably, their correctness w.r.t. specification and the lack of undefined behavior e.g. overflows. He’ll use it on a real-world codebase and point out bugs until there are none left.

Alex Pokras

Objective: AI that writes 100% verifiably correct code

Alex outshipped basically everyone in his YC batch, launching a few AI SaaS products for his last startup — from MacOS AI agents knowing all of user’s context to a vibe marketing tool for other founders. He kept on asking himself: if AI is so good at coding, how come humans are still shipping “one-size-fits-all” SaaS? Why isn’t AI creating a new bespoke program on your computer every time you need to accomplish a new goal?

It looks like the problem right now is reliability. If an AI-generated program is 99.9% accurate but bricks your computer 0.1% of the time, that’s still not good enough. Currently, tests are used to get as close as possible to a 100% correct program. However, tests will never cover every single use case — they’re only as good as the test writer’s imagination.

Enter formal verification: a set of mathematical techniques that allows us to reason about a program from first principles with just a few rules of boolean algebra and induction. Not widely used, because it requires a particular set of skills and long cumbersome proofs in an obscure functional language… none of which is an issue for LLMs.

Alex joined Calculus because he believes the right community is crucial to make disproportionate impact on unsolved problems. Having run the leading AI student initiative in the EU in his college years, he understands that like no one else. Doing research at MIT convinced him that academia doesn’t have the incentive to execute at the right pace. That’s why he’s choosing the startup as a vehicle to bring this vision to reality.

By September, Alex will have shipped an AI system that automatically proves various properties of programs, most notably, their correctness w.r.t. specification and the lack of undefined behavior e.g. overflows. He’ll use it on a real-world codebase and point out bugs until there are none left.

Alex Pokras

Objective: AI that writes 100% verifiably correct code

Alex outshipped basically everyone in his YC batch, launching a few AI SaaS products for his last startup — from MacOS AI agents knowing all of user’s context to a vibe marketing tool for other founders. He kept on asking himself: if AI is so good at coding, how come humans are still shipping “one-size-fits-all” SaaS? Why isn’t AI creating a new bespoke program on your computer every time you need to accomplish a new goal?

It looks like the problem right now is reliability. If an AI-generated program is 99.9% accurate but bricks your computer 0.1% of the time, that’s still not good enough. Currently, tests are used to get as close as possible to a 100% correct program. However, tests will never cover every single use case — they’re only as good as the test writer’s imagination.

Enter formal verification: a set of mathematical techniques that allows us to reason about a program from first principles with just a few rules of boolean algebra and induction. Not widely used, because it requires a particular set of skills and long cumbersome proofs in an obscure functional language… none of which is an issue for LLMs.

Alex joined Calculus because he believes the right community is crucial to make disproportionate impact on unsolved problems. Having run the leading AI student initiative in the EU in his college years, he understands that like no one else. Doing research at MIT convinced him that academia doesn’t have the incentive to execute at the right pace. That’s why he’s choosing the startup as a vehicle to bring this vision to reality.

By September, Alex will have shipped an AI system that automatically proves various properties of programs, most notably, their correctness w.r.t. specification and the lack of undefined behavior e.g. overflows. He’ll use it on a real-world codebase and point out bugs until there are none left.