More

johnbender · 2025-05-10T17:07:47 1746896867

> Besides, gotta start somewhere. It's probably a PoC, for a platform that will eventually handle all sorts of things.

I agree and I think we should give folks leeway to make progress but this seems to be the qualifier for nearly every GenAI demo I’ve seen

johnbender · 2025-01-11T00:48:48 1736556528

There’s decades of research in this vein fwiw, usually referred to as symbolic execution and it’s descendants like concolic execution.

johnbender · 2025-01-11T00:47:01 1736556421

A minor point. This is more akin to testing because you’re only checking your formulae against a subset of system traces.

Formal methods connotes comprehensive evidence about system behavior. In the case of TLA and similar system that’s a state machine and not the real system but the output for the state machine is a proof that your LTL/CTL/TLA properties hold for all behaviors of the system (traces or trees of traces).

charleshn · 2025-01-11T02:08:07 1736561287

Definitely, I'm playing fast and loose with "lightweight formal method here", thanks for making it clear.

I was mentioning it in the same context as e.g. the Amazon paper on lightweight formal methods [0] where they use property-based testing to test that the implementation conforms to the model specification.

In a similar spirit, linearizability checkers like Porcupine [1] are a nice mix of formalism and usability.

I really like those because they are incredibly powerful, don't require a model and verify the actual implementation - obviously as you mention they are not exhaustive.

[0] https://assets.amazon.science/77/5e/4a7c238f4ce890efdc325df8...

[1] https://github.com/anishathalye/porcupine

johnbender · on Nov 17, 2023

If you view a file system as running concurrently with another instance of itself where it could be preempted at any time indefinitely and where your algorithm for ensuring your crash protocol invariant must be lock free (two threads helps here) then you get a good sense for the complexity of the problem.

dan-robertson · on Nov 17, 2023

… and filesystems themselves are multithreaded, and your writes can get reordered quite dramatically.

johnbender · on Sept 29, 2022

Fwiw we have at least some reason to hope in this general context that between clever systems work and tightening theoretical bounds via additional assumptions and clever reasoning we might get to practical implementations for some applications.

As (maybe weak) evidence the progress on practical implementations of PCPs/SNARGs

https://dl.acm.org/doi/pdf/10.1145/2641562

rhindi · on Sept 29, 2022

It’s much much faster now, and performance is improving 10x every couple of year. With the current trend, FHE will be applicable to 80% of usecases by 2025

johnbender · on Sept 23, 2022

Learning whether the button should change colors is different from actually changing the colors.

I agree that the learning part is interesting but at google I suspect those are two different jobs.

johnbender · on Sept 8, 2022

As an aside Madoff Industries did employ people who earnestly worked outside the Ponzi scheme. I know because I met a few of them when I worked on a database architecture audit for them around 2008.

bri3d · on Sept 8, 2022

"Database architecture audit for Madoff Industries" is a pretty good conversation starter as a resume item!

This is a good point, and something that occurred to me after I posted - there's also a type of scam company where honest employees are hired to do something real, but the executive staff are skimming inordinate compensation for themselves or hiding a more nefarious project underneath, like a Ponzi scheme, simple money laundering, or self-dealing / kickback style schemes.

I suppose it's possible that Boom could be one of these, and I don't really have enough data to say whether it is or not.

My personal take on Boom is that it's your typical amateur, marketing-savvy CEO with a lofty dream who managed to collect funding for a project they can't execute on. Whether this kind of thing is a net "good" or not, I'm not sure, but I really don't think it's a "scam" in a mal-intent sort of way.

If I were a VC, I wouldn't invest in an aerospace founder with no aerospace background, but plenty chose to, and their funds are at least being redistributed towards employees and suppliers. The whole idea of Silicon Valley style "innovation" VC is supposedly that their moonshot upsides can patch over their moonshot downsides, no?

johnbender · on May 28, 2022

To follow on this sentiment at a slight tangent, I am happy for the enthusiastic attempts from all quarters but folks seem to misunderstand that incremental progress in academia is often due to the problems being very hard. Formal verification in the presence of weak memory models is an ongoing problem that only recently has seemed tractable thanks to the hard work of the folks at MPI etc over more than a decade of publications and iterating on logics like Iris

johnbender · on May 28, 2022

I think this sharply discounts the value of the step before the proof which is writing the specification in a formal language/logic. This often surfaces these misunderstandings before a proof is even necessary.

That doesn’t guarantee that the spec will be the right one (as you say) but it’s astounding how often just writing it down formally improves the implementation through rigorous thought.

pydry · on May 28, 2022

I've only done a bit of formal verification but I'd estimate that writing that spec was 7-10x harder than writing the actual program and was more complicated than the code.

In the end I had lower confidence that the spec lacked bugs than the program. This was after expending a huge amount of effort on a pretty tiny program.

I dont think this was a tooling thing. I think it spoke to the fundamental limits of formal verification. I think it'll always remain kinda niche.

danielvf · on May 29, 2022

I've used formal verification in anger, as well as working with a vendor to formaly verify some critical code for us. This rings very true. It is extraordinarily difficult to write correct rules. It ends up being the same problem as wishing against an evil genie.

Most rules that you come up with at first end up having a class of obvious exceptions in the real world, which the verifier finds, and then even more unobvious exceptions, and soon your logic around the exceptions to the rules become at least as complicated as the code you are attempting to verify. And in this any wrong assumptions that falsely allow bad behavior are not caught or flagged because they pass.

Even giving perfect proving software, it's still a far harder challenge to write rules than to write code. And current software is still far from perfect - you are likely to spend a lot of your rules time fighting with your prover.

johnbender · on May 28, 2022

I think this depends on the spec language and the target system. I’ve never encountered a spec more complicated than the program as the goal is always abstraction but I don’t mean to discount your experiences and complexity is affected by tooling familiarity and quality.

Separately the spec can often have its own properties which can be verified as a means to interrogate its correctness. For example state machines as spec, temporal logic properties and model checking where the state machine is the abstraction for a concrete system. Worth noting that proving state machines are an abstraction of a concrete system is a going research concern though.

lelanthran · on May 29, 2022

> I think this sharply discounts the value of the step before the proof which is writing the specification in a formal language/logic.

If that was at all achievable, we'd have a compiler that took the "specification in a formal language/logic" and emitted native code/applications/programs. We'd then call the "specification in a formal language/logic" a programming language.

Sure, there are a lot of formal languages for specifying logic with checkers that ensure no bugs in the input specification exist, but AFAIK none of them are useful enough to emit programs.

Needing a human to translate one formal language (the formal spec) into another formal language is pointless and useless, because then the human may as well just translate human language specification into formal language.

csande17 · on May 29, 2022

If you're writing a sorting algorithm or a hash table implementation or something, then the spec is meaningfully different from the code. The spec says "the output array is sorted", the program describes some particular strategy for sorting it, and then you use the proof tools to make sure that the strategy actually works to sort the array in all cases.

But for things like UI code, I too am having trouble imagining a spec that is concrete enough to be useful for formal verification and does not have some trivial 1:1 correspondence to the implementation. (If anyone knows of an example, I'd really be interested in seeing it!)

lelanthran · on May 29, 2022

> If you're writing a sorting algorithm or a hash table implementation or something, then the spec is meaningfully different from the code. The spec says "the output array is sorted", the program describes some particular strategy for sorting it, and then you use the proof tools to make sure that the strategy actually works to sort the array in all cases.

I dunno. Thinking more deeply about the specification for a sorting algorithm, it makes sense that the specification includes the O(n) runtime (or memory usage, or both), or else it's an informal specification.

If the spec is really nailed down formally then the specification language really would be the implementation language too.

nottorp · on May 29, 2022

> That doesn’t guarantee that the spec will be the right one (as you say) but it’s astounding how often just writing it down formally improves the implementation through rigorous thought.

But that doesn't have much to do with formal methods. You can achieve the same effect grabbing a colleague and explaining your spec to them, it will trigger the same rigorous thought because you want them to understand you.

johnbender · on May 22, 2022

> In general you can't prove a program will or won't terminate.

As a point of clarity for folks who come to this (the commenter clearly knows this) one can’t _automatically_ prove for an arbitrary program whether or not it terminates. It’s definitely possible to prove this manually per program and most dependently typed languages will do the work for you as a matter of sound over approximation as long as one of the arguments is shrinking in an obvious way.