Highs and Lows of Language-Parametric Formal Verification -

Formal verification tools need to understand the programming language they target. For some of the tools, the understanding is represented implicitly in their code, while other are explicitly linked to a separate representation of the language. A tool in the latter category can also be created by semi-automatically deriving it from the language representation, as demonstrated by K-framework. In that case, the language is encoded explicitly in a formal semantics, the verification tool is built separately in a way that allows plugging in an arbitrary formal semantics, and then the two are combined. In this post I explore some advantages and limitations of this “language-parametric” approach.

How language-parametric verification tools work

Deductive verification tools in general

All formal verification tools worth their name are based on some rigorous, formal, mathematical theory. For deductive verification tools, this theory is often called “logic”, or specifically “program logic” (to amplify that it is used for reasoning about programs). We have program logics since the end of 1960’s, when Tony Hoare introduced what later became known as Hoare logic. In Hoare logic, one writes triples of the form {P} S {Q}, with the meaning that after executing the statement S from any state satisfying the logical formula P, if the execution terminates, then in the resulting state Q holds.

In 1990’s, a number of new program logic emerged - separation logics. The main idea - triples - was the same as in Hoare logics, but separation logics allowed people to formally reason about challenging programs manipulating pointer-based data structures, and do so easily, in a way a programmer informally reasons about their code. Separation logic was a big little revolution - big enough for a paper titled “The Next 700 Separation Logics” to come out. And besides Hoare and Separation logics, there are also variants of dynamic modal logic being used for formal verification.

Many deductive verification tools (such as Frama-C for C and Key for Java) are based on a logic like that. These tools automate out the boring parts of formal verification and leave to the user only the interesting parts, like coming up with loop invariants.

Language-parametric deductive verification tools

Language-parametric deductive verification tools, as implemented in the K-framework, are also based on a logic: reachability logic (or its another variant). Reachability logic can express similar properties as Hoare logic (that is, partial correctness properties); however, its main advantage is that it works with any programming language that one gives it. This means that there is a theorem saying that no matter what language one gives to the logic, reasoning done using the logic will always be sound for that language. This is in contrast with Hoare and separation logics that are sound only for the one particular language they have been designed for.

Tools like K-framework then add some level of automation on top of reachability logic, similarly to the Hoare- or separation logic-based tools.

Advantages of language-parametricity

Testability of language semantics

Using a semantic framework, one can derive more tools than just verifiers. If one is able to derive an interpreter, then testing the semantics is easy. One can test the interpreter against other interpreters and compilers for the same langage, thus establishing the correctness of, or at least reducing the number of bugs in, the language semantics.

Ease of engineering

Separation of language definition from tooling can be exploited in a team work. The development effort for a verification tool can be split between people with knowledge of formal verification techniques, and people with knowledge of the particular programming language. The language for describing a language semantics can serve as a well-defined interface.

Amortization of costs

Development costs get amortized as the number of tools and number of languages grows. Tools can easily adapt as languages evolve.

Limitations

However, in the current state of the art, as implemented in the K-framework backed by reachability logic, there are many limitations. Some limitations are only a matter of implementation, while to resolve the other ones, one would need to develop better logical foundations.

Limitations of the logic

Reachability logic generalizes Hoare logic. That means it can be used for reasoning about partial correctness of programs, but not about other properties. For example, there is a class of so-called hyperproperties, called “k-safety hyperproperties”, that can be used for specifying security policies on programs. Some extensions to Hoare logic can handle these, but Reachability logic does not support these. Only recently I have created a logic capable of reasoning about k-safety hyperproperties in a language-parametric way, but it hasn’t been implemented in any tool, as it is not entirely clear how to automate reasoning in such logic.

Similarly, reasoning about pointer aliasing is difficult in reachability logic. Perhaps someone could create a logic mixing reachability logic with separation logic?

Limitations of the implementation

K-framework is quite slow, and it is not obvious to me how to make it faster. One approach might be partial specialization of semantics with programs before performing reasoning about the program, but this has not been explored properly yet.

Another issue of K-framework is that it is complex and it is not formally verified itself. This has been addressed recently by the work on proof object generation (for interpreters as well as for deductive verifiers), where a tool generates a formal proof for each task that it performs. However, this is hardly a substitute for full formal verification, since one cannot know in advance whether the tool will work correctly for his particular task. And the proof generation and checking itself is not based on any sound logic, since the languages used for describing language semantics in K-framework themselves do not have any formal semantics.

Conclusion

Language-parametric verification tools, formal semantic frameworks, and the underlying formal logics are useful tools adressing some of the limitations of the state-of-the-art program verification frameworks. However, more theoretical work is needed to make these competetive in other aspects of program verification, such as reasoning about pointers. Also, more practical work is needed to create language-parametric tools that are performant and provably correct.