Separation-logic Style Local Reasoning for Free? -

For a long time, deductive software verification faced two challenges: (1) the presence of pointers in computer programs and (2) the variety of programming languages one may want to support. The former was addressed by the family of separation logics, while the latter by a family of operational-semantics-parametric reachability logics. However, I have not seen anyone reasoning effectively about pointer-manipulating programs in reachability logic, and similarly, adapting a separation logic (e.g., Iris) to new languages requires a lot of expert manual work. Moreover, it seems to me that both communities around separation and reachability logic are mostly unaware of (or non-interacting with) the others’ work. But what if we could have the best of the two worlds? Could we have separation-logic style local reasoning for free, without repeatedly proving boring adequacy theorems?

First, a disclaimer: I am not an expert in separation logic. I am just learning and sketching some ideas for future work.

Two problems and two solutions

Some software is easy to reason about. Techniques for deductive reasoning about computer programs exists since late 1960’s, when Floyd and Hoare published their influential papers. However, reasoning about pointer-based data structures was traditionally considered hard. The problem with pointers is that they can alias: the same memory cell or part of a data structure can be accessed from two or more pointers that are unrelated at first sight. Separation logics, introduced in 1990s, were designed to address this problem.

Another problem with deductive program verification is that deductive verifiers have to be built for every new language again. (Some may consider this problem solved thanks to various intermediate languages, like Boogie or LLVM IR, but these solutions only shift the problem of building a sound verifier for a language L to building a sound verifier for a language L’ and proving the correctness of a translation from L into L’.) At least to some extent, this has been addressed by the family of reachability logics and the community around K framework.

However, the two solutions to the two problems are incompatible.

Or at least, I have never seen anyone using both.

A first look

The main ingredient of separation logic is its framing rule.

{ P } C { Q }
-------------
{ P * R } C { Q * R }

The rule expresses some form of monotonicity. It says that if the command C successfully transforms resources (data in heap, etc) P into Q, then we can execute the command with additional resources R (the frame), and everything will still work as previously, only that we will end up with additional R at the end.

Of course, this rule is unsound for many potential programs. For example, if a program counts the size of the memory, or if it requires some variable to be absent, we may get wildly different results when we execute the program in a larger memory or with more resources. Since reachability logic is parametric in a rewriting system representing the programming language’s semantics, and the rewriting system can query any part of the state in various ways, we have a problem.

One way of making sure the framing rule holds in a rewriting system would be ensuring it holds for every rule in the system. That would restrict operation individual rules can do. For example, a semantic rule that finds an unused memory location, e.g for the purposes of memory allocation, would break framing. Therefore, one would have to limit the kind of queries that rules (their side conditions and right-hand-sides) can ask about the values available in a valuation, to ensure monotonicity when adding resources (heap etc).

Consider a rule for reading the value of a variable from memory:

< X ~> Rest, Mem > => < Mem[X] ~> Rest, Mem > if X in_keys(Mem)

This rule is good: we do not change its effect when extending memory.

Negations are bad

However, a similar rule for reading some default value if variable is not defined, would be bad:

< X ~> Rest, Mem > => < 42 ~> Rest, Mem > if not (X in_keys(Mem))

The applicability of this rule is not preserved by memory extension, despite the fact that it uses essentially the same functions. The only difference is the negation, which could be considered a function, but also could be considered a hard-wired logical connective. (Afaik in K, there are two negations - one is a function from the underlying model, and one is the logical, matching logic negation. In Minuska, there is only the optional negation from the underlying model).

Equality is bad

Ok, so negations are problematic. But constructors together with equality are also problematic. Consider the rule

<foo, Mem> => <42, Mem> if Mem == {[(x,1),(y,2)]}

(where {[(x,1),(y,2)]} somehow constructs the memory mapping only x to 1 and y to 2). The applicability of this rule is not preserved by memory extension. To avoid such issues, we need to either disallow direct construction of memories, or equality. The former is problematic, because the language could still capture some old value of a memory and compare it with new. So, we have to disable equality.

A technical problem with Minuska is that rule side conditions are list of equalities of built-in values.

Could we have two kinds of values, where one kind would be “normal” (visible) and the other not comparable by equality (hidden)? This seems to be related to behavioral specifications, hidden algebras, etc.

Determinism is bad

How could we model a memory allocation primitive? The standard way it is done in K is that there is a hidden counter somewhere in the state, and every time we need to allocate new memory, we increase the counter and use its old value as the address. That way, one never reuses addresses that were allocated (or freed) previously. Of course, this breaks framing, since this is a non-local operation: extending the heap breaks the intended invariant that all used memory addresses are lower than the counter.

The other obvious way is using the lowest free address. That is non-local, too, and extending the memory results in a different address being returned to the user.

So it looks like the problem is that we are trying to be too exact, too deterministic. We could specify the allocation primitive to return some nondeterministic but unused address. In that case, extending the memory implies shrinking the space of possible new addresses. That corresponds to having fewer allowed behaviors in the conclusion of the framing rule, which is just fine, since it only means more properties would hold.

Both K and Minuska are deterministic.

However, some time ago I was thinking of extending Minuska with capabilities for interactions, using Interaction trees (ITree). I could go one step further and use Choice trees to extend minuska with interactions and nondeterminism.

Nondeterminism is bad

But how do we generate an interpreter for a nondeterministic language semantics? Especially if the nondeterminism is restricted: our semantic rule for allocation would ask the runtime for any integer different from keys in the map representing heap. The nondeterminism handler needs to know how the value will be used, and thus needs to be coupled with the static model providing the heap.

Ok, suppose then that there is some built-in function in the static model that takes a heap, and returns a new heap with a new cell in it, paired with a handler to the heap - and the handler is a non-deterministic value. What happens if this function is called multiple times during the evaluation of a rule’s right-hand-side?

Relevant literature

The paper Local Action and Abstract Separation Logic introduces a precise notion of “local action”. Also, I have the idea of non-deterministic allocation function from there (page 5 top right). The paper Local Actions for a Curry-style Operational Semantics contains the idea of having two semantics, one abstract and one concrete, that would be related by some kind of a refinement. The idea of “nondeterminism handler” from the previous paragraph may be related, and perhaps could be used to generate the refined semantics from the abstract one.

Conclusion

I think this is an interesting problem, and I would like to have a deeper look into it. Having a collaborator with a deeper knowledge of separation logics would be helpful, though.