There is one aspect of the Minuska framework which provides a language designer a great flexibility. Namely, a programming language configuration is a tree whose leaves hold values from an arbitrary domain (typically integers, lists, etc); it is up to the language designer to choose a suitable domain. This makes writing language-parametric tools in Minuska even harder. In this post I explore some arguments for supporting domain-parametricity anyway, discuss the associated problems, and sketch out a possible solution.

Advantages and problems of a fixed domain

When a semantics framework fixes the domain of values, language designers are forced to use that single domain. This is the case of K-framework. For some languages, the fixed domain may be an overkill: when a language needs to work only with strings (for variable names), integers (for values), and maps from strings to integers (for stores), having a support for maps from arbitrary terms (over arbitrary values) to other terms (over arbitrary values) can be percieved as an unnecessary level of indirection. What is worse, it might happen that some language needs to use some values outside of the officially supported domain. In the K-framework, such situation can be addressed by axiomatizing an extension to the supported domain of values, but that opens another can of worms: who makes sure that the axioms are consistent?

On the other hand, implementors of a language-parametric tool (e.g., of a symbolic execution engine) might be happier if they are tasked with supporting only one, although complex, domain of interest.

Problems of domain-parametricity

Allowing language designers to choose their domain of values puts some extra pressure on them. Suppose their programming language manipulates pairs of integers. Shall they use only a domain over integers and implement pairs themselves outside of the domain, or should pairs of integers be in the domain? Should pairs of other things be also in the domain?

And in the case of symbolic execution: if the implementers of the engine does not know what values can be used in the language, how can they implement simplification of path conditions? The generic symbolic execution engine can only automate reasoning about configurations modulo the domain of values. If the language designer chooses a trivial set of domain values, the language definition will become too low-level and hard to reason about - and simplification of path conditions will require writing uninteresting-but-complex invariants. For example, one can encode natural numbers as trees (basically, a list with trivial data in the leaf, where the length of the list is the number to be represented), but then non-trivial operations on natural numbers (e.g., addition) have to be implemented using recursive rewriting rules of the language definition, and suddenly, symbolic execution loops even for loop-free programs.

On the other hamd, if lots of computation is happening inside the domain of values, the symbolic execution engine is almost useless. And tailoring a domain of values to one’s language at hand is also some work, especially if the domain is complex.

Solution: compositionality of domains

Ideally, the language designer would be able to build their own domain of values based on pre-existing components and combinators. The components would come with some symbolic reasoning support, and the combinators would preserve this support. For example, we would have a domain of bools, a domain of ints, a domain of lists-of-stuff, a domain of maps-from-stuff-to-stuff, and so on. One complication would be in dependencies: some operations on lists may return booleans, some operations on integers may also return booleans, and we do not want to have two different versions of booleans in one domain.

Another complication would be in semantics transformers. Suppose we want to trasform a semantics to perform, e.g., an information flow analysis. A part of this transformation would be in transforming the rewriting rules, but part of it might want to transform the domain of values. If the transformation is completely value-agnostic, then fine. But what if we have some specific way we want to transform just integers? Then the transformation needs to somehow handle the domain combinators… but let me leave that for a future post.

Conclusion

This issue has been in my head for quite a some time. I do not have much time to attact it directly; however, I would like to model a subset of Rust in Minuska, and specifically, want to have the Tree Borrows aliasing model as a domain of values. The overall goal would be to have an extensible symbolic execution engine for a fragment of Rust; for the extensibility, domain combinators will come handy. So maybe I will revisit these ideas while working on the Rust project.