On the Value of Formal Semantics


Recently, a friend of mine asked me: “Do you think that all programming languages should have a formal semantics?”. I do not want to give a simple yes/no answer to that question. Instead, I would like to think about the value formal semantics provides and compare it with the alternatives programming languages have.

The State of the Art

Some popular programming languages, such as Python or TypeScript, do not have any explicit semantics, even informal one. However, from the 15 languages that are the most popular according to the StackOverflow 2023 Survey, 11 languages have some kind of an official specification covering both syntax and semantics, and specification for 1 language (Rust) is under development. From these, 8 languages have syntax defined formally, using some form of a grammar.

language specification syntax
JavaScript ECMA-262 grammar
HTML/CSS HTML standard, CSS informal
Python none
SQL ISO/IEC 9075-1:2023 ?
TypeScript obsolete spec from 2016
Bash/Shell (all shells) IEEE 1003.1 (POSIX) grammar
Java The Java® Language Specification grammar
C# C# standard specification grammar
C++ ISO/IEC 14882:2020 grammar
C ISO/IEC 9899:2018 grammar
PHP none
PowerShell Windows PowerShell Language Specification 3.0 grammar
Go The Go Programming Language Specification grammar
Rust in progress
Kotlin Kotlin language specification grammar

The idea of defining a language’s syntax by means of a formal grammar is there at least since 1950’s and it is commonly taught at universities, so it is not surprising that programming language designers use it. What I find more surprising is that the semantics of these popular languages is defined only informally, given the fact that some techniques for modeling semantics formally are known at least since 1960’s. One explanation of this difference might be that for syntax, there exists a large variety of tools that can generate executable parsers from formal grammars, so language designers have an immediate benefit from defining grammars formally, while there are not that many tools that can take advantage of a formal semantics. However, it could be counter-argued that formal grammars in language specifications are not always used for generating parsers - for example, the Clang C and C++ compiler employ hand-written parsers.

The benefits of specifications

The kinds of benefits gained by formulating a language specifications depend on the kind of the specification.

The case for informal semantics

When a programmer is writing a piece of code in a programming language, he is doing it with some intent. The piece of code will eventually be running inside a computer, having effects on the computer’s outputs and consuming the computer’s inputs. This may have the form of an interaction with a user, or of a network communication, or of a modification of data internally or externally stored. The programmer would have some idea of what these interactions should be, and would try to write the piece of code such that the interactions are as intended.

There are multiple strategies of achieving the match between the programmer’s intent and the produced code. One such strategy would consist of arbitrary modifications to the code coupled with checks that decide whether the code behaves as intended (or not). Needless to say that this strategy is not very effective. To improve, the programmer may use his knowledge of how individual constructs of the programming language influence the outcome. However, for this knowledge to be useful for a larger variety of programs, and larger programs, simply remembering the outcomes of a large body of programs is not enough: the knowledge has to be consolidated into a coherent, compact, and unified whole. I would call this whole a mental model of the programming language.

We may think about a mental model of a programming language as of its implicit informal semantics. While useful (and necessary), an implicit semantics has severe drawbacks with comparison to an explicit, usually written, informal semantics. First, while both implicit and explicit semantics can be incorrect or incomplete, it is much harder to detect these problems with an implicit semantics. Second, implicit understanding is very hard to share with other people.

An informal (but written) specification of a programming language makes the implicit understanding explicit. With such a specification, people can start reasoning (and arguing) about their code. Moreover, a specification can also serve as a social contract that guides the behavior of individual developers. Do I dislike what the interpreter is doing with my code? Either the interpreter is behaving according to the specification, and then I should fix my code, or the interpreter is buggy - and I can fill in a bug report (and hopefully, somebody will eventually fix the interpreter).

To get some feel for how such an informal specification of a language look like, here is an excerpt from the (final draft of) C17 standard: An excerpt from (a draft of) ISO/IEC 9899:2017 defining C It is pretty clear what this says, right? However, someone might start reasoning:

  1. “Replaces”? What does that mean, precisely? Does it mean that there is always something to be replaced with?
  2. “The value stored in the object”? Does this mean that there is always the value? One, single, well-defined value? But in Section 3 of the specification, value is “precise meaning of the contents of an object when interpreted as having a specific type”. So we have to interpret object through the lenses of types to get their values? This sounds like a contradiction.
  3. Also, that there is some total function named “stored in” that I can always find by searching the pdf for the phrase “stored in”? And someone made sure that this function is never referred to by another phrase?
  4. Thinking about it, the authors probably wanted to say that it is the result of the conversion that replaces the stored value, rather than that the original value before conversion.

I swear I didn’t prepare these questions in advance, nor I have tried to find something really tricky. I just picked what I thought to be a simple effectful construct in the language. I used to work with the language specification of C++ pretty intensively, but that experience didn’t help me much in understanding this simple sentence from the C standard.

Natural language is inherently ambiguous: what might one person consider a clear description of a behavior might be interpreted differently by another. Specifications written in a natural language can easily be incomplete or inconsistent, without anyone noticing early enough. This is where formal semantics can help.

Formal semantics

By “formal semantics of a programming language” I mean a mechanism that takes a given syntactically valid source code of a program, typically in a form of a syntax tree, and assigns it a well-defined mathematical object. Among the three types of artifacts people create when doing mathematics - definitions, conjectures, and proofs 1 - formal semantics would correspond to definitions. The object assigned to a program by the formal semantics can have many forms: it can be a function computing outputs from inputs, or a transition system, or a relation between expressions and their values, or a predicate transformer… For example, the language Standard ML has its execution semantics defined in an operational style. Here is a rule that defines the meaning of SML’s assignment operator: An excerpt from The Definition of Standard ML

Pretty cryptic, right? To understand what does it mean, we have to know what all the symbols (arrow, vdash, plus). This would be made easier if we had an access to the LaTeX source of the document, since search for mathematics in a pdf file is not that comfortable. Contrast this with the informal semantics, where at the first sight one could think he understands, and only later finds out that there are unresolved questions. Or maybe not.

A side note (click to expand) I have never worked with SML, this is my first time skimming through the document, and still I can guess what the snippet means even without searching through the document. On my first reading, it says that “if the expression exp evaluates to the assignment operator while updating the state s into s’, and atexp evaluates to this mapping containing ‘a’ and ‘v’” (this is weird, I have never seen something like that before), “then ’exp atexp’ evaluates to the empty set while updating the store such that ‘a’ has the value ‘v’ (while also applying the effect of the evaluation of ’exp’ and ‘atexp’ on the store)”. (I do not know what the notation “in Val” means.) This is likely because I have some general knowledge about semantics of programming languages. The knowledge is quite easy to learn: there are books, university courses, research papers, Wikipedia articles, etc. However, I am not aware of any such materials on how to interpret informal language semantics.

Once we have the definition of a programming language’s semantics, we can start creating conjectures. Some conjectures might correspond to an execution of a concrete program with concrete arguments: “this program, when executed with argument 5, will terminate with value 120”. One can also formulate richer conjectures: “this program is commutative in x and y”, “these two programs have the same input-output behavior”, “any program that satisfies this static well-typedness property is safe from runtime type errors”, or “any program that uses the interface of this class only in this way is safe from resource/memory errors of this kind”.

When we formulate these conjectures in the mathematical framework in which we work (typically set theory or type theory), they have precise meaning, free of ambiguities. We can also prove or refute such conjectures, and when we do it, we know what exactly is the thing that holds (or that does not hold) and why it does so. That is much more than what can be done with only an informal semantics.

But what about errors in the mathematics, such as invalid steps in proofs or plain old typos in definitions and conjectures?

Mechanized formal semantics

We can let computer check that the syntax of our conjectures as well as well-formedness of steps in proofs. To do so, we can use an interactive theorem prover such as HOL4 or Coq, in which we encode the definitions, conjectures, and proofs that we would otherwise just do on paper (or in LaTeX). What we gain is a confidence that all the mathematics is very likely correct. Moreover, since the language definition would have the form of a code, we could use many available tools to quickly orient ourselves in the specification: from grep, through using some form of a documentation generator and cross-reference index, to using advanced tools that the particular interactive theorem prover offer (such as Coq’s “Search” command). What can be better?

Semantic frameworks

Typically, we treat a mechanized formal semantics as instructions for the background language. For example, the formalization of CompCert’s Clight contains instructions for Coq to define the types expr, statement, state, step, and functions on them. These objects, together, then implement the mechanism that assigns a mathematical object to a program. This is a natural thing to do when doing a deep embedding in Coq; however, while with this style one can reason about the embedded language, one can not easily reason about the embedding itself.

What I mean is that one cannot programmatically inspect the types and extract useful information from them, or transform them e.g. by programatically adding new constructors, or modify the types of their parameters. This would be useful for various kinds of abstract interpretations, where we could replace, say, integers, with some abstract domain such as intervals. It has been observed by various authors that abstract and concrete interpreters have similar structure; however, I am not aware of any tool or paper that would be able to automatically generate abstract interpreters directly from a language’s semantics. I suspect this is because of how we encode the semantics.

The other approach is treating the mechanized formal semantics as data in the background language. With this approach, we can always take the data and let them be interpreted by some generic machinery that would be the same independently of the concrete programming language whose semantics we are encoding. Therefore, everything that is achievable with the “instructions” treatment of formal semantics is also achievable with the “data” treatment. However, we could have multiple machineries like that: for example, one machinery would interpret the data as a small-step formal semantics of the described language; another would yield a concrete interpreter of the same language, and yet another could generate symbolic and abstract interpreters.

Tools that treat semantics as data can be called “semantic frameworks”. There already exist some tools like that, such as PLT-Redex and K. Both of these tools achieved some success: K can be used for concrete and symbolic execution and deductive verification, while PLT-Redex for concrete execution and bug finding. However, both of these tools suffer from a serious problem: the meta-languages in which the semantics-as-data are described do not have formal semantics themselves. That means that one cannot argue about correctness of the generated tools (interpreters etc).

So practically speaking, the categories of “mechanized formal semantics” and “semantics framework” are separate: to this day, I am not aware of any implementation of a semantic framework that would be really formal. However, I tentatively believe that for programming languages it would be the best if they were specified in semantic frameworks that would be truly formal. This would yield both theoretical and practical benefits while amortizing the cost of development of such frameworks among the individual languages.

Final thoughts

The costs of development of a mechanized formal semantics would typically be higher than the cost of development of a pen-and-paper formal semantics, which would typically be higher than cost of development of an informal specification, which also does not come for free. Additional costs are associated with the subsequent language evolution, if the specification is to be kept in sync with other artifacts (such as tools and documentation). However, the cost of development of a specification should be compared with the cost of not having one.

Personally, I find the fact that programming languages are usually not precisely defined frustrating, since it makes precise reasoning about programs theoretically impossible. Our knowledge about software can be then gained only by conducting experiments (testing), and that is often subtly unreliable in the presence of concurrency and changes in the underlying software infrastructure. We could do better. Will we?


  1. Theorems, which occur more often than plain conjectures, can be though of as pairs of a conjecture and its proof. ↩︎