Multi-language Verification With UML Sequence Diagrams -

Reasoning about software is hard. Reasoning about software written in multiple languages and communicating in a complex environment is even harder. The latter is exactly the topic of inquiry in Yu Zhang’s Unifying Compositional Verification and Certified Compilation with a Three-Dimensional Refinement Algebra. Despite the complicated name of the paper, it turns out that one can use UML sequence diagrams. Well, almost.

Modeling a map application with UML Sequence Diagrams

Consider a map application that allows users to search for transport routes using multiple transport providers. Such app can interact with a user and transport providers in various ways; an example interaction is shown in the illustration below. An example sequence diagram The map application may be implemented in many different ways: for example, the list of connections found so far may be stored in memory, in a file, or in a database; the application may also be designed to remember only the cheapest connection so far, if memory consumption is an issue. But when talking about the correctness of such a map application, we may actually not care about these implementation details and consider only the application’s externally observable behavior.

We could represent this behavior as a collection of all possible sequence diagrams in which the application takes place. That, however, would have two big problems: (1) we may not know in advance what other actors there may be in the world; and (2) we would have the information about other actors, which would be irrelevant. Instead, let us represent the application’s externally obserable behavior as the collection of fragments of these sequence diagrams, such that each fragment shows only the lifeline of map application and its interractions - without the lifelines of other actors - as in the following illustration. An example sequence diagram

A little detail is that not all collection of fragments are meaningful. In the paper, the authors put two restrictions on the collections: (1) prefix-closedness, and (2) internal determinism (as modelled by a coherence relation in the paper).

In the subsequent text, let’s have a textual notation for those fragments. A fragment that simply returns an answer \( a \) for a question \( q \) without communicating with other actors would be written as \( q ↣ a \). A fragment that answers a question by first asking three other questions would be written as \( q ↣ (q_1 ⇝ a_1) ↣ (q_2 ⇝ a_2) ↣ (q_3 ⇝ a_3) ↣ a \). A fragment that answers first question immediatelly, for second questions asks one question, and for third question asks two questions, would be written as \( q ↣ a ↣ q' ↣ (q'_1 ⇝ a'_1) ↣ a' ↣ q'' ↣ (q''_1 ⇝ a''_1) ↣ (q''_2 ⇝ a''_2) ↣ a'' \). Notice how in the fragment, messages sent by the environment are interleaved with messages sent by the actor.

Modeling a Map Application, More Formally

In the referenced paper, fragments are called traces , and (reasonable) collections of traces are called strategies. To model such fragments/traces and collections/strategies, the authors use the notion of effect signature. An effect signature is a collection \( Q \) of questions together with a collection \( ar(q) \) of possible answers for each question \( q \in Q \). An effect signature is typically written as the set \( \{ (q : A) \mid q \in Q \land A = ar(q) \} \)

Each fragment/trace has associated two effect signatures, \( E_{\textit{in}} \) and \( E_{\textit{out}} \), and shows how concretely an actor answers questions of the \( E_{\textit{in}} \) kind (coming from some other actor), while possibly asking questions and receiving answers from yet another actor(s). In our example, the map application handles questions from the user, and is allowed to ask questions to a bus and a train service provider. A strategy \( \sigma \) is then some collection of fragments/traces with the same associated effect signatures; we would say that \( \sigma \) is of type \( E_{\textit{out}} ↠ E_{\textit{in}} \), written \( \sigma : E_{\textit{out}} ↠ E_{\textit{in}} \), whenever its fragments/traces answers \( E_{\textit{in}} \) using \( E_{\textit{out}}\).

Continuing our example, we have an effect signature

\[ ServiceProviderAPI = \{ (getConnections(a,b,d) : List(ID \times Price) ) \mid a,b \in City \land d \in Date \} \]

where each question/answer \( getConnections(a,b,d) = [(i_1,p_i),...,(i_n, p_n)] \) is interpreted as “On day \( d \), for travel from \( a \) to \( b \), we have connections \( i_1,...,i_n \), each \( i_j \) for the price \( p_j \)” (where all the \( i_j \)s are some global/nation-wide identifiers of connections).

Another effect signature would be

\[ MapAPI = \{ getCheapest(a,b,d) : ID \times Price \mid a,b \in City \land d \in Date \} \, . \]

Then, using two services providers

\[ trainProvider,busProvider : ∅ ↠ ServiceProviderAPI \, , \]

we would have a strategy

\[ mapApp : ServiceProviderAPI \oplus ServiceProviderAPI ↠ MapAPI \]

implementing the map application.

But what is the \( \oplus \) sign there?

Fragments Can Be Composed

Given two effect signatures \(E_1,E_2\), the expression \( E_1 \oplus E_2 \) denotes a new, combined effect signature:

\[ E_1 \oplus E_2 = \{ ((1, q) : A) \mid (q : A) \in E_1 \} \cup \{ ((2, q) : A) \mid (q : A) \in E_2 \} \, . \]

A fragment asking questions in this combined effect signature can basically ask questions from the first or the other signature; a fragment answering questions from the combined signature has to be able to answer questions from both signatures.

Also, one can easily implement \( E_1 \) using \( E_1 \oplus E_2 \), by simply wrapping the questions and returning the answers, as in the trace \( q ↣ ((1,q) ⇝ a) ↣ a ↣ q' ↣ ((1,q') ⇝ a') ↣ a' ↣ ... \). The result of this approach is the strategy \( π_i : E_1 \oplus E_2 ↠ E_i \), for \( i \in \{ 1, 2\} \). The authors come up with more strategy combinators, like

\[ \langle \_ , \_ \rangle : (E ↠ F_1) → (E ↠ F_2) → (E ↠ F_1 ⊕ F_2) \]

for flat composition, and

\[ \odot : (F ↠ G) → (E ↠ F) → (E ↠ G) \]

for layered composition, and

\[ \_^* : (E ↠ F) → (E ↠ F) \]

for iteration.

Concluding remarks

The paper also offers tools for reasoning of refinement between strategies, and more. The authors use an example of a multi-lingual program combining C and assembly code, communicating via shell pipe. Unfortunately, they do not make use of the UML Sequence Diagrams analogy, which made the paper harder to read, at least for me.