Multi-language Verification With UML Sequence Diagrams


Reasoning about software is hard. Reasoning about software written in multiple languages and communicating in a complex environment is even harder. The latter is exactly the topic of inquiry in Yu Zhang’s Unifying Compositional Verification and Certified Compilation with a Three-Dimensional Refinement Algebra. Despite the complicated name of the paper, it turns out that one can use UML sequence diagrams. Well, almost.

Modeling a map application with UML Sequence Diagrams

Consider a map application that allows users to search for transport routes using multiple transport providers. Such app can interact with a user and transport providers in various ways; an example interaction is shown in the illustration below. An example sequence diagram The map application may be implemented in many different ways: for example, the list of connections found so far may be stored in memory, in a file, or in a database; the application may also be designed to remember only the cheapest connection so far, if memory consumption is an issue. But when talking about the correctness of such a map application, we may actually not care about these implementation details and consider only the application’s externally observable behavior.

We could represent this behavior as a collection of all possible sequence diagrams in which the application takes place. That, however, would have two big problems: (1) we may not know in advance what other actors there may be in the world; and (2) we would have the information about other actors, which would be irrelevant. Instead, let us represent the application’s externally obserable behavior as the collection of fragments of these sequence diagrams, such that each fragment shows only the lifeline of map application and its interractions - without the lifelines of other actors - as in the following illustration. An example sequence diagram

A little detail is that not all collection of fragments are meaningful. In the paper, the authors put two restrictions on the collections: (1) prefix-closedness, and (2) internal determinism (as modelled by a coherence relation in the paper).

In the subsequent text, let’s have a textual notation for those fragments. A fragment that simply returns an answer a a for a question q q without communicating with other actors would be written as qa q ↣ a . A fragment that answers a question by first asking three other questions would be written as q(q1a1)(q2a2)(q3a3)a q ↣ (q_1 ⇝ a_1) ↣ (q_2 ⇝ a_2) ↣ (q_3 ⇝ a_3) ↣ a . A fragment that answers first question immediatelly, for second questions asks one question, and for third question asks two questions, would be written as qaq(q1a1)aq(q1a1)(q2a2)a q ↣ a ↣ q' ↣ (q'_1 ⇝ a'_1) ↣ a' ↣ q'' ↣ (q''_1 ⇝ a''_1) ↣ (q''_2 ⇝ a''_2) ↣ a'' . Notice how in the fragment, messages sent by the environment are interleaved with messages sent by the actor.

Modeling a Map Application, More Formally

In the referenced paper, fragments are called traces , and (reasonable) collections of traces are called strategies. To model such fragments/traces and collections/strategies, the authors use the notion of effect signature. An effect signature is a collection Q Q of questions together with a collection ar(q) ar(q) of possible answers for each question qQ q \in Q . An effect signature is typically written as the set {(q:A)qQA=ar(q)} \{ (q : A) \mid q \in Q \land A = ar(q) \}

Each fragment/trace has associated two effect signatures, Ein E_{\textit{in}} and Eout E_{\textit{out}} , and shows how concretely an actor answers questions of the Ein E_{\textit{in}} kind (coming from some other actor), while possibly asking questions and receiving answers from yet another actor(s). In our example, the map application handles questions from the user, and is allowed to ask questions to a bus and a train service provider. A strategy σ \sigma is then some collection of fragments/traces with the same associated effect signatures; we would say that σ \sigma is of type EoutEin E_{\textit{out}} ↠ E_{\textit{in}} , written σ:EoutEin \sigma : E_{\textit{out}} ↠ E_{\textit{in}} , whenever its fragments/traces answers Ein E_{\textit{in}} using Eout E_{\textit{out}}.

Continuing our example, we have an effect signature

ServiceProviderAPI={(getConnections(a,b,d):List(ID×Price))a,bCitydDate} ServiceProviderAPI = \{ (getConnections(a,b,d) : List(ID \times Price) ) \mid a,b \in City \land d \in Date \}

where each question/answer getConnections(a,b,d)=[(i1,pi),...,(in,pn)] getConnections(a,b,d) = [(i_1,p_i),...,(i_n, p_n)] is interpreted as “On day d d , for travel from a a to b b , we have connections i1,...,in i_1,...,i_n , each ij i_j for the price pj p_j ” (where all the ij i_j s are some global/nation-wide identifiers of connections).

Another effect signature would be

MapAPI={getCheapest(a,b,d):ID×Pricea,bCitydDate}. MapAPI = \{ getCheapest(a,b,d) : ID \times Price \mid a,b \in City \land d \in Date \} \, .

Then, using two services providers

trainProvider,busProvider:ServiceProviderAPI, trainProvider,busProvider : ∅ ↠ ServiceProviderAPI \, ,

we would have a strategy

mapApp:ServiceProviderAPIServiceProviderAPIMapAPI mapApp : ServiceProviderAPI \oplus ServiceProviderAPI ↠ MapAPI

implementing the map application.

But what is the \oplus sign there?

Fragments Can Be Composed

Given two effect signatures E1,E2E_1,E_2, the expression E1E2 E_1 \oplus E_2 denotes a new, combined effect signature:

E1E2={((1,q):A)(q:A)E1}{((2,q):A)(q:A)E2}. E_1 \oplus E_2 = \{ ((1, q) : A) \mid (q : A) \in E_1 \} \cup \{ ((2, q) : A) \mid (q : A) \in E_2 \} \, .

A fragment asking questions in this combined effect signature can basically ask questions from the first or the other signature; a fragment answering questions from the combined signature has to be able to answer questions from both signatures.

Also, one can easily implement E1 E_1 using E1E2 E_1 \oplus E_2 , by simply wrapping the questions and returning the answers, as in the trace q((1,q)a)aq((1,q)a)a... q ↣ ((1,q) ⇝ a) ↣ a ↣ q' ↣ ((1,q') ⇝ a') ↣ a' ↣ ... . The result of this approach is the strategy πi:E1E2Ei π_i : E_1 \oplus E_2 ↠ E_i , for i{1,2} i \in \{ 1, 2\} . The authors come up with more strategy combinators, like

_,_:(EF1)(EF2)(EF1F2) \langle \_ , \_ \rangle : (E ↠ F_1) → (E ↠ F_2) → (E ↠ F_1 ⊕ F_2)

for flat composition, and

:(FG)(EF)(EG) \odot : (F ↠ G) → (E ↠ F) → (E ↠ G)

for layered composition, and

_:(EF)(EF) \_^* : (E ↠ F) → (E ↠ F)

for iteration.

Concluding remarks

The paper also offers tools for reasoning of refinement between strategies, and more. The authors use an example of a multi-lingual program combining C and assembly code, communicating via shell pipe. Unfortunately, they do not make use of the UML Sequence Diagrams analogy, which made the paper harder to read, at least for me.