Formal Semantics of Statecharts
As shown in the precedent chapter, Statecharts as defined by Harel implies a quite involved semantics. Soon after its invention, many researchers dived into the question of defining a formal semantics and it turned out to be harder than excepted. As a consequence, many Statecharts variants appearead since then . However, none of them formalizes the Statecharts semantics as implemented in the industrial tool STATEMATE, till the work of Mikk et al.. They introduce in  the official semantics as supported in STATEMATE by closely following .
The STATEMATE semantics defines a set of finite state labels, , and the kind of states, T Y P E , AN D j OR j BASIC. From these sets, it defines the mathematical structure StateTree, which consists of the following parts:
• root, a special state that it is intended to contain the actual Statechart.
• init, an initial state.
• : ! P( ), the state hierarchy partial function that maps a state to its contained substates.
• : ! T Y P E, the map kind partial function.
Above definitions provide enough information to formalize the state hierarchy and to introduce some correctness conditions. For instance, Definition 3.1 is an example of a correct hierarchy of states with respect to the root state as defined in .
Definition 3.1. The root state is of kind OR and no state may contain it [ dom( ) n ran( ) = frootg ^ (root) = OR.
A Structured Network of Statecharts
The UML class abstractions enclose Statecharts and connect them in a well-defined and struc-tured manner. Therefore, we refer to the model as a structured network of Statecharts.
The class meta-model allows the designer to create models that can describe many possible configurations. The class allows the programmer to create a well-defined context to the state machine in order to communicate with the external world (including other classes) through its associations.
The UML Statechart is an extension of the Harel’s Statechart. Its structure is presented in a model based way in Figure 3.5. As we can observe, state kinds are abstracted into a common concept, Vertex. A Vertex may be a basic state if it does not contain any region, an OR state if it contains only one region and an AND state if it contains more than one region. States containing at least one region are called composite. From Vertex, the state inherits incoming and outgoing transitions that complete the HSM structure. Special states such as initial, final, history and join/fork, are considered Pseudostates.
Similar to Harel’s syntax of transitions, the transition is composed by a trigger, a constraint (or condition) and an opaque representation of an action called Behavior. Under this opaque type, the designer is free to choose his or her preferred language and it is frequently referred to as the Action Language of the state machine.
In , the OMG introduces an informal semantics of Statecharts where they specify the event processing algorithm for state machines, called the Run-To-Completion (RTC) step. The RTC informally describes the actions to perform in order to determine the set of enabled transitions to fire and how the compound transition is executed afterwards.
Contrary to traditional Statecharts, the state machine handles the arrival of events one at a time and it is not based on broadcast communications. That is, it exposes a message oriented, First In-First Out (FIFO)-like communication. Moreover, there is no difference between internal or external events, they are all considered external.
Following almost the same policies as Harel for the computation of enabled compound transi-tions, the actions of each fireable transition are executed sequentially — and not in-parallel as before. In UML Statecharts, the transition with higher priority is the more nested one (inside-out). At a given state configuration, the RTC step specifies the following actions to perform:
1. From the transition source state, all substates are exited and their corresponding exit ac-tions are executed. Note that there is no internal events generated here.
2. The chain of state exits continues until the first region that contains, directly or indirectly, both the main source and main target states is reached. The region that contains both the main source and main target states is called their least common ancestor, or lca, and it constitutes the scope of the transition.
3. The target configuration of states is entered and theirs corresponding actions are executed starting with the outermost one.
A complete formal specification of UML is a challenging work. The OMG undertakes it on a subset of UML called foundational UML (fUML) . However, their semantics does not cover the formalization of state machines, neither its communications. They map models to a particular “surface” language, such as Java, in order to concretize theirs semantics.
We found multiple formalizations of UML state machines in the literature [1, 22–25]. The most accomplished work to the best of our knowledge is due to Liu et al.. He introduces its own abstract syntax for HSM closely following the UML state machine specification. For instance, the state is a tuple of the following form: s = (r;btdef; en; ex; do; en;cex;cex;ccr;b sm; bt)
• rbis the set of regions.
• tdef, en, ex and do are the set of deferred events, the entry, exit and do behaviors.
• enc and exc are the set of entry point and exit point pseudostates.
• crb is the set of connection point references. sm is the owner state machine.
• btis the set of internal transitions.
The region is the cartesian product of vertex and transitions r , (v;bbt).
Compilation of UML Statecharts
The Unified Modeling Language (UML) language is uniquely designed as a modeling/specifi-cation language and it is not tied to any particular compilation method or optimizing flow. UML favors model-based design at all levels of the development process. Furthermore, the standard left many semantic variation points open to specific implementations . Most Model-Driven Development (MDD) frameworks based on or inspired from UML, such as Rhapsody  or GASPARD , support model-driven code generation and use object-oriented design patterns to produce executable models.
MDD tools propose different back-ends to produce executable code from UML models. Ex-ecutable implementations of UML Statecharts are frequently based on one of three different methods: state design pattern , the State Table Transition (STT)  and the nested switch case statements .
The most used technique for generating code from Statecharts relies on the state design pat-tern  shown in Figure 4.1. Basically, each state is translated into a class of the host language that handles a certain number of events depending on its outgoing transitions. If an event ar-rives, the context calls the virtual method handle(), which returns the next state according to the encoded transition triggered by such event.
The STT approach creates a table describing the relation between states and events. The boost C++ Statechart library  hides behind a heavy use of C++ metaprogramming a transition table to map efficiently states to transitions.
Finally, the Nested Switch Case method consists of, as its name implies, a set of nested switch case statements where the top ones filter the current state of the state machine and further nesting levels apply the final action following the received event.
The optimizations are frequently done by the backend-end compiler (C/C++, Java), which does not necessary know the original Statechart structure, and hence may miss interesting optimiza-tions . On this direction, we reviewed different approaches for model transformations and optimizations either before code generation or all along the optimizing and code generation process.
Certain research works handle optimization issues via Model-to-Model (M2M) transformations. There are a good variety of dedicated tools, most of them based on the Eclipse Framework, for M2M transformations such as ATL , Epsilon Transformation Language (ETL) , Query/View/Transformation (QVT)  and many others. The concept of M2M is to spec-ify the language transformations based on the language meta-model, within a meta-modeling framework. Figure 4.2 sketches the approach. Given a meta-model, we encode model trans-formations using a particular Domain-Specific Language (DSL) by directly referencing meta-model elements. The rules are then instantiated and applied at the model level.
In , Schattkowsky and Muller propose a set of rewriting rules in a M2M setting for a subset of the UML Statecharts, called Executable State Machines. We pointed out in Chapter 3 that UML Statecharts are independent of the underlying action language. Therefore, their proposed transformations are only related to the state machine structure, such as move entry/exit activities to input/output transitions, resolve conflicting triggers along the hierarchy, among others. For instance, Figure 4.3 shows the formal specification of a transformation that, given any composite state (ParentState), moves its incoming transitions (ParentT) to the state pointed by the initial pseudostate (FirstState). If not formally shown, the set of rules allow the authors to flatten the hierarchical state machine for direct execution. Even though the set of rewritting rules are generic, they lack of a concrete action language and cannot reason about the actions to be performed.
Synchronous Statecharts Compilation
In contrast to UML, synchronous Statecharts are strongly guided by a concrete language im-plementation (and its semantics as well). The code generation of synchronous languages is a well-known subject in the research community since the LUSTRE and ESTEREL programming languages .
SCCharts , presented in Chapter 3, is a safe imperative extension to synchronous languages with a Statechart-like visual representation. The compilation of SCCharts is decoupled into different intermediate representations:
• Extended SCChart: the front-end language.
• Normalized SCChart: the extended SCChart is reduced through a series of model-to-model transformations to primitive transitions only, which facilitates the mapping to the following representation.
• SC Graph: It is a directed graph that captures control dependencies in the form of a graph of basic blocks as well as data dependencies between parallel statements.
We show in Figure 4.6 the SCChart representations. For instance, note that the normalized one split transitions with more than one sequential statement into multiple states (transition Init ! W aitAB). As explained above, the SC Graph decouples the normalized SCChart into basic blocks or “scheduling units”. The green arrow denotes a concurrent data dependency. In order to guarantee a deterministic behavior, writes are scheduled before reads, which split g7 into two different scheduling units. The SC Graph relies on the SCL language introduced in Chapter 3 and it is from this language that the authors perform the software and hardware synthesis. SCL and its respective dataflow view make hardware synthesis almost straighforward. However, they need extra software support to generate the “tick” function that computes each signal following the chosen scheduling. Indeed, there is another representation slightly different from SCL called SCLP , which consists of a subset of the sequential C core.
The action language separates arithmetic operations into three basic actions: (1) start compu-tation, (2) receive the result and (3) store it. Point (1) represents a send primitive, (2) denotes a receive and (3) an update or assignment. Whereas the former two points are intrinsic to the state machine semantics, the latter is the additional most basic action to be defined. Figure 5.6 shows the structure of the proposed action language. We define the action, or behavior in UML terms, as a sequential composition of updates and sending primitives, separated by the token ‘:’. Receive and guard specifications are defined at the beginning of the transition followed by the keyword on and enclosed in square brackets, respectively.
As pointed out in Chapter 2, the data parallel approach is a promising technique to make effi-cient use of parallel hardware. We will show that its more elaborated extension called Nested-Data Parallelism can be represented in a model-driven manner. The implementation of such paradigm relies on a new data structure called parallel array in the context of Haskell Data-Parallel (HDP) . Similar to hHOEi2 objects, parallel arrays are composable to allow the expression of hierarchical or nested data parallelism.
The data parallel approach is about performing the same operation on a set of grouped elements, which in our context it is called broadcasting. We take advantage of multi-valued associations to define data parallel operations using index domains, defined between braces. Based on the model example of Figure 5.5, Listing 5.5 shows the luminance computation of all pixels in a data parallel fashion.
The above example introduces three new concepts: indices, index domains and indexed mes-sages. The definition is composed by a list of indexes and a set of constraints. The domain is equivalent to the closed range constraint 0 i pixels:len 1. The sending expression inside an index domain instantiates a set of indexed messages where each index value is taken from its enclosing domain.
Semantics of Actions
Let us first introduce some useful notations.
• We note sys[r7!o] the new system that maps reference r to object o and keeps everything else unchanged.
• We use to range over the hHOEi2 syntax sets defined in Appendix A.1.
• List concatenation
• The function update is noted as f0 = [f j r0 7!o], which is equivalent to build a new function f0(r) = if r0 = r then o else f(r).
• We also define the conditional function update f0 = [f j r0 7!o]p where p 2 B is a condition such that f0 = if p then [f j r0 7!o] else f
• We note projections with subscripts, e.g. Asys is the mapping of system sys 2 S.
• The binding extension operator B : ! ! is defined as B 0 = v:if v 2= dom( ) then 0(v) else (v)
The implementation of v 2= dom( 0) is not shown here. For completeness, we model partial functions with functor Partial(a) = a + Unde ned.2
• We will refer to an instance of a given syntax set S corresponding to the hHOEi2 grammar as S. For instance, state is an instance of hHOEi2.
State Machine Evaluation
We start by considering the non-indexed configuration model. A configuration indicates the current active states down the hierarchy of the Hierarchical State Machine (HSM). Therefore, it must model simple, composite and final conditions. The hierarchical state machine configura-tion is a functor K defined as follows: K(a) = a + a [K(a)] + End (6.3).
with injections for simple s : a ! K(a), composite c : a ! [K(a)] ! K(a) configurations and end = End. For instance, if we take configurations over state identifiers K(String) then one possible configuration of the state machine shown in Figure 6.3 is: k = c(’A’; [ s(’S1’); s(’T’)]) (6.4).
Configuration (6.4) indicates that the state machine is at composite state A where its left region is at state S1 and its right one at state T.
We define the transition semantics of configurations as a relation on the lifted set hstatei hstatei K = K(hstatei) S + S.
which we note . Using the transition relation on (6.2), we introduce the set of evaluationrules shown in Figure 6.4 where trni is the i-th transition out of s 2 hstatei and the next state configuration function n : hsmi ! hidi ! K(hstatei).
We defined two kind of inference rules and for each one of them we have two possibilities corresponding to the transition type to be evaluated. Rules KsExt and KsFinal handle simple configurations with external and final transitions, respectively. If applicable, KsExt evaluates transition trni and computes next configuration. KsFinal perform the same action on final tran-sitions and evaluates to end. Rules KcExt and KcFinal follows the same pattern for composite configurations. The key observation is that they evaluate the transition iff all its composite con-figurations have terminated ( end). Finally, KcPar evaluates parallel regions in case KcExt and KcFinal are not applicable.
Table of contents :
List of Figures
List of Tables
1.3 Objectives and Contributions
1.4 Thesis Organization
I State of the Art
2 Hierarchical Data Representations for Parallel Applications
2.1 The Data-Parallel Paradigm
2.1.1 The Functional Style
2.1.2 HTA: Hierarchically Tiled Arrays
2.2 Hierarchical Tasks
2.2.1 The Sequoia Approach
2.2.2 HiDP: Hierarchical Data Parallel Language
2.3 Low-level Trends
2.4 Model-Driven Representations
3 Semantics of Hierarchical State Machines
3.1 Harel’s Statecharts
3.2 Semantics of Statecharts
3.2.1 The STATEMATE Semantics
3.2.2 Formal Semantics of Statecharts
3.3 The Unified Modeling Language
3.3.1 A Structured Network of Statecharts
3.3.2 Informal Semantics
3.3.3 Formal Semantics
3.4 Synchrounous Statecharts
4 Code Optimization and Generation of High-Level Models
4.1 Compilation of UML Statecharts
4.1.1 Model-to-Model Optimizations
4.1.2 Optimizing Beyond the Back-end Language
4.2 Domain Specific Languages Supporting Model Based Design
4.3 Synchronous Statecharts Compilation
5 hHOEi2 Language
5.2 Hierarchical State Machines
5.3 Modeling Arithmetics
5.4 Parallel Actions
5.6 Object Creation
5.7 Indexed Regions
5.9 Modeling Applications
6 hHOEi2 Formal Semantics
6.3 Semantics of Actions
6.3.1 Sequential and Parallel Composition
6.3.4 Indexed Actions
6.4 Semantics of Transitions
6.5 State Machine Evaluation
6.6 Indexed Configurations
6.7.1 Applyon Semantics
7 Intermediate Representation
7.4 Hierarchical State Machine
7.4.1 Parallel Statements
7.5 Translating hHOEi2
7.5.1 Compilation of States and its Transitions
7.5.2 Propagation of Index Domains
7.5.3 Defining Initiators
7.5.4 “all” Condition
7.5.5 Indexed Regions
8 The Compiler: Analysis, Optimization and Code Generation
8.1.1 Instantaneous Reaction
8.2 The Optimizing Compiler
8.3.1 DefUse Chain: Structured Analysis of Reachable Definitions
8.3.2 Message Dependency Analyzer
8.3.3 Transaction Selector
8.3.4 Scalar Constants
8.4.1 Index Sets
8.4.2 Dead Code Elimination
8.4.3 Dead Associations Elimination
8.4.4 Rewriting Broadcasts
8.4.5 Basic Block Fusion
8.4.6 Loop Fusion
8.4.8 Folding of Operational Transitions
8.4.9 Dead Wait Rewriter
8.4.10 Unboxing Types
9 Experimental Results
9.1 Code Generation Strategy
9.1.2 Code Generation
9.2 Exercising the Optimizing Flow
9.3 Metrics and Results
9.4 Application Development: The hHOEi2 Approach
9.5 Towards GPGPU Code Generation
9.5.1 Extending the Optimizing Chain
9.5.2 OpenCL Code Generation
10 Conclusions and Perspectives
10.1 Main Results and Contributions
10.2 A Look Back to the Thesis Motivations
10.3 Future Research Directions
A.2 Intermediate Representation