A semantics for ini-module activation
We describe the semantics of an object-creation expression by showing how the ini-modules defined in the hierarchy of a class are activated. We focus on how the ini-modules are activated based on the parameters submitted to the object-creation expression. This semantics is an adaptation of the one of Magda . Note that here we will abstract away from the semantics of instructions and expressions contained in the body of an ini-module, i.e., from the semantics of the instructions I1 and I2 and of the expressions expr in the instruction next_ini and in new. Figure 2.1 shows the definition of a function IModules. This function starts from the class C and goes up considering class by class in the hierarchy, collecting the ini-modules. We assume that there exists a total order among the ini-modules in a given class (note in the −→ figure that indeed IM is a sequence and not a set). We justify these design choices in Section 2.5.1 and we discuss how to impose a total order among the ini-modules of a class in Chapter 3. Figure 2.2 shows the definition of a function RModules. This function, when applied to a class C , collects all the required ini-modules defined through the hierarchy of C . With −→ −→ RModules(mod, i) we denote the result of RModules by restricting mod to the first i ini-modules. Figure 2.3 shows the definition of a function activatedIniModules. This function computes a sequence of ini-modules that will be executed (we will define later a concept of activable ini-module), given a class C and a set of parameters p. Given an object-creation expression new C[id := expr]
we want to describe its semantics by showing how the ini-modules activation process takes place. Which ini-modules get activated depends on (1) the parameters supplied in the object-creation expression, and (2) the ini-modules defined in the hierarchy of class C. The semantics of the given object-creation expression is thus described by the call activatedIniModules(C, id). Definition 6 (Parameter map). We call parameter map a map < p, v > where p is a parameter set and v is the set of the respective values. This map is initially populated from the parameters supplied in an object-creation expression new C[id := expr]: the keys are the id and the values are obtained by evaluating the corresponding expr. Hereinafter we will sometimes refer only to the parameter set p, when not interested in the values. The function activated is defined in Figure 2.4 through a recursive function activated0 that has one more parameter for the current step i, which represents the index of the current −→ ini-module in mod to be considered for activation. This function is based on five rules: −→ each execution step considers the current ini-module mod[i] in the sequence, and at each computation step one of the rules is applied based on whether the current ini-module is activable or not
Checks on well-formedness of ini-modules
We recall that the parameter map initially contains the new-supplied parameters, then it is extended with the output parameters of each activated ini-module, and decreased by the input parameters of the activated ini-modules (see Section 2.3). We also highlight that all the checks discussed in this section are correctness checks, and as such they are treated.
Check 1 (All output parameters must be assigned). In Magda, all output parameters of an ini-module must be assigned in its body, in order to be ready to be consumed by other ini-modules. In Pharo, this check is conducted at runtime.
Check 2 (Output parameters must correspond to input parameters). In Magda, each output parameter of an ini-module must be an input parameter of exactly one ini-module that will be considered after for activation. The fact that it must be at least input of one ini-module avoids the production of oversupplied parameters. The fact that it must be at most input of one ini-module is enforced by imposing that each parameter name is unique, with the aim of giving each parameter a non-ambiguous semantics.
In Pharo, we relaxed this constraint:
• Regarding the at least direction, we decided to postpone it verification from the creation of a new ini-module to the execution of an object-creation expression. This means that we tolerate defining an ini-module producing output parameters that are not taken in input by some ini-module to be considered after for activation; but we raise an error when such an ini-module is activated. Note that the activation algorithm fails when the parameter map is still not empty after trying to activate the last ini-module of the sequence (see lines 14-27 of Algorithm 3). An ini-module violating the at least direction of the constraint naturally leads to an error, when activated. Therefore its verification is embedded in our algorithm. In the future, a quality check verifying preventively that a set of ini-modules is consistent on the basis of the at least direction of the constraint could be implemented as a sanity check.
• Regarding the at most direction, we discarded it because we believe that it is the developer’s responsibility to decide if a certain parameter name must have more than one meaning, by appearing as an input parameter of more than one ini-module. However, we discussed in Section 2.2 that in such cases a parameter renaming preserving both ini-modules activation semantics and ordering is always possible. This means that using the same parameter name with more than one meaning should be, however, considered a bad programming practice. A quality check could be implemented in the future to verify if the at most direction of the constraint is satisfied.
Check 3 (Ini-module signature uniqueness). Constraint 1 states that it must be forbidden to have two ini-modules with the same signature, i.e., two ini-modules with same input and output parameters in the same class. In Magda, since each parameter name is unique, this constraint is automatically satisfied as a consequence. In Pharo, an error is raised at runtime if the class designer tries to define a new ini-module with the same signature of another one already defined in the same class.
Check 4 (On fully applied ini-modules). In Magda, given an object-creation expression, an ini-module to be activated must have all its input parameters matched with a subset of the parameter map. In Pharo this constraint is retained and checked at runtime: any ini-module with an input parameter not in the current parameter map is ignored, i.e., not activated (see Definition 7). In both Magda and Pharo a non-fully-applied ini-module is never activated. The main diﬀerence is that in Magda this is checked statically, i.e., the developer must remove this anomaly from the code, while in Pharo it is tolerated. From the point of view of a dynamically-typed language this is reasonable, given the inherently prototypical nature of such languages, as the developer might, sooner or later, add other choices of initialization that makes that ini-module fully applied and thus activated.
Check 5 (On oversupplied parameters). In Magda, given an object-creation expression, only ini-modules all of whose output parameters are not in the parameter map can be activated. This prevents to have more than one way to calculate a certain output parameter, so to avoid ambiguity.
In Pharo this constraint is retained and checked at runtime: any ini-module that would add a produced parameter already present in the parameter map is ignored, i.e., not activated (see Definition 7). In both Magda and Pharo an ini-module that outputs oversupplied parameters is never activated. The main diﬀerence is that in Magda the developer must remove this anomaly from the code, while in Pharo it is tolerated. In fact, in Pharo, the first-found ini-module computing an output parameter is activated, preventing the next ones computing the same output parameter to be activated (unless the parameter is consumed and re-added to the parameter map in between). This way the developer has the responsibility to put the modules in the right order to get the desired eﬀect. Regarding this design choice, a similar observation as for Check 4 applies here.
Check 6 (On required and optional modules). In Magda, a static check ensures that all the ini-modules declared in mixins as required are activated by the supplied set of input parameters. Moreover, in FJMIP  each class must declare one and only one required ini-module, initializing all the fields of the class (this condition is verified by a static check). This ensures that, at the end of the initialization process, no field remains uninitialized.
In Pharo a tight property as the one guaranteed by FJMIP is not desirable, as lazy initialization (see Section 1.1.3) is a powerful and omnipresent initialization mechanism. However, the required ini-modules provide the class designer a mechanism to enforce a complete initialization of the state of the class, when needed. In Pharo, a runtime error is raised when a required ini-module is not activated. In addition to those originally provided by Magda, in Pharo we also added the following check.
Check 7 (No field access in ini-modules bodies). While in Magda it is possible to access field values in ini-modules bodies, in Pharo we decided to prohibit field access because it may introduce implicit dependencies between ini-modules, which can cause problems during software maintenance. In fact, the approach of ini-modules diﬀers greatly from that of traditional constructors. A constructor is responsible for fully initializing a newly created instance of a class, while an ini-module represents only a piece of the whole initialization protocol. As a consequence, we think that how all these pieces combine together should be made as much as possible explicit, to facilitate code maintenance. Hence, we decided to force all the dependencies between ini-modules to be explicit in their signatures. However, this point would need a better investigation by experimenting with Pharo’s code bases.
A class invariant is an assertion φ describing a property which must hold for all instances of a class. Class invariants must be established by the initialization and constantly maintained between calls to public methods. The class invariant constrains the state stored in the object. As it is stated in , object creation may be seen as the operation that ensures that all instances of a class start their lives in a correct mode – one in which the invariant is satisfied. However, in spite of its name, an invariant does not need to be satisfied at all times. At some intermediate stages, the invariant can not hold. Concerning object creation, the important thing is that the invariant is established at the end of the evaluation of an object-creation instruction. Let us explore more thoroughly what does this mean. Usually, an object-oriented language initializes the fields of an object based on their types (for example, the default value for a boolean field may be false). Meyer calls creation procedure each procedure of a class aimed at initializing newly created instances after the new instance has been created with all its fields initialized with their default values. Moreover, based on the design by contract , a creation procedure may require a precondition to ensure that a new instance satisfying the invariant is successfully created. Putting together all the pieces, Meyer says that every creation procedure of a class C, when applied to arguments satisfying its precondition in a state where the fields have their default values, must yield a state satisfying φ.
Class invariants are inherited, that is, the invariants of all the ancestors of a class apply to the class itself. The class invariant for a class consists of any invariant assertions imposed locally on that class, logically “and-ed” with all the invariant clauses inherited from the class’ ancestors. From the point of view of the initialization protocol, this means that a subclass may strengthen the invariant inherited from its ancestors, but it can not weaken it. In fact, for the Liskov substitution principle : Let φ(x) be a property provable about objects x of type T . Then φ(y) should be true for objects y of type S where S is a subtype of T .
Let us consider now ini-modules. When we say that ini-modules are modular, we mean that they diﬀer from traditional constructors because each ini-module implements only a portion of the initialization of an object, whereas a constructor performs a fully initialization of an object. In other words, a traditional constructor is in a 1:1 relation with an object-creation expression, and it is a creation procedure in Meyer’s sense. Ini-modules, instead, are in a 1:n relation with an object-creation expression, and the creation procedure in Meyer’s sense coincides with the function CreateInstance which we presented in Section 2.3. As a consequence, with ini-modules we have only one creation procedure, the function CreateInstance, whose actual behaviour is determined by the ini-modules. Therefore, according to Meyer, we want to design the ini-modules for a class C in such a way that the function CreateInstance, when applied to arguments satisfying its precondition (ProperParams in a state where the fields have their default values, yields a state satisfying φ (see Section 2.3).
Specifying a partial ordering and inferring a linear extension
Specifying a total ordering of the ini-modules of a class may result in an unnecessary and annoying task for two reasons:
1. Some order relations between ini-modules recur programmatically. Look for example at Figure 3.1: the ini-modules (angle, rad)(coordX, coordY), (point)(coordX, coordY) and ()(coordX, coordY) must all precede the ini-module (coordX, coordY)(). In the same way, in Figure 3.2 the ini-module (c, m, yc, k)(red, green, blue) must precede ini-module (red, green, blue)(). This is because an ini-module outputting parameters that are input parameters of another ini-module should usually precede it. Another recurring order relation is that between ini-module ()(coordX, coordY) and ini-modules (angle, rad)(coordX, coordY) and (point)(coordX, coordY), which must both precede it. This is because ini-modules producing default values (usually with an empty set of input parameters) should follow in the given order other ini-modules producing the same output parameters.
2. For some pairs of ini-modules, their relative ordering is irrelevant. Consider for example ini-modules (aWidth)() and (aHeight)(): initializing first the width and then the height or vice versa produce the same result.
To take in account the previous two points, we introduce two kinds of order constraints. Each order constraint states that an ini-module must be considered for activation before or after another one.
Default order constraints. We define a pair of default order constraints between two ini-modules A and B that apply by default when the signatures of the ini-modules exhibit particular patterns. This avoids the class designer from having to specify order relations that recur programmatically between ini-modules. In particular, we introduce two rules, Rule 1 having priority over Rule 2 (i.e., Rule 2 is tried only when Rule 1 does not apply):
• Rule 1: if A outputs at least one parameter that is consumed by B then A is tried first;
• Rule 2: if A takes as input parameters a strict subset of B’s input parameters, and A and B share at least one output parameter, then B is tried first.
Rule 1 captures the order relation between two ini-modules, the second ini-module consuming at least one parameter that is outputted by the first one. Rule 2 establishes that, when two ini-modules share at least one output parameter, the one requiring more data in input must be tried first as it is probably more precise and increases the chance that all parameters will be terminated before the end of the activation process.
In the rectangle example shown in Figures 2.5 and 2.6, the following default constraints apply:
• according to Rule 1, ini-modules Rectangle2D»(angle, rad)(coordX, coordY), Rectan-gle2D»(point)(coordX, coordY) and Rectangle2D»()(coordX, coordY) must be con-sidered for activation before ini-module Rectangle2D»(coordX, coordY)();
• according to Rule 2, ini-modules Rectangle2D»(angle, rad)(coordX, coordY) and Rectangle2D»(point)(coordX, coordY) must be considered for activation before ini-module Rectangle2D»()(coordX, coordY);
• according to Rule 1, ini-module ColoredRectangle2D»(c, m, yc, k)(red, green, blue) must be considered for activation before ini-module ColoredRectangle2D»(red, green, blue)().
Table of contents :
1 Limits of Pharo’s initialization approach
1.1 Initializing objects in Pharo and related problems
1.1.1 The initialize method
1.1.2 Instance-creation methods
1.1.3 Lazy initialization
1.2 The number of constructors explosion
1.2.1 Multiple initialization options
1.2.2 Optional initialization
1.2.3 Problems with subclassing
2 The ini-module model
2.1 What is an ini-module and some definitions
2.2 A semantics for ini-module activation
2.3 The algorithm
2.4 Note on I2 instructions
2.5 Discussion on the model
2.5.1 Design choices
2.5.2 Checks on well-formedness of ini-modules
2.5.3 Class invariants
3 The ordering algorithm
3.1 Directly specifying a total ordering
3.2 Specifying a partial ordering and inferring a linear extension
3.2.1 Formal definition of the partial order
3.2.2 Calculating a linear extension of the partial order
4 Pharo implementation
4.1 Ini-modules in Pharo
4.1.1 Extending the initialization protocol in subclasses
4.1.2 How to specify explicit constraints
4.1.3 Graphical user interface for ini-modules
4.2 Ini-modules as methods
4.2.1 Drawbacks of the discussed approach
5.1 Solving the initialization problems of Pharo
5.2 Solving the number of constructors explosion
5.2.1 Multiple initialization options
5.2.2 Optional initialization
5.2.3 Problems with subclassing
6 Related work
6.1 Common Lisp / CLOS
6.2 Constructors and methods with named and optional parameters
6.3 Other approaches to initialization
6.4 Other works related to initialization
7 Future work
7.1 Extending our analysis to other dynamic languages
7.2 Maximising ini-modules reuse
7.4 Going beyond the limitations of the prototype