Reverse Engineering and modularity detection in computer programs

Get Complete Project Material File(s) Now! »

Antimodularity and models of explanation

After having defined this new type of computational emergence, which is due to the amount of computational hardness which can be manifested in certain cases by modularity detection algorithms, I try to draw some possible consequences of antimodular emergence on the possibility of scientifically explaining systems aﬀected by it. I examine two well-known models of scientific explanation: the functional-mechanistic, and the deductive-nomological (DN, henceforth). I then evaluate a more debated model of explanation, computational explanation, and another type of explanation which has been object of recent scrutiny, the mathematical-topological type of explanation, a form of explanation adequate to explain certain features of complex dynamical systems. I conclude that antimodular emergence aﬀects the feasibility of the first two types of explanation, as well as computational explanation, albeit diﬀerently, and leaves unaﬀected the possibility of topological explanation, even constituting an occasion for this kind of explanation.

Antimodularity and functional or mechanistic explanations

I claim that antimodularity negatively aﬀects mechanistic explanation, a fundamental form of explanation in biological sciences. A brief detour is in order here to describe what this form of explanation amounts to.
The term mechanistic explanation usually refers nowadays in philosophy to a relatively recent model of scientific explanation, put forth since the ’90s by several groups of philosophers of biology and of cognitive science working rather independently, the most prominent exponents of the two main lines of inquiry in this field being William Bechtel and his collaborators on one hand, and Carl Craver and his colleagues on the other31. Leaving for the moment aside the subtle diﬀerences between these two main conceptions of mechanistic explanation32, I base here on Bechtel & Abrahamsen (2005), which is a standard text in the topic. Bechtel and Abrahamsen (BA henceforth) give a definition of mechanism as:
A mechanism is a structure performing a function33 in virtue of its component parts, com-ponent operations, and their organization. The orchestrated functioning of the mechanism is responsible for one or more phenomena34.
The definition above defines a mechanism as what I have along this chapter called a complex system, that is, a system composed of interacting parts. The point to stress here is that there is a functional view involved: the global function, which represents the explanandum, is explained by describing the organization and interactions of the parts which, by means of their dynamical “orchestrated” functioning, produce the phenomenon. What is needed, according to BA, to explain a given phenomenon is then to first identify the parts and operations involved in its production. To this aim, the system as a whole must be subject to two operations, which BA call structural decomposition and functional decomposition: the first yields the set of elementary parts of the system, while the second, which in real-world science is often conducted separately from the first, identifies component operations. A third, desirable operation is localization, which consists in linking parts with the operations they perform. This way, a mechanistic explanation is given, according to BA. This low-level kind of explanation is not always the most desirable, and, as BA highlight, it is important that a hierarchy of mechanisms be considered, and that explanation be multilevel. According to BA, a mechanism may also involve multiple levels of organization, being often part of a higher-level, larger part of a larger mechanism: circumstances external to a given mechanism can be seen as larger overarching mechanisms, while components of a mechanism can be seen as mechanisms themselves, recursively composed of subparts.

Antimodularity and the deductive-nomological model

In the classic deductive-nomological (DN ) view of explanation, stemming from the seminal work of Carl G. Hempel and Paul Oppenheim36, explanation is seen as a logical deduction of the ex-planandum from the explanans, and what counts is validity and soundness of the deduction, with scarce attention directed to the intelligibility of the explanation: such a concern about under-standability of the explanation would have been considered, in the historical post-neopositivistic milieu of the time, an inappropriate trespassing of philosophy of science into the territory of pragmatical, or worse, psychological aspects of scientific explanation. In such a view, all that matters for an explanation is that it is a correct deduction. Explanation is seen in this model as depending on the possibility of prediction of the phenomenon by means of a scientific law. The explanation itself amounts to the description of the logical derivation of the explanandum from a group of premises constituted by a scientific law and a set of clauses representing initial conditions of the system to explain.
For what concerns deductive-nomological explanations, I show that, antimodularity entailing weak emergence in complex enough systems, DN explanation for a complex antimodular system could not be recurred to, because, if it could, that would mean that the system is predictable, and this is negated by the definition of weak emergence itself. To clarify: it is excluded by the definition of a weakly emergent phenomenon that it can be predicted by means of a law which, given the initial state, determines in which state the system is going to be at any given time, and that this law has a mathematical expression which can be analytically solved. As said, this is excluded by the very definition of weak emergence, which basically states that a weakly emergent phenomenon (in a discrete dynamical system) is one that cannot be predicted, and that it can be reached only by performing the step-by-step microsimulation at the system’s lowest level. Given that my notion of antimodularity, under the circumstances expressed in section 1.4.1, entails weak emergence, it turns out that a complex enough antimodular system cannot be predicted by an analytically solvable expression. So, no DN explanation of a complex antimodular system could be based on such an analytically solvable law.
Anyway, if we take into consideration a specific class of systems, namely cellular automata (CAs), then a weakly emergent process generated by a CA can in a way be explained by producing a possibly very long list of steps of its evolution, a list which can be seen as a list of deductive steps inside a formal logical system, in which the premises are constituted by the CA’s initial configuration and the CA-rule, which gets repeatedly applied first to the initial configuration and then to the intermediate configuration obtained at each deductive step. Given that every CA-rule is, by the Curtis–Hedlund–Lyndon theorem37, local and equally valid in any point of the CA’s lattice, the form of a CA-rule can in this regard be assimilated to the form of a physical law, which, as a law, holds universally. Accordingly, by this analogy, the production of this list of consecutive states of the CA could in a way be assimilated to a long DN explanation, which must consist of a logical deduction of the explanandum starting from given initial conditions and a law. Even in this case, human comprehension would be hindered by the potential length of the list, but, according to the theoretical position of post-neopositivistic advocates of the DN model of explanation, understanding is an inessential feature of explanations, and it is not required for a good DN explanation. So, in a way, weak emergence and, consequently, antimodularity, does not hinder DN explanation, at least in the case of CAs and other systems whose dynamics follow a universal rule. From this considerations are excluded classes diﬀerent from CAs, for example boolean networks in general, whose dynamics can follow locally changing rules which Hempel & Oppenheim (1948). are not universal. In these cases, the rule to be employed would be the global update rule, which, being nonlocal, is usually much more complex than a CA-rule, and, as a consequence, the list of deductions constituting the DN-style explanation of such systems would be even more unintelligible.

Antimodularity and topological explanations

I now consider consequences of antimodularity on the possibility of explaining a complex sys-tem by means of what Philippe Huneman calls topological explanation. Huneman describes topological explanation as “a kind of explanation that abstracts away from causal relations and interactions in a system, in order to pick up some sort of ‘topological’ properties of that system and draw from those properties mathematical consequences that explain the features of the sys-tem they target.”38. Inspired from mathematical topology, the topological properties of a system are the properties concerning in a way its “shape” which are invariant under possible contin-uous deformations of the system. These structural properties must not belong to a material system, but can be parts of an abstract, mathematical space. In my terminology, I would say that these topological properties do not concern a system, but a description of the system. Now, topological explanation consists in explaining features of the system by appealing not to causal events between its parts, like mechanical explanation would do, but by pointing to some topo-logical features of the system’s representation in this abstract space. This kind of explanation is not mechanistic, because it does not specify the particular interactions between parts which give rise to the dynamical phenomenon: the explanation is specifically based on a mathematical, topological feature, which does all the explanatory work.
Topological explanation could also be based on the presence of modular structure: This can happen for example when a topological explanation of the robustness of a network’s dynamics to local perturbations is given by mentioning that the network has a community structure39: this modular structure ensures that perturbations remain local or channeled, without spreading indis-criminately at the same speed on the whole network. On the contrary, intrinsic antimodularity could produce unbounded spread of perturbations across the network. All in all, not only it seems that antimodular emergence does not hamper topological explanation, but it actually turns out that intrinsic antimodularity or its absence could indeed allow for certain topological explanations.

High-level modularity as a condition for programming and scientific research

At least for a computer programmer, the claim by Robert Cummins, which we will encounter in section 9.2, that functional analysis has an explanatory capacity, is nothing new: a programmer, at least implicitly, is continually developing partial explanations of how the program under con-struction works by means of the organized execution of its instructions, and, at a higher level, of its subroutines or even higher modules. The very act itself of programming starts from the specification of the whole program (e.g, being “a word processor”), and development proceeds by analyzing, in a Cummins-like sense, this global function, which is the specification, into smaller subfunctions which together make up the implementation. In turn, each subfunction gets de-composed, if possible, in simpler subfunctions, and so on. The actual writing of the program, the act of writing the sequences of instructions composing each subroutine, usually bouncing up and down across hierarchical levels, analyzing global functions into smaller subroutines, imple-menting them and going back to decompose other high-level functions, or of composing them starting from the simplest subroutines and going up in the hierarchical levels, is almost unfeasible without a former, at least implicit, explanation, on the part of the programmer herself, of the whole system in hierarchical terms. This can be seen as a form of functional explanation the way Cummins understands it. It seems, then, that functional hierarchical explanation is necessary for computer programming, in computer science.
In empirical science, functional, multilevel explanation is probably essential not only after a theory or a model of a phenomenon has been devised, that is when explaining an already known phenomenon, but also in the making of a theory. Interventionistic accounts of causation like that of James Woodward51 see a causal relationship as holding between two entities when an hypothetical variation of the state of one entity, purposefully induced by an experimenter, that is, an intervention, systematically produces a variation in the state of the other entity: when this hypothetical circumstance holds, we can say that the two entities are causally related. A mechanistic account of explanation requires during scientific research the advancement and re-fining of the mechanical description of a mechanism by progressively discovering all the causal relations holding between its parts. To do this, the experimenter proceeds by intervening on each part separately, and seeing if some consequent variation occurs on other parts. But, to properly identify causal links, intervention on the state of a part requires a temporary, at least virtual, disruption of the structure of causal links going from other parts of the mechanism towards the part on which we are intervening: interventions on mechanisms require that the mechanism is temporarily modified by eliminating some of the connections between its parts. Woodward claims that the set of equations representing correctly a causal system must be modular, because oth-erwise, given that detection of causal relationships requires intervention on a part of the system by temporarily disrupting only the causal influence which bears on that certain part, were the system completely not modular, this precise disconnection of one causal path would disrupt not only the part of the equation interested by the intervention, but also other parts of the system. So, dynamical modularity is always present in a mechanism, at least at the lowest level, that of the preferred description.
But, the point is, if we want to redescribe a system mechanistically at higher-levels, we could certainly construe relations between high-level parts as prima facie “high level” causal relations. In that case, in order to proceed by intervention Woodward-style, modularity is needed also in the equations representing the system’s dynamics at these higher levels. All considered, this condition holds if the update function’s52 structure is hierarchically modular, and this in turn represents the fact that the system is functionally, and probably also, dynamically, and, quite likely, structurally, hierarchically modular.
The same utility of the presence and of the possible detection of modular structure shows up in the phase of discovery of complex networks, especially in cases in which the discovery of links between nodes requires a complex experimental work, like in gene regulatory networks and other networks of biological interest. Certain recently proposed algorithmic methods, like that in Clauset, Moore, & Newman (2008)53, could be of great aid in this kind of task because, based on detected hierarchical modularity in the already discovered partial network, they can probabilis-tically produce, with a good chance of success, a prediction about where in the network missing links should probably show up with further observation, thereby fruitfully guiding subsequent experimentation. Thus, it seems that hierarchical modularity is important or even essential in the phase of scientific research and experimental discovery, besides being almost essential, as we have seen in the former sections, for explanation of an already studied phenomenon.

READ The word problem on strict categories

A metaphysical attempt: Modularity as ontology? Constrained antirealism

As we have seen, an important and necessary feature of a module is its robustness: a module is something which must endure a range of perturbations, and on some timescale it must persist a certain amount of time. Another defining trait of modules is the fact that they enjoy some amount of independence from the rest of the system and from other modules, and this is due to an at least partial isolation of the module, to its possessing some form of recognizable boundary. These are properties that modules share with entities, or with objects, that is, with what can conceivably be considered the basic ontological components of the world. I would suggest that this is not a mere coincidence. I think it is plausible to say that our perceptive system operates a modularity detection on the raw data which impinges on our bodies, in order to yield a description of the world in terms of entities or objects: objects that we perceive are the modules produced by this process of module detection. It is as well plausible, I think, to consider the limits of this modularity detection operated by each organism, in the light of the limits we have seen as aﬀecting modularity detection algorithms. Of course, there are some major diﬀerences here: first, the perceptive systems are not serial algorithmic computations, like those implemented on standard computers, and could accordingly be exempted from exhibiting the same computational complexity. However, if we take for granted a form of physical determinism (at least macroscop-ical) and the finiteness of perceptual biological systems, the computational equivalence of those perceptual processes with some algorithm should be quite guaranteed: perceptual systems can be seen as computational systems. We must then consider that algorithms in certain complexity classes are inherently intractable, regardless on the computational architecture on which they are implemented: for example, even if highly parallel architectures are used, like those of neural networks, EXPTIME problems cannot be successfully tackled, because their required time for completion grows exponentially, while the amount of parallelism can grow at most linearly54, and the exponential function grows incomparably faster than any polynomial one. A further objection can nevertheless be raised: perceptual systems do not need to discover a good modu-lar structure in an initially unstructured bunch of stimuli, because they are already fine-tuned to detect a more or less well-known modular structure in a world which is more or less stable: an organism is adapted to detect some kinds of objects around it, and its perceptual system is already structured and biased toward detection of those kinds of modules, that is, those kinds of objects in the world. This fine-tuning has been, in a classic darwinistic view, produced by natural selection. Such a biased modularity detection process could well be much less computaOr, possibly, quadratically, or even cubically, if we imagine some futuristic three-dimensional computational “growing cube”. tionally complex than the raw process of finding an initially unknown good modular structure in a big set of unstructured data, which requires the performing of a NP-complete task, the task of optimizing the modularity measure Q55. This is, at least, quite likely: perceptual processes are probably not so computationally hard. There are, however, two rejoinders. The first is that perceptual processes are not less constrained by computational complexity, at least indirectly, than modularity optimization is constrained: perceptual processes as they are now have been achieved through natural selection (at least on darwinistic accounts), and it is this process, the process of natural selection, which has borne the burden of trying to optimize the modularity measure Q on an initially perceptually unstructured world. NP-completeness is in some cases so hard that not even natural selection can be deemed having had enough time to perform an exhaustive search among the phenotypic space of possible perceptual systems, so it can be argued that the actual perceptual systems coming out of evolution have indirectly been aﬀected by these same computational constraints, due to the excessive computational complexity of the task of the search for the best modular hierarchical description, even for this search during the course of evolution. The second answer to the former objection stems from this last consideration: quite possibly because of their lower computational complexity, perceptual processes are less accurate than algorithmic Q optimization, and this is another way to say that optimal modularity detec-tion on data coming from the empirical world is not what perceptual systems yield: perception is indirectly constrained by the computational hardness of modularity detection, in the sense that it has been rendered more or less unreliable by this computational complexity.

Table of contents :

I Introduction
1 Modularity, Antimodularity, Explanation: an Introductory Tour
1.1 Modularity
1.1.1 Modularity in complex systems
1.1.2 Modularity, decomposability, and economy of description
1.1.3 Modules as repeated similar high-level parts
1.1.4 Structural and dynamical modularity
1.1.5 Near-decomposability and Aggregability
1.1.6 Modularity in discrete dynamical systems
1.1.7 Modularity in computational systems
1.1.8 Hierarchical modularity, levels, robustness and validity
1.1.9 Modularity and explanation
1.1.10 Some example applications of modularity in real-life scientific research .
1.2 Algorithmic detection of modularity
1.2.1 Modularity detection In networks and computational complexity
1.2.2 Modularity detection In discrete dynamical systems and computational systems
1.2.3 Some Applications of modularity detection in real-life researches
1.3 Computational hardness
1.4 Antimodularity
1.4.1 Antimodular emergence
1.4.2 Antimodularity and models of explanation
1.4.3 Antimodularity and functional or mechanistic explanations
1.4.4 Antimodularity and the deductive-nomological model
1.4.5 Antimodularity and topological explanations
1.4.6 Explanation and prediction
1.4.7 Computation and computational explanation
1.4.8 Antimodularity, cellular automata and computational explanations
1.4.9 High-level modularity as a condition for programming and scientific research
1.4.10 Explanatory emergence
1.4.11 Is it likely to encounter antimodular systems in science?
1.5 Some additional reflections on modularity, metaphysics, computing, history of science
1.5.1 A metaphysical attempt: Modularity as ontology? Constrained antirealism
1.5.2 Computational methods in scientific research: a possible historical turning point?
II Modularity
2 A first look at modularity
2.1 An informal definition of modularity
2.2 Early concepts related to modularity
2.2.1 Aggregation in dynamical systems
2.2.1.1 Approximate aggregation
2.2.1.2 Aggregation is computationally hard
2.2.2 Decomposability
2.2.3 Simon-Ando near-decomposability
2.2.4 Timescales and decomposition in nearly-decomposable systems
2.3 Hierarchical modularity
2.4 Generic near-decomposability
2.5 Modularity is relative to the choice of a metric
2.6 Summary of the chapter and outlook
3 Modularity and networks
3.1 Networks and network science
3.1.1 Random and regular networks
3.1.2 Small-world networks
3.1.3 Scale-free networks
3.2 Modularity in networks
3.2.1 Community and hierarchical structure detection
3.2.1.1 The Modularity measure Q
3.2.1.2 Reliability of the detected modular structure and computational hardness of Q optimization
3.2.1.3 Modularity detection in weighted networks
3.2.1.4 The problem of overlapping communities
3.2.1.5 community structure and scale-free networks
3.2.2 Network motifs and network themes
3.2.3 Network roles
3.2.4 Functional typology of hubs: party hubs and date hubs
3.2.4.1 Timescale decoupling and dynamical methods for community detection
3.2.5 Coarse-graining of networks with community structure or recurring modules
3.2.6 Modularity, small-world networks and information processing
3.2.7 Differences between modularity in networks and general modularity .
3.3 Limitations of algorithmic detection of modularity in networks
3.3.1 Time complexity of community structure and hierarchy detection .
3.3.1.1 Accuracy of community structure detection
3.3.1.2 Trade-off between accuracy and speed in community structure detection
3.3.2 Time complexity of network motifs detection
3.3.3 Time complexity of network roles detection
3.4 Summary of the survey on modularity detection algorithms
4 Modularity of computer programs
4.1 Computer programming
4.1.1 Computer programs
4.1.2 The Von neumann architecture
4.1.3 What a program is
4.1.4 Programming languages
4.1.4.1 Low-level languages
4.1.4.2 High-level languages
4.1.4.3 Syntax and semantics of programming languages
4.1.4.4 Program semantics and flow charts
4.1.5 Program specification and program implementation
4.1.5.1 Specification, abstraction and naming
4.1.5.2 Kinds of specification
4.1.5.2.1 Kind B: bare specification
4.1.5.2.2 Kind A: the aggregate kind
4.1.5.2.3 Kind M: the modular kind
4.1.5.2.4 Kind C: the kind by convention
4.1.6 A common definition of computer program
4.2 Program modularity
4.2.1 Subroutines
4.2.2 Structured programming
4.2.3 Object-oriented programming
4.2.4 Program modularity, coupling and cohesion
4.3 Reverse Engineering and modularity detection in computer programs
4.3.1 Reverse-engineering of program specifications in modular programs
4.3.1.1 Specification mining
4.3.2 Program modularity favors program development
4.3.3 Inherently antimodular programs
4.3.4 Modularization of computer programs by program slicing
5 Modularity in discrete dynamical systems
5.1 Discrete Dynamical Systems
5.1.1 Modular/digital and DDSs
5.1.2 A general definition of DDS
5.2 Cellular automata
5.2.1 Stephen Wolfram’s classification of CAs
5.2.2 Process modularity in CAs
5.2.3 Self-organization in CAs
5.2.4 Higher-level modularity in CAs
6 Thinking about modularity
6.1 Modularity and its properties: summing up
6.2 Structural and dynamical modularity
6.2.1 Structure and process
6.3 Modularity is relative
6.4 Forms of functional modularity
6.5 Modularity of the dynamical model and prediction
6.6 Hierarchical levels of descriptions
6.6.1 Abstractions
6.6.2 Preferred languages
6.6.3 Abstraction, aggregation and multiple realizability
6.6.4 Transformation of languages by abstraction
6.6.5 Descriptions and simulations
6.6.6 Languages and levels of description
6.6.7 Redescriptions
6.6.8 Validity of a redescription
6.6.9 Preferred descriptions
6.6.10 Modular redescriptions, aggregated redescriptions, explanatory redescriptions, robustness and validity
6.6.11 High-level modularity and macrodescriptions
6.6.12 Macro level and Micro level
6.6.13 Levels and the specification/implementation relation
6.6.14 A meta-consideration on levels of description
6.7 Temporal decoupling of hierarchical levels
6.8 Modularity, economy of description, explanation
6.9 High-level modularity conditions experimental research and computer programming
6.10 Summary
7 Some issues about modularity in biology
7.1 Evolution and modularity
7.1.1 Evolution of modularity in Herbert Simon’s view
7.1.2 Modularity as emergent self-organization in complex systems and the role of natural selection
7.1.3 Evolution and modularity of the genotype-phenotype map
7.1.4 Modularity as due to natural selection
7.2 A modular functional view of biological systems
7.3 A computational view of biological processes
III Models of explanation
8 The deductive-nomological model of explanation
8.1 Known problems of the DN model
9 Functions and functional explanation
9.1 Functions
9.2 Functional analysis
9.3 Functional explanation of computational systems
10 Mechanistic explanation
11 The new mechanistic school
11.1 Machamer, Darden and Craver’s account of mechanistic explanation
11.1.1 Mechanisms and functions
11.1.2 Activities, causes and laws
11.1.3 Diagrams
11.1.4 The working cycle of a mechanism
11.1.5 Hierarchies and Bottoming Out
11.1.6 Mechanism schemata, mechanism sketches, explanation, and scientific theories
11.1.7 Intelligibility and multi-level mechanistic explanation
11.2 Becthel and Abrahamsen’s view of mechanistic explanation
11.2.1 Main differences between BA and MDC accounts
11.2.2 BA’s definition of mechanism
11.2.3 Hierarchical organization of mechanisms
11.2.4 Diagrams and simulation in mechanistic explanation
11.2.5 Discovering mechanisms: decomposition and localization
11.2.6 Testing mechanistic explanations
11.2.7 Generalizing without laws
11.3 Functional analysis and mechanistic explanation
12 Philippe Huneman’s topological kind of explanation
IV Antimodularity
13 The notion of antimodularity
13.1 Problems with the detection of modularity
13.2 A definition of antimodularity
13.3 Antimodular emergence
14 Consequences of antimodularity on explanation
14.1 Antimodularity and functional or mechanistic explanations
14.2 Antimodularity and the deductive-nomological model
14.2.1 Antimodularity and weak emergence hamper DN explanation
14.3 Antimodularity and topological explanations
14.4 Explanation and prediction
14.5 Antimodularity and computational explanation
14.5.1 Computation and computational explanation
14.5.2 Antimodularity, cellular automata and computational explanations
14.6 Explanatory emergence
15 Are there antimodular systems in science?
17 Computer science basics
17.1 General notions
17.2 Automata theory
17.2.1 Finite automata
17.2.2 Nondeterministic finite automata
17.2.3 Probabilistic finite automata
17.2.4 Pushdown automata
17.2.5 Turing machines
17.2.6 The halting problem and the Entscheidungsproblem
17.2.7 Nondeterministic Turing machines
17.2.8 Linear bounded automata
17.2.9 The Chomsky hierarchy
17.2.10Grammars
17.2.10.1 Context-free grammars
17.2.10.2 Relationships between the expressive power of grammars
17.3 The Church-Turing thesis
17.4 Computational complexity
17.4.1 Time complexity
17.4.1.1 The TIME complexity classes
17.4.1.2 The EXPTIME complexity class
17.4.1.3 P, NP and complexity classes
17.4.1.3.1 The class P
17.4.1.3.2 The class NP and the P = NP problem
17.4.1.3.3 NP-completeness
17.4.1.3.4 NP-hardness
17.4.2 Space complexity
17.4.2.1 The SPACE complexity classes
17.4.2.2 The class PSPACE
17.4.2.3 The EXPSPACE complexity class
17.4.2.4 PSPACE-completeness
17.4.3 Relationships between space and time complexity classes and open problems
17.4.3.1 Existence of intractability
References