THE DEFINITION OF THE APPROACH TO EXTRACT DATA-DRIVEN INVARIANT MODELLING CONSTRUCTS

Get Complete Project Material File(s) Now! »

THE DIGITAL TWIN PARADIGM: STATE OF ART OF RESEARCH AND TECHNICAL APPROACHES

The basic idea behind Digital Twin is a high-fidelity virtual models of the physical entities having the scope of replicating and simulating the states and behaviours of these latter’s (Grieves 2014), (S. Jain, Shao, and Shin 2017), (Bao et al. 2018), (Tao, Sui, et al. 2018), (Zhuang, Liu, and Xiong 2018).
The concept of using “twins” dated back to NASA’s Apollo program, where two identical space vehicles were built to allow mirroring the conditions of the space vehicle during the mission. Professor Grieves at the University of Michigan firstly put forward the concept of ‘Digital Twin’ in Product Lifecycle Management (PLM) courses in 2003. At the present, the concept of “digital twin” refers to a holistic digital engineering view of the real system which may be applied to the product design and development, up to production planning, including the production engineering, production and associated services (Product Lifecycle Management).
The Digital Twin is defined, in fact, as a new paradigm in simulation and modelling (Rosen et al. 2015) because it extends the use of simulation modelling to all phases of the product life cycle (Garetti, Rosa, and Contribution to the Formalization of Data-Driven Invariant Modelling Constructs of Cyber-Physical Systems 20 Terzi 2012), (Rodič 2017), (Glaessgen and Stargel 2012), (Ríos et al. 2015), (G. N. Schroeder et al. 2016), (H. Zhang et al. 2017), (Schleich et al. 2017), (Söderberg et al. 2017).
In the context of the Industry of the Future, virtualization is the design principle that enables the replication of the physical system into its “digital twin” throughout the entire value chain, by merging data into virtual models (Hermann, Pentek, and Otto 2016). A closed loop needs to be realized (Alam and El Saddik 2017), (Graessler and Poehler 2017), (Yun, Park, and Kim 2017), (Autiosalo 2018), (Nikolakis et al. 2018), (Leng et al. 2018) between the physical and virtual worlds through real time data connection (Ciavotta et al. 2017), (Schluse, Atorf, and Rossmann 2017) (Stark, Kind, and Neumeyer 2017) by collecting and analysing data (Weber et al. 2017) (H. Lee and Kim 2018), (Haag and Anderl 2018) to respond to changes over the time. Virtualization, in fact, is heavily dependent upon the real-time data management capability of the system (Moreno et al., 2017) (Dassisti, Panetto, et al. 2017).
The digital twin has been also defined as a Virtual Knowledge System (E. Tuegel 2012), (J. Lee et al. 2013), (Negri, Fumagalli, and Macchi 2017) (Asimov et al. 2018), (W. Luo et al. 2018), (Z. Liu, Meyendorf, and Mrad 2018), provided a DT aims to achieve transparency (J. Lee et al. 2013) and to overcome the information asymmetry among technology, process and people (Padovano et al. 2018).
The cycle Observe–Orient–Decide–Act (the O-O-D-A loop), was defined by Colonel John Boyd, a military strategist of the United States Air Force (Osinga,2007). The second O, orient, is the most important part of the O-O-D-A loop since it refers to the repository of our genetic heritage, cultural tradition, and previous experiences. The orientation, in the concept of the digital twin, is represented by the amount of data and the fidelity of the virtual model. Its shape the way the digital twin observes, decides and acts.
Different tools and technologies are available for developing high-fidelity virtual models (Schleich et al. 2017). They use different techniques, such as simulation and emulation, including distinct functionalities (McGregor 2002).
Simulation capabilities of a DT are provided by a design of its environment allowing to approximate off-line the behaviour of the real systems to represent how the system reacts (Law, Kelton, and Kelton 2000). It can be thought as a “static feature” of the DT. On the other hand, the emulation refers to the capability of a DT to be synchronous with the real system, so as it behaves almost similarly to the actual behaviour of the physical system (Ayani, Ganebäck, and Ng 2018). Accordingly, this feature of DT can be thought as a “dynamic feature”. An emulation model operates in a hardware-in-the-loop configuration to perform the same work of the physical system. It provides a closer replication with respect to the simulation model (C. G. Lee and Park 2014).
From the simulation point of view, the digital twin represents a new wave in modelling and simulation (Rosen et al. 2015). Simulations tools, such as Plant Simulation from SIEMENS© (Vachálek et al. 2017) is an example of how a DT can help understanding what may happens according to given situations of the real world.
From the emulation point of view, the digital twin duplicates and imitates the physical system in the virtual word. Emulation tools, such Simumatik3D® (Ayani, Ganebäck, and Ng 2018), can thus help to proactively understand what has to be done to react modifications in the real world.
Basically, either the static or the dynamic feature are a what-if analysis, but at different scales in the time domain.

THE DIGITAL TWIN IN THE PRODUCTION PHASE

Digital twins in the production phase aim at real time monitoring and optimization and for predicting the future state of the physical twin, thus preventing downtime and failures (J. Lee et al. 2013). (Leng et al. 2018) presents a Digital Twin for manufacturing cyber-physical system (MCPS). (Ding et al. 2019) introduces a DT-based Cyber-Physical Production System (DT-CPPS). MCPS is used for controlling the shop floor manufacturing while DT-CPPS for improving the flexibility, controllability and efficiency of shop floor manufacturing.
A digital twin for production control and optimization can analyse the online data collected from the physical line for searching the optimal solution to the physical line (Sun et al. 2017) or to complex product assembly shop-floors (Zhuang, Liu, and Xiong 2018). It can evaluate autonomously the production real-time (Vachálek et al. 2017) and optimize the resource allocation (H. Zhang, Zhang, and Yan 2018) autonomously (Rosen et al. 2015).
A Digital Twin reference model for rotating machinery fault diagnosis was developed in (J. Wang et al. 2018), defining the requirements for constructing the Digital Twin model. A digital twin for hydraulic supports (Xie et al. 2019) is built to simulate the actual hydraulic and to support diagnosis and degradation analysis. The digital twin finds application also in CNC machine tool (W. Luo et al. 2018) and in smart injection process (Liau, Lee, and Ryu 2018) to control the behaviours of the physical system in real-time. Based on these findings, the DT can optimize and elevate the production process to a higher level of effectiveness and flexibility.

THE DIGITAL TWIN IN THE SERVICE PHASE

The service phase refers to the phases after sale, including the product utilization and the maintenance (Tao, Cheng, et al. 2018). In the service phase, Digital Twins can provide value-added services with the support of physical simulation and data driven intelligence (Qi et al. 2018) (J. Wang et al. 2018). The Prognostics and health management (PHM) is crucial in the life-cycle monitoring of a product. The PHM is an engineering process of failure prevention, and predicting reliability and remaining useful lifetime (RUL) (Sutharssan et al. 2015). The digital twin (DT) improves the accuracy and efficiency of PHM (Tao, Zhang, et al. 2018) or it is utilized for improving the energy efficiency (M. Zhang, Zuo, and Tao 2018). (Abramovici, Göbel, and Savarino 2017) introduce a cloud-based Smart Product platform for the reconfiguration of Smart Products during the use phase using the concept of virtual product twins and an Internet of Things. The conceptual approach is prototypically demonstrated by considering a model environment for smart cars, which are temporarily reconfigured during their use phase.
The digital twin has been developed also for the waste electrical and electronic equipment recovery to support the manufacturing/remanufacturing operations (X. V. Wang and Wang 2018).

DIGITAL TWIN FUNCTIONS (Why DT is adopted?)

DTs may evolve over time or change their features according to an ideal life cycle (Macchi et al. 2018). The following features/services have been assured to the DT so far based on the life-cycle stage previously referred:
1. Accelerating the product development speed: the digital twin can be used for designing products, testing them in real time situations, stipulating how the customer or the end user will use them and how the design will complement the product’s environment. One of the biggest issues in the field of design is that it is hard to find customers who would appreciate design as much as the designers themselves do (Söderberg et al. 2017). Data from the real machine is loaded into the digital model to enable simulation and testing of ideas even before actual manufacturing starts. The digital twin can be used to plan, reconfigure the system in response to external changes. 2. Identifying customers’ needs: performance and customer usage and preferences are reflected in the twin, and then feed into the product development process to increase the customer satisfaction and market share. Future products can be developed based upon the behaviour of existing products in the real world (Tao, Sui, et al. 2018). 3. Performance optimization and validation: the digital twin helps determining the optimal set of parameters and actions that can help maximizing some of the key performance, and providing forecasts for long-term planning (Vachálek et al. 2017). The digital twin can analyse performance data collected over time and under different conditions (Alcácer and Cruz-Machado 2019), reducing unplanned machine downtime, the amount of ‘scrap’ produced in each production line, and minimizing costly production quality faults.
4. Remote commissioning and diagnostics: the digital twin enables the remote visibility and diagnostic of the operations of interconnected systems such as manufacturing systems. This allows virtual monitoring systems and validation of the current status of production systems (i.e. energy monitoring and fault monitoring) (Qi et al. 2018). In addition, by obtaining the user’s usage, the digital twin can upgrade personalized product functions (Cheng et al. 2020).

DIGITAL TWIN ARCHITECTURES (How to realize and to develop a DT?)

A general and standard architecture of a digital twin was first built by Grieves (Grieves 2014). (Stark, Kind, and Neumeyer 2017) characterize the DT as (1) an unique instance of the universal Digital Master model of an asset, (2) its individual Digital Shadow and (3) an intelligent linkage (algorithm, simulation model, correlation, etc.) of the two elements above (Kritzinger et al. 2018). An extended five-layer DT is proposed by (Tao, Zhang, et al. 2018) and it is composed by: (1) Physical entity (PE); (2) Virtual entity (VE); (3) Services (Ss) for PE and VE; (4) Data (DD); (5) Connection (CN) among PE, VE, Ss and DD. Compared to Grieves’s architecture, data and services layers were added.
The five-layer DT architecture developed by (Ponomarev et al. 2017) presents: (1) cyber-physical layer; (2) primary processing/store data layer; (3) distributed computing and storage layer; (4) models and algorithms layer; (5) visualisation and user interfaces layer. This kind of architecture highlights the data storage, the distributed computing and management system as critical parts of the digital twin. An extended six-layer DT, is presented by (Redelinghuys, Basson, and Kruger 2019). The layers are: (1) physical devices; (2) local controllers; (3) local data repositories; (4) IoT gateway; (5) cloud-Based information repositories; (6) emulation and simulation. This structure is more focused on the transmission of data flow from the physical system (Layer 1 e 2) to the cloud (Layer 5).
Although there are various understandings of the DT architectures among researchers and industrial practitioners, models, data and connections always play the most important roles in the DT. From the computational perspective, the key functionality of digital twin is the combination of physics-based models and data driven models to emulate assets accurately (Kaur, Mishra, and Maheshwari 2020). The IoT system carries out real-time data for the synchronization of the virtual twin with its corresponding physical twin with the capabilities of geometric assurance, anomaly detection, prediction, prescription and optimization.

READ Existing models of innovation leadership

SUPERVISED DATA-DRIVEN APPROACHES

Supervised is the task of inferring a function from labelled training data. It aims to build a model that can correctly predict the output of an unseen instance by observing a set of labelled instances (Al-Sahaf et al. 2019). Supervised data mining attempts to explain the behaviour of the target as a function of a set of independent attributes or predictors widely (J. Zhang, Williams, and Wang 2018).
The principal supervised mining techniques are: classification and regression (Solomatine and Ostfeld 2008).
• Classification (Ahmed 2004) technique aims to predict target categories or classes. It is composed by a training dataset containing objects (input). Each object is evaluated based on a set of predictors. The Output (Target Class) is a model (classifier) that assigns a specific label (category) to each object based on the predictors (Gorunescu 2011). Classification algorithms find relationships between the values of the predictors and the label of the target (Mitra, Pal, and Mitra 2002). • Regression (Ngai, Xiu, and Chau 2009) technique aims to predict a numeric value. It is composed by a training dataset containing objects (input). Each object is evaluated based on a set of predictors. The output (Target Class) is a model (classifier) that assigns a specific value to each object based on the predictors. In the model build (training) process, a regression algorithm estimates the value of the target as a function of the predictors.
The major strength of the supervised approaches is that these approaches do not require a high-level of domain knowledge (Y. Liu et al. 2019). The major weaknesses are the massive amounts of data needed for training a reliable model and the results and the performances depend on the quality of training data (Zhao et al. 2019).

UNSUPERVISED DATA-DRIVEN APPROACHES

The unsupervised learning (Sutharssan et al. 2015), instead, discovers an internal representation from input data only. The principal unsupervised mining techniques are: clustering, association rule mining, sequential patterns discovery and anomaly detection explained below.
• Clustering (Hansen and Jaumard 1997) technique aims to split a set of data into subsets (clusters) with similar characteristics. Clustering is used to find out groups of items that are similar (Yin et al. 2011). Accordingly, the clustering algorithms can be classified in:
• Partitioning clustering (Fahad et al. 2014): given a set of n objects, a partitioning method constructs k un-nested partitions of the data, where each partition represents a cluster and k ≤ n. It splits the data into k groups such that each group must contain at least one object.
• Hierarchical clustering (Grira, Crucianu, and Boujemaa 2004): aims to obtain a hierarchy of nested clusters, called dendrogram. It shows how the data are grouped and which clusters are related to each other. The hierarchy of nested clusters can be agglomerative (bottom-up) or divisive (top-down). An agglomerative clustering starts with one object for each cluster and recursively combine in one cluster. A divisive clustering starts with all object in one cluster and recursively split in different clusters.
• Density-clustering (Grira, Crucianu, and Boujemaa 2004): considers that clusters are dense sets of data items separated by less dense regions.
• Association (Agrawal, Imieliński, and Swami 1993) technique aims to detect the probability of the co-occurrence of items in a collection. The relationships between co-occurring items are expressed as association rules.
The generation of knowledge, through association rules, is promoted by the Formal Concept Analysis (FCA) (Poelmans et al. 2010) because it is particularly suited for exploration data (Valtchev, Missaoui, and Godin 2004). FCA is a mathematical theory oriented at applications in knowledge representation (Wille 2002). It provides. It provides tools to group the data and to discover formal patterns by representing it as a hierarchy of formal concepts organised in a semi ordered set named lattice (Ganter, Stumme, and Wille 2005).
Indeed, a major part of the data is stored in relational databases. While most existing data mining approaches detect patterns in a single data table, multi-relational data mining (MRDM) approaches detect patterns that involve multiple tables (relations) from a relational database (Džeroski 2003). Relational Concept Analysis (RCA) is a multi-relational data mining approach. RCA enables to extract multi-relation association rules where each rule consists of several relations between entities.
• Sequential patterns discovery (Lin and Lee 2003) technique aims to identify associations or patterns over time. The goal is to model the states of a process generating the sequence trends and the report deviation over time.
• Anomaly detection (outlier/change/deviation detection) (Buczak and Guven 2015) technique aims to identify cases that are unusual within data that is seemingly homogeneous.

Performing the exergy analysis of each subsystem to compute the exergy loss 𝐸𝑥𝑙𝑜𝑠𝑠

The exergy approach provides the evaluation of the quality of energy usage, the identification and quantification of the inefficiency of the process through the measurement of the Exergy Loss. The exergetic analysis is performed by evaluating the exergy loss for each subsystem. Exergy approach consists of a model of a process to be controlled, whose inputs are the exergy flow of materials (𝐸𝑥𝑀,𝑖𝑛), the exergy flow of work (𝐸𝑥𝑊,𝑖𝑛), and the exergy flow of heat (𝐸𝑥𝑄,𝑖𝑛) and the output are the exergy flow of materials (𝐸𝑥𝑀,𝑜𝑢𝑡), the exergy flow of work (𝐸𝑥𝑊,𝑜𝑢𝑡), and the exergy flow of heat (𝐸𝑥𝑄,𝑜𝑢𝑡). The difference between the exergy values in input and the exergy values in output represents the exergy loss (𝐸𝑥𝑙𝑜𝑠𝑠).
The equation of exergetic balance is as follow (Moran, Shapiro, and Moran 2008): 𝐸𝑥𝑀,𝑖𝑛+ 𝐸𝑥𝑊,𝑖𝑛+ 𝐸𝑥𝑄,𝑖𝑛= 𝐸𝑥𝑀,𝑜𝑢𝑡+ 𝐸𝑥𝑊,𝑜𝑢𝑡+ 𝐸𝑥𝑄,𝑜𝑢𝑡+ 𝐸𝑥𝑙𝑜𝑠𝑠 (1).
The optimization criterion in the exergetic analysis is to minimize the term 𝐸𝑥𝑙𝑜𝑠𝑠, since the exergy loss is proportional to the generated entropy and this latter is responsible for the less-than-theoretical efficiency of the system. This criterion, in Industry of the Future implementations, may allow to select the critical systems as well as their critical parameters, from which the term 𝐸𝑥𝑙𝑜𝑠𝑠 and the exergetic efficiency depends (Dassisti, Semeraro, and Chimenti 2019). The equation adopted to compute 𝐸𝑥𝑙𝑜𝑠𝑠 is (Gutowski, Dahmus, and Thiriez 2006): 𝐸𝑥𝑙𝑜𝑠𝑠=𝐸𝑥𝑀,𝑖𝑛+ 𝐸𝑥𝑊,𝑖𝑛+ 𝐸𝑥𝑄,𝑖𝑛−𝐸𝑥𝑀,𝑜𝑢𝑡− 𝐸𝑥𝑊,𝑜𝑢𝑡− 𝐸𝑥𝑄,𝑜𝑢𝑡 (2).

Table of contents :

CHAPTER 1 – DIGITAL TWIN: A SYSTEMATIC LITERATURE REVIEW
INTRODUCTION
1.1 DIGITAL TRANSFORMATION IN SMEs
1.2 THE DIGITAL TWIN PARADIGM: STATE OF THE ART OF RESEARCH AND TECHNICAL APPROACHES
1.2.1 CONTEXT AND APPLICATION (Where to use a DT?)
1.2.2 THE DIGITAL TWIN ALONG THE PRODUCT-LIFE CYCLE (When a DT is developed?)
1.2.2.1 THE DIGITAL TWIN IN THE DESIGN PHASE
1.2.2.2 THE DIGITAL TWIN IN THE PRODUCTION PHASE
1.2.2.3 THE DIGITAL TWIN IN THE SERVICE PHASE
1.2.3 DIGITAL TWIN FUNCTIONS (Why DT is adopted?)
1.2.4 DIGITAL TWIN ARCHITECTURES (How to realize and to develop a DT?)
1.2.5 DT COMPONENTS (What a DT is made of?)
1.2.5.1 THE PHYSICAL LAYER
1.2.5.2 THE NETWORK LAYER
1.2.5.3 THE COMPUTING LAYER
1.2.5.3.1 DATA DRIVEN MODELS AND ANALYTICS
1.2.5.3.2 PHYSIC-BASED MODELS
1.2.5.3.3 THE APPLICATION AND INTERFACE LAYER
1.3 RESEARCH GAPS FOR THE DIGITAL TWIN
1.4 SYNTHESIS AND THESIS OBJECTIVE
CHAPTER 2 – FORMALIZATION OF DATA-DRIVEN INVARIANT MODELLING CONSTRUCTS: CONTRIBUTION OF THE THESIS
INTRODUCTION
2.1 STATE OF ART OF MODELLING APPROACHES
2.1.1 MODEL-BASED APPROACHES
2.1.2 DATA-DRIVEN APPROACHES
2.1.2.1 SUPERVISED DATA-DRIVEN APPROACHES
2.1.2.2 UNSUPERVISED DATA-DRIVEN APPROACHES
2.1.3 HYBRID APPROACHES
2.1.4 DESIGN PATTERN: OBJECT-ORIENTED PROGRAMMING
2.2 CONTRIBUTION POSITIONING
CHAPTER 3 – THE DEFINITION OF THE APPROACH TO EXTRACT DATA-DRIVEN INVARIANT MODELLING CONSTRUCTS
INTRODUCTION
3.1 DEFINITION OF THE APPROACH
3.1.1 DEFINITION OF THE SYSTEM
3.1.2 SYSTEM MODEL
3.1.3 DATA SELECTION
3.1.4 KNOWLEDGE DISCOVERY
3.1.5 KNOWLEDGE EXTRACTION
3.1.6 KNOWLEDGE FORMALIZATION: WEB LIBRARY OF DATA-DRIVEN CONSTRUCTS
3.1.7 DESIGN CRITERIA FOR DIGITAL TWIN
3.1.8 KNOWLEDGE RE-USE
CHAPTER 4 – CASE STUDY: MASTER ITALY DIGITAL TWIN
INTRODUCTION
4.1 THE PRESENTATION OF THE CASE STUDY: CONTEXT ANALYSIS
4.2 APPLICATION OF THE APPROACH TO THE INDUSTRIAL CASE TO DETECT DATA-DRIVEN INVARIANT MODELLING CONSTRUCTS
4.2.1 DEFINITION OF THE SYSTEM
4.2.2 SYSTEM MODEL
4.2.3 DATA SELECTION
4.2.4 KNOWLEDGE DISCOVERY
4.2.5 KNOWLEDGE EXTRACTION
4.2.6 KNOWLEDGE FORMALIZATION: WEB LIBRARY OF DATA-DRIVEN CONSTRUCTS
4.2.6.1 MOULD FILLING PATTERN (P1)
4.2.6.2 MACHINE RESTART PATTERN (P2)
4.2.6.3 CLAMPING SYSTEM PATTERN (P3) APPLIED TO MECHANICAL FAILURE
4.2.7 DESIGN CRITERIA FOR DIGITAL TWIN
4.2.8 KNOWLEDGE RE-USE
CONCLUSION
INTRODUCTION
1. SUMMARY OF THE THESIS
2. PERSPECTIVES FOR FUTURE RESEARCH WORK
ANNEXE – CASE STUDY
LIST OF FIGURES
LIST OF TABLES
BIBLIOGRAPHY