Cluster and experiment improvements and problems

Get Complete Project Material File(s) Now! »

Infrastructure and tools

With the growing adoption of cloud technologies [5], the same principles which drive continu-ous delivery of software can be applied to the infrastructure. This means that whole production environments encompassing web servers, databases, and load balancers can be automatically created from scratch.
They are bringing the benefits of repeatability, reliability, and predictability not only to software but to the underlying infrastructure as well. Nevertheless, there are also challenges involved in these kinds of infrastructures, challenges similar to those of software delivery. The infrastructures’ changeable nature can if mismanaged, lead to unmanageable and unrecoverable states. Resulting in considerable costs in time and resources through unnecessary work in recovering and managing the infrastructure [6].
The common problems and anti-patterns often identified in dynamic infrastructures are:
• Configuration drift, the configuration of servers starts to drift apart over time, which results in unmanaged server states.
• Snowflake servers, servers which have drifted so far from their original configuration that they are no longer replicable.
• Fragile infrastructure, infrastructure which, if disrupted, i.e., failure of a snowflake server requires significant time and effort to fix.
To combat this and put infrastructure under configuration management, one requirement stood out, the ability to completely version-control infrastructure. This resulted in Infrastruc-ture as Code, a process which describes all parts of an environment through definition files. Definition files enable a provisioning tool to read the definition file(s) and automatically and reliably produce complete production-ready environments.
In Infrastructure as Code: Managing Servers in the Cloud by Keith Morris, the distinctive feature of effective teams and tools is their ability to handle changes and new requirements [6]. High effectiveness stems from the ability to break down requirements into small, low-risk low-impact pieces, which then can be deployed through a fast delivery pipeline.
Today there is a wide variety of different provisioning and configuration tools available on the market. Examples of such can be Puppet, SaltStack, Ansible, Terraform, and many more. It is not always a clear line where provisioning stops and configuration start, or if a tool is a provisioning tool or a configuration tool. Often a tool realizes both or parts of the other. However, in essence, all tools have four common traits and goals:
• Source controlled infrastructure for providing audit trails and rollbacks.
• Testable infrastructure, written code can be unit, functional, and integration tested.
• Self-documented infrastructure, through written configuration files with readable syntax.
• Improved collaboration to support the idea of collaboration between developers and operators.
An essential part of the cloud-native environment is the container orchestration tool, the facilitator of portability, scalability, and redundancy. In today’s cloud-native environment, Kubernetes is by a wide margin the most popular choice of container orchestration technology, spanning the most dominant cloud providers [7]. Kubernetes is the clear choice of cluster technology moving forward with the report and investigation as no alternative has a comparable adoption rate yet.

DevOps

DevOps is not a clearly defined term but is classified as a collection of different prcesses/practices that try to make the development and operation of software as a cohesive experience as possible. Tying the two together (Dev and Ops) to reduce friction in the passing of software from release candidate to running in a production environment. Importantly DevOps is not a tool or technology but a set of disciplines regarding documentation, quality assurance, project management, user experience, and at last configuration management.
The book The DevOps Handbook written by some of the most influential people behind DevOps Gene Kim, Jez Humble, Patrick Debois and John Willis divide the processes into the fulfillment of three so-called flows [8]:
• The Principle of Flow, fast flow from idea/code to production.
• The Principle of Feedback, fast information flow from production to code.
• The Principle of Continuous Improvement, building organizational feedback loops to keep things moving.
The principles of Flow and Feedback goes hand in hand with Continuous methodologies and configuration management. Where configuration management can be described as the embodiment of the two principles in their real-world application. Figure fig. 2.1 depicts a simplified view of the DevOps process, which relates to the investigation’s scope. It shows the circular process which drives a project forward, supported by the ideas of continuous thinking, version control, monitoring, and more.
Collaborative development using version control builds and tests the application or service to be. The application is continuously integrated into the main branch of development, and artifact management allows for easy revert-ability. Continuous Deployment is possible from working main branch releases to the cluster, which runs the applications in their portable container runtime. From cluster and containers metrics can be derived, and if required, alerted on. Metrics form the usage analytics, which ties the process together, allowing the next iteration of the application to build upon a solid foundation.

Single-board computer clusters

With the widespread adoption rate of DevOps and Continuous practices together with the growing adoption rate of cloud-native technologies [5]. Development and operation on production-like clusters as early as possible in the delivery pipeline would be according to and beneficial to the principles in question. Minimal changes throughout the development, test, and production environments are essential to keep a delivery process from code check-in to deployment as lean, error-free, and efficient as possible [3].
Being able to replicate production-like container clusters at a small scale while maintaining a local hardware deployment provides developers and operations with a fully controlled cluster environment. These could potentially be a great benefit to the delivery process but could also provide developers and operation with educational opportunities. As the cloud environment is so dominated by the processes and methodologies mentioned above, it is important that they are also applicable to single-board computer clusters. Which would allow similar educational and demonstrative opportunities for the methodologies as are available for the technologies.
All methodologies and technologies presented in this chapter rely on parts of the continu-ous and encompassing process, which starts at infrastructure and ends in software running in production. Provisioning, configuration, and monitoring are all part of this overarching process, so even though the investigation is limited in scope and mainly focuses on configu-ration management, it still supports the more encompassing methodologies of DevOps and Continuous Delivery.
The investigation of provisioning, configuration, and monitoring of SBC clusters will involve a number of challenges. Mainly which requirement needs to be fulfilled to achieve a successful result when investigating, for example, provisioning. Does provisioning only involve a working cluster environment, or should idempotent execution also be a requirement?
While there are several good practices and requirements for configuration management and Continuous methodologies, there is no single checklist or definition for complete fulfillment. Several examples of real-world applications of single-board computer clusters and their use-cases are presented in the study of Commodity single board computer clusters and their applications [1]. An excerpt from the study includes educational clusters constructed by the University of Southhampton and the University of Edinburgh. Further, the Free University of Bolzano has taken single-board computer clusters a step further by deploying a 300 node cluster for education, research, and deployment purposes of clusters for developing countries [10]. Single-board computers have, as presented above, a broad range of applicable use-cases. For the report’s investigation, the focus will lie on problems and goals applicable and usable in most, if not all, use cases. Therefore by studying the requirements and good practices presented in literature, more specifically, Continuous Delivery [3] and Infrastructure as Code: Managing Servers in the Cloud [6] several goals, motivations, and criteria have been constructed for each investigation objective.

READ Discrete choice and random utility models of migration

Reliability and Validity

The reliability and validity of the case study’s result can, to a certain extent, be promised as reproducible artifacts are one of the essential parts of both provisioning and configuration. An identical hardware configuration, together with the correct version or minor version changes of software and APIs, should provide reasonable grounds for validation of the implementation. Moreover, some implementation validity is gained from the machine-readable definition files, as they can be parsed by linters and used for the actual provisioning and configuration of the implementation itself.
What could affect the validity of the results is the natural configuration drift that is in-evitable as time passes and software is updated. The current version of tools and dependencies are only guaranteed to be relevant for a short while. Although the results may act as useful guidelines for future versions of the various software involved, again, nothing can be guaran-teed. If the provisioning, configuration, and monitoring of the cluster are deemed working with current software versions, it is unlikely that future versions will recess in functionality. For example, the possibility for provisioning will most likely remain valid for the future, while the actual definition of the said provision is destined to change.

Connection to the single-board computers as remote nodes

Verified by automated remote access for provisioning tool to hosts. Connection to the hosts as remote nodes is related to the previous criteria. Now instead of pinging the hosts, access to them with root privileges is requested, allowing for actual configuration of the machines. No problems are encountered during verification of the criteria.

Table of contents :

1 Introduction
1.1 Background
1.2 Related work
1.3 Problem formulation
1.4 Objectives
1.5 Scope/Limitation
1.6 Target group
1.7 Outline
2 Continuous methodologies
2.1 Processes
2.2 Infrastructure and tools
2.3 DevOps
2.4 Single-board computer clusters
3 Method
3.1 Reliability and Validity
4 Experiment
4.1 Design
4.2 Characteristics and limitations
4.3 Evaluation scenarios
4.4 Preparation
4.5 Implementation
5 Result analysis
5.1 Provisioning
5.2 Configuration
5.3 Monitoring
5.4 Findings
6 Discussion
6.1 Cluster and experiment improvements and problems
6.2 Demonstrative and educational opportunities
6.3 Findings in relations to related work
7 Conclusion and Future work
8 References