The company that the experiments were performed at is called Briteback. Their main prod-uct is a communication platform for larger companies, realized as an application for several different platforms. Part of this application is used in the experiments, and so it is important that it is explained in sufficient detail.
Briteback has built their main application using microservices, meaning that they have many different types of services that run independently. One of the services used in the Briteback application is the image service. This service is used to download, crop and scale images that are later displayed in the application. This service will be the target of the auto-scaling experiment of this thesis. All of Britebacks services are created using a tool called Docker and they are deployed on a hosting provider called DigitalOcean. Docker is a tool for creating and running containers.
This study will be used by Briteback as groundwork for investigating an auto-scaling solution for the production environment of the company.
This chapter explains different concepts needed to understand how the method was formed and realized.
Software can follow different types of high level architectural designs. Within web development it is common to talk about monolithic and microservice architectures. They are commonly contrasted with each other because of their definitions. The National Institute of Standards and Technology (NIST) defines microservices in the following way:
Microservice: A microservice is a basic element that results from the archi-tectural decomposition of an application’s components into loosely coupled patterns consisting of self contained services that communicate with each other using a stan-dard communications protocol and a set of well defined API’s, independent of any vendor, product or technology.
NIST defines a microservice as a part of a bigger application. The individual service has to be able to work, mostly, on its own. It needs to be connected with other microservices that together form the application, as visible to the end user.
The definition given by NIST fits well with the domain. The individual servers are of varying sizes and each is characterized by its available interface (API) as well as the human-defined “aim” of the server. The definition also fits well with other research in the area, such as the definitions given by Dragoni et al. .
Monolithic applications, as opposed to microservices, are applications in which there is only one server managing the application. Even tough “monolithic” usually has negative connotations from a programming perspective it is important to understand that a monolithic application is not, by definition, worse than a microservice architecture. Instead it has different benefits, drawbacks and challenges compared to a microservice architecture.
In comparing monolithic and microservice applications there is research that suggests microser-vices cost less when deployed in the cloud. Villamizar et al. found that they could reduce infrastructure costs by up to 77% when deploying an application through AWS Lambda, when comparing an application written with both a monolithic and microservice architecture . Dragoni et al.  also argue that a microservice architecture help in many modern software engineering related practices. Having a service with well-defined barriers help programmers find bugs faster. Having several smaller services is ideal for containerization and thus improves portability and security. Scaling several smaller servers allows for fine-tuning of the scaling and can save costs, rather than increasing the capability of every service which a monolith has to do .
A conceptual image of what a microservice architecture looks like can be seen in Figure 3.1. The figure depicts a monolith architecture and a microservice architecture. The monolith has all software components communicating with each other and access to the same database. In the microservice architecture each software component has their own (smaller) database and can run fully independently. However, they need to communicate with other microservices over the network using a standard protocol, such as HTTP.
Figure 3.1: Microservices are built around single components that provide a standalone service, while monoliths are centralized systems.
ECMAScript and Node
Virtualization is a concept used in computing that means abstracting away from the underlying architecture. Virtualization is used to sandbox different environments. A common use-case is to run a Virtual Machine (VM) with a different Operating System (OS) than that of the original machine. This allows for running a Linux machine inside of a Windows machine and vice versa. To refer to the actual machine running the virtualization environment we use the word “host” and to refer to the VM that runs we use “guest” and say that the guest machine(s) runs on the host. Intuitively, a virtualized environment should run slower, compared to running on the host environment. While this is mostly correct, there are virtualization tools that run with very little overhead .
Apart from virtual machines there is also a technique called containerization. Instead of using virtual machines you use “containers”. A key difference being that a VM runs on virtual hardware with its own kernel while a container runs the same OS as the host and only has limited access to the host. Containers run faster compared to using a VM and also start faster . There is also research suggesting that the difference between containers and running code directly on the host is negligible .
Virtualization has properties that are useful to software applications. In this section we men-tion some of the benefits and describe them briefly.
Portability: An application defined inside of a virtualized environment can run anywhere the virtualized environment can run. This property becomes very useful when an application is hosted in the cloud, on hardware provided by someone else .
Security: Applications deployed inside a virtualized environment typically cannot access the host. Because of the sandboxed environment an attacker typically has to compromise the application and then escape the (restricted) virtualized environment to compromise the host .
Reliability: With virtual machines and containerization the running environment can be treated as software, instead of being part of the hardware. A failure is then only a soft-ware failure and not a hardware failure. This makes restarting the virtualized environment a valid fallback tactic for crashes, which is much harder to realize when actual hardware fails .
Docker is used to create software containers. It uses a scripting language to define what is called an image. The image defines what should exist in a container, if it should extend from a previous image and how it should be configured. From this image the Command Line Interface (CLI) can spawn containers and once spawned the CLI can access the containers and provide changes to the running environment or run programs in it. Docker also has a repository of shared images that can be used and extended upon, similar to open source communities such as GitHub . It is called Docker Hub. Figure 3.2 shows a conceptual image of how running Docker containers on a single computer works. An example of an image definition is found in Listing 3.1.
Docker can also run in a mode called swarm mode, or Docker Swarm. It allows for managing a set of containers each providing its own service, as well as running duplicates of services for redundancy. Built into Docker Swarm are features commonly used when deploying applica-tions, such as load balancing and automatic failover . Docker Swarm acts as a deployment agent, starting and stopping containers when it is told to. It decides where containers will run, when they will start, and when to restart containers that have failed or crashed. It uses the same API as the Docker CLI does, meaning that starting new containers work in the same way as when a human would start a container through Docker CLI, like depicted in Figure 3.2.
Listing 3.1: An example “Dockerfile” that creates an application extended from the Ubuntu image and puts the application in /app, exposes port 80 (uses port 80 on the host and sends anything on that port to the container). Lastly when the image is run as a container it will run the run-script.sh with “start” as the first command line argument.
For this thesis we use Docker to build an application image. Generally, an image would be built for each microservice. Images can also be less specific, for example a web server or a database. The setup, which utilize swarm mode, is defined through a Docker Compose File. This is a file following the YAML data standard and it defines a number of services run by Docker Swarm. Each service is based on an image, which in this case corresponds to the application images. Each service also has a number of properties that can be specified through the compose file, such as environment variables, container replicas, volumes, constraints and networks . An example of a Docker Compose File can be seen in Listing 3.2.
Listing 3.2: An example Docker Compose File (version 3) that creates a single service, hosting a service called “rethink”. The Docker Image used is called “rethinkdb” and the ports 29015, 28015 and 8080 are used by the image. Additionally, /root/rethinkdb on the host is mapped to /data in the container Docker allows Compose Files to define volumes, directories mounted inside the container. A container can save data to the host by saving it in the given volume. It also works in the reverse direction. A host can save data to a volume in order for the container to read information. Volumes are a common way of giving access to company secrets inside a container. Secrets would be data, such as SSH keys or authentication tokens. Images are stored in the cloud and thus you do not want to store your secret inside the image, as anyone with the image can run the container .
In a Compose File it is also possible to define constraints, when working in swarm mode. Con-straints are general constraints on where to run a service. For example, running a database on the weakest of four nodes may hamper application performance if the database is a bottleneck. Or perhaps the application is CPU intensive and it is important to run the CPU intensive tasks on the node with the most CPU’s. This can be achieved with constraints by specifying that a service needs a minimal amount of CPU/memory or specify a node that it runs on. Labels can manually be added to nodes through the Docker CLI and constraints can specify labels that they run on .
Compose Files allow for custom networks to be defined as well. These are networks that join services together. For example an application may have a database which every other service needs to be able to connect to but it also has a trusted store, which sensitive information is kept in, that can only be accessed by a few of the other services. In that case it would be possible to create two different networks, one with the database and one with the trusted store. In the Compose File you would then specify that the services with permission to access the trusted store is part of the same network as the trusted store . This scenario is depicted in Figure 3.3.
Figure 3.3: An example with Docker Compose service definition. A is a network that connects all four services to a common database. B is a network that connects a subset of services to the trusted store.
In order to run the configurations in a Compose File you can deploy it to a Docker Swarm. The swarm will then try to deploy all the containers in a way that satisfies the constraints and requirements of the Compose File. The file can be updated and redeployed and the swarm will try to update the running services in the swarm. It is also possible to manually update parameters of the compose file, for example scaling the number of container replicas for a single service .
Figure 3.4: Horizontal and vertical elasticity. Vertical means scaling individual capabilities of computing resources and horizontal means scaling the number of computing resources.
There are many definitions of elasticity. Here we will list key elements of elasticity and compare the literature in the area.
Horizontal elasticity: Increase or decrease the number of computing resources.
Vertical elasticity: Increase or decrease the capacity of available computing resources.
The definitions of horizontal and vertical elasticity are taken from Al-Dhuraibi et al. . Hor-izontal elasticity and vertical elasticity are orthogonal concepts in nature and cloud providers can provide both at the same time. It is also possible to impose restrictions on how the scaling works. For example imposing restrictions on vertical elasticity to only work between comput-ing resources of the same pricing. Even though the verticality of the elasticity is constrained it would still be vertical elasticity. The concept of vertical and horizontal elasticity is shown in Figure 3.4.
Over-provisioning: auto-scaling that has resulted in having a higher supply of processing power than demand.
Under-provisioning: auto-scaling that has resulted in having a higher demand than available processing power.
These two definitions are given by Al-Dhuraibi et al. . This is not the only definition of Over-provisioning and Under-provisioning. Ai et al.  define three states of the system instead of two. Apart from the above it defines the Just-in-Need state . This state is supposed to capture when the application runs at optimal scale, the supply matches the demand closely. However, an issue with the definitions of Ai et al.  is how the states are derived from the number of requests to the system and the number of VMs the system has available. The Just-in-Need state is defined as a ⋅ i < j ≤ b ⋅ i where i is the number of requests, j is the number of available VMs and a and b are constants such that a < b. In the paper they specifically use the value 1 and 3 for a and b, respectively. These values are not justified but it is mentioned that they will need to be modified depending on the cloud platform and context.
Scalability: The ability of the system to sustain increasing workloads by increasing resources.
Automation: The degree of which the system is able to scale without interaction from some-thing that is not part of the system.
Optimization: The degree of optimization for the application run by the system.
The above definitions are used by Al-Dhuraibi et al. to define and summarize elasticity . Their paper defines elasticity as the combination of Scalability, Automation and Optimization. For elasticity you need auto-scaling, or you cannot handle an ever increasing workload, but an important part of the application is also the optimization of the application itself. If the application is not built to scale then it is hard to achieve an elastic application with only auto-scaling. Even tough there are limits to elastic provisions with inelastic of unoptimised programs, there is research suggesting non-elastic programs can become elastic with software tools for elastic configuration during runtime .
Table of contents :
1.3 Research Questions
3.1 Microservice Architecture
3.4 Measuring Elasticity
3.6 Cloud Computing
3.7 Control Theory
3.8 Related Work
4.1 Sending Requests
4.2 Image Server
4.3 DigitalOcean Evaluation
4.4 Policy Experiments
4.5 Auto-Scaling Implementation
5.1 DigitalOcean Evaluation
5.2 Download Times for Docker Image
5.3 Optimal Number of Containers for Mixed Policy
5.4 Startup and Shutdown Times for Containers
5.5 Policy Evaluation
5.6 Policy Experiments for Baseline and Comparison
6.3 Source Criticism
6.4 The Work in a Wider Context
7.1 Discussion of Future Work
A Early Deployment Figures