Dynamical Modeling of Cloud Applications for Runtime Performance Management
Författare
Summary, in English
In this thesis, such performance models of cloud applications are studied. In particular, we focus on modeling using queueing theory and on the fluid model for approximating the often intractable dynamics of the queue lengths. First, existing results on how the fluid model can be obtained from the mean-field approximation of a closed queueing network are simplified and extended to allow for mixed networks. The queues are allowed to follow the processor sharing or delay disciplines, and can have multiple classes with phase-type service times. An improvement to this fluid model is then presented to increase accuracy when the \emph{system size}, i.e., number of servers, initial population, and arrival rate, is small. Furthermore, a closed-form approximation of the response time CDF is presented. The methods are tested in a series of simulation experiments and shown to be accurate.
This mean-field fluid model is then used to derive a general fluid model for microservices with interservice delays. The model is shown to be completely extractable at runtime in a distributed fashion. It is further evaluated on a simple microservice application and found to accurately predict important performance metrics in most cases. Furthermore, a method is devised to reduce the cost of a running application by tuning load balancing parameters between replicas. The method is built on gradient stepping by applying automatic differentiation to the fluid model. This allows for arbitrarily defined cost functions and constraints, most notably including different response time percentiles. The method is tested on a simple application distributed over multiple computing clusters and is shown to reduce costs while adhering to percentile constraints.
Finally, modeling of request cloning is studied using the novel concept of synchronized service. This allows certain forms of cloning over servers, each modeled with a single queue, to be equivalently expressed as one single queue. The concept is very general regarding the involved queueing discipline and distributions, but instead introduces new, less realistic assumptions. How the equivalent queue model is affected by relaxing these assumptions is studied considering the processor sharing discipline, and an extension to enable modeling of speculative execution is made. In a simulation campaign, it is shown that these relaxations only has a minor effect in certain cases.
Avdelning/ar
Publiceringsår
2022-10-25
Språk
Engelska
Fulltext
Dokumenttyp
Doktorsavhandling
Förlag
Department of Automatic Control, Lund Institute of Technology, Lund University
Ämne
- Control Engineering
Nyckelord
- Cloud computing
- Performance modeling
- Queueing theory
- Processor sharing
- Mean-field approximation
- Fluid model
- Microservices
- Load balancing
- Request cloning
Aktiv
Published
Projekt
- Event-Based Information Fusion for the Self-Adaptive Cloud
Handledare
ISBN/ISSN/Övrigt
- ISBN: 978-91-8039-393-5
- ISBN: 978-91-8039-394-2
Försvarsdatum
18 november 2022
Försvarstid
10:15
Försvarsplats
Lecture hall C, building KC4, Naturvetarvägen 18, Lund. Zoom: https://lu-se.zoom.us/j/65465034738
Opponent
- Giuliano Casale (Reader)