-
Notifications
You must be signed in to change notification settings - Fork 1
/
Copy pathQoS.tex
59 lines (49 loc) · 5.27 KB
/
QoS.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
In this section various methods for maintaining a specific \acf{QoS} level per tenant will be discussed.
The main challenges here are the metrics to use for measuring the current resource usage per tenant and how to separate tenants in such a way that the behaviour of one tenant will not affect the other tenants~\cite{krebs2013metrics}.
In many cases tenants in a multi-tenant application will have some form of \ac{SLA} with the provider of the application.
Often these SLAs contain certain \ac{QoS} constraints the provider will have to meet, such as response time or minimum uptime.
The tenant on the other hand will have to adhere to a \ac{FUP} or have certain fixed quotas for resource usage.
In a multi-tenant application the available resources should be fairly distributed among tenants.
Krebs~\cite{krebs2013metrics} defines a system as fair when the following criteria are met:
\begin{enumerate}
\item Tenants who do not exceed their quota should not be affected by tenants who do.
\item Tenants who do exceed their quota should experience performance degradation.
\item Tenants with a higher quota should experience better performance compared to tenants with a lower quota.
\end{enumerate}
In the list 'exceeding their quota' can mean exceeding an explicitly defined usage limit or, in the case of a fair use policy, unfair use of the system.
The primary mechanism for ensuring that these fairness criteria are met is finding methods for separating the performance of different tenants.
Several techniques for achieving this will be discussed in the next section.
\subsection{Tenant Separation Techniques}
In multi-tenant applications the primary technique proposed for maintaining \ac{QoS} for multiple tenants is to limit the amount of requests a tenant can execute using an entry point, or load balancer, that is aware of the resource consumption per tenant.
Walraven et al.~\cite{walraven2012towards}, Krebs et al.~\cite{krebs2013metrics} and Lin et al.~\cite{lin2009feedback} all propose methods that include request limiting in some form.
The method proposed by Walraven et al.~\cite{walraven2012towards} uses a pluggable middleware framework for performance isolation.
In this middleware reside a tenant aware profiler, a tenant categorizer and a scheduler.
The profiler gathers monitoring data from the rest of the system and the scheduler.
The data from the profiler and data concerning the \acp{SLA} of the tenants is used by the categorizer to put the tenants in one of three groups: passive, normal, or aggressive.
Finally the scheduler is responsible for the actual performance isolation.
The scheduler contains multiple request queues and requests are assigned to queues based on their category.
The system of queues allows, for instance, to process requests from aggressive tenants less frequently.
Their prototype was effective in isolating a single aggressive tenant from his normal co-tenants.
Krebs~\cite{krebs2013metrics} evaluates several tenant separation methods and finds that a two-queue system with a quota monitor provides very good tenant isolation.
In this evaluations the system first experiences normal load, in a second run a single disruptive tenant is added to test tenant isolation.
In this system requests from tenants that exceed their quota get placed in a blacklist queue and other requests in a whitelist queue.
Only when the whitelist is empty blacklisted requests will be serviced.
The concept of using multiple queues to categorize and prioritize tenant requests is very similar to the method proposed by Walraven et al.~\cite{walraven2012towards}.
Lin~\cite{lin2009feedback} proposes a system that not only limits number of requests requests per tenant but also dynamically assigns resources to tenants based on the admission rate.
What sets his method apart is that it uses a feedback loop to reassign resources and increase or decrease the admission rates to correct for errors in the initial estimations.
This controller is tested using three tenants that have different \acp{SLA}.
The controller manages to maintain the \ac{QoS} levels defined in the tenant specific \acp{SLA} under various load conditions.
\subsection{Research Agenda for \ac{QoS}}\label{sec:qos_agenda}
In this section several approaches for separating tenants and maintaining good \ac{QoS} have been discussed.
The approaches proposed by Walraven~\cite{walraven2012towards} and Krebs~\cite{krebs2013metrics} both showed promising results in experiments using a single disruptive tenant.
Lin's method was effective at maintaining the different \ac{QoS} levels per tenant.
Important topics for future research will be:
\begin{itemize}
\item \textbf{Multiple disruptive tenants}.
Multiple disruptive tenants is a case that has not been tested yet.
However this is something that should be handled for proper tenant separation.
Therefore the approaches of presented in this section must be tested with multiple disruptive tenants and improved if needed.
\item \textbf{Metrics for tenant load}.
The metrics used by Krebs and Walraven are request response time and response count.
However, as also discussed in the scalability section, there are more metrics that may be useful for measuring tenant separation and maintaining a good \ac{QoS} level. Such metrics must be identified and must have their usefulness assessed.
\end{itemize}