[haizea-commit] r522 - trunk/doc/manual

haizea-commit at mailman.cs.uchicago.edu haizea-commit at mailman.cs.uchicago.edu
Fri Sep 26 13:22:54 CDT 2008


Author: borja
Date: 2008-09-26 13:22:50 -0500 (Fri, 26 Sep 2008)
New Revision: 522

Modified:
   trunk/doc/manual/intro.tex
   trunk/doc/manual/leases.tex
   trunk/doc/manual/quickstart.tex
   trunk/doc/manual/simulation.tex
   trunk/doc/manual/title.tex
   trunk/doc/manual/whatis.tex
Log:
Wrote missing sections in manual, almost ready for TP1.2

Modified: trunk/doc/manual/intro.tex
===================================================================
--- trunk/doc/manual/intro.tex	2008-09-25 18:31:05 UTC (rev 521)
+++ trunk/doc/manual/intro.tex	2008-09-26 18:22:50 UTC (rev 522)
@@ -1,3 +1,17 @@
-\section*{What is Haizea?}
+
+Bla bla bla
+
 \section*{How to read this manual}
-\section*{Document conventions}
\ No newline at end of file
+\section*{Document conventions}
+
+\begin{shellverbatim}
+echo 'This shows something you have to type into your console'
+\end{shellverbatim}
+
+\begin{wideshellverbatim}
+This shows the contents of a file
+\end{wideshellverbatim}
+
+\begin{warning}
+This is a warning. It indicates that you should proceed with caution.
+\end{warning}
\ No newline at end of file

Modified: trunk/doc/manual/leases.tex
===================================================================
--- trunk/doc/manual/leases.tex	2008-09-25 18:31:05 UTC (rev 521)
+++ trunk/doc/manual/leases.tex	2008-09-26 18:22:50 UTC (rev 522)
@@ -1,20 +1,19 @@
-
 Let's say you need computational resources...
 
 Maybe you're a scientist who needs to run some simulations. You have specific hardware requirements, but you're not particularly picky about when the simulations run, and probably won't even notice if they are interrupted at some point, as long as they finish running (correctly) at some point, maybe before a given deadline. A job scheduler, and a compute cluster, would probably be a good fit for you.
 
-You could also be a software developer who wants to test his or her code on a pristine machine, which you would only need for a relatively short period of time. Plus, every time you use this machine, you'd like it to start up with the exact same pristine software environment. Oh, and you want your machine now. As in right now. One option could be to install a virtual machine (VM) manager (such as Xen, VMWare, etc.) to start up these pristine machines as VMs on your own machine. Even better, you could go to a cloud (like Amazon EC2 or the Science Clouds) and have those VMs appear automagically somewhere else, so you don't have to worry about setting up the VM manager or having a machine powerful enough to run those VMs.
+You could also be a software developer who wants to test his or her code on a pristine machine, which you would only need for a relatively short period of time. Plus, every time you use this machine, you'd like it to start up with the exact same pristine software environment. Oh, and you want your machine now. As in \emph{right now}. One option could be to install a virtual machine (VM) manager (such as Xen, VMWare, etc.) to start up these pristine machines as VMs on your own machine. Even better, you could go to a cloud (like Amazon EC2, \url{http://www.amazon.com/ec2/}, or the Science Clouds, \url{http://workspace.globus.org/clouds/}) and have those VMs appear automagically somewhere else, so you don't have to worry about setting up the VM manager or having a machine powerful enough to run those VMs.
 
 Or perhaps you're a run-of-the-mill geek who wants his or her own web/mail/DNS/etc server. This server will presumably be running for months or even years with high availability: your server has to be running all the time, with no interruptions. There's a whole slew of hosting providers who can give you a dedicated server or a virtual private server. The latter are typically managed with VM-based datacenter managers.
 
 As you can see, there are a lot of resource provisioning scenarios in nature. However, the solutions that have emerged tend to be specific to a particular scenario, to the exclusion of other ones. For example, while job-based systems are exceptionally good at managing complex batch workloads, they're not too good at provisioning resources at specific times (some job-based systems do offer advance reservations, but they have well-known utilization problems) or at giving users unfettered access to provisioned resources (forcing them, instead, to interact with the resources through the job abstraction).
 
-A lease is a general resource provisioning abstraction that could be used to satisfy a variety of use cases, such as the ones described above. In our work, we've defined a lease as "a negotiated and renegotiable agreement between a resource provider and a resource consumer, where the former agrees to make a set of resource available to the latter, based on a set of lease terms presented by the resource consumer". In our view, the lease terms must include the following dimensions:
+A lease is a general resource provisioning abstraction that could be used to satisfy a variety of use cases, such as the ones described above. In our work, we've defined a lease as a ``negotiated and renegotiable agreement between a resource provider and a resource consumer, where the former agrees to make a set of resource available to the latter, based on a set of lease terms presented by the resource consumer''. In our view, the lease terms must include the following dimensions:
 
 \begin{description}
  \item[Hardware] The hardware resources (CPU, memory, etc.) required by the resource consumer.
  \item[Software] The software environment that must be installed in those resources.
- \item[Availability] The period during which the hardware and software resources must be available. It is important to note that the availability period can be specified in a variety of ways, like "just get this to me as soon as you can", "I need this from 2pm to 4pm on Mondays, Wednesdays, and Fridays (and, if I don't get exactly this, I will be a very unhappy resource consumer)", or even "I need four hours sometime before 5pm tomorrow, although if you get the resources to me right now I'll settle for just two hours". A lease-based system must be able to efficiently combine all these different types of availability.
+ \item[Availability] The period during which the hardware and software resources must be available. It is important to note that the availability period can be specified in a variety of ways, like ``just get this to me as soon as you can'', ``I need this from 2pm to 4pm on Mondays, Wednesdays, and Fridays (and, if I don't get exactly this, I will be a very unhappy resource consumer)'', or even ``I need four hours sometime before 5pm tomorrow, although if you get the resources to me right now I'll settle for just two hours''. A lease-based system must be able to efficiently combine all these different types of availability.
 \end{description}
 
 Furthermore, if you don't get any of these dimensions, then you're being shortchanged by your resource lessor. For example, Amazon EC2 is very good at providing exactly the software environment you want, and reasonably good at providing the hardware you want (although you're limited to a few hardware configurations), but not so good at supporting a variety of availability periods.
@@ -27,15 +26,20 @@
 \item[Immediate leases] Resources must be provisioned right now, or not at all.
 \end{description}
 
-Although there are many systems (particularly job-based systems) that support these two types availability, Haizea differs in that it efficiently schedules heterogeneous workloads (combining best-effort and AR leases), overcoming the utilization problems typically resulting from using ARs. Haizea does this by using virtual machines to implement leases. Virtual machines also enable Haizea to provide exactly the hardware and software requested by the user. Additionally, Haizea also manages the overhead of preparing a lease, to make sure that any deployment operations (such as transferring a VM disk image) are taken care of before the start of a lease, instead of being deducted from the lessee's allocation.
+Although there are many systems (particularly job-based systems) that support the first two types of availability, Haizea differs in that it efficiently schedules heterogeneous workloads (combining best-effort and AR leases), overcoming the utilization problems that tend to occur when using ARs. Haizea does this by using virtual machines to implement leases. Virtual machines also enable Haizea to provide exactly the hardware and software requested by the user. Additionally, Haizea also manages the overhead of preparing a lease, to make sure that any deployment operations (such as transferring a VM disk image) are taken care of before the start of a lease, instead of being deducted from the lessee's allocation.
 
 In the future, Haizea will support additional lease types, such as urgent leases, periodic leases, deadline-driven leases, etc.
 
+\section{Supported types of leases}
 
-Haizea supports a variety of resource leases. There's a more detailed description of what a lease is in the What is Haizea? page, and this page just describes the supported types of leases. Throughout this page, let's assume you have a 4-node cluster, and that you want to lease parts of that cluster over time. We'll represent the four nodes over time like this:
+To better illustrate the types of leases supported in Haizea, let's assume you have a 4-node cluster, and that you want to lease parts of that cluster over time. We'll represent the four nodes over time like this:
 
-\section{"Advance Reservation" lease}
+\begin{center}
+\includegraphics{images/quickstart_leasegraph1.png}
+\end{center}
 
+\subsection{``Advance Reservation'' lease}
+
 An advance reservation, or AR, lease is a lease that must begin and end at very specific times. For example, the following lease starts at 1pm and ends at 2pm:
 
 \begin{center}
@@ -44,7 +48,7 @@
 
 Haizea can schedule this type of lease, which is particularly useful when you need resources at a specific time (for example, to coincide with a lecture, an experiment, etc.)
 
-\section{Preemptible best-effort lease}
+\subsection{Preemptible best-effort lease}
 
 Sometimes, you know you need resources, but you don't need them at a specific time. In fact, you're perfectly content to wait until there are enough resources available for your lease:
 
@@ -53,7 +57,7 @@
 \end{center}
 
 
-When you request a best-effort lease, your request gets placed in a queue, which is processed in a first-come-first-serve basis (the queue uses backfilling algorithms to improve resource management). The downside of this type of lease, of course, is that you may have to wait a while until resources are allocated to your lease:
+When you request a best-effort lease, your request gets placed in a queue, which is processed in a first-come-first-serve basis (the queue uses backfilling algorithms to improve resource utilization). The downside of this type of lease, of course, is that you may have to wait a while until resources are allocated to your lease:
 
 \begin{center}
 \includegraphics{images/lease_be2.png}
@@ -65,13 +69,13 @@
 \includegraphics{images/lease_be3.png}
 \end{center}
 
-Preemptible best-effort leases are good for running batch jobs, or any non-interactive work. The Haizea paper Combining Batch Execution and Leasing Using Virtual Machines showed how using the suspend/resume capability of virtual machines allowed AR and best-effort leases to be scheduled together efficiently, overcoming the utilization problems typically associated with ARs.
+Preemptible best-effort leases are good for running batch jobs, or any non-interactive work. The Haizea paper ``Combining Batch Execution and Leasing Using Virtual Machines'' showed how using the suspend/resume capability of virtual machines allowed AR and best-effort leases to be scheduled together efficiently, overcoming the utilization problems typically associated with ARs.
 
-\section{Non-preemptible best-effort lease}
+\subsection{Non-preemptible best-effort lease}
 
 But what if you're willing to wait for your resources to become available, but don't want them to be preempted? (e.g., if you want to use them interactively). Well, it's as simple as requesting a non-preemptible best-effort lease. Once your request makes it through the queue, and your lease is allocated resources, no one is taking them away.
 
-\section{Immediate lease}
+\subsection{Immediate lease}
 
 In some cases, you may need resources now. As in \emph{right now}:
 
@@ -80,15 +84,15 @@
 \end{center}
 
 
-Furthermore, if you can't get them right now, you're just not interested in anything else the resource provider has to offer. You're not going to request resources in the future, and you're certainly not going to be put on a queue. This is essentially the type of lease that many cloud systems offer (although the definition of "right now" varies wildly). Take into account that an immediate lease may still take a while to setup (VM image deployment, etc.). This type of lease in Haizea may evolve in the future into an "urgent lease", where "right now" really does mean "right now".
+Furthermore, if you can't get them right now, you're just not interested in anything else the resource provider has to offer. You're not going to request resources in the future, and you're certainly not going to be put on a queue. This is essentially the type of lease that many cloud systems offer (although the definition of "right now" varies wildly). Take into account that an immediate lease may still take a while to setup (VM image deployment, etc.). This type of lease in Haizea may evolve in the future into an ``urgent lease'', where ``right now'' really does mean ``right now''.
 
-\section{Coming soon}
+\subsection{Coming soon\ldots}
 
-The following types of leases are coming soon to a lease manager near you:
+In the future, Haizea will support more types of leases, such as best-effort leases with deadlines and leases requiring a non-trivial negotiation before the lease is accepted.
 
-\subsection{Best-effort with deadlines}
+\subsubsection{Best-effort with deadlines}
 
-In some cases, when you say "best effort", you really mean "best effort, but be reasonable". Sure, you're willing to wait for your resources, but you may need them before a deadline.
+In some cases, when you say ``best effort'', you really mean ``best effort, but be reasonable''. Sure, you're willing to wait for your resources, but you may need them before a deadline.
 
 \begin{center}
 \includegraphics{images/lease_deadline.png}
@@ -97,7 +101,7 @@
 
 For example, let's say you want a 16-node cluster sometime today to run a test program. You're not particularly picky about when you get the cluster, as long as it happens today and you're given sufficient warning of when your lease will be available. In the future, you will be able to tell Haizea that you have a deadline, and Haizea will either get the resources to you by then, or tell you that the deadline is simply unfeasible.
 
-\subsection{Negotiated leases}
+\subsubsection{Negotiated leases}
 
 If you've ever entered into any sort of non-computational lease agreement, you know that agreeing on the lease terms rarely involves the lessor instantly being on the same page as you. Rather, it involves a fair amount of haggling. Besides, if your computational needs are flexible, so should your lease manager (c'mon, are you sure you mean "exactly at 2pm"? maybe you meant to say "at some point this afternoon"?). In the future, you will be able to negotiate your leases with Haizea:
 

Modified: trunk/doc/manual/quickstart.tex
===================================================================
--- trunk/doc/manual/quickstart.tex	2008-09-25 18:31:05 UTC (rev 521)
+++ trunk/doc/manual/quickstart.tex	2008-09-26 18:22:50 UTC (rev 522)
@@ -2,9 +2,9 @@
 
 \section{The \texttt{haizea} command}
 
-The main command in the Haizea system is, unsurprisingly, the \texttt{haizea} command. Running this command starts up the Haizea lease manager, which is then ready to receive and schedule lease requests. As described in Chapter~\ref{chap:whatis}, Haizea can run in one of three modes: Simulation mode with simulated time, Simulation mode with real time, and OpenNebula mode. In this chapter we will focus on the simulation modes, starting with the ``simulated time'' variety. Both simulation modes, and the OpenNebula mode, will be described in more detail in the next chapters.
+The main command in the Haizea system is, unsurprisingly, the \texttt{haizea} command. Running this command starts up the Haizea lease manager, which is then ready to receive and schedule lease requests. As described in Chapter~\ref{chap:whatis}, Haizea can run in one of three modes: unattended simulated mode, interactive simulated mode, and OpenNebula mode. In this chapter we will focus on the simulation modes, starting with the ``unattended'' variety. Both simulation modes, and the OpenNebula mode, will be described in more detail in the next chapters.
 
-When running Haizea in simulation mode with simulated time, the inputs to Haizea are going to be the following:
+When running Haizea in unattended simulation mode, the inputs to Haizea are going to be the following:
 
 \begin{description}
  \item [The Haizea configuration file:] A text file containing all the options

Modified: trunk/doc/manual/simulation.tex
===================================================================
--- trunk/doc/manual/simulation.tex	2008-09-25 18:31:05 UTC (rev 521)
+++ trunk/doc/manual/simulation.tex	2008-09-26 18:22:50 UTC (rev 522)
@@ -1,13 +1,258 @@
+This chapter describes how to run Haizea in simulation mode. Since the Quickstart Guide (Chapter~\ref{chap:quickstart}) already provides a tutorial-like introduction to running simulations, this chapter is meant mostly as a reference guide, and covers the main simulation and scheduling options. However, it does not cover \emph{all} possible options in the configuration file (a description of all options and their valid values can be found in Appendix~\ref{app:conffile}). It also refers to scheduling algorithms that are not currently explained in the manual (they are described in some of the Haizea scientific publications, but these might be hard to swallow). Future versions of the Haizea manual will include a description of the main scheduling algorithms used, to better orient your choice of scheduling options. Finally, this chapter also covers how to run multiple unattended simulations.
 
+\section{Unattended simulations}
+
+To run Haizea as an unattended simulation requires setting the following options in the configuration file:
+
+\begin{wideshellverbatim}
+[general]
+...
+mode: simulated
+...
+
+[simulation]
+...
+clock: simulated
+...
+\end{wideshellverbatim}
+
+Additionally, the starting time of the simulation must be specified, along with a stopping condition:
+
+\begin{wideshellverbatim}
+[simulation]
+...
+starttime: 2006-11-25 13:00:00
+stop-when: all-leases-done | 
+           besteffort-submitted |
+           besteffort-done
+...
+\end{wideshellverbatim}
+
+
 \section{Interactive simulations}
 
-\section{Unattended simulations}
+To run Haizea as an interactive simulation, the following options must be set in the configuration file:
 
+\begin{wideshellverbatim}
+[general]
+...
+mode: simulated
+...
+
+[simulation]
+...
+clock: real
+...
+\end{wideshellverbatim}
+
+
+\section{Specifying the simulated physical resources}
+
+The simulated physical resources are specified using the \texttt{nodes} and \texttt{resources} options in the \texttt{[simulation]} section:
+
+\begin{wideshellverbatim}
+[simulation]
+...
+nodes: 4
+resources: CPU,1;Mem,1024;Net (in),100;Net (out),100;Disk,20000
+...
+\end{wideshellverbatim}
+
+Haizea currently only allows homogeneous resources to be specified. In other words, Haizea will manage a number of simulated physical machines, all with the same resources. The \texttt{resources} specifies the per-node resources using a semicolon-delimited list. Each entry in the list contains a pair: a resource name and its maximum capacity. Haizea currently recognizes the following:
+
+\begin{itemize}
+\item \texttt{CPU}: Number of processors per node.
+\item \texttt{Mem}: Memory (in MB)
+\item \texttt{Net (in)}: Inbound network bandwidth (in Mbps) 
+\item \texttt{Net (out)}: Outbound network bandwidth (in Mbps) 
+\item \texttt{Disk}: Disk space in MB (not counting space for disk image cache).
+\end{itemize}
+
+These five resources must always be specified, since Haize depends on them for fundamental resource reservations (running VMs, suspension of VMs, etc.) which involve these five types of resources. Additional resource types can be specified, but Haizea's scheduling code would have to be modified for them to be taken into account when scheduling leases. In the future, Haizea it will be possible to specify additional resources in the simulated nodes and in the lease requests with less effort.
+
 \section{Scheduling options}
 
+The scheduling options control how leases are assigned to resources.
+
+\subsection{Backfilling algorithms}
+
+\begin{warning}
+NOTE: This section assumes that you are familiar with backfilling algorithms. We will try to include a brief, didactic, explanation of backfilling algorithms in future versions of the manual.
+\end{warning}
+
+Haizea supports both aggressive and conservative backfilling:
+
+\begin{wideshellverbatim}
+[scheduling]
+...
+backfilling: off | aggressive | conservative
+...
+\end{wideshellverbatim}
+
+An exact number of allowed future reservations can also be specified:
+
+\begin{wideshellverbatim}
+[scheduling]
+...
+backfilling: intermediate
+backfilling-reservations: 4
+...
+\end{wideshellverbatim}
+
+
+\subsection{Lease suspension and migration}
+
+Lease suspension can be allowed for all leases, only for 1-node leases (``serial'' leases), or not allowed at all. Additionally, Haizea can schedule suspensions and resumptions to be locally or globally exclusive:
+
+\begin{wideshellverbatim}
+[scheduling]
+...
+suspension: none | serial-only | all
+...
+\end{wideshellverbatim}
+
+When suspending or resuming a VM, the VM's memory is dumped to a
+file on disk. To correctly estimate the time required to suspend
+a lease with multiple VMs, Haizea makes sure that no two 
+suspensions/resumptions happen at the same time (e.g., if eight
+memory files were being saved at the same time to disk, the disk's
+performance would be reduced in a way that is not as easy to estimate
+as if only one file were being saved at a time).
+            
+Depending on whether the files are being saved to/read from a global
+or local filesystem, this exclusion can be either global or local:
+
+\begin{wideshellverbatim}
+[scheduling]
+...
+suspendresume-exclusion: local | global
+...
+\end{wideshellverbatim}
+
+When allocating time for suspending or resuming a single virtual machine with $M$ MB of memory, and given a rate $R$ MB/s of read/write disk throughput, Haizea will estimate the suspension/resumption time to be $\frac{M}{R}$. The \texttt{suspendresume-rate} option is used to specify $R$:
+
+\begin{wideshellverbatim}
+[simulation]
+...
+suspendresume-rate: 32
+...
+\end{wideshellverbatim}
+
+Lease migration can be allowed or not allowed. When allowed, we can specify whether a migration will involve transferring only the memory image of a VM (i.e., the file containing the contents of the VM when it was suspended), or will require transferring both the memory image and the disk image:
+
+\begin{wideshellverbatim}
+[scheduling]
+...
+migration: True | False
+what-to-migrate: nothing | mem | mem+disk
+...
+\end{wideshellverbatim}
+
+Setting \texttt{what-to-migrate} to \texttt{nothing} means that migration \emph{is} allowed, but does not involve tranferring any files from one node to another.
+
+
+\subsection{Lease preparation scheduling}
+
+Before a lease can start, it may require some preparation, such as transferring a disk image from a repository to the physical node where a VM will be running. When no preparation is necessary (e.g., assuming that all required disk images are predeployed on the physical nodes), the \texttt{lease-preparation} option must be set to \texttt{unmanaged}:
+
+\begin{wideshellverbatim}
+[general]
+...
+lease-preparation: unmanaged
+...
+\end{wideshellverbatim}
+
+When disk images are located in a disk image repository, Haizea can schedule the file transfers from the repository to the physical nodes to make sure that images arrive on time (when a lease has to start at a specific time) and to minimize the number of transfers (by reusing images on the physical nodes). To do this, \texttt{lease-preparation} option must be set to \texttt{imagetransfer}, we need to specify the network bandwidth of the image repository (in Mbits per second), and specify several options in the \texttt{[deploy-imagetransfer]} section:
+
+\begin{wideshellverbatim}
+[general]
+...
+lease-preparation: imagetransfer
+...
+
+[simulation]
+...
+imagetransfer-bandwidth: 100
+...
+
+[deploy-imagetransfer]
+...
+\# Image transfer scheduling options
+...
+\end{wideshellverbatim}
+
+\subsubsection{Transfer mechanisms}
+
+The transfer mechanism specifies how the images will be transferred from the repository to the physical nodes. tHaizea currently only supports a multicast transfer mechanism:
+
+\begin{wideshellverbatim}
+[deploy-imagetransfer]
+...
+transfer-mechanism: multicast
+...
+\end{wideshellverbatim}
+
+This mechanism assumes that it is possible to multicast the same image from the repository node to more than one physical node at the same time.
+
+\subsubsection{Avoiding redundant transfers}
+
+Haizea can take steps to
+detect and avoid redundant transfers (e.g., if two leases are
+scheduled on the same node, and they both require the same disk
+image, don't transfer the image twice; allow one to ``piggyback''
+on the other). There is generally no reason to avoid redundant transfers.
+
+\begin{wideshellverbatim}
+[deploy-imagetransfer]
+...
+avoid-redundant-transfers: True | False
+...
+\end{wideshellverbatim}
+
+
+\subsubsection{Disk image reuse}
+
+Haizea can create disk image caches on the physical nodes with the goal of reusing frequent disk images and reducing the number of transfers: 
+
+\begin{wideshellverbatim}
+[deploy-imagetransfer]
+...
+diskimage-reuse: image-caches
+diskimage-cache-size: 20000
+...
+\end{wideshellverbatim}
+
+
+\subsection{The scheduling threshold}
+
+To avoid thrashing, Haizea will not schedule a lease unless all ovrheads
+can be correctly scheduled (which includes image transfers, suspensions, etc.).
+However, this can still result in situations where a lease is prepared,
+and then immediately suspended because of a blocking lease in the future.
+The scheduling threshold factor can be used to specify that a lease must
+not be scheduled unless it is guaranteed to run for a minimum amount of
+time (the rationale behind this is that you ideally don't want leases
+to be scheduled if they're not going to be active for at least as much time
+as was spent in overheads).
+            
+The default value is 1, meaning that the lease will be active for at least
+as much time $t$ as was spent on overheads (e.g., if preparing the lease requires
+60 seconds, and we know that it will have to be suspended, requiring 30 seconds,
+Haizea won't schedule the lease unless it can run for at least 90 minutes).
+In other words, a scheduling factor of $F$ required a minimum duration of 
+$F\cdot t$. A value of 0 could lead to thrashing, since Haizea could end up with
+situations where a lease starts and immediately gets suspended.   
+
+\begin{wideshellverbatim}
+[scheduling]
+...
+scheduling-threshold-factor: 1
+...
+\end{wideshellverbatim}
+
 \section{Running multiple unattended simulations}
 \label{sec:multiplesim}
-Haizea's regular configuration file (the one that is provided to the haizea command) allows for, at most, one tracefile to be used. However, when running simulations, it is often necessary to run through multiple tracefiles in a variety of configurations to compare the results of each tracefile/configuration combination. The "multi-configuration" file allows you to easily do just this. It is similar to the regular configuration file (all the options are the same), but it allows you to specify multiple tracefiles and multiple configuration profiles.
+Haizea's configuration file allows for, at most, one tracefile to be used. However, when running simulations, it is often necessary to run through multiple tracefiles in a variety of configurations to compare the results of each tracefile/configuration combination. The ``multi-configuration ''file allows you to easily do just this. It is similar to the regular configuration file (all the options are the same), but it allows you to specify multiple tracefiles and multiple configuration profiles.
 
 The multi-configuration file must contain a section called "\texttt{multi}" where you must specify the following:
 
@@ -17,7 +262,7 @@
 \item The directory where Haizea should store all the information it collects during the simulation (scheduling metrics, utilization information, etc.)
 \end{itemize}
 
-The [multi] section should look like this:
+The \texttt{[multi]} section should look like this:
 
 \begin{wideshellverbatim}
 [multi]
@@ -28,7 +273,7 @@
 basedatadir: Directory where raw data will be saved
 \end{wideshellverbatim}
 
-Next, for each section you would ordinarily include in a regular configuration file, you can include common options (shared by all profiles) and profile-specific options. For example, assuming you want to specify options in the general and simulation sections, and you want to create two profiles called nobackfilling and withbackfilling, you would have to create the following sections:
+Next, for each section you would ordinarily include in a regular configuration file, you can include common options (shared by all profiles) and profile-specific options. For example, assuming you want to specify options in the \texttt{general} and \texttt{simulation} sections, and you want to create two profiles called \texttt{nobackfilling} and \texttt{withbackfilling}, you would have to create the following sections:
 
 \begin{wideshellverbatim}
 [common:general]
@@ -50,13 +295,13 @@
 ...
 \end{wideshellverbatim}
 
-An example multi-configuration file is provided in /usr/share/haizea/etc/sample-multi.conf. Using this file, or once you've created your own, you can use the haizea-generate-configs to create the individual configuration files (one for every combination of tracefile, injected tracefile, and profile):
+An example multi-configuration file is provided in \texttt{/usr/share/haizea/etc/sample-multi.conf}. Using this file, or once you've created your own, you can use the \texttt{haizea-generate-configs} to create the individual configuration files (one for every combination of tracefile, injected tracefile, and profile):
 
 \begin{wideshellverbatim}
 haizea-generate-configs -c config -d dir
 \end{wideshellverbatim}
 
-The -c parameter is used to specify the multi-config file, and the -d parameter is used to specify where the configuration files should be created. Since running each configuration individually would be cumbersome, you can also use the haizea-generate-script command to generate a script that will run through all the generated configuration files. This command requires Mako Templates for Python, so make sure you install Mako before using haizea-generate-scripts. Haizea currently includes two script templates: one to generate a BASH script that will call haizea with each individual configuration file, and one to generate a basic Condor submission script. For example, to generate the BASH script, you would run the command like this:
+The \texttt{-c} parameter is used to specify the multi-config file, and the \texttt{-d} parameter is used to specify where the configuration files should be created. Since running each configuration individually would be cumbersome, you can also use the \texttt{haizea-generate-script} command to generate a script that will run through all the generated configuration files. This command requires Mako Templates for Python, so make sure you install Mako before using \texttt{haizea-generate-scripts}. Haizea currently includes two script templates: one to generate a BASH script that will call haizea with each individual configuration file, and one to generate a basic Condor submission script. For example, to generate the BASH script, you would run the command like this:
 
 \begin{wideshellverbatim}
 haizea-generate-scripts -c config -d dir -t /usr/share/haizea/etc/run.sh.template

Modified: trunk/doc/manual/title.tex
===================================================================
--- trunk/doc/manual/title.tex	2008-09-25 18:31:05 UTC (rev 521)
+++ trunk/doc/manual/title.tex	2008-09-26 18:22:50 UTC (rev 522)
@@ -7,7 +7,7 @@
 % Title
 \HRule \\[0.4cm]
 \includegraphics[width=0.6\textwidth]{images/haizea.png}\\[1cm]
-\textsc{ \huge The Haizea Manual}\\{\large Technology Preview 1.2}\\[0.4cm]
+\textsc{ \huge The Haizea Manual}\\{\large Technology Preview 1.2}\\{\large 9/29/08}\\[0.4cm]
  
 \HRule \\[1.5cm]
 \url{http://haizea.cs.uchicago.edu/}

Modified: trunk/doc/manual/whatis.tex
===================================================================
--- trunk/doc/manual/whatis.tex	2008-09-25 18:31:05 UTC (rev 521)
+++ trunk/doc/manual/whatis.tex	2008-09-26 18:22:50 UTC (rev 522)
@@ -2,46 +2,56 @@
 
 \begin{description}
 \item[Haizea is a resource manager] (or, depending on who you ask, a "resource scheduler"): Haizea is a software component that can manage a set of computers (typically a cluster), allowing users to request exclusive use of those resources described in a variety of terms, such as "I need 10 nodes, each with 1 GB of memory, right now" or "I need 4 nodes, each with 2 CPUs and 2GB of memory, from 2pm to 4pm tomorrow".
-\item[Haizea uses leases] The fundamental resource provisioning abstraction in Haizea is the lease. Intuitively, a lease is some form of contract where one party agrees to provide a set of resources (an apartment, a car, etc.) to another party. When a user wants to request computational resources from Haizea, it does so in the form of a lease. When applied to computational resources, the lease abstraction is a powerful and general construct with a lot of nuances. See below for a more detailed definition of leases or read about the types of leases supported by Haizea.
-\item[Haizea is VM-based] We hold that the best way of implementing resource leases is using virtual machines (VMs). Therefore, Haizea's scheduling algorithms are geared towards managing virtual machines, factoring in all the extra operations (and overhead) involved in managing VMs. The Globus Virtual Workspaces group, where Haizea was originally developed, has an extensive list of publications that argue how using virtual machines for resource leasing is A Good Thing (and also Not A Trivial Thing).
+\item[Haizea uses leases] The fundamental resource provisioning abstraction in Haizea is the lease. Intuitively, a lease is some form of contract where one party agrees to provide a set of resources (an apartment, a car, etc.) to another party. When a user wants to request computational resources from Haizea, it does so in the form of a lease. When applied to computational resources, the lease abstraction is a powerful and general construct with a lot of nuances. \ref{chap:leases}
+\item[Haizea is VM-based] We hold that the best way of implementing resource leases is using virtual machines (VMs). Therefore, Haizea's scheduling algorithms are geared towards managing virtual machines, factoring in all the extra operations (and overhead) involved in managing VMs. The Globus Virtual Workspaces group, where Haizea was originally developed, has an extensive list of publications that argue how using virtual machines for resource leasing is \textsf{A Good Thing} (and also \textsf{Not A Trivial Thing}).
 \item[Haizea is open source] Haizea is published under the Apache License 2.0, a BSD-like OSI-compatible license.
 \end{description}
 
 \section{What can you do with Haizea?}
 
+Haizea is, primarily, a VM resource management component that can take lease requests and make scheduling decisions, but doesn't actually know anything about how to enact those decisions. For example, Haizea may determine at what times a set of VMs representing a lease must start and stop, but it doesn't actually know how to instruct a virtual machine manager (such as Xen, KVM, etc.) to do these actions. Haizea can, however, delegate these enactment actions to an external component using a simple API. Haizea can currently interface with the OpenNebula (\url{http://www.opennebula.org/}) virtual infrastructure manager to enact its scheduling decisions. Haizea can also simulate enactment actions, which makes it useful for doing scheduling research involving leases or VMs (in fact, the Haizea simulator has been used in a couple of papers).
+
+So, Haizea can be used in three modes: OpenNebula mode, unattended simulation mode, and interactive simulation mode.
+
+\subsection{OpenNebula mode}
+
 \begin{center}
-\includegraphics{images/what_haizea_does.png}
+\includegraphics{images/mode_opennebula.png}
 \end{center}
 
+Haizea can be used as a drop-in replacement for OpenNebula's scheduling daemon. OpenNebula is a virtual infrastructure manager that enables the dynamic deployment and re-allocation of virtual machines on a pool of physical resources. OpenNebula and Haizea complement each other, since OpenNebula provides all the enactment muscle (OpenNebula can manage Xen and KVM virtual machines on a cluster, with VMWare support to follow shortly) while Haizea provides all the scheduling brains.
 
-You can use Haizea one of two ways. Haizea can be used as a standalone component or as a scheduling backend for a virtual infrastructure manager, such as OpenNebula. So, if you're...
+\subsection{Unattended simulation mode}
 
-\begin{description}
-\item[Using Haizea with OpenNebula] Haizea can be used as a drop-in replacement for OpenNebula's scheduling daemon. OpenNebula is a virtual infrastructure manager that enables the dynamic deployment and re-allocation of virtual machines on a pool of physical resources. OpenNebula and Haizea complement each other, since OpenNebula provides all the enactment muscle (OpenNebula can manage Xen and KVM virtual machines on a cluster, with VMWare support to follow shortly) while Haizea provides all the scheduling brains. The document "Using OpenNebula and Haizea to manage VMs on a cluster" provides more details on how to use OpenNebula 1.0 and Haizea together.
-\item[Using Haizea on its own] In this case, you actually can't do all that much :-) Haizea is, primarily, a VM resource management component that can take lease requests and make scheduling decisions, but doesn't actually know anything about how to enact those decisions. For example, Haizea may determine at what times a set of VMs representing a lease must start and stop, but it doesn't actually know how to instruct a virtual machine manager to do these actions. Haizea can, however, simulate those enactment actions so, on its own, Haizea might be useful if you're doing scheduling research involving leases or VMs (in fact, the Haizea simulator has been used in a couple of papers).
-\end{description}
+\begin{center}
+\includegraphics{images/mode_unattended_simulation.png}
+\end{center}
 
-You can find a couple (more specific) details about what you can do with Haizea in our list of features.
+In this mode, Haizea takes a list of lease requests (specified in a \emph{tracefile}) and a configuration file specifying simulation and scheduling options (such as the characteristics of the hardware to simulate), and processes them in ``simulated time''. In other words, the goal of this mode is to obtain the final schedule for a set of leases, without having to wait for all those leases to complete in real time (this makes this mode particularly useful to find out what effect a certain scheduling option could have over a period of weeks or months). In fact, the final result of an unattended simulation is a datafile with raw scheduling data and metrics which can be used to generate reports and graphs.
 
-\begin{description}
- \item[Simulation mode, simulated time:] In this mode, Haizea works with a simulated set of hardware resources (which we'll be able to specify in a configuration file). When a lease is scheduled, all enactment commands (``start VM'', ``stop VM'', etc.) for that lease are simulated. Additionally, when time is simulated, Haizea will just ``fast forward'' through all the lease requests it receives. For example, suppose you've requested a lease that requires 30
- \item[Simulation mode, real time:]  The ``real time'' mode simply means that time will pass
- \item[OpenNebula mode:] 
-\end{description}
+\subsection{Interactive simulation mode}
 
+\begin{center}
+\includegraphics{images/mode_interactive_simulation.png}
+\end{center}
 
-\section{Leasing as a fundamental abstraction}
+In this mode, enactment actions are simulated, but Haizea runs in ``real time''. This means that, instead of having to provide a list of lease requests beforehand, your can use Haizea's command-line interface to request leases interactively and query the status of Haizea's schedule (e.g., to find out the state of lease you've requested). Obviously, this mode is not useful if you want to simulate weeks or months of requests, but it is handy if you want to experiment with leases and track the schedule in a more user-friendly way (since the datafile produced by the unattended simulation is mostly meant for consumption by other programs, e.g., to generate graphs and reports).
 
-% Include very short spiel
+\section{Haizea architecture}
 
-\section{Features}
+\begin{center}
+\includegraphics{images/architecture.png}
+\end{center}
 
-Foobar
+Haizea is divided into the following three layers:
 
-\section{Haizea is extensible}
+\begin{description}
+\item[The request frontend] This is where lease requests arrive. Haizea can currently accept requests from OpenNebula or read them from a tracefile (in SWF format or using the Haizea-specific LWF format). A command-line interface is in the works.
+\item[The scheduling core] This is where the lease requests are processed and scheduled, resulting in enactment actions happening at specific points in time (e.g., "Start VM for lease X in node Y at time T", etc.)
+\item[The enactment modules] These take care of the "dirty work" of carrying out the enactment actions generated by the scheduler. Haizea can currently send enactment actions to OpenNebula, resulting in Haizea being able to manage Xen and KVM clusters (VMWare support coming soon), or to a simulated cluster.
+\end{description}
+The Haizea architecture keeps these three layers completely decoupled, which means that adding support for an additional enactment backend only requires writing an enactment module for that backend. The API for enactment modules is still not fully defined, and integration with OpenNebula is currently driving this effort. However, if you'd be interested in using Haizea in another system, please do let us know. We'd be very interested in hearing what your requirements are for the frontend and enactment APIs.
 
-The Haizea architecture has been designed so it can be used as a scheduling component that can be plugged into other systems by implementing a set of extra modules in Haizea using a well-defined API (i.e., without having to modify the core of Haizea). Right now, Haizea depends on OpenNebula to perform enactment actions on real systems (and, in turn, OpenNebula can use Haizea as its scheduling backend), but Haizea could be made to work with other systems with relative ease. See the Haizea architecture page for more details.
-
 \section{Haizea is still a technology preview}
 
 Haizea started out as research software (and it is still largely meant for research purposes). Although Haizea works and we're getting to the point where it can be used in production systems, it is still a technology preview, so please use it with caution. In particular, although we've produced some documentation and polished up the code, there is still a fair amount of documentation that has to be produced. If you have any trouble using Haizea, or understanding any part of the source code, please don't hesitate to ask for help.
\ No newline at end of file



More information about the Haizea-commit mailing list