[Colloquium] Bresnahan/MS Presentation/Dec. 13, 2006
Margaret Jaffey
margaret at cs.uchicago.edu
Mon Nov 27 10:48:53 CST 2006
This is an announcement of John Bresnahan's MS Presentation.
---------------
Date: Wednesday, December 13, 2006
Time: 10:00 a.m.
Place: Ryerson 276
M.S. Candidate: John Bresnahan
M.S. Paper Title: An Architecture for Dynamic Allocation of Compute
Cluster Bandwidth
Abstract:
Modern high-performance computers are often implemented as clusters.
A cluster is comprised of several machines, each representing a
compute node. Local resource managers are used to grant access to
these nodes. Once access is granted to a node, a user has exclusive
access to the entire node including its local filesystem and network
interface. In addition to serving as a computational cluster, these
sites are often highly connected to a network: for example, sites
participating in the US TeraGrid system have between ten and forty
gigabits per second of available bandwidth and each computational
node in a TeraGrid cluster typically has a one gigabit per second
network interface. However, since it is a compute cluster, jobs are
often computationally bound, in which case most of this bandwidth is
unused.
In order to stage in and out data sets, or to publish computational
results, sites provide file transfer services. Achieving peak
transfer speeds requires a significant number of nodes. Packet
switching at top speeds put a load on a CPU that prohibits locating
user compute jobs and site transfer processes on the same node. Thus,
administrators partition their resources into compute nodes and
transfer nodes. Due to fluctuating transfer requirements and the
desire to avoid idle resources it is difficult to determine this
partition statically. Ideally, the partitioning could be adjusted
dynamically to suit immediate needs.
This paper studies this dynamic resource partitioning problem. We
propose an architecture that allows transfer nodes to be acquired
from the computational queue when needed, and returned when no longer
required. The site administrator has a means to set a policy
regarding how long a transfer node can be idle before being returned,
and when to request additional nodes. We measure the costs associated
with a prototype implementation of this architecture and study the
impact of different policies on achieved performance.
Advisor: Prof. Ian Foster
A draft copy of John Bresnahan's MS Paper is available in Ry 161A.
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Margaret P. Jaffey margaret at cs.uchicago.edu
Department of Computer Science
Student Support Rep (Ry 161A) (773) 702-6011
The University of Chicago http://www.cs.uchicago.edu
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
More information about the Colloquium
mailing list