Gigabit Testbeds Final Report

4.5.1 QoS

Several testbed efforts addressed the QoS problem for high speed networks. This work included design, implementation and experiments with a new suite of protocols for handling real-time traffic, a detailed investigation of the processing required to support sophisticated QoS queuing algorithms in gigabit networks, an investigation of how to provide guaranteed response times to applications using general operating systems, and an exploration of optimal QoS dynamic packet scheduling.

Tenet Real-Time Protocol Suite

Researchers in the Blanca testbed at Berkeley investigated the problem of providing end-to-end real-time traffic support in a general mixed-traffic internetworking environment. Their approach consisted of developing a suite of real-time traffic protocols to provide IP-level channel setup and data forwarding and end-to-end transport, with the protocols intended to operate in conjunction with the use of TCP/IP for non-realtime traffic.

Called the Tenet suite, it consisted of four protocols: the real-time channel administration protocol (RCAP) for channel setup across a set of internetwork routers; the real-time internet protocol (RTIP) for IP-level forwarding; the real-time message transport protocol for host-to-host transfers using the underlying RTIP protocol; and the continuous media transport protocol (CMTP), used over RTIP for periodic traffic.

The protocols included admission control, priority scheduling, and distributed rate control, and were intended to provide mathematically provable performance guarantees based on user-specified traffic parameters. Extensive simulations were used to establish the scheme's performance, with prototype software then developed for experiments in real networks. The Berkeley-LBL portion of the Blanca testbed, an 800 Mbps HIPPI network, was used for the high speed experimentation.

Because the Tenet suite was designed for operation over packet-switched networks and HIPPI is a circuit-switched technology, the RCAP setup protocol had to be substantially redesigned (the original Blanca testbed plan was for an ATM network at Berkeley-LBL, but was changed due to delays in equipment development). The HIPPI version of RCAP made use of the time division reservation system discussed in section 0, allowing RCAP to dynamically reserve time slots within each physical HIPPI link for the real-time traffic.

Three experiments were carried out over the HIPPI network using SGI and Sun workstations, a single flow for reference purposes followed by two and then three simultaneous flows. Throughput and time variance data was collected for each case, with the results generally showing stable, well-behaved operation with only short transients occurring when a stream was turned on or off.

While these experiments demonstrated good bandwidth-sharing properties for competing real-time flows, the HIPPI time division scheme prevented simultaneous transmission of asynchronous non-realtime traffic. To obtain data in the latter context, experiments were also carried out using a local 100 Mbps FDDI network and a wide area T1 network connecting Berkeley, UC Santa Barbara, UCLA, and UC San Diego. These experiments successfully mixed real-time video traffic with varying amounts of background traffic loads, with the video streams essentially unaffected by other traffic.

Weighted Fair Queueing for ATM

To establish the feasibility of using sophisticated QoS algorithms for ATM cell streams at gigabit speeds, MIT and Bellcore collaborated on an investigation of this question in the Aurora testbed. As part of their Sunshine switch effort, Bellcore prototyped a second-generation output port controller which could be used both with the switch and as a standalone device for QoS and other experiments. Known as OPC-V2, it provided for two 155 Mbps SONET OC-3c data inputs and one OC-3c data output, with a second output for switch feedback signaling. An Intel i960CA 33 MHz processor controlled the actions of hardware which directly handled cell flows, allowing different queuing strategies to be applied which could control traffic flows as a function of ATM VC identifiers and other parameters. Two hardware sort modules were provided for fast sequencing of ATM cells.

To establish the gigabit-rate processing capabilities needed for QoS schemes under consideration by the IETF and ATM Forum, MIT first investigated ways to simplify weighted fair queuing (WFQ) approaches while retaining their key properties. The result was an approximation to hierarchical weighted fair queuing which could be implemented as a single-level queue structure, allowing it to be executed on the OPC-V2.

A determination of the number of instructions required by the resulting algorithm revealed that it needed slightly more processing power than was available from the I960 to maintain the full output rate of 155 Mbps -- the i960/33 could execute approximately 80 instructions in one ATM cell time, whereas the algorithm required 88 instructions per cell. Allowing also for the additional background processing required for full operation, MIT estimated that a factor of 2 increase in processing speed would satisfy the 155 Mbps rate of the OPC-V2 output port.

Extrapolating from these results, a factor of 8 increase relative to the i960/33 would be needed for a 622 Mbps output rate, or roughly an order of magnitude increase for gigabit-rate operation of the algorithm. Since the i960/33 was a circa 1990 processor, given processor cost/performance trends it seems likely that gigabit operation of a WFQ-class algorithm will be economically achievable in the 1996-1998 timeframe.

Extending QoS to Applications

Researchers at Penn in the Aurora testbed investigated issues in providing service guarantees to real-time applications using general operating system environments and high speed networks. A QoS service kernel was designed to provide scheduling-based operating system guarantees to meet stringent response time requirements such as those required for remote robotics control. This was generalized to a logical framework which established the relationships and requirements between application-specified QoS parameters, operating system policies and mechanisms, and network QoS services.

To bring these results together for experimentation, a QoS broker was implemented for use in controlling a remote robotic arm over an ATM network. The broker negotiates between applications and underlying networks in an attempt to best satisfy the desired performance, coordinating resources over the end-to-end path of the application. Conceptually, the broker is positioned as middleware in each endpoint machine, interfacing to applications, the operating system, and network I/O.

Testbed experimentation made use of initial broker software implementations on an IBM RS/6000 workstation using the AIX Unix operating system at each endpoint, with real-time robotic control computers interfaced to the RS/6000s through bus-to-bus connections. Their principle finding was that considerable additional work is needed on general operating systems such as the AIX, in order to provide the guarantees needed for robotics and other tightly constrained applications.

Dynamic Packet Scheduling

A dynamic packet scheduling approach to providing service guarantees in wide area network switches and routers was investigated by Wisconsin researchers in the Blanca testbed, using a combination of analysis, simulations and experiments. Their work focused on identifying optimal schedulable regions and associated scheduling algorithms for guaranteed and predictive service classes, assuming token bucket traffic descriptors. By performing dynamic scheduling to reallocate delay among competing traffic while still satisfying delay targets, they expected to satisfy a larger domain of service requirements than would otherwise be the case.

Experiments were carried out using Blanca testbed facilities between Wisconsin's Madison campus and UIUC in Illinois. The configuration included three Xunet ATM switches and an SGI workstation used as a router at each endpoint, with special software implemented to provide controllable traffic generation, performance measurements, and scheduling algorithms. Two traffic flows were used, a `source' flow and a background flow, with both sent from Madison to UIUC. Five scheduling strategies were compared: static priority, round robin, and dynamic scheduling with three different ratios of source to background traffic. For each strategy, experiments were run with different source traffic distributions and link utilizations, with dynamic scheduling outperforming the other approaches in nearly every case [3].