4003-541-70/4005-741-70
Data Communications and Networks II
Module 5. Transport Algorithms -- Lecture Notes
Prof. Alan Kaminsky -- Spring Quarter 2006
Rochester Institute of Technology -- Department of Computer Science
Transport Service Design Issues
- Setup options
- Connection-oriented
- Connectionless
- Delivery options
- Reliable delivery
- Unreliable delivery
- Content options
- Stream (unbounded size)
- Messages (unbounded size)
- Messages (bounded size)
- Endpoint identification
- Application programming interface
- Segmentation
- Sequence numbering options
- Give each byte its own sequence number
- Give each segment its own sequence number
- Segment identification
- A segment must be uniquely correlated to the connection it is part of
- Problem: Delayed delivery of segments for previous connections
- Corrupted segments: detection, response
TCP Design
RMTP Design
Connection Setup and Teardown
- TCP: Three-way handshake
- Synchronize each side with the other
- Each side informs other side of initial sequence number and MSS
- RMTP: Three-way handshake
- Package edu.rit.ron.rmtp
- Package edu.rit.ron.rmtp.test
- Demonstrations
Data Transfer
- Outgoing messages
- From output stream write() calls to outgoing data segments
- Sequentialization of outgoing messages
- Incoming messages
- Detection and reporting of incoming messages
- From incoming data segments to input stream read() calls
- Package edu.rit.ron.rmtp
- Demonstrations
Flow Control, Part 1
- Flow control
- Prevents the sender from sending data faster than the receiver can process it
- Stop-and-wait flow control
- Do not send the next data segment until you have received the acknowledgment for the previous data segment
- Algorithm
- Each side of connection keeps track of:
- Next-outgoing-sequence-number, initially 0
- Last-outgoing-acknowledgment-number, initially 0
- Next-expected-incoming-sequence-number, initially 0
- To send a data segment:
- Wait until last-outgoing-acknowledgment-number == next-outgoing-sequence-number
- Send data segment with next-outgoing-sequence-number
- Increment next-outgoing-sequence-number
- When a data segment is received:
- If data segment's sequence number == next-expected-incoming-sequence-number, then process payload and increment next-expected-incoming-sequence-number, otherwise ignore payload
- Send acknowledgement for next-expected-incoming-sequence-number
- Last-outgoing-acknowledgment-number = data segment's acknowledgement number
- Package edu.rit.ron.rmtp
- Demonstrations
Lost Packet Recovery
- Timeouts
- Start a timer whenever a data segment is sent
- Stop the timer when the acknowledgment is received
- (The version below uses a fixed timeout of 5 seconds)
- Retransmissions
- If the timer times out, retransmit the data segment
- If the data segment has been sent too many times, abandon the connection
- Package edu.rit.ron.rmtp
- Demonstrations
Adaptive Timeouts
- RFC 2988, "Computing TCP's Retransmission Timer" (November 2000)
- Based on measuring the round trip time (RTT)
- Time from transmission of data segment to reception of acknowledgment segment
- Only measured for the first transmission, not for subsequent retransmissions
- Variables
- RTO -- retransmission timeout
- SRTT -- smoothed RTT
- RTTVAR -- RTT variance
- R -- RTT measurement
- G -- Clock granularity
- Calculation of RTO
- R = RTT measurement
- If this is the first RTT measurement:
- Otherwise:
- RTTVAR = 3/4 * RTTVAR + 1/4 * abs (SRTT - R)
- SRTT = 7/8 * SRTT + 1/8 * R
- RTO = SRTT + max (G, 4 * RTTVAR)
- (For the Internet, RFC 2988 requires a minimum value of 1 second for RTO, but that is not implemented in the RIT Overlay Network)
- Timeout value
- If RTO has been calculated, use RTO, otherwise use 3 seconds
- For each retransmission, double the previous timeout
- Package edu.rit.ron.rmtp
- Demonstrations
Flow Control, Part 2
- A stop-and-wait protocol achieves flow control, but is inefficient on high-latency connections
- Alternative: Window-based flow control
- Window-based flow control in TCP
- TCP header includes a 16-bit window size field
- This tells the number of bytes available in the receive buffer of the host that sent the segment
- Sender can send as many bytes as the receiver's window size
- Then the sender must stop and wait until the receiver advertises a new window size
- A window size of 0 is legal and means, "Don't send any data now" -- the receiver closed the window
- When buffer space becomes available, the receiver sends a segment with a nonzero window size -- the receiver opened the window
- When the receiver's window is closed, the sender periodically times out and sends a probe segment; the receiver responds with an ack segment stating the window size
- Improving TCP performance
- Goal: Sender should send data in big chunks, not little chunks
- Goal: Receiver should ask for data in big chunks, not little chunks
- Goal: Receiver should piggyback acks, not send separate small ack segments
- Sending data in little chunks, e.g. Telnet or SSH
- Nagle's algorithm (1984) -- RFC 896, "Congestion Control in IP/TCP Internetworks" (January 1984)
- The first time, send a little chunk
- Accumulate little chunks until the ack for the first chunk returns, then send the accumulated chunks as one big chunk
- Can also send the accumulated chunks before that if they would fill half the receiver's window
- Can also send the accumulated chunks before that if they would fill the MSS
- Receiving data in little chunks, e.g. a byte-at-a-time application
- Receiver will oscillate window size between 0 and 1, sender will send one-byte segments -- silly window syndrome
- Clark's algorithm -- RFC 813, "Window and Acknowledgment Strategy in TCP" (July 1982)
- Receiver cannot advertise a new window size until the receive buffer can hold the MSS or the receive buffer is half empty, whichever is smaller
- Piggybacking acks
- If there is data traffic from the sender back to the receiver, the acks can piggyback on the traffic immediately
- If there is no data traffic from the sender back to the receiver, wait a short time to see if some data traffic shows up
- If not, then send a separate ack segment
- Reduces the overhead due to separate ack segments
- Increases the RTT
Congestion Control
- Each sending host's Transport Layer sends data at as high as rate as possible while losing as few packets as possible
- Alternative 1: Explicit rate feedback
- Each router (Network Layer) tells the sending hosts (Transport Layer) how fast to send data
- Hosts send data no faster than that rate
- Asynchronous Transfer Mode Available Bit Rate (ATM ABR) service works this way
- Alternative 2: Explicit congestion feedback
- Each router (Network Layer) tells the sending hosts (Transport Layer) whether the router is congested or not
- Hosts reduce their sending rate until the routers report they're not congested
- ATM can also work this way
- Alternative 3: No feedback
- The routers (Network Layer) don't tell the sending hosts (Transport Layer) anything
- Hosts must deduce congestion based on network behavior
- Hosts reduce their sending rate until it appears there is no congestion
- The Internet (TCP/IP) works this way
TCP Congestion Control
- Deducing congestion: Packet loss events
- Triple acks
- Timeouts
- TCP assumes that if a packet is lost, the reason is congestion, so slow down the sender
- This is usually a good assumption on sessile networks
- This is usually a bad assumption on wireless networks
- Controlling sending rate: Congestion window
- Works exactly the same as the flow control receive window
- Sender must stop and wait after sending min (receive window, congestion window) bytes
- For the rest of this section, assume receive window > congestion window, so congestion control (not flow control) determines the sending rate
- Determining congestion window size: Congestion control algorithm
- "TCP Reno" algorithm, first implemented in the Reno release of BSD Unix (1990), most widely used today
- Threshold = infinity
- Window = 1 * MSS
- Send (Window) number of bytes
- Wait for success (normal ack) or failure (triple ack, timeout)
- If a triple ack occurs:
- Threshold = Window / 2 -- "Multiplicative decrease"
- Window = Threshold
- Else if a timeout occurs:
- Threshold = Window / 2 -- "Multiplicative decrease"
- Window = 1 * MSS
- Else if Window < Threshold:
- Window = 2 * Window -- "Slow start"
- Else:
- Window = Window + 1 * MSS -- "Additive increase"
- Go to Step 3
- "TCP Tahoe" algorithm, predecessor of TCP Reno, very similar to TCP Reno except both a triple ack and a timeout initiate a slow start
- Threshold = infinity
- Window = 1 * MSS
- Send (Window) number of bytes
- Wait for success (normal ack) or failure (triple ack, timeout)
- If a triple ack or a timeout occurs:
- Threshold = Window / 2 -- "Multiplicative decrease"
- Window = 1 * MSS
- Else if Window < Threshold:
- Window = 2 * Window -- "Slow start"
- Else:
- Window = Window + 1 * MSS -- "Additive increase"
- Go to Step 3
- Steady-state average throughput of a TCP connection transferring a large quantity of data
- Throughput of a TCP connection transferring a small quantity of data, like a little web page or image
|
Data Communications and Networks II
|
|
4003-541-70/4005-741-70
|
|
Spring Quarter 2006
|
|
Course Page
|
|
Alan Kaminsky
|
|
•
|
|
Department of Computer Science
|
|
•
|
|
Rochester Institute of Technology
|
|
•
|
|
4486 +
2220 =
6706
|
|
Home Page
|
Copyright © 2006 Alan Kaminsky.
All rights reserved.
Last updated 22-May-2006.
Please send comments to ark@cs.rit.edu.
|