By: Anant Shah, Research Scientist, Edgecast, Kenji Noguchi, Senior Software Development Engineer, Edgecast
The Transmission Control Protocol (TCP) congestion control algorithm (CCA) governs how much data should be sent between clients and servers to maximize the utilization of available bandwidth and avoid congestion. Since its inception, other CCAs have been developed, such as Bottleneck Bandwidth and Round-trip propagation time (TCP BBR), and CUBIC. While TCP BBR and CUBIC aim to achieve congestion avoidance, understanding their effectiveness has been an ongoing mission for our engineering and research teams.
TCP BBR aims to achieve higher throughput by using packet delay as an indicator instead of packet loss. However, our previous research reported that BBR does not perform well in all cases. Specifically, our evaluation concluded there was little to no benefit in the throughput for small files (<100KB). Moreover, we observed BBR performance for flows with low round-trip time (RTT) and low retransmits was worse than CUBIC. Finally, the BBR benefits were only seen for client-facing traffic, while back-office connections (low RTT and negligible retransmits) performed better with CUBIC.
Edgecast is a global multi-tenant CDN delivering web traffic for many large (VOD and live) video streaming customers. Given that congestion control tunings using BBR for one customer can adversely affect another customer’s performance, and a blanket enablement might result in degradation of performance in some scenarios, we implemented a mechanism to detect cases where BBR provides improved performance and can dynamically enable it for all CDN customers. The result is a new, dynamic congestion control tuning feature that we’ve made available for all of our customers.
Perhaps the most important input to such a dynamic system is the data that powers it. Our dynamic congestion control tuning mechanism sits on top of our large-scale socket data collection, which exports TCP (xTCP) socket performance data from all the edge caches. Specifically, it extracts information from the Linux Kernel's tcp_info structure via netlink and streams it via Kafka into a ClickHouse cluster. Having this socket performance data at scale allows us to monitor the performance of the connections to our cache servers at very high granularity. xTCP has proven to be a powerful tool for many CDN optimizations. For example, we recently tuned our IPv6 initial congestion window and monitored the performance of gains using xTCP.
xTCP is similar to work done by Google Measurement lab’s (M-Lab) tcp-info tool with significant differences coming from optimizations needed to manage the large number of sockets seen by our edge caches (compared to M-Lab severs) and the ability to export the data in protobuf format. Stay tuned, we plan to open source xTCP soon.
In the following figure, we show the overview of our system. xTCP data is collected at scale from all our edge caches streamed into Kafka. xTCP data is then collected in a ClickHouse cluster, which powers our network data analytics, including the BBR controller, which detects the underperforming prefixes at each edge PoP.
While we want to maintain the dynamic nature of our workflow, we also need to make sure we select consistently under-performing prefixes at each edge point of presence (PoP) to avoid flip-flopping between CUBIC and BBR over short durations. And, as previously noted, we selectively activate BBR for requests where the file size is greater than 100KB. A fine-tuned CUBIC flow performs better for small files.
The BBR controller uses two metrics to assess the health of every observed client prefix:
The algorithm then consistently detects worse-performing prefixes over the past few hours. This detection runs every 5 minutes. While the total number of prefixes selected per edge PoP could be in the hundreds, we observed that prefix performance remains relatively consistent. The same prefixes are regularly selected, and new additions in each round (as shown in the following figure from the Chicago PoP) are very few.
New prefixes, if any, are selected to enable BBR, and a configuration is generated, which is passed through a validation step and pushed out to our edge caches globally.
We are happy to report that enabling BBR across our edge worldwide has shown considerable performance improvements. A key metric we track from the xTCP socket data is the delivery rate reported in TCP_INFO. Since we dynamically enable BBR for the most underperforming prefixes, we expect our lower percentile (worst case) delivery rate to improve.
The following figure shows the improvement in the 5th and 10th percentile delivery rate at our Los Angeles PoP as soon as the BBR change was enabled.
Similarly, in the following figure, we show considerable improvement (~2x) in the lower percentile delivery rate for a large residential ISP in the U.S. as soon as we dynamically enabled BBR at all of our North American PoPs.
The delivery rate extracted from tcp-info provides a good estimate of performance seen by the client. However, the most accurate indicator of performance is the throughput seen in the HTTP access logs for the client connection, i.e., goodput.
We measure the goodput from an edge cache server. As shown in the following figure, the change resulted in increased goodput. Overall, the 10th percentile goodput increased by 12%.
Special thanks to the BBR development team at Google for their amazing work on BBRv1 and their continued effort on BBRv2. We look forward to BBRv2 and will continue to push relevant changes to our platform in the near future. Kudos to Sergio Ruiz, Joseph Korkames, Joe Lahoud, Juan Bran, Daniel Lockhart, Ben Lovett, Colin Rasor, Mohnish Lad, Muhammad Bashir, Zach Jones, and Dave Andrews at Edgecast for supporting this change during development, testing and roll out. The Edgecast engineering team would especially like to thank Dave Seddon for his contributions in the development of the xTCP tool that powered much of the analysis.
With dynamic congestion control tuning, Edgecast customers now automatically gain performance improvements for their underperforming clients and improve their bottom line performance resulting in faster web delivery and a reduction in rebuffers for video streaming.
Contact us to learn more about how Edgecast can deliver better performance for your web content or application.
Stay connected and subscribe to our RSS feed.
Call us at
Manage your account or get tools and information.