Sunday, October 17, 2010

BGP is well-known for its faults. To me, it is interesting that the protocol is so different than TCP/IP, a protocol pair that has managed to scale so well as the Internet as grown. Using the deficiencies of BGP described in the paper "HLP: A Next Generation Inter-domain Routing Protocol", here a several ways in which the two protocol families differ.

- Scalability. TCP has scaled well as the number of users on the Internet increases because it has little to no intermediate state of interest to any other connections. TCP has had a difficult time scaling as available bandwidth and connection latency have both increased, but sender-only additions have been made to address this issue. BGP, on the other hand, doesn't scale as well. BGP routers must maintain state linear with the size of the Internet, and the number of globally visible state updates also grows linearly to the Internet size to synchronize this state.

- Convergence. TCP converges on one thing only: packet drops. Even though packet drops are related to a single piece of intermediate state, router queue size, the drop events themselves are local in scope, which greatly simplifies the transmission rate convergence process. BGP, on the other hand, as mentioned above, maintains a significant amount of global shared state. Furthermore, sender-side additions to TCP have greatly increased its ability to converge more rapidly to a healthy transmission rate, whereas no such improvements have been made to BGP.

- Fault isolation. Because of its initial use for military communication, the TCP/IP stack was designed to continue to operate in the face of local failures through the means of storing state on the ends. BGP, on the other hand, frequently requires global state updates to handle local faults.

Overall, the primary difference between the two protocol families is that one keeps all of its state on the ends and the other keeps complex state at the core of the Internet. While this analysis may not lead to improvements in BGP or inter-domain protocols, it is interesting to compare the two types of protocols to help understand the origins of BGP's current flaws.

No comments: