NB-Cache: Non-Blocking In-Network Caching for High-Speed Content Routers
Unlike IP router’s stateless forwarding model, a content router owns a sophisticated data plane, consisting of a three-stage pipeline. Generally, a pipeline runs only as fast as its slowest stage. However, this simple but fundamental truth seems to be unintentionally ignored by the research community. Lots of works emerge focusing solely on the optimization of a single pipeline stage, e.g., forwarding information base (FIB), without even reflecting whether it is the potential bottleneck. In this work, to prevent such “blind optimization”, we start by building a model together with a prototype which identifies content store (CS) as the exact bottleneck. Instead of CS performance tuning, we propose a novel traffic bypass mechanism called “NB-Cache” by rethinking the content router architecture. Actually, in a content router, all the traffic will be blocked by CS as the first pipeline stage while only part of the traffic will enter the subsequent two stages. When CS is congested, it seems like a good idea to let the traffic bypass the local overloaded CS and be forwarded to the upstream light-loaded routers. Indeed, this design increases the packet travel distance. Interestingly, it also relieves packet congestion by better network-wide load balancing. Our evaluation shows that NB-Cache can tremendously reduce the round-trip time by 70.10% while improve the end-to-end throughput by 130.48% compared with the original design.
Specifically, NB-Cache includes four techniques: Bloom filter as the first stage bypass; active queue man- agement as the second stage bypass; non-blocking I/O for immediate control return; router-assisted congestion control featuring lazy congestion notification (i.e., NB-CC). NB-Cache does not change router’s interfaces to the outside world, thus can work compatibly with classic content routers.
In the folder contention, codes are used to show content store performance during I/O contention. We emulate a single router with data packets and interest packets arriving from two different directions, i.e., the ingress and the egress. con_data_pkt.txt and con_interest_pkt.txt are pre-defined requests. bloom_filter.hpp provides codes to implement the Bloom filter for first-stage bypass. shmqueue.h and shmqueue.cpp are used as interprocess communication queue. contention.cpp is the code to emulate a single router performance during I/O contention.
In the folder nbcache-emulation, we emulate a simple network which includes four routers, one end host for sending interest packets and multiple links between the routers and the host. interest_pkt.txt contains 20000 pre-defined interest requests. Specially, there are five different cases. In caseA.cpp, the architecture of content router is classic without using Bloom filter, active queue management and non-blocking I/O access. In caseA+ecn.cpp, we then add congestion control scheme based on ECN. In caseB.cpp, there is an NB-cache-enabled router. And in caseB+ecn.cpp, ECN-based approaches are used. In caseB+nbcc.cpp, however, we further add NB-CC (congestion control for non-blocking content caching) as the transport layer companion of NB-Cache.
In the folder libeio, we use libeio as non-blocking I/O. eio.c is used for emulating the simple network (nbcache-emulation), while eio.c_contention is for emulating the single router performance during I/O contention (contention).
To begin with, we need to compile the libeio code:
$ ./autogen.sh
$ ./configure
$ make
$ sudo make install
(When we emulate nbcache-emulation, eio.c is used. However, when we emulate contention, we should replace eio.c with eio.c_contention, because we modify the original eio.c to add readers-writer lock mechanism for Interest/Data contention issue)
Then, compile the contention code:
$ g++ contention.cpp shmqueue.cpp -o contention -lpthread -leio
Or, compile the nbcache-emulation code:
$ g++ caseA.cpp shmqueue.cpp -o caseA -lpthread
$ g++ caseA+ecn.cpp shmqueue.cpp -o caseAecn -lpthread
$ g++ caseB.cpp shmqueue.cpp -o caseB -lpthread -leio
$ g++ caseB+ecn.cpp shmqueue.cpp -o caseBecn -lpthread -leio
$ g++ caseB+nbcc.cpp shmqueue.cpp -o caseBnbcc -lpthread -leio