-
Notifications
You must be signed in to change notification settings - Fork 1
/
Copy pathdatapath.tex
91 lines (80 loc) · 8.36 KB
/
datapath.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
\section{CCP Implementation}
\label{s:datapath}
We implement a user-space CCP agent in Rust, called Portus\footnote{github.com/ccp-project/portus}, which implements functionality common across independent congestion control algorithm implementations, including a compiler for the datapath language and a serialization library for IPC communication.
CCP congestion control algorithms are hence implemented in Rust; we additionally expose bindings in Python. The remainder of this section will discuss datapath support for CCP.
\subsection{Datapath Requirements}
\label{sec:implementation-basics}
% this text was in the sigcomm submission, I think it makes things flow better. (dr)
A CCP-compatible datapath must accurately enforce the congestion
control algorithm specified by the user-space CCP module.
%From the datapath’s perspective, it is
%almost as if the developer were writing their algorithm directly
%into the datapath, while algorithms themselves are free from considering datapath-specific implementation issues.
Once a datapath implements
support for CCP, it automatically enables all CCP algorithms.
An implementation of the CCP datapath must perform the following functions:
\begin{itemize}
\item The datapath should communicate with a \userspace CCP agent using an IPC
mechanism. The datapath multiplexes \ct{report}s from multiple connections
onto the single persistent IPC connection to the slow path. It must also perform
the proper serialization for all messages received and sent.
\item The datapath should execute the user-provided domain-specific program on
the arrival of every acknowledgment or a timeout in a safe manner. Datapath
programs (\Sec{sec:ccp}) may include simple computations to summarize
per-packet congestion signals (Table~\ref{tab:api}) and enforce congestion
windows and rates.
\end{itemize}
\subsection{Safe Execution of Datapath Programs}
\label{s:datapath:fold}
Datapaths are responsible for safely executing the program sent from the user-space CCP module. While CCP will compile the instructions and check for mundane errors (e.g., use of undefined variables) before installation, it is the datapath’s responsibility to ensure safe interpretation of the instructions. For example, datapaths should prevent divide by zero errors when calculating user defined variables and guarantee that programs cannot overwrite the congestion primitives. However, algorithms are allowed to set arbitrary congestion windows or rates, in the same way that any application can congest the network using UDP sockets.
Thankfully, this task is straightforward as datapath programs are limited in functionality:
programs may not enter loops, perform floating point operations, define functions or data structures, allocate memory, or use pointers. Rather, programs are strictly a way to express arithmetic computations over a limited set of primitives, define when and how to set congestion windows and pacing rates, and report measurements.
\subsection{\texttt{libccp}: CCP's Datapath Component}
\label{s:datapath:libccp}
We have implemented a library, \ct{libccp}\footnote{\url{github.com/ccp-project/libccp}}, that provides a reference
implementation of CCP's datapath component, in order to simplify CCP datapath development.
\ct{libccp} is lightweight execution loop for
datapath programs and message serialization.
While we considered using eBPF~\cite{ebpf} or TCP BPF~\cite{tcpbpf}
as the execution loop, including our own makes \ct{libccp} portable to datapaths outside the Linux kernel; the execution loop runs the same code in all three datapaths we implemented.
To use \ct{libccp}, the datapath must provide callbacks to functions that: (1) set the window and rate, (2) provide a notion of time, and (3) send an IPC message to CCP. Upon reading a message from CCP, the datapath calls \ct{ccp\_recv\_msg()}, which automatically de-multiplexes the message for the correct flow. After updating congestion signals, the datapath can call \ct{ccp\_invoke()} to run the datapath program, which may update variable calculations, set windows or rates,
and send report summaries to CCP. It is the responsibility of the datapath to ensure that it correctly computes and provides the congestion signals in Table~\ref{tab:api}.
The more signals a datapath can measure, the more algorithms that datapath can support. For example, CCP can only support DCTCP \cite{DCTCP} or ABC \cite{abc} on datapaths that provide ECN support; CCP will not run algorithms on datapaths lacking support for that algorithm’s requisite primitives.
\subsection{Datapath Implementation}
\label{s:datapath:software_datapaths}
We use \ct{libccp} to implement CCP support in three software datapaths: the Linux kernel\footnote{Our kernel module is built on Linux 4.14: github.com/ccp-project/ccp-kernel}; mTCP, a DPDK-based datapath; and Google's QUIC.
For both the Linux kernel and QUIC datapaths, we leveraged their respective pluggable congestion control interfaces, which provide callbacks upon packet acknowledgements and timeouts, where the ~\ct{libccp} program interpreter can be invoked.
The kernel module implements the communication channel to CCP using either Netlink sockets or a custom
character device, while mTCP and QUIC use Unix domain sockets.
We additionally modified the QUIC source code to support multiplexing CCP flows on one persistent IPC connection and to expose the function callbacks required by the \ct{libccp} API.
Unlike QUIC and the Linux kernel, mTCP only implements Reno and does not explicitly expose a congestion control interface for new algorithms.
In order to achieve behavior consistent with other datapaths, we also implemented SACK and packet pacing; these features were previously lacking.
%
%mTCP's lack of support for SACK caused us problems: when it observes a triple duplicate ACK,
%it jumps back and retransmits all packets starting from the lost packet (\ie Go-Back-N), even in the case of isolated loss.
%As a result, when initially testing our implementation of Cubic, each time the buffer was filled, CCP
%would cut the window, but when the cumulative ACK finally advanced, mTCP would immediately transmit a congestion window’s worth of packets in a burst (most of which had already been successfully received and SACKed), causing more congestion. \ma{I agree with Eddie that we can cut this.}
The definition of congestion signal primitives, IPC, and window and rate enforcement mechanisms is the only datapath-specific work needed to support CCP.
As an example, Table ~\ref{tab:api:kernel} details the mapping of kernel variables to CCP primitives.
Most of these definitions are straightforward; the CCP API merely requires datapaths to \textit{expose} variables they are already measuring.
All other necessary functionality, most notably interpreting and running the datapath programs, is shared amongst software datapaths via \texttt{libccp} (\S\ref{s:datapath:libccp}).
\begin{table}
\centering
\footnotesize
\begin{tabular}{p{0.35\columnwidth}p{0.5\columnwidth}}
\textbf{Signal} & \textbf{Definition} \\
\hline
\texttt{Ack.bytes\_acked}, \texttt{Ack.packets\_acked} & \texttt{Delta(tcp\_sock.bytes\_acked)} \\
\texttt{Ack.bytes\_misordered}, \texttt{Ack.packets\_misordered} & \texttt{Delta(tcp\_sock.sacked\_out)} \\
\texttt{Ack.ecn\_bytes}, \texttt{Ack.ecn\_packets} & \texttt{in\_ack\_event: CA\_ACK\_ECE} \\
\texttt{Ack.lost\_pkts\_sample} & \texttt{rate\_sample.losses} \\
\texttt{Ack.now} & \texttt{getnstimeofday()}\\
\texttt{Flow.was\_timeout} & \texttt{set\_state: TCP\_CA\_Loss} \\
\texttt{Flow.rtt\_sample\_us} & \texttt{rate\_sample.rtt\_us} \\
\texttt{Flow.rate\_outgoing} & \texttt{rate\_sample.delivered / Delta(tcp\_sock.first\_tx\_mstamp)} \\
\texttt{Flow.rate\_incoming} & \texttt{rate\_sample.delivered / Delta(tcp\_sock.tcp\_mstamp)} \\
\texttt{Flow.bytes\_in\_flight}, \texttt{Flow.packets\_in\_flight} & \texttt{tcp\_packets\_in\_flight(tcp\_sock)} \\
\end{tabular}
%\vspace{0.075in}
\caption{Definition of CCP primitives in terms of the \texttt{tcp\_sock} and \texttt{rate\_sample} structures, for the Linux kernel datapath.}\label{tab:api:kernel}
\end{table}