This repo contains our simulator and prototype code of BitSense. Specifically,
-
simulator/
contains a trace-driven simulator we built to integrate various sketching algorithms with BitSense. -
testbed/
contains BitSense data plane prototype written in$\mathrm{P4}_{16}$ (see here) that can be compiled and loaded onto Tofino switches.
We are willing to hear your feedback. Please let us know if you'd like to contribute to BitSense applications. If you have further questions and suggestions, don't hesitate to contact us at dr_abc@pku.edu.cn or huangqun@pku.edu.cn.
Simulator programs include the following directories.
bin/
contains executable files after compilationbuild/
contains intermediate objects generated by CMake and Makeconfig/
contains configuration files for BitSense and the integrated sketch algorithmsdata/
contains a truncated trace parsed from CAIDAsrc/
contains BitSense source codetest/
contains BitSense testing codethird_party/
contains third-party code as git submodules
We require the following dependencies to run BitSense simulator programs on Linux or Mac.
Dependency | Installation (on Linux) | Installation (on Mac) |
---|---|---|
Cmake (>=3.20) | sudo apt-get install cmake -y |
brew install cmake |
boost (>=1.75) | sudo apt-get install libboost-all-dev -y |
brew install boost |
libpcap (>=1.9) | sudo apt-get install libpcap-dev -y |
brew install libpcap |
PcapPlusPlus (>=21.05) | Method 1. Build from source using its default configuration and installation directory Method 2. brew install pcapplusplus |
brew install pcapplusplus |
Simulator programs also make use of three third-party libraries, namely eigen, fmt, and tomlplusplus. They are maintained as git submodules. Hence, don't forget to clone them with git submodule update --init
.
Besides, make sure your C++ compiler supports C++17 and the python interpreter version is at least 3.7 to enable essential library features.
The following shell script builds the simulator. As long as the dependencies are correctly installed, the script should run successfully.
mkdir -p simulator/build && cd simulator/build
cmake ..
make
To ensure each simulator module is built successfully, you can run ctest
in the simulator/build
directory.
You should see the following line in the testing output.
100% tests passed, 0 tests failed out of 8
After executing the script, you should see a number of executable files in the bin/
directory.
Each of them implements a sketch framework indicated by its name. Note that a file named as BS_XXX
implements the BitSense version of the sketch XXX
. For example, CM
implements the original Count-Min sketch, while BS_CM
implements the BitSense version of Count-Min sketch. The following table lists the command of running sketch frameworks in the simulator/bin
directory.
Command | Meaning | Command | Meaning |
---|---|---|---|
./CM |
Run Count-Min sketch | ./BS_CM |
Run BitSense + Count-Min sketch |
./CS |
Run Count sketch | ./BS_CS |
Run BitSense + Count sketch |
./ES |
Run Elastic sketch | ./BS_ES |
Run BitSense + Elastic sketch |
./FR |
Run Flow Radar | ./BS_FR |
Run BitSense + Flow Radar |
./NS |
Run NitroSketch | ./BS_NS |
Run BitSense + NitroSketch |
./NZE |
Run Nearly-zero-error sketch | ./BS_NZE |
Run BitSense + Nearly-zero-error sketch |
./PR |
Run PR sketch | ./BS_PR |
Run BitSense + PR sketch |
./UM |
Run UnivMon | ./BS_UM |
Run BitSense + UnivMon |
The above command will read the default runtime configuration (which is in simulator/config/sketch_config.toml
). If you want to specify a new configuration, run with ./XXX -c [path_to_config_file]
instead.
You may encounter an error FATAL| Failed to open record file ../data/records.bin. @data.h:769
at the first run. This is because the default data stream is simulator/data/records.bin
, but the file nonexists. We provide a sample stream down-sampled and parsed from a truncated CAIDA trace (much smaller than the trace used in the paper). Its compressed version is provided as records.bin.zip
in simulator/data
. Please do not forget to decompress this file.
Each sketch framework needs a number of configuration parameters to run, e.g., height, width, and input data stream. The default config file (i.e., simulator/config/sketch_config.toml
) has already specified a sample configuration for each sketch.
Note that the configuration of a sketch named XXX
is listed in the table named [XXX] in the config file. Table [XXX] contains four subtables, each holding a number of parameters as follows.
Subtable | Description | Example | Meaning of the Example |
---|---|---|---|
[XXX.para] | Sketch-specific configuration | [CM.para] depth = 3 width = 10000 |
Configure a Count-Min sketch with 3 rows and 10,000 counters in each row |
[XXX.data] | Input-related configuration | [CM.data] data = "../data/records.bin" |
Read the input data stream from the file ../data/records.bin |
[XXX.test] | Output-related configuration | [CM.test] query = ["ARE", "AAE"] |
Output the values of two metrics, namely ARE and AAE , after querying per-flow size |
[XXX.bs] | BitSense-related configuration | [CM.bs] cnt_no_ratio = 0.1435 width_cnt = [8, 16] no_hash = [3] |
Set the BitSense compression ratio to 0.1435, the width of counters on Layer 0 (and 1) as 8 bits (and 16 bits), and the number of hash function as 3 |
BitSense uses a more succinct representation of streaming packets than PCAP files as input, which only stores the flowkey, captured timestamp, and packet length of each packet and strips off payloads and unrelated metadata.
This customized representation has an extension name as the .bin
file.
To obtain such a representation, we provide a parser that turns .pcap
files into .bin
files with user-defined rules in simulation/src/pcap_parser
.
After successfully building the simulator, it is compiled into an executable called parser
in the simulator/bin
directory.
In fact, the default data stream in simulator/data
is pre-parsed from a CAIDA trace using this parser
.
You are welcome to use parser
to generate new input files from your own PCAP files. An example usage of parser
is as follows, where the parser config should follow the -c
option, and -v
prints verbose message (in this case, file summary).
> ./parser -c ../config/parser.toml -v
INFO| Loading config from ../config/parser.toml... @utils.cpp:62
VERBOSE| Config loaded. @utils.cpp:76
File summary:
File name: ../data/data-1000K.pcap
File size: 1542089364 bytes
Link layer type: Raw IP (12)
Finished. Printed 25146019 packets (998958 flows)
Now we introduce how to specify user-defined rules to parser
. You may skip this part if you simply want to run the simulator up, since there is already a usable input stream in this repo.
All tunable settings of the parser are in the config file parser.toml
under simulator/config
, where you can find detailed captions of each setting. Here we only provide an overview in the following table.
Function | Associated Setting(s) | Valid Value Types |
---|---|---|
Stream truncation | flow_count & packet_count |
Integer |
Packet filtration | filter |
Classic BPF expression |
Input / Output | input & output |
File path |
Output mode | mode |
bin , pcap , and more (See below) |
Record Format | format |
See below |
Output mode controls the intended file type. bin
is for BitSense to read at runtime. pcap
produces another PCAP file (e.g., with some unwanted packets removed).
txt
is for human read only, and should NOT be specified on large PCAP files to avoid massive outputs. null
indicates no output.
Record format defines the byte-level layout of each record representing a captured packet. It specifies which fields are in a record, their lengths, and order. An example record format runs as follows.
format = [["flowkey", "padding", "timestamp", "length", "padding"],
[13, 3, 8, 2, 6]]
It says that each record in the output has
Field Name | Valid Length | Further Constraints |
---|---|---|
flowkey | 4, 8, 13 | Specify exactly once |
timestamp | 1, 2, 4, 8 | In microseconds. Specify at most once |
length | 1, 2, 4, 8 | Specify at most once |
padding | > 0 | None |
Make sure the record format you specified to parser
when generating the .bin
file is the same as the format
you specified to BitSense in the [XXX.data]
subtable of each sketch framework.
The provided config files of the parser
and all the sketches use the example record format above.
We show the output format of BitSense simulator programs in this part using CM
and BS_CM
.
All other sketch frameworks have a similar output format. To display more statistics, you can add them to the [XXX.test]
subtable in the config file.
To facilitate reproducing the results, we make some sketch configurations available (in simulator/config/PerFlowMeas
) and provide two handy shell scripts (PerFlowMeas.sh
& PerFlowMeasHelper.sh
in simulator/bin
) to run experiments. Below we will introduce the details.
These configurations and scripts run Exp #1, i.e., per-flow measurement, in the paper.
Recall that in Exp #1, we quantify how BitSense improves measurement accuracy using two metrics, namely flow ratio (FR) and average relative error (ARE) given different memory budgets.
We test six frameworks, namely ES, FR, NS, NZE, PR, and UM.
Each framework has a raw implementation (i.e., without BitSense) and an optimized version (i.e., with BitSense), so there are records.bin
that is down-sampled and parsed from a truncated CAIDA trace.
This input contains
To run any one of the simulation/bin
, run PerFlowMeas.sh
with the specified parameters as follows.
> ./PerFlowMeas.sh -h
OVERVIEW: Run per-flow measurement
USAGE: ./PerFlowMeas.sh -m mem-size -s sketch-name [-v|-h]
OPTIONS:
m Specify the memory budget (0.125m|0.25m|0.375m|0.5m|0.625m|0.75m|0.875m|1m)
s Specify the sketch framework (ES|FR|NS|NZE|PR|UM|BS_ES|BS_FR|BS_NS|BS_NZE|BS_PR|BS_UM)
v Verbose mode
h Print this help message
As the helper message suggests, you must specify one of the eight memory budgets after the -m
option and one of the twelve implementations after the -s
option.
The script will read the corresponding configuration from simulator/config/PerFlowMeas
and display measurement results. An example is as follows, where an ES instance allocated 256.114 kB of memory achieves an ARE of 0.92 and an FR of 8.85%.
> ./PerFlowMeas.sh -s ES -m 0.25m
Mem: 256.114kB, ARE: 0.923303, FR: 8.85333%
If you want a detailed look at the configuration and the original output of the simulator, use the -v
option. Below is an example.
> ./PerFlowMeas.sh -s ES -m 0.25m -v
Reading sketch config from "../config/PerFlowMeas/Raw/ES/0.25m.toml":
[ES] # Elastic Sketch
[ES.para]
num_buckets = 5
num_per_bucket = 1
l_depth = 2
l_width = 32748
[ES.data]
cnt_method = "InPacket"
data = "../data/records.bin"
format = [["flowkey", "padding", "timestamp", "length", "padding"], [13, 3, 8, 2, 6]]
[ES.test]
update = []
query = ["PODF", "ARE"]
query_podf = 0.001
======
... # Simulator output, omitted here
Furthermore, PerFlowMeasHelper.sh
helps you run one of the six framework under all possible memory budgets. Its usage is detailed as follows.
> ./PerFlowMeasHelper.sh -h
OVERVIEW: Helper of per-flow measurement
Run sketch on all memory settings
USAGE: ./PerFlowMeasHelper.sh -o output-file -s sketch-name [-p proc-number] [-h]
OPTIONS:
s Specify the sketch framework (ES|FR|NS|NZE|PR|UM|BS_ES|BS_FR|BS_NS|BS_NZE|BS_PR|BS_UM)
p Specify the number of processes to run in parallel (default = 1)
o Specify the output file
h Print this help message
Specifically, you must specify an output file to store results after -o
and a sketch framework after -s
. You can control the maximum number of parallel processes using the -p
option, whose default value is 1. An example usage is as follows.
> ./PerFlowMeasHelper.sh -s BS_ES -o result-BS_ES.txt -p 2
> cat result-BS_ES.txt
BS_ES
Mem: 126.189kB, ARE: 0.341465, FR: 35.7667%
Mem: 254.166kB, ARE: 0.101142, FR: 70.9833%
Mem: 384.181kB, ARE: 0.0311635, FR: 85.06%
Mem: 512.021kB, ARE: 0.0105264, FR: 93.7167%
Mem: 767.891kB, ARE: 0.0027149, FR: 98.12%
Mem: 636.357kB, ARE: 0.00497757, FR: 96.7233%
Mem: 1.00516MB, ARE: 0.00151194, FR: 99.4433%
Mem: 896.025kB, ARE: 0.00108287, FR: 99.1733%
Given the sheer number of different configurations, data streams, and setup methods across different experiments, it is a tedious and tough job to give every experiment such handy scripts. Exp #1 is one exemplar and important experiment which demonstrates BitSense's ability, and we have released its full data. If you would like further information on the setup and configuration of the rest experiments, please contact the author via email mentioned at the head of this file and describe the intended purpose faithfully. Thank you for your understanding :)
We provide a simplified version of BitSense data plane P4 programs, which includes the following directories.
countmin/
contains the prototype programs of BitSense with Count-Min sketches (CM).sketchlearn/
contains the prototype programs of BitSense with SketchLearn (SL).
Similar to the simulation code and experimental setup above, please contact the author for full implementation. Please do not forget to state the intended purpose in the email. Thank you for your understanding :)
Make sure your Barefoot SDE version is at least 9.9.0.
We provide a simplified version of data plane programs that can be compiled and loaded onto a single Tofino switch. Since BitSense is a switch-local optimization, it does not matter what topology is in use, and the simplest topology, i.e., two hosts connected by a switch, suffices as long as the hosts can emit and receive packets in multiple flows. The control plane program has the same decoding logic as the simulator, and we omit it.
For the two prototype sketches (CM and SL), the original sketch algorithms are in the file cm.p4
and sl.p4
, respectively, while the BitSense version of the sketches are in the file bs_cm.p4
and bs_sl.p4
, respectively.
The function of these programs mainly include parsing the TCP/IP header, measuring traffic statistics with sketches, and running the basic forwarding function.
You can compile the aforementioned P4 files into target-specific code using the SDE tools. When the program is deployed, you need to configure the port and download FIB (Forwarding Information Base) entries according to your local topology. You may follow a Barefoot SDE guidebook on the detailed commands.