diff --git a/README.md b/README.md index 7724fce..40f20e2 100644 --- a/README.md +++ b/README.md @@ -3,7 +3,7 @@ This repo contains our simulator and prototype code of BitSense. Specifically, - `simulator/` contains a trace-driven simulator we built to integrate various sketching algorithms with BitSense. -- `testbed/` contains BitSense data plane prototype written in [P4$_{16}$](https://p4.org/p4-spec/docs/P4-16-v1.0.0-spec.html) that can be compiled and loaded onto Tofino switches. +- `testbed/` contains BitSense data plane prototype written in $\mathrm{P4}_{16}$ (see [here](https://p4.org/p4-spec/docs/P4-16-v1.0.0-spec.html)) that can be compiled and loaded onto Tofino switches. We are willing to hear your feedback. Please let us know if you'd like to contribute to BitSense applications. If you have further questions and suggestions, don't hesitate to contact us at dr_abc@pku.edu.cn or huangqun@pku.edu.cn. @@ -12,7 +12,7 @@ We are willing to hear your feedback. Please let us know if you'd like to contri ### Files Simulator programs include the following directories. - `bin/` contains executable files after compilation -- `build/` contains intermediate objects generated by CMake and make +- `build/` contains intermediate objects generated by CMake and Make - `config/` contains configuration files for BitSense and the integrated sketch algorithms - `data/` contains a truncated trace parsed from CAIDA - `src/` contains BitSense source code @@ -21,7 +21,7 @@ Simulator programs include the following directories. ### Install Dependencies -We require the following dependencies to run BitSense simulator programs on Linux and Mac. +We require the following dependencies to run BitSense simulator programs on Linux or Mac. | Dependency | Installation (on Linux) | Installation (on Mac) | |---|---|---| @@ -60,7 +60,7 @@ Each of them implements a sketch framework indicated by its name. Note that a fi The above command will read the default runtime configuration (which is in `simulator/config/sketch_config.toml`). If you want to specify a new configuration, run with `./XXX -c [path_to_config_file]` instead. -You may encounter an error `FATAL| Failed to open record file ../data/records.bin. @data.h:769` at the first run. This is because the default data stream is `../data/records.bin`, but the file nonexists. We provide a sample stream down-sampled and parsed from a truncated CAIDA trace (much smaller than the trace used in the paper). You can download this file from the [Google drive](https://drive.google.com/file/d/1o7YQdNVhQyAAe_naWXBGB7mOYQVmNF5T/view?usp=sharing) (or the [PKU disk](https://disk.pku.edu.cn:443/link/4BF2174500E4481C298BB1E9793CE85F) as a backup) and unzip it to `simluation/data`. Now any sketch framework should run smoothly. +You may encounter an error `FATAL| Failed to open record file ../data/records.bin. @data.h:769` at the first run. This is because the default data stream is `simulator/data/records.bin`, but the file nonexists. We provide a sample stream down-sampled and parsed from a truncated CAIDA trace (much smaller than the trace used in the paper). Its compressed version is provided as `records.bin.zip` in `simulator/data`. Please do not forget to decompress this file. ### Sketch Configuration Each sketch framework needs a number of configuration parameters to run, e.g., height, width, and input data stream. The default config file (i.e., `simulator/config/sketch_config.toml`) has already specified a sample configuration for each sketch. @@ -80,8 +80,20 @@ This customized representation has an extension name as the `.bin` file. To obtain such a representation, we provide a parser that turns `.pcap` files into `.bin` files with user-defined rules in `simulation/src/pcap_parser`. After successfully building the simulator, it is compiled into an executable called `parser` in the `simulator/bin` directory. In fact, the default data stream in `simulator/data` is pre-parsed from a CAIDA trace using this `parser`. +You are welcome to use `parser` to generate new input files from your own PCAP files. An example usage of `parser` is as follows, where the parser config should follow the `-c` option, and `-v` prints verbose message (in this case, file summary). +```console +> ./parser -c ../config/parser.toml -v + INFO| Loading config from ../config/parser.toml... @utils.cpp:62 +VERBOSE| Config loaded. @utils.cpp:76 +File summary: + File name: ../data/data-1000K.pcap + File size: 1542089364 bytes + Link layer type: Raw IP (12) + +Finished. Printed 25146019 packets (998958 flows) +``` -Now we introduce how to specify user-defined rules in `parser`. You may skip this part if you simply want to run the simulator up. +Now we introduce how to specify user-defined rules to `parser`. You may skip this part if you simply want to run the simulator up, since there is already a usable input stream in this repo. All tunable settings of the parser are in the config file `parser.toml` under `simulator/config`, where you can find detailed captions of each setting. Here we only provide an overview in the following table. | Function | Associated Setting(s) | Valid Value Types | @@ -117,7 +129,7 @@ The provided config files of the `parser` and all the sketches use the example r ### Result Screenshots We show the output format of BitSense simulator programs in this part using `CM` and `BS_CM`. -All other sketch frameworks have a similar output format. To display more statistics, you can add them to the config file. +All other sketch frameworks have a similar output format. To display more statistics, you can add them to the `[XXX.test]` subtable in the config file. ![CM output](img/CM_output.png) @@ -133,7 +145,7 @@ We test six frameworks, namely ES, FR, NS, NZE, PR, and UM. Each framework has a raw implementation (i.e., without BitSense) and an optimized version (i.e., with BitSense), so there are $6\times2=12$ implementations in total. For each implementation, we evaluate these metrics on eight memory budgets ranging from 0.125 MB to 1 MB, so there are $12\times8=96$ sketch instances to run. The only difference from the Exp \#1 in the paper lies in the input stream, where here we use the default `records.bin` that is down-sampled and parsed from a truncated CAIDA trace. -This input contains $1\times10^5$ distinct flows and $1.09\times10^6$ packets, pretty close to the number of flows and packets in *one epoch* of the original CAIDA trace. Hence, the reproduced results should be close to those in the paper, although there can be some nuance. +This input contains $1\times10^5$ distinct flows and $1.09\times10^6$ packets, pretty close to the number of flows and packets in *one epoch* of the original multi-epoch CAIDA trace. Hence, the reproduced results should be close to those in the paper, although there can be some nuance. To run any one of the $96$ sketch instances, goto `simulation/bin`, run `PerFlowMeas.sh` with the specified parameters as follows.