Skip to content

Latest commit

 

History

History
49 lines (40 loc) · 7.49 KB

README.md

File metadata and controls

49 lines (40 loc) · 7.49 KB

FTSEmu

A multi-threaded distributed system emulator. It has been developed as a platform for testing and analyzing a novel Fault-Tolerant, Distributed Termination Detection algorithm based on the algorithm by Shmuel Safra.

An implementation of the fault-tolerant algorithm has been applied to a fault-tolerant version of the Chandy-Misra routing algorithm as well as a fault-tolerant version of the Afek-Kutten-Yung self-stabilizing spanning tree algorithm. The implementation can be found here.

More:
[1] Georgios Karlos, Wan Fokkink, Per Fuchs - Fault-Tolerant Termination Detection with Safra's Algorithm
[2] Georgios Karlos - Emulation of a Fault-Tolerant Variant of Safra's Termination Detection Algoritm
[3] Edsger W. Dijkstra - Shmuel Safra's Version of Termination Detection

Usage

  • Build
$ ant jar
  • Run
$ java -jar TDS-0.1.jar [OPTION]...  

Options

Option Description Possible Parameters Default
-h Print help - off
-v Verbose output - off
-ver Algorithm version 1 (OFSS), 2 (IFSS), 3 (FTS), 0 (All) 0
-n Number of nodes [2, 1024] 4
-c Number of crashing nodes -c : random
-c low high: crash nodes in 'low' to 'high' interval
0
-l Activity level 1, 2 1
-w Max wait time for a run (ms) >0 100000
-csv Write performance metrics to a csv file - off
-f Clean (flush) the csv (if one exists with a name corresponding to this run's params) - off
-dist Choose a probability distribution by which random activity is determined -(1) : Uniform
-(2) : Gaussian
-strategy Choose a strategy by which activity is happening (random under chose distribution), when a node becomes active -(1) : Compute-Send - Perform some computation and then send some messages (uniform 0-3, gaussian mean=1 sd=1)
-(2) : N-Activities - Choose an N under given distribution. For each of the n activities randomly choose if computation of message
1
-batype Basic Algorithm type -(1) : Centralized (Node 0 is initially active)
-(2) : Decentralized-Even (Nodes with even id's are initially active)
-(3) : Decentralized-Random (A random number of nodes, uniformly chosen, is initially active)
2
-ci Interval between 2 consecutive node crashes -(1) : Uniform in [200, 3000] ms
-(2) : Gaussian with mean 1000ms and sd 200ms
1
-anl Average network latency Value in [20, 100] (Message delays are random with Gaussian distribution with mean -avl and standard deviation of 10 ms 60

Verbose Output

This is a typical output when running with option -v and default -ver

simulator screenshot

Issues

[1] There is currently a bug introduced after the migration. When all versions run at the same time, occasionally, one will hang with no activity taking place for that version.