Document how to simulate a power loss failure #667
Labels
component/raftstore
raftstore
kind/documentation
Improvements or additions to documentation
priority/medium
P3
Milestone
In a power loss failure, all written data not fsynced onto the persistent storage will be lost. The chaos testing framework should be able to test whether MatrixCube behave correctly with the presence of such power loss failures.
As discussed offline, such power loss failures can be simulated by cutting the network communication first, this isolates the local node from the outside world, meaning it won't be able to affect anyone else anymore. The ignore fsync flag of the vfs is then set (by calling fs.SetIgnoreSyncs(true)) to prevent any further fsync() operations to persistently sync stuff to the underlying storage device. Some random amount of wait time (i.e. sleep) can then be inserted here to accumulate some (un-fsynced) writes. After cube is stopped, fs.ResetToSyncedState() is called to clear all written contents that are not fsynced(), this will be followed by a call to fs.SetIgnoreSyncs(false) to reset the ignore fsync flag.
Provide demo code to show how this works.
The text was updated successfully, but these errors were encountered: