Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document how to simulate a power loss failure #667

Open
lni opened this issue Jan 14, 2022 · 1 comment
Open

Document how to simulate a power loss failure #667

lni opened this issue Jan 14, 2022 · 1 comment
Assignees
Labels
component/raftstore raftstore kind/documentation Improvements or additions to documentation priority/medium P3
Milestone

Comments

@lni
Copy link
Contributor

lni commented Jan 14, 2022

In a power loss failure, all written data not fsynced onto the persistent storage will be lost. The chaos testing framework should be able to test whether MatrixCube behave correctly with the presence of such power loss failures.

As discussed offline, such power loss failures can be simulated by cutting the network communication first, this isolates the local node from the outside world, meaning it won't be able to affect anyone else anymore. The ignore fsync flag of the vfs is then set (by calling fs.SetIgnoreSyncs(true)) to prevent any further fsync() operations to persistently sync stuff to the underlying storage device. Some random amount of wait time (i.e. sleep) can then be inserted here to accumulate some (un-fsynced) writes. After cube is stopped, fs.ResetToSyncedState() is called to clear all written contents that are not fsynced(), this will be followed by a call to fs.SetIgnoreSyncs(false) to reset the ignore fsync flag.

Provide demo code to show how this works.

@lni lni added component/raftstore raftstore kind/documentation Improvements or additions to documentation priority/medium P3 labels Jan 14, 2022
@lni lni added this to the v0.3.0 milestone Jan 14, 2022
@lni lni assigned reusee and lni Jan 14, 2022
@lni
Copy link
Contributor Author

lni commented Jan 14, 2022

TestKVDataStorageRestartWithNotSyncedDataLost in storage/kv/kv_data_storage_test.go can be used as an example.

@zhangxu19830126 zhangxu19830126 modified the milestones: v0.3.0, backlog Feb 25, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component/raftstore raftstore kind/documentation Improvements or additions to documentation priority/medium P3
Projects
None yet
Development

No branches or pull requests

3 participants