Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Difference between freezing a process group manually with cgroups and using criu --freeze-cgroup ? #2572

Open
yummypeng opened this issue Jan 24, 2025 · 3 comments

Comments

@yummypeng
Copy link
Contributor

Hey friends, may I ask what the difference is between manually using the freezer cgroup to freeze a process group and using criu --freeze-cgroup? I tried the former and found that criu gets stuck when running.

@avagin
Copy link
Member

avagin commented Jan 24, 2025

Where does the criu stuck? Could you attach criu verbose logs?

@yummypeng
Copy link
Contributor Author

yummypeng commented Jan 26, 2025

@avagin
I started an Nginx container using crictl, and its cgroup path is /sys/fs/cgroup/k8s.io/e3d4d8ca15eeae19e5e0bbf380bc0fb0a2d501967461dfcf7908bafc5814440f/ (I am using cgroupv2). If I first execute echo 1 > /sys/fs/cgroup/k8s.io/e3d4d8ca15eeae19e5e0bbf380bc0fb0a2d501967461dfcf7908bafc5814440f/cgroup.freeze
and then dump the Nginx thread using criu, criu will get stuck, and the complete log is as follows.

dump.log

However, if I use criu dump --freeze-cgroup /sys/fs/cgroup/k8s.io/e3d4d8ca15eeae19e5e0bbf380bc0fb0a2d501967461dfcf7908bafc5814440f/, it seems that criu can continue executing. From the code, I see that the implementation of --freeze-cgroup is probably similar to echo 1 > /sys/fs/cgroup/k8s.io/e3d4d8ca15eeae19e5e0bbf380bc0fb0a2d501967461dfcf7908bafc5814440f/cgroup.freeze, so I am very curious why executing this command manually before criu dump causes criu to hang.

@adrianreber
Copy link
Member

Please provide the information from the template and describe what you are trying to do.

You didn't mention initially that you are using Kubernetes with containerd. Container engines (not all of them) actually pause the container before taking a checkpoint. This usually is done using the cgroup.

You mention you are using cgroup v2, but CRIU says it detected a cgroup v1 system.

Why are you trying to freeze the cgroup before checkpointing the process? Just out of curiosity?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants