-
Notifications
You must be signed in to change notification settings - Fork 53
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Files written with zero bytes #172
Comments
What is the setup? For the empty files, does it have content from the filer? |
Using the default deploy/helm/seaweedfs-csi-driver/values.yaml with the only change:
seaweedfs helm values is fairly stock outside of the following changes (literal git diffs)
Has been working for a solid week, we did have one complaint that may have been related. Today at peak operation, lots of read/writes, catastrophic failure progressively worsening over the space of an hour from 1/10 to 9/10 writes ending up with 0 byte file before we switched to our previous nfs solution. Previous nfs is just one pod in the same node group operating as an nfs, seems to be handling the load and pressure fine. Gut feeling is replication couldn't keep up with the changes, or writes being rejected for some reason? but i'm pretty new to seaweedfs. I believe FFMPEG is constantly writing to a mono.m3u8.tmp file and then renaming it to mono.m3u8, replacing the file - snippet of the filer log here:
As for the .ts files, its new files created with sequential numbering 001.ts, 002.ts, etc.
Sorry i'm not quite sure what you mean. Infrastructure is all up and running but production is switched back to the previous nfs, so i can check any logs etc., just may need a bit of guidance if possible :) My next step was going to look at replacing the filer index with a distributed filer. |
This could be a metadata synchronization problem. Try to use one csi driver and one filer to see whether this can be reproduced.
|
I've been able to replicate this consistently with 1 csi-driver and 1 filer, as well as multiple csi-drivers and filers. There are 4 volume replicas and 3 master replicas. Creating a deployment with ~80 pods running ffmpeg with hls output to /data is enough. I did even see the problem with 40 pods, just not as frequent. Dockerfile:
start.sh
After a couple minutes running
Running this same scenario on https://github.com/kubernetes-sigs/nfs-ganesha-server-and-external-provisioner and the only time i've ever seen a zero byte file was a "mono.m3u8.tmp" file which i must have caught right at the second the file was created. |
How many |
Sorry for any confusion, using the default 1 controller: replicas on the seaweedfs-csi-driver helm values file. Running 40+ test pods created to write files via ffmpeg using the Dockerfile supplied above in simple deployment across ~8 nodes: pvc.yaml:
Deployment.yaml:
|
Still the same question. |
Sorry I'm not sure how to check this? |
basically how many csi driver programs are running. |
Hi,
I have seaweedfs-csi-driver configured with seaweedfs on kubernetes with default seaweedfs values outside of replicas and volumes. 3 masters with 001 replication configured, 3 filers, 4 volume replicas. We've also increased the CSI driver controller to 3 replicas to avoid a SPOF. We have 8 application pods running a total of ~100 ffmpeg processes streaming live hls content. For each process a new .ts file written with the stream data every 2 seconds, and a master.m3u8 file is updated every 2 seconds. For every new .ts data file that is written, one is deleted. ( a constant stream of changing data ).
When accessing the master.hls file we're finding that it is randomly empty with 0 bytes, and some of the .ts files also 0 bytes and others fine. For example:
Sometimes every file is zero bytes, sometimes none are, sometimes only some files are. It's not consistent. I'm also finding some larger once-off writes for entire mp4 files at ~1-5GB are empty.
Log files on all pods look normal, no visible errors that i could see.
We're migrating over from nfs-ganesha-server-and-external-provisioner due to it being a SPOF, the previous solution was fine without issue. The only change is using seaweedfs instead.
We tried doubling filer replicas, and even decreasing down to 1; to no avail.
I'm wondering if it could have something to do with concurrentWriters default of 32?
Any thoughts as to where to look to solve this?
The text was updated successfully, but these errors were encountered: