-
Notifications
You must be signed in to change notification settings - Fork 271
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remotery gets confused by fine-grain multi-threaded apps. #126
Comments
Have you tried increasing the size of the message queue? |
Actually, it's worth mentioning that Minitrace uses a mutex for pushing events which in cases of high contention won't give you accurate traces. There's every chance that threads queued for access to the mutex will give up their time-slice to the OS. |
Thanks for the suggestion, I tried doubling Then I tried doublng Anything else I should try? |
Quadrupling queuesize, quadrupling message-per-update and halving sleep-between-updates, the situation gets better again, but still data is missing. I can see this because I know there should be a 100 render_scanlines calls for each render_image, and the profile shows less than that, with gaps. How does Remotery handle the case of multiple threads trying to record samples at the same time? |
It should handle contention fine. I think what's happening is a combination of two things because you're sending data so fast:
Looking at the times you have a good stress test for Remotery, I only wish I could get access to your code so that I can reproduce the scenario myself. |
Bear in mind the only way to not lose data unconditionally is to block the thread issuing the sample and I don't want Remotery to do that, ever. So it'll be a case of finding the weak point and adding more memory or processor time to get those events out fast. |
I think it's possible to avoid losing data without blocking threads. If every thread writes to its own queue, instead of a shared queue, there would be no race conditions, and you would not need a mutex either. A data aggregator could then be the only reader of the queues. I think I'll attempt a proof of concept implementation of this. |
It still reduces to the same problem: if you are writing to the queue faster than data is being pulled from it, you will have to block. The only solution being to allocate more memory for the queue to decrease the chance of that happening. We still don't know the source of the problem. It could be that there's a failure point in sending the data across the network that isn't being reported. |
So what I'm saying is that to be sure, a breakpoint here would show stuff being discarded https://github.com/Celtoys/Remotery/blob/master/lib/Remotery.c#L4039 |
I think I might add a global Error Object that increments atomics based on how many times an error occurred. Using an error queue to report errors in an error queue doesn't sound dependable :) |
Thinking about it more, this is a real problem for reporting events at a frequency greater than the server code is capable of sending them over the network. It doesn't matter how much memory you allocate, eventually it will catch up to the reader. |
Thanks. Sure, you can always overflow the queue. I would rather have large gaps in the trace where the server choked on traffic, than have events randomly disappear without warning. For instance in game-dev, you simply scroll to that part where there is a full frame's worth of data. So maybe clear the queue, and then write a single event that signals the overflow of the queue, so you are alerted to it in the visualizer? |
The viewer definitely needs a visualisation that tells you when data has been dropped but I'm not happy with dropping entire frames because of what in most cases is an odd occurrence. I think what I'll do is a collection of things:
|
Sounds good! I did end up writing my own profiler, btw. |
I have pitted Remotery and Minitrace against each other.
On single-threaded apps, the results are the same.
For a multi-threaded run, however, I see that Remotery's measurements deviated from Minitrace's.
It looks like Remotery can miss events.
Race condition between worker threads trying to record a Remotery event?
Screenshots of the Remotery/Minifig runs:
http://thelittleengineerthatcould.blogspot.ca/2017/10/pitting-profilers-against-each-other.html
The text was updated successfully, but these errors were encountered: