-
-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(Player/SpellQueue): bandaid crashfix #21103
Conversation
cc @walkline |
From this crashlog, we see that the spell cast request has invalid values. The more interesting question is how the values became invalid. I saw some other recent crashlogs - one and two - which make me think that something bad is happening, such as using a dangling pointer (using a deleted player) or a race condition (e.g., double parallel updates of the same map or instance). |
Agree with walkline. No real sense to have a invalid spell here. Probably a deeper issue. @sogladev also why clearing the queue instead to ignoring it? (Simple question, I don't remember all the process of your system) |
If the front of std::deque contains invalid data then the queue behavior is undefined. So it's best to clear the entire thing than trying to consider the next element to be valid. I also question when std::deque contains invalid data that the pointer to the next element is also invalid, pointing to undefined memory. Now, I'm not sure what happens if .clear() is called 😆 |
Suddenly this crash became quite frequent, do we think this will help easen up the crashes? |
It can't make it crash more often.. |
But it wouldn’t solve the issue either, it would crash on .clear(). |
From the vague research I did earlier in the week it shouldn't even be possible for spellInfo to be null in the container to begin with (under normal circumstances). Once I'm done with my work week I'll try to spend a little more time on it if possible. |
Yes, it didnt help https://gist.github.com/Nyeriah/bbcd980360e462731503bb23f1c79c4c |
I spent a little time on it tonight and while there's a couple issues I saw nothing that would lead to a crash. I'd be curious to try disabling spell queue (it's a config option) and see if similar crashes are still occurring, would narrow it down specifically to spell queue or if spell queue container is just "unlucky" and being corrupted from something else. |
so for the latest crashlog nyeriah provided this bandaid pr actually catched the error and then crashed on could probably use another "safer" way to clear |
Not possible. An element in the container or even the player object itself point to invalid memory, all paths will inevitably lead to a seg fault once the memory is accessed during deallocation or use. The only solution is to fix the underlying issue, it's just a matter of trying to narrow down where it is or what causes it. It's significantly more difficult to do based off a crashlog without being able to check a memory dump in gdb. |
Changes Proposed:
This PR proposes changes to:
if there's an invalid spell in the queue, clear the queue and return
Issues Addressed:
SOURCE:
The changes have been validated through:
Tests Performed:
This PR has been:
How to Test the Changes:
Known Issues and TODO List:
How to Test AzerothCore PRs
When a PR is ready to be tested, it will be marked as [WAITING TO BE TESTED].
You can help by testing PRs and writing your feedback here on the PR's page on GitHub. Follow the instructions here:
http://www.azerothcore.org/wiki/How-to-test-a-PR
REMEMBER: when testing a PR that changes something generic (i.e. a part of code that handles more than one specific thing), the tester should not only check that the PR does its job (e.g. fixing spell XXX) but especially check that the PR does not cause any regression (i.e. introducing new bugs).
For example: if a PR fixes spell X by changing a part of code that handles spells X, Y, and Z, we should not only test X, but we should test Y and Z as well.