-
-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixed the native crash happening on Android. #116
base: main
Are you sure you want to change the base?
Conversation
* The issue on the Android side was a `null pointer dereference`, meaning that when accessing the getEntryByPath method, the archive object had already been disposed of, which caused the crash. * We are correctly handling and disposing of the previous archive and web views for the previously set archive. However, web views load content on their own thread. When we change the ZIM file, we dispose of the previous archive and stop the ongoing web view processes. In certain cases, the web view internally tries to get content via the getEntryByPath method because it takes some time to cancel all the previous calls on the background thread. As a result, the web view attempts to call the method on a disposed archive, which causes the native crash. * To fix this, I refactored the getPtr method to return nullptr if the archive has already been disposed of. I also handled this in the utils.h class by checking if the object is null (meaning it has already been disposed). If so, an error is thrown (which should be handled by the caller, in this case, the Kiwix app), instead of crashing the program. * Handling this scenario on the Android side is not feasible since the WebView operates on a separate thread over which we have no control. However, addressing this issue on the java-libkiwix side is a better approach. If an object is already disposed, it should throw an error that can be caught by the caller, instead of crashing the entire program. Handling such errors is the responsibility of the user of `java-libkiwix` (e.g., Kiwix Android). We are already handling all errors thrown by `java-libkiwix` in the Kiwix Android app. * Currently, I have created the `SAFE_THIS` macro to gather feedback on this approach. Changing the existing `THIS` macro to incorporate the new behavior would require updating all the places where it is being used. To validate this fix before making those changes, I implemented the new function separately. I explicitly disposed of the archive and attempted to access its content via the getEntryByPath method. Now, instead of crashing the entire program, it correctly throws an error, which is handled by Kiwix.
@kelson42, @mgautierfr Can you please review the PR and validate the idea? then I will refactore the remaining code. |
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #116 +/- ##
=========================================
Coverage 93.23% 93.23%
Complexity 237 237
=========================================
Files 47 47
Lines 325 325
Branches 3 3
=========================================
Hits 303 303
Misses 19 19
Partials 3 3 ☔ View full report in Codecov by Sentry. |
({ \ | ||
auto ptr = getPtr<NATIVE_TYPE>(env, thisObj); \ | ||
if (!ptr) { \ | ||
throwException(env, "java/lang/IllegalStateException", "The native object has already been disposed."); \ | ||
return nullptr; \ | ||
} \ | ||
ptr; \ | ||
}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't see how this could work in the pre-C++-14 language standard that I am more or less familiar with. SAFE_THIS
has to be an expression but it doesn't seem to expand to code that can be considered an expression according to the language specification. I tried to quickly skim through https://en.cppreference.com/w/cpp/language/expressions but couldn't find anything that allows to consider this code valid in modern c++.
In other words, I am puzzled how this code compiles (in the context of the change in lib/src/main/cpp/libzim/archive.cpp
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@veloman-yunkan This code is made by @mgautierfr, he can better explain this, and i am not very much familiar with C++.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that @veloman-yunkan understand how it works :)
#define FOO BAR
make every instance of FOO
literally replaced by BAR
.
So something like THIS->getEntryByPath(...)
is replaced by getPtr<NATIVE_TYPE>(env, thisObj)->getEntryByPath(...)
. And it compiles because getPtr
is a function returning a pointer.
With your define of SAFE_THIS
we have SAFE_THIS->getEntryByPath(...)
replaced by:
({
auto ptr = getPtr<NATIVE_TYPE>(env, thisObj);
if (!ptr) {
throwException(env, "java/lang/IllegalStateException", "The native object has already been disposed.");
return nullptr;
}
ptr;
})->getEntryByPath(...)
And, as @veloman-yunkan, I don't know how this code can be compiled.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With your define of SAFE_THIS we have SAFE_THIS->getEntryByPath(...) replaced by:
@mgautierfr SAFE_THIS
also uses the getPtr<NATIVE_TYPE>(env, thisObj)
to get the pointer, and if this method returns the nullptr
it throws the error which will be caught in the archive.cpp
.
METHOD(jobject, getEntryByPath__Ljava_lang_String_2, jstring path) {
return BUILD_WRAPPER("org/kiwix/libzim/Entry", SAFE_THIS->getEntryByPath(TO_C(path)));
} CATCH_EXCEPTION(nullptr)
Also, in the archive.cpp SAFE_THIS
returns the archive.
I think that's why this code was compiled without any error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are correctly handling and disposing of the previous archive and web views for the previously set archive.
I would say that you are not correctly disposing the previous archive if you are disposing it while something is still using it.
However, I understand it can be difficult to do such memory management in a java world where everything is expected to be managed by GC.
I have a clean solution for this problem and it is pr #25. The PR remove the need to call dispose on libzim object (and even remove this method) and make resource automatically clean up when GC destroy the object.
@mgautierfr You are right, and I mentioned this thing in the PR description please see point 4(We have done all possible things to remove loading of webView at the Android level).
Could we complete that PR so we get rid of this error? |
See issue kiwix/kiwix-android#3956
null pointer dereference
, meaning that when accessing the getEntryByPath method, the archive object had already been disposed of, which caused the crash.java-libkiwix
(e.g., Kiwix Android). We are already handling all errors thrown byjava-libkiwix
in the Kiwix Android app.SAFE_THIS
macro to gather feedback on this approach. Changing the existingTHIS
macro to incorporate the new behavior would require updating all the places where it is being used. To validate this fix before making those changes, I implemented the new function separately. I explicitly disposed of the archive and attempted to access its content via the getEntryByPath method. Now, instead of crashing the entire program, it correctly throws an error, which is handled by Kiwix.The issue sometimes occurs on the CI see https://github.com/kiwix/kiwix-android/actions/runs/12809338048/job/35713951704?pr=4177#step:8:9560