Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix #5093 : Crash on api below 24 #5211

Closed

Conversation

Shaik-Sirajuddin
Copy link

@Shaik-Sirajuddin Shaik-Sirajuddin commented Oct 28, 2023

Explanation

Essential Checklist

  • The PR title and explanation each start with "Fix #bugnum: " (If this PR fixes part of an issue, prefix the title with "Fix part of #bugnum: ...".)
  • Any changes to scripts/assets files have their rationale included in the PR explanation.
  • The PR follows the style guide.
  • The PR does not contain any unnecessary code changes from Android Studio (reference).
  • The PR is made from a branch that's not called "develop" and is up-to-date with "develop".
  • The PR is assigned to the appropriate reviewers (reference).

Reference Videos :
api_23_crash_before.webm
api_23_crash_after_fix.webm

@Shaik-Sirajuddin Shaik-Sirajuddin requested a review from a team as a code owner October 28, 2023 07:52
@github-actions
Copy link

Hi! @Shaik-Sirajuddin Welcome to Oppia! Could you please follow the instructions here and sign the CLA Sheet to get started? You'll need to do this before we can accept your PR. Thanks!

@adhiamboperes
Copy link
Collaborator

Please comment here once you have signed the CLA

@Shaik-Sirajuddin
Copy link
Author

Please comment here once you have signed the CLA

Successfully Signed CLA

@Shaik-Sirajuddin
Copy link
Author

@adhiamboperes could you check this pull request.

Copy link
Collaborator

@adhiamboperes adhiamboperes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @Shaik-Sirajuddin. I have left a suggestion, but overall looks good.

Please edit the PR title per the checklist instructions, and when done assign beck to me. Please see wiki section.

@Shaik-Sirajuddin Shaik-Sirajuddin changed the title Fix crash on api 23 Fix #5093 : Crash on api below 24 Nov 1, 2023
@Shaik-Sirajuddin
Copy link
Author

Thanks @Shaik-Sirajuddin. I have left a suggestion, but overall looks good.

Please edit the PR title per the checklist instructions, and when done assign beck to me. Please see wiki section.
Thanks @adhiamboperes
Implemented Changes as requested , Could you take another look.

Copy link
Collaborator

@adhiamboperes adhiamboperes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks @Shaik-Sirajuddin

Copy link

oppiabot bot commented Nov 1, 2023

Assigning @BenHenning for code owner reviews. Thanks!

Copy link
Member

@BenHenning BenHenning left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @Shaik-Sirajuddin! I had one comment on a potentially simpler way to do this.

prodImpl.loadTextFileFromLocalAssets(assetName)
} as? String ?: error("Asset doesn't exist: $assetName")
return if (Build.VERSION.SDK_INT >= Build.VERSION_CODES.N) {
trackedAssets.computeIfAbsent(assetName) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here & elsewhere: could we use Kotlin's https://kotlinlang.org/api/latest/jvm/stdlib/kotlin.collections/get-or-put.html instead? That would avoid the SDK-specific API that we're using here (and avoid needing to have two different implementations).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the suggestion ,
I think its enough to use getOrPut combined with synchronized function of kotlin to avoid unnecessary computation , as the main use of computeIfAbsent seemingly is to avoid unnecessary computation.

I think we can replace if condition by using code in the below format :

return memoizedLocales[localeContext] ?: synchronized(memoizedLocales) {
      memoizedLocales.getOrPut(localeContext) {
        val chooser = profileChooserSelector.findBestChooser(localeContext)
        val primaryLocaleSource = LocaleSource.createFromPrimary(localeContext)
        val fallbackLocaleSource = LocaleSource.createFromFallback(localeContext)
        val proposal = chooser.findBestProposal(primaryLocaleSource, fallbackLocaleSource)
        proposal.computedLocale
      }
    }

instead of writing like :

return if (Build.VERSION.SDK_INT >= Build.VERSION_CODES.N) {
      memoizedLocales.computeIfAbsent(localeContext) {
        val chooser = profileChooserSelector.findBestChooser(localeContext)
        val primaryLocaleSource = LocaleSource.createFromPrimary(localeContext)
        val fallbackLocaleSource = LocaleSource.createFromFallback(localeContext)
        val proposal = chooser.findBestProposal(primaryLocaleSource, fallbackLocaleSource)
        return@computeIfAbsent proposal.computedLocale
      }
    } else {
      // Note : Using get/PutIfAbsent For API Level below 24 as computeIfAbsent is introduced in API Level 24
      val locale = memoizedLocales[localeContext] ?: synchronized(memoizedLocales) {
        val chooser = profileChooserSelector.findBestChooser(localeContext)
        val primaryLocaleSource = LocaleSource.createFromPrimary(localeContext)
        val fallbackLocaleSource = LocaleSource.createFromFallback(localeContext)
        val proposal = chooser.findBestProposal(primaryLocaleSource, fallbackLocaleSource)
        memoizedLocales.putIfAbsent(localeContext, proposal.computedLocale)
        proposal.computedLocale
      }
      return locale
    }

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Noticed that getOrPut doesn't allow null value to be returned by defaultValue function ,
so in places where it is possible to receive null after computing for the key , we can use ?.let scope and putIfAbsent to account for NullPointerException.
Have implemented it accordingly.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I see. I want to try and avoid the synchronized bit if we can since it's pretty highly discouraged in Kotlin (see https://kt.academy/article/ek-synchronization) since it can cause problems with coroutines (see https://blog.danlew.net/2020/01/28/coroutines-and-java-synchronization-dont-mix/).

That being said, getOrPut does not guarantee atomicity, unlike does computeIfAbsent for ConcurrentHashMap. This may actually be a case where a slight change in implementation is needed to better leverage a lockless approach via coroutines. I'll follow up in a second comment with suggested versions of this & the next class to try and better illustrate what I'm thinking.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's what I'm thinking for fixing the issue with locale creation: https://gist.github.com/BenHenning/deb77757e89e1974066a6e76d1496f8b/revisions. There will be some peripheral changes needed in other files, too.

For FakeAssetRepository, we ideally would update it in the same way (i.e. update AssetRepository to returned Deferreds instead, which would also let us get rid of the ReentrantLock AssetRepositoryImpl is currently using). However, that's a change that will affect a lot more in the codebase, and doesn't seem worth it for the immediate benefit of fixing the crash. Could we maybe instead copy the ReentrantLock pattern used in the production implementation (https://github.com/oppia/oppia-android/blob/develop/utility/src/main/java/org/oppia/android/util/caching/AssetRepositoryImpl.kt)? It has the same problem as synchronized inherently, though it offers better consistency with other locks used in the codebase and offers a variety of benefits over the built-in JVM lock (see https://stackoverflow.com/a/11821900).

@Shaik-Sirajuddin
Copy link
Author

Shaik-Sirajuddin commented Nov 6, 2023

Thanks @Shaik-Sirajuddin! I had one comment on a potentially simpler way to do this.

Made modifications as suggested.
Kindly could you have a look at it @BenHenning

@Shaik-Sirajuddin
Copy link
Author

Shaik-Sirajuddin commented Nov 7, 2023

@Shaik-Sirajuddin, when done making changes please assign a reviewer by mentioning "@xxxx, PTAL".

For this PR, please check this link https://github.com/oppia/oppia-android/actions/runs/6753291993/job/18387213803?pr=5211 to see which of your tests are failing, and confirm that they are passing locally. Please refer to wiki page on how to run tests locally.

@adhiamboperes PTAL

Copy link

oppiabot bot commented Nov 7, 2023

Unassigning @Shaik-Sirajuddin since a re-review was requested. @Shaik-Sirajuddin, please make sure you have addressed all review comments. Thanks!

@Shaik-Sirajuddin
Copy link
Author

Made some changes
@adhiamboperes PTAL

@adhiamboperes
Copy link
Collaborator

Made some changes @adhiamboperes PTAL

Reassigning since the tests pass now. @BenHenning, PTAL.

@adhiamboperes adhiamboperes removed their assignment Nov 10, 2023
Copy link
Member

@BenHenning BenHenning left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @Shaik-Sirajuddin! Apologies for the delayed response--last week was really busy for me.

I've followed up on the ongoing comment thread, plus left another. PTAL.

prodImpl.loadTextFileFromLocalAssets(assetName)
} as? String ?: error("Asset doesn't exist: $assetName")
return if (Build.VERSION.SDK_INT >= Build.VERSION_CODES.N) {
trackedAssets.computeIfAbsent(assetName) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I see. I want to try and avoid the synchronized bit if we can since it's pretty highly discouraged in Kotlin (see https://kt.academy/article/ek-synchronization) since it can cause problems with coroutines (see https://blog.danlew.net/2020/01/28/coroutines-and-java-synchronization-dont-mix/).

That being said, getOrPut does not guarantee atomicity, unlike does computeIfAbsent for ConcurrentHashMap. This may actually be a case where a slight change in implementation is needed to better leverage a lockless approach via coroutines. I'll follow up in a second comment with suggested versions of this & the next class to try and better illustrate what I'm thinking.

prodImpl.loadTextFileFromLocalAssets(assetName)
} as? String ?: error("Asset doesn't exist: $assetName")
return if (Build.VERSION.SDK_INT >= Build.VERSION_CODES.N) {
trackedAssets.computeIfAbsent(assetName) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's what I'm thinking for fixing the issue with locale creation: https://gist.github.com/BenHenning/deb77757e89e1974066a6e76d1496f8b/revisions. There will be some peripheral changes needed in other files, too.

For FakeAssetRepository, we ideally would update it in the same way (i.e. update AssetRepository to returned Deferreds instead, which would also let us get rid of the ReentrantLock AssetRepositoryImpl is currently using). However, that's a change that will affect a lot more in the codebase, and doesn't seem worth it for the immediate benefit of fixing the crash. Could we maybe instead copy the ReentrantLock pattern used in the production implementation (https://github.com/oppia/oppia-android/blob/develop/utility/src/main/java/org/oppia/android/util/caching/AssetRepositoryImpl.kt)? It has the same problem as synchronized inherently, though it offers better consistency with other locks used in the codebase and offers a variety of benefits over the built-in JVM lock (see https://stackoverflow.com/a/11821900).

@@ -55,13 +55,26 @@ class AndroidLocaleFactory @Inject constructor(
*/
fun createAndroidLocale(localeContext: OppiaLocaleContext): Locale {
// Note: computeIfAbsent is used here instead of getOrPut to ensure atomicity across multiple
// threads calling into this create function.
return memoizedLocales.computeIfAbsent(localeContext) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given the problems that this has caused in earlier Android API versions, perhaps we should outright prohibit it to ensure this can't happen again.

I suggest adding a static regex pattern + test to check for this. See: https://github.com/oppia/oppia-android/wiki/Static-Analysis-Checks#generic-regex-pattern-matching-against-file-contents.

Copy link
Author

@Shaik-Sirajuddin Shaik-Sirajuddin Nov 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Checked the suggested approach : (https://gist.github.com/BenHenning/deb77757e89e1974066a6e76d1496f8b/revisions)

I have conducted a test on kotlin playground with suggested approach using this docs : (https://kotlinlang.org/docs/shared-mutable-state-and-concurrency.html)

But synchronization doesn't seem to be achieved . You could see the test here : (https://pl.kotl.in/cgtBwrYAb)
I think alternatively we can consider using mutex , Mutex Test : (https://pl.kotl.in/Q9BFy_zak)
Mutex runs faster than blockingDispatcher from the tests.
We may also need to consider what the docs have to say about using Mutex :

The locking in this example is fine-grained, so it pays the price

What do you suggest ?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @Shaik-Sirajuddin.

I think we would need to run the checks in a carefully isolated environment (vs. a cloud environment) in order to be confident in the results, but it does make sense why a fine-grained confinement strategy like blocking dispatcher would have poorer performance over a mutex (especially a cooperative one like Kotlin's Mutex--I actually wasn't aware of that one).

I generally lean toward maintaining the existing pattern for properly solving this (which is blocking dispatcher as we use that everywhere else). I'm actually not opposed to using Mutex, but one concern that comes up is how easy it is to deadlock when using managed critical sections (though this problem isn't completely fixed by blocking dispatcher; it's possible to live-lock that in the right circumstances). Generally the best solution is to use a message queue (like Kotlin's actor which we use in highly complex state machines in the app).

I'm open to either Mutex or blocking dispatcher here since both will play well with coroutines.

@adhiamboperes
Copy link
Collaborator

@Shaik-Sirajuddin, following up in case you need clarification on the latest review comments.

@Shaik-Sirajuddin
Copy link
Author

@Shaik-Sirajuddin, following up in case you need clarification on the latest review comments.

Apologies for delay. Facing some issues with PC
I will take a look soon.

@Shaik-Sirajuddin
Copy link
Author

@BenHenning PTAL

@oppiabot oppiabot bot assigned BenHenning and unassigned Shaik-Sirajuddin Nov 20, 2023
Copy link

oppiabot bot commented Nov 20, 2023

Unassigning @Shaik-Sirajuddin since a re-review was requested. @Shaik-Sirajuddin, please make sure you have addressed all review comments. Thanks!

Copy link

oppiabot bot commented Nov 27, 2023

Hi @Shaik-Sirajuddin, I'm going to mark this PR as stale because it hasn't had any updates for 7 days. If no further activity occurs within 7 days, it will be automatically closed so that others can take up the issue.
If you are still working on this PR, please make a follow-up commit within 3 days (and submit it for review, if applicable). Please also let us know if you are stuck so we can help you!

@oppiabot oppiabot bot added stale Corresponds to items that haven't seen a recent update and may be automatically closed. and removed stale Corresponds to items that haven't seen a recent update and may be automatically closed. labels Nov 27, 2023
Copy link

oppiabot bot commented Dec 6, 2023

Hi @Shaik-Sirajuddin, I'm going to mark this PR as stale because it hasn't had any updates for 7 days. If no further activity occurs within 7 days, it will be automatically closed so that others can take up the issue.
If you are still working on this PR, please make a follow-up commit within 3 days (and submit it for review, if applicable). Please also let us know if you are stuck so we can help you!

@oppiabot oppiabot bot added the stale Corresponds to items that haven't seen a recent update and may be automatically closed. label Dec 6, 2023
@oppiabot oppiabot bot closed this Dec 13, 2023
adhiamboperes added a commit that referenced this pull request Mar 1, 2024
## Explanation
Fixes #5093

This replaces #5211.

The core crash has come from using Java 8's
[Map.computeIfAbsent](https://docs.oracle.com/javase/8/docs/api/java/util/Map.html#computeIfAbsent-K-java.util.function.Function-)
on devices running SDK <24. This function isn't supported on these SDK
versions because:
1. Java 8 isn't fully supported until SDK 24
(https://stackoverflow.com/a/35934691).
2. While the build toolchain "desugars" Java 8 syntax and some APIs to
provide support for Java 8 before SDK 24, not all functionality is
present. While
https://developer.android.com/studio/write/java8-support-table suggests
that ``computeIfAbsent`` _should_ be present, it's not clear why it
isn't (it could be an out-of-date desugarer, or some other issue).

Only two uses for ``computeIfAbsent`` are present:
- ``AndroidLocaleFactory``: the main cause of the crash.
- ``FakeAssetRepository``: only used in tests, so ignored in this change
(since it would require changing the production API for
``AssetRepository`` to fix).

A new regex pattern check was added to ensure this method can't
inadvertently be used in the future since it won't work on lower API
versions.

This PR mainly focuses on fixing ``AndroidLocaleFactory``. 

Need to review the code & finish it, and document the changes. #5211
explored different methods that we could take to keep synchronization
when updating the factory to move away from ``computeIfAbsent``, but all
of them required introducing locking mechanisms which could cause
deadlocking. I also considered in this PR removing the memoization
altogether, but this doesn't seem ideal either since the creation of
profiles is non-trivial and locales are frequently fetched by nearly all
core lesson data providers throughout the app. I landed on a solution
that leveraged the app's blocking dispatcher and made profile creation
asynchronous. This has some specific implications:
- Technically a "live lock" is still possible if the blocking dispatcher
is starved. A better solution would be to use an actor-like pattern and
funnel changes through the background dispatcher, but that's out of
scope for this change and represents a problem with all uses of the
blocking dispatcher.
- Creations of ``DisplayLocaleImpl`` get more complicated since creation
of the ``Locale`` is now asynchronous. This was adjusted by passing in
the display locale's ``Locale`` object rather than having it compute it,
and callsites often already are operating within a coroutine context
which makes the asynchronous aspect of ``Locale`` creation a bit
simpler.
- Two cases where asynchronous creation cannot be used are the edge
cases of initial app startup failing to fetching & create the locale in
a timely manner, and initializing locale for all activities prior to the
current locale being fetched. In both cases, the factory was updated to
create a non-memoized ``Locale`` object specifically for these purposes.
This method should only be used when necessary since it avoids the
performance benefits of the memoized version.
- #5046 might be addressed since one plausible cause for that observed
issue is a root locale being used, or a region-only locale (both of
which should now be handled per this PR).

During development, there were two other changes that were made to ease
development:
- ``AndroidLocaleProfile`` was refactored to use a sealed class to
improve its typing, and to better facilitate the introduction of a root
profile (which was added for better system compatibility in cases when
the default app locale isn't supported on the system). Note that the new
root profile only applies to app strings since it can actually have
correct default behavior (whereas for content strings & audio voiceovers
it can't actually be used effectively). The profile class was also
updated to compute its own ``Locale`` (which is a simplification since
before there were multiple ways to create a "correct" ``Locale``), and
have an application-injectable factory.
- ``DataProviders.transformAsync`` was updated to handle caught
exceptions in the same way as ``DataProviders.transform`` which helped
while debugging the crash, and seems like it would have downstream
benefits in the future. The method's tests were correspondingly updated.

The changes here led to some testing changes:
-
``testCreateDisplayLocaleImpl_defaultInstance_hasDefaultInstanceContext``
was removed since using the default context isn't valid in most cases
(see below point).
-
``testReconstituteDisplayLocale_defaultContext_returnsDisplayLocaleForContext``
was renamed & a new test added to better represent that the default
context is invalid except for app strings where it can represent root
locale. For non-app cases, an exception should be thrown for default.
This is also why
``testCreateLocale_appStrings_allIncompat_invalidLangType_throwsException``
was updated to instead verify that the root locale is returned.
- In ``TranslationControllerTest``,
``testGetSystemLanguageLocale_rootLocale_returnsLocaleWithBlankContext``
and
``testGetAppLanguageLocale_uninitialized_returnsLocaleWithSystemLanguage``
were updated to correctly indicate that there is no specified language
for the root locale cases. To ensure coverage for valid IETF BCP-47 &
Android resource language IDs, a new test was added:
``testGetAppLanguageLocale_ptBrDefLocale_returnsLocaleWithIetfAndAndroidResourcesLangIds``.

Finally, please note that this is fixing a release blocking issue for
the upcoming 0.13 beta release of the app.

## Essential Checklist
- [x] The PR title and explanation each start with "Fix #bugnum: " (If
this PR fixes part of an issue, prefix the title with "Fix part of
#bugnum: ...".)
- [x] Any changes to
[scripts/assets](https://github.com/oppia/oppia-android/tree/develop/scripts/assets)
files have their rationale included in the PR explanation.
- [x] The PR follows the [style
guide](https://github.com/oppia/oppia-android/wiki/Coding-style-guide).
- [x] The PR does not contain any unnecessary code changes from Android
Studio
([reference](https://github.com/oppia/oppia-android/wiki/Guidance-on-submitting-a-PR#undo-unnecessary-changes)).
- [x] The PR is made from a branch that's **not** called "develop" and
is up-to-date with "develop".
- [x] The PR is **assigned** to the appropriate reviewers
([reference](https://github.com/oppia/oppia-android/wiki/Guidance-on-submitting-a-PR#clarification-regarding-assignees-and-reviewers-section)).

## For UI-specific PRs only
This is mainly an infrastructure change. The only user behavior impacted
is that the app no longer crashes on startup on SDK versions below 24.

Attempting to open the app on an SDK 23 emulator on develop
(4a07d8d):


![open_app_without_fix_smaller](https://github.com/oppia/oppia-android/assets/12983742/458b2baf-4157-4a3f-9809-27368a9c3e04)

Attempting to open the app with the fixes from this branch:

![open_app_with_fix_smaller](https://github.com/oppia/oppia-android/assets/12983742/b9ce13d9-ecfd-4205-b87f-6fbf42dfd995)

---------

Co-authored-by: Adhiambo Peres <59600948+adhiamboperes@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stale Corresponds to items that haven't seen a recent update and may be automatically closed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG]:Oppia-Android app not opening on the Android 6.0.1 Marshmello
3 participants