Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Posix sigaction mechanism works for XNU kernel, no need to handle mach exception at all #4725

Open
lionfore-hao opened this issue Jan 20, 2025 · 9 comments

Comments

@lionfore-hao
Copy link

Platform

iOS, macOS, iPadOS, tvOS, watchOS, visionOS

Environment

Develop, Production, TestFlight

Installed

Swift Package Manager

Version

all versions

Xcode Version

all versions

Did it work on previous versions?

No response

Steps to Reproduce

Sentry processes both MACH exceptions and POSIX signals for the XNU bases systems, and restores the saved previous handler callback once sentry processed an exception in its MACH exception handler, this behavior will make other POSIX signal based handlers not work, please see the following logics:

  1. sentry installs its MACH exception & POSIX signal handlers, and saves the previous relative handlers;
  2. after this, another component/SDK of the same application installs another POSIX signal handler;
  3. once an exception occurs, sentry processes it in its MACH exception handler(because this happens before POSIX signal), and restore the saved previous handler;
  4. then, all those components/SDKs installing their POSIX signals handler after sentry would lose the exception handling chance;

But I really do not know why sentry would like to handle MACH exceptions, in the fact, POSIX signals handler works well, not need to handle MACH exceptions at all.

We can talk about this with more detail, and I can be reached via haolianfu@agora.io, thanks in advance!

B.R.,
Lionfore

Expected Result

Remove all the MACH exception handling relative codes, and only left POSIX signals handling codes, this is the best solution.

Actual Result

Got replied in place or via haolianfu@agora.io

Are you willing to submit a PR?

No response

@philipphofmann
Copy link
Member

Thanks for opening this issue @lionfore-hao. If I understood it correctly, the problem this causes is:

this behavior will make other POSIX signal based handlers not work

If yes, please elaborate on why this is a problem for you. What use case do you have? This information will help us to prioritize this issue.

@philipphofmann philipphofmann moved this from Needs Discussion to Needs More Information in Mobile & Cross Platform SDK Jan 20, 2025
@lionfore-hao
Copy link
Author

Thanks for the quick reply! Yes, you are right, the core problem is that using MACH exception handling mechanism in Sentry makes other POSIX signal based handlers not work, and I have described the problem logic in my previous message.

We have another component which uses POSIX signal based handlers integrated together with Sentry, so the above mentioned behavior of Sentry makes the other component not work at all.

Thanks again!

@philipphofmann
Copy link
Member

@supervacuus, you're more experienced with signal handlers and such. Do you think the proposed change makes sense?

@lionfore-hao
Copy link
Author

Yes, I think remove MACH exception based codes, and only left the POSIX signal based codes in Sentry project is OK, this can work well, because XNU kernel supports POSIX signal mechanism completely.

@supervacuus
Copy link

Exposing an option that disables mach exception port handling for users only interested in POSIX signals could be a valuable feature. For instance, users who write cross-platform applications might already have handlers implemented. I cannot say how many signals you cover in sentry-cocoa and whether you only provide them for specific errors that can only be caught via POSIX signals.

Of course, arbitrary interactions between app- or app-dependency-installed handlers with sentry handlers won't be possible. For instance, in the Native SDK, only the inproc backend guarantees that previous handlers will be called; the crash handlers of breakpad and crashpad might terminate applications early or prevent further snapshotting. But if your POSIX signal handlers invoke the previously installed handlers and users install theirs before SDK initialization, that could make sense.

However, I wouldn't remove mach exception handling entirely since POSIX signals are only a compatibility layer on top. Mach exceptions reveal much more platform-specific detail, which many developers will require. I would guess that most Apple targeting developers will be more used to the native exception reporting rather than receiving POSIX signals 😸.

@philprime
Copy link
Contributor

For reference, these are the crash monitors for mach and signal exceptions:

  • SentryCrashMonitor_MachException.c:

    static void *
    handleExceptions(void *const userData)
    {
    MachExceptionMessage exceptionMessage = { { 0 } };
    MachReplyMessage replyMessage = { { 0 } };
    char *eventID = g_primaryEventID;
    const char *threadName = (const char *)userData;
    pthread_setname_np(threadName);
    if (strcmp(threadName, kThreadSecondary) == 0) {
    SENTRY_ASYNC_SAFE_LOG_DEBUG("This is the secondary thread. Suspending.");

  • SentryCrashMonitor_Signal.ca:

    static void
    handleSignal(int sigNum, siginfo_t *signalInfo, void *userContext)
    {
    SENTRY_ASYNC_SAFE_LOG_DEBUG("Trapped signal %d", sigNum);
    if (g_isEnabled) {
    thread_act_array_t threads = NULL;
    mach_msg_type_number_t numThreads = 0;
    sentrycrashmc_suspendEnvironment(&threads, &numThreads);
    sentrycrashcm_notifyFatalExceptionCaptured(false);
    SENTRY_ASYNC_SAFE_LOG_DEBUG("Filling out context.");

@lionfore-hao
Copy link
Author

will

Thank you so much for the detailed reply! But in my opinion, as long as the crash reporter grasps each thread's stack, it is enough for developer to analyze the error, and SIGSEGV/SIGILL/SIGBUS/SIGFPE/SIGABRT can tell us what happened. I know the MACH exception port could provide more information, but seems these informations are not so useful for the developer.

I am sad to hear that you guys would not remove the MACH exception port mechanism, because this would make many other crash reporter implementations not work at all, such as Google's breakpad.

At last, if you do not want to remove the MACH exception port relative logic in Sentry, then could you please do one of the following changes:

  1. Not restore the saved POSIX signal handlers when handled the exception in MACH exception port handler, and make the signal be delivered again to the installed signal handler, this is another possible solution for integrating with other crash reporter systems;
  2. Remove the POSIX signal handling codes, because I think MACH exception handling and POSIX signal handling mechanisms are redundant functions, except the signal SIGABRT, or you can just only install the SIGABRT signal handler, remove SIGSEGV/SIGILL/SIGBUS/SIGFPE signals handling codes, I think this is also OK for peaceful coexistence with other crash reporter systems;

We really need POSIX signal mechanisms work well when our customers integrated Sentry SDK, thank you in advance!

Regards,
Lionfore

@kahest kahest moved this from Needs More Information to Needs Investigation in Mobile & Cross Platform SDK Jan 22, 2025
@philipphofmann
Copy link
Member

  1. Not restore the saved POSIX signal handlers when handled the exception in MACH exception port handler, and make the signal be delivered again to the installed signal handler, this is another possible solution for integrating with other crash reporter systems;

That sounds like a valuable solution. @supervacuus, do you agree?

@lionfore-hao
Copy link
Author

Yes, do not restore the POSIX signal handlers at the MACH exception port handler, and the POSIX signal handler will be triggered later, you can restore the saved previous signal handler in Sentry's signal handler, then all the signal handler chain will work well now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Needs Investigation
Development

No branches or pull requests

5 participants