Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: "C panic" ABI specifier #2753

Closed
wants to merge 18 commits into from
Closed

WIP: "C panic" ABI specifier #2753

wants to merge 18 commits into from

Conversation

BatmanAoD
Copy link
Member

@BatmanAoD BatmanAoD commented Aug 29, 2019

Update 28 October 2019: The effort to provide this language feature has been spun into a working group. See the announcement RFC.


Some preliminary conversation and drafting took place in rust-lang/rust#63909.

There are some remaining "TODO" sections.

Rendered

text/0000-simple_unwind_annotation.md Outdated Show resolved Hide resolved

By default, Rust assumes that an external function imported with extern "C"
cannot unwind, and Rust will abort if a panic would propagate out of a Rust
function exported with extern "C" function. If you specify the #[unwind]
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
function exported with extern "C" function. If you specify the #[unwind]
function with a non-"Rust" ABI ("`extern`") specification. If you specify the #[unwind(allowed)]

This verbiage could still be improved

# Guide-level explanation
[guide-level-explanation]: #guide-level-explanation

TODO
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
TODO
The `#[unwind(allowed)]` attribute permits functions with non-Rust ABIs (e.g. `extern "C" fn`) to unwind rather than terminating the process.

(The current verbiage is bad. Please suggest improvements.)

I don't think we've yet reached consensus on whether to include the (allowed). Is it possible for annotations to exist in multiple forms, one that has an argument and one that doesn't? If not, I think we should use (allowed) for forwards compatibility.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible for annotations to exist in multiple forms, one that has an argument and one that doesn't?

Yes.

By default, Rust assumes that an external function imported with extern "C"
cannot unwind, and Rust will abort if a panic would propagate out of a Rust
function exported with extern "C" function. If you specify the #[unwind]
attribute on an extern "C" function, Rust will instead allow an unwind (such as
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
attribute on an extern "C" function, Rust will instead allow an unwind (such as
attribute on a function with a non-"Rust" ABI, Rust will instead allow an unwind (such as

To be clarified: the annotation can be applied to either function definitions (extern "C" fn...) or function declarations-without-definitions (extern "C" { fn ...; I'm not sure if there's a better term for these declarations).

Without this annotation, what is the behavior supposed to be? I'm pretty sure that for maximum soundness, definitions should always abort and declarations should be assumed not to abort. Should declarations without #[unwind(allowed)] be given automatic "wrapper" functions that would catch unwinding and trigger an abort, or should they simply propagate the exception as normal?

text/0000-simple_unwind_annotation.md Show resolved Hide resolved
text/0000-simple_unwind_annotation.md Outdated Show resolved Hide resolved
text/0000-simple_unwind_annotation.md Outdated Show resolved Hide resolved
text/0000-simple_unwind_annotation.md Outdated Show resolved Hide resolved
@acfoltzer
Copy link

Thank you all for pushing this forward! Getting this capability is very important to the safety of our Lucet WebAssembly runtime.

One case where an attribute might not suffice is for function pointers. For example:

fn foo(bar_ptr: usize) {
    unsafe {
        let bar = std::mem::transmute::<usize, extern "C" fn()>(bar_ptr);
        bar();
    }
}

In the case of Lucet, where we are grabbing function pointers from a dynamically-loaded shared object, we need to be able to make sure these are not nounwind as well.

@Centril
Copy link
Contributor

Centril commented Aug 29, 2019

One case where an attribute might not suffice is for function pointers. For example:

In case we go the attribute route, my RFC #2602 would provide the syntactic support allowing us to write type UnwindCFnPtr = #[unwind(allow)] extern "C" fn();. The semantic meaning would need to be provided through a different RFC but it is at least syntactically feasible.

@BatmanAoD
Copy link
Member Author

One case where an attribute might not suffice is for function pointers. For example:

In case we go the attribute route, my RFC #2602 would provide the syntactic support allowing us to write type UnwindCFnPtr = #[unwind(allow)] extern "C" fn();. The semantic meaning would need to be provided through a different RFC but it is at least syntactically feasible.

Eesh. That makes me strongly favor deprecating extern and replacing it with something better (which would include the unwind ABI) in the next edition.

Co-Authored-By: Ralf Jung <post@ralfj.de>
@Centril
Copy link
Contributor

Centril commented Aug 29, 2019

Eesh. That makes me strongly favor deprecating extern and replacing it with something better (which would include the unwind ABI) in the next edition.

Do you feel that #[unwind(allow)] extern "C" fn() is somehow more objectionable as compared to extern "C" unwind "allow" or whatnot? The advantage to this approach is that the syntactic support is reusable when we / users need to add their own attributes (even just using #[cfg(...)]).

Personally, I want to limit the additional spec and compiler complexity and still satisfy the common use cases. As such, currently, I'm not positively inclined towards using editions to change things here (well you might not need an edition anyways; it might be backwards compatible to just add something to all editions (e.g. abi "C" fn()) but even so that's a complication).

@BatmanAoD
Copy link
Member Author

Let me clarify first that I'm not trying to suggest any breaking changes in this RFC! I just think the extern keyword is confusing enough that it needs to be revisited and hopefully deprecated eventually.

Do you feel that #[unwind(allow)] extern "C" fn() is somehow more objectionable as compared to extern "C" unwind "allow" or whatnot? The advantage to this approach is that the syntactic support is reusable when we / users need to add their own attributes (even just using #[cfg(...)]).

My reaction is mostly "wow, how is that annotation becoming part of the type system?" Whereas with an unwind keyword, it would be clearer that indeed unwind "X" fn is a different type from unwind "Y" fn. So, yes, I do find it more objectionable, though I don't have a strong argument for why annotations should rarely (if ever) be part of the type system (except as a form of sugar or code-generation, where a long-form way of expressing the concept also exists). I just personally find it surprising.

Personally, I want to limit the additional spec and compiler complexity and still satisfy the common use cases. As such, currently, I'm not positively inclined towards using editions to change things here (well you might not need an edition anyways; it might be backwards compatible to just add something to all editions (e.g. abi "C" fn()) but even so that's a complication).

The only change I'm suggesting for a new edition are to introduce abi and/or unwind as keywords and deprecate extern. I would expect the two ways of declaring ABIs to be otherwise identical.

@Centril
Copy link
Contributor

Centril commented Aug 29, 2019

My reaction is mostly "wow, how is that annotation becoming part of the type system?" Whereas with an unwind keyword, it would be clearer that indeed unwind "X" fn is a different type from unwind "Y" fn. So, yes, I do find it more objectionable, though I don't have a strong argument for why annotations should rarely (if ever) be part of the type system (except as a form of sugar or code-generation, where a long-form way of expressing the concept also exists). I just personally find it surprising.

If you find #[unwind(...)] on a function pointer objectionable then we should likewise make a change in this RFC. We either have #[unwind(...)] extern "abi" fn foo() and #[unwind(...)] extern "abi" fn() or we have extern "abi" unwind "..." fn foo() and extern "abi" unwind "..." fn() but not a mix.

I'd like to note that we do have type-system-affecting attributes already, e.g. repr(packed). More broadly, we have plenty of semantic attributes (more reprs, ..) Of course, as attributes are not allowed on type expressions there's no current expectation that they would affect types, but I don't see why that cannot change. The upside is that attributes scale better syntactically for adhoc concepts.

@BatmanAoD
Copy link
Member Author

we do have type-system-affecting attributes already, e.g. repr(packed)

I knew this would be the response to my comment about not liking attributes affecting the type system. 😞 But I'm not sure I understand how repr is actually involved in the type system; it doesn't affect anything like type inference or type-checking, does it?

#[unwind(allowed)] will affect the type of function pointers, because it won't be possible to assign a function pointer with the annotation to one with the abort annotation.

@Centril
Copy link
Contributor

Centril commented Aug 29, 2019

it doesn't affect anything like type inference or type-checking, does it?

It does.

#![forbid(safe_packed_borrows)]
// ^-- until the soundness hole goes away and this becomes on by default.

#[repr(packed)]
struct Foo {
    bar: u8,
    baz: u16,
}

fn stuff() {
    let packed = Foo { bar: 0, baz: 0 };
    &packed.baz;
    //~^ ERROR borrow of packed field is unsafe and requires unsafe function or block
}

@BatmanAoD
Copy link
Member Author

BatmanAoD commented Aug 29, 2019

That...also surprises me. How can packed structs be used safely?

Edit: that was very much a LMGTFY question; sorry. I read some of the RFC that made it unsafe. I really would have expected the alignment info to be kept separate from the type-system info, but I see now why that would be much more complicated and probably not worthwhile.

@CAD97
Copy link

CAD97 commented Aug 30, 2019

Note that while extern "ABI" unwind "ABI" would require parser changes, it would not require an edition change to make unwind a keyword; a positional keyword works fine here.

The position in the non_exhaustive is also towards attributes not effecting "real code semantics"; that's why it's up for getting dedicated syntax rather than the currently implemented attribute.

Longer term, I think a syntax that specifies call/unwind ABI inline and not in an attribute is desirable. Short term, however, we really want to minimally have #[unwind(allowed)] to "just" control the emition of LLVM nounwind. It can then potentially be reimplemented as a (deprecated) proc-macro when dedicated syntax and more detailed semantics are designed and implemented.

@RalfJung
Copy link
Member

For function pointers, another simpler alternative is to just not add nounwind when calling a function pointer. If that turns out to be a performance issue, we can still talk about richer function types.

@jan-hudec
Copy link

@acfoltzer,

One case where an attribute might not suffice is for function pointers. [...]
In the case of Lucet, where we are grabbing function pointers from a dynamically-loaded shared object, we need to be able to make sure these are not nounwind as well.

As I understand it, the point of this RFC is to only affect the function generation. So function pointers shall all be considered unwind and no nounwind function pointer shall be introduced. It can be done at later time, preferably by refining the ABI definitions.

This is conservative approach that can be implemented quickly. Discussion about adding the nounwind flag to the ABI shall be postponed to #2699 or other later RFC.

# Rationale and alternatives
[rationale-and-alternatives]: #rationale-and-alternatives

TODO

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
TODO
The goal is to enable the current use-cases for unwinding panics through foreign code while providing safer behaviour towards bindings not expecting that, with minimal changes to the language.
We should try not to prevent the RFC 2699 or other later RFC from resolving the above disadvantages on top of this one.

# Unresolved questions
[unresolved-questions]: #unresolved-questions

TODO
Copy link

@jan-hudec jan-hudec Aug 30, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
TODO
- Since this simple approach does not allow bindings to specify whether they expect callbacks they accept to unwind, we may want to restrict `#[unwind]` to `unsafe` functions. This restriction can be lifted when unwind is made part of the type.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this already addressed by your point above that "function pointers shall all be considered unwind and no nounwind function pointer shall be introduced"? I think that statement applies to all FnOnce types, doesn't it?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

....actually, this should probably be mentioned in the "alternatives" section, now that the primary proposal recommends treating all function pointers as #[unwind(allowed)].

@gnzlbg What do you think of this suggestion?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@BatmanAoD If we switch to extern "C panic" fn then this issue goes away, because that will also let us handle function pointers easily.

BatmanAoD and others added 3 commits August 30, 2019 14:42
Drawbacks

Co-Authored-By: Jan Hudec <bulb@ucw.cz>
Co-Authored-By: Jan Hudec <bulb@ucw.cz>
Co-Authored-By: Jan Hudec <bulb@ucw.cz>
@acfoltzer
Copy link

If @gnzlbg's understanding is shared by the lang team, then I suspect this RFC will indeed be non-viable and will need to be closed, and I would advocate that we stabilize the abort-on-FFI-logic as soon as possible. Users relying on unwind-through-FFI will need to wait for a more comprehensive solution (e.g. #2699).

If this turns out to be the case, perhaps we can explore an even weaker variation of this RFC that would drop the emphasis on Rust panics and simply specify the new ABI string in terms of not emitting nounwind.

I would very much like to avoid a situation where we are stuck between either taking on all the downsides to the wrapper approach that I described upthread, running a nightly compiler in production, or even building a custom rustc if nounwind gets added to extern "C" function pointers as proposed by rust-lang/rust#64090.

@jan-hudec
Copy link

jan-hudec commented Sep 19, 2019

@gnzlbg,

If some RFC were to:

* add an ABI that follows the C calling convention and uses the platform unwinding mechanism just like `C+-fexceptions`,

* on the Rust side, require that function definitions with that calling convention translate Rust panics to that ABI, and function calls to that calling convention translate the unwinding mechanism to Rust panic

[...]
AFAICT this is what #2699 does (amongst other things, like adding a -C panic=native feature), and precisely what this RFC is not intended to do. That is, it is an explicit goal of this RFC not to do that because it would be "too complex", so instead the goal of this RFC is to add a new C ABI that unwinds with a Rust panic, which has all the fundamental issues that had been raised above.

Actually, it's precisely what this RFC should and must do.

If we now say that extern "C panic" functions raise panic using a mechanism compatible with C++ ABI of the platform and calls to extern "C panic" expect panic in this form, then it is

  • currently easy to do—and exactly the same as if we say it simply unwinds the Rust panic—because the mechanism used for panics now is compatible with C++ ABI (on all platforms where both are defined) and
  • still future-proof, because the compiler can start emitting appropriate trampolines (i.e. any hard work is postponed until the situation actually becomes complicated).

Then in #2699 we can define how to extend it to handle and/or convert C++ (and structural) exceptions too. I think it should be an extension (i.e. not yet another ABI string), so maybe the more general name would be OK here.

@gnzlbg
Copy link
Contributor

gnzlbg commented Sep 20, 2019

This seems to exactly describe the situation that would arise if a subsequent RFC specified a new, "C panic"-incompatible unwinding mechanism for Rust. Then we would consider whether such a change has a "small" impact:

I don't think so. That paragraph is specifically for changes resulting from specifying unspecified behavior. For example, we say that unwinding through FFI is UB, and that paragraph technically allow us to change it to say that this is now defined to abort (I say technically, because in practice, that paragraph has not allowed this).

However, changing the implementation of the Rust unwinding ABI in an ABI breaking way does not specify any behavior. It just changes from one unspecified behavior to another.

Also, I don't think it was the intent of that paragraph to allow adding new features to the language with the intent of them breaking later (I don't recall that use case showing up in the RFC discussion).

My impression based on this discussion is that the impact of breaking "C panic" would extend to a handful of people who have participated in this conversation.

I have no reason to believe that breaking code that uses "C panic" will be possible in practice because experience shows it isn't. We changed the behavior of unwinding through FFI from "undefined" to "abort" two years ago, where less crates were relying on it, and such a breaking change, which is explicitly allowed per that RFC, was not possible.

I have therefore no reasons to believe that such a change will be possible for "C panic" in practice independently of whatever any RFC says. Such a prognosis does not match with the experience that we have already acquired on this issue.

Why would this standard not also apply to "C panic"?

An ABI is a contract between the caller and the callee. extern "C" works because such a contract exists (an ABI spec), and both the caller and the callee can implement it correctly, and adhere to it, and that results in a stable language feature.

The "C panic" proposal leaves the Rust ABI unspecified such that no contract exists, allowing one part to change the contract any time. This can be made to work at one particular point in time, but it does not result in a stable language feature, and code using it can break any time.

That makes sense, and for the purposes of #2699 raises the unfortunate possibility that different unwinding mechanisms, once specified, could also be considered different platforms.

Only if those are guaranteed to have a certain ABI. The ABI of the rust unwinding mechanism is unspecified and it is illegal to use it through FFI, so if we discover a better way to do that, we can just ship it to users silently just like we ship new layout optimizations for repr(Rust) types.

Users that don't want this could globally opt-in to -C panic=native, or a target triple, or locally to something like "C+native" just like they can opt-in to repr(C). In the same way that using repr(C) does not limit what repr(Rust) can do, I think that solving this problem should not limit what Rust can do either. Limiting this is useful when you need it, and if you need it everywhere, I'm all in for allowing people to do so, but people that don't need it, should be able to automatically do better if possible.


@jan-hudec

Actually, it's precisely what this RFC should and must do.

If we now say that extern "C panic" functions raise panic using a mechanism compatible with C++ ABI of the platform and calls to extern "C panic" expect panic in this form,

I think this is one of the best of all the options explored, and that is essentially what the "C+unwind" /#[unwind(native)] part of #2699 was. I'm not sure if this RFC would be the best place to make that change, because the goal here was to let a panic escape as is (without the C++ compatible part), and this other proposal is slightly different (so the discussion about the two proposals in one might be confusing; don't know), but I'm all in for such a feature.

@acfoltzer
Copy link

I have no reason to believe that breaking code that uses "C panic" will be possible in practice because experience shows it isn't. We changed the behavior of unwinding through FFI from "undefined" to "abort" two years ago, where less crates were relying on it, and such a breaking change, which is explicitly allowed per that RFC, was not possible.

I have therefore no reasons to believe that such a change will be possible for "C panic" in practice independently of whatever any RFC says. Such a prognosis does not match with the experience that we have already acquired on this issue.

In the undefined -> abort case, we had a situation where the folks initiating the change did not know about the potential breakage, much less who specifically would be affected. The change did not offer a workaround for folks whose code would break. There was no subsequent RFC already being drafted to refine the changed behavior

By contrast, we now have a very good idea that people depend on this behavior (there's a whole RFC, after all), and who would be affected by breaking "C panic". We have a path forward to a more comprehensive solution in #2699 that the stakeholders are eager to adopt. Folks working on a hypothetical future RFC changing Rust's unwinding mechanism would know who to reach out to, and what concerns to account for.

Even though these cases both involve the same technical issue, the circumstances around them are completely different. Assuming they would end up in the same place is far too pessimistic to the point of feeling hostile towards would-be users of this RFC.


Regarding your interpretation of RFC: Language Semver, it's clear that at this point we disagree. I'll just second @BatmanAoD's request for representatives of the lang team to resolve the disagreement.

@gnzlbg
Copy link
Contributor

gnzlbg commented Sep 21, 2019

Folks working on a hypothetical future RFC changing Rust's unwinding mechanism would know who to reach out to, and what concerns to account for.

The Rust ABI is unspecified. We change it relatively often, and those changes do not need an RFC. I don't think we ever even had an FCP for them. That's what "we make no guarantees about the Rust ABI" means.

In the undefined -> abort case, we had a situation where the folks initiating the change did not know about the potential breakage, much less who specifically would be affected. The change did not offer a workaround for folks whose code would break.

The second time the change was made (a year ago) a workaround was introduced with it (#[unwind(allowed)] has been on nightly for a year) - AFAICT nobody is using it.

Even though these cases both involve the same technical issue, the circumstances around them are completely different. Assuming they would end up in the same place is far too pessimistic to the point of feeling hostile towards would-be users of this RFC.

What's extremely user hostile is telling users that a feature is ""stable"" while actually reserving the right to break code using it any time. Just because you can tolerate such breakage does not mean that all other users will. Preserving user workflows isn't hostility, its user friendliness. Nobody likes to get called friday night because their stable Rust toolchain stopped compiling some correct Rust code.

@BatmanAoD
Copy link
Member Author

The Rust ABI is unspecified. We change it relatively often, and those changes do not need an RFC. I don't think we ever even had an FCP for them. That's what "we make no guarantees about the Rust ABI" means.

How does this impact dylib linking?

@gnzlbg
Copy link
Contributor

gnzlbg commented Sep 22, 2019

How does this impact dylib linking?

For the "Rust" ABI, if the dylibs are compiled with different Rust toolchains, we provide no guarantees. Iff the dylib links and the ABI changed, then that's UB of the form "calling a function with the wrong ABI", but since name mangling can change, the dylib might not even link.

So for dylibs, either one recompiles everything on every toolchain upgrade, or one uses an ABI that's guaranteed not to change in the dylib public API like, e.g., extern "C". There was recently a proposal in internals to improve the situation.

@BatmanAoD
Copy link
Member Author

So for dylibs, either one recompiles everything on every toolchain upgrade, or one uses an ABI that's guaranteed not to change in the dylib public API like, e.g., extern "C".

Is it actually guaranteed that the extern "C" ABI cannot ever change? That seems odd to me, since my understanding was that platform ABIs can change, since C the language does not actually specify an ABI.

But even if extern "C" itself is guaranteed not to change, why does that imply that extern "C panic" could not ever change, since there exist some ABIs that do change?

@rpjohnst
Copy link

I believe @gnzlbg's framing is that extern "C" doesn't change, because if a platform ever did change its ABI, it would become a new target triple, enabling a single version of rustc to target both the old and new versions.

@comex
Copy link

comex commented Sep 22, 2019

Indeed. Upgrading rustc should not generally require you to upgrade your OS. rustc might eventually drop support for obsolete OS versions, but that's a separate and explicit step, usually taken long after support for newer OS versions is added. Changing the ABI of an existing target means dropping support for the old OS version at the same time as adding support for the new one, which is way too soon.

Also, most non-embedded OSes don't make straight-out ABI-breaking changes, since they value being able to run old binaries. Some BSD variants are an exception, e.g.:

At OpenBSD, we don't care about breaking API/ABI between releases. Once a release is done, the API/ABI is stable, but there is no guarantee that it will be compatible with the next release.

(source)

@gnzlbg
Copy link
Contributor

gnzlbg commented Sep 23, 2019

How to deal with platforms that do ABI breaking changes is a complex topic, e.g., RFC: target-extension, libc#7570: How to deal with breaking changes on platform ?, Rust Internals: Pre-RFC: target extension (dealing with breaking changes at OS level). All platforms (Linux, Windows, *BSDs) do ABI breaking changes relatively often, but not all ABI breaking changes are equally severe, and each platform uses a different bar. Some don't need any handling, some can be handled without users knowing, some require new target triples, and some require new target triples, two versions of every system library, being very careful what you link, etc.

@nikomatsakis
Copy link
Contributor

nikomatsakis commented Oct 1, 2019

We discussed this RFC in the lang-team meeting last Thursday.

We generally agreed on a few things:

General direction

We like the proposal of creating a distinct ABI for "foreign functions that may unwind". It seems to be a good solution that enables forward progress on inserting "abort on unwind" boundaries while still giving people who do want unwinding a migration path. It does not however provide a complete solution: obviously this RFC leaves a number of details unspecified. For example, what sort of unwinding mechanism is permitted across these function boundaries? What happens if unwinding actually occurs?

Semver and how it relates to this RFC

Much of the discussion on this thread centers around just what is being guaranteed as stable code here, along with a dash of what constitutes a SemVer breaking change. We felt that it would be ok to declare that certain details are not yet specified, and that those details may change across compiler revisions without it being considered a breaking change -- but that we should be very clear about what those details are.

Moreover, there was concern about "de facto stabilization" -- basically, being "locked in" and unable to change details because of the amount of code relying on them. The main way we plan to avoid this is by continuing to make progress on the remaining issues. The primary driver of de facto stabilization would be letting the problem continue to sit for years, thus giving time for implicit dependencies to develop. If we are able to continue making progress and actually enable a sufficient number of use cases, then we have less to fear (though we do not commit to resolving all the uncertainty; our goal is only to support specific key use cases).

Proposed actions

To that end, we discussed doing the following:

  • in terms of this "minimal first step" RFC:
    • declare what behaviors are not yet specified
    • identify and document (in detail) the use cases that we are aiming to support
  • to address the longer term, create a lang team "working group" (see naming nit below), with acfoltzer and BatmanAoD as shepherds
    • goal is to help us make progress on resolving some of those details
    • first goal should probably be completing the work on this RFC (given above) and establishing next milestones

Speaking personally, I hope we can use this "working group" as a lang team experiment in handling more complex design initiatives like this. That is, I'd like to create a repository where we can collect design history, and where we can try to focus the conversation in more narrow ways. (Speaking as someone who spent several hours re-reading the comments on just this RFC -- most of which were out of date! -- I can tell you it is very hard to get up-to-speed on this topic based on the existing RFC threads. But I think you already knew that.) One of the key things I think we should start with is clarifying which details we aim to resolve and in what order.

OK, also speaking personally, I'm hoping to move away from the term 'working group', as I think it engenders confusion between this sort of working group and (say) the domain working groups such as the embedded working group. I think perhaps the term "project" is more appropriate (as the embedded team has proposed). We'll see.

C panic vs C unwind

This is obviously not just a naming issue. Specifically, "C panic" is meant to signal "a function that unwinds with the Rust unwinding mechanism" and "C unwind" with "a funtion that unwinds with the 'native' unwinding mechanism, as given by the current target" (today: DWARF on linux, SEH on msvc).

Of those present at the meeting, I believe the consensus was that we would prefer C unwind semantics. The reasoning is that it is clear that we will want some way to declare "native unwinding mechanism" eventually, and so we might as well start with that, especially because the native mechanism and the Rust mechanism are presently the same, so we're not introducing any kind of overhead in doing so. (We could always add a "C panic" ABI later, in other words, if we felt the need.) One other point is that we didn't know of any particular use case for "C panic", whereas "C unwind" has obvious use cases.

(Note that the behavior for "C unwind" would be defined by the target, and hence subject to change should target definition or compilation options change, but of course any such changes to existing targets would have to be executed very carefully. This would be analogous to adjusting our defaults to require more modern chipsets and other such changes.)

However, we also felt that this was a largely moot distinction. Obviously the two are the same now. Whatever we say today, if we decide in the future to alter the Rust unwinding and native unwinding mechanisms, then we could decide then what the meaning of "C panic" will be. (That said, as @gnzlbg and others have argued, obviously the "most compatible" thing would be for it to continue with the meaning it has today, which is effectively the native unwinding mechanism, since by definition that is what all existing code will be expecting.)

(It's also worth pointing out that @joshtriplett wasn't present at the meeting, though they've approved this message.)

On terminology

One final point. We observed that there are a lot of subtle distinctions at play here, for example between "undefined behavior" and "not yet specified" or "target dependent" behavior (and perhaps further between things like "Rust UB", "LLVM UB", and "Spec-only UB"). In terms of this RFC, it makes sense to try and carefully define all terms used. But thinking a bit more broadly it would also be good perhaps to try and document a consistent terminology that we can use across many discussions (and also to explicitly define relationships and differences between this terminology and that used in other language specifications).

@RalfJung
Copy link
Member

So this RFC goals of:

  • being minimal while combining two currently-incompatible ABIs,
  • leaving the door open for breakage while sticking to the Rust stability guarantees,

are incompatible with themselves.

I am not convinced of that. Sure, even with a "minimal RFC" code relying on "C+unwind" to match the native unwind ABI still relies on implementation-specific behavior and can thus break any time -- but that's still way better than UB! Let's take one step at a time?

stable correct working Rust code cannot break without a major Rust semver bump.

Code relying on the unwind mechanism will be as (in)correct as code relying on repr(Rust) struct layout. So we are free to break it any time. But for any specific Rust version it can still "happen to work" and there is no UB or contract violation if you fix the toolchain (and are willing to rely on undocumented unspecified behavior).

(I think that's also the gist of @nikomatsakis' message.)

C panic vs C unwind

Talking just in terms of naming, as in "panic" vs "unwind", note that there is precedence in catch_unwind. The reason is that a panic might unwind or might might abort, and if it aborts it doesn't get caught and raises no ABI concerns. So we already use "unwind" as a term to refer specifically to Rust unwinding.

@gnzlbg
Copy link
Contributor

gnzlbg commented Oct 12, 2019

Sure, even with a "minimal RFC" code relying on "C+unwind" to match the native unwind ABI still relies on implementation-specific behavior and can thus break any time

The next iteration of this RFC kind of assumes that this isn't the case, and nobody has raised this issue yet, so could you explain how that could break?

Code relying on the unwind mechanism will be as (in)correct as code relying on repr(Rust) struct layout. So we are free to break it any time.

Since we haven't been able to break existing code that relies on this for landing soundness fixes, what makes you believe that we will be able to break code using this feature for landing performance improvements ?

@RalfJung
Copy link
Member

The next iteration of this RFC kind of assumes that this isn't the case, and nobody has raised this issue yet, so could you explain how that could break?

I do not understand the question. I think I basically just paraphrased what @nikomatsakis wrote, quoting: "We felt that it would be ok to declare that certain details are not yet specified, and that those details may change across compiler revisions without it being considered a breaking change"

Since we haven't been able to break existing code that relies on this for landing soundness fixes, what makes you believe that we will be able to break code using this feature for landing performance improvements ?

"Haven't been able" is putting it wrong IMO. When considering the trade-off between breaking that code and not breaking it, right now the "not breaking" seems like the better option. That might change as new evidence surfaces. And certainly whatever the trade-off is for having abort-on-panic shims has little to do with whatever the trade-off is when someone finds a new amazing unwind ABI that we'd like to use for panics instead of the native one.

@gnzlbg
Copy link
Contributor

gnzlbg commented Oct 12, 2019

I do not understand the question. I think I basically just paraphrased what @nikomatsakis wrote, quoting: "We felt that it would be ok to declare that certain details are not yet specified, and that those details may change across compiler revisions without it being considered a breaking change"

By "details that are not yet specified" I understood that they mean that such an initial RFC might say things like "If a native unwind unwinds through a function frame containing types with destructors, the behavior is undefined" and whatever the implementation does when that happens might change between toolchain versions, but that's ok because we are talking about undefined behavior.

That's different from claiming that "code relying on "C+unwind" to match the native unwind ABI still relies on implementation-specific behavior and can thus break any time", since that sounds to me that it would be impossible to write code that uses a minimal "C+unwind" correctly to solve any kind of problem reliably. E.g. code using a minimal "C+unwind" feature that does not support unwinding across stack frames with types with destructors can just be carefully written to avoid such types.

And certainly whatever the trade-off is for having abort-on-panic shims has little to do with whatever the trade-off is when someone finds a new amazing unwind ABI that we'd like to use for panics instead of the native one.

Agreed that the trade-offs are very different. My opinion of the current trade off situation is that breaking code to land a critical language fix is a better reason for breaking code (independently of whether it is worth breaking that code) than doing so for landing a performance improvement. I don't really see us ever breaking any working code for landing any performance improvements.

@RalfJung
Copy link
Member

By "details that are not yet specified" I understood that they mean that such an initial RFC might say things like "If a native unwind unwinds through a function frame containing types with destructors, the behavior is undefined" and whatever the implementation does when that happens might change between toolchain versions, but that's ok because we are talking about undefined behavior.

I think Niko was pretty clear that this is not about making things UB.

That's different from claiming that "code relying on "C+unwind" to match the native unwind ABI still relies on implementation-specific behavior and can thus break any time", since that sounds to me that it would be impossible to write code that uses a minimal "C+unwind" correctly to solve any kind of problem reliably.

Your notion of "impossible" seems to be such that relying on implementation-specific (unstable) behavior is impossible. That's very black-and-white. There is a huge gray area between "stable and well-defined" and "UB", a position that you seem to ignore (based on the fact that I have to repeat myself right now). There totally is a world where we say that C+unwind supports Rust unwinding, that we are free to change that unwinding strategy any time, but that it matches native unwinding right now. There's at least 6 weeks notice between landing something in nightly and things hitting stable, so people have time to adjust their stuff if we end up actually changing it.

Maybe that's not reliably enough for you, but it's certainly more reliable than UB; so people that are willing to risk UB, I expect, will be fine with this level of reliability. I feel like you are making the perfect the enemy of the better-than-right-now.

@gnzlbg
Copy link
Contributor

gnzlbg commented Oct 12, 2019

I think Niko was pretty clear that this is not about making things UB.

And that removes undefined behavior from the language.

Your notion of "impossible" seems to be such that relying on implementation-specific (unstable) behavior is impossible.

That isn't my notion of "impossible".

@gThorondorsen
Copy link

@RalfJung There totally is a world where we say that C+unwind supports Rust unwinding, that we are free to change that unwinding strategy any time, but that it matches native unwinding right now.

But in such a world, I cannot call C+unwind a stable feature. The point of this feature is to interface with foreign code, and to do that correctly it is essential to know the unwinding strategy. If that strategy can change, then any code using the feature is automatically broken anytime that happen.

In particular, one of the use cases discussed in this thread was a JIT compiler. If the unwinding strategy can change, then the code generator needs to change accordingly depending on the compiler's version. I cannot see how such a situation is compatible with Rust's stability guarantee.

That's why I think that if we ever want to have this feature stabilised we need to guarantee that, for a given target, the unwinding strategy of extern "C unwind" fns will not change. It can be implementation-defined, but it has to be stable across rustc releases.

@RalfJung
Copy link
Member

IMO if that is the standard, we should not block abort-on-panic shims on a solution. That standard requires a huge level of commitment. That makes the time window so big that it is important to avoid UB in the mean time.

The compromise was "we'll get in something simple quickly and only then re-enable the shims". If there is enough opposition to a simple solution to make that impossible, we should just re-enable the shims now.

@Centril
Copy link
Contributor

Centril commented Oct 13, 2019

That's why I think that if we ever want to have this feature stabilised we need to guarantee that, for a given target, the unwinding strategy of extern "C unwind" fns will not change. It can be implementation-defined, but it has to be stable across rustc releases.

We only have one actual Rust compiler. Therefore, I think "implementation-defined" vs. "stable" would be a distinction without much practical difference in this case. I do think that rustc / t-compiler / t-lang should be free to change the unwinding strategy including for existing important tier-1 targets if there are benefits we want (perf, improvements in C/C++, ...).

@kornelski
Copy link
Contributor

The mere possibility of changing the unwinding strategy, in some unspecified, but possibly totally incompatible way, has been mentioned many times and is paralyzing all discussions about unwinding. It's impossible to argue with the hypothetical future (and on hypothetical platforms).

  1. Is there something wrong with current unwinding strategy that would be a motivation to change it in the first place?
  2. Is there another strategy that seems viable for Rust, and would be better than the current one?
  3. And if there is, is it certainly incompatible with FFI?

I'm not aware of any plans to change unwinding. And I'm struggling to imagine why and how Rust could make a change to unwinding so massive and disruptive, that it couldn't support unwinding over C any more. Unwinding over C stack frames is really simple.

@rpjohnst
Copy link

@kornelski

It's not so hypothetical as it may sound. C++ is currently considering a new unwinding strategy similar to (but more optimized at the ABI level than) returning Result objects. And even C is considering ABI support to match.

The reasoning is that current unwinding strategies are so unpredictable and costly when they are used that many codebases simply forbid unwinding, and the Result-style version would be more widely usable. However, this is obviously an FFI-incompatible change of the sort being discussed here.

For more information: https://wg21.link/p0709r2

@gnzlbg
Copy link
Contributor

gnzlbg commented Oct 14, 2019 via email

@Ixrec
Copy link
Contributor

Ixrec commented Oct 14, 2019

As much as I like the C++ "static exceptions" proposal, I don't think it's actually relevant here. It's not proposing a new runtime unwinding mechanism or implementation strategy. Rather, it's proposing a new semantics for try/catch syntax that doesn't involve unwinding at all.

If anything, the fact that "current unwinding strategies are so unpredictable and costly when they are used that many codebases simply forbid unwinding", combined with the fact that C++ is seriously considering such invasive changes to dramatically reduce the usage of unwinding in practice, to me indicates that the trend in unwinding is not to implement it better but rather to minimize its usage.

So at least for now, I strongly agree with referring to a potentially desirable future unwinding implementation as "largely hypothetical", and that we need to find a less paralyzing way of discussing this. Personally, I don't see the problem with just saying that this scenario is so unlikely that, if it ever does come to pass, it's fine to use "heavy" migration mechanisms like a new edition or an explicit extern "C super-panic" opt-in or whatever.

@RalfJung
Copy link
Member

RalfJung commented Oct 14, 2019

That
standard level of stability that you appear to be against is the same
standard that we follow for extern „C“.

I am aware that for "normal" calls, we have much better support than what we currently seem to be willing to provide for unwinding. That seems okay though, unwinding is much more tricky / less engraved in stone than the "normal" call ABI (see the other comments re: C++ changing what they do).

Also I never said I am against proper stability. I just observe that we / the lang team do not seem ready to make the strong kind of commitment which providing that kind of stability requires. Of course the best solution would be to have something fully stable, but that just seems enough out of reach that I strongly feel it is unreasonable to block landing the abort-on-panic shim on that.

If you are really proposing lowering the standard for “C unwind”

Again, the discussion here is about a stop-gap that just needs to be "good enough" for users that currently rely on UB. My proposal (not really mine, it's the original intent of this RFC as I interpret it) raises the standard compared to the status quo. Surpassing UB in terms of reliability should really not be hard. But then each time we try that, some people make the perfect the enemy of the good, stalling progress. :(


@Ixrec

As much as I like the C++ "static exceptions" proposal, I don't think it's actually relevant here. It's not proposing a new runtime unwinding mechanism or implementation strategy. Rather, it's proposing a new semantics for try/catch syntax that doesn't involve unwinding at all.

So maybe here's a different angle for this: maybe we can say that the C+unwind ABI is "C calling convention plus whatever the platform-native unwinding mechanism is", and we guarantee that much. What we do not guarantee is that propagating panics up the stack will always be implemented using platform-native unwinding. The default panic strategy might change some day from panic=unwind to panic=exception (for the new C++ exception mechanism). If that happens, when a Rust panic leaves a C+unwind function, it cannot be propagated "normally" -- we might at that point either add an abort-on-panic shim (and offer a new C+exception ABI or so to match the new way that panics are propagated upwards), or we could translate it to platform-native unwinding at that point (and then would need a matching translation catching the unwind and turning it into exception-style panic propagation again on every Rust call site of a C+unwind function).

At that point catch_unwind would become a really bad name for a function that catches propagating panics, but... well that's on whoever picked that name I guess. ;)

So that would avoid the ABI suddenly changing under the feet of people using C+unwind. But it would also not guarantee that it will forever be the case that that code properly propagates panics, hopefully avoiding lock-in.

Now I am not an FFI author myself (I am in this because I want to fix existing crates relying on UB), so... does this make sense? (And maybe I am just paraphrasing what @Ixrec was saying, not sure.)

@kornelski
Copy link
Contributor

The proposed C++ change is an addition which with an opt-in repurposes some C++ syntax locally, but doesn't change the implementation of the existing unwinding mechanism. If C++ did it, it wouldn't affect Rust.

If Rust decided to replace existing mechanism with the new C++ approach (which is basically Result), then it could similarly still keep the existing unwinding mechanism around "C unwind" functions.

Is there anything stopping Rust from converting any future unwinding mechanism to an old one (or even setjmp/longjmp) on boundaries of "C unwind" FFI?

@gnzlbg
Copy link
Contributor

gnzlbg commented Oct 14, 2019

So maybe here's a different angle for this: maybe we can say that the C+unwind ABI is "C calling convention plus whatever the platform-native unwinding mechanism is", and we guarantee that much. What we do not guarantee is that propagating panics up the stack will always be implemented using platform-native unwinding.

That's not a different angle. That's exactly how "C unwind" is currently being proposed. I don't know to what other proposal you are referring to in your other comments (apparently not this one), but I'm glad to see that you have arrived at this conclusion.

The default panic strategy might change some day from panic=unwind to panic=exception (for the new C++ exception mechanism).

I don't think the default panic strategy (e.g. from unwind to something else, like abort, or something that doesn't exists yet) can change. What can be changed under this proposal is the implementation and the ABI of the default panic=unwind strategy.

If that happens, when a Rust panic leaves a C+unwind function, it cannot be propagated "normally" -- we might at that point either add an abort-on-panic shim (and offer a new C+exception ABI or so to match the new way that panics are propagated upwards), or we could translate it to platform-native unwinding at that point (and then would need a matching translation catching the unwind and turning it into exception-style panic propagation again on every Rust call site of a C+unwind function).

Adding an abort-on-panic-shim would break all code using panics and the "C unwind" feature, so we would need to add some other kind of shim that translates the panic to whatever ABI "C unwind" implements.

At that point catch_unwind would become a really bad name for a function that catches propagating panics, but... well that's on whoever picked that name I guess. ;)

Under this proposal Rust code can only be unwinded using the Rust unwind ABI (at the "C unwind" ABI boundary the native mechanism must be "translated" to the Rust one), so catch_unwind is still an accurate name under it.

does this make sense?

Yes. The only main realization you are missing is that, were the Rust implementation of the panic=unwind feature change in such a way that a translation is required at the "C unwind" boundary, we could, at that point, introduce a -C panic-unwind-impl=native/rust compiler flags that allows users and/or targets to select the actual implementation, eliminating such translation cost for those for which this is a problem, by potentially trading off some Rust panicking performance. As @rpjohnst mentions, unwinding is a trade-off, and current implementations make code fast if no panics happens, and very slow if panics do happen. If your application panics a lot for "reason"s, having such a-C panic-unwind-impl option that allows you to pick a different unwind implementation with different trade-offs seems reasonable to me.

@nikomatsakis
Copy link
Contributor

Dear world,

I'm writing with a quick update. As I mentioned in my update comment, the plan of record here is as follows:

  • create a "C unwind" ABI whose details are largely to be determined to start
    • the important point here is that "C unwind" provides something that existing users can migrate to where panics are "expected", even if the current details are not yet specified
    • this in turn enables us to modify extern "C" so as to abort on unwinding
    • this does not in and of itself enable any new use cases on stable; it simply allows existing, working code to continue working in an unstable state
    • like any feature, this ABI would initially be available only on nightly though it would be expected to be stabilized
  • create a project group ('working group') that can iteratively specify the many details at play here
    • once we feel we've settled enough details to make a "useful" guarantee, that would be either a separate RFC or some sort of FCP
    • these guarantees take the form of specifying details of the "C unwind" ABI for particular platforms, or of specifying use patterns that ought to work
    • you can think of this as a more evolved version of the "eRFC" concept

In fact, we've already created a repository for tracking the progress of this group. Right now that's a personal repository of mine, but it will move to rust-lang once the group is "official". The details there are still a "WIP" but hopefully it gives you some sense of the roadmap and the steps we plan to take. We're also chatting in #wg-ffi-unwind over on Zulip.

For the time being, I'm going to go ahead and close this RFC because this discussion thread has gotten long and is quite distracting. Our intention is to open a fresh RFC that declares the above steps. To be clear, I expect that RFC to move fairly quickly through the process -- I think we've had plenty of discussion on the topic and the tradeoffs involved in different strategies.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
T-lang Relevant to the language team, which will review and decide on the RFC.
Projects
None yet
Development

Successfully merging this pull request may close these issues.