Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pod/perlguts pod/perlhacktips - various updates and new content #22881

Open
wants to merge 1 commit into
base: blead
Choose a base branch
from

Conversation

bulk88
Copy link
Contributor

@bulk88 bulk88 commented Jan 1, 2025

See commit.


  • This set of changes does not require a perldelta entry.

Comment on lines +123 to +148
For 5.21.6 and up, to avoid recompiling XS, if you want to add a new interpreter
global variable while hacking on the interpreter, is to rename, repurpose, or
make into union, a current variable from F<intrpvar.h> without change its size,
alignment, and offset.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Preferably we'd fix this bug in the XS handshake, so that maint can add new members to the end of the interpreter structure.

Copy link
Contributor Author

@bulk88 bulk88 Jan 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We would need 2 numbers/consts then, a fictional "ABI" my_perl struct that is frozen-ish, and the runtime my_perl struct has its normal CC length. But if == changes to >=, it won't detect interp fields like

https://github.com/Perl/perl5/blob/blead/intrpvar.h#L277

https://github.com/Perl/perl5/blob/blead/intrpvar.h#L520 (32b time_t vs 64b , windows mingw gcc hot mess 10-15 years ago)

https://github.com/Perl/perl5/blob/3a16cb5fb8114b7773a9c488c78684bffd787b62/intrpvar.h#L749C28-L749C37

big multi member struct, spec says the type has UB contents, and UB size (OS vendors choice)

https://codebrowser.dev/glibc/glibc/locale/bits/types/__locale_t.h.html#__locale_struct

pre-xs handshake segv, compiler upgrade, made glibc.so and std C disk .h not agree anymore, and segv the persons newer xs modules, but not his interp or old .so XS bin

https://bugzilla.redhat.com/show_bug.cgi?id=1064271#c34

Also it might be time for XS handshake and your new reverse version of it (I can crash the sample if perl.dll and host exe dont agree on the my_perl struct), to pass a better file name than FILE or, allow a trapable fatal error.

A very very tiny API, would be needed to throw a "portable" die to root interp from an accidental 2nd interp/2nd libperl, or from the BOOT: XSUB of the broken 3rd party XS .so/.dll to root interp.

void arg2->cantAttach(void);

setjmp buffers and libc's internals of longjmp() have an above average risk of SEGVing. Remember Win32/64's eternal problem of multiple CRT DLLs in 1 proc. (5.41.8 has it right now, ws2_32.dll links against msvcrt.dll instead of linking against ntdll.dll for libc basics). A tiny api like, or just rename the XS___Module__bootstrap to XS___Module__bootstrap_v4108_A1F0 and move the handshake key into the symbol table of the .so/.dll.

My wishlist has an idea to remove the 1 fn/sym big WinOS .dll XS export table completely from XS. Just have dynaloder call DllMain with DWORD fdwReason *(U32)"PERL".

Debugging XS handshake, its harder than average, either you write code to print the offsets of all fields, and text diff that, or make a .i format it, cut out the interp struct, text diff it. Its most often for me, a Inline::C/EU::MM, -DFLAGS CCFLAGS problems. Even more sinister, last time I did the text diff both interp structs had the same line count!!!! It was a U16 U64 problem, and both aligned (hole) to 64b, so even XS handshake didn't catch that. interp structs were the same total size after standard C integer type alignment.

I will change that area of text to say the correct/intended design is adding members to bottom in maint, but current its not possible.

@bulk88
Copy link
Contributor Author

bulk88 commented Jan 16, 2025

repushed, added some sentences says expanding the end interp struct in a maint release should be returned one day if someone can write better code than the current implementation, and is equally resistent against self built perls in /home, .rc mistakes, copy pasted binaries, incompatible CC upgrades, and really bad makefile.pls on cpan that don't work outside of their authors box.

added some stories about extreme platforms perl runs on, since IT diverse than "Ubuntu on x64", and advice to think about future tech changes long after you wrote it

Comment on lines -2946 to +2973
compatibility at the expense of performance. (Passing an arg is
cheaper than grabbing it from thread-local storage.)
compatibility at the expense of performance. Passing an arg is
much cheaper and faster than grabbing it with from the OS's thread-local
storage API with function calls.

But consider this, if there is a choice between C<Perl_croak> and
C<Perl_croak_nocontext> which one do you pick? Which one is
more efficient? Is it even possible to make the C<if(assert_failed)> test true
and enter conditional branch with C<Perl_croak>?

Maybe only from a test file. Maybe not. Your C<Perl_croak> branch is probably
unreachable until you add a new bug. So the performance of
C<Perl_croak_nocontext> compared to C<Perl_croak>, doesn't matter. The C<dTHX;>
call inside the slower C<Perl_croak_nocontext>, will never execute in anyone's
normal control flow. If the error branch never executes, optimize what does
execute. By removing the C<aTHX> arg, you saved 4-12 bytes space and 1-3 CPU
assembly ops on a cold branch, by pushing 1 less variable onto the C stack
inside the call expression invoking C<Perl_croak_nocontext>, instead of
C<Perl_croak>. The CPU has less to jump over now.

The rational of C<Perl_croak_nocontext> is better than C<Perl_croak> is only
in the case of C<Perl_croak>, and nowhere else except for the deprecated
C<Perl_die_nocontext> C<Perl_die> pair and 3rd case of C<Perl_warn>.
C<Perl_warn> is debateable.

It doesn't apply to C<Perl_form> C<Perl_mess> or keyword
C<Perl_op_die(OP * op)>, which could be normal control flow.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do think this part could be briefer.

Maybe just:

For calls to C<Perl_croak> which is typically on a cold code path it's often better to call C<Perl_croak_nocontext> version of the function to save passing that extra argument.

I don't think this is so important that we need this much extra text talking about it, it's worth mentioning, but not worth spending too much text on.

As to pushing on the stack, most Perl_croak() calls don't have a lot of arguments, they're going to be pushed into the stack for i386, but x86_64 passes at least 4 integer/pointer arguments in registers (4 Windows, 6 everyone else AFAIK), and other architectures even more (8 for riscv and arm), so I don't think it's worth mentioning.

Comment on lines +75 to +87
The interpreter currently does not use any atomic intrinsic functions offered
by a C compiler. Instead Perl's thread safe serialization, is done with an
internal API with names like C<MUTEX_INIT()> and C<MUTEX_LOCK()> .

Historically, atomic operations didn't exist on most CPU archs that Perl uses.
If they existed, atomic APIs were always OS and vender specific, and never
portable.

As of 5.35.5, perl dropped support for a strict C89 compiler and moved to
a minimum requirement of C89+some C99. See L</C99>. C11 standardized some
atomics for the first time in the optionally implemented C<stdatomic.h>.
Patches are welcome to add a portable atomic API, with fallbacks to
C<MUTEX_LOCK()>.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This discussion of atomics doesn't really belong in COMMON PROBLEMS.

I'm not sure something this discussion of atomics belongs here at all - nothing uses it. We might optionally use atomics in the future, but it's pretty much vapourware at this point. (also: C++ builds, and C++ header compatibility, yay)

If we do start using atomics it might be worth mentioning, but any use it likely to be behind macros anyway.

Comment on lines +89 to +97
The right way to introduce a new C global variable, usually will be to add
it as a new interpreter variable. See F<intrpvar.h>. Since 5.10.0, adding
or removing or changed the size of any interpreter variable, is not supported
and undefined behavior. Recompiling XS modules is required.

There are some loopholes to this policy if you are writing unstable
experiments. These loopholes can never be used, in stable code, for the
interpreter, or XS modules. The loopholes may temporarily work, just long
enough, to finish the experiment. Remember, failure to get a C<SEGV>, or
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This (all the additional interpreter variable text) is really way much too verbose.

A lot of the text is about the incompatibility when adding interpreter variables, but in most cases someone hacking on perl itself is working on blead - the incompatibility when hacking on perl is only an issue then backporting changes back to maint.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants