Skip to content
This repository has been archived by the owner on Dec 9, 2018. It is now read-only.

Add basic thread #110

Merged
merged 12 commits into from
Apr 22, 2017
Merged

Add basic thread #110

merged 12 commits into from
Apr 22, 2017

Conversation

tbu-
Copy link
Collaborator

@tbu- tbu- commented Feb 18, 2017

It supports spawn and join so far.

Only works on x86_64 so far.

CC #13.

@tbu-
Copy link
Collaborator Author

tbu- commented Feb 18, 2017

Will add the other architectures soon (unfortunately, the syscall_clone (__steed_clone) function needs to be written for each target separately). Does that look good so far? It should run and work on x86_64.

@anatol
Copy link
Contributor

anatol commented Mar 23, 2017

It seems it does not include implementation for a number of Thread functions like yield_now current sleep_ms etc..

@tbu-
Copy link
Collaborator Author

tbu- commented Mar 23, 2017

Yes. I tried to get the bare-minimum of thread support running, namely: spawning a thread. I didn't manage to on all platforms. I currently do not work on it, so feel free to pick it up if you want.

(If you do, I'd suggest you to also only try thread spawning first, do the rest later. It's hard enough. ^^ Inline assembly for architectures you have never heard before.)

@anatol
Copy link
Contributor

anatol commented Mar 23, 2017

@japaric does this PR look good? Could you please review/merge it before starting work on other thread pieces?

@tbu-
Copy link
Collaborator Author

tbu- commented Mar 24, 2017

This still needs support for the rest of the architectures before it can go forward. From looking at my commit list, at least sparc and PowerPC64 are missing.

@homunkulus
Copy link
Collaborator

☔ The latest upstream changes (presumably f1624e5) made this pull request unmergeable. Please resolve the merge conflicts.

@tbu- tbu- force-pushed the pr_thread_basic branch from 43fc651 to b943119 Compare April 14, 2017 22:52
@tbu-
Copy link
Collaborator Author

tbu- commented Apr 14, 2017

@homunkulus try

@homunkulus
Copy link
Collaborator

⌛ Trying commit b943119 with merge b943119...

Copy link
Owner

@japaric japaric left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First, sorry for taking so long to review this.

Second, thanks @tbu- for tackling this. This is tricky stuff!

The implementation looks reasonable to me. It's a good starting point. I mainly have nits about creating issues about what still needs to be implemented.

Cargo.toml Outdated
@@ -6,7 +6,7 @@ name = "std"
version = "0.1.0"

[dependencies]
sc = "0.1.5"
sc = { git = "git://github.com/japaric/syscall.rs" }
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can this be switched back to the latest release?

// Save `fn_` and `arg` on the child stack.
stp x0,x3,[x1,#-16]!

mov x8,#220 // CLONE
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't recall if it's possible or not but could NR_CLONE from the sc crate be used here (and on all the other instances) instead hardcoding the value? (I know the value is very unlikely to change; this is just for clarity and to avoid duplication)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know how to do that. (This value won't change, it would break a lot of programs and Linux is commited to backward-compatiblity on the syscall layer.)

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, that's fine.

#[naked]
#[no_mangle]
unsafe extern "C" fn __steed_clone() {
// Syscall number is passed in x8, syscall arguments in x0, x1, x2, x3, x4.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I love these comments ❤️ ❤️ ❤️ . Great work, @tbu-. 👍

Some of the MUSL implementations say very little about what's going on.

}

pub unsafe fn set_thread_pointer(thread_data: *mut ()) {
let mut user_desc = linux::user_desc {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add a note mentioning that the meaning of flags is explained in the asm block above?

impl ThreadId {
// Generate a new unique thread ID.
fn new() -> ThreadId {
/*
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you create an issue and add a TODO about this?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

src/libc/mod.rs Outdated
}
*/

pub unsafe fn pthread_join(pthread: pthread_t, retval: *mut *mut c_void)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This implementation looks different from the MUSL one, where did it come from? (I checked src/thread/pthread_join.c). Regardless, it looks reasonable to me.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a simplified copy of __timedwait_cp from src/thread/__timedwait.c, which is called in pthread_join by MUSL.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add a comment mentioning that file and that this is a simplified version?

pub struct Handler(());

impl Handler {
pub unsafe fn new() -> Handler {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you create an issue and add a TODO about this?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -15,6 +15,10 @@ pub fn exit(code: i32) -> ! {
unsafe { linux::exit_group(code) }
}

pub fn page_size() -> usize {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you create an issue and add a TODO about this?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

src/libc/mod.rs Outdated
}
*/

pub unsafe fn pthread_create(pthread: *mut pthread_t,
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This implementation looks different from the MUSL one, where did it come from? (I checked src/thread/pthread_create.c) Perhaps it is a simplified / minimal version of it. In any case, it looks reasonable to me.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a simplified version of the MUSL one, pthread_create in src/thread/pthread_create.c.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add a comment mentioning that file and that this is a simplified version, and the rationale behind omitting some parts (like the copy_tls stuff)?

}

/*
impl Drop for Thread {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you create an issue and add a TODO about this?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tbu-
Copy link
Collaborator Author

tbu- commented Apr 15, 2017

@homunkulus try

@homunkulus
Copy link
Collaborator

⌛ Trying commit 3663209 with merge 3663209...

@homunkulus
Copy link
Collaborator

💔 Test failed - status-travis

@tbu-
Copy link
Collaborator Author

tbu- commented Apr 15, 2017

It only failed on tier-2 platforms.

@homunkulus
Copy link
Collaborator

☔ The latest upstream changes (presumably #126) made this pull request unmergeable. Please resolve the merge conflicts.

@japaric
Copy link
Owner

japaric commented Apr 15, 2017

It only failed on tier-2 platforms.

Please open an issue about the targets that are missing thread support.

Ideally, we should cfg the thread code on the platforms where it's unimplemented to keep steed compiling and hello would working but panic when the thread API is used. But, if that tedious, feel free to just leave things as they are and put these targets in the allow_failures list

@tbu-
Copy link
Collaborator Author

tbu- commented Apr 15, 2017

Please open an issue about the targets that are missing thread support.

#13 (comment)

@tbu- tbu- force-pushed the pr_thread_basic branch from 3663209 to 3cfc3cc Compare April 15, 2017 14:40
@tbu-
Copy link
Collaborator Author

tbu- commented Apr 15, 2017

@homunkulus try

@homunkulus
Copy link
Collaborator

⌛ Trying commit 52f2697 with merge 52f2697...

@homunkulus
Copy link
Collaborator

💔 Test failed - status-travis

@tbu-
Copy link
Collaborator Author

tbu- commented Apr 15, 2017

@japaric It seems that the parameters to __steed_clone are optimized out when compiling in release mode. Do you have an idea on how to change that? It might be that LLVM looks at the actual symbol and sees that it doesn't declare any parameters.

@tbu-
Copy link
Collaborator Author

tbu- commented Apr 15, 2017

If I put it into a different crate and disable LTO, then it works, but if either LTO is enabled or __steed_clone is in the same crate, the optimization happens and the code doesn't work in release mode...

Not sure what to do about this.

@tbu-
Copy link
Collaborator Author

tbu- commented Apr 15, 2017

It sounds like I'm invoking UB somewhere.

@japaric
Copy link
Owner

japaric commented Apr 15, 2017

Hmmm, the problem could the extern "C" / symbol approach since in LTO there's only one object file. Perhaps you could try something like this:

fn syscall_clone(
    fn_: extern "C" fn(*mut c_void) -> *mut c_void,
    child_stack: *mut c_void,
    flags: c_ulong,
    arg: *mut c_void,
    ptid: *mut pid_t,
    newtls: *mut c_void,
    ctid: *mut pid_t,
) -> pid_t {
    unsafe {
        __steed_clone();
    }
}

the return type is going to be a problem there. Perhaps change the signature of __steed_clone to fn() -> pid_t and put an intrinsics::unreachable() at the end of __steed_clone implementation.

@tbu-
Copy link
Collaborator Author

tbu- commented Apr 15, 2017

Changing the signature of __steed_clone didn't help, the parameters still seem to be optimized out. Maybe I need to annotate the asm! macro somehow.

@japaric
Copy link
Owner

japaric commented Apr 15, 2017

the parameters still seem to be optimized out.

Maybe try something like test::black_box?

tbu- added 2 commits April 19, 2017 20:55
It supports `spawn` and `join` so far.

Only works on x86_64 so far.

CC japaric#13.
@tbu- tbu- force-pushed the pr_thread_basic branch from 52f2697 to de3632e Compare April 21, 2017 13:47
@tbu-
Copy link
Collaborator Author

tbu- commented Apr 21, 2017

I've read "This Week in Rust" and found that they just added the thing I need: global_asm! lets me write functions in pure assembly, without any added stuff by the compiler. This should fix the issues in release build.

@homunkulus try

@homunkulus
Copy link
Collaborator

⌛ Trying commit de3632e with merge de3632e...

@homunkulus
Copy link
Collaborator

💔 Test failed - status-travis

@japaric
Copy link
Owner

japaric commented Apr 21, 2017

Oh, nice. global_asm! landed 👍.

ld.lld: error: std.cgu-0.rs:(.text+0x0): duplicate symbol '__steed_clone'

My hunch tells me that adding a [cfg(not(test)] to the symbol may fix this.

This allows us to write the whole function in assembly without hacks.
@tbu- tbu- force-pushed the pr_thread_basic branch from de3632e to 835f2c2 Compare April 21, 2017 14:17
@tbu-
Copy link
Collaborator Author

tbu- commented Apr 21, 2017

Yep, also thought so.

@homunkulus try

@homunkulus
Copy link
Collaborator

⌛ Trying commit 835f2c2 with merge 835f2c2...

@homunkulus
Copy link
Collaborator

☀️ Test successful - status-travis
State: approved= try=True

@tbu-
Copy link
Collaborator Author

tbu- commented Apr 21, 2017

\o/

@@ -34,7 +32,10 @@ unsafe extern "C" fn __steed_clone() {
//
// We save `fn_` and `arg` on the child stack.

asm!("
global_asm!("
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you could even commit actual .s files, if you're interested, and do global_asm!(include_str!("./asm_file.s"));

// where CLONE and EXIT are the respective syscalls. Because it mangles
// with the stack (in two processes at once), it needs to be written in the
// target's assembly.
#[link_name = "__steed_clone"]
Copy link

@mrhota mrhota Apr 22, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if you can achieve what @japaric suggested above with NR_CLONE by doing this:

#[no_mangle]
extern "C" fn __steed_clone(nr: usize // <-- or whatever correct type is for syscall numbers
                            fn_: extern "C" fn(*mut c_void) -> *mut c_void,
                            child_stack: *mut c_void,
                            flags: c_ulong,
                            arg: *mut c_void,
                            ptid: *mut pid_t,
                            newtls: *mut c_void,
                            ctid: *mut pid_t) -> pid_t;

#[inline] // or whatever
fn syscall_clone(fn_: extern "C" fn(*mut c_void) -> *mut c_void,
                 child_stack: *mut c_void,
                 flags: c_ulong,
                 arg: *mut c_void,
                 ptid: *mut pid_t,
                 newtls: *mut c_void,
                 ctid: *mut pid_t) -> pid_t {
    // assuming the first arg below is correct...
    __steed_clone(sc::nr::CLONE, fn_, child_stack, flags, arg, ptid, newtls, ctid)
}

Now, in the assembly, can't you retrieve the syscall number from rax or whatever the correct register is?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that would be possible, but not zero-cost. :) It would actually pass that parameter at runtime in this case.

@japaric japaric merged commit 38c9086 into japaric:master Apr 22, 2017
@japaric
Copy link
Owner

japaric commented Apr 22, 2017

Excellent work, @tbu-! 💯

@anatol
Copy link
Contributor

anatol commented May 5, 2017

thread::current is needed for reentrant mutexes. @tbu- do you plan to implement thread::current?

@tbu-
Copy link
Collaborator Author

tbu- commented May 5, 2017

Yes, I do. However, I'm also implementing mutexes, I hope we're not duplicating work.

@anatol
Copy link
Contributor

anatol commented May 5, 2017

anatol@2e05d1a

@tbu-
Copy link
Collaborator Author

tbu- commented May 5, 2017

tbu-@b1a8b91 :/

How did you do it? I copied the stuff from parking_lot.

@anatol
Copy link
Contributor

anatol commented May 5, 2017

I implemented rwlock/mutex using atomics, somewhat similar to musl implementation. https://github.com/anatol/steed/blob/2e05d1ac3ef94fe91d0bfb1e2b6eccf281c6f419/src/sys/linux/mutex.rs . Reentrant mutexes fail to compile as thread::current is missing. Also I need to review memory ordering - currently I use Ordering::Relaxing and it works on x86_64 but my gut feeling says it might not be enough for architectures with weak memory models.

I never used parking_lot but their implementation looks too complicated to me.

@japaric
Copy link
Owner

japaric commented May 6, 2017

I'd personally prefer to use parking-lot's implementation. It's been out there (and been tested) for a while and have quite a few advantages over libc's, like a const-fn constructors and smaller memory usage.

@anatol
Copy link
Contributor

anatol commented May 6, 2017

I guess it it matter of priorities but it makes sense to avoid complexity and start with a simple textbook implementation for synchronization. The simple atomics based solution is widely known and used in many places, including musl.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants