-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Initial sketch of fiber groups #14
base: master
Are you sure you want to change the base?
Conversation
In particular, I'm not sure about the |
BTW, is there a reason to keep the fiber pool, now that we have our own malloc? |
It appears to work in my small test ( https://github.com/Microsoft/pxt-common-packages/blob/dbg2/libs/core/hf2.cpp#L226 ), but I given how tricky these things are I would appreciate careful review. |
@finneyj could you take a look at this when you have a moment? Thanks! |
* Resume all fibers with the specific group id. | ||
*/ | ||
int fiber_resume_group(uint16_t group_id); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice. Can we just add a few doxygen comments in here for @param and @return? Just because our online docs scrape these to form docs like these:
https://lancaster-university.github.io/microbit-docs/
These aren't online for codal yet, but I'd like to do it here too.
move_fiber(f, &g->runQueue); | ||
return; | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. Assume this is to cover cases where blocked fibers are woken up, but their fiber group has been paused in the meantime?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's right. This is when fibers wait for events - when paused, this fiber should end up in group run queue. This doesn't happen for sleeping fibers.
} | ||
|
||
__enable_irq(); | ||
|
||
// Ensure this fiber is in suitable state for reuse. | ||
f->flags = 0; | ||
f->group_id = 0; | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so we're assuming all fibers are "system" fibers unless otherwise stated. I think this is good - no different to current semantics.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that's right
@@ -251,8 +310,7 @@ void scheduler_tick(DeviceEvent evt) | |||
if (evt.timestamp >= f->context) | |||
{ | |||
// Wakey wakey! | |||
dequeue_fiber(f); | |||
queue_fiber(f,&runQueue); | |||
move_fiber(f, &runQueue); | |||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this also be a queue_fiber_for_run() call? In case the sleeping fiber's group has been paused?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sleeping fibers are moved to sleep queue of the paused group upon pause (mainly, so that we can modify their wake up time when unpausing). So this should never be a paused fiber.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cool.
Fiber *n; | ||
for (Fiber *f = queue; f; f = n) { | ||
n = f->next; | ||
if (group_id == 0 || f->group_id == group_id) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mmoskal - why the exception case for group_id == 0 here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
zero means to move everything, but maybe I don't strictly need it...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah, I see. that's fine. My bad. I misinterpreted this as (f->group_id == 0). I think this is fine - quite consistent with messageBus etc. when zero maps to ANY_EVENT etc. etc.
|
||
move_fibers(runQueue, &g->runQueue, group_id); | ||
move_fibers(sleepQueue, &g->sleepQueue, group_id); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what about the waitQueue? This is where fibers blocked waiting for an event will be...
Wonders if we should leave fiber that are already blocked on the sleepQueue and waitQueue and then migrate them across to the paued queue(s) if/when wake up... I think you have code to support this code path in queue_fiber_for_run() ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you just answered this one above...
uint32_t delta = (uint32_t)(system_timer_current_time() - g->timestamp); | ||
for (Fiber *f = g->sleepQueue; f; f = f->next) { | ||
f->context += delta; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah, ok. So sleep() operations are offset by a pause() operation. Essentially time stands still for paused fibers... how relativistic!
I guess the reasoning here is that this is good because it preserves relative time between all the fibers in the group that was paused?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
note we agreed this in initial discussions, so all good here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, I think this is what user expects. when the program is paused, all user threads are paused and when we resume them the fiber_sleeps() return in the right order.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, agree.
} | ||
|
||
return currentFiber; | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is good. Certainly great that you've factored this out into an internal function.
Fork-On-Block (FOB) is really an optimization for event handlers. Essentially new fiber contexts are lazily created when needed, which can reduce churn and RAM overhead for the common case (where event handlers are non-blocking functions).
This code path will only ever be triggered if you call pause() on a fiber that is a member of the fiber group being paused whilst it's inside an event handler... I think this is ok in principle, as it should look just like an event handler that blocks waiting for anything else, but this would be the test case to validate. I guess for you it would boil down to "can you breakpoint an event handler?"
…otta-dependency yotta: Explicitly point to master branch for dependencies
@finneyj @jamesadevine could you take a look and see if this seems to make any sense? Background, from our previous emails:
From: "Finney, Joe" j.finney@lancaster.ac.uk
Date: Tuesday, February 21, 2017 at 01:00
[snip]
General approach with scheduler sound good. I think I’d just suggest generalising a little. Instead of a boolean use/system mode, we could have a thread group id (just an int) This might be more flexible in the long run, as a notion of thread groups and rough priorities could be a good building block. Your fiber_user_pause() and fiber_user_resume() operation could turn into fiber_pause_group(int id) and fiber_resume_group() – at which point any thread groups with a given value are paused. We could have default SYSTEM and USER groups to get the behaviour you describe, but also have scope for the future.
Fiddling the scheduler wait_queue/sleep_queue to ensure dormant fibers don’t wake up prematurely should be no problem. Just need to rewrite the timing information held in the queue.
[snip]
From: Michal Moskal (Brickred Systems LLC) [mailto:v-mimosk@microsoft.com]
Sent: 19 February 2017 04:57
[snip]
In any event, when the program hits a breakpoint, I want all user code paused. In particular, if user did fiber_sleep(2000) and already 500ms passed, and then they hit a breakpoint somewhere, and finally hit resume, I want the remaining 1500ms to pass before the fiber_sleep(2000) returns. Now, there is a question what is user code. For example, on micro:bit the display driver is not user code – we want it to run when the program is stopped, otherwise it’s going to be very confusing. OTOH, the scrollString() thread would I think be considered user code. I guess most of the time, the non-user code runs on some interrupts, but it’s possible it would also run on a fiber – for example bulk transmission over MSD.
So I propose the following API (I’m not attached to the names):
void fiber_set_user_code(boolean);
void fiber_user_pause();
void fiber_user_resume();
The first one should set a flag on the fiber, to mark it as user-code. All fibers start as system, but can be marked user with this call. The second and third stop and start all user fibers. When the scheduler is user-paused, then no user code fibers run. If fiber_set_user_code(true) is called while the scheduler is user-paused, the call blocks. I would call this function upon entry to any TypeScript code. I could also expose fiber_set_user_code(false) in TS for people who know what they are doing. Similarly, fiber_user_pause() from a user fiber would block. It would be typically call on exit from an exception/interrupt handler (this is done via messing up with stuff on stack, so it’s no longer in interrupt context).
User-pausing scheduler should store current time. Resuming, should add the delta between now and stored time to wake up time of all user fibers on the sleep queue.