Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

STCOR-868 backport idle-session timeout and fixed-length-session timeout #1507

Merged
merged 17 commits into from
Jul 26, 2024

Conversation

zburke
Copy link
Member

@zburke zburke commented Jul 25, 2024

Q shipped with a very minimal refresh-token-rotation implementation. In environments with long AT expirations, this doesn't pose much of a problem, but when sessions last less than, say, eight hours, the shortcomings become painfully apparent.

Here's a summary:

  • STCOR-776 idle session timeout
  • STCOR-787 use stripes-config::config.tenantOptions for all tenant-related data
  • STCOR-864 do not render ModuleContainer until discovery is complete [bug fix]
  • STCOR-865 consolidate /logout API calls into /logout and /logout-timeout UI routes
  • STCOR-866 start the RTR cycle when restoring an existing session [bug fix]
  • STCOR-862 fixed-length session timeout

zburke and others added 15 commits July 25, 2024 08:50
…inactivity (#1463)

Track activity (e.g. clicks, keypresses, etc), and after a period of
inactivity show a "Keep working?" modal with a countdown timer; when the
countdown reaches zero, end the session.

* Separate the RTR cycle from regular API requests. Previously, we would
  inspect the AT on each request to make sure it was valid, and fire RTR
  if the AT was getting old. This meant we had to inspect requests (to see
  if the AT was valid) and responses (to see if a 4xx failure was due to
  RTR or the API itself). Since RTR and regular API requests are now
  separate, we can assume the AT in an API request is valid and likewise
  that any error response is related to that API. Much simpler.
* Show a "Keep working?" modal after a period of inactivity (default: 1
  hour) with a countdown (default: 1 minute), and terminate the session if
  the countdown reaches 0 without the user clicking the modal's CTA.
  "Activity" defaults to `keydown` and `mousedown` events. Previously, an
  abandoned session with an expired RT showed no indication that the RT
  had expired until the user performed an API request, which would cause
  an immediate logout.
* Provide a `/logout` route, making it possible to logout by directly
  accessing that URL.
* When a session is terminated due to inactivity, redirect to
  `/logout-timeout`, a static route with a message explaining what
  happened.

Activity across tabs/windows happens via BroadcastChannel and storage
requests, both of which emit events _only_ to the channels where they
were not fired, i.e. you won't receive BroadcastChannel event on the
object where it was posted, nor a storage event in the window/tab where
the value changed.

* Confirming the "Keep working?" modal in any window keeps all windows alive.
* Activity in any window keeps all windows alive.
* Logging out in any window logs out in all windows.
* The session timeout, keep-working timeout, and the events that
  constitute "activity" have default values but may be overridden via
  `stripes.config.js`:
```
config: {
  //...
  useSecureTokens: true,
  rtr: {
    // how long before an idle session is killed? default: 60m.
    // this value must be shorter than the RT's TTL.
    // must be a string parseable by ms, e.g. 60s, 10m, 1h
    idleSessionTTL: '10m',

    // how long to show the "warning, session is idle" modal? default: 1m.
    // this value must be shorter than the idleSessionTTL.
    // must be a string parseable by ms, e.g. 60s, 10m, 1h
    idleModalTTL: '30s',

    // which events constitute "activity" that prolongs a session?
    // default: keydown, mousedown
    activityEvents: ['keydown', 'mousedown', 'wheel', 'touchstart', 'scroll'],
  }
}
```
* Turn on the logging channels `rtr` and `rtrv` (verbose)

See https://developer.mozilla.org/en-US/docs/Web/API/Broadcast_Channel_API

Refs STCOR-776. Replaces PR #1431, which implemented something similar
but with RTR still attached to regular API requests. It ... sort of
worked, but not really, and was reverted in PR #1433.

(cherry picked from commit 39d1fc9)
Whoops, missed this in #1463

Refs STCOR-776

(cherry picked from commit be7f076)
* Always populate `stripes.config.rtr.activityEvents`; if it isn't
  defined at build-time via `stripes.config.js`, populate it with values
  from `RTR_ACTIVITY_EVENTS`. Previously, it was omitted if not defined
  in `stripes.config.js`, causing the KeepWorkingModal callback to
  dispatch an empty event, dismissing the modal but failing to actually
  prolong the session since no "activity" event was fired. Whoops.
  Related, this means we must supply `activityEvents` on `stripes` in
  tests of `<SessionEventContainer>` where it is rendered directly,
  without `<Root>` as a parent to call `configureRtr()`.
* Populate `stripes.config.rtr.idleSessionTTL` and
  `stripes.config.rtr.idleModalTTL` from their corresponding constants
  instead of hard-coding magic strings in multiple places like an idiot.
* Minor test clean up. We don't need to reassign `console` methods when
  we can just run jest with `--silent`.

Refs STCOR-776

(cherry picked from commit 99b8948)
Not all authentication-related responses are created equal. If the
response contains AT expiration data, i.e. it is a response from
`/bl-users/login-with-expiry`, use that data to configure RTR. If the
response does not contain such data, i.e. it is a response from
`/bl-users/_self`, pull AT expiration data from the session and use
that. If we still don't have any AT expiration data, cross your fingers
and try rotating in 10 seconds.

Previously, we did not ever check the session, so RTR always fired 10
seconds into a resumed, which is exactly the situation faced in e2e
tests.

This doesn't precisely handle the issue faced by e2e tests that stall
when the rtr requests fires, but leveraging the session data should
push the rtr request far enough into the future that most e2e tests
will now avoid it. We should still try to understand that problem and
solve it, but in the mean time this may enable us to work around it.

Refs STCOR-776

(cherry picked from commit e93a5af)
We attempted to rebase onto master just after STCOR-776 merged (#1463).
It didn't go smoothly but the results got pushed anyway, which made
clean up tricky too. I think the changes here resolve the conflicts.

One outstanding issue I am aware of is that the `/logout-timeout`
redirect does not work correctly. When the session terminates,
`<AuthnLogin>` redirects to keycloak no matter what. It's like the
routing switch statement is falling through instead of stopping with
`<LogoutTimeout>`. That's no good, but it's less no good than the
current tip-of-branch, which doesn't redirect ever, making it impossible
to authenticate.

(cherry picked from commit a9b860d)
The `/authn/logout` request requires the `X-Okapi-Tenant` header to
succeed.

(cherry picked from commit eeaa34a)
The previous commit re-enabled logout by correctly passing the
`x-okapi-tenant` header in the `/authn/logout` request. It turns out
that if you want read the tenant from the store in a test, you have to
mock the store in your test. WHO KNEW???

(cherry picked from commit 5bc64ce)
There are many small differences in how keycloak and okapi respond to
authentication related requests.
* permissions are structured differently in Okapi between `login` and
  `_self` requests and depending on whether `expandPermissions=true` is
  present on the request; keycloak always responds with a flattened
  list.
* token expiration data is nested in the login-response in Okapi but is
  a root-level element in the `/authn/token` response from keycloak.

STCOR-776, STCOR-846

(cherry picked from commit 2e162f6)
…nantOptions in stripes.config.js (#1487)

* Retrieve clientId and tenant values from config.tenantOptions before login

* Fix tenant gathering

* Remove isSingleTenant param which is redundant

* If user object not returned from local storage, then default user from /_self response

* Update CHANGELOG.md

* Revert PreLoginLanding which uses okapi values

* Remove space

* Rework flow to immediately set config to okapi for compatibility.

* Lint fix

* Fix unit test

(cherry picked from commit e738a2f)
* Ensure okapi is being read from store after pulling from tenantOptions in AuthLogin

(cherry picked from commit eed1ba5)
Stripes should render `<ModuleContainer>` either when discovery is
complete or when okapi isn't present at all, i.e. when
`stripes.config.js` doesn't even contain an `okapi` entry. What's most
amazing about this bug is not the bug, which is a relatively simple
typo, but that it didn't bite us for more than six years.

BTOG init never conducted discovery, but _did_ pass an okapi object
during application setup, which is another way of saying that our
application didn't have anything that relied on the presence of this
bug, but our test suite did. :|

Ignore the "new" AuthnLogin test file; those tests were previously
stashed in `RootWithIntl.test.js` for some reason and have just been
relocated.

Refs STCOR-864

(cherry picked from commit 6201292)
Two things happen when idle-session-timeout kicks in:
1. the redux store is updated to clear out the session
2. the URL is updated to `/logout-timeout`

It sounds simple, but it gets messy when `<RootWithIntl>` re-renders
when the store updates because that's where routes are defined.
Previously, with event-handlers separately calling `logout()` to update
the store and `history.push()` to update the URL, you could end up in an
unexpected situation such as being logged-out before the URL updated to
`/logout-timeout`, causing the default route-match handler to kick in
and redirect to the login screen.

The changes here consolidate calls to `logout()` into the components
bound to `/logout` (`<Logout>`) and `/logout-timeout`
(`<LogoutTimeout>`). Event handlers that previously did things like
```
return logout(...)         // update redux and other storage
  .then(history.push(...)) // update URL
```

are now limited to updating the URL. This means directly accessing the
routes `/logout` and `/logout-timeout` always terminates a session, and
the logic around logout is both simpler and better contained within
components whose purpose, by dint of their names, is blindingly clear.

The minor changes in `<MainNav>` are just clean-up work, removing cruft
that is no longer in use.

Refs STCOR-865

(cherry picked from commit 8daa267)
The RTR cycle is kicked off when processing the response from an
authentication-related request. `/users-keycloak/_self` was missing
from the list, which meant that RTR would never kick off when a new tab
was opened for an existing session.

Refs STCOR-866

(cherry picked from commit f93f21d)
RTR may be implemented such that each refresh extends the session by a
fixed interval, or the session-length may be fixed causing the RT TTL to
gradually shrink until the session ends and the user is forced to
re-authenticate. This PR implements handling for the latter scenario,
showing a non-interactive "this session will expire" banner before the
session expires and then redirecting to `/logout` to clear out session
data.

By default the warning is visible for one minute. It may be changed at
build-time by setting the `stripes.config.js` value
`config.rtr.fixedLengthSessionWarningTTL` to any value parseable by
`ms()`, e.g. `30s`, `1m`, `1h`.

Cache the current path in session storage prior to a timeout-logout,
allowing the user to return directly to that page when
re-authenticating.

The "interesting" bits are mostly in `FFetch` where, in addition to
scheduling AT rotation, there are two new `setTimer()` calls to dispatch
the FLS-warning and FLS-timeout events. Handlers for these are events
are located with other RTR event handlers in  `SessionEventContainer`.

There are corresponding reducer functions in `okapiActions`. Both it and
`okapiReducer` were refactored to use constants instead of strings for
their action-types. The refactor is otherwise insignificant.

Refs STCOR-862

(cherry picked from commit 8b5274e)
Some of the commits being cherry-picked on this branch relied on other
work that wasn't directly related to idle-session timeout or
fixed-length-session timeout, and hence weren't picked onto the branch
but were, in fact necessary, for this work to function correctly.

* Restore the `/oidc-landing` route
* Pass the correct arguments to `loadResources()`

Refs STCOR-868
Copy link

github-actions bot commented Jul 25, 2024

Jest Unit Test Statistics

240 tests  +39   240 ✔️ +39   51s ⏱️ +7s
  45 suites +  7       0 💤 ±  0 
    1 files   ±  0       0 ±  0 

Results for commit 8c5b31a. ± Comparison against base commit 3fef2ba.

♻️ This comment has been updated with latest results.

Copy link

github-actions bot commented Jul 25, 2024

BigTest Unit Test Statistics

    1 files  ±0      1 suites  ±0   10s ⏱️ ±0s
266 tests ±0  260 ✔️ ±0  6 💤 ±0  0 ±0 
269 runs  ±0  263 ✔️ ±0  6 💤 ±0  0 ±0 

Results for commit 8c5b31a. ± Comparison against base commit 3fef2ba.

♻️ This comment has been updated with latest results.

whoops; didn't mean to include this in the previous commit
@aidynoJ
Copy link
Contributor

aidynoJ commented Jul 26, 2024

LGTM, as I understand it is a set of PRs that was already reviewed.

Copy link
Member

@ryandberger ryandberger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree with @aidynoJ and approve. There are a few lint issues to address (mainly unused variables), but not required for merge.

Copy link

@zburke zburke merged commit 3c10177 into keycloak-quesnelia Jul 26, 2024
6 checks passed
@zburke zburke deleted the STCOR-868 branch July 26, 2024 17:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants