Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feed consistent permissions #1722

Merged
merged 3 commits into from
Dec 4, 2024
Merged

Feed consistent permissions #1722

merged 3 commits into from
Dec 4, 2024

Conversation

dlpzx
Copy link
Contributor

@dlpzx dlpzx commented Nov 26, 2024

Feature or Bugfix

  • Feature
  • Bugfix

Detail

The Feeds module is used in the frontend in several modules. Some restrict access to admins only and some don't. In this PR we unify the behavior. ONLY ADMINS CAN SEE THE FEED in the frontend.

  • Dashboards: accessible to any user -----> add isAdmin
  • PIpelines: accessible to any user -----> add isAdmin
  • Redshift_Datasets: accessible to admin users only
  • Redshift_Tables : accessible to admin users only
  • S3_Datasets: accessible to admin users only
  • Folders: accessible to admin users only
  • Tables: accessible to admin users only

Alongside the frontend changes, the backend should follow the same logic and restrict the API calls with permissions checks. That is what this PR does, it introduces resource permission checks depending on the Feed targetType with GET_X permission checks.

  • Add security-focused tests for unauthorized cases
Screenshot 2024-11-26 at 14 49 56

Testing

  • UI shows chat button for admins (creators or admin team) - verified in Datasets and Dashboards
  • UI does not show chat button for non-admins - verified in Datasets and Dashboards
  • Deploy in AWS
  • Call getFeed, postFeedMessage with resource admin (with GET permissions) and get feed
    • Dataset
    • Table
    • Folder
    • Redshift Dataset
    • Redshift Table
    • Dashboard
  • Call getFeed, postFeedMessage with another team not the resource admin (with UPDATE permissions) and get unauthorized response:
    • Dataset
    • Table
    • Folder
    • Redshift Dataset
    • Redshift Table
    • Dashboard

Relates

Security

Please answer the questions below briefly where applicable, or write N/A. Based on
OWASP 10.

  • Does this PR introduce or modify any input fields or queries - this includes
    fetching data from storage outside the application (e.g. a database, an S3 bucket)?
    • Is the input sanitized?
    • What precautions are you taking before deserializing the data you consume?
    • Is injection prevented by parametrizing queries?
    • Have you ensured no eval or similar functions are used?
  • Does this PR introduce any functionality or component that requires authorization?
    • How have you ensured it respects the existing AuthN/AuthZ mechanisms?
    • Are you logging failed auth attempts?
  • Are you using or adding any cryptographic features?
    • Do you use a standard proven implementations?
    • Are the used keys controlled by the customer? Where are they stored?
  • Are you introducing any new policies/roles/users?
    • Have you used the least-privilege principle? How?

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@dlpzx dlpzx force-pushed the feat/feed-consistent-permissions branch from 0940baf to c19810e Compare November 26, 2024 10:37
@dlpzx dlpzx force-pushed the feat/feed-consistent-permissions branch 2 times, most recently from 94741de to e4672cb Compare November 26, 2024 13:14
@dlpzx dlpzx force-pushed the feat/feed-consistent-permissions branch from e4672cb to 239bcc8 Compare November 26, 2024 13:51
@dlpzx dlpzx marked this pull request as ready for review November 26, 2024 13:51
@@ -41,6 +39,15 @@ def get_feed(
targetUri: str = None,
targetType: str = None,
) -> Feed:
context = get_context()
with context.db_engine.scoped_session() as session:
ResourcePolicyService.check_user_resource_permission(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: We used this logic in a couple of places already, does it worth exposing it as a decorator?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We use it for Stacks and for Feeds no? where else do we check that is using a function in the permission_name?

Copy link
Contributor

@noah-paige noah-paige left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A call out that this is not restricted to only Admin/Creators on the backend for all target types:

  • For Pipelines GET_PIPELINE only given to pipeline creator so now on restricted both to only creators on FE and BE

  • For Dashboards GET_DASHBOARD is given to Dashboard creators and when dashboards have been shared to a group (but on FE only restricted to creator)

  • For S3 Dataset GET_DATASET is given to Dataset creator, steward and env Admin (but on FE only restricted to Creator or Steward)

  • For Redshift Dataset GET_REDSHIFT_DATASET is given to Dataset creator, steward and Env Admin (but on FE only restricted to Creator or Steward)

  • For S3 Tables GET_DATASET_TABLE is given to Dataset creator, steward and shared groups (but on FE only restricted to Creator or Steward)

  • For Redshift Tables GET_REDSHIFT_DATASET_TABLE is given to Dataset creator, steward (but on FE only restricted to Creator or Steward)

  • For S3 Folders GET_DATASET_FOLDER is given to Dataset creator, steward and shared groups (but on FE only restricted to Creator or Steward)

For the following should we make it consistent on server and client side:

  • Dashboards
  • S3 Dataset
  • Redshift Dataset
  • S3 Tables
  • S3 Folders

For Redshift Tables -- should GET_REDSHIFT_DATASET_TABLE be granted on share approval and similar changes to FE for feed?

@dlpzx
Copy link
Contributor Author

dlpzx commented Nov 29, 2024

A call out that this is not restricted to only Admin/Creators on the backend for all target types:

* For Pipelines `GET_PIPELINE` only given to pipeline creator so now on restricted both to only creators on FE and BE

* For Dashboards `GET_DASHBOARD` is given to Dashboard creators and when dashboards have been shared to a group (but on FE only restricted to creator)

* For S3 Dataset `GET_DATASET` is given to Dataset creator, steward and env Admin  (but on FE only restricted to Creator or Steward)

* For Redshift Dataset `GET_REDSHIFT_DATASET` is given to Dataset creator, steward and Env Admin  (but on FE only restricted to Creator or Steward)

* For S3 Tables `GET_DATASET_TABLE` is given to Dataset creator, steward and shared groups (but on FE only restricted to Creator or Steward)

* For Redshift Tables `GET_REDSHIFT_DATASET_TABLE` is given to Dataset creator, steward (but on FE only restricted to Creator or Steward)

* For S3 Folders `GET_DATASET_FOLDER` is given to Dataset creator, steward and shared groups (but on FE only restricted to Creator or Steward)

For the following should we make it consistent on server and client side:

* Dashboards

* S3 Dataset

* Redshift Dataset

* S3 Tables

* S3 Folders

For Redshift Tables -- should GET_REDSHIFT_DATASET_TABLE be granted on share approval and similar changes to FE for feed?

I have created a couple of reports to analyze the different behaviors. I think the full consistency analysis of the permissions is out of the scope of this PR

@noah-paige
Copy link
Contributor

@dlpzx can we track an issue/story to expose feed to other non admin groups - then I will go ahead and approve this PR

@dlpzx
Copy link
Contributor Author

dlpzx commented Dec 3, 2024

#1726 I opened the issue, but I think i will work first on clarify the GET_DATASET, GET_TABLE behavior

@dlpzx dlpzx merged commit 5438bdb into main Dec 4, 2024
9 checks passed
@dlpzx dlpzx mentioned this pull request Dec 4, 2024
dlpzx added a commit that referenced this pull request Dec 5, 2024
- Feature
- Bugfix

The Feeds module is used in the frontend in several modules. Some
restrict access to admins only and some don't. In this PR we unify the
behavior. ONLY ADMINS CAN SEE THE FEED in the frontend.
- Dashboards: accessible to any user -----> add isAdmin
- PIpelines: accessible to any user  -----> add isAdmin
- Redshift_Datasets: accessible to admin users only
- Redshift_Tables : accessible to admin users only
- S3_Datasets: accessible to admin users only
- Folders: accessible to admin users only
- Tables: accessible to admin users only

Alongside the frontend changes, the backend should follow the same logic
and restrict the API calls with permissions checks. That is what this PR
does, it introduces resource permission checks depending on the Feed
targetType with GET_X permission checks.

- [x] Add security-focused tests for unauthorized cases

<img width="1183" alt="Screenshot 2024-11-26 at 14 49 56"
src="https://github.com/user-attachments/assets/f71292f1-1c90-4e35-a040-17d246ce2b68">

- [X] UI shows chat button for admins (creators or admin team) -
verified in Datasets and Dashboards
- [X] UI does not show chat button for non-admins - verified in Datasets
and Dashboards
- [x] Deploy in AWS
- Call getFeed, postFeedMessage with resource admin (with GET
permissions) and get feed
    - [X] Dataset
    - [x] Table
    - [x] Folder
    - [X] Redshift Dataset
    - [X] Redshift Table
    - [x] Dashboard
- Call getFeed, postFeedMessage with another team not the resource admin
(with UPDATE permissions) and get unauthorized response:
    - [X] Dataset
    - [x] Table
    - [x] Folder
    - [x] Redshift Dataset
    - [x] Redshift Table
    - [x] Dashboard
- <URL or Ticket>

Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?

By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.
@dlpzx dlpzx mentioned this pull request Dec 9, 2024
dlpzx added a commit that referenced this pull request Jan 15, 2025
### Feature or Bugfix
- Security

### Detail

### 🔐 Security
* Update sanitization technique for terms filtering by @noah-paige in
#1692 and in
#1693
* Move access logging to a separate environment logging bucket by
@noah-paige in #1695
* Add explicit token duration config for both JWTs by @noah-paige in
#1698
* Disable GraphQL introspection if prod sizing by @noah-paige in
#1704
* Add snyk workflow on schedule by @noah-paige in
#1705,
#1708,
#1713,
#1745 and in in
#1746
* Unify Logger Config for Tasks by @noah-paige in
#1709
* Updating overly permissive policies tagged by checkov for environment
role using least privilege principles by @mourya-33 in
#1632

Data.all permission model has been reviewed to ensure all Mutations and
Queries have proper permissions:
* Add MANAGE_SHARES permissions by @dlpzx in
#1702
* Add permission check - is tenant to update SSM parameters API by
@dlpzx in #1714
* Add GET_SHARE_OBJECT permissions to get data filters API by @dlpzx in
#1717
* Add permissions on list datasets for env group + cosmetic S3 Datasets
by @dlpzx in #1718
* Add GET_WORKSHEET permission in RUN_SQL_QUERY by @dlpzx in
#1716
* Add permissions to Quicksight monitoring service layer by @dlpzx in
#1715
* Add LIST_ENVIRONMENT_DATASETS permission for listing shared datasets
and cleanup unused code by @dlpzx in
#1719
* Add is_owner permissions to Glossary mutations + add new integration
tests by @dlpzx in #1721
* Refactor env permissions + modify getTrustAccount by @dlpzx in
#1712
* Add Feed consistent permissions by @dlpzx in
#1722
* Add Votes consistent permissions by @dlpzx in
#1724
* Consistent get_<DATA_ASSET> permissions - Dashboards by @dlpzx in
#1729


### 🧪 Test improvements
Integration tests are in sync with `main` without 2.7 planned features.
In this PR all core modules, optional modules and submodules are tested.
That includes: tenant-permissions, omics, mlstudio, votes, notifications
and backwards compatiblity of s3 shares. by @SofiaSazonova, @noah-paige
, @petrkalos and @dlpzx


In addition, the following PR adds functional tests that ensure the
permission model of data.all is not corrupted.
* ⭐ Add resource permission checks by @petrkalos in
#1711


### Dependencies
* Update FastAPI by @petrkalos in #1577 
* update fastapi dependency by @noah-paige in
#1699
* Upgrade "cross-spawn" to "7.0.5" by @dlpzx in
#1701
* Bump python runtime to bump cdk klayers cryptography version by
@noah-paige in #1707


### Relates
- List above

### Security
Please answer the questions below briefly where applicable, or write
`N/A`. Based on
[OWASP 10](https://owasp.org/Top10/en/).

- Does this PR introduce or modify any input fields or queries - this
includes
fetching data from storage outside the application (e.g. a database, an
S3 bucket)?
  - Is the input sanitized?
- What precautions are you taking before deserializing the data you
consume?
  - Is injection prevented by parametrizing queries?
  - Have you ensured no `eval` or similar functions are used?
- Does this PR introduce any functionality or component that requires
authorization?
- How have you ensured it respects the existing AuthN/AuthZ mechanisms?
  - Are you logging failed auth attempts?
- Are you using or adding any cryptographic features?
  - Do you use a standard proven implementations?
  - Are the used keys controlled by the customer? Where are they stored?
- Are you introducing any new policies/roles/users?
  - Have you used the least-privilege principle? How?


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

---------

Co-authored-by: mourya-33 <134511711+mourya-33@users.noreply.github.com>
Co-authored-by: Mourya Darivemula <mouryacd@amazon.com>
Co-authored-by: Noah Paige <69586985+noah-paige@users.noreply.github.com>
Co-authored-by: Petros Kalos <kalosp@amazon.com>
Co-authored-by: Sofia Sazonova <sofia-s@304.ru>
Co-authored-by: Sofia Sazonova <sazonova@amazon.co.uk>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants