Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

403 Forbidden XSRF cookie does not match POST argument after updating to the latest helm chart version (3.3.7) #3422

Open
matanshk opened this issue May 23, 2024 · 22 comments
Labels

Comments

@matanshk
Copy link

Bug description

We are using z2jh helm chart on our Kubernetes cluster, we upgraded the chart to the latest version (3.3.7) from 3.1.0.
When the upgrade was finished; we started to get the error in the UI:
"403 Forbidden, XSRF cookie does not match POST argument"
We noticed weird behavior from the jupyterhub, because some people in the team always got the issue, some of them were facing it sometimes (not always), and some didn't face it at all. I want to mention that it happens only with Chrome and Firefox browsers, but with Safari it worked well.
Cleaning cookies and incognito didn't solve it, we also tried to update the browser's version to the newest and nothing changed.

Screenshot 2024-05-22 at 4 18 26 PM

I want to mention that before the upgrade we never saw this issue, I tried to downgrade the helm chart version for the previous patches (3.3.6, 3.3.5, 3.3.4, 3.3.3) and still got the same 403 error when I downgraded it to 3.1.0 (our previous version before the upgrade) the issue disappears.

In the logs I can see that:

How to reproduce

Acutely, we tried our best to understand how to reproduce the issue and make it cause in the team members that are not facing with the issue, but without any success :|
but I can say that it happens in the authentication step, it's doesn't matter if you provide correct username and password or wrong, you will get the 403 error.

Expected behaviour

To get a smooth authentication process without getting the 403 Forbidden error

Actual behaviour

We are getting 403 error right after clicking on the "Sign in" button

Your personal set up

We are running on LKE cluster with Debian 11 OS worker nodes.
Nginx ingress controller and mTLS certificate for authentication on the ingress (I disabled the mTLS for testing and nothing changed) together with dummy authenticator with preconfigured password
The issue happens right after the upgrade to helm chart version 3.3.7 from 3.1.0.

Configuration
singleuser:
  events: false
  networkPolicy:
    enabled: false
  storage:
    type: dynamic
    extraLabels: {}
    extraVolumes:
      - name: sparkmagic-config
        configMap:
          name: sparkmagic-config
    extraVolumeMounts:
      - name: sparkmagic-config
        mountPath: /opt/.sparkmagic/config.json
        subPath: config.json
    static:
      pvcName:
      subPath: "{username}"
    capacity: 10Gi
    homeMountPath: /home/jovyan
    dynamic:
      storageClass:
      pvcNameTemplate: claim-{username}{servername}
      volumeNameTemplate: volume-{username}{servername}
      storageAccessModes: [ReadWriteOnce]
  extraEnv:
    SPARKMAGIC_CONF_DIR: /opt/.sparkmagic/
    SPARKMAGIC_CONF_FILE: config.json

  image:
    name: <our_custom_jupytarlab_image>
    tag: <our_custom_jupytarlab_image_tag>
    pullPolicy: Always
    pullSecrets: [ "acr-docker-auth" ]
  startTimeout: 300
  cmd: "/opt/entrypoint.sh"

proxy:
  service:
    type: ClusterIP
  chp:
    networkPolicy:
      enabled: false


hub:
  existingSecret: jupyterhub-secret-conf
  networkPolicy:
    enabled: false
  config:
    Authenticator:
      admin_users:
        - user1
        - user2
        - user3
      allowed_users:
        - user4
        - user5
    JupyterHub:
      authenticator_class: dummy

  authenticatePrometheus: true

  extraEnv:
    - name: PROMETHEUS_TOKEN
      valueFrom:
        secretKeyRef:
          name: prometheus-service-token
          key: PROMETHEUS_TOKEN

  extraConfig: 
    prometheus-service.py: |
      # Add a service "promehteus-service" to scrape prometheus metrics
      c.JupyterHub.services = [
          {
              "name": "prometheus-service",
              "api_token": os.environ["PROMETHEUS_TOKEN"]
          },
      ]

      # Add a service role to scrape prometheus metrics
      c.JupyterHub.load_roles = [
          {
              "name": "service-metrics-role",
              "description": "access metrics",
              "scopes": [
                  "read:metrics",
              ],
              "services": [
                  "prometheus-service",
              ],
          }
      ]

ingress:
  enabled: true
  annotations:
    kubernetes.io/ingress.class: "nginx"
    cert-manager.io/cluster-issuer: letsencrypt-production
    nginx.ingress.kubernetes.io/auth-tls-error-page: "http://www.mysite.com/error-cert.html"
    nginx.ingress.kubernetes.io/auth-tls-pass-certificate-to-upstream: "true"
    nginx.ingress.kubernetes.io/auth-tls-secret: "jupyterhub/ca-secret"
    nginx.ingress.kubernetes.io/auth-tls-verify-client: "on"
    nginx.ingress.kubernetes.io/auth-tls-verify-depth: "2"
  ingressClassName:
  pathSuffix:
  pathType: Prefix

  hosts:
    - jupyterhub.example.host.net
  tls:
    - hosts:
        - jupyterhub.example.host.net
      secretName: jupyterhub-production-tls

Logs
[D 2024-05-22 10:37:37.991 JupyterHub _xsrf_utils:155] xsrf id mismatch b'None:K_exHeY0CyJABPsBIDe7n6UIv1_upqmXywnhbOr9FIQ=' != b'None:TC8vH45MqUauWHsXz0zEsrVDFQ-Hzg0Zv3mZzYFnjls='
[I 2024-05-22 10:37:37.992 JupyterHub _xsrf_utils:125] Setting new xsrf cookie for b'None:TC8vH45MqUauWHsXz0zEsrVDFQ-Hzg0Zv3mZzYFnjls=' {'path': '/hub/', 'max_age': 3600}
[W 2024-05-22 10:37:37.992 JupyterHub web:1873] 403 POST /hub/login?next=%2Fhub%2F (10.2.13.129): XSRF cookie does not match POST argument
@matanshk matanshk added the bug label May 23, 2024
Copy link

welcome bot commented May 23, 2024

Thank you for opening your first issue in this project! Engagement like this is essential for open source projects! 🤗

If you haven't done so already, check out Jupyter's Code of Conduct. Also, please try to follow the issue template as it helps other other community members to contribute more effectively.
welcome
You can meet the other Jovyans by joining our Discourse forum. There is also an intro thread there where you can stop by and say Hi! 👋

Welcome to the Jupyter community! 🎉

@samyuh
Copy link

samyuh commented May 24, 2024

Hello!
I'm also with this problem. I have a custom Load Balancer service pointing to the proxy, which is defined as ClusterIP:

proxy:
  service:
      type: ClusterIP

I tried disable xsrf check, but without success:

extraConfig:
    myConfigName: |
      c.ServerApp.disable_check_xsrf = True
      c.JupyterHub.disable_check_xsrf = True
      print("Disabled XSRF check", flush=True)

Am I doing something wrong to disable this xsrf check?

@matanshk
Copy link
Author

matanshk commented May 24, 2024

@samyuh, If I'm not wrong; they removed the option to configure the XSRF cookie when they released Jupytherhub version 4.0.0.

@samyuh
Copy link

samyuh commented May 24, 2024

Oh, thanks for the information.

By the way, I double checked and we are in fact using the version 3.3.7. I will try to downgrade later today and I will reach out if the bug persists or not.

@samyuh
Copy link

samyuh commented May 24, 2024

@matanshk after the downgrade to 3.1.0 we are able to login

@matanshk
Copy link
Author

@samyuh I'm happy to hear, and this is exactly what happened to us

@jdicesar
Copy link

Hey guys I have been fighting this on my server build as well. The downgrade to 3.1.0 also worked for me. I am using Docker Swarm as opposed to Kubernetes. Did you guys have any issues getting the singleuser servers to run after the downgrade? I matched the version for that image to jupyterhub/singleuser:3.1.0.

@matanshk
Copy link
Author

@jdicesar We hadn't issue with the single user server after the downgrade. I just want to mention that we are building our single user server image based on the juoyter base image

@Khoi16
Copy link

Khoi16 commented Jun 13, 2024

Is there any one fix this bug

@Khoi16
Copy link

Khoi16 commented Jun 13, 2024

So finally, I found that if we use the DNS (domain) that has a proxy server like cloud flare. We will have an error like this. Anyone can explain for me please!. Thanks

@Khoi16
Copy link

Khoi16 commented Jun 13, 2024

Oh, I found the solution. Add proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; to your z2jh

ingress:
  enabled: true
  # annotations: {}
  annotations:
    nginx.org/websocket-services: proxy-public
    nginx.org/server-snippets: |
      server_name asdasdasdasdsaa_test;
      location / {
      proxy_pass http://localhost:8888;
      proxy_set_header Host $host;
      proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
      proxy_set_header Upgrade $http_upgrade;
      proxy_set_header Connection upgrade;
      proxy_set_header Accept-Encoding gzip;
      }

@Khoi16
Copy link

Khoi16 commented Jun 22, 2024 via email

@chainlink
Copy link

Also hitting this issue in our jupyterhub install.

@ScOut3R
Copy link

ScOut3R commented Sep 19, 2024

Running JupyterHub on GKE with the Gateway API to expose the web ui I encountered this issue. For version 4.x the solution was to set X-Forwarded-Host on the HTTPRoute to the public facing host.

@samyuh
Copy link

samyuh commented Sep 21, 2024

I will try to work on this once I have some free time. Was someone able to reproduce this locally?

I could just reproduce this when running on our preprod servers, and I don't want to debug things there

@derekelewis
Copy link

I encountered this issue when using Z2JH on EKS and an ALB as the ingress. Enabling sticky sessions fixed it for me.

@Richard-Regan
Copy link

Also hitting this issue in our jupyterhub install.

Hi Chainlink, did you ever solve this, as I am getting the same issue on a clean install.

@matanshk
Copy link
Author

matanshk commented Nov 6, 2024

any update about this issue?

@matanshk
Copy link
Author

Hi everyone, the issue was resolved by adding this annotation to the ingress:

    nginx.ingress.kubernetes.io/configuration-snippet: |
      proxy_set_header X-Real-IP $remote_addr;

@jrdnbradford
Copy link

@matanshk what's your full set of ingress annotations? I added this and still got the error.

@hnykda
Copy link

hnykda commented Dec 12, 2024

With Traefik in k8s on azure with z2jh 4.0.0, I had to do:

apiVersion: traefik.io/v1alpha1
kind: Middleware
metadata:
  name: jupyterhub-headers
  namespace: jupyterhub
spec:
  headers:
    customRequestHeaders:
      X-Real-IP: ""
      X-Forwarded-Proto: "https"
      X-Forwarded-For: ""

---
apiVersion: traefik.io/v1alpha1
kind: IngressRoute
metadata:
  name: jupyterhub
  namespace: jupyterhub
spec:
  entryPoints:
    - websecure
  routes:
    - match: Host(`your-domain.com`)
      kind: Rule
      services:
        - name: proxy-public
          port: 80
      middlewares:
        - name: jupyterhub-headers
  tls:
    certResolver: letsencrypt-resolver

@ianalis
Copy link

ianalis commented Dec 26, 2024

So it seems the solution is for the gateway proxy to add an X-Real-IP header. For those using haproxy like myself, the configuration will be something like this:

listen jupyterhub-gpu
        ...
        http-request add-header X-Forwarded-Proto https if { ssl_fc }
        http-request add-header X-Scheme https if { ssl_fc }
        http-request add-header X-Real-IP %[src]
        ...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests