-
Notifications
You must be signed in to change notification settings - Fork 176
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add support documentation and must-gather
- Loading branch information
Showing
7 changed files
with
171 additions
and
2 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,12 @@ | ||
========================================================== | ||
Support | ||
========================================================== | ||
|
||
.. toctree:: | ||
:titlesonly: | ||
:maxdepth: 1 | ||
|
||
overview | ||
known-issues | ||
troubleshooting/index | ||
must-gather |
File renamed without changes.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,101 @@ | ||
# Gathering data with must-gather | ||
|
||
`must-gather` is an embedded tool in Scylla Operator that helps collecting all the necessary info when something goes wrong. | ||
|
||
The tool talks to the Kubernetes API, retrieves a predefined set of resources and saves them into a folder in your current directory. | ||
By default, all collected Secrets are censored to avoid sending sensitive data. | ||
That said, you can always review the archive before you attach it to an issue or your support request. | ||
|
||
Given it needs to talk to the Kubernetes API, at the very least, you need to supply the `--kubeconfig` flag with a path to the kubeconfig file for your Kubernetes cluster, or set the `KUBECONFIG` environment variable. | ||
|
||
## Running must-gather | ||
|
||
There is more than one way to run `must-gather`. | ||
Here are some examples of how you can run the tool. | ||
|
||
### Prerequisites | ||
|
||
All examples assume you have exported `KUBECONFIG` environment variable that points to a kubeconfig file on your machine. | ||
If not, you can run this command to export the common default location. | ||
Please make sure such a file exists. | ||
|
||
```bash | ||
export KUBECONFIG=~/.kube/config | ||
ls -l "${KUBECONFIG}" | ||
``` | ||
|
||
```note:: | ||
There can be slight deviations in the arguments for your container tool, depending on the container runtime, whether you use SELinux or similar factors. | ||
As an example, the need for the `Z` option on volume mounts depends on whether you use SELinux and what context is applied on your file or directory. | ||
If you get an error mentioning `Error: lsetxattr <path>: operation not supported`, try it without the `Z` option. | ||
``` | ||
|
||
Let's also check whether your kubeconfig uses [external authentication plugin](https://kubernetes.io/docs/reference/access-authn-authz/authentication/#client-go-credential-plugins). | ||
You can determine that by running | ||
```bash | ||
kubectl config view --minify | ||
``` | ||
and checking whether it uses an external exec plugin by looking for this pattern (containing the `exec` key) | ||
```yaml | ||
users: | ||
- name: <user_name> | ||
user: | ||
exec: | ||
``` | ||
If not, you can skip the rest of this section. | ||
In case your kubeconfig depends on external binaries, you have to take a few extra steps because the external binary won't be available within our container to authenticate the requests. | ||
Similarly to how Pods are run within Kubernetes, we'll create a dedicated ServiceAccount for must-gather and use it to run the tool. | ||
(When you are done using it, feel free to remove the Kubernetes resources created for that purpose.) | ||
```bash | ||
kubectl create namespace must-gather | ||
kubectl -n must-gather create serviceaccount must-gather | ||
kubectl create clusterrolebinding must-gather --clusterrole=cluster-admin --serviceaccount=must-gather:must-gather | ||
export MUST_GATHER_TOKEN | ||
MUST_GATHER_TOKEN=$( kubectl -n must-gather create token must-gather --duration=1h ) | ||
kubeconfig=$( mktemp ) | ||
# Create a copy of the existing kubeconfig and | ||
# replace user authentication using yq, or by adjusting the fields manually. | ||
kubectl config view --minify --raw -o yaml | yq -e '.users[0].user = {"token": env(MUST_GATHER_TOKEN)}' > "${kubeconfig}" | ||
KUBECONFIG="${kubeconfig}" | ||
``` | ||
|
||
```note:: | ||
If you don't have `yq` installed, you can get it at https://github.com/mikefarah/yq/#install or you can replace the user authentication settings manually. | ||
``` | ||
|
||
### Podman | ||
```bash | ||
podman run -it --pull=always --rm -v="${KUBECONFIG}:/kubeconfig:ro,Z" -v="$( pwd ):/workspace:Z" --workdir=/workspace docker.io/scylladb/scylla-operator:latest must-gather --kubeconfig=/kubeconfig | ||
``` | ||
|
||
### Docker | ||
```bash | ||
docker run -it --pull=always --rm -v="${KUBECONFIG}:/kubeconfig:ro" -v="$( pwd ):/workspace" --workdir=/workspace docker.io/scylladb/scylla-operator:latest must-gather --kubeconfig=/kubeconfig | ||
``` | ||
|
||
## Limiting must-gather to a particular namespace | ||
|
||
If you are running a large Kubernetes cluster with many ScyllaClusters, it may be useful to limit the collection of ScyllaClusters to a particular namespace. | ||
Unless you hit scale issues, we advise not to use this mode, as sometimes the ScyllaClusters affect other collected resources, like the manager or they form a multi-datacenter. | ||
|
||
```bash | ||
scylla-operator must-gather --namespace="<namespace_with_broken_scyllacluster>" | ||
``` | ||
|
||
```note:: | ||
The `--namespace` flag affects only `ScyllaClusters`. | ||
Other resources related to the operator installation or cluster state will still be collected from other namespaces. | ||
``` | ||
|
||
### Collecting every resource in the cluster | ||
|
||
By default, `must-gather` collects only a predefined subset of resources. | ||
You can also request collecting every resource in the Kubernetes API, if the default set wouldn't be enough to debug an issue. | ||
|
||
```bash | ||
scylla-operator must-gather --all-resources | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,14 @@ | ||
# Support overview | ||
|
||
## Get support | ||
|
||
ScyllaDB provides administrators with [paid support](https://www.scylladb.com/product/support/#enterprise-support), including Scylla Operator. | ||
|
||
## Troubleshooting issues | ||
|
||
To learn more about what to do when issues arise, visit our dedicated [troubleshooting section](troubleshooting/index). | ||
|
||
## Gather data about your cluster | ||
|
||
Scylla Operator contains an embedded tool called [must-gather](must-gather.md) that can collect the required information for requesting support or reporting issues. | ||
Support requests and bug reports are required to attach the must-gather archive to help us understand the issue. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,8 @@ | ||
========================================================== | ||
Troubleshooting | ||
========================================================== | ||
|
||
.. toctree:: | ||
:maxdepth: 2 | ||
|
||
installation |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,34 @@ | ||
# Troubleshooting installation issues | ||
|
||
## Webhooks | ||
Scylla Operator provides several custom API resources that use webhooks to function properly. | ||
|
||
Unfortunately, it is often the case that user's clusters have modified SDN, that doesn't extend to the control plane, and Kubernetes apiserver is not able to reach the pods that serve the webhook traffic. | ||
Another common case are firewall rules that block the webhook traffic. | ||
|
||
```note:: | ||
To be called a Kubernetes cluster, clusters are required to pass Kubernetes conformance test suite. | ||
This suite includes tests that require Kubernetes apiserver to be able to reach webhook services. | ||
``` | ||
|
||
```note:: | ||
Before filing an issue, please make sure your cluster webhook traffic can reach your webhook services, independently of Scylla Operator resources. | ||
``` | ||
|
||
### EKS | ||
|
||
#### Custom CNI | ||
EKS is currently breaking Kubernetes webhooks [when used with custom CNI networking](https://github.com/aws/containers-roadmap/issues/1215). | ||
|
||
```note:: | ||
We advise you to avoid using such setups and use a conformant Kubernetes cluster that supports webhooks. | ||
``` | ||
|
||
There are some workarounds where you can reconfigure the webhook to use Ingress or hostNetwork instead, but it's beyond a standard configuration that we support and not specific to the Scylla Operator. | ||
|
||
### GKE | ||
|
||
#### Private clusters | ||
|
||
If you use GKE private clusters you need to manually configure the firewall to allow webhook traffic. | ||
You can find more information on how to do that in [GKE private clusters docs](https://cloud.google.com/kubernetes-engine/docs/how-to/private-clusters#add_firewall_rules). |