Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cleaning up Mesos from paasta readthedocs - PAASTA-18313 #3954

Merged
merged 4 commits into from
Oct 16, 2024

Conversation

EmanElsaban
Copy link
Contributor

@EmanElsaban EmanElsaban commented Sep 11, 2024

This PR cleansup Mesos from paasta readthedocs -> https://paasta.readthedocs.io/en/latest/index.html and update with Kubernetes instead

@EmanElsaban EmanElsaban force-pushed the u/emanelsabban/PAASTA-18313 branch from de419bd to a06818d Compare September 11, 2024 14:15
Copy link
Member

@nemacysts nemacysts left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice - thank you for going through all these docs!

**Kubernetes Horizontal Pod Autoscaler (HPA)**
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

It's a Kubernetes feature that automatically scales the number of pods in a deployment based on observed CPU utilization (or, with custom metrics support, on some other application-provided metrics).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to be consistent with the other definitions:

Suggested change
It's a Kubernetes feature that automatically scales the number of pods in a deployment based on observed CPU utilization (or, with custom metrics support, on some other application-provided metrics).
A Kubernetes feature that automatically scales the number of pods in a deployment based on observed CPU utilization (or, with custom metrics support, on some other application-provided metrics).

**instancename**
~~~~~~~~~~~~~~~~

Logical collection of Mesos tasks that comprise a Marathon app. service
name + instancename = Marathon app name. Examples: main, canary.
Logical collection of Kubernetes pods that comprise a Kubernetes Deployment. service
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe something like:

Suggested change
Logical collection of Kubernetes pods that comprise a Kubernetes Deployment. service
Logical collection of Kubernetes pods that comprise an application deployed on Kubernetes. service

would be simpler for folks that don't know what a pod is? we haven't really defined what a Deployment is yet in k8s in this glossary otherwise :p

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good idea, will add also what a deployment is to the glossary :p

docs/source/about/glossary.rst Show resolved Hide resolved

**namespace**
~~~~~~~~~~~~~

An haproxy/SmartStack concept grouping backends that listen on a
particular port. A namespace may route to many healthy Marathon
instances. By default, the namespace in which a Marathon job appears is
particular port. A namespace may route to many healthy paaSTA
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
particular port. A namespace may route to many healthy paaSTA
particular port. A namespace may route to many healthy PaaSTA

particular port. A namespace may route to many healthy Marathon
instances. By default, the namespace in which a Marathon job appears is
particular port. A namespace may route to many healthy paaSTA
instances. By default, the namespace in which a Kubernetes deployment appears is
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, not sure if we should use deployment or Deployment here - or abstract this out a bit and just call this a "Kubernetes application" or "an application running on Kubernetes"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we should just say "paasta instance" since also an application is not defined here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will define as well kubernetes namespace up in the glossary

launched the task may or may not decide to try and start the same task
elsewhere.
Docker container, then the container will die. Kubernetes will attempt to reschedule the pod
to maintain the desired number of replicas specified in the Deployment. For each paasta instance, a deployment is created
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
to maintain the desired number of replicas specified in the Deployment. For each paasta instance, a deployment is created
to maintain the desired number of replicas specified in the Deployment. For each PaaSTA instance, a Deployment is created

Comment on lines 165 to 167
Kubernetes supports disk resource isolation through the use of Persistent Volumes (PVs), Persistent Volume Claims (PVCs), and
storage quotas. Paasta instances pods can claim a portion of storage through PVCs. This doesn't limit the disk space that the pod can use directly but
it allows for the allocation of storage resources to pods. Disk resource is also isolated through the use of namespaces - each namespace has a set quota for the total amount of storage that can be requested by the paasta service running in it.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we might want to change this a bit: we don't have namespace-level disk limits and we generally don't encourage using PVs since PaaSTA services are ideally stateless

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked by default we do have a limit on disk per namespace but only infra engineers can set it and not service owners?

(py38) emanelsabban@dev208-uswest1adevc:/etc/paasta$ kubectl-eks-infrastage describe ns/paastasvc-paasta-contract-monitor
INFO: You are using an Okta authenticated kubectl wrapper
Name:         paastasvc-paasta-contract-monitor
Labels:       kubernetes.io/metadata.name=paastasvc-paasta-contract-monitor
              kustomize.toolkit.fluxcd.io/name=common-paasta-contract-monitor
              kustomize.toolkit.fluxcd.io/namespace=flux-system
              kustomize.toolkit.fluxcd.io/prune=disabled
              name=paastasvc-paasta-contract-monitor
              paasta.yelp.com/managed=true
              yelp.com/owner=compute_infra_platform_experience
Annotations:  <none>
Status:       Active

No resource quota.

Resource Limits
 Type       Resource           Min  Max  Default Request  Default Limit  Max Limit/Request Ratio
 ----       --------           ---  ---  ---------------  -------------  -----------------------
 Container  ephemeral-storage  -    -    1Gi              1Gi            -
 Container  memory             -    -    1Gi              1Gi            -
 Container  cpu                -    -    1                1              -

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, you've discovered our cheeky way of adding default resource limits to pods that don't define any (and have not yet gotten autotune recommendations) - these aren't really intended as hard limits, but as a way to ensure that we don't accidentally launch paasta pods with no resource requests/limits. the values in these limit ranges are also overrideable: if you set disk: 2048 in soaconfigs, the ephemeral-storage limit from the namespace is ignored :)

what we could probably do here is mention that paasta will by default apply some resource limits if none are specified, but that those defaults can be overridden in soaconfigs?

May use any URL scheme supported by Mesos's `fetcher module. <http://mesos.apache.org/documentation/latest/fetcher/>`_

Example: ``"dockercfg_location": "http://somehost/somepath"``

* ``synapse_port``: The port that haproxy-synapse exposes its status on.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

heh, this synapse config is no longer used - but again, that's for another PR :p

reaped by Marathon. By default the healthchecks occur every 10 seconds, and a service
must fail 30 times before that task is pruned and a new one is launched in its place.
This means a task had 5 minutes by default to properly respond to its healthchecks.
Note: Kubernetes support multiple container runtimes, including Docker (via "dockershim", which is deprecated and removed as of Kubernetes v1.24), containerd, and CRI-O.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Note: Kubernetes support multiple container runtimes, including Docker (via "dockershim", which is deprecated and removed as of Kubernetes v1.24), containerd, and CRI-O.
Note: Kubernetes supports multiple container runtimes, including Docker (via "dockershim", which is deprecated and removed as of Kubernetes v1.24), containerd, and CRI-O.

must fail 30 times before that task is pruned and a new one is launched in its place.
This means a task had 5 minutes by default to properly respond to its healthchecks.
Note: Kubernetes support multiple container runtimes, including Docker (via "dockershim", which is deprecated and removed as of Kubernetes v1.24), containerd, and CRI-O.
In Yelp, we use docker and are currently in the process of migrating to containerd.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
In Yelp, we use docker and are currently in the process of migrating to containerd.
At Yelp, we use docker and are currently in the process of migrating to containerd.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that said, this is almost done so we can probably leave this out (and remove the bit about k8s supporting multiple container runtimes too)

@EmanElsaban EmanElsaban force-pushed the u/emanelsabban/PAASTA-18313 branch from 4346cda to 40793f8 Compare September 12, 2024 14:21
Copy link
Member

@nemacysts nemacysts left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mostly casing fixes :)

packaging implementation.


Making new versions
-------------------
* Make a branch. WRITE TESTS FIRST (TDD)! Add features.

* Submit your branch for review. Include the "paasta" group. Communicate with
* Submit your branch for review. Include the "PaaSTA" group. Communicate with
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oops, this is a reference to the paasta GH team, which is all lowercase

Suggested change
* Submit your branch for review. Include the "PaaSTA" group. Communicate with
* Submit your branch for review. Include the "paasta" group. Communicate with

@@ -8,7 +8,7 @@ What are deploy groups?
A deploy group is a group of PaaSTA instances that will be deployed together.
They provide a way to control how new versions of your service are deployed to production and other environments.
The ``kubernetes-[clustername].yaml``, ``tron-[clustername].yaml``, and ``adhoc-[clustername].yaml`` files should have a ``deploy_group`` field on each instance.
The ``paasta mark-for-deployment`` command (usually run by Jenkins) operates on deploy groups -- it tells PaaSTA that you want a deploy group to run a specific version of your service.
The ``paasta mark-for-Deployment`` command (usually run by Jenkins) operates on deploy groups -- it tells PaaSTA that you want a deploy group to run a specific version of your service.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this looks like the find-replace was a little overzealous :p

Suggested change
The ``paasta mark-for-Deployment`` command (usually run by Jenkins) operates on deploy groups -- it tells PaaSTA that you want a deploy group to run a specific version of your service.
The ``paasta mark-for-deployment`` command (usually run by Jenkins) operates on deploy groups -- it tells PaaSTA that you want a deploy group to run a specific version of your service.

@@ -41,7 +41,7 @@ Deploy groups in kubernetes/tron yamls and deploy.yaml should match

In almost all cases, you want the list of deploy groups in ``deploy.yaml`` (the ``step`` entries under ``pipeline``, except for the special build/test steps) to match the set of ``deploy_group``s defined in your kubernetes.yaml / tron.yaml / adhoc.yaml.
If an instance has a ``deploy_group`` that is not defined in deploy.yaml, or your Jenkins pipeline has not run since you added the deploy.yaml entry, PaaSTA won't know what version of your container image this instance should run.
If a deploy group is specified as a ``step`` in deploy.yaml but is not referenced in any kubernetes/adhoc/tron.yaml, this deployment step will have no effect.
If a deploy group is specified as a ``step`` in deploy.yaml but is not referenced in any kubernetes/adhoc/tron.yaml, this Deployment step will have no effect.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
If a deploy group is specified as a ``step`` in deploy.yaml but is not referenced in any kubernetes/adhoc/tron.yaml, this Deployment step will have no effect.
If a deploy group is specified as a ``step`` in deploy.yaml but is not referenced in any kubernetes/adhoc/tron.yaml, this deployment step will have no effect.

(same here)

This package must be installed anywhere the PaaSTA CLI and on the Mesos/Marathon
masters. If you are using SmartStack for service discovery, then the package must
be installed on the Mesos Slaves as well so they can query the local API.
This package must be installed anywhere the PaaSTA CLI and on the kube nodes.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this typo already existed, so feel free to ignore this:

Suggested change
This package must be installed anywhere the PaaSTA CLI and on the kube nodes.
This package must be installed anywhere the PaaSTA CLI is needed and on the kube nodes.

@@ -76,7 +74,7 @@ Docker and a Docker Registry

PaaSTA uses `Docker <https://www.docker.com/>`_ to build and distribute code for each service. PaaSTA
assumes that a single registry is available and that the associated components
(Docker commands, unix users, mesos slaves, etc) have the correct credentials
(Docker commands, unix users, etc) have the correct credentials
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually, on another read of this, we should mention that the Kubernetes nodes need to have the correct credentials (we carried over the same host-based methods of providing creds rather than using the k8s dockerpullsecret feature

Suggested change
(Docker commands, unix users, etc) have the correct credentials
(Docker commands, unix users, Kubernetes nodes, etc) have the correct credentials


* ``lifecycle``: A dictionary of additional options that adjust the termination phase of the `pod lifecycle <https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle/#pod-termination>`_:
* ``lifecycle``: A dictionary of additional options that adjust the termination phase of the `Pod lifecycle <https://kubernetes.io/docs/concepts/workloads/Pods/Pod-lifecycle/#Pod-termination>`_:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(and same here)

@@ -826,7 +576,7 @@ If a Tron **action** of a job is of executor type ``spark``, it MAY specify the

* ``spark_args``: Dictionary of spark configurations documented in
https://spark.apache.org/docs/latest/configuration.html. Note some configurations are non-
user-editable as they will be populated by paasta tools. See
user-editable as they will be populated by PaaSTA tools. See
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
user-editable as they will be populated by PaaSTA tools. See
user-editable as they will be populated by PaaSTA. See

Comment on lines 596 to 598
* ``net``: Specify which kind of
`networking mode <https://docs.docker.com/engine/reference/run/#network-settings>`_
instances of this service should be launched using. Defaults to ``'bridge'``.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* ``net``: Specify which kind of
`networking mode <https://docs.docker.com/engine/reference/run/#network-settings>`_
instances of this service should be launched using. Defaults to ``'bridge'``.
* ``net``: Specify which kind of
`networking mode <https://docs.docker.com/engine/reference/run/#network-settings>`_
adhoc containers of this service should be launched using. Defaults to ``'bridge'``.

Comment on lines 600 to 610
* ``cmd``: The command that is executed. Can be used as an alternative to
args for containers without an `entrypoint
<https://docs.docker.com/reference/builder/#entrypoint>`_. [#note]_.

* ``args``
* ``args``: An array of docker args if you use the `"entrypoint"
<https://docs.docker.com/reference/builder/#entrypoint>`_ functionality. [#note]_.

* ``deploy_group``
* ``deploy_group``: A string identifying what deploy group this instance belongs
to. The ``step`` parameter in ``deploy.yaml`` references this value
to determine the order in which to build & deploy deploy groups. Defaults to
``clustername.instancename``. See the deploy group doc_ for more information.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, are these not in the kubernetes settings? imo, we should define these there and update the old See the marathon-[clustername].yaml_ section for details for each of these parameters. text to refer to the eks/kubernetes-[clustername].yaml files

Copy link
Contributor Author

@EmanElsaban EmanElsaban Sep 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed the marathon-clustername.yaml section since we dont have these files in soaconfigs anymore and added them under adhoc-clustername.yaml since they were mentioned there but not defined

@@ -890,7 +663,7 @@ Basic HTTP and TCP options
it will generate synapse discovery files on every host, but no listening
port will be allocated. This must be unique across all environments where
PaaSTA (or synapse) runs. At Yelp, we pick from the range [19000, 21000].
Feel free to pick the next available value -- paasta fsm will do this for
Feel free to pick the next available value -- PaaSTA fsm will do this for
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Feel free to pick the next available value -- PaaSTA fsm will do this for
Feel free to pick the next available value -- ``paasta fsm`` will do this for

Copy link
Member

@nemacysts nemacysts left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just some small tweaks - should be able to merge after these, i think :)

Comment on lines 160 to 161
Kubernetes supports disk resource isolation through the use of storage quotas. Disk resource is isolated through the use of
namespaces - PaaSTA by default apply storage resource limit for the namespace if none is specified (Note: those limits can be overridden in soaconfigs).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Kubernetes supports disk resource isolation through the use of storage quotas. Disk resource is isolated through the use of
namespaces - PaaSTA by default apply storage resource limit for the namespace if none is specified (Note: those limits can be overridden in soaconfigs).
Kubernetes supports disk resource isolation through the use of storage quotas. Kubernetes will periodically poll for usage, so it is possible to temporarily exceed the configured limit. When Kubernetes sees that a container has exceeded it's limit, it will evict (i.e., kill) the offending Pod, thereby deleting the containers filesystem and reclaiming the used disk.
NOTE: this usage calculation takes into consideration node-level container logs (i.e., container logs for stdout/stderr stored on-host to power things like ``kubectl logs``) - if an application is particularly "chatty" with its output, the ``disk`` allocation in soa-configs will need to take this into account.```

@@ -66,7 +66,7 @@ Steps below outline running PaaSTA playground components with a debugger attache

.. sourcecode:: shell

(py38-linux) user@dev55-uswest1adevc:~/pg/paasta$ KUBECONFIG=./k8s_itests/kubeconfig kubectl get pods -n paasta
(py38-linux) user@dev55-uswest1adevc:~/pg/paasta$ KUBECONFIG=./k8s_itests/kubeconfig kubectl get Pods -n paasta
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
(py38-linux) user@dev55-uswest1adevc:~/pg/paasta$ KUBECONFIG=./k8s_itests/kubeconfig kubectl get Pods -n paasta
(py38-linux) user@dev55-uswest1adevc:~/pg/paasta$ KUBECONFIG=./k8s_itests/kubeconfig kubectl get pods -n paasta

a more detailed read on how this works in practice, see the docs on `isolation <isolation.html>`_.
* ``cpus``: Number of CPUs an instance needs. Defaults to 1. CPUs in Kubernetes
are "shares" and represent a minimal amount of a CPU to share with a Pod
relative to the other Pods on a host.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

might be nice to leave in the For a more detailed read on how this works in practice... text here and in mem

Comment on lines 56 to 57
* ``disk``: Disk (in MB) an instance needs. Defaults to 1024 (1GB). Disk limits
may or may not be enforced, but services should set their ``disk`` setting
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

whoops, didn't notice that this says that disk limits may not be enforced - they always are in kubernetes pods launched from soaconfigs :)

``tron-[clustername].yaml``
--------------------------------

This file stores configuration for periodically scheduled jobs for execution on
`Tron <https://github.com/yelp/tron>`_.

The documentation here is for the PaaSTA-specific options. For all other
The documentation here is for the paasta-specific options. For all other
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this paasta was probably fine as PaaSTA :p

@@ -842,7 +598,7 @@ The yaml where adhoc instances are defined. Top-level keys are instance names.
Each instance MAY have:

* Anything in the `Common Settings`_.

.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this . a typo?

is specified. If both are specified, an exception is thrown with an
explanation of the problem, and the program terminates.

.. _doc: deploy_groups.html
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this link seems out of place, if i'm not mistaken (i think we can delete this since we refer to the kubernetes-* section above)

@EmanElsaban EmanElsaban requested a review from nemacysts October 7, 2024 21:14
@EmanElsaban EmanElsaban merged commit f9e83e6 into master Oct 16, 2024
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants