Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[occm] Multi region openstack cluster #2595

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

sergelogvinov
Copy link
Contributor

@sergelogvinov sergelogvinov commented May 14, 2024

What this PR does / why we need it:

Openstack CCM multi region support, if it has one Identity provider.
The OpenStack cluster includes a single Keystone service and multiple Nova, Cinder, and Neutron services grouped by region.

openstack-regional

Which issue this PR fixes(if applicable):
fixes #1924

Special notes for reviewers:

CCM config changes:
The region is required param (as was before) and it uses as default region in cluster.
The regions can set multiple times, they will merge with region param. So the value of the region may or may not exist in the list of defined regions.

[Global]
auth-url=https://auth.openstack.example.com/v3/
region=REGION1
# new param 'regions' can be specified multiple times
regions=REGION1
regions=REGION2
regions=REGION3

Optionally can be set in cloud.conf, it supports only one auth service (Keystone)

clouds:
  kubernetes:
    auth:
      auth_url: https://auth.openstack.example.com/v3
    region_name: "REGION1"
    regions:
      - REGION1
      - REGION2
      - REGION3

During the initialization process, OCCM checks for the existence of providerID. If providerID does not exist, it defaults to using node.name, as it did previously. Additionally, if the node has the label topology.kubernetes.io/region, OCCM will prioritize using this region as the first one to check. This approach ensures that in the event of a region outage, OCCM can continue to function.

In addition, we can assist CCM in locating the node by providing kubelet parameters:

  • --provider-id=openstack:///$InstanceID - InstanceID exists in metadata
  • --provider-id=openstack://$REGION/$InstanceID - if you can define the region (by default meta server does not have this information)
  • --node-labels=topology.kubernetes.io/region=$REGION set preferred REGION in label, OCCM will then prioritize searching for the node in this specified region

The OCCM sets ProviderID:

OCCM with multi regions can work with/without env.OS_CCM_REGIONAL=true

Release note:

[OCCM] support multi regional cluster with one keystone deployment

@k8s-ci-robot k8s-ci-robot added the release-note-none Denotes a PR that doesn't merit a release note. label May 14, 2024
@k8s-ci-robot k8s-ci-robot requested review from anguslees and dulek May 14, 2024 14:42
@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels May 14, 2024
@k8s-ci-robot
Copy link
Contributor

Hi @sergelogvinov. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label May 14, 2024
@sergelogvinov
Copy link
Contributor Author

@mdbooth can you take a look on this PR.
Probably I need to add more configuration checks.

Thanks.

Copy link
Contributor

@mdbooth mdbooth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't taken a deep look into this, but I much prefer this in principal: it only tells cloud-provider things we know to be true. This makes me much more confident that this will continue to work correctly as cloud-provider evolves.

Would you still run multiple CCMs, or switch to a single active CCM?

Comment on lines 60 to 75
for _, region := range os.regions {
opt := os.epOpts
opt.Region = region

compute[region], err = client.NewComputeV2(os.provider, opt)
if err != nil {
klog.Errorf("unable to access compute v2 API : %v", err)
return nil, false
}

network[region], err = client.NewNetworkV2(os.provider, opt)
if err != nil {
klog.Errorf("unable to access network v2 API : %v", err)
return nil, false
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pierreprinetti how much work is performed when initialising a new service client? Is it local-only, or do we have to go back to keystone?

I might be inclined to intialise this lazily anyway, tbh.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

similar thought, maybe init them until real usage?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've had one issue in proxmox with lazy initialization. The regions cannot exist, and during the rollout of OCCM, it starts without errors. The kubernetes administrator will think that all configuration is correct.

So we can check all regions here and crush if needed. What do you think?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apologies for the late response. Building a ProviderClient requires a Keystone roundtrip; building ServiceClients is cheap.

Comment on lines 178 to 188
if node.Spec.ProviderID == "" {
return i.getInstanceByName(node)
}

instanceID, instanceRegion, err := instanceIDFromProviderID(node.Spec.ProviderID)
if err != nil {
return nil, err
return nil, "", err
}

if instanceRegion == "" {
return i.getInstanceByID(instanceID, node.Labels[v1.LabelTopologyRegion])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd probably be a bit more explicit with where we're looking for stuff here. IIUC there are 2 possible places we can get a specific region from:

  • providerID
  • LabelTopologyRegion

Both may be unset because the node has not yet been adopted by the node-controller.

providerID may not contain a region either because it was set before we became multi-region, or because it was set by kubelet without a region and it's immutable.

But the end result is that either we know the region or we don't. If we know the region we should look only in that region. If we don't know the region we should look everywhere.

How about logic something like:

instanceID, instanceRegion, err := instanceIDFromProviderID(node.Spec.ProviderID)
..err omitted...

if instanceRegion == "" {
  instanceRegion = node.Labels[v1.LabelTopologyRegion]
}

var searchRegions []string
if instanceRegion != "" {
  if !slices.Contains(i.regions, instanceRegion) {
    return ...bad region error...
  }
  searchRegions = []string{instanceRegion}
} else {
  searchRegions = ..all the regions, preferred first...
}

for region := range searchRegions {
  mc := ...
  if instanceID != "" {
    getInstanceByID()
  } else {
    getInstanceByName()
  }
  mc.ObserveRequest()
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, this very very good idea.
I've changed the implementation. But one thought - i cannot trust LabelTopologyRegion, something can change it and node-lifecycle will remove node (for instance on reboot/upgrade event)...

So i can use LabelTopologyRegion only as prefered-region. And check this region first.
Thanks.

@MatthieuFin
Copy link
Contributor

Hi @sergelogvinov
I propose an implementation of multi cloud support for cinder-csi-plugin, which offer multiple openstack clusters support, not only multiple regions, I haven't take look of occm implementation yet, but is it possible to adapt it to support multiple cloud definitions instead of only multiple regions ?

@sergelogvinov
Copy link
Contributor Author

Hi @sergelogvinov I propose an implementation of multi cloud support for cinder-csi-plugin, which offer multiple openstack clusters support, not only multiple regions, I haven't take look of occm implementation yet, but is it possible to adapt it to support multiple cloud definitions instead of only multiple regions ?

Thank you for this PR, it is very interesting. Can we have a call/chat in slack #provider-openstack (Serge Logvinov)?

@sergelogvinov sergelogvinov force-pushed the multi-regionl-only-openstack branch from aee43df to fa2bd50 Compare May 15, 2024 07:59
@jichenjc
Copy link
Contributor

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels May 15, 2024
@jichenjc
Copy link
Contributor

/ok-to-test

Comment on lines 60 to 75
for _, region := range os.regions {
opt := os.epOpts
opt.Region = region

compute[region], err = client.NewComputeV2(os.provider, opt)
if err != nil {
klog.Errorf("unable to access compute v2 API : %v", err)
return nil, false
}

network[region], err = client.NewNetworkV2(os.provider, opt)
if err != nil {
klog.Errorf("unable to access network v2 API : %v", err)
return nil, false
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

similar thought, maybe init them until real usage?

@sergelogvinov sergelogvinov force-pushed the multi-regionl-only-openstack branch from fa2bd50 to 46aebfb Compare May 15, 2024 09:32
@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jul 2, 2024
@sergelogvinov sergelogvinov force-pushed the multi-regionl-only-openstack branch from 46aebfb to 6efe3b4 Compare August 7, 2024 05:57
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Aug 7, 2024
@sergelogvinov sergelogvinov force-pushed the multi-regionl-only-openstack branch from 6efe3b4 to 2ea04e3 Compare August 7, 2024 06:54
@sergelogvinov
Copy link
Contributor Author

I've rebased the PR. all tests passed and i've tested manually too

Can you take a look please @jichenjc @mdbooth
It will be great to merge this change into the upcoming release...

Thanks.

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Sep 4, 2024
@sergelogvinov sergelogvinov force-pushed the multi-regionl-only-openstack branch from 2ea04e3 to 6da4422 Compare September 18, 2024 08:22
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Sep 18, 2024
@sergelogvinov sergelogvinov force-pushed the multi-regionl-only-openstack branch from 6da4422 to 2f86fa7 Compare September 26, 2024 18:40
@sergelogvinov
Copy link
Contributor Author

Is anything else we can do here? @jichenjc @mdbooth @kayrus

We had conversation how we need initialize the openstack clients

	for _, region := range os.regions {
		opt := os.epOpts
		opt.Region = region

		compute[region], err = client.NewComputeV2(os.provider, opt)
		if err != nil {
			klog.Errorf("unable to access compute v2 API : %v", err)
			return nil, false
		}

		network[region], err = client.NewNetworkV2(os.provider, opt)
		if err != nil {
			klog.Errorf("unable to access network v2 API : %v", err)
			return nil, false
		}

It seems to be a similar process to the one we followed in cinder-csi-plugin.
I believe @MatthieuFin and I can introduce multi OpenStack authentication support after this PR.

    [Global]
    auth-url="https://auth.cloud.openstackcluster.region-default.local/v3"
    username="region-default-username"
    password="region-default-password"
    region="default"
    tenant-id="region-default-tenant-id"
    tenant-name="region-default-tenant-name"
    domain-name="Default"
    
    [Global "region-one"]
    auth-url="https://auth.cloud.openstackcluster.region-one.local/v3"
    username="region-one-username"
    password="region-one-password"
    region="one"
    tenant-id="region-one-tenant-id"
    tenant-name="region-one-tenant-name"
    domain-name="Default"

Thanks.

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Oct 22, 2024
@sergelogvinov sergelogvinov force-pushed the multi-regionl-only-openstack branch from 2f86fa7 to 3c8e594 Compare November 30, 2024 20:37
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign zetaab for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Nov 30, 2024
@sergelogvinov sergelogvinov force-pushed the multi-regionl-only-openstack branch 2 times, most recently from f29e427 to d2e2b76 Compare December 1, 2024 08:54
@sergelogvinov
Copy link
Contributor Author

/retest

@sergelogvinov
Copy link
Contributor Author

/test openstack-cloud-controller-manager-e2e-test

@sergelogvinov
Copy link
Contributor Author

Thank you @kayrus for refactoring of instance.go (instance-v2).
I’ve moved all my changes to instance.go. And it ready to go.

Whenever you have a moment, could you please take a look at this?

@sergelogvinov sergelogvinov force-pushed the multi-regionl-only-openstack branch from d2e2b76 to db35d48 Compare December 11, 2024 09:20
@kayrus
Copy link
Contributor

kayrus commented Dec 11, 2024

@sergelogvinov hm, I have doubts about this PR so far. Some raw thoughts:

  • region vs regions already cause a confusion, I believe there should be a better way to introduce regions. Does your implementation support multiple keystone URLs? Can we extract regions from the config file?
  • add more documentation
  • add/adjust documentation on how this correlates with cinder-csi and manila-csi plugins.
  • multicloud vs multiregion? do we need both, or does one imply another?
  • ideally I'd have a per-cloud/per-region struct member/interface that implements only a single cloud/region logic, but I'm afraid that the upstream https://github.com/kubernetes/cloud-provider doesn't support this. Maybe it makes sense to submit a feature request there? Let me know if you have questions on this particular point.

Sorry, I don't have a multicloud/multiregion setup, and the use case is not really clear for me.

@sergelogvinov
Copy link
Contributor Author

Thank you @kayrus for reviewing my PR.

I completely agree with your point about hybrid/multi-cluster setups (bare metal + OpenStack + AWS + GCP, etc.). This should first be implemented in https://github.com/kubernetes/cloud-provider first. I hope it will be added someday.

Our case is a bit different:

Imagine an OpenStack setup with only one Keystone endpoint but multiple regions for services like Nova, Neutron, Cinder, and Glance. Each region has one available zone called "nova" (the default installation). I think this type of setup is easier to manage and upgrade. I've seen similar setups used by many cloud providers.

openstack-regional

So, in this case, it’s not fully separated OpenStack clusters. It looks more like one region with many av-zones, similar to how well-known cloud providers organize their systems.

That’s why this PR supports only one Keystone endpoint. Using this endpoint, we can get Nova/Neutron endpoints for each region.

This feature will be introduced as an alpha feature and can be enabled using a CLI flag. Other components like the load balancer, cinder-csi and manila-csi will also need updates. However, we need to focus on supporting the cloud-node and cloud-node-lifecycle controllers first.

PS. I think I forgot to check backward compatibility with non regional OpenStack setups. I will check this soon.
Thanks.

@sergelogvinov sergelogvinov force-pushed the multi-regionl-only-openstack branch 2 times, most recently from ccec72b to bfce454 Compare December 20, 2024 10:56
Currently, it supports only single auth section.
Set the regions in config as:

[Global]
region=REGION1
regions=REGION1
regions=REGION2
regions=REGION3
@sergelogvinov sergelogvinov force-pushed the multi-regionl-only-openstack branch from bfce454 to dcd162b Compare December 20, 2024 11:15
@sergelogvinov
Copy link
Contributor Author

I’ve updated the documentation and verified the setup. It works well both with OS_CCM_REGIONAL=true enabled and without it.

Could you please let me know if the documentation is clear or if there’s anything that needs improvement?

Thank you! @kayrus

@pierreprinetti
Copy link
Member

pierreprinetti commented Dec 22, 2024

Once everything else is clarified, it may be nice to have a release note

@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. and removed release-note-none Denotes a PR that doesn't merit a release note. labels Dec 23, 2024
@sergelogvinov
Copy link
Contributor Author

Hi, @MatthieuFin Could you please take a look at this, since you know hybrid cloud?

@MatthieuFin
Copy link
Contributor

MatthieuFin commented Jan 8, 2025

Hi,
I manage kubernetes clusters with nodes distributed on different standalone OpenStack clusters (which means different keystone) so I can't use directly this implementation on my "classical" kubernetes clusters.

To handle node lifecycle with OCCM, I deploy 1 OCCM DaemonSet per OpenStack cluster (with env variable OS_CCM_REGIONAL=true), for that I use official helm chart with simply hack hostNetwork to false to be able to have multiple pods on same node.

I'll try to allowed some time to test your implementation and see how I could adapt it to be able to handle differents keystones, but since this is not a blocking point for me, I don't promise anything.

My next needed improvement will probably to be able to select which OpenStack cloud should be use to provide a LoadBalancer svc probably based on annotations or by Class name I didn't take a look about this point for now (currently all my LB svc are provided by only 1 OpenStack cluster with LB feature enable on corresponding OCCM and disable on others)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[occm] Multi region support
7 participants