Skip to content

Commit

Permalink
[Feat] Multi region support (Topology Aware Provisioning) (#280)
Browse files Browse the repository at this point in the history
---------

Co-authored-by: Khaja Omer <komer@akamai.com>
  • Loading branch information
komer3 and komer3 authored Oct 11, 2024
1 parent 84fe6cb commit 0c9e508
Show file tree
Hide file tree
Showing 13 changed files with 471 additions and 73 deletions.
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@
- [Creating a PersistentVolumeClaim](docs/usage.md#creating-a-persistentvolumeclaim)
- [Encrypted Drives using LUKS](docs/encrypted-drives.md)
- [Adding Tags to Created Volumes](docs/volume-tags.md)
- [Topology-Aware Provisioning](docs/topology-aware-provisioning.md)
- [Development Setup](docs/development-setup.md)
- [Prerequisites](docs/development-setup.md#-prerequisites)
- [Setting Up the Local Development Environment](docs/development-setup.md#-setting-up-the-local-development-environment)
Expand Down
18 changes: 18 additions & 0 deletions deploy/kubernetes/base/csi-storageclass.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -16,3 +16,21 @@ metadata:
provisioner: linodebs.csi.linode.com
reclaimPolicy: Retain
allowVolumeExpansion: true
---
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
name: linode-block-storage-wait-for-consumer
namespace: kube-system
provisioner: linodebs.csi.linode.com
reclaimPolicy: Delete
volumeBindingMode: WaitForFirstConsumer
---
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
name: linode-block-storage-wait-for-consumer-retain
namespace: kube-system
provisioner: linodebs.csi.linode.com
reclaimPolicy: Retain
volumeBindingMode: WaitForFirstConsumer
1 change: 1 addition & 0 deletions deploy/kubernetes/base/ss-csi-linode-controller.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -43,6 +43,7 @@ spec:
- "--volume-name-prefix=pvc"
- "--volume-name-uuid-length=16"
- "--csi-address=$(ADDRESS)"
- "--feature-gates=Topology=true"
- "--v=2"
env:
- name: ADDRESS
Expand Down
88 changes: 88 additions & 0 deletions docs/topology-aware-provisioning.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,88 @@
## 🌐 Topology-Aware Provisioning

This CSI driver supports topology-aware provisioning, optimizing volume placement based on the physical infrastructure layout.

**Notes:**

1. **Volume Cloning**: Cloning only works within the same region, not across regions.
2. **Volume Migration**: We can't move volumes across regions.
3. **Remote Provisioning**: Volume provisioning is supported in remote regions (nodes or clusters outside of the region where the controller server is deployed).

> [!IMPORTANT]
> Make sure you are using the latest release v0.8.6+ to utilize the remote provisioning feature.
#### 📝 Example StorageClass and PVC

```yaml
allowVolumeExpansion: true
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: linode-block-storage-wait-for-consumer
provisioner: linodebs.csi.linode.com
reclaimPolicy: Delete
volumeBindingMode: WaitForFirstConsumer
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: pvc-filesystem
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi
storageClassName: linode-block-storage-wait-for-consumer
```
> **Important**: The `volumeBindingMode: WaitForFirstConsumer` setting is crucial for topology-aware provisioning. It delays volume binding and creation until a pod using the PVC is created. This allows the system to consider the pod's scheduling requirements and node assignment when selecting the most appropriate storage location, ensuring optimal data locality and performance.

#### 🖥️ Example Pod

```yaml
apiVersion: v1
kind: Pod
metadata:
name: e2e-pod
spec:
nodeSelector:
topology.linode.com/region: us-ord
tolerations:
- key: "node-role.kubernetes.io/control-plane"
operator: "Exists"
effect: "NoSchedule"
containers:
- name: e2e-pod
image: ubuntu
command:
- sleep
- "1000000"
volumeMounts:
- mountPath: /data
name: csi-volume
volumes:
- name: csi-volume
persistentVolumeClaim:
claimName: pvc-filesystem
```

This example demonstrates how to set up topology-aware provisioning using the Linode Block Storage CSI Driver. The StorageClass defines the provisioner and reclaim policy, while the PersistentVolumeClaim requests storage from this class. The Pod specification shows how to use the PVC and includes a node selector for region-specific deployment.

> [!IMPORTANT]
> To enable topology-aware provisioning, make sure to pass the following argument to the csi-provisioner sidecar:
> ```
> --feature-gates=CSINodeInfo=true
> ```
> This enables the CSINodeInfo feature gate, which is required for topology-aware provisioning to function correctly.
>
> Note: This feature is enabled by default in release v0.8.6 and later versions.

#### Provisioning Process

1. CO (Kubernetes) determines required topology based on application needs (pod scheduled region) and cluster layout.
2. external-provisioner gathers topology requirements from CO and includes `TopologyRequirement` in `CreateVolume` call.
3. CSI driver creates volume satisfying topology requirements.
4. Driver returns actual topology of created volume.

By leveraging topology-aware provisioning, CSI drivers ensure optimal volume placement within the infrastructure, improving performance, availability, and data locality.
1 change: 1 addition & 0 deletions helm-chart/csi-driver/templates/csi-linode-controller.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@ spec:
- --volume-name-prefix=pvc
- --volume-name-uuid-length=16
- --csi-address=$(ADDRESS)
- --feature-gates=Topology=true
- --v=2
{{- if .Values.enable_metrics}}
- --metrics-address={{ .Values.csiProvisioner.metrics.address }}
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: linode-block-storage-wait-for-consumer-retain
namespace: {{ required ".Values.namespace required" .Values.namespace }}
{{- if eq .Values.defaultStorageClass "linode-block-storage-wait-for-consumer-retain" }}
annotations:
storageclass.kubernetes.io/is-default-class: "true"
{{- end }}
{{- if .Values.volumeTags }}
parameters:
linodebs.csi.linode.com/volumeTags: {{ join "," .Values.volumeTags }}
{{- end}}
allowVolumeExpansion: true
provisioner: linodebs.csi.linode.com
reclaimPolicy: Retain
volumeBindingMode: WaitForFirstConsumer
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: linode-block-storage-wait-for-consumer
namespace: {{ required ".Values.namespace required" .Values.namespace }}
{{- if eq .Values.defaultStorageClass "linode-block-storage-wait-for-consumer" }}
annotations:
storageclass.kubernetes.io/is-default-class: "true"
{{- end }}
{{- if .Values.volumeTags }}
parameters:
linodebs.csi.linode.com/volumeTags: {{ join "," .Values.volumeTags }}
{{- end}}
allowVolumeExpansion: true
provisioner: linodebs.csi.linode.com
reclaimPolicy: Delete
volumeBindingMode: WaitForFirstConsumer
27 changes: 15 additions & 12 deletions internal/driver/controllerserver.go
Original file line number Diff line number Diff line change
Expand Up @@ -76,24 +76,27 @@ func (cs *ControllerServer) CreateVolume(ctx context.Context, req *csi.CreateVol
return &csi.CreateVolumeResponse{}, err
}

// Create volume context
volContext := cs.createVolumeContext(ctx, req)
contentSource := req.GetVolumeContentSource()
accessibilityRequirements := req.GetAccessibilityRequirements()

// Attempt to retrieve information about a source volume if the request includes a content source.
// This is important for scenarios where the volume is being cloned from an existing one.
sourceVolInfo, err := cs.getContentSourceVolume(ctx, req.GetVolumeContentSource())
sourceVolInfo, err := cs.getContentSourceVolume(ctx, contentSource, accessibilityRequirements)
if err != nil {
return &csi.CreateVolumeResponse{}, err
}

// Create the volume
vol, err := cs.createAndWaitForVolume(ctx, volName, sizeGB, req.GetParameters()[VolumeTags], sourceVolInfo)
vol, err := cs.createAndWaitForVolume(ctx, volName, sizeGB, req.GetParameters()[VolumeTags], sourceVolInfo, accessibilityRequirements)
if err != nil {
return &csi.CreateVolumeResponse{}, err
}

// Create volume context
volContext := cs.createVolumeContext(ctx, req, vol)

// Prepare and return response
resp := cs.prepareCreateVolumeResponse(ctx, vol, size, volContext, sourceVolInfo, req.GetVolumeContentSource())
resp := cs.prepareCreateVolumeResponse(ctx, vol, size, volContext, sourceVolInfo, contentSource)

log.V(2).Info("CreateVolume response", "response", resp)
return resp, nil
Expand Down Expand Up @@ -154,9 +157,15 @@ func (cs *ControllerServer) ControllerPublishVolume(ctx context.Context, req *cs
return resp, err
}

// Retrieve and validate the instance associated with the Linode ID
instance, err := cs.getInstance(ctx, linodeID)
if err != nil {
return resp, err
}

// Check if the volume exists and is valid.
// If the volume is already attached to the specified instance, it returns its device path.
devicePath, err := cs.getAndValidateVolume(ctx, volumeID, linodeID)
devicePath, err := cs.getAndValidateVolume(ctx, volumeID, instance, req.GetVolumeContext())
if err != nil {
return resp, err
}
Expand All @@ -169,12 +178,6 @@ func (cs *ControllerServer) ControllerPublishVolume(ctx context.Context, req *cs
}, nil
}

// Retrieve and validate the instance associated with the Linode ID
instance, err := cs.getInstance(ctx, linodeID)
if err != nil {
return resp, err
}

// Check if the instance can accommodate the volume attachment
if capErr := cs.checkAttachmentCapacity(ctx, instance); capErr != nil {
return resp, capErr
Expand Down
Loading

0 comments on commit 0c9e508

Please sign in to comment.