-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
InferencePool config proposal for API review #162
base: main
Are you sure you want to change the base?
Conversation
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: ahg-g The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
This is intended to be post-v0.1, correct? |
api/v1alpha1/inferencepool_types.go
Outdated
} | ||
|
||
// ExtensionRef is a reference to the extension deployment. | ||
type ExtensionRef struct { | ||
// A selector for the pods that run the deployment. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Need to support extensions possibly running outside the cluster
- Need to support "localhost" to run extensions as a sidecar
api/v1alpha1/inferencepool_types.go
Outdated
// A selector for the pods that run the deployment. | ||
// | ||
// +kubebuilder:validation:Required | ||
Selector map[string]string `json:"selector"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
assuming sidecar here, is sidecar in the gateway pod, if it looks like a backendRef
, you could route back to the Gateway Service and have the sidecar listen on a different port, and other implementations like Envoy Gateway could use the custom Backend
Resource to backendRef
to route via UDS
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, sidecar on the gateway pod, that's a good idea.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, so we should be good with a service reference here, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ahg-g something that looks like https://github.com/kubernetes-sigs/gateway-api/blob/50cf145b0dd1849ebd03c5866a92797862f23efe/apis/v1/object_reference_types.go#L96 should solve the Service
case and also allow for other custom backend types
@ahg-g: The following tests failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
|
||
// The port number on the pods running the extension. Defaults to 9002 if not set. | ||
// | ||
// +kubebuilder:default=9002 | ||
TargetPortNumber *int32 `json:"targetPortNumber"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Include omitempty
if this is an optional field.
Thoughts on how to secure this connection? For example:
...
extensionRef:
targetPortNumber: 9002
protocol: TLS # Maybe default to HTTP with enum for TLS, UDS (e.g. sidecar extension)
tls: # Required when protocol is TLS
certificateRefs:
- kind: Secret
name: example-com
xref for additional details.
@@ -61,8 +61,100 @@ type InferencePoolSpec struct { | |||
// +kubebuilder:validation:Maximum=65535 | |||
// +kubebuilder:validation:Required | |||
TargetPortNumber int32 `json:"targetPortNumber"` | |||
|
|||
// EndpointPickerConfig selects and configures the endpoint picking algorithm to apply on the requests sent |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How does the term "picker" differ from "selector"? I ask b/c users have established a mental model for Endpoint Selector. Unless "picker" has a different semantic meaning than "selector", I'd like to see the project standardize on "selector".
Since EndpointPickerConfig is not optional, this means every implementation must support the EPP extension. If that is the case, the reference EPP extension must be pluggable using a standardized interface, e.g. CNI, CNI, etc.
cc: @robscott
} | ||
|
||
// ExtensionConnection encapsulates options that configures the connection to the extension. | ||
type ExtensionConnection struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Based on my above feedback, consider including targetPortNumber, protocol, tls, etc. in this struct since these concepts are related to connection
.
Proposal at https://docs.google.com/document/d/1RKVFAgerUoIMhClU9gRduXoF8Hq2tzzVOHdPvkQPKgs/edit?tab=t.0
/hold