--- apiVersion: apiextensions.k8s.io/v1 kind: CustomResourceDefinition metadata: annotations: api-approved.kubernetes.io: https://github.com/kubernetes-sigs/gateway-api-inference-extension/pull/1173 inference.networking.k8s.io/bundle-version: v1.3.0 creationTimestamp: "2026-03-18T16:55:09Z" generation: 1 managedFields: - apiVersion: apiextensions.k8s.io/v1 fieldsType: FieldsV1 fieldsV1: f:metadata: f:annotations: f:api-approved.kubernetes.io: {} f:inference.networking.k8s.io/bundle-version: {} f:spec: f:group: {} f:names: f:kind: {} f:listKind: {} f:plural: {} f:shortNames: {} f:singular: {} f:scope: {} f:versions: {} manager: kubectl operation: Apply time: "2026-03-18T16:55:09Z" - apiVersion: apiextensions.k8s.io/v1 fieldsType: FieldsV1 fieldsV1: f:status: f:acceptedNames: f:kind: {} f:listKind: {} f:plural: {} f:shortNames: {} f:singular: {} f:conditions: k:{"type":"Established"}: .: {} f:lastTransitionTime: {} f:message: {} f:reason: {} f:status: {} f:type: {} k:{"type":"KubernetesAPIApprovalPolicyConformant"}: .: {} f:lastTransitionTime: {} f:message: {} f:reason: {} f:status: {} f:type: {} k:{"type":"NamesAccepted"}: .: {} f:lastTransitionTime: {} f:message: {} f:reason: {} f:status: {} f:type: {} manager: kube-apiserver operation: Update subresource: status time: "2026-03-18T16:55:09Z" name: inferencepools.inference.networking.k8s.io resourceVersion: "17037" uid: 6071b5fa-0221-4bbc-b1bd-c8eb5bb96fb7 spec: conversion: strategy: None group: inference.networking.k8s.io names: kind: InferencePool listKind: InferencePoolList plural: inferencepools shortNames: - infpool singular: inferencepool scope: Namespaced versions: - name: v1 schema: openAPIV3Schema: description: | InferencePool is the Schema for the InferencePools API. properties: apiVersion: description: |- APIVersion defines the versioned schema of this representation of an object. Servers should convert recognized schemas to the latest internal value, and may reject unrecognized values. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources type: string kind: description: |- Kind is a string value representing the REST resource this object represents. Servers may infer this from the endpoint the client submits requests to. Cannot be updated. In CamelCase. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds type: string metadata: type: object spec: description: Spec defines the desired state of the InferencePool. properties: endpointPickerRef: description: |- EndpointPickerRef is a reference to the Endpoint Picker extension and its associated configuration. properties: failureMode: default: FailClose description: |- FailureMode configures how the parent handles the case when the Endpoint Picker extension is non-responsive. When unspecified, defaults to "FailClose". enum: - FailOpen - FailClose type: string group: default: "" description: |- Group is the group of the referent API object. When unspecified, the default value is "", representing the Core API group. maxLength: 253 minLength: 0 pattern: ^$|^[a-z0-9]([-a-z0-9]*[a-z0-9])?(\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*$ type: string kind: default: Service description: |- Kind is the Kubernetes resource kind of the referent. Required if the referent is ambiguous, e.g. service with multiple ports. Defaults to "Service" when not specified. ExternalName services can refer to CNAME DNS records that may live outside of the cluster and as such are difficult to reason about in terms of conformance. They also may not be safe to forward to (see CVE-2021-25740 for more information). Implementations MUST NOT support ExternalName Services. maxLength: 63 minLength: 1 pattern: ^[a-zA-Z]([-a-zA-Z0-9]*[a-zA-Z0-9])?$ type: string name: description: Name is the name of the referent API object. maxLength: 253 minLength: 1 type: string port: description: |- Port is the port of the Endpoint Picker extension service. Port is required when the referent is a Kubernetes Service. In this case, the port number is the service port number, not the target port. For other resources, destination port might be derived from the referent resource or this field. properties: number: description: |- Number defines the port number to access the selected model server Pods. The number must be in the range 1 to 65535. format: int32 maximum: 65535 minimum: 1 type: integer required: - number type: object required: - name type: object x-kubernetes-validations: - message: port is required when kind is 'Service' or unspecified (defaults to 'Service') rule: self.kind != 'Service' || has(self.port) selector: description: |- Selector determines which Pods are members of this inference pool. It matches Pods by their labels only within the same namespace; cross-namespace selection is not supported. The structure of this LabelSelector is intentionally simple to be compatible with Kubernetes Service selectors, as some implementations may translate this configuration into a Service resource. properties: matchLabels: additionalProperties: description: |- LabelValue is the value of a label. This is used for validation of maps. This matches the Kubernetes label validation rules: * must be 63 characters or less (can be empty), * unless empty, must begin and end with an alphanumeric character ([a-z0-9A-Z]), * could contain dashes (-), underscores (_), dots (.), and alphanumerics between. Valid values include: * MyValue * my.name * 123-my-value maxLength: 63 minLength: 0 pattern: ^(([A-Za-z0-9][-A-Za-z0-9_.]*)?[A-Za-z0-9])?$ type: string description: |- MatchLabels contains a set of required {key,value} pairs. An object must match every label in this map to be selected. The matching logic is an AND operation on all entries. maxProperties: 64 minProperties: 1 type: object required: - matchLabels type: object targetPorts: description: |- TargetPorts defines a list of ports that are exposed by this InferencePool. Every port will be treated as a distinctive endpoint by EPP, addressable as a 'podIP:portNumber' combination. items: description: Port defines the network port that will be exposed by this InferencePool. properties: number: description: |- Number defines the port number to access the selected model server Pods. The number must be in the range 1 to 65535. format: int32 maximum: 65535 minimum: 1 type: integer required: - number type: object maxItems: 8 minItems: 1 type: array x-kubernetes-list-type: atomic x-kubernetes-validations: - message: port number must be unique rule: self.all(p1, self.exists_one(p2, p1.number==p2.number)) required: - endpointPickerRef - selector - targetPorts type: object status: description: Status defines the observed state of the InferencePool. properties: parents: description: |- Parents is a list of parent resources, typically Gateways, that are associated with the InferencePool, and the status of the InferencePool with respect to each parent. A controller that manages the InferencePool, must add an entry for each parent it manages and remove the parent entry when the controller no longer considers the InferencePool to be associated with that parent. A maximum of 32 parents will be represented in this list. When the list is empty, it indicates that the InferencePool is not associated with any parents. items: description: ParentStatus defines the observed state of InferencePool from a Parent, i.e. Gateway. properties: conditions: description: |- Conditions is a list of status conditions that provide information about the observed state of the InferencePool. This field is required to be set by the controller that manages the InferencePool. Supported condition types are: * "Accepted" * "ResolvedRefs" items: description: Condition contains details for one aspect of the current state of this API Resource. properties: lastTransitionTime: description: |- lastTransitionTime is the last time the condition transitioned from one status to another. This should be when the underlying condition changed. If that is not known, then using the time when the API field changed is acceptable. format: date-time type: string message: description: |- message is a human readable message indicating details about the transition. This may be an empty string. maxLength: 32768 type: string observedGeneration: description: |- observedGeneration represents the .metadata.generation that the condition was set based upon. For instance, if .metadata.generation is currently 12, but the .status.conditions[x].observedGeneration is 9, the condition is out of date with respect to the current state of the instance. format: int64 minimum: 0 type: integer reason: description: |- reason contains a programmatic identifier indicating the reason for the condition's last transition. Producers of specific condition types may define expected values and meanings for this field, and whether the values are considered a guaranteed API. The value should be a CamelCase string. This field may not be empty. maxLength: 1024 minLength: 1 pattern: ^[A-Za-z]([A-Za-z0-9_,:]*[A-Za-z0-9_])?$ type: string status: description: status of the condition, one of True, False, Unknown. enum: - "True" - "False" - Unknown type: string type: description: type of condition in CamelCase or in foo.example.com/CamelCase. maxLength: 316 pattern: ^([a-z0-9]([-a-z0-9]*[a-z0-9])?(\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*/)?(([A-Za-z0-9][-A-Za-z0-9_.]*)?[A-Za-z0-9])$ type: string required: - lastTransitionTime - message - reason - status - type type: object maxItems: 8 type: array x-kubernetes-list-map-keys: - type x-kubernetes-list-type: map controllerName: description: |- ControllerName is a domain/path string that indicates the name of the controller that wrote this status. This corresponds with the GatewayClass controllerName field when the parentRef references a Gateway kind. Example: "example.net/gateway-controller". The format of this field is DOMAIN "/" PATH, where DOMAIN and PATH are valid Kubernetes names: https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names Controllers MAY populate this field when writing status. When populating this field, controllers should ensure that entries to status populated with their ControllerName are cleaned up when they are no longer necessary. maxLength: 253 minLength: 1 pattern: ^[a-z0-9]([-a-z0-9]*[a-z0-9])?(\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*\/[A-Za-z0-9\/\-._~%!$&'()*+,;=:]+$ type: string parentRef: description: |- ParentRef is used to identify the parent resource that this status is associated with. It is used to match the InferencePool with the parent resource, such as a Gateway. properties: group: default: gateway.networking.k8s.io description: |- Group is the group of the referent API object. When unspecified, the referent is assumed to be in the "gateway.networking.k8s.io" API group. maxLength: 253 minLength: 0 pattern: ^$|^[a-z0-9]([-a-z0-9]*[a-z0-9])?(\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*$ type: string kind: default: Gateway description: |- Kind is the kind of the referent API object. When unspecified, the referent is assumed to be a "Gateway" kind. maxLength: 63 minLength: 1 pattern: ^[a-zA-Z]([-a-zA-Z0-9]*[a-zA-Z0-9])?$ type: string name: description: Name is the name of the referent API object. maxLength: 253 minLength: 1 type: string namespace: description: |- Namespace is the namespace of the referenced object. When unspecified, the local namespace is inferred. Note that when a namespace different than the local namespace is specified, a ReferenceGrant object is required in the referent namespace to allow that namespace's owner to accept the reference. See the ReferenceGrant documentation for details: https://gateway-api.sigs.k8s.io/api-types/referencegrant/ maxLength: 63 minLength: 1 pattern: ^[a-z0-9]([-a-z0-9]*[a-z0-9])?$ type: string required: - name type: object required: - parentRef type: object maxItems: 32 type: array x-kubernetes-list-type: atomic type: object required: - spec type: object served: true storage: true subresources: status: {} status: acceptedNames: kind: InferencePool listKind: InferencePoolList plural: inferencepools shortNames: - infpool singular: inferencepool conditions: - lastTransitionTime: "2026-03-18T16:55:09Z" message: no conflicts found reason: NoConflicts status: "True" type: NamesAccepted - lastTransitionTime: "2026-03-18T16:55:09Z" message: the initial names have been accepted reason: InitialNamesAccepted status: "True" type: Established - lastTransitionTime: "2026-03-18T16:55:09Z" message: approved in https://github.com/kubernetes-sigs/gateway-api-inference-extension/pull/1173 reason: ApprovedAnnotation status: "True" type: KubernetesAPIApprovalPolicyConformant storedVersions: - v1