--- apiVersion: apiextensions.k8s.io/v1 kind: CustomResourceDefinition metadata: annotations: api-approved.kubernetes.io: unapproved, experimental-only inference.networking.k8s.io/bundle-version: v1.3.0 creationTimestamp: "2026-03-18T16:50:54Z" generation: 1 managedFields: - apiVersion: apiextensions.k8s.io/v1 fieldsType: FieldsV1 fieldsV1: f:metadata: f:annotations: f:api-approved.kubernetes.io: {} f:inference.networking.k8s.io/bundle-version: {} f:spec: f:group: {} f:names: f:kind: {} f:listKind: {} f:plural: {} f:shortNames: {} f:singular: {} f:scope: {} f:versions: {} manager: kubectl operation: Apply time: "2026-03-18T16:50:54Z" - apiVersion: apiextensions.k8s.io/v1 fieldsType: FieldsV1 fieldsV1: f:status: f:acceptedNames: f:kind: {} f:listKind: {} f:plural: {} f:shortNames: {} f:singular: {} f:conditions: k:{"type":"Established"}: .: {} f:lastTransitionTime: {} f:message: {} f:reason: {} f:status: {} f:type: {} k:{"type":"NamesAccepted"}: .: {} f:lastTransitionTime: {} f:message: {} f:reason: {} f:status: {} f:type: {} manager: kube-apiserver operation: Update subresource: status time: "2026-03-18T16:50:54Z" name: inferencepools.inference.networking.x-k8s.io resourceVersion: "11957" uid: eeb73f1c-ee45-42ae-a7bc-2fdb2aec6f08 spec: conversion: strategy: None group: inference.networking.x-k8s.io names: kind: InferencePool listKind: InferencePoolList plural: inferencepools shortNames: - xinfpool singular: inferencepool scope: Namespaced versions: - name: v1alpha2 schema: openAPIV3Schema: description: InferencePool is the Schema for the InferencePools API. properties: apiVersion: description: |- APIVersion defines the versioned schema of this representation of an object. Servers should convert recognized schemas to the latest internal value, and may reject unrecognized values. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources type: string kind: description: |- Kind is a string value representing the REST resource this object represents. Servers may infer this from the endpoint the client submits requests to. Cannot be updated. In CamelCase. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds type: string metadata: type: object spec: description: InferencePoolSpec defines the desired state of InferencePool properties: extensionRef: description: Extension configures an endpoint picker as an extension service. properties: failureMode: default: FailClose description: |- Configures how the gateway handles the case when the extension is not responsive. Defaults to failClose. enum: - FailOpen - FailClose type: string group: default: "" description: |- Group is the group of the referent. The default value is "", representing the Core API group. maxLength: 253 pattern: ^$|^[a-z0-9]([-a-z0-9]*[a-z0-9])?(\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*$ type: string kind: default: Service description: |- Kind is the Kubernetes resource kind of the referent. Defaults to "Service" when not specified. ExternalName services can refer to CNAME DNS records that may live outside of the cluster and as such are difficult to reason about in terms of conformance. They also may not be safe to forward to (see CVE-2021-25740 for more information). Implementations MUST NOT support ExternalName Services. maxLength: 63 minLength: 1 pattern: ^[a-zA-Z]([-a-zA-Z0-9]*[a-zA-Z0-9])?$ type: string name: description: Name is the name of the referent. maxLength: 253 minLength: 1 type: string portNumber: description: |- The port number on the service running the extension. When unspecified, implementations SHOULD infer a default value of 9002 when the Kind is Service. format: int32 maximum: 65535 minimum: 1 type: integer required: - name type: object selector: additionalProperties: description: |- LabelValue is the value of a label. This is used for validation of maps. This matches the Kubernetes label validation rules: * must be 63 characters or less (can be empty), * unless empty, must begin and end with an alphanumeric character ([a-z0-9A-Z]), * could contain dashes (-), underscores (_), dots (.), and alphanumerics between. Valid values include: * MyValue * my.name * 123-my-value maxLength: 63 minLength: 0 pattern: ^(([A-Za-z0-9][-A-Za-z0-9_.]*)?[A-Za-z0-9])?$ type: string description: |- Selector defines a map of labels to watch model server Pods that should be included in the InferencePool. In some cases, implementations may translate this field to a Service selector, so this matches the simple map used for Service selectors instead of the full Kubernetes LabelSelector type. If specified, it will be applied to match the model server pods in the same namespace as the InferencePool. Cross namesoace selector is not supported. type: object targetPortNumber: description: |- TargetPortNumber defines the port number to access the selected model server Pods. The number must be in the range 1 to 65535. format: int32 maximum: 65535 minimum: 1 type: integer required: - extensionRef - selector - targetPortNumber type: object status: default: parent: - conditions: - lastTransitionTime: "1970-01-01T00:00:00Z" message: Waiting for controller reason: Pending status: Unknown type: Accepted parentRef: kind: Status name: default description: Status defines the observed state of InferencePool. properties: parent: description: |- Parents is a list of parent resources (usually Gateways) that are associated with the InferencePool, and the status of the InferencePool with respect to each parent. A maximum of 32 Gateways will be represented in this list. When the list contains `kind: Status, name: default`, it indicates that the InferencePool is not associated with any Gateway and a controller must perform the following: - Remove the parent when setting the "Accepted" condition. - Add the parent when the controller will no longer manage the InferencePool and no other parents exist. items: description: PoolStatus defines the observed state of InferencePool from a Gateway. properties: conditions: default: - lastTransitionTime: "1970-01-01T00:00:00Z" message: Waiting for controller reason: Pending status: Unknown type: Accepted description: |- Conditions track the state of the InferencePool. Known condition types are: * "Accepted" * "ResolvedRefs" items: description: Condition contains details for one aspect of the current state of this API Resource. properties: lastTransitionTime: description: |- lastTransitionTime is the last time the condition transitioned from one status to another. This should be when the underlying condition changed. If that is not known, then using the time when the API field changed is acceptable. format: date-time type: string message: description: |- message is a human readable message indicating details about the transition. This may be an empty string. maxLength: 32768 type: string observedGeneration: description: |- observedGeneration represents the .metadata.generation that the condition was set based upon. For instance, if .metadata.generation is currently 12, but the .status.conditions[x].observedGeneration is 9, the condition is out of date with respect to the current state of the instance. format: int64 minimum: 0 type: integer reason: description: |- reason contains a programmatic identifier indicating the reason for the condition's last transition. Producers of specific condition types may define expected values and meanings for this field, and whether the values are considered a guaranteed API. The value should be a CamelCase string. This field may not be empty. maxLength: 1024 minLength: 1 pattern: ^[A-Za-z]([A-Za-z0-9_,:]*[A-Za-z0-9_])?$ type: string status: description: status of the condition, one of True, False, Unknown. enum: - "True" - "False" - Unknown type: string type: description: type of condition in CamelCase or in foo.example.com/CamelCase. maxLength: 316 pattern: ^([a-z0-9]([-a-z0-9]*[a-z0-9])?(\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*/)?(([A-Za-z0-9][-A-Za-z0-9_.]*)?[A-Za-z0-9])$ type: string required: - lastTransitionTime - message - reason - status - type type: object maxItems: 8 type: array x-kubernetes-list-map-keys: - type x-kubernetes-list-type: map parentRef: description: GatewayRef indicates the gateway that observed state of InferencePool. properties: group: default: gateway.networking.k8s.io description: Group is the group of the referent. maxLength: 253 pattern: ^$|^[a-z0-9]([-a-z0-9]*[a-z0-9])?(\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*$ type: string kind: default: Gateway description: Kind is kind of the referent. For example "Gateway". maxLength: 63 minLength: 1 pattern: ^[a-zA-Z]([-a-zA-Z0-9]*[a-zA-Z0-9])?$ type: string name: description: Name is the name of the referent. maxLength: 253 minLength: 1 type: string namespace: description: |- Namespace is the namespace of the referent. If not present, the namespace of the referent is assumed to be the same as the namespace of the referring object. maxLength: 63 minLength: 1 pattern: ^[a-z0-9]([-a-z0-9]*[a-z0-9])?$ type: string required: - name type: object required: - parentRef type: object maxItems: 32 type: array type: object type: object served: true storage: true subresources: status: {} status: acceptedNames: kind: InferencePool listKind: InferencePoolList plural: inferencepools shortNames: - xinfpool singular: inferencepool conditions: - lastTransitionTime: "2026-03-18T16:50:54Z" message: no conflicts found reason: NoConflicts status: "True" type: NamesAccepted - lastTransitionTime: "2026-03-18T16:50:54Z" message: the initial names have been accepted reason: InitialNamesAccepted status: "True" type: Established storedVersions: - v1alpha2