go: downloading github.com/kubeflow/trainer/v2 v2.1.0 go: downloading k8s.io/apimachinery v0.34.1 go: downloading sigs.k8s.io/kueue v0.6.2 go: downloading sigs.k8s.io/jobset v0.10.1 go: downloading github.com/onsi/gomega v1.38.2 go: downloading k8s.io/api v0.34.1 go: downloading github.com/matoous/go-nanoid/v2 v2.1.0 go: downloading github.com/prometheus/client_golang v1.23.2 go: downloading github.com/prometheus/common v0.67.2 go: downloading github.com/minio/minio-go/v7 v7.0.98 go: downloading github.com/kubeflow/training-operator v1.7.0 go: downloading github.com/openshift/api v0.0.0-20251124165233-999c45c0835a go: downloading github.com/openshift/client-go v0.0.0-20251015124057-db0dee36e235 go: downloading github.com/openshift/kueue-operator v0.0.0-20251202204851-958c48004dad go: downloading github.com/operator-framework/api v0.36.0 go: downloading github.com/operator-framework/operator-lifecycle-manager v0.38.0 go: downloading github.com/ray-project/kuberay/ray-operator v1.3.0 go: downloading k8s.io/client-go v0.34.1 go: downloading github.com/prometheus/client_model v0.6.2 go: downloading go.yaml.in/yaml/v2 v2.4.3 go: downloading google.golang.org/protobuf v1.36.10 go: downloading github.com/json-iterator/go v1.1.12 go: downloading github.com/go-ini/ini v1.67.0 go: downloading github.com/dustin/go-humanize v1.0.1 go: downloading github.com/google/uuid v1.6.0 go: downloading github.com/klauspost/compress v1.18.2 go: downloading github.com/klauspost/crc32 v1.3.0 go: downloading github.com/minio/crc64nvme v1.1.1 go: downloading github.com/minio/md5-simd v1.1.2 go: downloading go.yaml.in/yaml/v3 v3.0.4 go: downloading golang.org/x/net v0.48.0 go: downloading github.com/google/go-cmp v0.7.0 go: downloading k8s.io/kube-openapi v0.0.0-20250710124328-f3f2b991d03b go: downloading sigs.k8s.io/controller-runtime v0.22.4 go: downloading github.com/gogo/protobuf v1.3.2 go: downloading k8s.io/klog/v2 v2.130.1 go: downloading sigs.k8s.io/randfill v1.0.0 go: downloading github.com/mwitkow/go-conntrack v0.0.0-20190716064945-2f068394615f go: downloading golang.org/x/oauth2 v0.32.0 go: downloading sigs.k8s.io/yaml v1.6.0 go: downloading k8s.io/utils v0.0.0-20251002143259-bc988d571ff4 go: downloading gopkg.in/inf.v0 v0.9.1 go: downloading sigs.k8s.io/structured-merge-diff/v6 v6.3.0 go: downloading github.com/sirupsen/logrus v1.9.3 go: downloading golang.org/x/crypto v0.46.0 go: downloading github.com/rs/xid v1.6.0 go: downloading github.com/tinylib/msgp v1.6.1 go: downloading github.com/klauspost/cpuid/v2 v2.2.11 go: downloading golang.org/x/sys v0.39.0 go: downloading github.com/modern-go/concurrent v0.0.0-20180306012644-bacd9c7ef1dd go: downloading github.com/modern-go/reflect2 v1.0.3-0.20250322232337-35a7c28c31ee go: downloading sigs.k8s.io/json v0.0.0-20241014173422-cfa47c3a1cc8 go: downloading volcano.sh/apis v1.13.1-0.20251028070205-46d20c0699e7 go: downloading github.com/jpillora/backoff v1.0.0 go: downloading github.com/go-logr/logr v1.4.3 go: downloading github.com/blang/semver/v4 v4.0.0 go: downloading github.com/emicklei/go-restful/v3 v3.12.2 go: downloading github.com/go-openapi/jsonreference v0.21.0 go: downloading github.com/go-openapi/swag v0.23.1 go: downloading github.com/google/gnostic-models v0.7.0 go: downloading github.com/philhofer/fwd v1.2.0 go: downloading github.com/fxamacker/cbor/v2 v2.9.0 go: downloading golang.org/x/text v0.32.0 go: downloading github.com/beorn7/perks v1.0.1 go: downloading github.com/cespare/xxhash/v2 v2.3.0 go: downloading github.com/prometheus/procfs v0.16.1 go: downloading github.com/go-openapi/jsonpointer v0.21.1 go: downloading github.com/mailru/easyjson v0.9.0 go: downloading github.com/munnerz/goautoneg v0.0.0-20191010083416-a7dc8b61c822 go: downloading github.com/x448/float16 v0.8.4 go: downloading gopkg.in/yaml.v3 v3.0.1 go: downloading github.com/spf13/pflag v1.0.10 go: downloading golang.org/x/term v0.38.0 go: downloading golang.org/x/time v0.14.0 go: downloading gopkg.in/evanphx/json-patch.v4 v4.12.0 go: downloading github.com/davecgh/go-spew v1.1.2-0.20180830191138-d8f796af33cc go: downloading github.com/pkg/errors v0.9.1 go: downloading github.com/josharian/intern v1.0.0 Warning: Failed to get DSC: the server could not find the requested resource Initial Kueue managementState: === RUN TestDefaultClusterTrainingRuntimes cluster_training_runtimes_test.go:38: No running rhods-operator pod found in redhat-ods-operator namespace --- FAIL: TestDefaultClusterTrainingRuntimes (0.01s) === RUN TestDefaultTrainingHubRuntimesMatchDefaultClusterRuntimes cluster_training_runtimes_test.go:142: CTR "training-hub-cpu" matches DefaultClusterTrainingRuntime "torch-distributed-cpu" cluster_training_runtimes_test.go:142: CTR "torch-distributed-cuda130-torch291-py312" matches DefaultClusterTrainingRuntime "torch-distributed" cluster_training_runtimes_test.go:142: CTR "torch-distributed-cpu-torch291-py312" matches DefaultClusterTrainingRuntime "torch-distributed-cpu" cluster_training_runtimes_test.go:142: CTR "training-hub" matches DefaultClusterTrainingRuntime "torch-distributed" cluster_training_runtimes_test.go:142: CTR "training-hub-rocm" matches DefaultClusterTrainingRuntime "torch-distributed-rocm" cluster_training_runtimes_test.go:142: CTR "training-hub-th06-cuda130-torch291-py312" matches DefaultClusterTrainingRuntime "torch-distributed" cluster_training_runtimes_test.go:142: CTR "training-hub-th06-cpu-torch291-py312" matches DefaultClusterTrainingRuntime "torch-distributed-cpu" cluster_training_runtimes_test.go:142: CTR "training-hub-th06-rocm64-torch291-py312" matches DefaultClusterTrainingRuntime "torch-distributed-rocm" cluster_training_runtimes_test.go:142: CTR "torch-distributed-rocm64-torch291-py312" matches DefaultClusterTrainingRuntime "torch-distributed-rocm" cluster_training_runtimes_test.go:145: All CTRs match their DefaultClusterTrainingRuntime counterparts! --- PASS: TestDefaultTrainingHubRuntimesMatchDefaultClusterRuntimes (0.01s) === RUN TestRunTrainJobWithDefaultClusterTrainingRuntimes test_tag.go:37: Test tier 'Smoke' doesn't match expected tier 'Sanity' --- SKIP: TestRunTrainJobWithDefaultClusterTrainingRuntimes (0.00s) === RUN TestJobSetWorkflow test_tag.go:37: Test tier 'Smoke' doesn't match expected tier 'Sanity' --- SKIP: TestJobSetWorkflow (0.00s) === RUN TestFailedJobSetWorkflow test_tag.go:37: Test tier 'Smoke' doesn't match expected tier 'Sanity' --- SKIP: TestFailedJobSetWorkflow (0.00s) === RUN TestKubeflowSdkSanity test_tag.go:37: Test tier 'Smoke' doesn't match expected tier 'Sanity' --- SKIP: TestKubeflowSdkSanity (0.00s) === RUN TestKubeflowSdkKueueIntegration test_tag.go:37: Test tier 'Smoke' doesn't match expected tier 'Sanity' --- SKIP: TestKubeflowSdkKueueIntegration (0.00s) === RUN TestSftTrainingHubSingleNodeSingleGPU test_tag.go:37: Test tier 'Smoke' doesn't match expected tier 'KFTO-CUDA' --- SKIP: TestSftTrainingHubSingleNodeSingleGPU (0.00s) === RUN TestOsftTrainingHubSingleNodeSingleGPU test_tag.go:37: Test tier 'Smoke' doesn't match expected tier 'KFTO-CUDA' --- SKIP: TestOsftTrainingHubSingleNodeSingleGPU (0.00s) === RUN TestLoraTrainingHubSingleNodeSingleGPU test_tag.go:37: Test tier 'Smoke' doesn't match expected tier 'KFTO-CUDA' --- SKIP: TestLoraTrainingHubSingleNodeSingleGPU (0.00s) === RUN TestOsftTrainingHubMultiNodeMultiGPU test_tag.go:37: Test tier 'Smoke' doesn't match expected tier 'KFTO-CUDA' --- SKIP: TestOsftTrainingHubMultiNodeMultiGPU (0.00s) === RUN TestLoraTrainingHubMultiNodeMultiGPU test_tag.go:37: Test tier 'Smoke' doesn't match expected tier 'KFTO-CUDA' --- SKIP: TestLoraTrainingHubMultiNodeMultiGPU (0.00s) === RUN TestSftTrainingHubMultiNodeMultiGPU test_tag.go:37: Test tier 'Smoke' doesn't match expected tier 'KFTO-CUDA' --- SKIP: TestSftTrainingHubMultiNodeMultiGPU (0.00s) === RUN TestRhaiTrainingProgressionCPU test_tag.go:37: Test tier 'Smoke' doesn't match expected tier 'Tier1' --- SKIP: TestRhaiTrainingProgressionCPU (0.00s) === RUN TestRhaiJitCheckpointingCPU test_tag.go:37: Test tier 'Smoke' doesn't match expected tier 'Tier1' --- SKIP: TestRhaiJitCheckpointingCPU (0.00s) === RUN TestRhaiFeaturesCPU test_tag.go:37: Test tier 'Smoke' doesn't match expected tier 'Tier1' --- SKIP: TestRhaiFeaturesCPU (0.00s) === RUN TestRhaiTrainingProgressionCuda test_tag.go:37: Test tier 'Smoke' doesn't match expected tier 'KFTO-CUDA' --- SKIP: TestRhaiTrainingProgressionCuda (0.00s) === RUN TestRhaiJitCheckpointingCuda test_tag.go:37: Test tier 'Smoke' doesn't match expected tier 'KFTO-CUDA' --- SKIP: TestRhaiJitCheckpointingCuda (0.00s) === RUN TestRhaiFeaturesCuda test_tag.go:37: Test tier 'Smoke' doesn't match expected tier 'KFTO-CUDA' --- SKIP: TestRhaiFeaturesCuda (0.00s) === RUN TestRhaiTrainingProgressionRocm test_tag.go:37: Test tier 'Smoke' doesn't match expected tier 'KFTO-ROCm' --- SKIP: TestRhaiTrainingProgressionRocm (0.00s) === RUN TestRhaiJitCheckpointingRocm test_tag.go:37: Test tier 'Smoke' doesn't match expected tier 'KFTO-ROCm' --- SKIP: TestRhaiJitCheckpointingRocm (0.00s) === RUN TestRhaiFeaturesRocm test_tag.go:37: Test tier 'Smoke' doesn't match expected tier 'KFTO-ROCm' --- SKIP: TestRhaiFeaturesRocm (0.00s) === RUN TestRhaiTrainingProgressionMultiGpuCuda test_tag.go:37: Test tier 'Smoke' doesn't match expected tier 'KFTO-CUDA' --- SKIP: TestRhaiTrainingProgressionMultiGpuCuda (0.00s) === RUN TestRhaiJitCheckpointingMultiGpuCuda test_tag.go:37: Test tier 'Smoke' doesn't match expected tier 'KFTO-CUDA' --- SKIP: TestRhaiJitCheckpointingMultiGpuCuda (0.00s) === RUN TestRhaiFeaturesMultiGpuCuda test_tag.go:37: Test tier 'Smoke' doesn't match expected tier 'KFTO-CUDA' --- SKIP: TestRhaiFeaturesMultiGpuCuda (0.00s) === RUN TestRhaiTrainingProgressionMultiGpuRocm test_tag.go:37: Test tier 'Smoke' doesn't match expected tier 'KFTO-ROCm' --- SKIP: TestRhaiTrainingProgressionMultiGpuRocm (0.00s) === RUN TestRhaiJitCheckpointingMultiGpuRocm test_tag.go:37: Test tier 'Smoke' doesn't match expected tier 'KFTO-ROCm' --- SKIP: TestRhaiJitCheckpointingMultiGpuRocm (0.00s) === RUN TestRhaiFeaturesMultiGpuRocm test_tag.go:37: Test tier 'Smoke' doesn't match expected tier 'KFTO-ROCm' --- SKIP: TestRhaiFeaturesMultiGpuRocm (0.00s) === RUN TestTrainingFailureScenarios test_tag.go:37: Test tier 'Smoke' doesn't match expected tier 'KFTO-CUDA' --- SKIP: TestTrainingFailureScenarios (0.00s) === RUN TestTorchrunTrainingFailure test_tag.go:37: Test tier 'Smoke' doesn't match expected tier 'KFTO-CUDA' --- SKIP: TestTorchrunTrainingFailure (0.00s) === RUN TestRhaiS3CheckpointingCPU test_tag.go:37: Test tier 'Smoke' doesn't match expected tier 'Tier1' --- SKIP: TestRhaiS3CheckpointingCPU (0.00s) === RUN TestRhaiS3FsdpFullStateCheckpointingCuda test_tag.go:37: Test tier 'Smoke' doesn't match expected tier 'KFTO-CUDA' --- SKIP: TestRhaiS3FsdpFullStateCheckpointingCuda (0.00s) === RUN TestRhaiS3FsdpFullStateCheckpointingMultiProcessCuda test_tag.go:37: Test tier 'Smoke' doesn't match expected tier 'KFTO-CUDA' --- SKIP: TestRhaiS3FsdpFullStateCheckpointingMultiProcessCuda (0.00s) === RUN TestRhaiS3FsdpSharedStateCheckpointingCuda test_tag.go:37: Test tier 'Smoke' doesn't match expected tier 'KFTO-CUDA' --- SKIP: TestRhaiS3FsdpSharedStateCheckpointingCuda (0.00s) === RUN TestRhaiS3FsdpSharedStateCheckpointingMultiGpuCuda test_tag.go:37: Test tier 'Smoke' doesn't match expected tier 'KFTO-CUDA' --- SKIP: TestRhaiS3FsdpSharedStateCheckpointingMultiGpuCuda (0.00s) === RUN TestRhaiS3DeepspeedStage0CheckpointingCuda test_tag.go:37: Test tier 'Smoke' doesn't match expected tier 'KFTO-CUDA' --- SKIP: TestRhaiS3DeepspeedStage0CheckpointingCuda (0.00s) === RUN TestRhaiS3DeepspeedStage0CheckpointingMultiGpuCuda test_tag.go:37: Test tier 'Smoke' doesn't match expected tier 'KFTO-CUDA' --- SKIP: TestRhaiS3DeepspeedStage0CheckpointingMultiGpuCuda (0.00s) === RUN TestPyTorchDDPMultiNodeMultiCPUWithTorchCuda28 test_tag.go:37: Test tier 'Smoke' doesn't match expected tier 'Tier1' --- SKIP: TestPyTorchDDPMultiNodeMultiCPUWithTorchCuda28 (0.00s) === RUN TestPyTorchDDPSingleNodeSingleGPUWithTorchCuda test_tag.go:37: Test tier 'Smoke' doesn't match expected tier 'KFTO-CUDA' --- SKIP: TestPyTorchDDPSingleNodeSingleGPUWithTorchCuda (0.00s) === RUN TestPyTorchDDPSingleNodeMultiGPUWithTorchCuda test_tag.go:37: Test tier 'Smoke' doesn't match expected tier 'KFTO-CUDA' --- SKIP: TestPyTorchDDPSingleNodeMultiGPUWithTorchCuda (0.00s) === RUN TestPyTorchDDPMultiNodeSingleGPUWithTorchCuda test_tag.go:37: Test tier 'Smoke' doesn't match expected tier 'KFTO-CUDA' --- SKIP: TestPyTorchDDPMultiNodeSingleGPUWithTorchCuda (0.00s) === RUN TestPyTorchDDPMultiNodeMultiGPUWithTorchCuda test_tag.go:37: Test tier 'Smoke' doesn't match expected tier 'KFTO-CUDA' --- SKIP: TestPyTorchDDPMultiNodeMultiGPUWithTorchCuda (0.00s) === RUN TestPyTorchDDPSingleNodeSingleGPUWithTorchRocm test_tag.go:37: Test tier 'Smoke' doesn't match expected tier 'KFTO-ROCm' --- SKIP: TestPyTorchDDPSingleNodeSingleGPUWithTorchRocm (0.00s) === RUN TestPyTorchDDPSingleNodeMultiGPUWithTorchRocm test_tag.go:37: Test tier 'Smoke' doesn't match expected tier 'KFTO-ROCm' --- SKIP: TestPyTorchDDPSingleNodeMultiGPUWithTorchRocm (0.00s) === RUN TestPyTorchDDPMultiNodeSingleGPUWithTorchRocm test_tag.go:37: Test tier 'Smoke' doesn't match expected tier 'KFTO-ROCm' --- SKIP: TestPyTorchDDPMultiNodeSingleGPUWithTorchRocm (0.00s) === RUN TestPyTorchDDPMultiNodeMultiGPUWithTorchRocm test_tag.go:37: Test tier 'Smoke' doesn't match expected tier 'KFTO-ROCm' --- SKIP: TestPyTorchDDPMultiNodeMultiGPUWithTorchRocm (0.00s) === RUN TestKueueDefaultLocalQueueLabelInjection test_tag.go:37: Test tier 'Smoke' doesn't match expected tier 'Sanity' --- SKIP: TestKueueDefaultLocalQueueLabelInjection (0.00s) === RUN TestKueueWorkloadPreemptionSuspendsTrainJob test_tag.go:37: Test tier 'Smoke' doesn't match expected tier 'Sanity' --- SKIP: TestKueueWorkloadPreemptionSuspendsTrainJob (0.00s) === RUN TestKueueWorkloadInadmissibleWithNonExistentLocalQueue test_tag.go:37: Test tier 'Smoke' doesn't match expected tier 'Sanity' --- SKIP: TestKueueWorkloadInadmissibleWithNonExistentLocalQueue (0.00s) === RUN TestSetupUpgradeTrainJob trainer_kueue_upgrade_training_test.go:57: Skip due to issue RHOAIENG-48867 --- SKIP: TestSetupUpgradeTrainJob (0.00s) === RUN TestRunUpgradeTrainJob trainer_kueue_upgrade_training_test.go:125: Skip due to issue RHOAIENG-48867 --- SKIP: TestRunUpgradeTrainJob (0.00s) === RUN TestSetupSpecificRuntimeUpgradeTrainJob test_tag.go:37: Test tier 'Smoke' doesn't match expected tier 'Pre-Upgrade' --- SKIP: TestSetupSpecificRuntimeUpgradeTrainJob (0.00s) === RUN TestRunSpecificRuntimeUpgradeTrainJob test_tag.go:37: Test tier 'Smoke' doesn't match expected tier 'Post-Upgrade' --- SKIP: TestRunSpecificRuntimeUpgradeTrainJob (0.00s) === RUN TestKubeflowTrainerSmoke trainer_smoke_test.go:28: Unexpected error: <*errors.StatusError | 0xc000625e00>: the server could not find the requested resource { ErrStatus: { TypeMeta: {Kind: "", APIVersion: ""}, ListMeta: { SelfLink: "", ResourceVersion: "", Continue: "", RemainingItemCount: nil, }, Status: "Failure", Message: "the server could not find the requested resource", Reason: "NotFound", Details: { Name: "", Group: "", Kind: "", UID: "", Causes: [ { Type: "UnexpectedServerResponse", Message: "404 page not found", Field: "", }, ], RetryAfterSeconds: 0, }, Code: 404, }, } occurred --- FAIL: TestKubeflowTrainerSmoke (0.00s) === RUN TestSetupTrainingRuntime test_tag.go:37: Test tier 'Smoke' doesn't match expected tier 'Pre-Upgrade' --- SKIP: TestSetupTrainingRuntime (0.00s) === RUN TestVerifyTrainingRuntime test_tag.go:37: Test tier 'Smoke' doesn't match expected tier 'Post-Upgrade' --- SKIP: TestVerifyTrainingRuntime (0.00s) === RUN TestSetupSleepTrainJob test_tag.go:37: Test tier 'Smoke' doesn't match expected tier 'Pre-Upgrade' --- SKIP: TestSetupSleepTrainJob (0.00s) === RUN TestVerifySleepTrainJob test_tag.go:37: Test tier 'Smoke' doesn't match expected tier 'Post-Upgrade' --- SKIP: TestVerifySleepTrainJob (0.00s) FAIL TearDown: Setting kueue managementState to Removed in DataScienceCluster... TearDown: Failed to set Kueue to Removed: TearDown: failed to set kueue to Removed: the server could not find the requested resource ok github.com/opendatahub-io/distributed-workloads/tests/trainer 0.078s