Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add webhook framework #713

Open
wants to merge 10 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@ go.work.sum

.DS_Store
__pycache__
*.xml

# Python virtual environment directory
.venv
Expand Down
22 changes: 20 additions & 2 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -52,9 +52,22 @@ help: ## Display this help.

##@ Development

GINKGO_VERSION ?= $(shell go list -m -f '{{.Version}}' github.com/onsi/ginkgo/v2)
INTEGRATION_TARGET ?= ./test/integration/...

GINKGO = $(shell pwd)/bin/ginkgo
.PHONY: ginkgo
ginkgo: ## Download ginkgo locally if necessary.
test -s $(LOCALBIN)/ginkgo || \
GOBIN=$(LOCALBIN) go install github.com/onsi/ginkgo/v2/ginkgo@$(GINKGO_VERSION)

.PHONY: manifests
manifests: controller-gen ## Generate WebhookConfiguration, ClusterRole and CustomResourceDefinition objects.
$(CONTROLLER_GEN) rbac:roleName=controller-manager-role crd:maxDescLen=0,generateEmbeddedObjectMeta=true webhook paths="./..." output:crd:artifacts:config=config/crd/bases
$(CONTROLLER_GEN) \
rbac:roleName=controller-manager-role output:rbac:artifacts:config=config/rbac/controller-manager \
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the past, we maintain this manually. some internal users want to use the component separately. so we change the folder structure to component based. If we use config/rbac/controller-manager. other components like orchestration will land into same subfolder?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Revert since kubebuilder can't separate the roles right now, and maybe unable in a long time. I added the roles manually as well.

crd:maxDescLen=0,generateEmbeddedObjectMeta=true output:crd:artifacts:config=config/crd/bases \
webhook output:webhook:artifacts:config=config/webhook \
paths="./..."

.PHONY: generate
generate: controller-gen ## Generate code containing DeepCopy, DeepCopyInto, and DeepCopyObject method implementations.
Expand All @@ -81,6 +94,11 @@ vet: ## Run go vet against code.
test: manifests generate fmt vet envtest ## Run tests.
KUBEBUILDER_ASSETS="$(shell $(ENVTEST) use $(ENVTEST_K8S_VERSION) --bin-dir $(LOCALBIN) -p path)" go test $$(go list ./... | grep -v /e2e) -coverprofile cover.out

.PHONY: test-integration
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

great!

test-integration: manifests fmt vet envtest ginkgo ## Run integration tests.
KUBEBUILDER_ASSETS="$(shell $(ENVTEST) use $(ENVTEST_K8S_VERSION) --bin-dir $(LOCALBIN) -p path)" \
$(GINKGO) --junit-report=junit.xml --output-dir=$(ARTIFACTS) -v $(INTEGRATION_TARGET)

# Utilize Kind or modify the e2e tests to load the image locally, enabling compatibility with other vendors.
.PHONY: test-e2e # Run the e2e tests against a Kind k8s instance that is spun up.
test-e2e:
Expand Down Expand Up @@ -272,7 +290,7 @@ GOLANGCI_LINT = $(LOCALBIN)/golangci-lint-$(GOLANGCI_LINT_VERSION)

## Tool Versions
KUSTOMIZE_VERSION ?= v5.3.0
CONTROLLER_TOOLS_VERSION ?= v0.14.0
CONTROLLER_TOOLS_VERSION ?= v0.16.1
ENVTEST_VERSION ?= release-0.17
GOLANGCI_LINT_VERSION ?= v1.57.2

Expand Down
4 changes: 2 additions & 2 deletions api/orchestration/v1alpha1/kvcache_types.go
Original file line number Diff line number Diff line change
Expand Up @@ -69,9 +69,9 @@ type MetadataStorage struct {

type CacheSpec struct {
// Replicas is the number of kvcache pods to deploy
// +kubebuilder:validation:Required
// +kubebuilder:default:=3
Replicas int `json:"replicas,omitempty"`
// +optional
Replicas *int `json:"replicas,omitempty"`
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

separate PR

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This leads to a compile error so I changed here together, a separate PR here #740


// represent the kvcache's image
// +kubebuilder:validation:Optional
Expand Down
5 changes: 5 additions & 0 deletions api/orchestration/v1alpha1/zz_generated.deepcopy.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

41 changes: 34 additions & 7 deletions cmd/controllers/main.go
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
/*
Copyright 2024.
Copyright 2024 The Aibrix Team.

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
Expand Down Expand Up @@ -32,6 +32,7 @@ import (
"github.com/vllm-project/aibrix/pkg/features"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"

"github.com/vllm-project/aibrix/pkg/cert"
"k8s.io/apimachinery/pkg/runtime"
utilruntime "k8s.io/apimachinery/pkg/util/runtime"
_ "k8s.io/client-go/plugin/pkg/client/auth"
Expand All @@ -48,6 +49,7 @@ import (
"github.com/vllm-project/aibrix/pkg/cache"
"github.com/vllm-project/aibrix/pkg/config"
"github.com/vllm-project/aibrix/pkg/controller"
apiwebhook "github.com/vllm-project/aibrix/pkg/webhook"
//+kubebuilder:scaffold:imports
)

Expand Down Expand Up @@ -233,16 +235,21 @@ func main() {
cache.NewCache(config, stopCh, nil)
}

// Initialize controllers
controller.Initialize()
certsReady := make(chan struct{})

// Kind controller registration is encapsulated inside the pkg/controller/controller.go
// So here we can use more clean registration flow and there's no need to change logics in future.
if err = controller.SetupWithManager(mgr, runtimeConfig); err != nil {
setupLog.Error(err, "unable to setup controller")
if err = cert.CertsManager(mgr, leaderElectionNamespace, certsReady); err != nil {
setupLog.Error(err, "unable to setup cert rotation")
os.Exit(1)
}

// Initialize controllers
controller.Initialize()

// Cert won't be ready until manager starts, so start a goroutine here which
// will block until the cert is ready before setting up the controllers.
// Controllers who register after manager starts will start directly.
go setupControllers(mgr, runtimeConfig, certsReady)

//+kubebuilder:scaffold:builder

if err := mgr.AddHealthzCheck("healthz", healthz.Ping); err != nil {
Expand All @@ -260,3 +267,23 @@ func main() {
os.Exit(1)
}
}

func setupControllers(mgr ctrl.Manager, runtimeConfig config.RuntimeConfig, certsReady chan struct{}) {
// The controllers won't work until the webhooks are operating,
// and the webhook won't work until the certs are all in places.
setupLog.Info("waiting for the cert generation to complete")
<-certsReady
setupLog.Info("certs ready")

// Kind controller registration is encapsulated inside the pkg/controller/controller.go
// So here we can use more clean registration flow and there's no need to change logics in future.
if err := controller.SetupWithManager(mgr, runtimeConfig); err != nil {
setupLog.Error(err, "unable to setup controller")
os.Exit(1)
}

if err := apiwebhook.SetupBackendRuntimeWebhook(mgr); err != nil {
setupLog.Error(err, "unable to create webhook", "webhook", "Model")
os.Exit(1)
}
}
9 changes: 8 additions & 1 deletion config/default/kustomization.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,11 @@ resources:
- ../dependency/kuberay-operator
# [WEBHOOK] To enable webhook, uncomment all the sections with [WEBHOOK] prefix including the one in
# crd/kustomization.yaml
#- ../webhook
- ../webhook

# [INTERNALCERT]
- ../internalcert

# [CERTMANAGER] To enable cert-manager, uncomment all sections with 'CERTMANAGER'. 'WEBHOOK' components are required.
#- ../certmanager
# [PROMETHEUS] To enable prometheus monitor, uncomment all sections with 'PROMETHEUS'.
Expand All @@ -54,6 +58,9 @@ patches:
- kind: ServiceAccount
name: aibrix-kuberay-operator-leader-election
namespace: aibrix-system
# [WEBHOOK] To enable webhook, uncomment all the sections with [WEBHOOK] prefix including the one in
# crd/kustomization.yaml
- path: manager_webhook_patch.yaml

images:
- name: controller
Expand Down
23 changes: 23 additions & 0 deletions config/default/manager_webhook_patch.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
apiVersion: apps/v1
kind: Deployment
metadata:
name: controller-manager
namespace: system
spec:
template:
spec:
containers:
- name: manager
ports:
- containerPort: 9443
name: webhook-server
protocol: TCP
volumeMounts:
- mountPath: /tmp/k8s-webhook-server/serving-certs
name: cert
readOnly: true
volumes:
- name: cert
secret:
defaultMode: 420
secretName: webhook-server-cert
29 changes: 29 additions & 0 deletions config/default/webhookcainjection_patch.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
# This patch add annotation to admission webhook config and
# CERTIFICATE_NAMESPACE and CERTIFICATE_NAME will be substituted by kustomize
apiVersion: admissionregistration.k8s.io/v1
kind: MutatingWebhookConfiguration
metadata:
labels:
app.kubernetes.io/name: mutatingwebhookconfiguration
app.kubernetes.io/instance: mutating-webhook-configuration
app.kubernetes.io/component: webhook
app.kubernetes.io/created-by: aibrix
app.kubernetes.io/part-of: aibrix
app.kubernetes.io/managed-by: kustomize
name: mutating-webhook-configuration
annotations:
cert-manager.io/inject-ca-from: CERTIFICATE_NAMESPACE/CERTIFICATE_NAME
---
apiVersion: admissionregistration.k8s.io/v1
kind: ValidatingWebhookConfiguration
metadata:
labels:
app.kubernetes.io/name: validatingwebhookconfiguration
app.kubernetes.io/instance: validating-webhook-configuration
app.kubernetes.io/component: webhook
app.kubernetes.io/created-by: aibrix
app.kubernetes.io/part-of: aibrix
app.kubernetes.io/managed-by: kustomize
name: validating-webhook-configuration
annotations:
cert-manager.io/inject-ca-from: CERTIFICATE_NAMESPACE/CERTIFICATE_NAME
2 changes: 2 additions & 0 deletions config/internalcert/kustomization.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
resources:
- secret.yaml
5 changes: 5 additions & 0 deletions config/internalcert/secret.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
apiVersion: v1
kind: Secret
metadata:
name: webhook-server-cert
namespace: system
19 changes: 19 additions & 0 deletions config/rbac/controller-manager/role.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -289,3 +289,22 @@ rules:
- get
- patch
- update
- apiGroups:
- ""
resources:
- secrets
verbs:
- get
- list
- update
- watch
- apiGroups:
- admissionregistration.k8s.io
resources:
- mutatingwebhookconfigurations
- validatingwebhookconfigurations
verbs:
- get
- list
- update
- watch
6 changes: 6 additions & 0 deletions config/webhook/kustomization.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
resources:
- manifests.yaml
- service.yaml

configurations:
- kustomizeconfig.yaml
22 changes: 22 additions & 0 deletions config/webhook/kustomizeconfig.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
# the following config is for teaching kustomize where to look at when substituting nameReference.
# It requires kustomize v2.1.0 or newer to work properly.
nameReference:
- kind: Service
version: v1
fieldSpecs:
- kind: MutatingWebhookConfiguration
group: admissionregistration.k8s.io
path: webhooks/clientConfig/service/name
- kind: ValidatingWebhookConfiguration
group: admissionregistration.k8s.io
path: webhooks/clientConfig/service/name

namespace:
- kind: MutatingWebhookConfiguration
group: admissionregistration.k8s.io
path: webhooks/clientConfig/service/namespace
create: true
- kind: ValidatingWebhookConfiguration
group: admissionregistration.k8s.io
path: webhooks/clientConfig/service/namespace
create: true
52 changes: 52 additions & 0 deletions config/webhook/manifests.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
---
apiVersion: admissionregistration.k8s.io/v1
kind: MutatingWebhookConfiguration
metadata:
name: mutating-webhook-configuration
webhooks:
- admissionReviewVersions:
- v1
clientConfig:
service:
name: webhook-service
namespace: system
path: /mutate-model-aibrix-ai-v1alpha1-modeladapter
failurePolicy: Fail
name: mmodeladapter.kb.io
rules:
- apiGroups:
- model.aibrix.ai
apiVersions:
- v1alpha1
operations:
- CREATE
- UPDATE
resources:
- modeladapters
sideEffects: None
---
apiVersion: admissionregistration.k8s.io/v1
kind: ValidatingWebhookConfiguration
metadata:
name: validating-webhook-configuration
webhooks:
- admissionReviewVersions:
- v1
clientConfig:
service:
name: webhook-service
namespace: system
path: /validate-model-aibrix-ai-v1alpha1-modeladapter
failurePolicy: Fail
name: vmodeladapter.kb.io
rules:
- apiGroups:
- model.aibrix.ai
apiVersions:
- v1alpha1
operations:
- CREATE
- UPDATE
resources:
- modeladapters
sideEffects: None
19 changes: 19 additions & 0 deletions config/webhook/service.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
apiVersion: v1
kind: Service
metadata:
labels:
app.kubernetes.io/name: service
app.kubernetes.io/instance: webhook-service
app.kubernetes.io/component: webhook
app.kubernetes.io/created-by: aibrix
app.kubernetes.io/part-of: aibrix
app.kubernetes.io/managed-by: kustomize
name: webhook-service
namespace: system
spec:
ports:
- port: 443
protocol: TCP
targetPort: 9443
selector:
control-plane: controller-manager
Loading
Loading