You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hermetic test is a fast way to verify functionality e2e without requiring an integration test env. In addition to the basic test we have today, we should add the following:
Test framework improvements:
Run a fake k8s API server so we don't need to fake the reconcilers
Verify metrics (not added yet)
Test Case
Test when model is not found in LLMService
Test when ModelServerPool is not found
Test when no backend pods are available
Test invalid request (e.g., doesn't contain "model")
Test backend server error, client should receive an error with an appropriate error code
Verify traffic split
Test algorithm
Sheddable request succeeds when resource is available; and dropped when resource is constrained
Verify min KV cache algo without LoRA
Verify LoRA affinity algo for the "warm up" case (when no pods has loaded any LoRA yet). This requires sending multiple requests and verify later requests will be sticky to backend pods.
The content you are editing has changed. Please copy your edits and refresh the page.
@danehans thanks! Please sync with @BenjaminBraunDev as he started with some initial work. There are lot of tests to add so you can divide and conquer!
@liu-cong thanks for the heads-up. @BenjaminBraunDev can you provide a status update when you have a moment? Do you have a local branch or WIP PR that can be referenced to see how we can divide and conquer this issue?
Hermetic test is a fast way to verify functionality e2e without requiring an integration test env. In addition to the basic test we have today, we should add the following:
Test framework improvements:
Test Case
Tasks
The text was updated successfully, but these errors were encountered: