Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add model API #299

Open
wants to merge 7 commits into
base: main
Choose a base branch
from
Open

Add model API #299

wants to merge 7 commits into from

Conversation

varungup90
Copy link
Collaborator

@varungup90 varungup90 commented Oct 16, 2024

Address #302

@varungup90 varungup90 changed the title [WIP] Add model API Add model API Oct 16, 2024
@Jeffwan
Copy link
Collaborator

Jeffwan commented Oct 18, 2024

I will finish the code review by tomorrow. A little bit busy on some miscs

@Jeffwan Jeffwan self-requested a review October 18, 2024 00:23
// NOTE: json tags are required. Any new fields you add must have json tags for the fields to be serialized.

// ModelSpec defines the desired state of Model
type ModelSpec struct {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel this is very thin layer. If we use PodTemplate, do we still have control on the sidecar and engine?

  1. should we inject initContainer or sidecar container inside the user defined template?
  2. What if user uses SGLang in pod template but specify the vLLM as the engine type?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. If we have requirements then we should definitely inject our own init or sidecar containers.
  2. Good point, need to add checks to prevent such disprepancy.

@Jeffwan
Copy link
Collaborator

Jeffwan commented Oct 18, 2024

Overall code change looks good to me. However, we need some discussion on the model API abstraction. We should come up enough future features and take those insights into the consideration when we build this API

@varungup90
Copy link
Collaborator Author

Overall code change looks good to me. However, we need some discussion on the model API abstraction. We should come up enough future features and take those insights into the consideration when we build this API

Sure. I am thinking pretty much all features are associated with base model, such as reroute (to another deployment if current base model is unavailable), retry, or default behavior for rpm/tpm, routing algorithm can be defined for base model and user can override as required.

Right now with pod template, it is pretty thin layer with basic checks and can be extended as requirement evolves.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants