Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add LLaVA OneVision model support #7693

Draft
wants to merge 5 commits into
base: main
Choose a base branch
from
Draft

Add LLaVA OneVision model support #7693

wants to merge 5 commits into from

Conversation

RyanJDick
Copy link
Collaborator

Summary

This PR adds support for the LLaVA OneVision model type:

  • The recommended model is available under the "Starter Models" list.
  • The LLaVA OneVision VLLM invocation can be used for inference. It supports 0-3 input images along with an input prompt.

Example

image

Output:

The image is a digital illustration that depicts a surreal landscape with a prominent water tower in the foreground. The tower is tall and cylindrical, with a platform at the top that has a railing. It is surrounded by a grassy field with small white flowers. The sky is filled with various celestial bodies, including a large moon and several smaller moons, creating a dreamlike atmosphere. The clouds are fluffy and scattered across the sky, and the overall color palette is warm, with shades of orange, pink, and blue dominating the scene. The art style is reminiscent of a science fiction or fantasy genre, with a focus on imaginative and fantastical elements.

Related Issues / Discussions

N/A

Remaining Work

  • Add the new model type to the frontend so that it appears in the Models tab.
  • Add a model identifier input to the LLaVA OneVision VLLM. Or, only support a single model and raise if it's not installed with a reference to the starter model.

QA Instructions

  • Test model installation via starter model list
  • Test that installed LLaVA models appear in the model list.
  • Test inference with 0 images
  • Test inference with 1 image
  • Test inference with 2 images

Merge Plan

Checklist

  • The PR has a short but descriptive title, suitable for a changelog
  • Tests added / updated (if applicable)
  • Documentation added / updated (if applicable)
  • Updated What's New copy (if doing a release after this PR)

@github-actions github-actions bot added python PRs that change python files invocations PRs that change invocations backend PRs that change backend files labels Feb 26, 2025
@jazzhaiku jazzhaiku self-requested a review February 27, 2025 20:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend PRs that change backend files invocations PRs that change invocations python PRs that change python files
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant