Incompatibility between Atomic Agents and Instructor when using Gemini models #56

marcortola · 2025-01-08T11:29:20Z

When using Instructor with Gemini models, there's an incompatibility issue between Atomic Agents and Instructor's implementation. Atomic Agents always includes the model parameter in the create method call, while Instructor explicitly raises an exception when this parameter is provided for Gemini models.

This incompatibility prevents users from using Atomic Agents with Instructor when working with Gemini models, limiting the functionality and integration possibilities.

Note that while we can still use Gemini like this:

client = openai.OpenAI(api_key=gemini_api_key, base_url="https://generativelanguage.googleapis.com/v1beta/openai/")

It would be safer in the long term to use the recommended Instructor approach as stated in its docs: https://python.useinstructor.com/integrations/google/

client = instructor.from_gemini(
    client=genai.GenerativeModel(
        model_name="models/gemini-1.5-flash-latest",
    ),
    mode=instructor.Mode.GEMINI_JSON,
)

Current Behavior

Atomic Agents sets the model parameter in all cases:

response = self.client.chat.completions.create(
    messages=messages,
    model=self.model,
    response_model=response_model,
    temperature=self.temperature,
    max_tokens=self.max_tokens,
)

Instructor raises an exception when the model parameter is provided with Gemini models:
https://github.com/instructor-ai/instructor/blob/887316c8267df621b985c736a2ebba504edf2399/instructor/process_response.py#L446

Proposed Solution

Modify the base agent implementation to conditionally include the model parameter based on the client mode, without introducing any BC:

params = {
    'messages': messages,
    'response_model': response_model,
    'temperature': self.temperature,
    'max_tokens': self.max_tokens,
}

if self.client.mode != Mode.GEMINI_JSON:
    params['model'] = self.model

return self.client.chat.completions.create(**params)

Note: While this is the current behavior in Instructor, it might be worth considering a longer-term solution where Instructor handles the model parameter more gracefully (e.g., with a warning instead of an exception) for Gemini models.

I can prepare a PR if this approach is agreed upon.

The text was updated successfully, but these errors were encountered:

KennyVaneetvelde · 2025-01-08T14:57:57Z

@marcortola I was looking into this, but then decided not to make any changes since google is providing an OpenAI compatible API nowadays, which means you can just use the openAI provider, see: https://github.com/BrainBlend-AI/atomic-agents/blob/main/atomic-examples/quickstart/quickstart/4_basic_chatbot_different_providers.py

From what I tested it seemed to do what you need it to do, and I didn't want to muddy up configuration or make examples more confusing at that time, especially since with gemini you need to pass the model when executing the .run and that just didn't feel quite right to me, plus using the openai-compatible API/client seemed to work great for images as well, see this example...

That being said, if you make sure it works for images, your solution makes sense as a temporary solution, because other people will inevitably try to use from_gemini and end up either reading this or creating a ticket before realizing or me telling them it can be done with from_openai. And either way, using from_openai to use gemini is counterintuitive

For a longer term solution, I'm starting to think it might be worth abstracting the client creation to provide a better developer experience, that way internally, we could still just use the cleanest/easiest way, which would be using from_openai, while providing the developer with a more explicit way still...

Thoughts? Ideas? Pros? Cons?

marcortola · 2025-01-09T11:05:39Z

Thanks for considering this issue and providing your perspective, @KennyVaneetvelde.

Regarding using the OpenAI adapter, while it does seem to work, I was hesitant to rely on it in production since Google has marked it as experimental. They also explicitly recommend calling the Gemini API directly if you aren't already using the OpenAI libraries. I haven't done any latency tests yet, but that was my main reasoning for not going with the from_openai approach initially.

I agree that passing the model to the run method feels weird and not quite right. It's not an ideal developer experience. I also think that adding vendor-specific code to the Atomic Agents core codebase with my proposed solution would break the OCP and could introduce maintenance difficulties down the line as we support more vendors (or if Instructor adds breaking changes). While it could work as a temporary workaround, it's not ideal for the long term.

Abstracting client creation is an interesting idea. While it solves a slightly different problem, I can see how it would enable us to implement new clients without changing the Atomic Agents user-facing code. The Atomic Agents client creation API could act as a stable contract. As an example, I'm currently using this rudimentary client abstraction in my projects:

import os
from functools import cache
import anthropic
import instructor
import openai
from instructor import Instructor

class ClientFactory:
    @classmethod
    @cache
    def create(cls, model_name: str, streaming: bool = False) -> Instructor:
        if model_name.startswith("gemini-"):
            return cls._create_gemini_client(streaming)
        elif model_name.startswith("gpt-"):
            return cls._create_openai_client(streaming)
        elif model_name.startswith("claude-"):
            return cls._create_anthropic_client(streaming)
            
        raise ValueError(f"Unsupported model: {model_name}")

    @classmethod
    def _create_gemini_client(cls, streaming: bool) -> Instructor:
        api_key = os.getenv("GEMINI_API_KEY")
        if not api_key:
            raise ValueError("GEMINI_API_KEY environment variable is not set")

        if streaming:
            client = openai.AsyncOpenAI(
                api_key=api_key,
                base_url="https://generativelanguage.googleapis.com/v1beta/openai/"
            )
        else:
            client = openai.OpenAI(
                api_key=api_key,
                base_url="https://generativelanguage.googleapis.com/v1beta/openai/"
            )
        return instructor.from_openai(client, mode=instructor.Mode.JSON)

    @classmethod
    def _create_openai_client(cls, streaming: bool) -> Instructor:
        api_key = os.getenv("OPENAI_API_KEY")
        if not api_key:
            raise ValueError("OPENAI_API_KEY environment variable is not set")

        if streaming:
            client = openai.AsyncOpenAI(api_key=api_key)
        else:
            client = openai.OpenAI(api_key=api_key)
        return instructor.from_openai(client)

    @classmethod
    def _create_anthropic_client(cls, streaming: bool) -> Instructor:
        api_key = os.getenv("ANTHROPIC_API_KEY")
        if not api_key:
            raise ValueError("ANTHROPIC_API_KEY environment variable is not set")

        if streaming:
            client = anthropic.AsyncAnthropic(api_key=api_key)
        else:
            client = anthropic.Anthropic(api_key=api_key)
        return instructor.from_anthropic(client)


# Usage example:
client = ClientFactory.create("gpt-4-turbo-preview")  # or "gemini-1.5-pro", "claude-3-opus"

Perhaps we could:

Keep using the OpenAI adapter short-term since it works.
Add a warning when users try the native Gemini approach pointing them to the OpenAI adapter, or implement the conditional model approach I mention in the issue as a temporary fix
Work on a proper client abstraction and usage as a longer-term solution.

Would love to hear your thoughts on this approach.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Incompatibility between Atomic Agents and Instructor when using Gemini models #56

Incompatibility between Atomic Agents and Instructor when using Gemini models #56

marcortola commented Jan 8, 2025

KennyVaneetvelde commented Jan 8, 2025 •

edited

Loading

marcortola commented Jan 9, 2025

Incompatibility between Atomic Agents and Instructor when using Gemini models #56

Incompatibility between Atomic Agents and Instructor when using Gemini models #56

Comments

marcortola commented Jan 8, 2025

Current Behavior

Proposed Solution

KennyVaneetvelde commented Jan 8, 2025 • edited Loading

marcortola commented Jan 9, 2025

KennyVaneetvelde commented Jan 8, 2025 •

edited

Loading