Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(openai): Update with_structured_output default for OpenAI #28947

Merged
merged 20 commits into from
Jan 9, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
293 changes: 292 additions & 1 deletion libs/partners/openai/langchain_openai/chat_models/azure.py
Original file line number Diff line number Diff line change
Expand Up @@ -18,13 +18,15 @@
)

import openai
from langchain_core.language_models import LanguageModelInput
from langchain_core.language_models.chat_models import LangSmithParams
from langchain_core.messages import BaseMessage
from langchain_core.outputs import ChatResult
from langchain_core.runnables import Runnable
from langchain_core.utils import from_env, secret_from_env
from langchain_core.utils.pydantic import is_basemodel_subclass
from pydantic import BaseModel, Field, SecretStr, model_validator
from typing_extensions import Self
from typing_extensions import Literal, Self

from langchain_openai.chat_models.base import BaseChatOpenAI

Expand Down Expand Up @@ -739,3 +741,292 @@ def _create_chat_result(
)

return chat_result

def with_structured_output(
self,
schema: Optional[_DictOrPydanticClass] = None,
*,
method: Literal["function_calling", "json_mode", "json_schema"] = "json_schema",
include_raw: bool = False,
strict: Optional[bool] = None,
**kwargs: Any,
) -> Runnable[LanguageModelInput, _DictOrPydantic]:
"""Model wrapper that returns outputs formatted to match the given schema.

Args:
schema:
The output schema. Can be passed in as:

- a JSON Schema,
- a TypedDict class,
- or a Pydantic class,
- an OpenAI function/tool schema.

If ``schema`` is a Pydantic class then the model output will be a
Pydantic instance of that class, and the model-generated fields will be
validated by the Pydantic class. Otherwise the model output will be a
dict and will not be validated. See :meth:`langchain_core.utils.function_calling.convert_to_openai_tool`
for more on how to properly specify types and descriptions of
schema fields when specifying a Pydantic or TypedDict class.

method: The method for steering model generation, one of:

- "json_schema":
Uses OpenAI's Structured Output API:
https://platform.openai.com/docs/guides/structured-outputs
Supported for "gpt-4o-mini", "gpt-4o-2024-08-06", "o1", and later
models.
- "function_calling":
Uses OpenAI's tool-calling (formerly called function calling)
API: https://platform.openai.com/docs/guides/function-calling
- "json_mode":
Uses OpenAI's JSON mode. Note that if using JSON mode then you
must include instructions for formatting the output into the
desired schema into the model call:
https://platform.openai.com/docs/guides/structured-outputs/json-mode

Learn more about the differences between the methods and which models
support which methods here:

- https://platform.openai.com/docs/guides/structured-outputs/structured-outputs-vs-json-mode
- https://platform.openai.com/docs/guides/structured-outputs/function-calling-vs-response-format

include_raw:
If False then only the parsed structured output is returned. If
an error occurs during model output parsing it will be raised. If True
then both the raw model response (a BaseMessage) and the parsed model
response will be returned. If an error occurs during output parsing it
will be caught and returned as well. The final output is always a dict
with keys "raw", "parsed", and "parsing_error".
strict:

- True:
Model output is guaranteed to exactly match the schema.
The input schema will also be validated according to
https://platform.openai.com/docs/guides/structured-outputs/supported-schemas
- False:
Input schema will not be validated and model output will not be
validated.
- None:
``strict`` argument will not be passed to the model.

Defaults to False if ``method`` is ``"json_schema"`` or
``"function_calling"``. Can only be non-null if ``method`` is
``"json_schema"`` or ``"function_calling"``.

kwargs: Additional keyword args aren't supported.

Returns:
A Runnable that takes same inputs as a :class:`langchain_core.language_models.chat.BaseChatModel`.

| If ``include_raw`` is False and ``schema`` is a Pydantic class, Runnable outputs an instance of ``schema`` (i.e., a Pydantic object). Otherwise, if ``include_raw`` is False then Runnable outputs a dict.

| If ``include_raw`` is True, then Runnable outputs a dict with keys:

- "raw": BaseMessage
- "parsed": None if there was a parsing error, otherwise the type depends on the ``schema`` as described above.
- "parsing_error": Optional[BaseException]

.. versionchanged:: 0.1.20

Added support for TypedDict class ``schema``.

.. versionchanged:: 0.1.21

Support for ``strict`` argument added.
Support for ``method`` = "json_schema" added.

.. versionchanged:: 0.3.0

- ``method`` default changed from "function_calling" to "json_schema".
- ``strict`` defaults to True instead of False when ``method`` is
"function_calling".

.. dropdown:: Example: schema=Pydantic class, method="json_schema", include_raw=False, strict=True

Note, OpenAI has a number of restrictions on what types of schemas can be
provided if ``strict`` = True. When using Pydantic, our model cannot
specify any Field metadata (like min/max constraints) and fields cannot
have default values.

See all constraints here: https://platform.openai.com/docs/guides/structured-outputs/supported-schemas

.. code-block:: python

from typing import Optional

from langchain_openai import AzureChatOpenAI
from pydantic import BaseModel, Field


class AnswerWithJustification(BaseModel):
'''An answer to the user question along with justification for the answer.'''

answer: str
justification: Optional[str] = Field(
default=..., description="A justification for the answer."
)


llm = AzureChatOpenAI(azure_deployment="...", model="gpt-4o", temperature=0)
structured_llm = llm.with_structured_output(AnswerWithJustification)

structured_llm.invoke(
"What weighs more a pound of bricks or a pound of feathers"
)

# -> AnswerWithJustification(
# answer='They weigh the same',
# justification='Both a pound of bricks and a pound of feathers weigh one pound. The weight is the same, but the volume or density of the objects may differ.'
# )

.. dropdown:: Example: schema=Pydantic class, method="json_schema", include_raw=True

.. code-block:: python

from langchain_openai import AzureChatOpenAI
from pydantic import BaseModel


class AnswerWithJustification(BaseModel):
'''An answer to the user question along with justification for the answer.'''

answer: str
justification: str


llm = AzureChatOpenAI(azure_deployment="...", model="gpt-4o", temperature=0)
structured_llm = llm.with_structured_output(
AnswerWithJustification, include_raw=True
)

structured_llm.invoke(
"What weighs more a pound of bricks or a pound of feathers"
)
# -> {
# 'raw': AIMessage(content='', additional_kwargs={'tool_calls': [{'id': 'call_Ao02pnFYXD6GN1yzc0uXPsvF', 'function': {'arguments': '{"answer":"They weigh the same.","justification":"Both a pound of bricks and a pound of feathers weigh one pound. The weight is the same, but the volume or density of the objects may differ."}', 'name': 'AnswerWithJustification'}, 'type': 'function'}]}),
# 'parsed': AnswerWithJustification(answer='They weigh the same.', justification='Both a pound of bricks and a pound of feathers weigh one pound. The weight is the same, but the volume or density of the objects may differ.'),
# 'parsing_error': None
# }

.. dropdown:: Example: schema=TypedDict class, method="json_schema", include_raw=False

.. code-block:: python

from typing_extensions import Annotated, TypedDict

from langchain_openai import AzureChatOpenAI


class AnswerWithJustification(TypedDict):
'''An answer to the user question along with justification for the answer.'''

answer: str
justification: Annotated[
Optional[str], None, "A justification for the answer."
]


llm = AzureChatOpenAI(azure_deployment="...", model="gpt-4o", temperature=0)
structured_llm = llm.with_structured_output(AnswerWithJustification)

structured_llm.invoke(
"What weighs more a pound of bricks or a pound of feathers"
)
# -> {
# 'answer': 'They weigh the same',
# 'justification': 'Both a pound of bricks and a pound of feathers weigh one pound. The weight is the same, but the volume and density of the two substances differ.'
# }

.. dropdown:: Example: schema=OpenAI function schema, method="json_schema", include_raw=False

.. code-block:: python

from langchain_openai import AzureChatOpenAI

oai_schema = {
'name': 'AnswerWithJustification',
'description': 'An answer to the user question along with justification for the answer.',
'parameters': {
'type': 'object',
'properties': {
'answer': {'type': 'string'},
'justification': {'description': 'A justification for the answer.', 'type': 'string'}
},
'required': ['answer']
}
}

llm = AzureChatOpenAI(
azure_deployment="...",
model="gpt-4o",
temperature=0,
)
structured_llm = llm.with_structured_output(oai_schema)

structured_llm.invoke(
"What weighs more a pound of bricks or a pound of feathers"
)
# -> {
# 'answer': 'They weigh the same',
# 'justification': 'Both a pound of bricks and a pound of feathers weigh one pound. The weight is the same, but the volume and density of the two substances differ.'
# }

.. dropdown:: Example: schema=Pydantic class, method="json_mode", include_raw=True

.. code-block::

from langchain_openai import AzureChatOpenAI
from pydantic import BaseModel

class AnswerWithJustification(BaseModel):
answer: str
justification: str

llm = AzureChatOpenAI(
azure_deployment="...",
model="gpt-4o",
temperature=0,
)
structured_llm = llm.with_structured_output(
AnswerWithJustification,
method="json_mode",
include_raw=True
)

structured_llm.invoke(
"Answer the following question. "
"Make sure to return a JSON blob with keys 'answer' and 'justification'.\\n\\n"
"What's heavier a pound of bricks or a pound of feathers?"
)
# -> {
# 'raw': AIMessage(content='{\\n "answer": "They are both the same weight.",\\n "justification": "Both a pound of bricks and a pound of feathers weigh one pound. The difference lies in the volume and density of the materials, not the weight." \\n}'),
# 'parsed': AnswerWithJustification(answer='They are both the same weight.', justification='Both a pound of bricks and a pound of feathers weigh one pound. The difference lies in the volume and density of the materials, not the weight.'),
# 'parsing_error': None
# }

.. dropdown:: Example: schema=None, method="json_mode", include_raw=True

.. code-block::

structured_llm = llm.with_structured_output(method="json_mode", include_raw=True)

structured_llm.invoke(
"Answer the following question. "
"Make sure to return a JSON blob with keys 'answer' and 'justification'.\\n\\n"
"What's heavier a pound of bricks or a pound of feathers?"
)
# -> {
# 'raw': AIMessage(content='{\\n "answer": "They are both the same weight.",\\n "justification": "Both a pound of bricks and a pound of feathers weigh one pound. The difference lies in the volume and density of the materials, not the weight." \\n}'),
# 'parsed': {
# 'answer': 'They are both the same weight.',
# 'justification': 'Both a pound of bricks and a pound of feathers weigh one pound. The difference lies in the volume and density of the materials, not the weight.'
# },
# 'parsing_error': None
# }
""" # noqa: E501
if method in ("json_schema", "function_calling") and strict is None:
strict = False
return super().with_structured_output(
schema, method=method, include_raw=include_raw, strict=strict, **kwargs
)
Loading
Loading