LLM qualitative evaluations and labeling #876

Luca-Blight · 2025-02-19T20:20:40Z

Description

It would be nice to have a place in the platform for this.
Another option would be to allow for integration with a partner that does provide it.

samuelcolvin · 2025-02-19T21:08:19Z

Yup, we're going on this very thing, see pydantic/pydantic-ai#915 and linked pull request.

Luca-Blight · 2025-02-19T21:27:27Z

That's awesome to see!

One thing that could be an interesting feature to have, particularly for online performance, is to enable the ability for another model to be set up as evaluator versus using a human.

Luca-Blight added the Feature Request label Feb 19, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LLM qualitative evaluations and labeling #876

LLM qualitative evaluations and labeling #876

Luca-Blight commented Feb 19, 2025

samuelcolvin commented Feb 19, 2025

Luca-Blight commented Feb 19, 2025

LLM qualitative evaluations and labeling #876

LLM qualitative evaluations and labeling #876

Comments

Luca-Blight commented Feb 19, 2025

Description

samuelcolvin commented Feb 19, 2025

Luca-Blight commented Feb 19, 2025