You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hey folks! Our team just did a hands-on trail day yesterday to analyze the capabilities of pydantic-logfire, and the
first thing I'd like to say:
the ease of use is amazing! 🚀
pydantic logfire might be one of those desperately needed tools that finally simplify the huge technical (digest tons of
documentation and concept pages, understand / reverse engineer the logs, metrics, trace sdks) and
infrastructure complexity (run your own collector) of open telemetry to make it more easily adoptable by a less
technical audience as well.
Integrating pydantic logfire into one of our microservices has been a no-brainer and stupid simple 🚀
However, we ran into one thing that really puzzled us:
pydantic logfire does not support its own library and enforces log record bodies to be strings
This appears to be a huge technical limitation. We are running a highly automated distributed mlops stack, and we put a
decent effort into ensuring that all platform components (microservices running fastapi, orchestration engines (
apache-airlfow), processing jobs (spark), training jobs and many more) issue fully structured logs. And we use pydantic
to model our log events which helps a lot to publish fully structured and schema safe log events.
Due to above-mentioned pain points, our logs are (alas) not opentelemetry compatible yet, but this is somewhat what the
resulting log entry could look like in otel:
{"timestamp": 174012134465303,"observed_timestamp": 174012134565816,"trace_id": "019523d43b42cf4394594759b699305d","span_id": "a9c6c9ec18b1472a","severity_text": "INFO","severity_number": 9,"body": {"endpoint_name": "my_ml_endpoint","model": {"name": "foo","version": "2"},"instance": {"type": "medium","count": 2}},"resource": {"service.namespace": "model-serving","service.name": "api","service.version": "3.0.1"},"attributes": {# attributes derived from the pydantic class object "log.record.type": "EndpointCreated","log.record.schema.major": "1","log.record.schema.minor": "3",# other standardized otel attributes}}
Using fully structured logs enables a whole new world of automation if log entries are finally fully structured and
machine parseable (instead of human-readable unstructured prose).
Don't get me wrong. I do understand that a logging library somehow needs to support "good" old fstring logging, but having
no support for fully structured log events feels somewhat strange, especially if built by a team that essentially
created a powerful and enjoyable python serde library in the first place 🤔
Has this been considered, but rejected?
If so, for what reasons? 🤔
Amongst lot's of other things, think of use cases such as this (creating log views that instantly show all ml models ever deployed, alongside other standardized metadata validated by pydantic):
SELECT
trace_id
, span_id
, body ->>'endpoint_name'as endpoint_name
, body ->>'model'->>'name'as model_name
, body ->>'model'->>'version'as model_version
, body ->>'instance'->>'type'as instance_type
, body ->>'instance'->>'count'as instance_count
FROM
records
Kind regards!
The text was updated successfully, but these errors were encountered:
Description
Hey folks! Our team just did a hands-on trail day yesterday to analyze the capabilities of pydantic-logfire, and the
first thing I'd like to say:
pydantic logfire might be one of those desperately needed tools that finally simplify the huge technical (digest tons of
documentation and concept pages, understand / reverse engineer the logs, metrics, trace sdks) and
infrastructure complexity (run your own collector) of open telemetry to make it more easily adoptable by a less
technical audience as well.
Integrating pydantic logfire into one of our microservices has been a no-brainer and stupid simple 🚀
However, we ran into one thing that really puzzled us:
This appears to be a huge technical limitation. We are running a highly automated distributed mlops stack, and we put a
decent effort into ensuring that all platform components (microservices running fastapi, orchestration engines (
apache-airlfow), processing jobs (spark), training jobs and many more) issue fully structured logs. And we use pydantic
to model our log events which helps a lot to publish fully structured and schema safe log events.
In a nutshell, this is what we do more or less:
Due to above-mentioned pain points, our logs are (alas) not opentelemetry compatible yet, but this is somewhat what the
resulting log entry could look like in otel:
Using fully structured logs enables a whole new world of automation if log entries are finally fully structured and
machine parseable (instead of human-readable unstructured prose).
Don't get me wrong. I do understand that a logging library somehow needs to support "good" old fstring logging, but having
no support for fully structured log events feels somewhat strange, especially if built by a team that essentially
created a powerful and enjoyable python serde library in the first place 🤔
Amongst lot's of other things, think of use cases such as this (creating log views that instantly show all ml models ever deployed, alongside other standardized metadata validated by pydantic):
Kind regards!
The text was updated successfully, but these errors were encountered: