-
Notifications
You must be signed in to change notification settings - Fork 182
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update enduser domain and add enduser.authentication.id
#1456
base: main
Are you sure you want to change the base?
Conversation
0a8b0e8
to
ee0970f
Compare
brief: > | ||
Describes information about the end user, which can be used as a subdomain of browser, client, or user domains. | ||
attributes: | ||
- id: enduser.id |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems confusing to set enduser.id = "QdH5CAWJgqVT4rOr0qtumf"
and enduser.authentication.id = "lmolkova"
, based on the
https://github.com/open-telemetry/semantic-conventions/pull/1146/files#r1712997369 and https://github.com/open-telemetry/semantic-conventions/pull/1146/files#r1710187141
It'd be more clear if we called this one enduser.tracking.id
or enduser.anonymous.id
so that people would not put PII there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
per earlier discussion, it seemed that anonymous
was confusing to some.
i'm good with enduser.anonymous_id
or enduser.tracking_id
. neither tracking
nor anonymous
a namespace, nesting here doesn't seem to follow the naming convention?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do you mean that enduser.tracking.id
would not follow the naming convention? Why?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like enduser.tracking.id
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
some examples
semantic-conventions/schemas/1.19.0
Line 32 in d5d2b9d
messaging.consumer_id: messaging.consumer.id |
nesting is used for .
if not, use _
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lmolkova @trisch-me Please review and fill in anything that i may have missed.:
enduser.pseudo.id
- pros: it's clear to know this id is not authenticated
- cons: it might lead to misinterpretation of this id, like it's not a real id, a testing id, a temporary id?
enduser.tracking.id' - pros: it's clear that this id is used to track a particular user. - cons:
tracking` may raise privacy concerns, as it implies monitoring user behavior, which could lead to user distrust. it also lacks context of what exactly is being tracked (e.g. user actions, sessions, locations, etc)
enduser.unauth.id
- pros: unauth
is short
- cons: unauth
is ambiguous, as it can be unauthenticated
or unauthorized
. additionally, acronym is not a good naming practice and leads to more confusion.
enduser.temp.id
or enduser.transient.id
- pros: it suggests that this id is temporary and associated with user who has not been authenticated.
- cons: it lacks context about the id is temporary for what context (e.g. session, authentication)
enduser.unauthenticated.id
- pro: it's clear to indicate an authenticated user.
- cons: it collides with enduser.authentication.id
, which can be renamed to enduser.authenticated.id
, then it would have been fine?
enduser.anonymous.id
- pros: it's clear that this id is anonymous.
- cons: it can be confusing and lacks context. as long we have a clear documentation, this should be ok?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I kind of like these three:
enduser.pseudo.id
enduser.transient.id
enduser.ephemeral.id
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I attended Client SIG this morning,
they also preferred enduser.pseudo.id
after going through this list.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what were the cons for enduser.unauthenticated.id
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we have already discussed and agreed to have enduser.authentication.id
. authentication
as a sub-namespace under enduser. anything that is not under authentication
is either unauthenticated or unauthorized or anonymous or some random id.
enduser.authentication.id
…s/semantic-conventions into heya/add-enduser-namespace
model/enduser/registry.yaml
Outdated
Identifier of an anonymous end user who interacts with a system. | ||
This identifier may be unique only through best-effort means and does not imply that the user is authenticated to the system. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
meeting notes from semantic convention SIG meeting today:
- is this something that OpenTelemetry instrumentation is planning populate, or is this only something that vendor-specific instrumentation is planning to populate?
- if it's something that OpenTelemetry instrumentation is planning to populate, what would the implementation look like, e.g. would this be stored in a persistent cookie? would it be stamped onto a specific event?
- there is a desire not to add attributes into the semconv repo when only a single vendor has expressed interest in them
- there is also a desire not to add attributes into the semconv repo without having any span/event/metric definitions that use them
the recommendation for next steps was to discuss this in the Client SIG
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@heyams were those questions addressed during client sig meeting?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, we kind of covered it. cc @MSNev
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you post the answers here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this something that OpenTelemetry instrumentation is planning populate
- Yes, this will be added by instrumentations either directly or indirectly (via a psuedo user manager -- like the Session Manager)
if it's something that OpenTelemetry instrumentation is planning to populate, what would the implementation look like, e.g. would this be stored in a persistent cookie? would it be stamped onto a specific event?
- It will need something like the SessionManager implementation to "manager" the lifecycle of this value, some environments would be just to create a simple random value for App start (like Android) while in a browser which is stateless then the "user manager" would most likely use cookies (but it could also use session storage)
there is a desire not to add attributes into the semconv repo when only a single vendor has expressed interest in them
- This will not be single vender specific, but it is highly RUM specific
there is also a desire not to add attributes into the semconv repo without having any span/event/metric definitions that use them
- It should be available both Spans and Logs, while it could be available for Metrics it's cardinality (because its a random value) should not necessarily be used for metrics...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks! a couple of followup questions:
- if/when entity (or mutable resource) attributes are available, do you see
enduser.pseudo.id
as one of those? - are there any specific span or event semantic conventions that we can add
enduser.pseudo.id
to? or isenduser.pseudo.id
hopefully just an entity (or mutable resource) attribute?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe we discussed these in the client SIG. I will wait for @MSNev's response.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- If/when mutable resources attributes (whether they are entities or shorter lived lifespan resources ) then this (and session) could possibly by represented there.
are there any specific span or event semantic conventions that we can add enduser.pseudo.id to?
No, not specifically as this really is just additional context details.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe having a document covering enduser
convention as a whole was a critical feedback point.
We're trying to avoid adding attributes without also adding a signal that leverages them.
I'm surprised we don't have any signals to populate these on. Wouldn't we add them on session events?
https://github.com/open-telemetry/semantic-conventions/blob/f7362c7066856ff8591ac461d4e3b31ad7af3a4b/docs/general/session.md
If we're going to just stamp them on all telemetry items, let's document it as an attribute group - this would be the place to describe any additional guidance. This doc has to be updated anyway.
I believe it should cover:
- who populates those attributes - would an HTTP instrumentation do it? Distro? End user apps?
- when they are populated? Whenever they are known? Is it opt-in? etc
- how to populate them - where would I get this information from? There are some mappings in existing docs - are they still relevant?
- is user info propagated to other services (currently docs say it should be using baggage)
it's ok, it's not a required check, we're working on resolving the flakiness |
attributes: | ||
- id: enduser.id | ||
type: string | ||
deprecated: Replaced by `enduser.pseudo.id` attribute. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
deprecated attributes should be moved from registry to the deprecated section
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
my understanding is that if the namespace is deprecated, then it will move everything to the deprecated folder.
if some of the attributes are deprecated, then simply use deprecated:
. please correct me if I'm wrong. an example will be helpful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
deprecated attributes should be moved to the deprecated folder. E.g. like here https://github.com/open-telemetry/semantic-conventions/blob/main/model/http/deprecated/registry-deprecated.yaml - you can find a bunch of other examples with https://github.com/search?q=repo%3Aopen-telemetry%2Fsemantic-conventions+%22deprecated%3A%22+language%3AYAML&type=code
- ref: enduser.authentication.id | ||
requirement_level: required | ||
note: > | ||
The `enduser.authentication.id` attribute is intended to provide an unique identifier of an authenticated enduser. | ||
The deprecated attributes `enduser.authentication.role` and `enduser.authentication.scope` are removed from the enduser registry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking that enduser.id
would be the normal (authenticated) user id, do I remember that correctly? thanks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#1456 (comment)
we decided to use enduser.authentication.id
for the authenticated user id since the beginning of this discussion.
renaming enduser.id
to something specific (like enduser.pseudo.id
) so that user don't put authenticated user id under this attribute.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you summarize the reason for using enduser.authentication.id
instead of enduser.id
?
does this mean we will prohibit future "embedding" of user.id
into the enduser.*
namespace? or will we have 3 "id" attributes including enduser.id
?
This PR was marked stale due to lack of activity. It will be closed in 7 days. |
Fixes #1104