Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update enduser domain and add enduser.authentication.id #1456

Open
wants to merge 26 commits into
base: main
Choose a base branch
from

Conversation

heyams
Copy link
Contributor

@heyams heyams commented Oct 7, 2024

Fixes #1104

@heyams heyams force-pushed the heya/add-enduser-namespace branch from 0a8b0e8 to ee0970f Compare October 7, 2024 19:06
model/enduser/registry.yaml Outdated Show resolved Hide resolved
model/enduser/registry.yaml Show resolved Hide resolved
model/enduser/registry.yaml Show resolved Hide resolved
brief: >
Describes information about the end user, which can be used as a subdomain of browser, client, or user domains.
attributes:
- id: enduser.id
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems confusing to set enduser.id = "QdH5CAWJgqVT4rOr0qtumf" and enduser.authentication.id = "lmolkova", based on the

https://github.com/open-telemetry/semantic-conventions/pull/1146/files#r1712997369 and https://github.com/open-telemetry/semantic-conventions/pull/1146/files#r1710187141

It'd be more clear if we called this one enduser.tracking.id or enduser.anonymous.id so that people would not put PII there.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

per earlier discussion, it seemed that anonymous was confusing to some.
i'm good with enduser.anonymous_id or enduser.tracking_id. neither tracking nor anonymous a namespace, nesting here doesn't seem to follow the naming convention?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you mean that enduser.tracking.id would not follow the naming convention? Why?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like enduser.tracking.id

Copy link
Contributor Author

@heyams heyams Oct 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some examples

messaging.consumer_id: messaging.consumer.id

nesting is used for .
if not, use _

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lmolkova @trisch-me Please review and fill in anything that i may have missed.:

enduser.pseudo.id
- pros: it's clear to know this id is not authenticated
- cons: it might lead to misinterpretation of this id, like it's not a real id, a testing id, a temporary id?

enduser.tracking.id' - pros: it's clear that this id is used to track a particular user. - cons: tracking` may raise privacy concerns, as it implies monitoring user behavior, which could lead to user distrust. it also lacks context of what exactly is being tracked (e.g. user actions, sessions, locations, etc)

enduser.unauth.id
- pros: unauth is short
- cons: unauth is ambiguous, as it can be unauthenticated or unauthorized. additionally, acronym is not a good naming practice and leads to more confusion.

enduser.temp.id or enduser.transient.id
- pros: it suggests that this id is temporary and associated with user who has not been authenticated.
- cons: it lacks context about the id is temporary for what context (e.g. session, authentication)

enduser.unauthenticated.id
- pro: it's clear to indicate an authenticated user.
- cons: it collides with enduser.authentication.id, which can be renamed to enduser.authenticated.id, then it would have been fine?

enduser.anonymous.id
- pros: it's clear that this id is anonymous.
- cons: it can be confusing and lacks context. as long we have a clear documentation, this should be ok?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I kind of like these three:

  • enduser.pseudo.id
  • enduser.transient.id
  • enduser.ephemeral.id

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I attended Client SIG this morning,
they also preferred enduser.pseudo.id after going through this list.

Copy link
Contributor

@trisch-me trisch-me Dec 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what were the cons for enduser.unauthenticated.id?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we have already discussed and agreed to have enduser.authentication.id. authentication as a sub-namespace under enduser. anything that is not under authentication is either unauthenticated or unauthorized or anonymous or some random id.

@heyams heyams changed the title Update enduser domain and add authentication as a subdomain Update enduser domain and add enduser.authentication.id Oct 22, 2024
@heyams heyams changed the title Update enduser domain and add enduser.authentication.id Update enduser domain and add enduser.authentication.id Oct 22, 2024
@heyams
Copy link
Contributor Author

heyams commented Oct 23, 2024

need some advice with this weaver error, it didn't point to any files that I have modified. it's persisting.
image

@trask
Copy link
Member

trask commented Oct 23, 2024

need some advice with this weaver error, it didn't point to any files that I have modified. it's persisting. image

if you scroll up from there, you'll see:

ℹ Validating: $/home/weaver/target/general/attributes.md
✖ Could not find: identity

which looks related to this file that was deleted in your PR: model/enduser/deprecated/common.yaml

@heyams heyams marked this pull request as ready for review November 22, 2024 23:41
@heyams heyams requested review from a team as code owners November 22, 2024 23:41
@heyams heyams requested a review from joaopgrassi November 26, 2024 19:16
Comment on lines 19 to 20
Identifier of an anonymous end user who interacts with a system.
This identifier may be unique only through best-effort means and does not imply that the user is authenticated to the system.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

meeting notes from semantic convention SIG meeting today:

  • is this something that OpenTelemetry instrumentation is planning populate, or is this only something that vendor-specific instrumentation is planning to populate?
  • if it's something that OpenTelemetry instrumentation is planning to populate, what would the implementation look like, e.g. would this be stored in a persistent cookie? would it be stamped onto a specific event?
  • there is a desire not to add attributes into the semconv repo when only a single vendor has expressed interest in them
  • there is also a desire not to add attributes into the semconv repo without having any span/event/metric definitions that use them

the recommendation for next steps was to discuss this in the Client SIG

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@heyams were those questions addressed during client sig meeting?

Copy link
Contributor Author

@heyams heyams Dec 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, we kind of covered it. cc @MSNev

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you post the answers here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this something that OpenTelemetry instrumentation is planning populate

  • Yes, this will be added by instrumentations either directly or indirectly (via a psuedo user manager -- like the Session Manager)

if it's something that OpenTelemetry instrumentation is planning to populate, what would the implementation look like, e.g. would this be stored in a persistent cookie? would it be stamped onto a specific event?

  • It will need something like the SessionManager implementation to "manager" the lifecycle of this value, some environments would be just to create a simple random value for App start (like Android) while in a browser which is stateless then the "user manager" would most likely use cookies (but it could also use session storage)

there is a desire not to add attributes into the semconv repo when only a single vendor has expressed interest in them

  • This will not be single vender specific, but it is highly RUM specific

there is also a desire not to add attributes into the semconv repo without having any span/event/metric definitions that use them

  • It should be available both Spans and Logs, while it could be available for Metrics it's cardinality (because its a random value) should not necessarily be used for metrics...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks! a couple of followup questions:

  • if/when entity (or mutable resource) attributes are available, do you see enduser.pseudo.id as one of those?
  • are there any specific span or event semantic conventions that we can add enduser.pseudo.id to? or is enduser.pseudo.id hopefully just an entity (or mutable resource) attribute?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe we discussed these in the client SIG. I will wait for @MSNev's response.

Copy link
Contributor

@MSNev MSNev Dec 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • If/when mutable resources attributes (whether they are entities or shorter lived lifespan resources ) then this (and session) could possibly by represented there.

are there any specific span or event semantic conventions that we can add enduser.pseudo.id to?

No, not specifically as this really is just additional context details.

Copy link
Contributor

@lmolkova lmolkova Jan 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe having a document covering enduser convention as a whole was a critical feedback point.
We're trying to avoid adding attributes without also adding a signal that leverages them.

I'm surprised we don't have any signals to populate these on. Wouldn't we add them on session events?
https://github.com/open-telemetry/semantic-conventions/blob/f7362c7066856ff8591ac461d4e3b31ad7af3a4b/docs/general/session.md

If we're going to just stamp them on all telemetry items, let's document it as an attribute group - this would be the place to describe any additional guidance. This doc has to be updated anyway.

I believe it should cover:

  • who populates those attributes - would an HTTP instrumentation do it? Distro? End user apps?
  • when they are populated? Whenever they are known? Is it opt-in? etc
  • how to populate them - where would I get this information from? There are some mappings in existing docs - are they still relevant?
  • is user info propagated to other services (currently docs say it should be using baggage)

@heyams
Copy link
Contributor Author

heyams commented Dec 10, 2024

image

it was a false-negative. that link works. how do i ignore this failure?

@trask
Copy link
Member

trask commented Dec 10, 2024

it was a false-negative. that link works. how do i ignore this failure?

it's ok, it's not a required check, we're working on resolving the flakiness

attributes:
- id: enduser.id
type: string
deprecated: Replaced by `enduser.pseudo.id` attribute.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

deprecated attributes should be moved from registry to the deprecated section

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

my understanding is that if the namespace is deprecated, then it will move everything to the deprecated folder.
if some of the attributes are deprecated, then simply use deprecated:. please correct me if I'm wrong. an example will be helpful.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on lines +18 to +22
- ref: enduser.authentication.id
requirement_level: required
note: >
The `enduser.authentication.id` attribute is intended to provide an unique identifier of an authenticated enduser.
The deprecated attributes `enduser.authentication.role` and `enduser.authentication.scope` are removed from the enduser registry.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking that enduser.id would be the normal (authenticated) user id, do I remember that correctly? thanks

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#1456 (comment)
we decided to use enduser.authentication.id for the authenticated user id since the beginning of this discussion.

renaming enduser.id to something specific (like enduser.pseudo.id) so that user don't put authenticated user id under this attribute.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you summarize the reason for using enduser.authentication.id instead of enduser.id?

does this mean we will prohibit future "embedding" of user.id into the enduser.* namespace? or will we have 3 "id" attributes including enduser.id?

Copy link

github-actions bot commented Jan 5, 2025

This PR was marked stale due to lack of activity. It will be closed in 7 days.

@github-actions github-actions bot added Stale and removed Stale labels Jan 5, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

User.id for authenticated user id
8 participants