Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

profiles: update default value field in message Profile #608

Closed
wants to merge 1 commit into from
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 3 additions & 4 deletions opentelemetry/proto/profiles/v1development/profiles.proto
Original file line number Diff line number Diff line change
Expand Up @@ -229,10 +229,6 @@ message Profile {
// for human-friendly content. The profile must stay functional if this field
// is cleaned.
repeated int32 comment_strindices = 15; // Indices into string table.
// Index into the string table of the type of the preferred sample
// value. If unset, clients should default to the last sample value.
int32 default_sample_type_strindex = 16;


// A globally unique identifier for a profile. The ID is a 16-byte array. An ID with
// all zeroes is considered invalid.
Expand Down Expand Up @@ -269,6 +265,9 @@ message Profile {
// The field is optional, however if it is present then equivalent converted data should be populated in other fields
// of this message as far as is practicable.
bytes original_payload = 21;

// Index into the sample_type array for the type of the default sample value.
int32 default_sample_type_index = 23;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Depending on the discussion in #606 (comment) this field ID could also be 16. 23 is chosen here, as this is the next free field ID once #606 is merged.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This avoids issues where the string that is pointed to by default_sample_type_strindex does not match with the value types in sample_type.

Did you see this issue in practice? It's not very obvious to me that this change is worth it:

  • The distinction between _index and _strindex can be subtle to some producers and consumers, causing bugs potentially.
  • The index can still be out of bound and needs care similar to the string. Arguably, bugs around not adjusting the index can be more subtle - e.g. if certain profile processing drops some sample types but does not adjust the default sample type, it may be easier to notice and debug that if the default type is a string.

Also note that this changes the semantics of unset default sample type compared to pprof: here the unset field will effectively mean first sample type, in pprof that means last:

  // Index into the string table of the type of the preferred sample
  // value. If unset, clients should default to the last sample value.
  int64 default_sample_type = 14;

Copy link
Contributor Author

@florianl florianl Dec 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The named issues around _index vs _strindex and indexing issues of these fields can be argued about every _strindex and _index field in the protocol and therefore are not specific to this field.

Unfortunately there are implementations, where default_sample_type pointed to microseconds (in the string table) and the unit in sample_type was in nanoseconds. Which one has the precedence in such a case?
If default_sample_type_index is an index into sample_type then such issues will not happen.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that example it seems odd that the sample type name had the unit. I'd expect the name of the sample type to be "time" or something like that, and that's what the default sample type field would also contain. And microseconds or nanoseconds would be the unit field.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure I'm mixing something. The value in sample_type was [["cpu","nanoseconds"]] - just like the example in the documentation to sample_type. While default_sample_type pointed to a value in the string_table that referenced microseconds.
So to me the values in sample_type and default_sample_type are valid by themselves. But as the unit (nanoseconds) in sample_type conflicts with the value in default_sample_type it is hard to interpret the profile in the correct way.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

default_sample_type_strindex is supposed to point to one of ValueType.type_strindex values.

For example, for sample types of [{"cpu_time","nanoseconds"}, {"num_samples", "count"}], default_sample_type_strindex can point to the string index for cpu_time or num_samples, but not to nanoseconds or count.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your example :) I think it points out exactly the issue with this field. While I run into issues with nanoseconds vs microseconds your example set it even to something different.
If one looks at the data one can usually differentiate and interpret data correctly. But such differences make it hard to build tooling for automation and potential recommendation.
For that reason, default_sample_type_strindex should become default_sample_type_index and point to an element in sample_type.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But the unit name is clearly not unique - two or more field can use the same unit (e.g. bytes). I'm not sure why this is ambiguous tbh.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

my motivation with this proposal is to avoid named issues in the future by pointing the index for the default value to the elements that already exist in sample_type.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have opened #610 as an alternative and will close this proposal.

}

// Represents a mapping between Attribute Keys and Units.
Expand Down
Loading