Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: changed start and end time logic for consumer lag details #6605

Merged
merged 4 commits into from
Dec 19, 2024

Conversation

SagarRajput-7
Copy link
Contributor

@SagarRajput-7 SagarRajput-7 commented Dec 9, 2024

Summary

Updated the startime and endtime logic for the APIs under consumer lag details

----- new logic ----

we have a t - that is the point on graph we clicked
so now the start and end time would be

Start: T-5
End T+5

Reasons for this is that we get related producer span prior to the point in time also, so here we are creating a better timeframe

  • Also changed order of tab for consumer lag and partition view

Related Issues / PR's

Screenshots

Screen.Recording.2024-12-09.at.5.57.49.PM.mov

Affected Areas and Manually Tested Areas

Tested all the related APIs' effects on the demo.in env


Important

Update start/end time logic and default tab for consumer lag details in messaging queues.

  • Behavior:
    • Update getStartAndEndTimesInMilliseconds() in MessagingQueuesUtils.ts to calculate start and end times around a point in time, converting timestamp to milliseconds.
    • Change default tab in MessagingQueuesDetails in MQDetails.tsx from ConsumerDetails to ProducerDetails.
  • API Payload:
    • Modify getMetaDataAndAPIPerView() in MessagingQueuesUtils.ts to use microseconds for start and end in tableApiPayload for consumerLag view.
  • Misc:
    • Reorder MessagingQueueServiceDetailType in MQServiceDetailTypePerView in MQDetails.tsx for consumerLag and partitionLatency views.

This description was created by Ellipsis for 72252cd. It will automatically update as commits are pushed.

Copy link

github-actions bot commented Dec 9, 2024

Build Error! No Linked Issue found. Please link an issue or mention it in the body using #<issue_id>

@github-actions github-actions bot added the enhancement New feature or request label Dec 9, 2024
Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Looks good to me! Reviewed everything up to 72252cd in 13 seconds

More details
  • Looked at 62 lines of code in 2 files
  • Skipped 0 files when reviewing.
  • Skipped posting 3 drafted comments based on config settings.
1. frontend/src/pages/MessagingQueues/MessagingQueuesUtils.ts:317
  • Draft comment:
    Ensure that the change from nanoseconds to milliseconds is consistent across all related functions and API calls to prevent any potential time-related bugs.
  • Reason this comment was not posted:
    Confidence changes required: 50%
    The change from nanoseconds to milliseconds in the getMetaDataAndAPIPerView function is consistent with the change in the getStartAndEndTimesInMilliseconds function. This ensures that the time values are correctly handled in milliseconds throughout the code.
2. frontend/src/pages/MessagingQueues/MessagingQueuesUtils.ts:184
  • Draft comment:
    The conversion of timestamp to milliseconds by multiplying by 1000 is correct for consistency with other parts of the code that expect milliseconds.
  • Reason this comment was not posted:
    Confidence changes required: 50%
    The change in the getStartAndEndTimesInMilliseconds function to multiply the timestamp by 1000 is correct for converting seconds to milliseconds. This aligns with the change in the getMetaDataAndAPIPerView function where the start and end times are now multiplied by 1e6 instead of 1e9.
3. frontend/src/pages/MessagingQueues/MessagingQueuesUtils.ts:314
  • Draft comment:
    The function getAttributeDataFromOnboardingStatus should not be part of the ClickHouseReader interface as it is not related to ClickHouse. Access it through the DAO in the telemetry instance instead.
  • Reason this comment was not posted:
    Comment was not on a valid diff hunk.

Workflow ID: wflow_NYbuGPSpm9318sWf


You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

Copy link

github-actions bot commented Dec 9, 2024

Build Error! No Linked Issue found. Please link an issue or mention it in the body using #<issue_id>

1 similar comment
Copy link

github-actions bot commented Dec 9, 2024

Build Error! No Linked Issue found. Please link an issue or mention it in the body using #<issue_id>

Copy link

github-actions bot commented Dec 9, 2024

Build Error! No Linked Issue found. Please link an issue or mention it in the body using #<issue_id>

1 similar comment
Copy link

github-actions bot commented Dec 9, 2024

Build Error! No Linked Issue found. Please link an issue or mention it in the body using #<issue_id>

@vikrantgupta25
Copy link
Collaborator

When clicking on any point in the graph, the graph essentially shows results for the current aggregation interval.
Example if the aggregation interval is 10 minutes then the a single graph point refers to the data in T-10 -> T , So ideally the logic should be based on aggregation interval and not hardcoded T-5 -> T+5. Please correct me if i misunderstood the intention here.

cc @shivanshuraj1333

@shivanshuraj1333
Copy link
Member

yes this is a valid point I was trying to think in that direction but I don't have much context on aggregation intervals.

@shivanshuraj1333
Copy link
Member

@vikrantgupta25
Copy link
Collaborator

yes so again choosing based on points wouldn't be the correct thing to do. the aggregation interval is calculated on the query service end and even frontend can do the same calculation if required. so any dot corresponds to previous data not the future data. So the logic should be T- aggregationInterval to T. We do the same thing when moving from APM to logs or Traces page. We can use the same calculation here as well.

@vikrantgupta25
Copy link
Collaborator

@shivanshuraj1333
Copy link
Member

but these metrics aggregation windows doesn't correlates to spans, if the aggregation window is very small we may not see any spans, in those cases hardcoded values makes more sense.

@SagarRajput-7
Copy link
Contributor Author

@shivanshuraj1333 - should I change the logic or not here, according to the above discussion?

@vikrantgupta25
Copy link
Collaborator

but these metrics aggregation windows doesn't correlates to spans, if the aggregation window is very small we may not see any spans, in those cases hardcoded values makes more sense.

are these metrics derived from the spans or exported directly ?

@shivanshuraj1333
Copy link
Member

shivanshuraj1333 commented Dec 16, 2024

exported directly ?

Metrics are collected separately from kafka brokers directly (code ref: https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/receiver/kafkametricsreceiver/consumer_scraper.go#L152), spans are from SDK/Agents from the client side. There's no native correlation between them (wrt timestamps).

@vikrantgupta25
Copy link
Collaborator

IMO since there is no exact co-relation between them, is it correct to co-relate them on the basis of timestamp as not all the spans would be contributors to those peak metrics right ?

@shivanshuraj1333
Copy link
Member

at atomic level (here partition), we can do timestamp based correlation per partition, to approximate spans contributing to metric for that time period.

@shivanshuraj1333
Copy link
Member

eg

GROUP               TOPIC           PARTITION  CURRENT-OFFSET  LOG-END-OFFSET  LAG             CONSUMER-ID                                                         HOST            CLIENT-ID
high-variable-group topic1          0          0               0               0               consumer-high-variable-group-1-4e811c9f-e504-4c0b-a6c6-473e4eace95d /172.18.0.10    consumer-high-variable-group-1
high-variable-group topic1          1          0               0               0               consumer-high-variable-group-1-4e811c9f-e504-4c0b-a6c6-473e4eace95d /172.18.0.10    consumer-high-variable-group-1
high-variable-group topic1          2          14193           30155           15962           consumer-high-variable-group-1-4e811c9f-e504-4c0b-a6c6-473e4eace95d /172.18.0.10    consumer-high-variable-group-1

@shivanshuraj1333
Copy link
Member

@SagarRajput-7 let's get this one merged

@SagarRajput-7
Copy link
Contributor Author

@SagarRajput-7 let's get this one merged

Sure, @shivanshuraj1333

@vikrantgupta25 please review

@SagarRajput-7 SagarRajput-7 merged commit 7405bfb into develop Dec 19, 2024
15 of 18 checks passed
@SagarRajput-7 SagarRajput-7 deleted the kafka-consumer-lag-details-changes branch December 19, 2024 07:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
docs not required enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants