Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MeterProvider to return specific MetricError #2461

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
25 changes: 25 additions & 0 deletions opentelemetry-sdk/src/metrics/error.rs
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,12 @@
/// Invalid configuration
#[error("Config error {0}")]
Config(String),
/// Shutdown already invoked
#[error("Shutdown already invoked")]
AlreadyShutdown,
/// Shutdown failed due to timeout exceeding
#[error("Shutdown failed due to timeout exceeding")]
ShutdownTimeout,
/// Fail to export metrics
#[error("Metrics exporter {0} failed with {name}", name = .0.exporter_name())]
ExportErr(Box<dyn ExportError>),
Expand All @@ -27,6 +33,25 @@
InvalidInstrumentConfiguration(&'static str),
}

impl PartialEq for MetricError {
fn eq(&self, other: &Self) -> bool {
match (self, other) {
(MetricError::Other(a), MetricError::Other(b)) => a == b,
(MetricError::Config(a), MetricError::Config(b)) => a == b,

Check warning on line 40 in opentelemetry-sdk/src/metrics/error.rs

View check run for this annotation

Codecov / codecov/patch

opentelemetry-sdk/src/metrics/error.rs#L39-L40

Added lines #L39 - L40 were not covered by tests
(MetricError::AlreadyShutdown, MetricError::AlreadyShutdown) => true,
(MetricError::ShutdownTimeout, MetricError::ShutdownTimeout) => true,
(MetricError::ExportErr(a), MetricError::ExportErr(b)) => {
a.exporter_name() == b.exporter_name()

Check warning on line 44 in opentelemetry-sdk/src/metrics/error.rs

View check run for this annotation

Codecov / codecov/patch

opentelemetry-sdk/src/metrics/error.rs#L42-L44

Added lines #L42 - L44 were not covered by tests
}
(
MetricError::InvalidInstrumentConfiguration(a),
MetricError::InvalidInstrumentConfiguration(b),
) => a == b,
_ => false,

Check warning on line 50 in opentelemetry-sdk/src/metrics/error.rs

View check run for this annotation

Codecov / codecov/patch

opentelemetry-sdk/src/metrics/error.rs#L47-L50

Added lines #L47 - L50 were not covered by tests
}
}
}

impl<T: ExportError> From<T> for MetricError {
fn from(err: T) -> Self {
MetricError::ExportErr(Box::new(err))
Expand Down
8 changes: 2 additions & 6 deletions opentelemetry-sdk/src/metrics/meter_provider.rs
Original file line number Diff line number Diff line change
Expand Up @@ -123,9 +123,7 @@ impl SdkMeterProviderInner {
.shutdown_invoked
.load(std::sync::atomic::Ordering::Relaxed)
{
Err(MetricError::Other(
"Cannot perform flush as MeterProvider shutdown already invoked.".into(),
))
Err(MetricError::AlreadyShutdown)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to be clear, is this approach different from the one proposed in #2381, where we were using separate enum types for Shutdown, and possibly ForceFlush as well? In this case, the proposal is to use the same MetricError enum for the entire user-facing interface, correct?

Copy link
Contributor

@scottgerring scottgerring Dec 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe it is a best practice to have error-per-call when there is a clear disunity of potential errors, like there is between shutdown and export (but probably not between export and forceFlush).

As it stands, shutdown() can return two errors - AlreadyShutdown and ShutdownTimeout (plus the common boxed one for PoisonError).

By adding these types to MetricError we're promising the caller that shutdown() may also return any of Config, ExportErr, InvalidInstrumentConfiguration errors. This is simply not true of the interface, and places extra burden on the caller to handle, leaving the caller with these options:

  • 1/ Treat all errors as unrecoverable - simply match error
  • 2/ Extensively dig through our code, work out that AlreadyShutdown and ShutdownTimeout are two that they care about handling, and _ => to ignore the rest
  • 3/ Exhaustively match all the arms without realising that a bunch of them will never be invoked

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By adding these types to MetricError we're promising the caller that shutdown() may also return any of Config, ExportErr, InvalidInstrumentConfiguration errors.

Agree this PR is the wrong approach. Thanks for the feedback. Need to rework.

Copy link
Contributor

@scottgerring scottgerring Dec 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @cijothomas I think I overindexed on the single error type - sorry about that. I don't think this needs a dramatic rework!

What you have done with the named-logical-errors makes sense to me! I stole it and lifted it into the other PR for metrics/traces :D

} else {
self.pipes.force_flush()
}
Expand All @@ -137,9 +135,7 @@ impl SdkMeterProviderInner {
.swap(true, std::sync::atomic::Ordering::SeqCst)
{
// If the previous value was true, shutdown was already invoked.
Err(MetricError::Other(
"MeterProvider shutdown already invoked.".into(),
))
Err(MetricError::AlreadyShutdown)
} else {
self.pipes.shutdown()
}
Expand Down
20 changes: 14 additions & 6 deletions opentelemetry-sdk/src/metrics/periodic_reader.rs
Original file line number Diff line number Diff line change
Expand Up @@ -408,14 +408,19 @@
.send(Message::Shutdown(response_tx))
.map_err(|e| MetricError::Other(e.to_string()))?;

if let Ok(response) = response_rx.recv() {
if response {
Ok(())
} else {
// TODO: accept timeout from caller.
match response_rx.recv_timeout(Duration::from_secs(5)) {
Ok(response) => {
if response {
Ok(())
} else {
Err(MetricError::Other("Failed to shutdown".into()))

Check warning on line 417 in opentelemetry-sdk/src/metrics/periodic_reader.rs

View check run for this annotation

Codecov / codecov/patch

opentelemetry-sdk/src/metrics/periodic_reader.rs#L417

Added line #L417 was not covered by tests
}
}
Err(mpsc::RecvTimeoutError::Timeout) => Err(MetricError::ShutdownTimeout),

Check warning on line 420 in opentelemetry-sdk/src/metrics/periodic_reader.rs

View check run for this annotation

Codecov / codecov/patch

opentelemetry-sdk/src/metrics/periodic_reader.rs#L420

Added line #L420 was not covered by tests
Err(mpsc::RecvTimeoutError::Disconnected) => {
Err(MetricError::Other("Failed to shutdown".into()))
}
} else {
Err(MetricError::Other("Failed to shutdown".into()))
}
}
}
Expand Down Expand Up @@ -573,10 +578,12 @@
// calling shutdown again should return Err
let result = meter_provider.shutdown();
assert!(result.is_err());
assert_eq!(result.unwrap_err(), MetricError::AlreadyShutdown);

// calling shutdown again should return Err
let result = meter_provider.shutdown();
assert!(result.is_err());
assert_eq!(result.unwrap_err(), MetricError::AlreadyShutdown);
}

#[test]
Expand All @@ -598,6 +605,7 @@
// calling force_flush after shutdown should return Err
let result = meter_provider.force_flush();
assert!(result.is_err());
assert_eq!(result.unwrap_err(), MetricError::AlreadyShutdown);
}

#[test]
Expand Down
Loading