Skip to content

Commit

Permalink
Define instrumentation configuration API (#4128)
Browse files Browse the repository at this point in the history
Resolves #3535.

This introduces an API component to file configuration, which has been
limited to SDK (i.e. end user facing) up until this point.

The configuration model recently added the first surface area related to
instrumentation configuration properties in
open-telemetry/opentelemetry-configuration#91.

The API proposed in this PR is collectively called the "Instrumentation
config API", and provides a mechanism for instrumentation libraries to
participate in file configuration and read relevant properties during
initialization. The intent is for both OpenTelemetry-authored and native
instrumentation alike to be able to be configured by users in a standard
way. New API surface area is necessary to accomplish this to avoid
instrumentation libraries from needing to take a dependency on SDK
artifacts.

The following summarizes the additions:

- Introduce ConfigProvider, the instrumentation config API analog of
TracerProvider, MeterProvider, LoggerProvider. This is the entry point
to the API.
- Define "Get instrumentation config" operation for ConfigProvider. This
returns something called ConfigProperties, which is a programmatic
representation of a YAML mapping node. The ConfigProperties returned by
"Get instrumentation config" represents the
[`.instrumentation`](https://github.com/open-telemetry/opentelemetry-configuration/blob/670901762dd5cce1eecee423b8660e69f71ef4be/examples/kitchen-sink.yaml#L438-L439)
node defined in a config file.
- Rebrand "file configuration" to "declarative configuration". This
expresses the intent without coupling to the file representation, which
although will be the most popular way to consume these features is just
one possible way to represent the configuration model and use these
tools.
- Break out dedicated `api.md`, `data-model.md`, and `sdk.md` files for
respective API, data model, and SDK portions of declarative
configuration. This aligns with other portions of the spec. The
separation should improve clarity regarding what should and should not
be exposed in the API.

I've prototyped this new API in `opentelemetry-java` here:
open-telemetry/opentelemetry-java#6549

cc @open-telemetry/configuration-maintainers,
@open-telemetry/specs-semconv-maintainers
  • Loading branch information
jack-berg authored Aug 15, 2024
1 parent 254b78f commit 9e1aeb7
Show file tree
Hide file tree
Showing 10 changed files with 649 additions and 526 deletions.
5 changes: 3 additions & 2 deletions spec-compliance-matrix.md
Original file line number Diff line number Diff line change
Expand Up @@ -294,10 +294,11 @@ Note: Support for environment variables is optional.
| OTEL_EXPORTER_OTLP_METRICS_DEFAULT_HISTOGRAM_AGGREGATION | | + | | | | | | | | | |
| OTEL_EXPERIMENTAL_CONFIG_FILE | | | | | | | | | | | |

## File Configuration
## Declarative configuration

See [File Configuration](./specification/configuration/file-configuration.md)
See [declarative configuration](./specification/configuration/README.md#declarative-configuration)
for details.
Disclaimer: Declarative configuration is currently in Development status - work in progress.

| Feature | Go | Java | JS | Python | Ruby | Erlang | PHP | Rust | C++ | .NET | Swift |
|-------------------------------------------------------------------------------------------------------------------------|----|------|----|--------|------|--------|-----|------|-----|------|-------|
Expand Down
2 changes: 1 addition & 1 deletion specification/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,7 @@ path_base_for_github_subdir:
- [Metrics](metrics/sdk.md)
- [Logs](logs/sdk.md)
- [Resource](resource/sdk.md)
- [Configuration](configuration/sdk-configuration.md)
- [Configuration](configuration/README.md)
- Data Specification
- [Semantic Conventions](overview.md#semantic-conventions)
- [Protocol](protocol/README.md)
Expand Down
57 changes: 56 additions & 1 deletion specification/configuration/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,4 +4,59 @@ path_base_for_github_subdir:
to: configuration/README.md
--->

# Configuration
# Overview

OpenTelemetry SDK components are highly configurable. This specification
outlines the mechanisms by which OpenTelemetry components can be configured. It
does not attempt to specify the details of what can be configured.

## Configuration Interfaces

### Programmatic

The SDK MUST provide a programmatic interface for all configuration.
This interface SHOULD be written in the language of the SDK itself.
All other configuration mechanisms SHOULD be built on top of this interface.

An example of this programmatic interface is accepting a well-defined
struct on an SDK builder class. From that, one could build a CLI that accepts a
file (YAML, JSON, TOML, ...) and then transforms into that well-defined struct
consumable by the programmatic interface (
see [declarative configuration](#declarative-configuration)).

### Environment variables

Environment variable configuration defines a set of language agnostic
environment variables for common configuration goals.

See [OpenTelemetry Environment Variable Specification](./sdk-environment-variables.md).

### Declarative configuration

Declarative configuration provides a mechanism for configuring OpenTelemetry
which is more expressive and full-featured than
the [environment variable](#environment-variables) based scheme, and language
agnostic in a way not possible with [programmatic configuration](#programmatic).
Notably, declarative configuration defines tooling allowing users to load
OpenTelemetry components according to a file-based representation of a
standardized configuration data model.

Declarative configuration consists of the following main components:

* [Data model](./data-model.md) defines data structures which allow users to
specify an intended configuration of OpenTelemetry SDK components and
instrumentation. The data model includes a file-based representation.
* [Instrumentation configuration API](./api.md) allows
instrumentation libraries to consume configuration by reading relevant
configuration options during initialization.
* [Configuration SDK](./sdk.md) defines SDK capabilities around file
configuration, including an In-Memory configuration model, support for
referencing custom extension plugin interfaces in configuration files, and
operations to parse configuration files and interpret the configuration data
model.

### Other Mechanisms

Additional configuration mechanisms SHOULD be provided in whatever
language/format/style is idiomatic for the language of the SDK. The
SDK can include as many configuration mechanisms as appropriate.
80 changes: 80 additions & 0 deletions specification/configuration/api.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,80 @@
# Instrumentation Configuration API

**Status**: [Development](../document-status.md)

<!-- toc -->

- [Overview](#overview)
* [ConfigProvider](#configprovider)
+ [ConfigProvider operations](#configprovider-operations)
- [Get instrumentation config](#get-instrumentation-config)
* [ConfigProperties](#configproperties)

<!-- tocstop -->

## Overview

The instrumentation configuration API is part of
the [declarative configuration interface](./README.md#declarative-configuration).

The API allows [instrumentation libraries](../glossary.md#instrumentation-library)
to consume configuration by reading relevant configuration during
initialization. For example, an instrumentation library for an HTTP client can
read the set of HTTP request and response headers to capture.

It consists of the following main components:

* [ConfigProvider](#configprovider) is the entry point of the API.
* [ConfigProperties](#configproperties) is a programmatic representation of a
configuration mapping node.

### ConfigProvider

`ConfigProvider` provides access to configuration properties relevant to
instrumentation.

Instrumentation libraries access `ConfigProvider` during
initialization. `ConfigProvider` may be passed as an argument to the
instrumentation library, or the instrumentation library may access it from a
central place. Thus, the API SHOULD provide a way to access a global
default `ConfigProvider`, and set/register it.

#### ConfigProvider operations

The `ConfigProvider` MUST provide the following functions:

* [Get instrumentation config](#get-instrumentation-config)

TODO: decide if additional operations are needed to improve API ergonomics

##### Get instrumentation config

Obtain configuration relevant to instrumentation libraries.

**Returns:** [`ConfigProperties`](#configproperties) representing
the [`.instrumentation`](https://github.com/open-telemetry/opentelemetry-configuration/blob/670901762dd5cce1eecee423b8660e69f71ef4be/examples/kitchen-sink.yaml#L438-L439)
configuration mapping node.

If the `.instrumentation` node is not set, get instrumentation config MUST
return nil, null, undefined or another language-specific idiomatic pattern
denoting empty.

### ConfigProperties

`ConfigProperties` is a programmatic representation of a configuration mapping
node (i.e. a YAML mapping node).

`ConfigProperties` MUST provide accessors for reading all properties from the
mapping node it represents, including:

* scalars (string, boolean, double precision floating point, 64-bit integer)
* mappings, which SHOULD be represented as `ConfigProperties`
* sequences of scalars
* sequences of mappings, which SHOULD be represented as `ConfigProperties`
* the set of property keys present

`ConfigProperties` SHOULD provide access to properties in a type safe manner,
based on what is idiomatic in the language.

`ConfigProperties` SHOULD allow a caller to determine if a property is present
with a null value, versus not set.
148 changes: 148 additions & 0 deletions specification/configuration/data-model.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,148 @@
# Configuration Data Model

**Status**: [Development](../document-status.md)

<!-- toc -->

- [Overview](#overview)
* [Stability definition](#stability-definition)
* [File-based configuration model](#file-based-configuration-model)
+ [YAML file format](#yaml-file-format)
+ [Environment variable substitution](#environment-variable-substitution)

<!-- tocstop -->

## Overview

The OpenTelemetry configuration data model is part of
the [declarative configuration interface](./README.md#declarative-configuration).

The data model defines data structures which allow users to specify an intended
configuration of OpenTelemetry SDK components and instrumentation.

The data model is defined
in [opentelemetry-configuration](https://github.com/open-telemetry/opentelemetry-configuration)
using [JSON Schema](https://json-schema.org/).

The data model itself is an abstraction with multiple built-in representations:

* [File-based configuration model](#file-based-configuration-model)
* [SDK in-memory configuration model](./sdk.md#in-memory-configuration-model)

### Stability definition

TODO: define stability guarantees and backwards compatibility

### File-based configuration model

A configuration file is a serialized file-based representation of
the configuration data model.

Configuration files SHOULD use one the following serialization formats:

* [YAML file format](#yaml-file-format)

#### YAML file format

[YAML](https://yaml.org/spec/1.2.2/) configuration files SHOULD follow YAML spec
revision >= 1.2.

YAML configuration files SHOULD be parsed using [v1.2 YAML core schema](https://yaml.org/spec/1.2.2/#103-core-schema).

YAML configuration files MUST use file extensions `.yaml` or `.yml`.

#### Environment variable substitution

Configuration files support environment variables substitution for references
which match the following PCRE2 regular expression:

```regexp
\$\{(?:env:)?(?<ENV_NAME>[a-zA-Z_][a-zA-Z0-9_]*)(:-(?<DEFAULT_VALUE>[^\n]*))?\}
```

The `ENV_NAME` MUST start with an alphabetic or `_` character, and is followed
by 0 or more alphanumeric or `_` characters.

For example, `${API_KEY}` and `${env:API_KEY}` are valid, while `${1API_KEY}`
and `${API_$KEY}` are invalid.

Environment variable substitution MUST only apply to scalar values. Mapping keys
are not candidates for substitution.

The `DEFAULT_VALUE` is an optional fallback value which is substituted
if `ENV_NAME` is null, empty, or undefined. `DEFAULT_VALUE` consists of 0 or
more non line break characters (i.e. any character except `\n`). If a referenced
environment variable is not defined and does not have a `DEFAULT_VALUE`, it MUST
be replaced with an empty value.

When parsing a configuration file that contains a reference not matching
the references regular expression but does match the following PCRE2
regular expression, the parser MUST return an empty result (no partial
results are allowed) and an error describing the parse failure to the user.

```regexp
\$\{(?<INVALID_IDENTIFIER>[^}]+)\}
```

Node types MUST be interpreted after environment variable substitution takes
place. This ensures the environment string representation of boolean, integer,
or floating point fields can be properly converted to expected types.

It MUST NOT be possible to inject YAML structures by environment variables. For
example, see references to `INVALID_MAP_VALUE` environment variable below.

It MUST NOT be possible to inject environment variable by environment variables.
For example, see references to `DO_NOT_REPLACE_ME` environment variable below.

For example, consider the following environment variables,
and [YAML](#yaml-file-format) configuration file:

```shell
export STRING_VALUE="value"
export BOOL_VALUE="true"
export INT_VALUE="1"
export FLOAT_VALUE="1.1"
export HEX_VALUE="0xdeadbeef" # A valid integer value written in hexadecimal
export INVALID_MAP_VALUE="value\nkey:value" # An invalid attempt to inject a map key into the YAML
export DO_NOT_REPLACE_ME="Never use this value" # An unused environment variable
export REPLACE_ME='${DO_NOT_REPLACE_ME}' # A valid replacement text, used verbatim, not replaced with "Never use this value"
```

```yaml
string_key: ${STRING_VALUE} # Valid reference to STRING_VALUE
env_string_key: ${env:STRING_VALUE} # Valid reference to STRING_VALUE
other_string_key: "${STRING_VALUE}" # Valid reference to STRING_VALUE inside double quotes
another_string_key: "${BOOL_VALUE}" # Valid reference to BOOL_VALUE inside double quotes
string_key_with_quoted_hex_value: "${HEX_VALUE}" # Valid reference to HEX_VALUE inside double quotes
yet_another_string_key: ${INVALID_MAP_VALUE} # Valid reference to INVALID_MAP_VALUE, but YAML structure from INVALID_MAP_VALUE MUST NOT be injected
bool_key: ${BOOL_VALUE} # Valid reference to BOOL_VALUE
int_key: ${INT_VALUE} # Valid reference to INT_VALUE
int_key_with_unquoted_hex_value: ${HEX_VALUE} # Valid reference to HEX_VALUE without quotes
float_key: ${FLOAT_VALUE} # Valid reference to FLOAT_VALUE
combo_string_key: foo ${STRING_VALUE} ${FLOAT_VALUE} # Valid reference to STRING_VALUE and FLOAT_VALUE
string_key_with_default: ${UNDEFINED_KEY:-fallback} # UNDEFINED_KEY is not defined but a default value is included
undefined_key: ${UNDEFINED_KEY} # Invalid reference, UNDEFINED_KEY is not defined and is replaced with ""
${STRING_VALUE}: value # Invalid reference, substitution is not valid in mapping keys and reference is ignored
recursive_key: ${REPLACE_ME} # Valid reference to REPLACE_ME
# invalid_identifier_key: ${STRING_VALUE:?error} # If uncommented, this is an invalid identifier, it would fail to parse
```

Environment variable substitution results in the following YAML:

```yaml
string_key: value # Interpreted as type string, tag URI tag:yaml.org,2002:str
env_string_key: value # Interpreted as type string, tag URI tag:yaml.org,2002:str
other_string_key: "value" # Interpreted as type string, tag URI tag:yaml.org,2002:str
another_string_key: "true" # Interpreted as type string, tag URI tag:yaml.org,2002:str
string_key_with_quoted_hex_value: "0xdeadbeef" # Interpreted as type string, tag URI tag:yaml.org,2002:str
yet_another_string_key: "value\nkey:value" # Interpreted as type string, tag URI tag:yaml.org,2002:str
bool_key: true # Interpreted as type bool, tag URI tag:yaml.org,2002:bool
int_key: 1 # Interpreted as type int, tag URI tag:yaml.org,2002:int
int_key_with_unquoted_hex_value: 3735928559 # Interpreted as type int, tag URI tag:yaml.org,2002:int
float_key: 1.1 # Interpreted as type float, tag URI tag:yaml.org,2002:float
combo_string_key: foo value 1.1 # Interpreted as type string, tag URI tag:yaml.org,2002:str
string_key_with_default: fallback # Interpreted as type string, tag URI tag:yaml.org,2002:str
undefined_key: # Interpreted as type null, tag URI tag:yaml.org,2002:null
${STRING_VALUE}: value # Interpreted as type string, tag URI tag:yaml.org,2002:str
recursive_key: ${DO_NOT_REPLACE_ME} # Interpreted as type string, tag URI tag:yaml.org,2002:str
```
Loading

0 comments on commit 9e1aeb7

Please sign in to comment.