Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support getting event sample rate from TraceState p-value #77

Open
spencerwilson opened this issue May 11, 2022 · 2 comments
Open

Support getting event sample rate from TraceState p-value #77

spencerwilson opened this issue May 11, 2022 · 2 comments
Labels
type: enhancement New feature or request

Comments

@spencerwilson
Copy link

spencerwilson commented May 11, 2022

Is your feature request related to a problem? Please describe.

Honeycomb can't easily interoperate with OpenTelemetry systems that perform sampling. Specifically, in a multi-service OTel-first system, there's no convenient method or place to adapt OTel p-values into sampleRate attributes.

Describe the solution you'd like

Update api.honeycomb.io's OTLP/gRPC ingest to behave like so: For each span during TranslateTraceRequest,

  1. Deserialize a span's trace state, sourced from Span.GetTraceState. See details on OpenTelemetry's use of trace state here: https://github.com/open-telemetry/opentelemetry-specification/blob/v1.11.0/specification/trace/tracestate-handling.md
  2. If trace state defines a p-value in the interval [0, 62], set the Honeycomb event's SampleRate to 2 ** p. The OTel spec refers to this quantity as "adjusted count".
  3. If trace state defines a p-value equal to 63, drop the event.
  4. Otherwise, use the existing getSampleRate logic that consults span attributes.

Note: I'm unsure of the correctness of step (3). Am seeking clarification with the OTel Sampling SIG now: https://cloud-native.slack.com/archives/C027DS6GZD3/p1652294663462109

Describe alternatives you've considered

A processor in the OpenTelemetry Collector that copies/adapts p values into a sampleRate span attribute might be feasible, but no one's open-sourced such a processor yet.

Additional context

  • Honeycomb's OTLP ingest supports span attributes sampleRate and SampleRate as a means of conveying "this is how much to re-weight this span by when doing aggregate calculations". (src)
  • Since OpenTelemetry spec v1.9.0 there's been an (experimental) OTel-native way to convey this information: a "p-value" in a span's TraceState.
@spencerwilson spencerwilson added the type: enhancement New feature or request label May 11, 2022
@JamieDanielson JamieDanielson added the status: oncall Flagged for awareness from Honeycomb Telemetry Oncall label May 11, 2022
@MikeGoldsmith
Copy link
Contributor

Hi @spencerwilson - thanks for raising this issue.

We agree adding support for OpenTelemetry sampling would be good to add. We'll look into what is required and prioritise accordingly.

@MikeGoldsmith MikeGoldsmith removed the status: oncall Flagged for awareness from Honeycomb Telemetry Oncall label May 12, 2022
@spencerwilson
Copy link
Author

spencerwilson commented Jun 10, 2022

If trace state defines a p-value equal to 63, drop the event.

fwiw I confirmed that this is indeed the correct behavior, if you want estimated quantities (counts, etc.) to be valid. 63 would only be indicated if the span was collected due to a "non-probability sampler" choosing it, and such a sampler is a thing that kind of exists "outside of" the statistically quantifiable realm of sampling.

AFAIK it's a pretty theoretical concern at present. No one's advertised that they've built any non-probability sampler yet, so spans with p=63 will be virtually nonexistent in practice.

The best-best treatment of spans with p=63 would be to both:

  • store the spans; and
  • give them a weight of 0 in calculations.

I'm not sure if the Honeycomb query engine supports 0-weighted events, though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants