Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Opensearch trace not marking by Error at the parent level #5325

Open
berezinsn opened this issue Jan 13, 2025 · 0 comments
Open

[BUG] Opensearch trace not marking by Error at the parent level #5325

berezinsn opened this issue Jan 13, 2025 · 0 comments
Labels
bug Something isn't working

Comments

@berezinsn
Copy link

Setup:
Otel Agents -> Otel collector -> Jaeger / DataPrepper -> Opensearch -> OpensearchDashboards

Versions:
Opensearch Helm Chart version: 2.27.1, appVersion: 2.18.0
Opensearch-Dashboards Helm Chart version: 2.25.0, appVersion: 2.18.0
Jaeger Helm Chart version: 3.3.3, appVersion: 1.53.0
DataPrepper Helm Chart version: 0.1.0, appVersion: 2.8.0

Describe the issue:
I have a setup with instrumented applications using OpenTelemetry (Otel) agents, which push traces to an Otel collector. The Otel collector sends data to both Jaeger and DataPrepper. However, I am noticing a difference in the behavior of the same traces when viewed in OpenSearch Dashboards depending on the data source selected (Jaeger vs. DataPrepper).

Specifically, when I select DataPrepper as the data source, I do not see the entire trace being marked as a trace with errors, and the errors are not displayed on the dashboard. In contrast, when using Jaeger as the data source, the errors are correctly visualized, and the entire trace is marked as an "error trace" if any span within the trace contains an error.

Configuration:
Jaeger:

jaeger:
  agent:
    enabled: false
  provisionDataStore:
    cassandra: false
    elasticsearch: false
  collector:
    enabled: true
    annotations: {}
    image:
      registry: ""
      repository: jaegertracing/jaeger-collector
      tag: ""
      digest: ""
    envFrom: []
    cmdlineParams: {}
    basePath: /
    replicaCount: 1
    service:
      otlp:
        grpc:
          name: "otlp-grpc"
          port: 4317
        http:
          name: "otlp-http"
          port: 4318
    serviceAccount:
      create: true
  storage:
    type: elasticsearch
    elasticsearch:
      scheme: http
      host: opensearch-cluster-master.opensearch-otel.svc.cluster.local
      port: 9200
      anonymous: true
      usePassword: false
        - name: SPAN_STORAGE_TYPE
          value: "opensearch"
        - name: ES_TAGS_AS_FIELDS_ALL
          value: "true"
      tls:
        enabled: false

DataPrepper:

    config:
      otel-trace-pipeline:
        delay: "1000"
        source:
          otel_trace_source:
            ssl: false
        buffer:
          bounded_blocking:
            buffer_size: 10240
            batch_size: 160
        sink:
          - pipeline:
              name: "raw-traces-pipeline"
          - pipeline:
              name: "otel-service-map-pipeline"
      raw-traces-pipeline:
        source:
          pipeline:
            name: "otel-trace-pipeline"
        buffer:
          bounded_blocking:
            buffer_size: 10240
            batch_size: 160
        processor:
          - otel_trace_raw:
          - otel_trace_group:
              hosts: [ "http://opensearch-cluster-master:9200" ]
              insecure: true
        sink:
          - opensearch:
              hosts: [ "http://opensearch-cluster-master:9200" ]
              insecure: true
              index_type: trace-analytics-raw
      otel-service-map-pipeline:
        delay: "1000"
        source:
          pipeline:
            name: "otel-trace-pipeline"
        buffer:
          bounded_blocking:
            buffer_size: 10240
            batch_size: 160
        processor:
          - service_map_stateful:
              window_duration: 300
        sink:
          - opensearch:
              hosts: [ "http://opensearch-cluster-master:9200" ]
              insecure: true
              index_type: trace-analytics-service-map
              index: otel-v1-apm-span-%{yyyy.MM.dd}
              #max_retries: 20
              bulk_size: 4

Relevant Logs or Screenshots:
DataPrepper source. Error in span, but not all trace marked with Error, and no statistics observed
Screenshot 2024-12-24 at 16 31 59
Screenshot 2024-12-24 at 16 30 49

Here is Jaeger source. Error is observed in span and the whole trace marked with error (in the right top corner, next capture)
Screenshot 2024-12-24 at 16 31 43
Screenshot 2024-12-24 at 16 31 07

Please share your suggestions on how to fix it. TraceID is the same for both cases.
Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
Development

No branches or pull requests

2 participants