Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Set observedTime and time to current time instead of epoch 0 (Jan 1, 1970) #5275

Open
JannikBrand opened this issue Dec 19, 2024 · 1 comment
Labels
bug Something isn't working

Comments

@JannikBrand
Copy link
Contributor

JannikBrand commented Dec 19, 2024

Describe the bug

Copied content from my comment.

When sending data without observedTimeUnixNano and timeUnixNano fields directly to Data Prepper, it will create Documents in OpenSearch including the time and observedTime field set to epoch 0 (Jan 1 1970).
This makes logs very hard to find, sometimes users are under the expression that the logs weren't ingested at all, since they would only check a recent time.

Based on the spec of the observedTime field, it "is the time when OpenTelemetry’s code observed the event measured by the clock of the OpenTelemetry code", so Data Prepper should set it to the current time.

Based on the spec of the time field, setting it to epoch 0 seems wrong. Either the field should be dropped (because it is optional) or set to the value of observedTime. The latter would make sense, since the spec mentions: "Use Timestamp if it is present, otherwise use ObservedTimestamp".

To Reproduce

First, create a "otel-log-without-time.json" file, e.g.:

{
  "resourceLogs": [
    {
      "resource": {
        "attributes": [
          {
            "key": "service.name",
            "value": { "stringValue": "my-application" }
          }
        ],
        "droppedAttributesCount": 0
      },
      "scopeLogs": [
        {
          "scope": {
            "name": "scopeName",
            "version": "version1"
          },
          "logRecords": [
            {
              "severityNumber": 9,
              "severityText": "Info",
              "body": { "stringValue": "This is a log message" },
              "attributes": [],
              "droppedAttributesCount": 0,
              "traceId": "08040201000000000000000000000000",
              "spanId": "0102040800000000"
            }
          ],
          "schemaUrl": "foo"
        }
      ],
      "schemaUrl": "bar"
    }
  ]
}

Second, sent it via grpcurl to Data Prepper, e.g.:

grpcurl -insecure -d @ < otel-log-without-time.json <dp_endpoint>:<dp_otel_log_port> opentelemetry.proto.collector.logs.v1.LogsService/Export

Data Prepper log pipeline looks sth. like this (highlighting that the proto_reflection_service is enabled for grpcurl):

logs-pipeline:
  source:
    otel_logs_source:
      ssl: false
      proto_reflection_service: true

  buffer:
    bounded_blocking:
      buffer_size: 12800
      batch_size: 200
  processor:
  sink:
    - opensearch:
        hosts: [ "<opensearch_endpoint>" ]
        insecure: true
        username: <os_username>
        password: <os_user_password>
        index: logs-otel-v1-%{yyyy.MM.dd}

Resulting OpenSearch doc:

{
  "_index": "logs-otel-v1-2024.12.19",
  "_type": "_doc",
  "_id": "GB2n3pMB0Mc1_i72fE8Y",
  "_version": 1,
  "_score": null,
  "_source": {
    "traceId": "d3cd38d36d35d34d34d34d34d34d34d34d34d34d34d34d34",
    "spanId": "d35d36d38d3cd34d34d34d34",
    "severityText": "Info",
    "flags": 0,
    "time": "1970-01-01T00:00:00Z",
    "severityNumber": 9,
    "droppedAttributesCount": 0,
    "serviceName": "my-application",
    "body": "This is a log message",
    "observedTime": "1970-01-01T00:00:00Z",
    "schemaUrl": "bar",
    "instrumentationScope.name": "scopeName",
    "resource.attributes.service@name": "my-application",
    "instrumentationScope.version": "version1"
  },
  "fields": {
    "observedTime": [
      "1970-01-01T00:00:00.000Z"
    ],
    "time": [
      "1970-01-01T00:00:00.000Z"
    ]
  },
  "sort": [
    0
  ]
}

Expected behavior

The OpenSearch document should have the observedTime set to the current time when it was ingested.

Optionally, it could be considered to set the time to the observedTime, in case it does not exist. This allows to always use the time field in an OpenSearch index pattern as time field.
Alternatively, users could achieve this behavior with a Data Prepper processor.

Screenshots

image

=> In case no time field is contained. In this case the logs-otel-v1-* index pattern uses time as time field. All respective logs are at epoch 0 and therefore hard to find.

Environment (please complete the following information):

  • DP version: 2.9.0
@dblock
Copy link
Member

dblock commented Jan 6, 2025

[Catch All Triage - 1, 2, 3, 4, 5, 6]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
Development

No branches or pull requests

2 participants