Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reported events timestamp is wrong #4505

Closed
yanivagman opened this issue Jan 15, 2025 · 5 comments · Fixed by #4513
Closed

Reported events timestamp is wrong #4505

yanivagman opened this issue Jan 15, 2025 · 5 comments · Fixed by #4513
Assignees
Labels
Milestone

Comments

@yanivagman
Copy link
Collaborator

yanivagman commented Jan 15, 2025

Description

Reported timestamp is incorrect.
It was found that tracee is incorrectly using monotonic clock in userspace and boot time clock in bpf code.
During tracee initialization, this code is called:

	// Checking the kernel symbol needs to happen after obtaining the capability;
	// otherwise, we get a warning.
	usedClockID := traceetime.CLOCK_BOOTTIME
	err = capabilities.GetInstance().EBPF(
		func() error {
			supported, innerErr := bpf.BPFHelperIsSupported(bpf.BPFProgTypeKprobe, bpf.BPFFuncKtimeGetBootNs)

                       fmt.Printf("supported: %v, innerErr: %v\n", supported, innerErr)

			// only report if operation not permitted
			if errors.Is(innerErr, syscall.EPERM) {
				return innerErr
			}

			// If BPFFuncKtimeGetBootNs is not available, eBPF will generate events based on monotonic time.
			if !supported {
				usedClockID = traceetime.CLOCK_MONOTONIC
			}
			return nil
		})
	if err != nil {
		return errfmt.WrapError(err)
	}

Running with this debug print message returns: supported: false, innerErr: operation failed for function ktime_get_boot_ns with program type BPF_PROG_TYPE_KPROBE: no such file or directory

  1. We need to figure out why this error is returned and fix it
  2. The code that checks for EPERM is also incorrect. This error will never be returned as can be seen in libbpfgo: https://github.com/aquasecurity/libbpfgo/blob/5d6a4fcd14d7cf760b017317ff0c00d2e841be22/libbpfgo.go#L105
  3. We need to fallback to clock boottime in case an error is returned since it is supported from kernel 5.5 and more common

Output of tracee version:

(paste your output here)

Output of uname -a:

(paste your output here)

Additional details

@geyslan
Copy link
Member

geyslan commented Jan 15, 2025

@NDStrahilevitz we need to enlarge the scope to consider returning other errors as well. @rscampos FYI.

tracee/pkg/ebpf/tracee.go

Lines 389 to 397 in a215fa6

// only report if operation not permitted
if errors.Is(innerErr, syscall.EPERM) {
return innerErr
}
// If BPFFuncKtimeGetBootNs is not available, eBPF will generate events based on monotonic time.
if !supported {
usedClockID = traceetime.CLOCK_MONOTONIC
}

@rscampos rscampos assigned rscampos and unassigned NDStrahilevitz Jan 15, 2025
@rscampos
Copy link
Collaborator

rscampos commented Jan 15, 2025

Folks, I took ownership of this issue since I worked on this in the past.

Figure out the issue... libbpf checks /proc/version_signature to get ubuntu version, but in the newer ubuntu kernels (or other dists) this file doesn't existis. This results into no such file or directory. The error occurs in libbpfgo, which is incorrectly checking errno and returning an error to the user.

I'll rework the libbpfgo func BPFHelperIsSupported since we need a way to warn user when the users don't have permission.

@geyslan
Copy link
Member

geyslan commented Jan 16, 2025

@rscampos I believe it's an issue in the libbpf itself which is propagating the errno not related to the scope called. The helper that attempts to get the kernel version and fails (trying for Ubuntu), should clear errno, since it doesn't fail - it has a fallback to get the release from uname:

https://github.com/libbpf/libbpf/blob/0ff2f8e0ee22453b08fe857d0ce4a6adb44b1427/src/libbpf_probes.c#L31

https://github.com/libbpf/libbpf/blob/0ff2f8e0ee22453b08fe857d0ce4a6adb44b1427/src/libbpf_probes.c#L97

strstr from Debian helper doesn't set errno, sscanf is successful and probably doesn't touch erro then.

The code path is keeping errno tainted when it should be cleared.

@rscampos
Copy link
Collaborator

rscampos commented Jan 16, 2025

Thank you for this analysis, I agree with you - the fact the libbpf don't clear the errno and also, the fact that we relied in the errno in libbpfgo and return it direct to the user was the root cause of the Tracee issue reported by Yaniv.

Besides that, in the end, the only errno which matters for us was is if the user has or not privilege (in order to execute the probe). Take a look on bpftool:
https://github.com/libbpf/bpftool/blob/main/src/feature.c#L689-L701

So the ideia, for libbpfgo, is only rely in errno if is EPERM. If its the case, we return EPERM to user. (Let me validate this idea in practice).

@geyslan
Copy link
Member

geyslan commented Jan 16, 2025

So the ideia, for libbpfgo, is only rely in errno if is EPERM. If its the case, we return EPERM to user. (Let me validate this idea in practice).

I agree, but keep in mind that as they don't clear errno out... other errno could be set as:

  • faccessat: various (EPERM included, but not the case since it calls as read only).
  • fscanf: various.
  • sscanf: EILSEQ, EINVAL, ENOMEM (rare).
  • uname: EFAULT - not apply since the buf is ok.
  • fopen: various.
    ...

In other words, consider that it might be tainted with EPERM in a way not expected.

Consider to raise a fix to libbpf as a continuation of the libbpfgo fix. 👍🏼

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants