Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test TestMetricsAndPprofExist intermittently fails due to Tracee timeout and subsequent "already running" error #4487

Open
ShohamBit opened this issue Jan 13, 2025 · 0 comments
Labels

Comments

@ShohamBit
Copy link
Collaborator

ShohamBit commented Jan 13, 2025

Description

The TestMetricsAndPprofExist test is experiencing intermittent failures related to how the Tracee process is managed. The failure manifests in two stages:

Stage 1: Timeout

Initially, the test fails because Tracee does not start within the defined timeout you can view more in issue #4486

Stage 2: "Already Running"

When the test is rerun immediately after a timeout failure, it fails again, but this time due to the error "tracee is already running" (testutils.TraceeAlreadyRunning). This is because the Tracee process from the previous, timed-out run is still active in the background and not properly terminated by the test framework.

The core issue is that when TestMetricsAndPprofExist times out, the Tracee process is left orphaned. The test does not have a mechanism to gracefully terminate a timed-out Tracee instance before rerunning. Currently, manual intervention with pkill is required to kill the lingering Tracee process before the test can be successfully rerun.

pgrep tracee

kill [output of pgrep]

Output of tracee version:

Tracee version: main-4cdea40dce

Output of uname -a:

Linux shoham 5.15.0-130-generic #140-Ubuntu SMP Wed Dec 18 17:59:53 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux

Additional details

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant