Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

/proc parsing refactor #4364

Merged
merged 5 commits into from
Oct 28, 2024
Merged

Conversation

geyslan
Copy link
Member

@geyslan geyslan commented Oct 23, 2024

1. Explain what the PR does

5acb711 fix(time): ticks is uint64
c4bdcab chore,fix(proc): improve /proc stat parsing
b9aa357 chore(proc): add benchmark for /proc stat parsing
b3e18a2 chore(proc): improve /proc status parsing
a1f36cb chore(proc): add benchmark for /proc status parsing

5acb711 fix(time): ticks is uint64

The field started_time fetched from /proc stat file and passed as ticks
to ClockTicksToNsSinceBootTime() is an uint64.

c4bdcab chore,fix(proc): improve /proc stat parsing

This short-circuits the stat parsing by avoiding fully parsing all
stat fields. It also removed ProcStat unused fields leaving the
unique we are interested in, what reduces the memory footprint.

To reduce the size of the function it now uses a array to lookup the
field handler functions.

It also fixes a possible error when parsing the stat file of a process
with a comm field that contains inner parenthesis. Beyond that, instead
of using a complex regex to change the comm field, it now uses a simple
index based approach.

---

Running tool: /home/gg/.goenv/versions/1.22.4/bin/go test -benchmem
-run=^$ -tags ebpf -bench ^Benchmark_newProcStat$
github.com/aquasecurity/tracee/pkg/utils/proc -benchtime=10000000x

goos: linux
goarch: amd64
pkg: github.com/aquasecurity/tracee/pkg/utils/proc
cpu: AMD Ryzen 9 7950X 16-Core Processor
Benchmark_newProcStat-32 10000000   5152 ns/op   2168 B/op   8 allocs/op
PASS
ok      github.com/aquasecurity/tracee/pkg/utils/proc   51.527s

---

| Metric                   | Old Bench  | New Bench  | Improvement (%) |
|--------------------------|------------|------------|-----------------|
| Execution Time (ns/op)   | 14336      | 5152       | ~64.1% faster   |
| Memory Allocations (B/op)| 2976       | 2168       | ~27.2% reduction|
| Allocations (allocs/op)  | 45         | 8          | ~82.2% reduction|
| Total Time (seconds)     | 143.369    | 51.527     | ~64.1% faster   |

b9aa357 chore(proc): add benchmark for /proc stat parsing

Running tool: /home/gg/.goenv/versions/1.22.4/bin/go test -benchmem
-run=^$ -tags ebpf -bench ^Benchmark_newProcStat$
github.com/aquasecurity/tracee/pkg/utils/proc -benchtime=10000000x

goos: linux
goarch: amd64
pkg: github.com/aquasecurity/tracee/pkg/utils/proc
cpu: AMD Ryzen 9 7950X 16-Core Processor
Benchmark_newProcStat-32 10000000  14336 ns/op  2976 B/op   45 allocs/op
PASS
ok      github.com/aquasecurity/tracee/pkg/utils/proc   143.369s

b3e18a2 chore(proc): improve /proc status parsing

This short-circuits the status parsing by avoiding fully parsing all
status fields. It also makes ProcStatus a concrete type that contains
only the fields we are interested in, what reduces the memory footprint.

To reduce the size of the function it now uses a map to lookup the field
handler functions.

---

Running tool: /home/gg/.goenv/versions/1.22.4/bin/go test -benchmem
-run=^$ -tags ebpf -bench ^Benchmark_newProcStatus$
github.com/aquasecurity/tracee/pkg/utils/proc -benchtime=10000000x

goos: linux
goarch: amd64
pkg: github.com/aquasecurity/tracee/pkg/utils/proc
cpu: AMD Ryzen 9 7950X 16-Core Processor
Benchmark_newProcStatus-32 10000000  5551 ns/op  5136 B/op  29 allocs/op
PASS
ok      github.com/aquasecurity/tracee/pkg/utils/proc   55.516s

---

| Metric                   | Old Bench. | New Bench. | Improvement (%) |
|--------------------------|------------|------------|-----------------|
| Execution Time (ns/op)   | 13203      | 5551       | ~57.9% faster   |
| Memory Allocations (B/op)| 17851      | 5136       | ~71.2% reduction|
| Allocations (allocs/op)  | 137        | 29         | ~78.8% reduction|
| Total Time (seconds)     | 132.033    | 55.516     | ~57.9% faster   |

a1f36cb chore(proc): add benchmark for /proc status parsing

Running tool: /home/gg/.goenv/versions/1.22.4/bin/go test -benchmem
-run=^$ -tags ebpf -bench ^Benchmark_newProcStatus$
github.com/aquasecurity/tracee/pkg/utils/proc -benchtime=10000000x

goos: linux
goarch: amd64
pkg: github.com/aquasecurity/tracee/pkg/utils/proc
cpu: AMD Ryzen 9 7950X 16-Core Processor
Benchmark_newProcStatus-32 10000000  13203 ns/op  17851 B/op  137 allocs/op
PASS
ok      github.com/aquasecurity/tracee/pkg/utils/proc   132.033s

2. Explain how to test it

3. Other comments

@geyslan geyslan self-assigned this Oct 23, 2024
@geyslan geyslan requested a review from yanivagman October 23, 2024 14:11
@geyslan geyslan changed the title proc parsing refactor /proc parsing refactor Oct 23, 2024
}

// GetUid returns UID in the following order: real, effective, saved set, filesystem.
func (ps ProcStatus) GetUid() [4]int {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we keep those uid and gid parsers? Why did we have them in first place?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, we could leave them be, but they are not currently used by any caller, so I opted to not include their holders, keeping ProcStatus' memory footprint small.

Running tool: /home/gg/.goenv/versions/1.22.4/bin/go test -benchmem
-run=^$ -tags ebpf -bench ^Benchmark_newProcStatus$
github.com/aquasecurity/tracee/pkg/utils/proc -benchtime=10000000x

goos: linux
goarch: amd64
pkg: github.com/aquasecurity/tracee/pkg/utils/proc
cpu: AMD Ryzen 9 7950X 16-Core Processor
Benchmark_newProcStatus-32 10000000  13203 ns/op  17851 B/op  137 allocs/op
PASS
ok  	github.com/aquasecurity/tracee/pkg/utils/proc	132.033s
This short-circuits the status parsing by avoiding fully parsing all
status fields. It also makes ProcStatus a concrete type that contains
only the fields we are interested in, what reduces the memory footprint.

To reduce the size of the function it now uses a map to lookup the field
handler functions.

---

Running tool: /home/gg/.goenv/versions/1.22.4/bin/go test -benchmem
-run=^$ -tags ebpf -bench ^Benchmark_newProcStatus$
github.com/aquasecurity/tracee/pkg/utils/proc -benchtime=10000000x

goos: linux
goarch: amd64
pkg: github.com/aquasecurity/tracee/pkg/utils/proc
cpu: AMD Ryzen 9 7950X 16-Core Processor
Benchmark_newProcStatus-32 10000000  5551 ns/op  5136 B/op  29 allocs/op
PASS
ok  	github.com/aquasecurity/tracee/pkg/utils/proc	55.516s

---

| Metric                   | Old Bench. | New Bench. | Improvement (%) |
|--------------------------|------------|------------|-----------------|
| Execution Time (ns/op)   | 13203      | 5551       | ~57.9% faster   |
| Memory Allocations (B/op)| 17851      | 5136       | ~71.2% reduction|
| Allocations (allocs/op)  | 137        | 29         | ~78.8% reduction|
| Total Time (seconds)     | 132.033    | 55.516     | ~57.9% faster   |
Running tool: /home/gg/.goenv/versions/1.22.4/bin/go test -benchmem
-run=^$ -tags ebpf -bench ^Benchmark_newProcStat$
github.com/aquasecurity/tracee/pkg/utils/proc -benchtime=10000000x

goos: linux
goarch: amd64
pkg: github.com/aquasecurity/tracee/pkg/utils/proc
cpu: AMD Ryzen 9 7950X 16-Core Processor
Benchmark_newProcStat-32 10000000  14336 ns/op  2976 B/op   45 allocs/op
PASS
ok  	github.com/aquasecurity/tracee/pkg/utils/proc	143.369s
This short-circuits the stat parsing by avoiding fully parsing all
stat fields. It also removed ProcStat unused fields leaving the
unique we are interested in, what reduces the memory footprint.

To reduce the size of the function it now uses a array to lookup the
field handler functions.

It also fixes a possible error when parsing the stat file of a process
with a comm field that contains inner parenthesis. Beyond that, instead
of using a complex regex to change the comm field, it now uses a simple
index based approach.

---

Running tool: /home/gg/.goenv/versions/1.22.4/bin/go test -benchmem
-run=^$ -tags ebpf -bench ^Benchmark_newProcStat$
github.com/aquasecurity/tracee/pkg/utils/proc -benchtime=10000000x

goos: linux
goarch: amd64
pkg: github.com/aquasecurity/tracee/pkg/utils/proc
cpu: AMD Ryzen 9 7950X 16-Core Processor
Benchmark_newProcStat-32 10000000   5152 ns/op   2168 B/op   8 allocs/op
PASS
ok  	github.com/aquasecurity/tracee/pkg/utils/proc	51.527s

---

| Metric                   | Old Bench  | New Bench  | Improvement (%) |
|--------------------------|------------|------------|-----------------|
| Execution Time (ns/op)   | 14336      | 5152       | ~64.1% faster   |
| Memory Allocations (B/op)| 2976       | 2168       | ~27.2% reduction|
| Allocations (allocs/op)  | 45         | 8          | ~82.2% reduction|
| Total Time (seconds)     | 143.369    | 51.527     | ~64.1% faster   |
The field started_time fetched from /proc stat file and passed as ticks
to ClockTicksToNsSinceBootTime() is an uint64.
@geyslan
Copy link
Member Author

geyslan commented Oct 28, 2024

/fast-forward

@github-actions github-actions bot merged commit 3f60a6f into aquasecurity:main Oct 28, 2024
31 checks passed
@geyslan geyslan mentioned this pull request Oct 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants