Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Store query texts in temp file #620

Merged
merged 2 commits into from
Oct 23, 2024
Merged

Store query texts in temp file #620

merged 2 commits into from
Oct 23, 2024

Conversation

seanlinsley
Copy link
Member

#605 succeeded in reducing memory usage, but it also caused collector query time in pg_stat_statements to increase. This PR tries to solve both issues by writing query texts to a tempfile so the query can finish quickly, and then reading each query text in one-by-one to keep memory usage down.

@seanlinsley seanlinsley requested a review from a team October 22, 2024 17:24
@seanlinsley seanlinsley changed the title Store pg_stat_statements in temp file Store query texts in temp file Oct 22, 2024
Copy link
Contributor

@msakrejda msakrejda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor comment, but looks good.

queryStats := make([]state.PostgresStatementStats, 0)
queryLength := make([]int, 0)
if showtext {
tmpFile, err = ioutil.TempFile("", "")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function is deprecated in Go now; it just calls another function.

Suggested change
tmpFile, err = ioutil.TempFile("", "")
tmpFile, err = os.CreateTemp("", "")

I'll clean up other usage, but we should avoid introducing more.

// Currently, Aurora gives wildly incorrect blk_read_time and blk_write_time values
// for utility statements; ignore I/O timing in this situation.
if !postgresVersion.IsAwsAurora || !receivedQuery.Valid {
if !postgresVersion.IsAwsAurora || receivedQuery != "" {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if !postgresVersion.IsAwsAurora || receivedQuery != "" {
if !postgresVersion.IsAwsAurora || receivedQuery == "" {

queryLength = append(queryLength, len(receivedQuery.String))
tmpFile.WriteString(receivedQuery.String)
} else {
if ignoreIOTiming(postgresVersion, "") {
Copy link
Member

@lfittl lfittl Oct 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I just discussed with Sean, this is actually not doing what we intended, because it will only ignore I/O timing when showtext=true (i.e. the once every 10 minute snapshot), not in the every 1min snapshots - effectively causing those to collect the bogus I/O times for utility statements.

We should be able to address this correctly by moving this logic to the output part of the collector (where we have the text available and can cleanup the stats for a given query). At that point we may also want to check if Aurora changelog notes this bug being fixed.

For this PR we can remove this check here (since it doesn't do anything). Note this PR also changes the ignoreIOTiming for showtext=true to run outside the for loop. That's actually very good to do because it removes a pg_query parse call inside the for loop (and as such should reduce the observed runtime for this query on the database side).

@@ -230,6 +234,18 @@ func GetStatements(ctx context.Context, server *state.Server, logger *util.Logge
statementTextsByFp := make(state.PostgresStatementTextMap)
statementStats := make(state.PostgresStatementStatsMap)

queryKeys := make([]state.PostgresStatementKey, 0)
queryStats := make([]state.PostgresStatementStats, 0)
queryLength := make([]int, 0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would make more sense to me if its called queryLengths (i.e. with a "s" at the end)

}

if err = rows.Err(); err != nil {
return nil, nil, nil, err
}

tmpFile.Seek(0, io.SeekStart)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whilst this doesn't crash when tmpFile is not initialized (I don't understand why, but [Go Playground confirms it](tmpFile.Seek(0, io.SeekStart))), it might be best if we wrap this and the for loop below in if showtext, just so a future reader doesn't think it might be a problem.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I think that's because the first thing Seek does is call File.checkValid, which returns an error if File is nil. I agree wrapping this in a showtext check is worthwhile, and checking the return value from Seek may be worth doing, too, to avoid surprises.

if err != nil {
return nil, nil, nil, err
}
query := string(bytes)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could technically avoid the memory copy here (see e.g. https://www.reddit.com/r/golang/comments/14xvgoj/converting_string_byte/ for a way of doing it), but not sure if its worth saving on that (maybe worth a benchmark, not necessarily right now)

@seanlinsley seanlinsley merged commit 6b769dc into main Oct 23, 2024
3 checks passed
@seanlinsley seanlinsley deleted the statements-tmpfile branch October 23, 2024 16:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants