Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Relation stats: Call pg_stat_get_* directly instead of using system views #512

Merged
merged 1 commit into from
Mar 22, 2024

Conversation

lfittl
Copy link
Member

@lfittl lfittl commented Mar 7, 2024

Due to the way how pg_stat_all_tables and pg_statio_all_tables are defined, its significantly more expensive to query through the views when a table filter is in place, since all the data gets pulled, only for the filter to be applied afterwards.

Instead, call the underlying pg_stat_get* functions directly, which has the same effect as querying the views (as they are simple views without any security barrier), and shows a major speedup in cases where a large portion of tables is filtered out. This also should provide a minor speedup in regular operations, even with no table filter in place.

…iews

Due to the way how pg_stat_all_tables and pg_statio_all_tables are
defined, its significantly more expensive to query through the views
when a table filter is in place, since all the data gets pulled, only
for the filter to be applied afterwards.

Instead, call the underlying pg_stat_get* functions directly, which has
the same effect as querying the views (as they are simple views without
any security barrier), and shows a major speedup in cases where a large
portion of tables is filtered out. This also should provide a minor
speedup in regular operations, even with no table filter in place.
@msakrejda
Copy link
Contributor

For the record, this rewrite of the query did not perform faster for the customer for whom this was a problem (we suspect they had more general I/O issues). Is this still worth doing?

@lfittl
Copy link
Member Author

lfittl commented Mar 20, 2024

For the record, this rewrite of the query did not perform faster for the customer for whom this was a problem (we suspect they had more general I/O issues). Is this still worth doing?

Maybe? I haven't identified a downside to this approach, and it does save a few cycles for all cases, and more so when a table filter is active (even if its not necessarily a big performance impact for regular hardware).

@lfittl lfittl marked this pull request as ready for review March 20, 2024 23:02
@lfittl lfittl requested a review from a team March 20, 2024 23:02
@msakrejda
Copy link
Contributor

Yeah, makes sense.

Copy link
Contributor

@msakrejda msakrejda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Double-checked the SQL against the source view definitions, and looks good to me.

@lfittl lfittl merged commit 60101c4 into main Mar 22, 2024
7 checks passed
@lfittl lfittl deleted the speed-up-relation-stats branch March 22, 2024 17:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants