Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support for PARALLEL SAFE #32

Open
gshennessy opened this issue Mar 15, 2021 · 13 comments
Open

support for PARALLEL SAFE #32

gshennessy opened this issue Mar 15, 2021 · 13 comments

Comments

@gshennessy
Copy link

Would it make sense to add PARALLEL SAFE to the SQL definitions for the q3c functions?
The Postgresql Documentation states:

Functions should be labeled parallel unsafe if they modify any database state, or if they make changes to the transaction such as using sub-transactions, or if they access sequences or attempt to make persistent changes to settings (e.g., setval). They should be labeled as parallel restricted if they access temporary tables, client connection state, cursors, prepared statements, or miscellaneous backend-local state which the system cannot synchronize in parallel mode (e.g., setseed cannot be executed other than by the group leader because a change made by another process would not be reflected in the leader). In general, if a function is labeled as being safe when it is restricted or unsafe, or if it is labeled as being restricted when it is in fact unsafe, it may throw errors or produce wrong answers when used in a parallel query. C-language functions could in theory exhibit totally undefined behavior if mislabeled, since there is no way for the system to protect itself against arbitrary C code, but in most likely cases the result will be no worse than for any other function. If in doubt, functions should be labeled as UNSAFE, which is the default.

From my admittedly limited understanding of the guts of q3c, it doesn't try to do anything that would prevent it from being PARALLEL SAFE, so is this a feasible change?

@segasai
Copy link
Owner

segasai commented Mar 15, 2021

There is one part in q3c_radial_query() and q3c_join where I avoid some repeated calculations in q3_get_nearby_it() by using a static variable in the C code. I am not sure that's a problem for PARALLEL, but I thought to be on the safe side I didn't set the functions to be PARALLEL SAFE. I would prefer some analysis/testing before making this change.

@gshennessy
Copy link
Author

What kind of testing would you be interested in?

@segasai
Copy link
Owner

segasai commented Mar 15, 2021

testing involving parallel plans and showing that things work normally.
But I still would need some analysis of https://github.com/segasai/q3c/blob/master/q3c.c and static variables. I just haven't studied how that works in the cases of parallel workers. (it's probably fine, but still).

@tony-12345
Copy link

I added a parallel test which runs your test scripts 5 times in background threads, with some sleep commands to try to prevent the different threads from all running the same query at the same time. This worked for me on Centos with no errors.
tony-12345@5007376

I also added some query plans to show that some of your test queries are in fact using parallel workers.
https://github.com/tony-12345/q3c/tree/master/queryplans

How do you feel about this showing that parallel safe functions work correctly?

@segasai
Copy link
Owner

segasai commented Jan 6, 2022

Thanks for looking into this @tony-12345 . I guess, what you show seems enough for me to turn on parallelism.
I'll change the function definitions in next few days.

@esabol
Copy link
Contributor

esabol commented May 26, 2022

I'll change the function definitions in next few days.

Did this ever happen? Just curious. I'm not seeing anything in the commits.

@segasai
Copy link
Owner

segasai commented May 26, 2022

Sorry, no. The uncommitted diff is still sitting on my machine, I think I got stuck on whether I could just change the q3c--2.0.0.sql (what I've done), or I needed to create a new q3c--2.0.1.sql and the migration sql. I never quite fully understood how extension versions supposed to work...

@esabol
Copy link
Contributor

esabol commented May 26, 2022

I would think it would need to be a new version with SQL to migrate, but I'm by no means an expert.

@gshennessy
Copy link
Author

would it be helpful for me to learn how to do a 'pull request'? :)

@segasai
Copy link
Owner

segasai commented Aug 11, 2022

Yes, please! (the sql update scripts will be also needed)
Personally I'm seeing no benefit in parallel safe (as q3c doesn't stop parallelization of other parts of the query), that's why I never got around to implement it.

@gshennessy
Copy link
Author

My DB servers have 20 cores, which is why I want parallel safe. Most of the time I'm IO bound, but there are times I am CPU bound, so any little bit helps.

@segasai
Copy link
Owner

segasai commented Aug 11, 2022

Yes, I also have multiple cores, but i'm not sure I've seen many cases where I was cpu limited because of q3c. I'm occasionally cpu limited when doing nearest neighbors queries, but that's usually limited by the PG code rather than q3c.

@segasai
Copy link
Owner

segasai commented Dec 12, 2023

It's long overdue, but I've made a pr #36 that makes some q3c functions parallel safe (the ones I was confident about, i.e. q3c_ang2ipix ).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants