Skip to content
This repository has been archived by the owner on Feb 24, 2024. It is now read-only.

Add more fine grained event trigger for automatic schema cache reloading #445

Closed
steve-chavez opened this issue Oct 19, 2021 · 7 comments · Fixed by #489
Closed

Add more fine grained event trigger for automatic schema cache reloading #445

steve-chavez opened this issue Oct 19, 2021 · 7 comments · Fixed by #489
Labels
explanations background and context about how the project works.

Comments

@steve-chavez
Copy link
Member

Right now https://postgrest.org/en/v8.0/schema_cache.html#automatic-schema-cache-reloading reloads on every DDL change but we don't need to reload on changes on triggers, grants, policies, etc.

See PostgREST/postgrest#1512 (comment) for a basic idea on how to do a more fine grained even trigger.

@steve-chavez steve-chavez added the explanations background and context about how the project works. label Oct 19, 2021
@wolfgangwalther
Copy link
Member

For future reference:

  1. all object_type values are to be found here: https://github.com/postgres/postgres/blob/639a86e36aaecb84faaf941dcd0b183ba0aba9e9/src/backend/catalog/objectaddress.c#L630-L844
  2. We probably need the following object_types:
  • collation: allColumns
  • foreign table column: allColumns, pfkSourceColumns
  • foreign table: allTables
  • function: procsSqlQuery
  • materialized view: allTables
  • procedure: when Add support for Stored Procedures postgrest#1976 is merged
  • rule: allTables, might change insertable, updatable, deletable
  • schema: renaming db-schemas might change the whole schema cache
  • table column: allColumns, pfkSourceColumns
  • table constraint: allColumns, allM2ORels, allPrimaryKeys, pfkSourceColumns - I assume this includes PKs and FKs
  • table: allTables
  • trigger: allTables, might change insertable, updatable, deletable
  • type: procsSqlQuery, allColumns
  • view: allTables, pfkSourceColumns

@steve-chavez
Copy link
Member Author

A problem with the current event trigger is that a create temp table fires it.

Just reproduced it with a function like:

create or replace function what() returns text as $$ 
  create temp table if not exists wut(); 
  select 'what'::text 
$$ language sql;

Each call with POST will print:

127.0.0.1 - - [04/Jan/2022:21:18:18 -0500] "POST /rpc/what HTTP/1.1" 200 - "" "curl/7.76.1"
04/Jan/2022:21:18:18 -0500: Schema cache loaded

127.0.0.1 - - [04/Jan/2022:21:18:20 -0500] "POST /rpc/what HTTP/1.1" 200 - "" "curl/7.76.1"
04/Jan/2022:21:18:21 -0500: Schema cache loaded

A GET does fail:

{"hint":null,"details":null,"code":"25006","message":"cannot execute CREATE TABLE in a read-only transaction"}

This of course affects the performance of functions that use temp tables for internal logic.

@steve-chavez
Copy link
Member Author

@wolfgangwalther Can you remind me why we need collations?


I've settled on the following function for now(also took event trigger matrix as reference), since there's no event for CREATE TEMP TABLE I excluded every object created on pg_temp.

CREATE OR REPLACE FUNCTION pgrst_watch() RETURNS event_trigger AS $$
DECLARE
  obj record;
BEGIN
  IF tg_tag IN (
    -- schemas
    'CREATE SCHEMA', 'ALTER SCHEMA', 'DROP SCHEMA'
    -- tables
  , 'CREATE TABLE', 'CREATE TABLE AS', 'SELECT INTO', 'ALTER TABLE', 'DROP TABLE'
    -- foreign tables
  , 'CREATE FOREIGN TABLE', 'ALTER FOREIGN TABLE', 'DROP FOREIGN TABLE'
    -- views
  , 'CREATE VIEW', 'ALTER VIEW', 'DROP VIEW'
    -- materialized views
  , 'CREATE MATERIALIZED VIEW', 'ALTER MATERIALIZED VIEW', 'DROP MATERIALIZED VIEW'
    -- functions
  , 'CREATE FUNCTION', 'ALTER FUNCTION', 'DROP FUNCTION'
    -- procedures
  , 'CREATE PROCEDURE', 'ALTER PROCEDURE', 'DROP PROCEDURE'
    -- triggers
  , 'CREATE TRIGGER', 'DROP TRIGGER'
    -- types
  , 'CREATE TYPE', 'DROP TYPE'
    -- rules
  , 'CREATE RULE', 'DROP RULE'
    -- drops many types of objects
  , 'DROP OWNED'
  ) AND 
  -- don't notify in case of CREATE TEMP table or other objects created on pg_temp
  'pg_temp' NOT IN (SELECT schema_name FROM pg_event_trigger_ddl_commands())
  THEN
    RAISE NOTICE '%', tg_tag;
    NOTIFY pgrst, 'reload schema';
  END IF;
END;
$$ LANGUAGE plpgsql;

@steve-chavez
Copy link
Member Author

Only triggering on the schemas in db-schemas + db-extra-search-path would be ideal, but unless the in-db configuration is used, it's not possible.

For solving that I think we need a new NOTIFY pgrst payload, that allows us to send the schema name.

@wolfgangwalther
Copy link
Member

@wolfgangwalther Can you remind me why we need collations?

I assumed we'd need it because of the following two lines in the allColumns query:

https://github.com/PostgREST/postgrest/blob/6b5370f145ad2f346ec803c0ddaa3f47ab86ed76/src/PostgREST/DbStructure.hs#L581-L582

But it seems like this is a LEFT JOIN from which we don't select any columns. So... we can probably remove that join alltogether.


  -- don't notify in case of CREATE TEMP table or other objects created on pg_temp
  'pg_temp' NOT IN (SELECT schema_name FROM pg_event_trigger_ddl_commands())

I would have to try to reproduce this, but I think this might lead to problems. pg_event_trigger_ddl_commands can return multiple commands, so if only one of those is doing something in pg_temp we reject the whole reload, even though others might be relevant.

Maybe something more like this?

  EXISTS (SELECT 1 FROM pg_event_trigger_ddl_commands WHERE schema_name <> 'pg_temp')

Only triggering on the schemas in db-schemas + db-extra-search-path would be ideal, but unless the in-db configuration is used, it's not possible.

Even if in-db configuration is used, we still wouldn't know the username of the authenticator - so it's not clear to me how we would even query for those settings?

But restricting this to the schemas in db-schemas and db-extra-search-path wouldn't be enough anyway. The recursive view query does only query for views in those schemas - but the base table can be in a completely different, hidden schema. And since we infer some parameters for the views from those tables, we'd need to know about a change of the base table, too.

@steve-chavez
Copy link
Member Author

steve-chavez commented Jan 11, 2022

I would have to try to reproduce this, but I think this might lead to problems. pg_event_trigger_ddl_commands can return multiple commands, so if only one of those is doing something in pg_temp we reject the whole reload, even though others might be relevant.

Yeah, the docs say:

pg_event_trigger_ddl_commands returns one row for each base command executed; some commands that are a single SQL sentence may return more than one row.

https://www.postgresql.org/docs/13/functions-event-triggers.html

Asked on IRC:

steve-chavez | Hello. Q: In the docs about pg_event_trigger_ddl_commands, it says:        
             | "pg_event_trigger_ddl_commands returns one row for each base command       
             | executed; some commands that are a single SQL sentence may return more than
             | one row."                                                                  
steve-chavez | Which are those commands?                                                  
        Myon | I guess something like "create table" that might create a toast table      
        Myon | probably best to check the source                                          

So I guess another schema that needs to be excluded is pg_toast(I've also seen pg_toast_temp_{1,2,3} so these too). No idea what's the full list of those commands though.

Tried changing the logic of the above to instead iterate over pg_event_trigger_ddl_commands() but that is not reliable because somehow command_tag doesn't contain DROP SCHEMA, DROP TABLE, etc. unlike tg_tag which does contain all. So the current logic won't prevent a schema reload in case of dropping a temp table, because pg_event_trigger_ddl_commands is empty for DROPs.

@steve-chavez
Copy link
Member Author

The above behavior is documented here: https://www.postgresql.org/docs/current/event-trigger-definition.html

So unfortunately it seems we'll need two event triggers, one with ddl_command_end and other with sql_drop to be able to exclude created and dropped temp tables.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
explanations background and context about how the project works.
Development

Successfully merging a pull request may close this issue.

2 participants