Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Allow different schema for tmp tables created during table materialization #664

Open
wants to merge 12 commits into
base: main
Choose a base branch
from

Conversation

pierrebzl
Copy link
Contributor

@pierrebzl pierrebzl commented May 29, 2024

Description

As described in https://github.com/dbt-athena/dbt-athena/issues/662, this intend to extend temp_schema option table materialization that requires to create transient temporary tables.

After this PR, with temp_schema set on all your models that use table maaterialization, you should not see any tmp tables created inside model target schemas.

Models used to test - Optional

Checklist

  • You followed contributing section
  • You kept your Pull Request small and focused on a single feature or bug fix.
  • You added unit testing when necessary
  • You added functional testing when necessary

@pierrebzl pierrebzl changed the title feat: Allow to define a different schema for tmp tables created during table materialization feat: Allow different schema for tmp tables created during table materialization May 29, 2024
@nicor88
Copy link
Contributor

nicor88 commented Jun 12, 2024

@pierrebzl is this ready for review? If so we can move it from the draft state, and I can start to review it.

@pierrebzl
Copy link
Contributor Author

@pierrebzl is this ready for review? If so we can move it from the draft state, and I can start to review it.

Yes, sorry I was away for the past 2 weeks.
Yes, this part is RFR but I don't think it covers all the use case of temp table creation yet.

Copy link
Contributor

@svdimchenko svdimchenko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the feature seems to be pretty useful. Pls review my comments regarding code style and all materialisation types

@svdimchenko
Copy link
Contributor

functional test seems to be failed, pls take a look there

@pierrebzl pierrebzl requested a review from svdimchenko June 16, 2024 18:19
{%- set old_relation = adapter.get_relation(database=database, schema=schema, identifier=identifier) -%}
{%- set old_tmp_relation = adapter.get_relation(identifier=identifier ~ '__ha',
schema=schema,
database=database) -%}
{%- if temp_schema is not none and old_tmp_relation is not none-%}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't fully get this check, why do you need to include a check for old_tmp_relation not being none here? If old_tmp_relation is none, we still want to create the new tmp table in the tmp schema.

@@ -36,6 +36,14 @@
{%- endcall %}
{%- endmacro %}

{% macro set_table_relation_schema(relation, schema) %}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you add some comments to this method?

{% macro set_table_relation_schema(relation, schema) %}
{%- if temp_schema is not none -%}
{%- set relation = relation.incorporate(path={"schema": schema}) -%}
{%- do create_schema(relation) -%}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not 100% convinced by this. pretty_much create_schema will be called by every model where we setup the temp_schema. It will be good too find a a better way to handle that.

@@ -36,6 +36,14 @@
{%- endcall %}
{%- endmacro %}

{% macro set_table_relation_schema(relation, schema) %}
{%- if temp_schema is not none -%}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
{%- if temp_schema is not none -%}
{%- if schema is not none -%}

or could you explain how do you get temp_schema here ?

Copy link
Contributor

@nicor88 nicor88 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First of of all thanks for the contribution, I left some comments, please have a look.

Also, I would like to know if this feature works also for hive tables with ha=true, maybe consider to add a test for those, and be sure to include them in the implementation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants