Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] Non-partitioned staging table is being created in s3_data_dir instead of s3_tmp_table_dir for incremental models #608

Open
2 tasks done
sandeepmullangi2 opened this issue Jan 15, 2025 · 4 comments
Labels
pkg:dbt-athena Issue affects dbt-athena type:bug Something isn't working as documented

Comments

@sandeepmullangi2
Copy link

Is this a new bug in dbt-athena?

  • I believe this is a new bug in dbt-athena
  • I have searched the existing issues, and I could not find an existing issue for this bug

Current Behavior

In the create_table_as_with_partitions macro, currently NON-PARTIONED STAGING TABLE is being created in s3 path that is configured as part of s3_data_dir because here temporary is false. Now s3_tmp_table_dir is not even considered here.

Our setup is configured in such a way that target final tables are created in datalake AWS account and temporary tables are created in our AWS account and we have full access on our AWS account. In this scenario, temporary staging table which should be created in our AWS account is being created in datalake AWS account and we are unable to delete s3.

Expected Behavior

NON-PARTIONED STAGING TABLE should be created whatever is configured in s3_tmp_table_dir

Steps To Reproduce

  1. Use following model file config as below and name it as local_test.sql

{{ config(
materialized='incremental',
incremental_strategy='merge',
partitioned_by=['dt'],
unique_key=['date_key'],
s3_data_dir='s3://sandeep-dagster/models/data/',
s3_tmp_table_dir='s3://sandeep-dagster/models/temporary/',
force_batch='true'
)
}}

SELECT 1 as id, '2022-01-01' AS dt
union all
SELECT 2, '2022-01-02' AS dt

  1. Use this as profiles

dev:
type: athena
s3_data_dir: s3://datalake-bucket/models/
s3_tmp_table_dir: s3://my-bucket/models/
s3_data_naming: schema_table
region_name: us-east-2
database: awsdatacatalog
schema: test
work_group: test

  1. Now local_test__tmp_not_partitioned table is created in s3_data_dir

Relevant log output

Environment

- OS: macOS Sequoia 15.1
- Python: 3.10.15
- dbt-core: 1.8.7
- dbt-athena: 1.8.4

Additional Context

https://getdbt.slack.com/archives/C013MLFR7BQ/p1736430731459869

@sandeepmullangi2
Copy link
Author

Its a minor fix. This should solve our problem dbt-labs/dbt-athena#779

@nicor88
Copy link
Contributor

nicor88 commented Jan 15, 2025

@sandeepmullangi2 I believe that dbt-athena implementationis being moved to https://github.com/dbt-labs/dbt-adapters

If I got correctly from @mikealfare, this repo is in deprecation mode.

@mikealfare
Copy link
Contributor

That's correct. Future issues and pull requests should be submitted against https://github.com/dbt-labs/dbt-adapters. We are midway through the process so some docs have not been updated to reflect the new process. We appreciate your patience as we work through that. In the meantime, I will transfer this issue over to that repo. Unfortunately I cannot do the same for your pull request.

@mikealfare mikealfare added pkg:dbt-athena Issue affects dbt-athena type:bug Something isn't working as documented labels Jan 15, 2025
@mikealfare mikealfare transferred this issue from dbt-labs/dbt-athena Jan 15, 2025
@sandeepmullangi2
Copy link
Author

Thanks @nicor88 @mikealfare for moving my issue from other repo to here.
Ya that should be fine if PR can not be migrated, i will redo PR and submit once maintainers are ok in this repo.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pkg:dbt-athena Issue affects dbt-athena type:bug Something isn't working as documented
Projects
None yet
Development

No branches or pull requests

3 participants