-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Managed Iceberg] support BQMS catalog #33511
Conversation
Checks are failing. Will not request review until checks are succeeding. If you'd like to override that behavior, comment |
|
||
def bqmsLocation = "$buildDir/libs" | ||
task downloadBqmsJar(type: Copy) { | ||
def jarUrl = 'https://storage.googleapis.com/spark-lib/bigquery/iceberg-bigquery-catalog-1.5.2-0.1.0.jar' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we define the version somewhere to control this and add the comments about when to change the version?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
related, currently it downloads from a gcs bucket through googleapis. Is iceberg-bigquery-catalog available in some public (trusted) source / a maven repository?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is iceberg-bigquery-catalog available in some public
Not yet. But I believe the BQ team guarantees the availability of the jar in location and it can be trusted. I don't think we have a clear ETA regarding when this jar will be available in a public repo.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 for defining the version elsewhere for easy tracking. Probably we can just define this at the BeamModulePlugin.
def activemq_version = "5.14.5" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not entirely convinced about defining a version here, since this jar is downloaded from a plain GCS bucket path, not a repository that keeps track of versions.
e.g. there's no guarantee the BigQuery team will use the same naming template if they hypothetically release a new catalog version.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we just define versions as constants there, not really related to any repositories.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean, we shouldn't define the dependency there, just the version so that it's better visible at the top level.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean that a new GCS path may not have the same naming template, so defining a version may not be helpful here. I don't have strong feelings about it though, I can add it in if it's a blocker
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would just define it there for consistency. We can always change or define a new constant if the naming pattern changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done in the latest commit. I'm not the biggest fan 😅 but it should work.
Lmk if this is what you had in mind
Assigning reviewers. If you would like to opt out of this review, comment R: @robertwb for label java. Available commands:
The PR bot will only process comments in the main thread (not review comments). |
...va/io/iceberg/src/test/java/org/apache/beam/sdk/io/iceberg/catalog/IcebergCatalogBaseIT.java
Outdated
Show resolved
Hide resolved
LGTM other than the existing comments. |
buildSrc/src/main/groovy/org/apache/beam/gradle/BeamModulePlugin.groovy
Outdated
Show resolved
Hide resolved
...iceberg/src/test/java/org/apache/beam/sdk/io/iceberg/catalog/BigQueryMetastoreCatalogIT.java
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. LGTM.
* Add BQMS catalog * trigger integration tests * build fix * use shaded jar * shadowClosure * use global timeout for tests * define version in BeamModulePlugin * address comments
Downloads the BQMS jar during build time (best effort)
Adds validation tests for BQMS catalog