Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revise import table feature for time-related types #2461

Open
wants to merge 2 commits into
base: add_time_related_types
Choose a base branch
from

Conversation

Torch3333
Copy link
Contributor

@Torch3333 Torch3333 commented Jan 10, 2025

Description

This PR brings three main changes to the import table feature:

  1. Make the time-related types that could not be imported to be importable.
    image

  2. When importing a table, the default ScalarDB type mapping for a given column can now be overridden. This is only useful for some time-related types as of now:

    • Oracle DATE type which can be used a ScalarDB DATE (default mapping), TIME or TIMESTAMP
    • Oracle TIMESTAMP type which can be used as ScalarDB DATE or TIMESTAMP (default mapping)
    • Mysql DATETIME type which can be used as ScalarDB TIMESTAMP (default mapping) or TIMESTAMPTZ

    To do so, the user needs to define the columns requiring type override through the Schema Loader import table schema file or the Admin API.

  3. To verify that there is no encoding-decoding issue with the JDBC driver with all data types that ScalarDB supports through the import table feature, I added an integration test that performs a put then get on a generic value for each column.

Related issues and/or PRs

Changes made

See the description above and the comments left below.

Checklist

The following is a best-effort checklist. If any items in this checklist are not applicable to this PR or are dependent on other, unmerged PRs, please still mark the checkboxes after you have read and understood each item.

  • I have commented my code, particularly in hard-to-understand areas.
  • I have updated the documentation to reflect the changes.
  • Any remaining open issues linked to this PR are documented and up-to-date (Jira, GitHub, etc.).
  • Tests (unit, integration, etc.) have been added for the changes.
  • My changes generate no new warnings.
  • Any dependent changes in other PRs have been merged and published.

Additional notes (optional)

This PR will be merge into the feature branch master...add_time_related_types

Release notes

N/A

@Torch3333 Torch3333 force-pushed the add_time_related_types_revise_import branch from d69ffca to c6999af Compare January 10, 2025 09:24
@Torch3333 Torch3333 force-pushed the add_time_related_types_revise_import branch from c6999af to 9c1de6d Compare January 14, 2025 08:53
@@ -179,9 +187,14 @@ private LinkedHashMap<String, String> prepareColumnsForMysql() {
columns.put("col18", "TINYBLOB");
columns.put("col19", "MEDIUMBLOB");
columns.put("col20", "LONGBLOB");
columns.put("col21", "BINARY(255)");
columns.put("col21", "BINARY(5)");
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed the column size to facilitate inserting a BLOB value for the simple write-read integration test. For this data type, the value is right-padded if the value length is inferior to the column size.
I did a similar change for some other data types that are right padded.

public void close() throws SQLException {
dataSource.close();
}

@SuppressWarnings("UseCorrectAssertInTests")
public static class JdbcTestData implements TestData {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I refactored the JdbcAdminImportTestUtils classes to store the test data for importable and non-importable tables in this object to improve readability.

@@ -147,4 +156,84 @@ private void importTable_ForNonExistingTable_ShouldThrowIllegalArgumentException
() -> admin.importTable(getNamespace(), "non-existing-table", Collections.emptyMap()))
.isInstanceOf(IllegalArgumentException.class);
}

public void importTable_ForImportedTable_ShouldPutThenGetCorrectly() throws ExecutionException {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To verify that is no encoding-decoding issue with the JDBC driver with all data types that ScalarDB supports through the import table feature, I added an integration test that performs a put then get on a generic value for each column.

@@ -1,6 +1,10 @@
{
"sample_namespace1.sample_table1": {
"transaction": true
"transaction": true,
"override-columns-type": {
Copy link
Contributor Author

@Torch3333 Torch3333 Jan 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By setting columns with their desired data type inside the override-columns-type JSON object of the import table schema file , the user can override the default type mapping. Only setting a column that requires override is necessary.
This is only useful for some type-related types on Mysql and Oracle now.

Comment on lines +459 to 476
/**
* Imports an existing table that is not managed by ScalarDB.
*
* @param namespace an existing namespace
* @param table an existing table
* @param options options to import
* @param overrideColumnsType a map of column data type by column name. Only set the column for
* which you want to override the default data type mapping.
* @throws IllegalArgumentException if the table is already managed by ScalarDB, if the target
* table does not exist, or if the table does not meet the requirement of ScalarDB table
* @throws ExecutionException if the operation fails
*/
void importTable(
String namespace,
String table,
Map<String, String> options,
Map<String, DataType> overrideColumnsType)
throws ExecutionException;
Copy link
Contributor Author

@Torch3333 Torch3333 Jan 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add the Map<String, DataType> overrideColumnsType parameter to override the type for columns in the Admin.importTable(...) method.

Comment on lines +62 to +76
/**
* Get import table metadata in the ScalarDB format.
*
* @param namespace namespace name of import table
* @param table import table name
* @param overrideColumnsType a map of column data type by column name. Only set the column for
* which you want to override the default data type mapping.
* @throws IllegalArgumentException if the table does not exist
* @throws IllegalStateException if the table does not meet the requirement of ScalarDB table
* @throws ExecutionException if the operation fails
* @return import table metadata in the ScalarDB format
*/
TableMetadata getImportTableMetadata(
String namespace, String table, Map<String, DataType> overrideColumnsType)
throws ExecutionException;
Copy link
Contributor Author

@Torch3333 Torch3333 Jan 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add the Map<String, DataType> overrideColumnsType parameter to override the type for columns in the Admin.getImportTableMetadata(...) method.

Comment on lines +729 to +734
JDBC_IMPORT_DATA_TYPE_OVERRIDE_NOT_SUPPORTED(
Category.USER_ERROR,
"0158",
"The underlying storage data type %s is not supported as ScalarDB %s data type: %s",
"",
""),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@josh-wong
Do you mind checking the English of this error message?

@Torch3333 Torch3333 marked this pull request as ready for review January 15, 2025 00:54
@Torch3333 Torch3333 self-assigned this Jan 15, 2025
columns.put("col23", "TIME(6)");
columns.put("col24", "DATETIME(6)");
columns.put("col25", "DATETIME(6)"); // override to TIMESTAMPTZ
// With Mysql 5.7, if a TIMESTAMP column is not explicitly declared with the NULL attribute, it
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// With Mysql 5.7, if a TIMESTAMP column is not explicitly declared with the NULL attribute, it
// With MySQL 5.7, if a TIMESTAMP column is not explicitly declared with the NULL attribute, it

return new JdbcTestData(
tableName, createTableSql, overrideColumnsType, tableMetadata, columns);
}
/**
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/**
/**

// Both MySQL TIMESTAMP and DATETIME data types are mapped to the TIMESTAMP JDBC type
case TIMESTAMP:
if (overrideDataType == DataType.TIMESTAMPTZ || typeName.equalsIgnoreCase("TIMESTAMP")) {
return DataType.TIMESTAMPTZ;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With this PR, the following requirement can be addressed?

  • The underlying database has some timestamps in DATETIME columns
  • The timestamps in the columns are JST-based
  • The user wants to import the timestamps to ScalarDB as ScalarDB.TIMESTAMPZ not ScalarDB.TIMESTAMP

I guess timezone issues would occur since the DATETIME columns don't have timezone information and there is no way to specify the timezone offset between JST and UTC for table import. Is my understanding correct?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants