Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Control file module files and validation #2445

Draft
wants to merge 44 commits into
base: master
Choose a base branch
from

Conversation

inv-jishnu
Copy link
Contributor

@inv-jishnu inv-jishnu commented Jan 2, 2025

Description

This PR adds files related to control file and its validation. A control file is used to map column names specified in import source file to actual table columns.

Related issues and/or PRs

Please review and merge this PR once the following PRs are merged

Changes made

I have added files related to control file and its validation

Checklist

The following is a best-effort checklist. If any items in this checklist are not applicable to this PR or are dependent on other, unmerged PRs, please still mark the checkboxes after you have read and understood each item.

  • I have commented my code, particularly in hard-to-understand areas.
  • I have updated the documentation to reflect the changes.
  • Any remaining open issues linked to this PR are documented and up-to-date (Jira, GitHub, etc.).
  • Tests (unit, integration, etc.) have been added for the changes.
  • My changes generate no new warnings.
  • Any dependent changes in other PRs have been merged and published.

Additional notes (optional)

Road map to merge remaining data loader core files. Current status

Release notes

NA

@inv-jishnu inv-jishnu marked this pull request as draft January 2, 2025 05:03
@inv-jishnu inv-jishnu self-assigned this Jan 7, 2025
@ypeckstadt ypeckstadt added the enhancement New feature or request label Jan 8, 2025
@ypeckstadt ypeckstadt marked this pull request as ready for review January 8, 2025 23:47
DATA_LOADER_MISSING_COLUMN_MAPPING(
Category.USER_ERROR,
"0159",
"No mapping found for column '%s' in table '%s' in the control file. \\nControl file validation set at 'FULL'. All columns need to be mapped.",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"No mapping found for column '%s' in table '%s' in the control file. \\nControl file validation set at 'FULL'. All columns need to be mapped.",
"No mapping found for column '%s' in table '%s' in the control file. Control file validation set at 'FULL'. All columns need to be mapped.",

Ignore this comment if the newline is really necessary.

DATA_LOADER_ERROR_CRUD_EXCEPTION(
Category.INTERNAL_ERROR,
"0047",
"Something went wrong while trying to save the data. Details %s",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"Something went wrong while trying to save the data. Details %s",
"Something went wrong while trying to save the data. Details: %s",

DATA_LOADER_ERROR_SCAN(
Category.INTERNAL_ERROR,
"0048",
"Something went wrong while scanning. Are you sure you are running in the correct transaction mode? Details %s",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"Something went wrong while scanning. Are you sure you are running in the correct transaction mode? Details %s",
"Something went wrong while scanning. Are you sure you are running in the correct transaction mode? Details: %s",

@JsonProperty("namespace")
private String namespace;

/** The name of the table in ScalarDB. */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

namespace above is also just the name of a namespace. How about renaming this to table as well or the above namespace to namespaceName for consistency?

""),
DATA_LOADER_CONTROL_FILE_MISSING_DATA_MAPPINGS(
Category.USER_ERROR, "0160", "The control file is missing data mappings", "", ""),
DATA_LOADER__MISSING_NAMESPACE_OR_TABLE(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
DATA_LOADER__MISSING_NAMESPACE_OR_TABLE(
DATA_LOADER_MISSING_NAMESPACE_OR_TABLE(

String namespace,
String table,
Key partitionKey,
Key clusteringKey,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Key clusteringKey,
@Nullable Key clusteringKey,

Java is a poor language unable to represent nullable or non-nullable with type, and @Nullable annotation is very helpful for us.

logger.info(String.format(GET_COMPLETED_MSG, loggingKey));
return result;
} catch (CrudException e) {
throw new ScalarDBDaoException("error GET " + loggingKey, e.getCause());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
throw new ScalarDBDaoException("error GET " + loggingKey, e.getCause());
throw new ScalarDBDaoException("error GET " + loggingKey, e);

Just including e would be a bit better not to lose the context.

logger.info(SCAN_START_MSG);
Scanner scanner = storage.scan(scan);
List<Result> allResults = scanner.all();
scanner.close();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's possible this close isn't called if scanner.all() throws an exception. How about using try-with-resources statement for scanner?

/* Distributed storage for ScalarDB connection that is running in storage mode. */
@Nullable private final DistributedStorage storage;
/* Distributed Transaction manager for ScalarDB connection that is running in transaction mode */
private final DistributedTransactionManager transactionManager;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These 3 variables should have @Nullable annotation as well.

}

/** @return storage for ScalarDB connection that is running in storage mode */
public DistributedStorage getDistributedStorage() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
public DistributedStorage getDistributedStorage() {
@Nullable public DistributedStorage getDistributedStorage() {

@inv-jishnu inv-jishnu marked this pull request as draft January 9, 2025 10:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants