-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reporting the Advisory locks, XML Functions and System columns on the DDLs in the unsupported features in assess/analyze #2061
Reporting the Advisory locks, XML Functions and System columns on the DDLs in the unsupported features in assess/analyze #2061
Conversation
… DDLs in the unsupported_features in analyze
87dbbac
to
43f1853
Compare
@@ -50,7 +50,7 @@ func (d *TableIssueDetector) DetectIssues(obj queryparser.DDLObject) ([]QueryIss | |||
// Check for generated columns | |||
if len(table.GeneratedColumns) > 0 { | |||
issues = append(issues, NewGeneratedColumnsIssue( | |||
TABLE_OBJECT_TYPE, | |||
obj.GetObjectType(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a TableIssueDetector
will always return issues for Table object types, right?
Is this just a change to not use a constant and use the function, or is there a certain bug/case because of which you had to do this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this just a change to not use a constant and use the function,
Yes right, that way I was able to directly use ddlObj.GetObjectType()
to get the respective type.
return issues, nil | ||
} | ||
|
||
genericIssues, err := p.genericIssues(query) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a comment here explaining why we're making this call for DDLs. Give an example.
@@ -289,6 +269,22 @@ func (p *ParserIssueDetector) getDDLIssues(query string) ([]QueryIssue, error) { | |||
issues[i].SqlStatement = query | |||
} | |||
} | |||
|
|||
if _, ok := ddlObj.(*queryparser.Object); ok { // In case the DDL doesn't have any processor skip checking generic issues |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a ddlobject is an interface that already adheres to GetObjectName
and GetSchemaName
. So why do we need to additionally check if you can type cast it to a queryparser.Object
. As long as ddlObject is of type DDLObject interface
, you should be good, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So there were reasons to add this check here-
- To handle the scenario where the stmt coming to getDDLIssues is DML stmt which can be in case of PLPGQueries where we call internal getAllIssues(), so with that generic function is getting called for the select stmt twice giving the duplicate issues. - This can be solved by just unique issues by some Key or anything
- To handle the scenario where the ObjectType is not known in this code path (the case where DDLObject is not implemented for some DDL types yet) so it might return. the objectType -
OBJECT
(NoOpProcessor). For such case, the analyze code to convertIssue toAnalyzeIssue() where we set invalidCount of objectname based on ObjectTYpe will fail with nil pointer as it doesn't know about it. - This can also be solved by handling it properly in that function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As discussed, changed this condition with isDDL in the starting of getDDLIssues()
} | ||
|
||
// TODO: in future when we will DDL issues detection here we need `GetDDLIssues` | ||
func (p *ParserIssueDetector) getDMLIssues(query string) ([]QueryIssue, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would recommend still keeping the getDMLIssues
and GetDMLIssues
both. The idea was to not touch the GetDMLIssues
often and keep the getIssuesNotFixedInTargetDbVersion
in it so that it does not get missed.
If we have just one function, there is a high possibility that we accidentally just write some return from the middle of that function which will lead to issues not being filtered out..
|
||
assert.Equal(t, len(expectedIssues), len(issues), "Mismatch in issue count for statement: %s", stmt) | ||
for _, expectedIssue := range expectedIssues { | ||
found := slices.ContainsFunc(issues, func(QueryIssue QueryIssue) bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As discussed, see if we can use cmp.Equal for this.
@@ -58,22 +39,6 @@ func IsMviewObject(parseTree *pg_query.ParseResult) bool { | |||
return isCreateAsStmt && createAsNode.CreateTableAsStmt.Objtype == pg_query.ObjectType_OBJECT_MATVIEW | |||
} | |||
|
|||
func GetSelectStmtQueryFromViewOrMView(parseTree *pg_query.ParseResult) (string, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, so we don't really need these anymore because parser can directly parse MVIEW/VIEW, correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah
@priyanshi-yb It would be good to also include details for those issues in the report.
I believe these details are not available right now, am i right? |
Yes, currently it is not available we might have to enhance the traversal logic to return all the variety of unsupported functions/columns.. we found in the statement |
@@ -210,6 +219,11 @@ func TestAllIssues(t *testing.T) { | |||
}, | |||
} | |||
|
|||
//Should modify it in similar way we do it actual code as the particular DDL issue in plpgsql can have different Details map on the basis of objectType |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm confused. What was wrong with the previous approach where you define issues directly with "FUNCTION", "list_high_earners" ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
example for this cluster on -
func NewClusterONIssue(objectType string, objectName string, sqlStatement string) QueryIssue {
details := map[string]interface{}{}
//for ALTER AND INDEX both same struct now how to differentiate which one to not
if objectType == "TABLE" {
details["INCREASE_INVALID_COUNT"] = false
}
return newQueryIssue(clusterOnIssue, objectType, objectName, sqlStatement, details)
}
earlier I was using the New function for getting queryissue -
NewClusterONIssue("FUNCTION", "list_high_earners", "ALTER TABLE employees CLUSTER ON idx;")
giving this QueryIssue, with details as empty as -
{{CLUSTER_ON ALTER TABLE CLUSTER not supported yet. Remove it from the exported schema. https://github.com/YugaByte/yugabyte-db/issues/1124 https://docs.yugabyte.com/preview/yugabyte-voyager/known-issues/postgresql/#unsupported-alter-table-ddl-variants-in-source-schema map[]} FUNCTION list_high_earners ALTER TABLE employees CLUSTER ON idx; map[]}
But the actual query issue will look like this where details will be populated based on TABLE object type-
{{CLUSTER_ON ALTER TABLE CLUSTER not supported yet. Remove it from the exported schema. https://github.com/YugaByte/yugabyte-db/issues/1124 https://docs.yugabyte.com/preview/yugabyte-voyager/known-issues/postgresql/#unsupported-alter-table-ddl-variants-in-source-schema map[]} FUNCTION list_high_earners ALTER TABLE employees CLUSTER ON idx; map[INCREASE_INVALID_COUNT:false]}
In the code to detect these issues in PLPGSQL, we modify the issues later (in getAllPLPGSQLIssues()) to change the objectType and objectName to the actual object name and type of the PLPGSQL object.
So now with cmp.Equal, it also compares details
map, I had to change the way I generate the expected issues to be able to check properly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Orthogonal point:
INCREASE_INVALID_COUNT
should not be in queryissue layer.
It is an analyze-schema detail, we should have some logic of figuring it out at that layer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I agree. This was just a hack while refactoring. I already removed it in this PR with handling all cases for invalid count #2073
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh nice, will check!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Describe the changes in this pull request
Unsupported features
in the reports Advisory locks, XML functions and System columns in the DDL.GetDDLIssues()
in theparser_issue_detector_test.go
Fixes #2025
Sample Assessment report -
Sample Analyze report -
Describe if there are any user-facing changes
Added new features section under Unsupported Features
How was this pull request tested?
Existing tests were enough.
Added unit tests
Does your PR have changes that can cause upgrade issues?