Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Skip calling Read after Create/Update operations for notebooks #4173

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

alexott
Copy link
Contributor

@alexott alexott commented Oct 30, 2024

Changes

It's based on #4190.

It was found that the import API returns object_id as a result of its execution, so we don't really need to call the Read operation that will call get-status to fill all attributes (when we create a notebook using the SOURCE format).

The other formats (DBC and JUPYTER) will still call the Read to fill in the missing parameters, such as language.

It also changed the language to optional,computed so we can safely set the value and don't trigger the diff.

This should help when we import a large number of notebooks, i.e., when applying exported resources.

Tests

  • make test run locally
  • relevant change in docs/ folder
  • covered with integration tests in internal/acceptance
  • relevant acceptance tests are passing
  • using Go SDK

@alexott alexott requested review from a team as code owners October 30, 2024 12:21
@alexott alexott requested review from rauchy and removed request for a team October 30, 2024 12:21
@alexott alexott temporarily deployed to test-trigger-is October 30, 2024 12:21 — with GitHub Actions Inactive
@alexott alexott requested a review from mgyucht October 30, 2024 12:21
workspace/resource_notebook.go Outdated Show resolved Hide resolved
workspace/resource_notebook.go Outdated Show resolved Hide resolved
@alexott alexott changed the title [Fix] Skip calling get-status for notebooks in Read after Create/Update operations [Feature] Allow to skip calling Read after Create/Update operations Oct 30, 2024
@alexott alexott changed the title [Feature] Allow to skip calling Read after Create/Update operations [Feature] WIP: Allow to skip calling Read after Create/Update operations Oct 30, 2024
@alexott alexott force-pushed the notebooks-skip-get-status branch from 7ecd15c to 2ddb757 Compare October 30, 2024 17:46
@alexott alexott temporarily deployed to test-trigger-is October 30, 2024 17:46 — with GitHub Actions Inactive
@alexott alexott changed the title [Feature] WIP: Allow to skip calling Read after Create/Update operations [Feature] WIP: Skip calling Read after Create/Update operations for notebooks Nov 3, 2024
alexott added a commit that referenced this pull request Nov 3, 2024
This PR adds the ability for a resource to specify that it may not need to call `Read`
after `Create` and `Update` operations so we can avoid performing another API call(s). The
resource may implement `CanSkipReadAfterCreateAndUpdate` function that can decide if the
`Read` operation should be skipped.

I decided to move common part from #4173 to make it easier to review
@alexott
Copy link
Contributor Author

alexott commented Nov 3, 2024

@mgyucht I've opened #4190 with changes in common so it will be easier to review. This PR will be used only for notebook changes.

github-merge-queue bot pushed a commit that referenced this pull request Nov 4, 2024
## Changes
<!-- Summary of your changes that are easy to understand -->

This PR adds the ability for a resource to specify that it may not need
to call `Read` after `Create` and `Update` operations so we can avoid
performing another API call(s). The resource may implement
`CanSkipReadAfterCreateAndUpdate` function that can decide if the `Read`
operation should be skipped.

I decided to move common part from #4173 to make it easier to review

## Tests
<!-- 
How is this tested? Please see the checklist below and also describe any
other relevant tests
-->

- [x] `make test` run locally
- [ ] relevant change in `docs/` folder
- [ ] covered with integration tests in `internal/acceptance`
- [x] relevant acceptance tests are passing
- [ ] using Go SDK
@alexott alexott force-pushed the notebooks-skip-get-status branch from 2ddb757 to e80cda6 Compare November 5, 2024 13:49
@alexott alexott temporarily deployed to test-trigger-is November 5, 2024 13:50 — with GitHub Actions Inactive
@alexott alexott temporarily deployed to test-trigger-is November 5, 2024 13:50 — with GitHub Actions Inactive
@alexott alexott temporarily deployed to test-trigger-is November 6, 2024 10:26 — with GitHub Actions Inactive
@alexott alexott temporarily deployed to test-trigger-is November 6, 2024 10:26 — with GitHub Actions Inactive
@alexott alexott changed the title [Feature] WIP: Skip calling Read after Create/Update operations for notebooks [Feature] Skip calling Read after Create/Update operations for notebooks Nov 19, 2024
@alexott alexott requested a review from mgyucht November 26, 2024 10:13
Copy link
Contributor

@rauchy rauchy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this PR is doing a bit more than what the title suggests, especially now that #4190 lets us skip calling Read, and this PR is using that. I'm missing some specific Notebook context, so it's tough for me to fully judge if the change is correct. My review is mainly focused on the tests and code, and I don't see any issues there. @mgyucht, do you think you should have another look at it as well?

workspace/resource_notebook.go Outdated Show resolved Hide resolved
@alexott
Copy link
Contributor Author

alexott commented Dec 4, 2024

@rauchy I've added more context to the PR description

Copy link
Contributor

@mgyucht mgyucht left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might be misunderstanding parts of this, as databricks_notebook is a pretty complex resource, but I think there are some concerns we need to address before merging. LMK what you think.

oldFormat := d.Get("format").(string)
if oldFormat == "" {
source := d.Get("source").(string)
// check if `source` is set, and if it is, use file exension to determine format
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// check if `source` is set, and if it is, use file exension to determine format
// check if `source` is set, and if it is, use file extension to determine format

return nil
},
Read: func(ctx context.Context, d *schema.ResourceData, c *common.DatabricksClient) error {
oldFormat := d.Get("format").(string)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To clarify, we are doing this in the Read() method for resource import? We should set these fields correctly on create/update definitely. IIUC we won't have access to the source field during import, so there will be a diff anyways after importing a notebook, and this would be set in the update anyways. So I'm wondering why we need to set these in Read() at all.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is for importing, specifically for DBC files - If I remember correctly, the problem is that when no format is set and the DBC file is the source, then the Update will delete the original content...

}
if err != nil {
return err
}
}
if d.Get("object_type").(string) == "" {
d.Set("object_type", Notebook)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to know the object type here? Isn't it always notebook for a databricks_notebook? Or is it because this can be a DBC file which can contain other stuff as well?

I am also missing when it would be set before (when this if block would be skipped). It's marked as deprecated and should always be notebook, I guess?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's because DBC could create a directory

@@ -272,6 +279,10 @@ func ResourceNotebook() common.Resource {
Path: path,
Overwrite: true,
}
if createNotebook.Format == "" && createNotebook.Language != "" {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can users specify format? It's not documented on the public docs: https://registry.terraform.io/providers/databricks/databricks/1.61.0/docs/resources/notebook. If not, we can remove the first branch of the if since it is always true.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, I can encode my .ipynb or .dbc file and submit as content_base64 in terraform.

Comment on lines +282 to +297
if createNotebook.Format == "" && createNotebook.Language != "" {
createNotebook.Format = "SOURCE"
d.Set("format", createNotebook.Format)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To make this easier to follow, it seems like we should do something like the following:

if createNotebook.Language == "" {
	// Users must not specify `content_base64` without a language, so
	// `source` must be set.
	ext := strings.ToLower(filepath.Ext(d.Get("source").(string)))
	createNotebook.Language = extMap[ext].Language
	createNotebook.Format = extMap[ext].Format
	// Overwrite cannot be used for Dbc format
	createNotebook.Overwrite = extMap[ext].Overwrite
	d.Set("language", createNotebook.Language)
} else {
	createNotebook.Format = "SOURCE"
}
d.Set("format", createNotebook.Format)

Is this logic correct?

Comment on lines +63 to +64
"source": "acceptance/testdata/tf-test-jupyter.ipynb",
"format": "JUPYTER",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think these will normally be part of the resource state during read for a newly imported resource. What is this test meant to cover?

@@ -119,22 +155,8 @@ func TestResourceNotebookCreate_DirectoryExist(t *testing.T) {
Overwrite: true,
Format: "SOURCE",
},
},
{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice that we save multiple API calls here!

@@ -0,0 +1,42 @@
{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a readme for the .dbc files? It's not clear what is in those.

// ResourceNotebook manages notebooks
func ResourceNotebook() common.Resource {
s := FileContentSchema(map[string]*schema.Schema{
"language": {
Type: schema.TypeString,
Optional: true,
Computed: true, // we need it because it will be filled by the provider or backend
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 thank you for correcting this

}
ext := strings.ToLower(filepath.Ext(source))
return old == extMap[ext].Language
},
},
"format": {
Type: schema.TypeString,
Optional: true,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't need to change it in this PR, but is this really optional? From the docs, users can't specify this, and it seems like we always set it in Create based on the language/source rather than accepting the user-provided value.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The language needs to be set for notebooks. When we use the file content then we can guess it from the file extension, but if we provide content as base64, then we don't have this information and user must provide it.

alexott and others added 5 commits January 2, 2025 11:53
It was found that the import API returns `object_id` as a result of its execution, so we
don't really need to call get-status to fill all attributes. This should help when we
import a large number of notebooks, i.e., when applying exported resources.

This also changes `format` and `language` attributes to `optional,computed` to avoid
having issues with defaults, and suppress diff.
@alexott alexott force-pushed the notebooks-skip-get-status branch from 7f1b4e7 to 04f60ca Compare January 2, 2025 11:45
@alexott alexott temporarily deployed to test-trigger-is January 2, 2025 11:46 — with GitHub Actions Inactive
@alexott alexott temporarily deployed to test-trigger-is January 2, 2025 11:46 — with GitHub Actions Inactive
Copy link

github-actions bot commented Jan 2, 2025

If integration tests don't run automatically, an authorized user can run them manually by following the instructions below:

Trigger:
go/deco-tests-run/terraform

Inputs:

  • PR number: 4173
  • Commit SHA: 04f60ca88e895f2a50c960f8e404e58a18b4b013

Checks will be approved automatically on success.

@eng-dev-ecosystem-bot
Copy link
Collaborator

Test Details: go/deco-tests/12581928742

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants