Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Router custom-learned-route-priority undefined behavior #12355

Merged
merged 12 commits into from
Jan 17, 2025

Conversation

akshat-jindal-nit
Copy link
Contributor

@akshat-jindal-nit akshat-jindal-nit commented Nov 19, 2024

Bug: b/378841773
Issue: custom-learned-route-priority is an optional field .
However, TF would set custom-learned-route-priority to be 0 even if the user has not explicitly set the value to be 0 when there is an update to the resource.

Repro steps:

Create a Router peer resource without custom-learned-route-priority value set.

resource "google_compute_router_peer" "peer" {
  name                      = "tf-test-my-router-peer-1"
  router                    = google_compute_router.router.name
  region                    = google_compute_router.router.region
  interface                 = "interface-1"
  peer_asn                  = 65513
  advertise_mode            = "CUSTOM"
}

Query gcloud to check the value of custom-learned-route-priority, it will be empty
gcloud compute routers describe {router-name}

Update the router peer resource (ex: advertised_route_priority = 100) without acustom-learned-route-priority. However, TF would add custom-learned-route-priority = 0 in the update api call.

resource "google_compute_router_peer" "peer" {
  name                      = "tf-test-my-router-peer-1"
  router                    = google_compute_router.router.name
  region                    = google_compute_router.router.region
  interface                 = "interface-1"
  peer_asn                  = 65513
  advertised_route_priority = 100
  advertise_mode            = "CUSTOM"
}

Query gcloud to check the value of custom-learned-route-priority, it will be 0 even though it is not set by the user.
gcloud compute routers describe {router-name}

Release Note Template for Downstream PRs (will be copied)

compute: fixed a issue where `custom_learned_route_priority` was accidentally set to 0 during updates in 'google_compute_router_peer'
bug: added support for setting `custom_learned_route_priority` to 0 in 'google_compute_router_peer' by adding the `zero_custom_learned_route_priority` field

@github-actions github-actions bot requested a review from slevenick November 19, 2024 04:15
Copy link

Hello! I am a robot. Tests will require approval from a repository maintainer to run.

@slevenick, a repository maintainer, has been assigned to review your changes. If you have not received review feedback within 2 business days, please leave a comment on this PR asking them to take a look.

You can help make sure that review is quick by doing a self-review and by running impacted tests locally.

@akshat-jindal-nit
Copy link
Contributor Author

We see this type of similar issue for advertised_route_policy , @harshithpatte-g made that fix #11613.

@harshithpatte-g
Copy link
Contributor

LGTM.

Copy link
Contributor

@slevenick slevenick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add an update test that tests this behavior specifically?

@akshat-jindal-nit
Copy link
Contributor Author

Can you add an update test that tests this behavior specifically?

I tested it manually by doing a gcloud query for the repro steps. We can't write a test or add into an existing test.
I have tried different permutations ,
update the existing resources without custom_learned_route_priority with the fix.
update the existing resources with custom_learned_route_priority with the fix.
Create a new resource with a fix.
Create a new resource with custom_learned_route_priority with a fix.
Remove custom_learned_route_priority from an existing resource with custom_learned_route_priority.
and so on.

@github-actions github-actions bot requested a review from slevenick November 20, 2024 10:07
@slevenick
Copy link
Contributor

Can you add an update test that tests this behavior specifically?

I tested it manually by doing a gcloud query for the repro steps. We can't write a test or add into an existing test. I have tried different permutations , update the existing resources without custom_learned_route_priority with the fix. update the existing resources with custom_learned_route_priority with the fix. Create a new resource with a fix. Create a new resource with custom_learned_route_priority with a fix. Remove custom_learned_route_priority from an existing resource with custom_learned_route_priority. and so on.

We should be able to write test check steps to validate that a field is set to a certain value after a config has been applied.

That's what we should do here, apply a config with no value and check that it doesn't have one, then apply a config with the 0 value and check that it gets applied, etc.

Look for occurrences of resource.TestCheckResourceAttr for examples of this

Copy link
Contributor

@slevenick slevenick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comment above

@akshat-jindal-nit
Copy link
Contributor Author

slevenick

I think we can't write the automate test as terraform consider empty or null int field value as 0 .
FYI
I manually check it using these steps https://googlecloudplatform.github.io/magic-modules/develop/test/run-tests/#optional-test-manually.
Here we seeing the field is not set at all when we use the gcloud by fetching out the details of that particular resource.

@github-actions github-actions bot requested a review from slevenick November 22, 2024 09:00
@slevenick
Copy link
Contributor

slevenick

I think we can't write the automate test as terraform consider empty or null int field value as 0 . FYI I manually check it using these steps https://googlecloudplatform.github.io/magic-modules/develop/test/run-tests/#optional-test-manually. Here we seeing the field is not set at all when we use the gcloud by fetching out the details of that particular resource.

We should be able to write a test that sets the field to 0. And then we should be able to verify that Terraform correctly set the value to 0 in the API. Can you add a test that does so?

@akshat-jindal-nit
Copy link
Contributor Author

akshat-jindal-nit commented Nov 25, 2024

@slevenick

  1. When the customer explicitly sets custom_learned_route_prioirty as 0. TF would consider this as no change and do not make an api call.

  2. (The important issue) Without explicitly setting custom_learned_route_prioirty as 0, any update to resource like advertised_route_priority would make a patch call with custom_learned_route_prioirty = 0 (side effect) along with the change (advertised_route_priority = true)

So here we go with solution 2. To use ResourceData.getOk() instead of deprecated ResourceData.getOkExists(). Because the data type is an integer, empty value(nil) is always set as 0. ResourceData.getOk() will consider 0 as resource value doesn't exist.

Solution to point 1. We don't have it currently as TF SDK doesn't support nil for integer type. The scope of point 1 is greater than just this field.
That's why writing automated test is not possible in this case.

@slevenick
Copy link
Contributor

@slevenick

  1. When the customer explicitly sets custom_learned_route_prioirty as 0. TF would consider this as no change and do not make an api call.
  2. (The important issue) Without explicitly setting custom_learned_route_prioirty as 0, any update to resource like advertised_route_priority would make a patch call with custom_learned_route_prioirty = 0 (side effect) along with the change (advertised_route_priority = true)

So here we go with solution 2. To use ResourceData.getOk() instead of deprecated ResourceData.getOkExists(). Because the data type is an integer, empty value(nil) is always set as 0. ResourceData.getOk() will consider 0 as resource value doesn't exist.

Solution to point 1. We don't have it currently as TF SDK doesn't support nil for integer type. The scope of point 1 is greater than just this field. That's why writing automated test is not possible in this case.

Sorry, I think I'm not stating my question well enough.

Is 0 a valid value in the API for this field?

If so, can the user set that value via their Terraform resource?

@akshat-jindal-nit
Copy link
Contributor Author

akshat-jindal-nit commented Nov 27, 2024

Is 0 a valid value in the API for this field?

Yes 0 is the valid value for this particular field custom_learned_route_priority image
Yes user can update or insert any value b/w 0 to 65335 for custom_learend_router_priority field.
This particular test TestAccComputeRouterPeer_advertiseMode is running successfully for my changes.
I see there is no other test for inserting custom_learned_route_priority and updating it. @slevenick if we need to add that particular tests also please let me know I will add those tests.

Copy link

@slevenick This PR has been waiting for review for 3 weekdays. Please take a look! Use the label disable-review-reminders to disable these notifications.

Copy link
Contributor

@slevenick slevenick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a test that specifies custom_learned_route_priority = 0

Specifically we should update from custom_learned_route_priority = 200 to custom_learned_route_priority = 0 to check that the update from another value to 0 works as expected.

@akshat-jindal-nit
Copy link
Contributor Author

The Problem:

Before the fix: The custom_learned_route_priority field was unintentionally being set to 0 whenever any other field was updated. This is bad because it could overwrite a user's intentional configuration and they might not notice.
After the fix: The field no longer defaults to 0 on updates, but Terraform can't set it to 0 either . This means users must use gcloud or the Pantheon UI to set it to 0 if needed.

Why the second fix is better (but not perfect):
While not ideal, the second fix is preferable because it avoids unexpected behavior and data loss. Forcing users to explicitly set the field to 0 through other tools ensures they are aware of this specific configuration.

cc: @harshithpatte-g
@slevenick what's your point of view on this situation.

@github-actions github-actions bot requested a review from slevenick December 2, 2024 08:29
Copy link

github-actions bot commented Dec 5, 2024

@slevenick This PR has been waiting for review for 3 weekdays. Please take a look! Use the label disable-review-reminders to disable these notifications.

Copy link

github-actions bot commented Dec 9, 2024

@GoogleCloudPlatform/terraform-team @slevenick This PR has been waiting for review for 1 week. Please take a look! Use the label disable-review-reminders to disable these notifications.

Copy link
Contributor

@slevenick slevenick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that in general this change could be ok, but we would need to accompany it with a way for users to set 0 if that's what they intend. This can be done by adding an additional boolean that controls of 0 specifically is sent to the API

Copy link

@akshat-jindal-nit, this PR is waiting for action from you. If no action is taken, this PR will be closed in 28 days.

Please address any comments or change requests, or re-request review from a core reviewer if no action is required.

Image showing the re-request review button

This notification can be disabled with the disable-automatic-closure label.

@github-actions github-actions bot requested a review from slevenick December 23, 2024 16:24
@modular-magician modular-magician added the awaiting-approval Pull requests that need reviewer's approval to run presubmit tests label Dec 23, 2024
Copy link

@slevenick This PR has been waiting for review for 3 weekdays. Please take a look! Use the label disable-review-reminders to disable these notifications.

@modular-magician
Copy link
Collaborator

Tests analytics

Total tests: 1078
Passed tests: 998
Skipped tests: 73
Affected tests: 7

Click here to see the affected service packages
  • compute

Action taken

Found 7 affected test(s) by replaying old test recordings. Starting RECORDING based on the most recent commit. Click here to see the affected tests
  • TestAccComputeRouterPeer_EnableDisableIpv6
  • TestAccComputeRouterPeer_Ipv4BasicCreateUpdate
  • TestAccComputeRouterPeer_UpdateIpv6Address
  • TestAccComputeRouterPeer_UpdateRouterCustomLearnedRoutePriority
  • TestAccComputeRouterPeer_advertiseMode
  • TestAccComputeRouterPeer_bfd
  • TestAccComputeRouterPeer_enable

Get to know how VCR tests work

@modular-magician
Copy link
Collaborator

🟢 Tests passed during RECORDING mode:
TestAccComputeRouterPeer_Ipv4BasicCreateUpdate [Debug log]
TestAccComputeRouterPeer_UpdateRouterCustomLearnedRoutePriority [Debug log]
TestAccComputeRouterPeer_bfd [Debug log]
TestAccComputeRouterPeer_enable [Debug log]

🟢 No issues found for passed tests after REPLAYING rerun.


🔴 Tests failed during RECORDING mode:
TestAccComputeRouterPeer_EnableDisableIpv6 [Error message] [Debug log]
TestAccComputeRouterPeer_UpdateIpv6Address [Error message] [Debug log]
TestAccComputeRouterPeer_advertiseMode [Error message] [Debug log]

🔴 Errors occurred during RECORDING mode. Please fix them to complete your PR.

View the build log or the debug log for each test

@modular-magician modular-magician added awaiting-approval Pull requests that need reviewer's approval to run presubmit tests and removed awaiting-approval Pull requests that need reviewer's approval to run presubmit tests labels Jan 6, 2025
@modular-magician
Copy link
Collaborator

Hi there, I'm the Modular magician. I've detected the following information about your changes:

Diff report

Your PR generated some diffs in downstreams - here they are.

google provider: Diff ( 3 files changed, 174 insertions(+), 58 deletions(-))
google-beta provider: Diff ( 3 files changed, 174 insertions(+), 58 deletions(-))

@modular-magician
Copy link
Collaborator

Tests analytics

Total tests: 1078
Passed tests: 1003
Skipped tests: 73
Affected tests: 2

Click here to see the affected service packages
  • compute

Action taken

Found 2 affected test(s) by replaying old test recordings. Starting RECORDING based on the most recent commit. Click here to see the affected tests
  • TestAccComputeRouterPeer_EnableDisableIpv6
  • TestAccComputeRouterPeer_UpdateIpv6Address

Get to know how VCR tests work

@modular-magician
Copy link
Collaborator

🟢 Tests passed during RECORDING mode:
TestAccComputeRouterPeer_EnableDisableIpv6 [Debug log]
TestAccComputeRouterPeer_UpdateIpv6Address [Debug log]

🟢 No issues found for passed tests after REPLAYING rerun.


🟢 All tests passed!

View the build log or the debug log for each test

…outer_bgp_peer_test.go.tmpl

Co-authored-by: Sam Levenick <[email protected]>
@github-actions github-actions bot requested a review from slevenick January 8, 2025 06:15
@modular-magician modular-magician added the awaiting-approval Pull requests that need reviewer's approval to run presubmit tests label Jan 8, 2025
Copy link

@slevenick This PR has been waiting for review for 3 weekdays. Please take a look! Use the label disable-review-reminders to disable these notifications.

@modular-magician modular-magician removed the awaiting-approval Pull requests that need reviewer's approval to run presubmit tests label Jan 13, 2025
@modular-magician
Copy link
Collaborator

Hi there, I'm the Modular magician. I've detected the following information about your changes:

Diff report

Your PR generated some diffs in downstreams - here they are.

google provider: Diff ( 4 files changed, 194 insertions(+), 58 deletions(-))
google-beta provider: Diff ( 4 files changed, 194 insertions(+), 58 deletions(-))

@modular-magician
Copy link
Collaborator

Tests analytics

Total tests: 0
Passed tests: 0
Skipped tests: 0
Affected tests: 0

Click here to see the affected service packages
  • compute
#### Non-exercised tests

🔴 Tests were added that are skipped in VCR:

  • TestAccComputeRouterPeer_UpdateRouterCustomLearnedRoutePriority
    🔴 Errors occurred during REPLAYING mode. Please fix them to complete your PR.

View the build log

@modular-magician modular-magician added the awaiting-approval Pull requests that need reviewer's approval to run presubmit tests label Jan 13, 2025
@akshat-jindal-nit
Copy link
Contributor Author

@slevenick can you please re-run the test, there was some typo in it, I correct it.

@modular-magician modular-magician removed the awaiting-approval Pull requests that need reviewer's approval to run presubmit tests label Jan 13, 2025
@modular-magician
Copy link
Collaborator

Hi there, I'm the Modular magician. I've detected the following information about your changes:

Diff report

Your PR generated some diffs in downstreams - here they are.

google provider: Diff ( 4 files changed, 194 insertions(+), 58 deletions(-))
google-beta provider: Diff ( 4 files changed, 194 insertions(+), 58 deletions(-))

@modular-magician
Copy link
Collaborator

Tests analytics

Total tests: 1081
Passed tests: 1008
Skipped tests: 73
Affected tests: 0

Click here to see the affected service packages
  • compute

🟢 All tests passed!

View the build log

Copy link

@GoogleCloudPlatform/terraform-team @slevenick This PR has been waiting for review for 1 week. Please take a look! Use the label disable-review-reminders to disable these notifications.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants