Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HIVE-2525: privatelink: add privatelink controller unit tests #2520

Merged
merged 1 commit into from
Dec 18, 2024

Conversation

jstuever
Copy link
Contributor

No description provided.

@openshift-ci-robot
Copy link

openshift-ci-robot commented Nov 14, 2024

@jstuever: This pull request references HIVE-2525 which is a valid jira issue.

In response to this:

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Nov 14, 2024
@openshift-ci openshift-ci bot requested review from 2uasimojo and dlom November 14, 2024 21:39
@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Nov 14, 2024
Copy link

codecov bot commented Nov 15, 2024

Codecov Report

Attention: Patch coverage is 93.05556% with 5 lines in your changes missing coverage. Please review.

Project coverage is 49.79%. Comparing base (3ccd44d) to head (cad3386).
Report is 8 commits behind head on master.

Files with missing lines Patch % Lines
pkg/test/logger/logger.go 64.28% 3 Missing and 2 partials ⚠️
Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #2520      +/-   ##
==========================================
+ Coverage   46.26%   49.79%   +3.52%     
==========================================
  Files         279      281       +2     
  Lines       32913    32986      +73     
==========================================
+ Hits        15227    16424    +1197     
+ Misses      16406    15233    -1173     
- Partials     1280     1329      +49     
Files with missing lines Coverage Δ
...er/privatelink/actuator/mock/actuator_generated.go 100.00% <100.00%> (ø)
pkg/test/clusterdeployment/clusterdeployment.go 100.00% <100.00%> (+2.75%) ⬆️
pkg/test/clusterprovision/clusterprovision.go 97.72% <100.00%> (+0.08%) ⬆️
pkg/test/logger/logger.go 64.28% <64.28%> (ø)

... and 11 files with indirect coverage changes

@jstuever jstuever changed the title HIVE-2525: privatelink: add privatelink controller unit tests WIP: HIVE-2525: privatelink: add privatelink controller unit tests Nov 18, 2024
@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Nov 18, 2024
@jstuever jstuever changed the title WIP: HIVE-2525: privatelink: add privatelink controller unit tests HIVE-2525: privatelink: add privatelink controller unit tests Nov 18, 2024
@openshift-ci openshift-ci bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Nov 18, 2024
@jstuever jstuever changed the title HIVE-2525: privatelink: add privatelink controller unit tests WIP: HIVE-2525: privatelink: add privatelink controller unit tests Nov 19, 2024
@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Nov 19, 2024
@jstuever jstuever force-pushed the HIVE-2525 branch 2 times, most recently from f5dd15d to b82d227 Compare November 28, 2024 00:13
@jstuever jstuever changed the title WIP: HIVE-2525: privatelink: add privatelink controller unit tests HIVE-2525: privatelink: add privatelink controller unit tests Dec 3, 2024
@openshift-ci openshift-ci bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Dec 3, 2024
@jstuever
Copy link
Contributor Author

jstuever commented Dec 3, 2024

/assign @2uasimojo

Copy link
Member

@2uasimojo 2uasimojo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Partial review 1/

pkg/test/logger/logger.go Show resolved Hide resolved
pkg/test/logger/logger.go Outdated Show resolved Hide resolved
pkg/controller/privatelink/conditions/conditions_test.go Outdated Show resolved Hide resolved
pkg/controller/privatelink/conditions/conditions_test.go Outdated Show resolved Hide resolved
pkg/controller/privatelink/conditions/conditions_test.go Outdated Show resolved Hide resolved
Copy link
Member

@2uasimojo 2uasimojo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

partial 2/

@jstuever jstuever force-pushed the HIVE-2525 branch 3 times, most recently from 6829e6e to 2086ed2 Compare December 10, 2024 19:31
Copy link
Member

@2uasimojo 2uasimojo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

partial 3/

I've started rolling the code-side issues into https://issues.redhat.com/browse/HIVE-2687 -- feel free to resolve those comments (after confirming that they're correctly represented in the card).

@jstuever jstuever force-pushed the HIVE-2525 branch 3 times, most recently from 12ad27e to e403502 Compare December 11, 2024 21:07
Copy link
Member

@2uasimojo 2uasimojo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

partial 4/

}},
},
expectError: "unable to find an associatedVPC that uses the primary AWS PrivateLink credentials",
}}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like we could use a test case where VPCEndpointID is unset, but getAssociatedVPCs fails.

Copy link
Contributor Author

@jstuever jstuever Dec 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I purposely skipped this test case because I found it is an impossible situation. getAssociatedVPCs can only return an error when VPCEndpointID is set.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mm, I see. An "unreachable" comment on the code side, then?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jstuever jstuever force-pushed the HIVE-2525 branch 4 times, most recently from 05b0965 to d6f2766 Compare December 13, 2024 21:53
Copy link
Member

@2uasimojo 2uasimojo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

partial 5/

I need to re-look at conditions_test.go and then I think we'll be ready.

Reason: "ServiceAttachmentSubnetReconcileFailed",
Message: "googleapi: Error 401: not authorized to GetSubnet, AccessDenied",
}},
expectError: "failed to reconcile the Service Attachment Subnet: googleapi: Error 401: not authorized to GetSubnet, AccessDenied",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't have coverage for all the paths that fail to SetErrConditionWithRetry. Which I get would be hard to do as currently written. But when we fix that thing up per HIVE-2686 it should become possible. Leaving this note to remind self to follow up on the coverage when that's done.

I will note, however, that some of those paths actually return from the reconciler rather than just logging. Which makes the missing coverage "heavier".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, let's move this conversation to #2528

@jstuever jstuever force-pushed the HIVE-2525 branch 4 times, most recently from 351b11a to 651f14c Compare December 16, 2024 20:00
test.completed = corev1.ConditionFalse
}

logger, _ := testlogger.NewLoggerWithHook()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto hook now unused

}),
testcd.WithCondition(hivev1.ClusterDeploymentCondition{
Type: hivev1.PrivateLinkReadyClusterDeploymentCondition,
Status: corev1.ConditionFalse,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't match the comment on L474.

I think if the ready condition is true but completed is false, we won't attempt to update the ready condition at all. Which would be inherently covered by changing this and L497 both to ConditionTrue. But we could prove that we don't even attempt to update the ready condition in that path by making the pre-existing condition's reason and/or message different.

...and this behavior seems pretty odd to me. Why would we not want to revert the ready condition from true to false if we call SetReadyCondition with completed==ConditionFalse? But I guess that's for discussion in the fup.

Copy link
Contributor Author

@jstuever jstuever Dec 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch... fixed.

As for the ready condition, I'm not exactly sure. I think it limits flapping of the ready condition. When it is called after something is modified (ie, hzModified), then it always uses false. The only time it uses true is in the privatelink.go, and only after all of the actuator reconciles return true. I found that at least one of the awshubactuator functions always returned modified=true, so this would always flap false, and then back to true.

Once we move to the one-change-per-reconcile flow (where I fixed this one function), it might make more sense to flap here. If you think this is still a desired change, we can continue the discussion in the followup.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I figured it probably had something to do with preventing flapping. The part that doesn't make sense to me is how the Ready condition could have been set to True in the first place if we haven't actually completed the setup. That would seem to point to a bug somewhere in the code flow, and this odd logic was put in place to paper over it.

Alternatively the question may be: how did we become un ready after having become ready at some point? If that is legitimately possible, I would argue that we should be setting the Ready condition to false in that scenario.

But yeah, let's dig into it some more in the followup.

Good catch... fixed.

A thing I was trying to point out (albeit uneloquently) that this delta still doesn't cover: as written, we can't tell which path we took to end up no-op'ing: either the second branch of the weird anti-flapping condition, or the implicit else of that chunk. Since completed is false, we expect it to be the latter, but nothing about the test case proves that as currently written. My suggestion to remedy that was:

But we could prove that we don't even attempt to update the ready condition in that path by making the pre-existing condition's reason and/or message different.

That is, if the initial Reason was different, the else if path would revert it, but the implicit else path would not. The latter would be proven because we would have to change the Reason in the expected value as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The flapping is happening after a successful reconcile (which sets the condition to true). In an ideal world, this -could- happen if one of the resources were deleted from the cloud provider. A future reconcile would identify that it does not exist and should report the resource as modified, causing a status (message) update.

However, in the code as currently written, ReconcileHostedZoneRecords always returns modified if there is no error. As a result, recordsModified is always true and the status is always updated. Then it is updated again when the reconcile completes successfully.

I think a key point here is that completed != ready. Completed just indicates that we have completed the reconcile successfully and should update to a ready condition. In all other cases, we should just update the status message to match the latest change(s). At no point does not-completed indicate that ready condition should be false.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, so arguably both of those things are bugs -- which we can discuss in the fup.

For now all I'm asking is to change the Reason and/or Message to prove which code path is being taken.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't the implicit completed: False mean it cannot logically be due to else if completed == corev1.ConditionTrue {?

I've added more no change cases to help test/clarify each of the three 1) if, 2) else-if, and 3) implicit else paths of the code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And Done.

@2uasimojo
Copy link
Member

/test e2e e2e-gcp

@2uasimojo
Copy link
Member

/lgtm

Thanks for the long push and patience.

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Dec 17, 2024
Copy link
Contributor

openshift-ci bot commented Dec 17, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: 2uasimojo, jstuever

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Copy link
Contributor

openshift-ci bot commented Dec 18, 2024

@jstuever: all tests passed!

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@jstuever
Copy link
Contributor Author

/override "Red Hat Konflux / hive-mce-27-on-pull-request" "Red Hat Konflux / hive-on-pull-request"

Copy link
Contributor

openshift-ci bot commented Dec 18, 2024

@jstuever: Overrode contexts on behalf of jstuever: Red Hat Konflux / hive-mce-27-on-pull-request, Red Hat Konflux / hive-on-pull-request

In response to this:

/override "Red Hat Konflux / hive-mce-27-on-pull-request" "Red Hat Konflux / hive-on-pull-request"

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@openshift-merge-bot openshift-merge-bot bot merged commit 425785d into openshift:master Dec 18, 2024
12 of 14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants