-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
metrics: add utility to record metrics for sidecar #78
base: main
Are you sure you want to change the base?
metrics: add utility to record metrics for sidecar #78
Conversation
|
Welcome @Nikhil-Ladha! |
Hi @Nikhil-Ladha. Thanks for your PR. I'm waiting for a kubernetes-csi member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
29ec31e
to
187c10a
Compare
@Nikhil-Ladha: GitHub didn't allow me to request PR reviews from the following users: Rakshith-R, iPraveenParihar. Note that only kubernetes-csi members and repo collaborators can review this PR, and authors cannot review their own PRs. In response to this: Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
187c10a
to
816790e
Compare
Since, there are no metrics being emitted as of now from the sidecar. This is the only metrics being logged in the server.
P.S: I have added unit tests to support working of other metrics utility functions. Hope, that should suffice for now. |
816790e
to
f9be3e6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why don't you propose how basic timing of GetMetadataAllocated
and GetMetadataDelta
would be measured?
f9be3e6
to
1d3a2a0
Compare
1d3a2a0
to
a3d4faf
Compare
@Nikhil-Ladha I have a high level question: What was wrong with the original use of the |
a3d4faf
to
00c3a4a
Compare
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: Nikhil-Ladha The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
@carlbraganza updated the PR based on our discussion offline, please take a look now. |
// Record metrics | ||
s.config.Runtime.RecordMetricsWithLabels(opLabel, runtime.MetadataAllocatedOperationName, startTime, err) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will not record the time taken for s.streamGetMetadataAllocatedResponse(ctx, stream, csiStream)
! Also, this logic doesn't record earlier failures in the subroutine. The same is true for the Delta
case.
I suggest the following changes to handle all cases with just one call to your record function:
- Change the method signature to make
err
a named return variable. e.g.
func (s *Server) GetMetadataAllocated(req *api.GetMetadataAllocatedRequest, stream api.SnapshotMetadata_GetMetadataAllocatedServer) (err error) {
- Remove
err
from all auto-definitions - you will have to declare some variables explicitly where multiple auto-definitions occur. e.g.
var (
csiReq *csi.GetMetadataAllocatedRequest
csiStream csi.SnapshotMetadata_GetMetadataAllocatedClient
)
- Make sure that the last statement assigns to
err
:
err = s.streamGetMetadataAllocatedResponse(ctx, stream, csiStream)
- At the top of the function you can define something like this:
defer func(startTime time.Time) {
opLabel := map[string]string{
runtime.LabelTargetSnapshotName: fmt.Sprintf("%s/%s", req.Namespace, req.SnapshotName),
}
s.config.Runtime.RecordMetricsWithLabels(opLabel, runtime.MetadataAllocatedOperationName, startTime, err)
}(time.Now())
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't track the failures in other subroutines purposefully, as I thought if the operation hasn't even reached the CSI driver then what's the use of tracking the metrics for it? Also, regarding the stream response, I wasn't sure if we should be recording that.
Anyway, this seems fine as well and I will update the code accordingly.
Though one question, why do I need to do this-
Remove err from all auto-definitions - you will have to declare some variables explicitly where multiple auto-definitions occur. e.g.
The auto-definitions use the same named err
error variable so we should be able to track the err
value while recording the metrics in the end. Can you please help me understand what difference does that make?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm... you may be right in that it won't get redefined - I don't know the latest Go semantics but some experiments in the playground shows that it is much more restrictive than before. Please check! Also, the UTs should help in ensuring existing behavior continues.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated the code accordingly, please take a look.
Also, add UT for the metrics wherever necessary.
00c3a4a
to
adcd2a3
Compare
/ok-to-test |
adcd2a3
to
fd8d379
Compare
enable prometheus metrics for the sidecar using the csi-lib-utils metrics package. The httpServer for metrics can be enable by adding the httpEndpoint arg to the sidecar. The server will record metrics for operations like GetMetadataAllocated, GetMetadataDelta in the format `snapshot_metadata_controller_operation_total_seconds_bucket{driver_name="",operation_name="",operation_status="",target_snapshot="", base_snapshot="",le="0.1"}`, for GetMetaAllocated operations "base_snapshot" value will be empty. Signed-off-by: Nikhil-Ladha <[email protected]>
fd8d379
to
8331a93
Compare
Enable prometheus metrics for the sidecar using the csi-lib-utils metrics package. The httpServer for metrics can be enable by adding the httpEndpoint arg to the sidecar. The server will record metrics for operations like GetMetadataAllocated, GetMetadataDelta in the format
snapshot_metadata_controller_operation_total_seconds_bucket{driver_name="",operation_name="",operation_status="",target_snapshot="", base_snapshot="",le="0.1"}
, for GetMetaAllocated operations "base_snapshot"value will be empty.
What type of PR is this?
/kind feature
What this PR does / why we need it:
This PR enables recording prometheus metrics for the snapshot-metadata sidecar.
Which issue(s) this PR fixes:
Fixes #11
Does this PR introduce a user-facing change?: