-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Duration histogram addition and subtraction #50
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some comments, and please add unit-test(s) for moving average.
func NewMovingAverageHistogram( | ||
typ DurationHistogramType, | ||
maxDuration time.Duration, | ||
durationPerHistogram time.Duration, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wouldn't it be easier to pass histCount
and durationPerHistogram
as parameters (instead of maxDuration
& durationPerHistogram
) ?
This would allows to replace StatLastMinute
with StatLastN
where the caller knows what n
means since he passed the values in the ctor.
func (m *MovingAverageHistogram) StatLastN(n int, f func(h *DurationHistogram) time.Duration) time.Duration
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that's reasonable, yes. I do want to keep StatLastMinute
because realistically, this will be used for metrics largely, and that way the metrics call stays exactly the same regardless of the duration / frequency and what not.
countToKeep := int(time.Minute / m.durationPerHistogram) | ||
if time.Minute%m.durationPerHistogram != 0 { | ||
countToKeep++ | ||
} | ||
if countToKeep == 1 { | ||
countToKeep++ | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so countToKeep
is 2 if m.durationPerHistogram
>= 30s ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed the logic to be more correct / hopefully more clear.
// We calculate this countToKeep in order to at least capture the last minute worth of data, | ||
// accounting for the possibility of an empty period that just started. | ||
countToKeep := int(time.Minute/m.durationPerHistogram) + 1 | ||
// This case occurs with the duration per histogram is not a divisor of a minute. | ||
if time.Duration(countToKeep)*m.durationPerHistogram < time.Minute+m.durationPerHistogram { | ||
countToKeep++ | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This produces:
durationPerHistogram countToKeep
1s 61
5s 13
10s 7
20s 4
29s 4
30s 3
1m0s 2
1m0.000000001s 2
2m0s 2
3m0s 2
3m0.000000001s 2
4m0s 2
Instead I propose:
func countToKeep(durationPerHistogram, durationToCover time.Duration) int {
countToKeep := int((durationToCover + 1) / durationPerHistogram)
if time.Duration(countToKeep)*durationPerHistogram < durationToCover {
countToKeep++
}
return countToKeep
}
func TestCountToKeep(t *testing.T) {
type testCase struct {
durationPerHistogram time.Duration
toCover time.Duration
wantCount int
}
for i, tc := range []*testCase{
{durationPerHistogram: time.Second * 1, toCover: time.Minute, wantCount: 60},
{durationPerHistogram: time.Second * 5, toCover: time.Minute, wantCount: 12},
{durationPerHistogram: time.Second * 10, toCover: time.Minute, wantCount: 6},
{durationPerHistogram: time.Second * 20, toCover: time.Minute, wantCount: 3},
{durationPerHistogram: time.Second * 29, toCover: time.Minute, wantCount: 3},
{durationPerHistogram: time.Second * 30, toCover: time.Minute, wantCount: 2},
{durationPerHistogram: time.Minute * 1, toCover: time.Minute, wantCount: 1},
{durationPerHistogram: time.Minute + 1, toCover: time.Minute, wantCount: 1},
{durationPerHistogram: time.Minute * 2, toCover: time.Minute, wantCount: 1},
{durationPerHistogram: time.Minute * 3, toCover: time.Minute, wantCount: 1},
{durationPerHistogram: time.Minute*3 + 1, toCover: time.Minute, wantCount: 1},
{durationPerHistogram: time.Minute * 4, toCover: time.Minute, wantCount: 1},
{durationPerHistogram: time.Second * 1, toCover: time.Hour, wantCount: 3600},
{durationPerHistogram: time.Second * 5, toCover: time.Hour, wantCount: 720},
{durationPerHistogram: time.Minute * 1, toCover: time.Hour, wantCount: 60},
{durationPerHistogram: time.Minute * 5, toCover: time.Hour, wantCount: 12},
} {
ck := countToKeep(tc.durationPerHistogram, tc.toCover)
require.Equal(t, tc.wantCount, ck, "failed at: %d / %v", i, tc.durationPerHistogram)
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The main problem with this is that it's possible to report basically no data, depending on the rotation schedule. If we always synchronized rotation of the histograms with collection of stats, it would be fine, but that is not the case and in general I prefer to design the system assuming that there is no synchronization between the rotation and data being requested.
For instance, with your example for covering a minute with a one minute histogram, consider the following scenarios. In all these cases, the last histogram was rotated at 12:00:15
(a random time), and so the next histogram will start being filled at 12:01:15
.
Case 1 (good case)
A stat over data in the last minute is requested at 12:01:14
. countToKeep
is 1, so the histogram used to calculate the output statistic holds data from 12:00:15
to 12:01:14
, 59 seconds.
Case 2 (average case)
A stat over data in the last minute is requested at 12:00:45
. The histogram used to calculate the output statistic holds data from 12:00:15
to 12:00:45
, 30 seconds.
Case 3 (bad case)
A stat over data in the last minute is requested at 12:01:16
. The histogram has just rotated, and as only one histogram is requested, it holds less than a second of data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. Or keep in memory when
the rotation occurred and decide - based on when the call happens - if it is necessary to add 1.
Or always add 1 and weight the current and the older buckets based on the time spent since rotation.
I need this change for some of the backend metrics work. It's useful to be able to add and subtract these in order to handle the monotonic counts that come out of influx.