Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Saving websocket RTT samples #234

Open
wants to merge 7 commits into
base: main
Choose a base branch
from
Open

Conversation

darkk
Copy link
Contributor

@darkk darkk commented Jan 21, 2020

That's WIP for #192, reopened per the comment in #232.

I have a question I'm unsure about:

  • is it okay to carry start around the way it's implemented? It looks a bit messy to me, but I'm quite okay with that.

WIP is:

  • send only the first WSInfo during download test. It breaks the feedback loop and makes queue management easier. Biased RTT estimates are not so interesting in the real time.

This change is Reviewable

darkk added 2 commits January 20, 2020 17:23
L7 ping may be significantly different from L4 pings and separate
endpoint `/ping` is used to keep spec intact till further discussion
happens.

One L7-ping sent before `/download` test to get one sample that is not
biased by the queue of bytes.

L7 ping is logged as `WSInfo` sub-object as it's currently unclear if
AppInfo should be extended or not. It's part of ndt7 spec :)

See 3520181 and m-lab#192
@darkk darkk mentioned this pull request Jan 21, 2020
2 tasks
@coveralls
Copy link
Collaborator

coveralls commented Jan 21, 2020

Pull Request Test Coverage Report for Build 1113

  • 52 of 232 (22.41%) changed or added relevant lines in 13 files are covered.
  • 5 unchanged lines in 2 files lost coverage.
  • Overall coverage decreased (-6.6%) to 72.66%

Changes Missing Coverage Covered Lines Changed/Added Lines %
ndt7/upload/sender/sender.go 4 5 80.0%
ndt7/download/sender/sender.go 6 10 60.0%
ndt7/handler/handler.go 2 6 33.33%
ndt7/ping/message/message.go 19 23 82.61%
ndt7/ping/ping.go 0 9 0.0%
ndt7/ping/sender/sender.go 0 31 0.0%
ndt7/ping/receiver/receiver.go 0 60 0.0%
ndt7/ping/mux/mux.go 0 67 0.0%
Files with Coverage Reduction New Missed Lines %
ndt7/ping/ping.go 1 0%
ndt7/upload/sender/sender.go 4 67.65%
Totals Coverage Status
Change from base Build 1111: -6.6%
Covered Lines: 1576
Relevant Lines: 2169

💛 - Coveralls

Copy link
Contributor

@bassosimone bassosimone left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Asked two clarifying questions

receiverch := receiver.StartDownloadReceiver(wholectx, conn)
measurerch := measurer.Start(wholectx, conn, resultfp.Data.UUID, start)
receiverch, pongch := receiver.StartDownloadReceiver(wholectx, conn, start, measurerch)
senderch := sender.Start(conn, measurerch, start, pongch)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am confused by the fact that a new channel has been introduced. Isn't it possible to use the measurerch for passing around information? I understood that the PING is another nullable pointer within a Measurement.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

download.sender.loop goroutine may be naturally blocked on conn.WritePreparedMessage, so there is a risk that send(measurerch) will block receiver.loop goroutine. Blocked measurer.loop is good, it'll just skip a few ticks. Blocked receiver.loop leads to skewed RTT measurements and I'd rather avoid that.

There are several possible options to overcome that.

Current implementation creates buffered channel for RTT measurements with a buffer size equal of maximum possible amount of pings issued during the test. It was good enough for PoC, but I consider it a wasteful behavior optimized for the worst case.

I'm refactoring the code to send less WSPingInfo messages to the client. Download sub-test will send only the first message back to the client via a buffered channel with small buffer and tail-drop other messages if the channel is not writable due to sender being blocked. The WSPingInfo messages would be still logged, and I assume that writing to the logging channel does not naturally block for an unpredictable amount of time (unlike conn.WritePreparedMessage in download.sender.loop). I think, logging WSPingInfo in ClientMeasurements section is okay. Anyway, the elapsed value in ping/pong frame is not signed, so client can fake it, so putting that to ClientMeasurements makes sense to me :-)

BTW, should ping be signed or formatted as {ndt7: $value} as a safety net against a client sending unsolicited pongs with some numerical value? Clients can send unsolicited pongs for heartbeat purposes per RFC.

It may make sense to apply the same logic to the measurements coming from measurer.loop. Currently the messages are logged if and only if dst is available for writing (sender is not blocked). Maybe it makes sense to split that channel into two: one channel coming to the log (that blocks on "disk") and another coming to the client (that may tail-drop messages if the client is not fast enough to fetch WritePreparedMessage blob). What do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW, slow client blocking TCPInfo logging seems to be a real-world case, not a hypothetical situation. I've checked my few ndt7 data samples, some of them have a gap as large as 1.2s between TCPInfo.ElapsedTime samples (that's twice as large as spec.MaxPoissonSamplingInterval being 625ms). Those 11 files with 321 TCPInfo samples have 15 TCPInfo samples being more than 650ms apart.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you talking about client side traces or server side traces?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That you don't get measurements on the client side because of head of line blocking is a know issue that I think we cannot address. However, the channel has buffer, so the measurements were actually saved on disk when I checked. The improvement that I foresee here is actually separating saving measurements from sending measurements to client. In the second case, I think it is reasonable to consider sending the client the most recent sample not the most ancient.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another thought that I have is that I'd try to not change the channel structure in this PR but rather consider changing the channel structure later. The code is using a pipeline pattern and channels are closed to signal EOF. However, there are better patterns where there are multiple workers per stage and where contexts are used rather than closing channels. I have a sense that the right solution here is to measure and reorganise channels to have more parallelism when needed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However, the channel has buffer

As far as I see, the created channels do not have buffers, all make(chan...) calls under ndt7/ do not specify any non-zero capacity. Do you mean golang channels here?

I was worried about the following backpressure situation:

  • the client is slow and his TCP-proxy has large TCP buffer (the easiest way for me to reproduce the case is tor)
  • download.sender.loop goroutine blocks on conn.WritePreparedMessage for some time (tor case gives me delays like 8500ms)
  • download.sender.loop does not fetch messages from m, ok := <-src as the goroutine is blocked
  • measurer.loop goroutine blocks on sending to dst as the channel is not drained and has no buffer
  • measurer.loop samples TCP_INFO and BRR_INFO possibly less often that MaxPoissonSamplingInterval as it's blocked on sending to dst and skips some of ticker.C ticks
  • the TCPInfo samples are not written to the log as they're not actually sampled :-)

Note, I assume that ndt7 has a (weak) guarantee to take TCPInfo samples at least as often as MaxPoissonSamplingInterval. That's just my assumption and it may be wrong.

This backpressure case worries me as it means that receiver.loop being blocked on writing RTT estimate to the channel drained by sender may provide wrong RTT estimate for the next sample in a queue. receiver should be mostly blocked on conn.ReadMessage to provide reasonably accurate data.

The orignal goal to introduce pongch was to mitigate that backpressure with quite a deep buffer. Bug I agree that this backpressure issue should be out of the scope of this pull request.

Back to answering your original question. My goal is go have some of WSPingInfo samples being sent back to the ndt7 client during the ping test runtime. The master branch has the channel passing both the messages and the EOF signal from measurer.loop to ${testname}.sender.loop. Having two concurrent writers (measurer.loop and receiver.loop) sending to the channel that is eventually closed(!) will lead to goroutine panic (on an attempt to send to closed channel).

So, I'm still going to use pongch as a channel from receiver.loop to ${testname}.sender.loop to keep the "pipeline + EOF" pattern.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TL;DR I acknowledge the issue you report (did see it myself). I will not accept a patch with pongch. I explain how I’d change the architecture to avoid losing measurements. I am not asking you to implement that, however. I am just asking you to kindly focus here on a simple patch that implements the /ping you need and does not intersect with existing subtests.

The channel has buffer [...]

I was wrong. I used a buffer in a development branch then. I must have concluded that was not the right fix, which is to change the architecture.

[...] the pipeline+EOF pattern

That pattern is wrong. A pattern where an expired ctx tells you a goroutine to stop is right. We’ll fix that.

[...] I am still using to use pongch

Your attitude is wrong. The right attitude is to listen. We are on the same page on more than 99% of the topics here.

Let me summarize my position: thank your for exposing a bunch of architectural issues. I am not going to accept a patch that complicates significantly the code base. Please focus on writing a /ping endpoint that does its job and does not intersect with the implementation of other experiments.

That patch, I will expedite its review. I am not willing to spend further time to argue on the merits of adding complexity to the code base when this can be avoided. That is 無駄無駄無駄無駄.

Worried about the following back pressure situation [...]

Your analysis is right. Your remediation is wrong. Thank your for spelling out this issue so clearly. As I mentioned to you (I guess I’m private?) I did see the same for a 2G connection.

[...] weak guarantee to take TCP samples

This is right or wrong depending on the meaning of take. It’s right if take also implies save. It’s wrong if take also imply sending the data to the client. We send data to the client on a best effort basis but we should be doing our best to collect and save samples.

(On this note: M-Lab has a sidecar service that samples TCPInfo at high frequency. The data is collected separately and can be joined after submission. We are still collecting BBR and TCPInfo here, why? The original design was to stop early when BBR bandwidth estimation did become stable, but we have not implemented this yet. Anyway, this should help you understand why dealing with this issue has priority medium and not high.)

The right fix that significantly mitigates all that have been said so far is to change the code architecture as follows:

  1. the measurer is a class and stores measurements in itself in a lock free way

  2. the channels become all non blocking when writing

  3. LIFO is better than FIFO when passing measurements downstream

  4. we use a WaitGroup to wait for all the goroutines to complete

  5. then we’re again single threaded and we just save on disk accessing directly what was measured

This is the right architectural fix for the current situation.

goal is to send back samples to the client

We already agreed that your use case is best served by a separate subtest called ping. We already agreed that we also need a single sample at the beginning of the /download. Everything else seems irrelevant to the objective, hence wrong.

If adding support for WebSocket level ping to download and upload is such a burden with the current architecture, the right thing to do is to open an issue, just take the first sample of the connection, and get rid of the remaining code. A future code architecture may accommodate this feature with less effort. This is something that would tell us that the architecture is right. Now it is clearly wrong because it does not allow us to do that easily.

Instead, please focus on writing /ping to fulfill your goals. Please do so without complicating the implementation of existing endpoints in exchange for seriously marginal gains.

✌️

Copy link
Contributor Author

@darkk darkk Jan 30, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(writing that down, so the essence is not lost in a log of a private chat)

I've refactored the code under following rules:

  • keep measurer and saver intact
  • avoid two ping frames in-flight (drop ping sample if the previous pong has not arrived yet)
  • have LIFO logic for samples sent to client
  • make the channels "almost non-blocking" for sender. I mean structuring the code in a way that blocking on chan <- value MAY happen in some case, but this case is an "erratic" code path, not a "usual" one.
  • use EOF as a completion signal for most of the goroutines to terminate them in a predictable order (as the final sink, namely saver, depends on EOF)
  • use context.cancel() to stop memoryless timer early in case of a network error from a client

complicating ... in exchange for seriously marginal gains

Yeah, I'm sorry, that's an (unlucky) trait I have indeed! Thanks (no kidding) for nudging me towards awareness, better focus and simplicity.

M-Lab has a sidecar service that samples TCPInfo at high frequency

I've also completely forgotten about that. That's why I overestimated the utility of TCPInfo samples being logged.

LastRTT: int64(rtt / time.Microsecond),
MinRTT: int64(minRTT / time.Microsecond),
}
pongch <- wsinfo // Liveness: buffered (sender)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, here I believe it should be possible to use dst to emit a measurement containing PingInfo, right?

Copy link
Contributor Author

@darkk darkk Jan 21, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it may be written to the resulting json file here, it will be put under ClientMeasurements in this case and it's probably a good way to reduce complexity. But some kind of pongch is still useful to send the message back to ndt7 client for the liveness reasons mentioned above.

ndt7/model/measurement.go Outdated Show resolved Hide resolved
@bassosimone bassosimone added the 2020-02-meeting To discuss during 2020-02-meeting label Feb 4, 2020
bassosimone pushed a commit that referenced this pull request Feb 10, 2020
This diff has been written by @darkk as part of #234 and
has been extracted #234 by @bassosimone, with the objective
of merging these low hanging fruits while we look into the
proper way of implementing the `/ping` endpoint.

Closes #242
@SaiedKazemi SaiedKazemi changed the base branch from master to main August 10, 2022 23:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
2020-02-meeting To discuss during 2020-02-meeting
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants