Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Poll-based interface for DmaFile #446

Draft
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

waynexia
Copy link
Contributor

Signed-off-by: Ruihang Xia [email protected]

What does this PR do?

Expose poll-based interface for DmaFile.

Motivation

#445

Related issues

#445

Additional Notes

Anything else we should know when reviewing?

Checklist

[] I have added unit tests to the code I am submitting
[] My unit tests cover both failure and success scenarios
[] If applicable, I have discussed my architecture

Signed-off-by: Ruihang Xia <[email protected]>
@github-actions
Copy link

Greetings @waynexia!

It looks like your PR added a new or changed an existing dependency, and CI has failed to validate your changes.
Some possible reasons this could happen:

  • One of the dependencies you added uses a restricted license. See deny.toml for a list of licenses we allow;
  • One of the dependencies you added has a known security vulnerability;
  • You added or updated a dependency and didn't update the LICENSE-3rdparty.csv file. To do so, run the following and commit the changes:
$ cargo install cargo-license
$ cargo license --all-features -a -j --no-deps -d | jq -r '(["Component","Origin","License","Copyright"]) as $cols | map(. as $row | ["name", "repository", "license", "authors"] | map($row[.])) as $rows | $cols, $rows[] | @csv' > LICENSE-3rdparty.csv

Thank you!

@waynexia waynexia mentioned this pull request Oct 19, 2021
@waynexia
Copy link
Contributor Author

Greetings @waynexia!

It looks like your PR added a new or changed an existing dependency, and CI has failed to validate your changes. Some possible reasons this could happen:

* One of the dependencies you added uses a restricted license. See `deny.toml` for a list of licenses we allow;

* One of the dependencies you added has a known security vulnerability;

* You added or updated a dependency and didn't update the `LICENSE-3rdparty.csv` file. To do so, run the following and commit the changes:
$ cargo install cargo-license
$ cargo license --all-features -a -j --no-deps -d | jq -r '(["Component","Origin","License","Copyright"]) as $cols | map(. as $row | ["name", "repository", "license", "authors"] | map($row[.])) as $rows | $cols, $rows[] | @csv' > LICENSE-3rdparty.csv

Thank you!

Unrelated change. Fix patch in #447

@glommer
Copy link
Collaborator

glommer commented Oct 19, 2021

Hi @waynexia

How is this better than using

let x = file.read_dma_aligned(...);
pin!(x);
futures_lite::poll_fn(|cx|, x.poll(cx));

I understand the above is obviously more code, but we don't expose a poll interface to anything, and the above is a idiomatic way how those things are usually done in Rust to the best of my knowledge

Also @duarten , if you want to chime in, please do!

@waynexia
Copy link
Contributor Author

Thanks @glommer.

To use those async methods in the "poll" context, we need to store the future somewhere (to poll the same future again). And it's hard to write down the generated async future's type for keeping it without dyn and one extra indirection (box, raw pointer or reference, etc.) to my knowledge.

But this is not a big overhead after all. However exposing the poll-based interface obviously adds a lot of methods and structures. So this is just a draft proposal to bring a discussion on it.

@glommer
Copy link
Collaborator

glommer commented Oct 20, 2021

Can you post a concrete example where you had issues with this ?
Storing the future, for example, very often can be done in the stack as long as you pin it.

@waynexia
Copy link
Contributor Author

pub fn poll_read_aligned(file: &DmaFile, pos: u64, size: usize) -> PollReadAligned<'_> {
    PollReadAligned {
        task: Box::pin(file.read_at_aligned(pos, size)),
    }
}

pub struct PollReadAligned<'a> {
    task: Pin<Box<dyn Future<Output = Result<ReadResult>> + 'a>>,
}

impl Future for PollReadAligned<'_> {
    type Output = Result<ReadResult>;

    fn poll(self: Pin<&mut Self>, cx: &mut Context<'_>) -> Poll<Self::Output> {
        self.get_mut().task.as_mut().poll(cx)
    }
}

Something like this wraps the async API back to return Poll. It uses one box to store the generated future.

... adds a lot of methods and structures.

Correction: we can replace the async API with the poll-based version as ce17478 does. This only adds the new Future structs and the API can still be used like before.

@github-actions
Copy link

Greetings @waynexia!

It looks like your PR added a new or changed an existing dependency, and CI has failed to validate your changes.
Some possible reasons this could happen:

  • One of the dependencies you added uses a restricted license. See deny.toml for a list of licenses we allow;
  • One of the dependencies you added has a known security vulnerability;
  • You added or updated a dependency and didn't update the LICENSE-3rdparty.csv file. To do so, run the following and commit the changes:
$ cargo install cargo-license
$ cargo license --all-features -a -j --no-deps -d | jq -r '(["Component","Origin","License","Copyright"]) as $cols | map(. as $row | ["name", "repository", "license", "authors"] | map($row[.])) as $rows | $cols, $rows[] | @csv' > LICENSE-3rdparty.csv

Thank you!

@duarten
Copy link
Member

duarten commented Oct 20, 2021

I'm a bit confused because the PR just seems to manually writes out the future that async would generate. You'll also need to wrap the resulting future in a Box to poll() it.

Maybe I missed the actual goal here. Is it to have a non-future based interface where you periodically poll() for some data, without needing a Context?

@HippoBaro
Copy link
Member

It just so happens that I'm dealing with a similar issue with some Glommio app myself.

My case is slightly different but ends up being essentially the same. I am trying to compose multiple Glommio streams (DmaStreamReader) into a higher-order one. It's a pretty typical use-case.

Of course, I want this API to be zero-copy, so I can't use AsyncRead and AsyncReadExt. The method get_buffer_aligned defined as such does what I want on paper:

/// Allows access to the buffer that holds the current position with no
/// extra copy
/// [...]
pub async fn get_buffer_aligned(&mut self, len: u64) -> Result<ReadResult> { ... }

Except that it's an async function: I can't use it directly from poll_next (a poll function is obviously not async).

Now back to the topic at hand: I could box the opaque type returned, store it on my stream struct, and call it a day. But I would have to use dyn, and I don't want the overhead. If the function had returned a struct implementing Future, it would be much simpler. There would be no boxing, no dyn, no overhead.

I understand @glommer's argument that this works just fine when you store the future on the stack, but that's not an option here.

I am not sure the approach @waynexia is pushing for is the correct one. But I do see that none of the functions in AsyncReadExt are async. They all return a public struct implementing Future. I reckon that's the reason why.

@duarten
Copy link
Member

duarten commented Oct 21, 2021

Except that it's an async function: I can't use it directly from poll_next (a poll function is obviously not async).

Why can't you call an async function from poll_next?

Now back to the topic at hand: I could box the opaque type returned, store it on my stream struct, and call it a day. But I would have to use dyn, and I don't want the overhead. If the function had returned a struct implementing Future, it would be much simpler. There would be no boxing, no dyn, no overhead.

Why would you need to use dyn? An async function is a function returning a struct implementing Future. You do need boxing in any case to be able to call poll on the pinned future, no?

@waynexia
Copy link
Contributor Author

Greeting @duarten!

Why can't you call an async function from poll_next?

You can, but you can't .await it and you need to do something like this as "a poll function is obviously not async". And because we want to poll on the same future every time, we need to store it rather than create one each time. That is why @glommer's suggestion is not suited for this scenario to my understanding.

Why would you need to use dyn? An async function is a function returning a struct implementing Future. You do need boxing in any case to be able to call poll on the pinned future, no?

Using dyn because the generated opaque Future struct can't be written down. Then the box is needed to give the dyn Future<> a determined size as well as to put it on the heap.

For the proposed poll api, you can either use the .await combinator like those async things, or store the returned Future structure and poll it at your opinion.

@duarten
Copy link
Member

duarten commented Oct 21, 2021

Greeting @duarten!

Howdy! :)

Why can't you call an async function from poll_next?

You can, but you can't .await it and you need to do something like this as "a poll function is obviously not async". And because we want to poll on the same future every time, we need to store it rather than create one each time. That is why @glommer's suggestion is not suited for this scenario to my understanding.

But it's the same thing with a non-async function: you also need to store the returned future.

Why would you need to use dyn? An async function is a function returning a struct implementing Future. You do need boxing in any case to be able to call poll on the pinned future, no?

Using dyn because the generated opaque Future struct can't be written down. Then the box is needed to give the dyn Future<> a determined size as well as to put it on the heap.

For the proposed poll api, you can either use the .await combinator like those async things, or store the returned Future structure and poll it at your opinion.

That is true for the async version too: you can await or store the returned future. If the difference is just having a named type, I don't think it's worth the trouble. You can use generics instead of dyn. For example:

async fn f() {}

struct Pollable<F: Future> {
    f: Pin<Box<F>>,
}

impl<F: Future> Pollable<F> {
    pub fn poll(&mut self, cx: &mut Context<'_>) -> Poll<F::Output> {
        self.f.as_mut().poll(cx)
    }
}

fn as_pollable<F: Future>(f: F) -> Pollable<F> {
    Pollable { f: Box::pin(f) }
}

fn main() {
    let waker = noop_waker_ref();
    let mut cx = std::task::Context::from_waker(waker);
    let mut f = as_pollable(f());
    let _ = f.poll(&mut cx);
}

@HippoBaro
Copy link
Member

@duarten your example works because you use the stack for storage. Try to make this work with a stream, and you'll realize you can't always do that.

I'm looking for a solution where boxing is a no-go (I don't want to allocate) and where dyn is a no-go (I don't want virtual function calls).

You do need boxing in any case to be able to call poll on the pinned future, no?

That's not always true. My stream struct is already pinned (via the pin_project macro) so I can store futures directly.

@duarten
Copy link
Member

duarten commented Oct 21, 2021

@duarten your example works because you use the stack for storage. Try to make this work with a stream, and you'll realize you can't always do that.

Hum.. not sure I understand. Fundamentally, using a generated type vs a named typed implies you have to use generics, but that should be orthogonal to storage?

I'm looking for a solution where boxing is a no-go (I don't want to allocate) and where dyn is a no-go (I don't want virtual function calls).

You do need boxing in any case to be able to call poll on the pinned future, no?

That's not always true. My stream struct is already pinned (via the pin_project macro) so I can store futures directly.

I don't quite understand that either. I thought pin_project didn't pin anything, it only provided a convenient way to access pinned fields of a struct (I guess it uses Pin::new_unchecked for those), meaning you still had to ensure the pin_project struct is itself pinned, which you can either do with Box::pin or Pin::new_unchecked, no?

(To be clear, I'm not pushing back on anything, I'm just trying to understand the issue. As you see, I don't have a lot of experience with the lower levels of async rust.)

@HippoBaro
Copy link
Member

(To be clear, I'm not pushing back on anything, I'm just trying to understand the issue. As you see, I don't have a lot of experience with the lower levels of async rust.)

I think it's the right approach too! I've been trying to get this stream struct of mine working, and it's been a pain so far 😆. I find that not having a poll_* variant of async functions is the main reason.

I did some additional investigation yesterday, and I came to the conclusion that what waynexia suggest is better than a named future for the following reason:

  • Creating a future from a poll function is trivial (via futures_lite::poll_fn for instance), but the other way around is not;
  • Futures often have a lifetime, making it extremely hard to work within a stream because rust forbids a struct member to have a lifetime based on another member. The borrow-checker can't express that. There are workarounds, but they are non-trivial.

If this is useful, maybe @waynexia and I can write some small code examples to demonstrate our point?

@duarten
Copy link
Member

duarten commented Oct 22, 2021

Ah, yes, I certainly see how those poll functions are easier to work with! Thanks for the example, the approach makes a lot of sense to me 👍🏻

Copy link
Member

@HippoBaro HippoBaro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a net improvement, and I'm ready to merge this, but before, I would like to ask that:

  • The equivalent interface be written for the non-aligned variant read_at;
  • The ImmutableFile functions reflect this change too (should be a matter of changing the function signatures)

@duarten
Copy link
Member

duarten commented Oct 22, 2021

Hum.. I understand adding the poll_collect_rw function, but I still don't understand how hand-rolling PollDmaReadAligned is better, or why it's easier to store that instead of the future generated by the async function (or using poll_fn).

@duarten
Copy link
Member

duarten commented Oct 22, 2021

Okay, I chatted with @HippoBaro and now I understand the issues. The goal is to make asynchronous functions more composable, either by providing poll_ functions that can be called directly without the need for the caller to store a future, or by providing a named future that can be stored without the need for dyn. Generics can help with the latter but are limited; in particular, if you only have access to the name of the async-generated future after having constructed your struct, then, well, you can't construct your struct. I think rust-lang/rfcs#2515 can eventually help here.

Until then, named futures seem like a good approach. Maybe we can introduce a macro that will generate the future to avoid repeating this boilerplate all over the codebase.

@HippoBaro
Copy link
Member

Thanks @duarten you definitely say it better than I do 😅

Until then, named futures seem like a good approach. Maybe we can introduce a macro that will generate the future to avoid repeating this boilerplate all over the codebase.

@waynexia how do you feel about giving this a try? There is potential to limit the boilerplate here.

@waynexia
Copy link
Contributor Author

Sorry for the delay!

@waynexia how do you feel about giving this a try? There is potential to limit the boilerplate here.

I'd like to😉. Let me try DmaFile first and see what happens.

@github-actions
Copy link

Greetings @waynexia!

It looks like your PR added a new or changed an existing dependency, and CI has failed to validate your changes.
Some possible reasons this could happen:

  • One of the dependencies you added uses a restricted license. See deny.toml for a list of licenses we allow;
  • One of the dependencies you added has a known security vulnerability;
  • You added or updated a dependency and didn't update the LICENSE-3rdparty.csv file. To do so, run the following and commit the changes:
$ cargo install cargo-license
$ cargo license --all-features -a -j --no-deps -d | jq -r '(["Component","Origin","License","Copyright"]) as $cols | map(. as $row | ["name", "repository", "license", "authors"] | map($row[.])) as $rows | $cols, $rows[] | @csv' > LICENSE-3rdparty.csv

Thank you!

1 similar comment
@github-actions
Copy link

Greetings @waynexia!

It looks like your PR added a new or changed an existing dependency, and CI has failed to validate your changes.
Some possible reasons this could happen:

  • One of the dependencies you added uses a restricted license. See deny.toml for a list of licenses we allow;
  • One of the dependencies you added has a known security vulnerability;
  • You added or updated a dependency and didn't update the LICENSE-3rdparty.csv file. To do so, run the following and commit the changes:
$ cargo install cargo-license
$ cargo license --all-features -a -j --no-deps -d | jq -r '(["Component","Origin","License","Copyright"]) as $cols | map(. as $row | ["name", "repository", "license", "authors"] | map($row[.])) as $rows | $cols, $rows[] | @csv' > LICENSE-3rdparty.csv

Thank you!

/// Future of [`DmaFile::read_at_aligned`].
#[derive(Debug)]
#[must_use = "future has no effect unless you .await or poll it"]
pub struct PollDmaReadAtAligned<'a> {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't figure out a good way to cover these structs into a macro 🤦‍♂️ Both the members and fn poll()'s bodies are different...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I put these structs in here for now. We can consider moving them to a separate file if more are added.

@@ -264,6 +264,15 @@ impl Source {
})
.await
}

pub(crate) fn poll_collect_rw(&self, cx: &mut Context<'_>) -> Poll<io::Result<usize>> {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering whether we can make more use of this, like dma write or other files/resources' operation. Maybe we can track the progress in #445.

@HippoBaro HippoBaro force-pushed the master branch 2 times, most recently from 616409a to d480c63 Compare February 27, 2022 03:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants