Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FR: faster range #6705

Open
m-muecke opened this issue Jan 2, 2025 · 4 comments
Open

FR: faster range #6705

m-muecke opened this issue Jan 2, 2025 · 4 comments

Comments

@m-muecke
Copy link
Contributor

m-muecke commented Jan 2, 2025

Not sure if this fits the scope of the project, but it would be lovely to get a faster range function i.e. frange(), since base R just calls min() and max() instead of doing it in a single loop. R implementation for reference: https://github.com/r-devel/r-svn/blob/d56b2b2cbfc1ca3699dcab9d0c3d571ed71e4853/src/library/base/R/range.R#L22 and a faster C implementation from collapse: https://github.com/SebKrantz/collapse/blob/f319f91853721b18fee45ac52d56359d8db7427e/src/programming.c#L1033

@MichaelChirico
Copy link
Member

Could you provide some motivating benchmarks showing how inefficient it is to call min()/max() separately vs. calculate both in a single loop?

@m-muecke
Copy link
Contributor Author

m-muecke commented Jan 3, 2025

# integer
bench::press(
  n = c(10, 1e3, 1e6),
  {
    x <- sample(n)
    bench::mark(
      range = range(x),
      frange = collapse::frange(x)
    )
  }
)
#> Running with:
#>         n
#> 1      10
#> 2    1000
#> 3 1000000
#> # A tibble: 6 × 7
#>   expression       n      min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>   <dbl> <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 range           10   1.11µs   1.19µs   781631.    14.2KB      0  
#> 2 frange          10    410ns 492.01ns  1888247.    6.25MB    189. 
#> 3 range         1000   3.12µs    3.4µs   269203.    3.95KB     26.9
#> 4 frange        1000   1.02µs   1.11µs   871791.        0B      0  
#> 5 range      1000000   2.02ms   2.21ms      447.    3.81MB     45.2
#> 6 frange     1000000 600.12µs 612.38µs     1622.        0B      0

# real
bench::press(
  n = c(10, 1e3, 1e6),
  {
    x <- rnorm(n)
    bench::mark(
      range = range(x),
      frange = collapse::frange(x)
    )
  }
)
#> Running with:
#>         n
#> 1      10
#> 2    1000
#> 3 1000000
#> # A tibble: 6 × 7
#>   expression       n      min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>   <dbl> <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 range           10   1.11µs   1.23µs   735998.        0B     73.6
#> 2 frange          10    410ns 492.01ns  1889826.        0B      0  
#> 3 range         1000   4.22µs    4.8µs   200672.    7.86KB     20.1
#> 4 frange        1000   1.52µs    1.6µs   600102.        0B      0  
#> 5 range      1000000   3.64ms   3.73ms      267.    7.63MB     89.1
#> 6 frange     1000000   1.12ms   1.14ms      874.        0B      0

Created on 2025-01-03 with reprex v2.1.1

@ben-schwen
Copy link
Member

ben-schwen commented Jan 6, 2025

But base::range can also handle other types like character? We could ofc implement grange to support range for grouped calculations but without a strong need this just gets at the end of our backlog.

@m-muecke
Copy link
Contributor Author

m-muecke commented Jan 6, 2025

But base::range can also handle other types like character? We could ofc implement grange to support range for grouped calculations but without a strong need this just gets at the end of our backlog.

Yes ofc, thought it was a long shot. Perhaps trying to create an issue for R-devel makes more sense.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants