Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make sure data.table is linked (not just compiled) with -fopenmp in this particular branch of configure for Darwin #6633

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

dvg-p4
Copy link
Contributor

@dvg-p4 dvg-p4 commented Dec 4, 2024

At least, that's what I think this does? See #6622, which this partially fixes. Empirically this change allows me to install data.table from source with multithreading working properly into my R4.2 library on my MacBook. (Doesn't work for R4.4 but there's a precompiled binary for that version so I don't care enough to keep futzing with it.)

Holding off on writing a NEWS entry since as mentioned I'm not entirely sure what this change actually does.

@dvg-p4 dvg-p4 changed the title Make sure data.table is linked (not just compiled) with -fopenmp in this particular branch for Darwin Make sure data.table is linked (not just compiled) with -fopenmp in this particular branch of configure for Darwin Dec 4, 2024
@dvg-p4 dvg-p4 changed the title Make sure data.table is linked (not just compiled) with -fopenmp in this particular branch of configure for Darwin Make sure data.table is linked (not just compiled) with -fopenmp in this particular branch of configure for Darwin Dec 4, 2024
Copy link

codecov bot commented Dec 4, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 98.61%. Comparing base (23dac21) to head (f55e69a).
Report is 48 commits behind head on master.

Additional details and impacted files
@@           Coverage Diff           @@
##           master    #6633   +/-   ##
=======================================
  Coverage   98.61%   98.61%           
=======================================
  Files          79       79           
  Lines       14536    14535    -1     
=======================================
  Hits        14334    14334           
+ Misses        202      201    -1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@MichaelChirico
Copy link
Member

MichaelChirico commented Dec 4, 2024

I am no expert on compiler flags (@jangorecki any idea?). I just poked around a bit to evaluate the PR:

data.table/configure

Lines 116 to 117 in 98cf24e

export PKG_CFLAGS="${PKG_CFLAGS} -Xclang -fopenmp"
export PKG_LIBS="${PKG_LIBS} -lomp"

  • This PR does not touch the other case where PKG_LIBS is not paired with PKG_CFLAGS, should we make the change there too?

data.table/configure

Lines 102 to 103 in 98cf24e

export PKG_CFLAGS="${PKG_CFLAGS} -fopenmp"
export R_OPENMP_ENABLED=1

cc @kevinushey, is there a reason to pair/not pair PKG_CFLAGS with PKG_LIBS?

@kevinushey
Copy link
Contributor

My understanding was that -fopenmp, at least on macOS, was purely used to control whether OpenMP was handled at compile time, and linking to the OpenMP libraries was managed via -lomp or something similar.

FWIW, installation of data.table from sources works fine for me locally (except I needed to add #include <float.h> for DBL_MAX in froll.c).

clang -arch arm64 -dynamiclib -Wl,-headerpad_max_install_names -undefined dynamic_lookup -L/Library/Frameworks/R.framework/Versions/4.4-arm64/Resources/lib -L/opt/R/arm64/lib -o data.table.so assign.o between.o bmerge.o chmatch.o cj.o coalesce.o dogroups.o fastmean.o fcast.o fifelse.o fmelt.o forder.o frank.o fread.o freadR.o froll.o frollR.o frolladaptive.o fsort.o fwrite.o fwriteR.o gsumm.o idatetime.o ijoin.o init.o inrange.o nafill.o negate.o nqrecreateindices.o openmp-utils.o programming.o quickselect.o rbindlist.o reorder.o shift.o snprintf.o subset.o transpose.o types.o uniqlist.o utils.o vecseq.o wrappers.o -lomp -L/usr/lib -lz -L/opt/homebrew/opt/libomp/lib -lomp -L/opt/homebrew/lib -F/Library/Frameworks/R.framework/Versions/4.4-arm64 -framework R -Wl,-framework -Wl,CoreFoundation
PKG_CFLAGS = -Xclang -fopenmp
PKG_LIBS = -lomp -L/usr/lib -lz

@kevinushey
Copy link
Contributor

@dvg-p4 did you compile R with clang or gcc?

@dvg-p4
Copy link
Contributor Author

dvg-p4 commented Dec 4, 2024

@aitap understands this far better than I do

@dvg-p4
Copy link
Contributor Author

dvg-p4 commented Dec 4, 2024

@dvg-p4 did you compile R with clang or gcc?

clang

Output when I successfully install this branch, R4.2

dgealow@Dans-P4-MacBook-Pro data.table % R

R version 4.2.3 (2023-03-15) -- "Shortstop Beagle"
[...]
> install.packages('.', repos = NULL, type = 'source')
Installing package into ‘/Users/dgealow/Library/R/arm64/4.2/library’
(as ‘lib’ is unspecified)
* installing *source* package ‘data.table’ ...
** using staged installation
zlib 1.2.11 is available ok
* checking if R installation supports OpenMP with "-Xclang -fopenmp" ... no
* checking if R installation supports OpenMP with "-fopenmp" ... yes
** libs
clang -arch arm64 -dynamiclib -Wl,-headerpad_max_install_names -undefined dynamic_lookup -single_module -multiply_defined suppress -L/Library/Frameworks/R.framework/Resources/lib -L/opt/R/arm64/lib -o data.table.so assign.o between.o bmerge.o chmatch.o cj.o coalesce.o dogroups.o fastmean.o fcast.o fifelse.o fmelt.o forder.o frank.o fread.o freadR.o froll.o frollR.o frolladaptive.o fsort.o fwrite.o fwriteR.o gsumm.o idatetime.o ijoin.o init.o inrange.o nafill.o negate.o nqrecreateindices.o openmp-utils.o programming.o quickselect.o rbindlist.o reorder.o shift.o snprintf.o subset.o transpose.o types.o uniqlist.o utils.o vecseq.o wrappers.o -fopenmp -lz -F/Library/Frameworks/R.framework/.. -framework R -Wl,-framework -Wl,CoreFoundation
ld: warning: -single_module is obsolete
ld: warning: -multiply_defined is obsolete
ld: warning: -single_module is obsolete
ld: warning: -multiply_defined is obsolete
PKG_CFLAGS = -fopenmp
PKG_LIBS = -fopenmp -lz
if [ "data.table.so" != "data_table.so" ]; then mv data.table.so data_table.so; fi
if [ "" != "Windows_NT" ] && [ `uname -s` = 'Darwin' ]; then install_name_tool -id data_table.so data_table.so; fi
installing to /Users/dgealow/Library/R/arm64/4.2/library/00LOCK-data.table/00new/data.table/libs
** R
** inst
** byte-compile and prepare package for lazy loading
** help
*** installing help indices
** building package indices
** installing vignettes
** testing if installed package can be loaded from temporary location
** checking absolute paths in shared objects and dynamic libraries
** testing if installed package can be loaded from final location
** testing if installed package keeps a record of temporary installation path
* DONE (data.table)

@dvg-p4
Copy link
Contributor Author

dvg-p4 commented Dec 4, 2024

Output when I unsuccessfully try to install master, R4.2

> install.packages('.', repos = NULL, type = 'source')
Installing package into ‘/Users/dgealow/Library/R/arm64/4.2/library’
(as ‘lib’ is unspecified)
* installing *source* package ‘data.table’ ...
** using staged installation
zlib 1.2.11 is available ok
* checking if R installation supports OpenMP with "-Xclang -fopenmp" ... no
* checking if R installation supports OpenMP with "-fopenmp" ... yes
** libs
clang -arch arm64 -dynamiclib -Wl,-headerpad_max_install_names -undefined dynamic_lookup -single_module -multiply_defined suppress -L/Library/Frameworks/R.framework/Resources/lib -L/opt/R/arm64/lib -o data.table.so assign.o between.o bmerge.o chmatch.o cj.o coalesce.o dogroups.o fastmean.o fcast.o fifelse.o fmelt.o forder.o frank.o fread.o freadR.o froll.o frollR.o frolladaptive.o fsort.o fwrite.o fwriteR.o gsumm.o idatetime.o ijoin.o init.o inrange.o nafill.o negate.o nqrecreateindices.o openmp-utils.o programming.o quickselect.o rbindlist.o reorder.o shift.o snprintf.o subset.o transpose.o types.o uniqlist.o utils.o vecseq.o wrappers.o -lz -F/Library/Frameworks/R.framework/.. -framework R -Wl,-framework -Wl,CoreFoundation
ld: warning: -single_module is obsolete
ld: warning: -multiply_defined is obsolete
ld: warning: -single_module is obsolete
ld: warning: -multiply_defined is obsolete
PKG_CFLAGS = -fopenmp
PKG_LIBS = -lz
if [ "data.table.so" != "data_table.so" ]; then mv data.table.so data_table.so; fi
if [ "" != "Windows_NT" ] && [ `uname -s` = 'Darwin' ]; then install_name_tool -id data_table.so data_table.so; fi
installing to /Users/dgealow/Library/R/arm64/4.2/library/00LOCK-data.table/00new/data.table/libs
** R
** inst
** byte-compile and prepare package for lazy loading
** help
*** installing help indices
** building package indices
** installing vignettes
** testing if installed package can be loaded from temporary location
Error: package or namespace load failed for ‘data.table’ in dyn.load(file, DLLpath = DLLpath, ...):
 unable to load shared object '/Users/dgealow/Library/R/arm64/4.2/library/00LOCK-data.table/00new/data.table/libs/data_table.so':
  dlopen(/Users/dgealow/Library/R/arm64/4.2/library/00LOCK-data.table/00new/data.table/libs/data_table.so, 0x0006): symbol not found in flat namespace '___kmpc_barrier'
Error: loading failed
Execution halted
ERROR: loading failed
* removing ‘/Users/dgealow/Library/R/arm64/4.2/library/data.table’
* restoring previous ‘/Users/dgealow/Library/R/arm64/4.2/library/data.table’
Warning message:
In install.packages(".", repos = NULL, type = "source") :
  installation of package ‘.’ had non-zero exit status

@kevinushey
Copy link
Contributor

Are you using LLVM clang or Apple clang? Maybe LLVM clang needs to also include -fopenmp as a linker option?

Either way, the change is probably fine ...

@dvg-p4
Copy link
Contributor Author

dvg-p4 commented Dec 4, 2024

Looks like llvm

% which clang
/opt/homebrew/opt/llvm/bin/clang

though I get the same error if I try to take it off my path and force apple clang instead (80% sure I've actually tested what I think I'm testing, 20% it's still using the llvm one because some internal environment thing is overriding my overriding)

@aitap
Copy link
Contributor

aitap commented Dec 5, 2024

Actually WRE suggests PKG_LIBS = $(SHLIB_OPENMP_CFLAGS)?

Exactly! With OpenMP, there's a number of possibilities:

  1. The happy path: OpenMP both supported by the toolchain and enabled in R during build configuration. Then the WRE recommendation of setting $(SHLIB_OPENMP_CFLAGS) just works. How to test for it:
    • Switch to a temporary directory
    • Create Makevars:
      PKG_CFLAGS = $(SHLIB_OPENMP_CFLAGS)
      PKG_LIBS = $(SHLIB_OPENMP_CFLAGS)
      
    • Create a test file:
      #ifdef _OPENMP
       #include <omp.h>
      #endif
      void test_openmp(int * numprocs) {
       *numprocs =
      #ifdef _OPENMP
        omp_get_max_threads()
      #else
        0
      #endif
       ;
      }
    • Compile it:
      R="${R_HOME}/bin/R"
      "$R" CMD SHLIB --preclean test.c # will pick up Makevars
    • Test that it works:
      "$R" -q -s -e '
       dyn.load(paste0("test", .Platform$dynlib.ext));
       q(status = .C("test_openmp", ans = integer(1))$ans <= 0)
      '
  2. OpenMP supported by the toolchain but not enabled in R. data.table assumes that the user will be better served by using OpenMP anyway and has its own configure checks for this case. I can't dig up the exact mailing list post, but I think that someone from the CRAN team was deliberately building R without configuring OpenMP in order to check packages in parallel, but some packages that used OpenMP anyway caused them trouble. Still, packages are allowed up to two parallel processes or threads during CRAN checks and data.table does its best in this regard. How to test: as above, but instead of creating Makevars, can set the environment variables PKG_CFLAGS and PKG_LIBS directly.
    • Apple introduced a number of problems here by shipping a toolchain that almost but not quite supports OpenMP, requiring the package developers to test for PKG_LIBS=-lomp, PKG_CPPFLAGS="-Xclang -fopenmp". (Why CPPFLAGS instead of CFLAGS? CPPFLAGS are used for all languages that use the C preprocessor, including C++ and Objective C. CFLAGS are for the C language only.)
    • @dvg-p4's Homebrew toolchain instead works with -fopenmp in both CFLAGS and LIBS, but not with the flags specific to Apple toolchain. Unfortunately, this only breaks during the shared library loading, which the current configure script doesn't test.
  3. OpenMP not supported by the toolchain. Hopefully it will fail to compile, fail to link, or at least fail to return an above-zero result from omp_get_max_threads(). (Feel free to suggest a different test! Maybe omp_get_num_procs() is better?)

@aitap
Copy link
Contributor

aitap commented Dec 5, 2024

A more audacious rewrite is in the check-openmp-cflags branch. It's not impossible to write a shell script for this, but much less convenient. (There's no mktemp in POSIX. The suggested replacement is to use set -o noclobber with pathchk, which doesn't work in my /bin/sh.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants