Skip to content

QUDA v1.0.0

Compare
Choose a tag to compare
@mathiaswagner mathiaswagner released this 10 Jan 19:31
· 9 commits to release/1.0.x since this release
66729fd

Version 1.0.0 - 10 January 2020

  • Add support for CUDA 10.2: QUDA 1.0.0 is supported on CUDA 7.5-10.2
    using either GCC or clang compilers. CUDA 10.x and either GCC >=
    6.x or clang >= 6.x are highly recommended.

  • Significant improvements to the CMake build system and removal of the
    legacy configure build.

  • Added more targeted compilation options to constrain which
    precisions and reconstruct types are compiled. QUDA_PRECISION is a
    cmake parameter that is a 4-bit number corresponding to which
    precisions are enabled, with 1 = quarter, 2 = half, 4 = single and 8
    = double, the default is 14 which enables double, single and half
    precision. QUDA_RECONSTRUCT is a 3-bit number corresponding to
    which reconstruct types are enabled, with 1 = reconstruct-8/9, 2 =
    reconstruct-12/13 and 4 = reconstruct-18, the default is 7 which
    enables all reconstruct types.

  • Completely rewritten all dslash kernels using the accessor
    framework. This dramatically reduces code complexity and improve
    performance.

  • New physics functionality added: gauge Laplace kernel, Gaussian
    quark smearing, topological charge density.

  • QUDA can now be built to either utilize texture-memory reads or to
    use direct memory accessing (cmake option QUDA_TEX). The default
    has textures on, though we note that since Pascal it can be
    advantageous to disable textures and utilize direct reads.

  • QUDA is no longer supported on the Fermi generation of GPUs (sm_20
    and sm_21). Compilation and running should still be possible but
    will require compilation with texture objects disabled.

  • Added supported for quarter precision (QUDA_QUARTER_PRECISION) for
    the linear operator and associated solvers.

  • Implemented both CA-CG and CA-GCR communication avoid solvers, for
    use either as stand-alone solvers or as a means to accelerate
    multigrid.

  • Continued evolution and optimization of the multigrid framework.
    Regardless, we advise users to use the latest develop branch when
    using multigrid, since it continues to be a fast-moving target with
    continual focus on optimization and improvement.

  • An implementation of the Thick Restarted Lanczos Method (TRLM) for
    eigenvector solving of the normal operator.

  • Lanczos-accelerated multigrid through the use of coarse-grid
    deflation and / or using singular vectors to define the prolongator.

  • Removal of the legacy contraction and co-variant derivative
    algorithms, and replacement with accessor-based rewrites.

  • Improved heavy-quark residual convergence which ensure correct
    convergence for MILC heavy quark observables.

  • Experimental support for Just-In-Time (JIT) compilation using Jitify.

  • Significantly improved unit testing framework using ctest.

  • QUDA can now be built to target Google's address sanitizer
    (CMAKE_BUILD_TYPE option is SANITIZE) for improved debugging.

  • QUDA can now download and install the USQCD libraries QMP and QIO
    automatically as part of the compilation process. To enable this,
    the option QUDA_DOWNLOAD_USQCD=ON should be set. Similarly to Eigen
    installation this requires access to the outside internet.

  • QUDA can now download and install the ARPACK library automatically
    if the QUDA_DOWNLOAD_ARPACK option is enabled.

  • Updated to CUB 1.8.

  • Multiple bug fixes and clean up to the library. Many of these are
    listed here: https://github.com/lattice/quda/milestone/21?closed=1