Release QUDA v1.0.0 · lattice/quda

Version 1.0.0 - 10 January 2020

Add support for CUDA 10.2: QUDA 1.0.0 is supported on CUDA 7.5-10.2
using either GCC or clang compilers. CUDA 10.x and either GCC >=
6.x or clang >= 6.x are highly recommended.
Significant improvements to the CMake build system and removal of the
legacy configure build.
Added more targeted compilation options to constrain which
precisions and reconstruct types are compiled. QUDA_PRECISION is a
cmake parameter that is a 4-bit number corresponding to which
precisions are enabled, with 1 = quarter, 2 = half, 4 = single and 8
= double, the default is 14 which enables double, single and half
precision. QUDA_RECONSTRUCT is a 3-bit number corresponding to
which reconstruct types are enabled, with 1 = reconstruct-8/9, 2 =
reconstruct-12/13 and 4 = reconstruct-18, the default is 7 which
enables all reconstruct types.
Completely rewritten all dslash kernels using the accessor
framework. This dramatically reduces code complexity and improve
performance.
New physics functionality added: gauge Laplace kernel, Gaussian
quark smearing, topological charge density.
QUDA can now be built to either utilize texture-memory reads or to
use direct memory accessing (cmake option QUDA_TEX). The default
has textures on, though we note that since Pascal it can be
advantageous to disable textures and utilize direct reads.
QUDA is no longer supported on the Fermi generation of GPUs (sm_20
and sm_21). Compilation and running should still be possible but
will require compilation with texture objects disabled.
Added supported for quarter precision (QUDA_QUARTER_PRECISION) for
the linear operator and associated solvers.
Implemented both CA-CG and CA-GCR communication avoid solvers, for
use either as stand-alone solvers or as a means to accelerate
multigrid.
Continued evolution and optimization of the multigrid framework.
Regardless, we advise users to use the latest develop branch when
using multigrid, since it continues to be a fast-moving target with
continual focus on optimization and improvement.
An implementation of the Thick Restarted Lanczos Method (TRLM) for
eigenvector solving of the normal operator.
Lanczos-accelerated multigrid through the use of coarse-grid
deflation and / or using singular vectors to define the prolongator.
Removal of the legacy contraction and co-variant derivative
algorithms, and replacement with accessor-based rewrites.
Improved heavy-quark residual convergence which ensure correct
convergence for MILC heavy quark observables.
Experimental support for Just-In-Time (JIT) compilation using Jitify.
Significantly improved unit testing framework using ctest.
QUDA can now be built to target Google's address sanitizer
(CMAKE_BUILD_TYPE option is SANITIZE) for improved debugging.
QUDA can now download and install the USQCD libraries QMP and QIO
automatically as part of the compilation process. To enable this,
the option QUDA_DOWNLOAD_USQCD=ON should be set. Similarly to Eigen
installation this requires access to the outside internet.
QUDA can now download and install the ARPACK library automatically
if the QUDA_DOWNLOAD_ARPACK option is enabled.
Updated to CUB 1.8.
Multiple bug fixes and clean up to the library. Many of these are
listed here: https://github.com/lattice/quda/milestone/21?closed=1

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

QUDA v1.0.0

Version 1.0.0 - 10 January 2020