Releases: diku-dk/futhark
0.25.7
Added
-
futhark autotune
now supportship
backend. -
Better parallelisation of
scatter
when the target is
multidimensional (#2035).
Fixed
-
Very large
iota
s now work. -
Lambda lifting in
while
conditions (#2038). -
Size expressions in local function parameters had an interesting
interaction with defunctionalisation (#2040). -
The
store
command in server executables did not properly
synchronise when storing opaque values, which would lead to
use-after-free errors.
0.25.6
Added
-
The various C API functions that accept strings now perform a copy,
meaning the caller does not have to keep the strings alive. -
Slightly better lexer error messages.
-
Fusion across slicing is now possible in some cases.
-
New tool:
futhark profile
.
Fixed
- Inefficient locking for certain segmented histograms (#2024).
0.25.5
0.25.4
Fixed
-
Invalid simplification (#2015).
-
Rarely occurring deadlock for fused map-scan compositions in CUDA
backend, when a bounds check failed in the map function. -
Compiler and interpreter crash for tricky interactions of abstract
types and sizes (#2016). Solved by banning such uses - in principle
this could break code. -
Incomplete alias tracking could cause removal of necessary copies,
leading to compiler crash (#2018).
0.25.3
Added
-
pyopencl backend: compatibility with future versions of PyOpenCL.
-
New backend: hip.
Fixed
-
Exotic problems related to intra-group reductions with array
operands. (Very rare in real code, although sometimes generated by
AD.) -
Interpreter issue related to sizes in modules (#1992, #1993, #2002).
-
Incorrect insertion of size arguments in in complex cases (#1998).
-
Incorrect handling of
match
in lambda lifting (#2000). -
Regression in checking of consumption (#2007).
-
Error in type checking of horisontally fused
scatter
s could crash
the compiler (#2009). -
Size-polymorphic value bindings with existential sizes are now
rejected by type checker (#1993). -
Single pass scans with complicated fused map functions were
insufficiently memory-expanded (#2023). -
Invalid short circuiting (#2013).
0.25.2
Added
-
Flattening/unflattening as the final operation in an entry point no
longer forces a copy. -
The
opencl
backend no longer always fails on platforms that do
not support 64-bit integer atomics, although it will still fail if
the program needs them. -
Various performance improvements to the compiler itself;
particularly the frontend. It should be moderately faster.
Fixed
0.25.1
Added
-
Arbitrary expressions of type
i64
are now allowed as sizes. Work
by Lubin Bailly. -
New prelude function
resize
.
Removed
- The prelude functions
concat_to
andflatten_to
. They are often
not necessary now, and otherwiseresize
is available.
Changed
-
The prelude functions
flatten
andunflatten
(and their
multidimensional variants), as well assplit
, now have more
precise types. -
Local and anonymous (lambda) functions that must return unique
results (because they are passed to a higher order function that
requires this) must now have an explicit return type ascription that
declares this, using*
. This is very rare (in practice
unobserved) in real programs.
Fixed
-
futhark doc
produced some invalid links. -
flatten
did not properly check for claimed negative array sizes. -
Type checker crash on some ill-typed programs (#1926).
-
Some soundness bugs in memory short circuiting (#1927, #1930).
-
Global arrays with size parameters no longer have aliases.
-
futhark eval
no longer crashes on ambiguously typed expressions (#1946). -
A code motion pass was ignorant of consumption constraints, leading
to compiler crash (#1947). -
Type checker could get confused and think unknown sizes were
available when they really weren't (#1950). -
Some index optimisations removed certificates (#1952).
-
GPU backends can now transpose arrays whose size does not fit in a
32-bit integer (#1953). -
Bug in alias checking for the core language type checker (#1949).
Actually (finally) a proper fix of #803. -
Defunctionalisation duplicates less code (#1968).
0.24.3
Fixed
-
Certain cases of noninlined functions in
multicore
backend. -
Defunctionalisation of
match
where the constructors carry
functions (#1917). -
Shape coercions involving sum types (#1918). This required
tightening the rules a little bit, so some coercions involving
anonymous sizes may now be rejected. Add expected sizes as needed. -
Defunctionalisation somtimes forgot about sizes bound at top level
(#1920).
0.24.2
Added
futhark literate
(and FutharkScript in general) is now able to do
a bit of type-coercion of constants.
Fixed
-
Accumulators (produced by AD) had defective code generation for
intra-group GPU kernel versions. (#1895) -
The header file generated by
--library
contained a prototype for
an undefined function. (#1896) -
Crashing bug in LSP caused by stray
undefined
(#1907). -
Missing check for anonymous sizes in type abbreviations (#1903).
-
Defunctionalisation crashed on projection of holes.
-
Loop optimisation would sometimes remove holes.
-
A potential barrier divergence for certain GPU kernels that fail
bounds checking. -
A potential infinite loop when looking up aliases (#1915).
-
futhark literate
: less extraneous whitespace.
0.24.1
Added
-
The manifest file now lists which tuning parameters are relevant for
each entry point. (#1884) -
A command
tuning_params
has been added to the server protocol.
Changed
- If part of a function parameter is marked as consuming ("unique"),
the entire parameter is now marked as consuming.