Releases: apache/orc
v2.1.0
Milestone
Branch
This is a new minor release which we cannot provide a changelog.
New Feature
[ORC-262] [C++] Support async prefetch in Orc reader
[ORC-1388] [C++] Support schema evolution from decimal to timestamp/string group
[ORC-1389] [C++] Support schema evolution from string group to numeric/string group
[ORC-1390] [C++] Support schema evolution from string group to decimal/timestamp
[ORC-1807] [C++] Native support for vcpkg
[ORC-1622] [C++] Support conan packaging
Improvement
[ORC-1264] [C++] Add a writer option to align compression block with row group boundary
[ORC-1365] [C++] Use BlockBuffer to replace DataBuffer of rawInputBuffer in the CompressionStream
[ORC-1635] Try downloading orc-format from dlcdn.apache.org
before archive.apache.org
[ORC-1645] Evaulate stripe stats before load stripe footer
[ORC-1658] [C++] uniform identifiers naming style.
[ORC-1661] [C++] Better handling when TZDB is unavailable
[ORC-1664] Enable the removeUnusedImports function in spotless-maven-plugin
[ORC-1665] Enable the importOrder
function in spotless-maven-plugin
[ORC-1667] Add check
tool to check the index of the specified column
[ORC-1669] [C++] Deprecate HDFS support
[ORC-1672] Modify the package name of TestCheckTool
[ORC-1675] [C++] Print decimal values as strings
[ORC-1677] [C++] remove m
prefix of variables.
[ORC-1683] Fix instanceof
of BinaryStatisticsImpl merge method
[ORC-1684] [C++] Find tzdb without TZDIR when in conda-environments
[ORC-1685] Use Pattern Matching for instanceof
in RecordReaderImpl
[ORC-1686] [C++] Avoid using std::filesystem
[ORC-1687] [C++] Enforce naming style.
[ORC-1688] [C++] Do not access TZDB if there is no timestamp type
[ORC-1689] [C++] Generate CMake config file
[ORC-1690] [C++] Refactor CMake to use imported thirdtparty libraries
[ORC-1710] Reduce enum array allocation
[ORC-1711] [C++] Introduce a memory block size parameter for writer option
[ORC-1720] [C++] Unified compressor/decompressor exception types
[ORC-1724] JsonFileDump utility should print user metadata
[ORC-1730] [C++] Add finishEncode support for the encoder
[ORC-1732] [C++] Can't detect Protobuf installed by Homebrew on macOS
[ORC-1733] [C++][CMake] Fix CMAKE_MODULE_PATH not to use PROJECT_SOURCE_DIR
[ORC-1751] [C++] Syntax error in ThirdpartyToolchain
[ORC-1767] [C++] Improve writing performance of encoded string column and support EncodedStringVectorBatch for StringColumnWriter
[ORC-1796] [C++] Reading orc file which lack of Statistics may give wrong result
[ORC-1810] Offline build support
Bug Fix
[ORC-1654] [C++] Count up EvaluatedRowGroupCount correctly.
[ORC-1657] Fix building apache orc with clang-cl on Windows
[ORC-1706] [C++] Fix build break w/ BUILD_CPP_ENABLE_METRICS=ON
[ORC-1725] [C++] Statistics for BYTE type are calculated incorrectly on ARM
[ORC-1738] Wrong Int128 maximum value
[ORC-1811] Use the recommended closer.lua
URL to download ORC format
[ORC-1813] Incompatibility with ORC files written in version 0.12 due to missing hasNull field in C++ Reader
Task
[ORC-1573] Setting version to 2.1.0-SNAPSHOT
[ORC-1594] Add IntelliJ conf in the project root directory to support JIRA/PR autolinks
[ORC-1649] [C++][Conan] Add 2.0.0 to conan recipe and update release guide
[ORC-1655] Add label definition to conan directory
[ORC-1656] Skip build and test on conan updates
[ORC-1666] Remove extra newlines at the end of Java files
[ORC-1758] Use OpenContainers
Annotations in docker images
[ORC-1802] Enable tag protection
Test
[ORC-1589] Bump spotbugs-maven-plugin
to 4.8.3.0
[ORC-1590] Bump spotless-maven-plugin
to 2.42.0
[ORC-1603] Bump checkstyle
to 10.13.0
[ORC-1606] Upgrade spotless-maven-plugin
to 2.43.0
[ORC-1611] Bump junit
to 5.10.2
[ORC-1651] Bump checkstyle
to 10.14.0
[ORC-1652] Bump extra-enforcer-rules
to 1.8.0
[ORC-1653] Bump maven-assembly-plugin
to 3.7.0
[ORC-1659] Bump guava
to 33.1.0-jre
[ORC-1660] Bump checkstyle
to 10.14.2
[ORC-1673] Remove test packages o.a.o.tools.[count|merge|sizes]
[ORC-1676] Use Hive 4.0.0 in benchmark
[ORC-1678] Bump checkstyle
to 10.15.0
[ORC-1680] Bump bcpkix-jdk18on
to 1.78
[ORC-1691] Bump spotbugs-maven-plugin
to 4.8.4.0
[ORC-1694] Upgrade gson to 2.9.0 for Benchmarks Hive
[ORC-1695] Upgrade gson to 2.10.1
[ORC-1699] Fix SparkBenchmark in Parquet format according to SPARK-40918
[ORC-1700] Write parquet decimal type data in Benchmark using FIXED_LEN_BYTE_ARRAY type
[ORC-1704] Migration to Scala 2.13 of Apache Spark 3.5.1 at SparkBenchmark
[ORC-1707] Fix sun.util.calendar IllegalAccessException when SparkBenchmark runs on JDK17
[ORC-1708] Support data/compress options in Hive benchmark
[ORC-1709] Upgrade GitHub Action setup-java
to v4 and use built-in cache feature
[ORC-1713] Bump spotbugs-maven-plugin to 4.8.5.0
[ORC-1716] Bump com.puppycrawl.tools:checkstyle to 10.16.0
[ORC-1719] Bump guava
to 33.2.0-jre
[ORC-1722] Bump checkstyle
to 10.17.0
[ORC-1726] Bump guava
to 33.2.1-jre
[ORC-1727] Bump maven-enforcer-plugin
to 3.5.0
[ORC-1728] Bump maven-shade-plugin
to 3.6.0
[ORC-1729] Bump maven-checkstyle-plugin
to 3.4.0
[ORC-1731] Upgrade maven-dependency-plugin
to 3.7.0
[ORC-1735] Upgrade maven-dependency-plugin
to 3.7.1
[ORC-1736] Bump junit
to 5.10.3
[ORC-1737] Bump spotbugs-maven-plugin
to 4.8.6.1
[ORC-1739] Bump spotbugs-maven-plugin
to 4.8.6.2
[ORC-1745] Remove Ubuntu 20.04
Support
[ORC-1750] Bump protobuf-java
to 3.25.4
[ORC-1756] Bump snappy-java
to 1.1.10.6 in bench
module
[ORC-1760] Upgrade junit
to 5.11.0
[ORC-1761] Upgrade guava
to 33.3.0-jre
[ORC-1763] Upgrade checkstyle
to 10.18.0
[ORC-1764] Upgrade maven-checkstyle-plugin
to 3.5.0
[ORC-1765] Upgrade maven-dependency-plugin
to 3.8.0
[ORC-1771] Upgrade checkstyle
to 10.18.1
[ORC-1772] Bump spotbugs-maven-plugin
to 4.8.6.3
[ORC-1774] Upgrade snappy-java
to 1.1.10.7 in bench
module
[ORC-1776] Remove MacOS 12
from GitHub Action CI and docs
[ORC-1778] Upgrade Spark to 4.0.0-preview2
[ORC-1779] Upgrade extra-enforcer-rules
to 1.9.0
[ORC-1780] Upgrade spotbugs-maven-plugin
to 4.8.6.4
[ORC-1783] Add MacOS 15 to GitHub Action MacOS CI and docs
[ORC-1786] Upgrade guava
to 33.3.1-jre
[ORC-1788] Upgrade checkstyle
to 10.18.2
[ORC-1789] Upgrade junit
to 5.11.2
[ORC-1790] Upgrade parquet
to 1.14.3 in bench
module
[ORC-1794] Upgrade checkstyle
to 10.19.0
[ORC-1795] Upgrade junit
to 5.11.3
[ORC-1797] Upgrade spotbugs-maven-plugin
to 4.8.6.5
[ORC-1799] Upgrade maven-checkstyle-plugin
to 3.6.0
[ORC-1801] Upgrade checkstyle
to 10.20.0
[ORC-1804] Upgrade parquet
to 1.14.4 in bench
module
[ORC-1805] Upgrade checkstyle
to 10.20.1
[ORC-1806] Upgrade spotbugs-maven-plugin
to 4.8.6.6
[ORC-1809] Upgrade checkstyle
to 10.20.2
[ORC-1812] Upgrade parquet
to 1.15.0 in bench
module
[ORC-1816] Upgrade checkstyle
to 10.21.0
[ORC-1820] Bump junit.version to 5.11.4
[ORC-1821] Upgrade guava
to 33.4.0-jre
[ORC-1822] [C++][CI] Use cpp-linter-action for clang-tidy and clang-format
[ORC-1823] Upgrade checkstyle
to 10.21.1
[ORC-1826] [C++] Add ASAN to CI
Build and Dependency Changes
[ORC-1608] Upgrade Hadoop to 3.4.0
[ORC-1617] Upgrade slf4j
to 2.0.12
[ORC-1640] Upgrade cyclonedx-maven-plugin to 2.7.11
[ORC-1650] Bump maven-shade-plugin
to 3.5.2
[ORC-1670] Upgrade zstd-jni
to 1.5.6-1
[ORC-1679] Bump zstd-jni
1.5.6-2
[ORC-1682] Bump maven-assembly-plugin to 3.7.1
[ORC-1692] Bump slf4j
to 2.0.13
[ORC-1693] Bump maven-jar-plugin
to 3.4.0
[ORC-1698] Upgrade commons-cli
to 1.7.0
[ORC-1701] Bump threeten-extra
to 1.8.0
[ORC-1702] Bump bcpkix-jdk18on
to 1.78.1
[ORC-1703] Bump maven-jar-plugin
to 3.4.1
[ORC-1705] Upgrade zstd-jni
to 1.5.6-3
[ORC-1712] Bump maven-shade-plugin to 3.5.3
[ORC-1714] Bump commons-csv to 1.11.0
[ORC-1715] Bump org.objenesis:objenesis to 3.3
[ORC-1718] Upgrade build-helper-maven-plugin
to 3.6.0
[ORC-1723] Upgrade commons-cli
to 1.8.0
[ORC-1734] Bump maven-jar-plugin
to 3.4.2
[ORC-1748] Upgrade commons-lang3
to 3.15.0
[ORC-1755] Bump commons-lang3
to 3.16.0
[ORC-1757] Bump slf4j
to 2.0.14
[ORC-1759] Upgrade commons-cli
to 1.9.0
[ORC-1762] Bump slf4j
to 2.0.16
[ORC-1766] Upgrade brotli4j
to 1.17.0
[ORC-1768] Upgrade commons-lang3
to 3.17.0
[ORC-1773] Bump reproducible-build-maven-plugin
to 0.17
[ORC-1775] Upgrade aircompressor
to 2.0.2
[ORC-1777] Upgrade protobuf-java
to 3.25.5
[ORC-1781] Upgrade zstd-jni
to 1.5.6-6
[ORC-1782] Upgrade Hadoop to 3.4.1
[ORC-1784] Upgrade Maven
to 3.9.9
[ORC-1785] Upgrade commons-csv
to 1.12.0
[ORC-1791] Remove commons-lang3
dependency
[ORC-1798] Upgrade maven-dependency-plugin
to 3.8.1
[ORC-1803] Upgrade zstd-jni
to 1.5.6-7
[ORC-1808] Upgrade zstd-jni
to 1.5.6-8
[ORC-1817] Upgrade brotli4j
to 1.18.0
[ORC-1825] [C++] Bump Snappy to 1.2.1
[ORC-1827] [C++] Bump ZLIB to 1.3.1
[ORC-1828] [C++] Bump LZ4 to 1.10.0
Documentation
[ORC-642] Update PatchedBase doc with patch ceiling in spec
[ORC-1634] Fix some outdated descriptions in Building ORC documentation
[ORC-1668] Add merge command to Java tools documentation
[ORC-1800] Upgrade bcpkix-jdk18on
to 1.79
[ORC-1814] Use Ubuntu 24.04/Jekyll 4.3/Rouge 4.5 to generate website
[ORC-1815] Remove broken people.apache.org
links
[ORC-1819] Publish snapshot website through GitHub Pages
[ORC-1824] Update Python documentation with PyArrow 18.1.0 and Dask 2024.12.1
[ORC-1830] Fix release table hyperlink to use baseurl
v2.0.3
Milestone
Branch
Bug Fix
- ORC-1796: [C++] Fix return wrong result if lack of has null
Test
- ORC-1680: Bump
bcpkix-jdk18on
to 1.78 - ORC-1702: Bump
bcpkix-jdk18on
to 1.78.1 - ORC-1756: Bump
snappy-java
to 1.1.10.6 inbench
module - ORC-1756: Upgrade
snappy-java
to 1.1.10.7 inbench
module - ORC-1770: Upgrade
parquet
to 1.14.2 inbench
module - ORC-1776: Remove
MacOS 12
from GitHub Action CI and docs - ORC-1778: Upgrade
Spark
to 4.0.0-preview2 inbench
module - ORC-1783: Add
MacOS 15
to GitHub Action MacOS CI and docs - ORC-1790: Upgrade
parquet
to 1.14.3 inbench
module - ORC-1800: Upgrade
bcpkix-jdk18on
to 1.79
Build and Dependency Changes
- ORC-1608: Upgrade
Hadoop
to 3.4.0 - ORC-1750: Bump
protobuf-java
to 3.25.4 - ORC-1769: Upgrade
zstd-jni
to 1.5.6-5 - ORC-1775: Upgrade
aircompressor
to 2.0.2 - ORC-1777: Bump
protobuf-java
to 3.25.5 - ORC-1781: Upgrade
zstd-jni
to 1.5.6-6 - ORC-1782: Upgrade
Hadoop
to 3.4.1 - ORC-1784: Upgrade
Maven
to 3.9.9 - ORC-1785: Upgrade
commons-csv
to 1.12.0 - ORC-1791: Remove
commons-lang3
dependency
v1.9.5
v1.8.8
v1.7.11
Milestone
Branch
Bug Fix
Test
- ORC-1540: Remove MacOS 11 from GitHub Action CI and docs
- ORC-1556: Add
Rocky Linux 9
Docker Test - ORC-1557: Add GitHub Action CI for
Docker Test
- ORC-1561: Remove Java11 and clang variants from
docker/os-list.txt
inbranch-1.7
- ORC-1578: Fix
SparkBenchmark
onsales
data according to SPARK-40918 - ORC-1696: Fix ClassCastException when reading avro decimal type in bechmark
v2.0.2
Milestone
Branch
Improvements (tools)
- ORC-1724: JsonFileDump utility should print user metadata
- ORC-1740: Avoid the dump tool repeatedly parsing ColumnStatistics
- ORC-1742: Support print the id, name and type of each column in dump tool
Bug Fix
- ORC-1732: [C++] Fix detecting Homebrew-installed Protobuf on MacOS
- ORC-1733: [C++][CMake] Fix CMAKE_MODULE_PATH not to use PROJECT_SOURCE_DIR
- ORC-1738: [C++] Fix wrong Int128 maximum value
- ORC-1741: Respect decimal reader isRepeating flag
- ORC-1749: Fix
supportVectoredIO
for hadoop version string with optional patch labels - ORC-1751: [C++] fix syntax error in ThirdpartyToolchain
Test
- ORC-1694: Upgrade gson to 2.9.0 for Benchmarks Hive
- ORC-1697: Fix IllegalArgumentException when reading json timestamp type in benchmark
- ORC-1700: Write parquet decimal type data in Benchmark using
FIXED_LEN_BYTE_ARRAY
type - ORC-1743: Upgrade Spark to 4.0.0-preview1
- ORC-1744: Add
ubuntu-24.04
to GitHub Action - ORC-1746: Bump
netty-all
to 4.1.110.Final inbench
module - ORC-1752: Fix NumberFormatException when reading json timestamp type in benchmark
- ORC-1753: Use Avro 1.12.0 in
bench
module
Build and Dependency Changes
v1.9.4
Milestone
Changelog
BugFix
- ORC-1696 Fix ClassCastException when reading avro decimal type in bechmark
- ORC-1721 Upgrade
aircompressor
to 0.27 - ORC-1738 Wrong Int128 maximum value
Test
- ORC-1619 Add
MacOS 14
to GitHub Action - ORC-1699 Fix SparkBenchmark in Parquet format according to SPARK-40918
Task
- ORC-1540 Remove MacOS 11 from GitHub Action CI
v2.0.1
Milestone
Branch
Improvements (tools)
- ORC-1644: Add
merge
tool to merge multiple ORC files into a single ORC file - ORC-1647: Tips for supporting ORC in the
convert
command - ORC-1667: Add
check
tool to check the index of the specified column
Bug Fix
- ORC-1646: Close the reader when reading the schema with the
convert
command - ORC-1654: [C++] Count up EvaluatedRowGroupCount correctly
- ORC-1684: [C++] Find tzdb without TZDIR when in conda-environments
- ORC-1688: [C++] Do not access TZDB if there is no timestamp type
- ORC-1696: Fix ClassCastException when reading avro decimal type in bechmark
Task
- ORC-1649:[C++][Conan] Add 2.0.0 to conan recipe and update release guide
- ORC-1669: [C++] Deprecate HDFS support
- ORC-1686: [C++] Avoid using std::filesystem
Test
- ORC-1648: Add test to convert ORC in the
convert
command - ORC-1663: [C++] Enable TestTimezone.testMissingTZDB on Windows
- ORC-1672: Remove test packages
o.a.o.tools.check
- ORC-1673: Remove test packages
o.a.o.tools.[count|merge|sizes]
- ORC-1676: Use Hive 4.0.0 in benchmark
- ORC-1681: Remove redundant import statement in tests to fix checkstyle failures
- ORC-1699: Fix SparkBenchmark in Parquet format according to SPARK-40918
- ORC-1704: Migration to Scala 2.13 of Apache Spark 3.5.1 at SparkBenchmark
- ORC-1707: Fix
sun.util.calendar
IllegalAccessException when SparkBenchmark runs on JDK17 - ORC-1708: Support data/compress options in Hive benchmark
Build and Dependency Changes
- ORC-1670: Upgrade
zstd-jni
to 1.5.6-1 - ORC-1679: Bump
zstd-jni
to 1.5.6-2 - ORC-1695: Upgrade gson to 2.10.1
- ORC-1698: Upgrade
commons-cli
to 1.7.0 - ORC-1705: Upgrade
zstd-jni
to 1.5.6-3 - ORC-1714: Bump commons-csv to 1.11.0
- ORC-1715: Bump org.objenesis:objenesis to 3.3
Documentation
- ORC-1668: Add
merge
command to Java tools documentation
v1.8.7
Milestone
Changelog
Bug
ORC-1528: Fix readBytes potential overflow in RecordReaderUtils.ChunkReader#create
ORC-1602: [C++] limit compression block size
Test
ORC-1556: Add Rocky Linux 9 Docker Test
ORC-1557: Add GitHub Action CI for Docker Test
ORC-1560: Remove Java11 and clang variants from docker/os-list.txt
in branch-1.8
ORC-1562: Bump guava
to 33.0.0-jre
ORC-1578: Fix SparkBenchmark
on sales data according to SPARK-40918
ORC-1621: Switch to oraclelinux9
from rocky9
Documentation
ORC-1536: Remove hive-storage-api
link from maven-javadoc-plugin
ORC-1563: Fix orc.bloom.filter.fpp
default value and orc.compress
notes of Spark and Hive config docs
v1.9.3
Milestone
Changelog
BugFix
- ORC-634 Fix the json output for double NaN and infinite
- ORC-1553 Reading information from Row group, where there are 0 records of SArg column
- ORC-1563 Fix orc.bloom.filter.fpp default value and orc.compress notes of Spark and Hive config docs
- ORC-1578 Fix SparkBenchmark according to SPARK-40918
- ORC-1586 Fix IllegalAccessError when SparkBenchmark runs on JDK17
- ORC-1602 [C++] limit compression block size
- ORC-1607 Fix
testDoubleNaNAndInfinite
to useTestFileDump.checkOutput
- ORC-1609 Fix the compilation problem of TestJsonFileDump in branch 1.9
Test
- ORC-1556 Add
Rocky Linux 9
Docker Test - ORC-1557 Add GitHub Action CI for
Docker Test
- ORC-1559 Remove Java11 and clang variants from
docker/os-list.txt
frombranch-1.9
Task
- ORC-1532 Upgrade
opencsv
to 5.9 - ORC-1536 Remove
hive-storage-api
link frommaven-javadoc-plugin
- ORC-1576 Upgrade spark.jackson.version to 2.15.2 in bench module
- ORC-1591 Lower log level from INFO to DEBUG in *ReaderImpl/WriterImpl/PhysicalFsWriter
- ORC-1592 Suppress KeyProvider missing log
- ORC-1616 Upgrade
aircompressor
to 0.26 - ORC-1618 Disable building tests for snappy
Documentation:
- ORC-1535 Remove generated Java docs from source tree