·
25 commits
to main
since this release
Deprecation Notice
The ONEDPL_USE_AOT_COMPILATION and ONEDPL_AOT_ARCH CMake options are deprecated and will be removed in a future
release. Please use the relevant compiler flags to enable this feature.
New Features
- Added parallel range algorithms in
namespace oneapi::dpl::ranges:set_intersection,set_union,
set_difference,set_symmetric_difference,includes,unique,unique_copy,destroy,
uninitialized_fill,uninitialized_move,uninitialized_copy,uninitialized_value_construct,
uninitialized_default_construct,reverse,reverse_copy,swap_ranges. These algorithms operate with
C++20 random access ranges. - Improved performance of
gpu::inclusive_scankernel template and added support for binary operator and type
combinations which do not have a SYCL known identity. - Improved performance of
inclusive_scan_by_segment,exclusive_scan_by_segment,set_union,
set_difference,set_intersection, andset_symmetric_differencewhen using device policies. - Improved performance of search operations (e.g.,
find,all_of,equal,search, etc.),is_heapand
is_heap_untilalgorithms on Intel® Arc™ B-series GPU devices.
Fixed Issues
- Removed requirement of GPU double precision support to use
set_union,set_difference,set_intersection,
andset_symmetric_differenceon Windows operating systems. - Removed default-constructible requirements from the value type for
reduceandtransform_reducealgorithms,
as well as copy-constructible requirements when these algorithms are used with a native ("host") policy. - Fixed an issue with
ranges::mergewhen projections of the two input ranges were not the same. - Fixed
equalreturning afalsefor empty input sequences; now it returnstrue. - Fixed a compilation error SYCL kernel cannot use exceptions occurring with libstdc++ version 10 when calling
adjacent_find,is_sortedandis_sorted_untilrange algorithms with device policies. - Fixed an issue with
PSTL_USE_NONTEMPORAL_STORESmacro having no effect. - Fixed a bug where
uniquecalled with a device policy returned an incorrect result iterator. - Fixed a bug in
exclusive_scan,inclusive_scan,transform_exclusive_scan,transform_inclusive_scan,
exlusive_scan_by_segment, andinclusive_scan_by_segmentalgorithms when using device policies with different
input and output value types. - Fixed a bug in return value types of
minmax_elementandmismatchrange algorithms. - Fixed compile errors in
set_unionandset_symmetric_differencewhen using device policies
with different second-input and output value types.
Known Issues and Limitations
New in This Release
copy_if,unique_copy,set_union,set_intersection,set_difference,set_symmetric_difference
range algorithms require the output range to have sufficient size to hold all resulting elements.
Existing Issues
See oneDPL Guide for other restrictions and known limitations_.
histogramalgorithm requires the output value type to be an integral type no larger than four bytes
when used with a device policy on hardware that does not support 64-bit atomic operations.- For
transform_exclusive_scanandexclusive_scanto run in-place (that is, with the same data
used for both input and destination) and with an execution policy ofunseqorpar_unseq,
it is required that the provided input and destination iterators are equality comparable.
Furthermore, the equality comparison of the input and destination iterator must evaluate to true.
If these conditions are not met, the result of these algorithm calls is undefined. - Incorrect results may be produced by
exclusive_scan,inclusive_scan,transform_exclusive_scan,
transform_inclusive_scan,exclusive_scan_by_segment,inclusive_scan_by_segment,reduce_by_segment
withunseqorpar_unseqpolicy when compiled by Intel® oneAPI DPC++/C++ Compiler 2024.1 or earlier
with-fiopenmp,-fiopenmp-simd,-qopenmp,-qopenmp-simdoptions on Linux.
To avoid the issue, pass-fopenmpor-fopenmp-simdoption instead.