Skip to content

TensorSpan<T>.FlattenTo is 80x slower with the .NET 10 package #121463

@MitchRazga

Description

@MitchRazga

Description

There is around an ~80x performance regression in System.Numerics.Tensors TensorSpan<T>.FlattenTo (and ReadOnlyTensorSpan<T>.FlattenTo).

Prior to 10.0.0-preview.4.25258.110 the implementation copied to the destination in blocks but now copies per-element.

Reproduction + benchmark solution here
Run FlattenTo.Before and FlattenTo.After console apps to benchmark against each package version.

Regression?

This performance issue was not present in the stable 9.0.10 release however the package was marked Experimental before .NET 10.

10.0.0-preview.3.25171.5 is the last version before the rewrite #114927 that introduced TensorOperation.

Data

Method Mean Error StdDev Allocated
FlattenToBefore 25.16 μs 0.136 μs 0.120 μs 48 B
FlattenToAfter 1.936 ms 0.0013 ms 0.0011 ms -

Before-Benchmarks.log

After-Benchmarks.log

Analysis

release/9.0

release/10.0-rc2


public static void Invoke<TOperation, TArg, TResult>(in ReadOnlyTensorSpan<TArg> x, in Span<TResult> destination)

cc: @tannergooding

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions