Improve using generics to select float precision and array sizes. #6138

tomas-davidovic · 2025-01-20T16:33:16Z

I am trying to turn out MLPs from hardcoded values and defines to a more generics based solution with interface like this:

interface IMLP
{
    associatedtype PrecisionType : __BuiltinFloatingPointType; // precision of the IO, can be float or half
    static const uint kInputWidth;
    static const uint kOutputWidth;
    void eval(PrecisionType input[kInputWidth], out PrecisionType output[kOutputWidth]); // other parameters not relevant
}

This is then used in this manner:

void decorate_and_eval<MLP : IMLP, let TLatentSizes : uint>(
    float latents[TLatentSizes],
    out float result[MLP::kOutputWidth],
    out float resultExponentiated[MLP::kOutputWidth]
)
{
    MLP::PrecisionType inputs[MLP::kInputWidth];
    for (uint i = 0; i < TLatentSizes; ++i)
    {
        inputs[i] = latents[i]; // This does not work, it won't cast, even though it can be cast with any conforming type
        inputs[i] = (MLP::PrecisionType)latents[i]; // requires explicit cast
    }

    for (uint i = TLatentSizes; i < MLP::kInputWidth; ++i)
    {
        inputs[i] = 0.f;                     // This does not work either
        inputs[i] = (MLP::PrecisionType)0.f; // must have an explicit cast
    }

    MLP::PrecisionType outputs[MLP::kOutputWidth] = {}; // this does not work, says the size is not statically known.

    // Must do this explicit thing as a workaound to get equivalent result
    [ForceUnroll]
    for (uint i = 0; i < MLP::kOutputWidth; ++i)
        outputs[i] = MLP::PrecisionType(0.f);

    MLP mlp;
    mlp.eval(inputs, outputs);

    for (uint i = 0; i < MLP::kOutputWidth; ++i)
    {
        result[i] = outputs[i];                    // this does not work
        result[i] = (float)outputs[i];             // nor does this
        result[i] = float(outputs[i]);             // nor this
        result[i] = __realCast<float>(outputs[i]); // must do this
    }

    for (uint i = 0; i < MLP::kOutputWidth; ++i)
    {
        resultExponentiated[i] = exp(outputs[i]); // this does not work, despite exp having generic arg of type __BuiltinFloatingPointType
        resultExponentiated[i] = exp(__realCast<float>(outputs[i])); // if PrecisionType is half, this is unnecessarily expensive
        resultExponentiated[i] = exp(__realCast<half>(outputs[i]));  // if PrecisionType is float, this is too imprecise
    }
}

I am aware this is multiple issues in one report, but I believe the context is needed and will provide the individual pieces as subissues.

The text was updated successfully, but these errors were encountered:

tomas-davidovic · 2025-01-20T16:39:31Z

Sorry, turns out I cannot do subissues, but I hope the individual problems are described well enough in the text they can be broken up. In case there is a problem, these would be the subissues:

Array size as a generic argument prevents using = {} initialization.

void empty_init<let TSize : uint>()
{
    float array[TSize] = {}; // does not work, despite being clearly codegen time known size.
                             /// WAR:
    [ForceUnroll]
    for (uint war = 0; war < TSize; ++war)
        array[war] = 0.f;
}

Assigning a float literal into a generic argument of the type __BuiltinFloatingPointType is not possible without an explicit cast.

void assign_float<TFloat : __BuiltinFloatingPointType>()
{
    TFloat foo;
    foo = 1.0f;

    TFloat bar;
    bar = TFloat(1.0f); // Must explicitly cast, even though there is an implicit cast from 1.f to every __BuiltinFloatingPointType
}

Assigning from a variable with type defined by generic argument of type __BuiltinFloatingPointType into a float is not possible even with C-style casts, __realCast is required:

interface IGenerator
{
    associatedtype TFloat : __BuiltinFloatingPointType;
    TFloat generate();
};

float gen_plus_1<Gen : IGenerator>()
{
    Gen g;
    Gen::TFloat f = g.generate();
    float result0 = 1.f + f;                    // no overload for `+` applicable to types
    float result1 = 1.f + (float)f;             // expected a function, got `typeof(float)`
    float result2 = 1.f + float(f);             // expected a function, got `typeof(float)`
    float result3 = 1.f + __realCast<float>(f); // feels like a workaround, not a recommended workflow
    return result0;
}

Calling a built-in function exp that is defined for both half and float is not possible using a generic of type __BuiltinFloatingPointType without forcing an explicit cast that will have unintended consequences on the codegen:

float gen_exp<Gen : IGenerator>()
{
    Gen g;
    Gen::TFloat f = g.generate();

    float e0 = exp(__realCast<float>(f)); // if f is `half`, does exp(float) instead of exp(half) -> slow
    float e1 = exp(__realCast<half>(f));  // if f is `float`, does exp(half) instead of exp(float) -> imprecise
    float e2 = exp(f); // does not work, despite exp having __BuiltinFloatingPointType attribute and TFloat being __BuiltinFloatingPointType
                       // type.
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve using generics to select float precision and array sizes. #6138

Improve using generics to select float precision and array sizes. #6138

tomas-davidovic commented Jan 20, 2025

tomas-davidovic commented Jan 20, 2025

Improve using generics to select float precision and array sizes. #6138

Improve using generics to select float precision and array sizes. #6138

Comments

tomas-davidovic commented Jan 20, 2025

tomas-davidovic commented Jan 20, 2025