Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve using generics to select float precision and array sizes. #6138

Open
tomas-davidovic opened this issue Jan 20, 2025 · 1 comment
Open

Comments

@tomas-davidovic
Copy link

I am trying to turn out MLPs from hardcoded values and defines to a more generics based solution with interface like this:

interface IMLP
{
    associatedtype PrecisionType : __BuiltinFloatingPointType; // precision of the IO, can be float or half
    static const uint kInputWidth;
    static const uint kOutputWidth;
    void eval(PrecisionType input[kInputWidth], out PrecisionType output[kOutputWidth]); // other parameters not relevant
}

This is then used in this manner:

void decorate_and_eval<MLP : IMLP, let TLatentSizes : uint>(
    float latents[TLatentSizes],
    out float result[MLP::kOutputWidth],
    out float resultExponentiated[MLP::kOutputWidth]
)
{
    MLP::PrecisionType inputs[MLP::kInputWidth];
    for (uint i = 0; i < TLatentSizes; ++i)
    {
        inputs[i] = latents[i]; // This does not work, it won't cast, even though it can be cast with any conforming type
        inputs[i] = (MLP::PrecisionType)latents[i]; // requires explicit cast
    }

    for (uint i = TLatentSizes; i < MLP::kInputWidth; ++i)
    {
        inputs[i] = 0.f;                     // This does not work either
        inputs[i] = (MLP::PrecisionType)0.f; // must have an explicit cast
    }

    MLP::PrecisionType outputs[MLP::kOutputWidth] = {}; // this does not work, says the size is not statically known.

    // Must do this explicit thing as a workaound to get equivalent result
    [ForceUnroll]
    for (uint i = 0; i < MLP::kOutputWidth; ++i)
        outputs[i] = MLP::PrecisionType(0.f);

    MLP mlp;
    mlp.eval(inputs, outputs);

    for (uint i = 0; i < MLP::kOutputWidth; ++i)
    {
        result[i] = outputs[i];                    // this does not work
        result[i] = (float)outputs[i];             // nor does this
        result[i] = float(outputs[i]);             // nor this
        result[i] = __realCast<float>(outputs[i]); // must do this
    }

    for (uint i = 0; i < MLP::kOutputWidth; ++i)
    {
        resultExponentiated[i] = exp(outputs[i]); // this does not work, despite exp having generic arg of type __BuiltinFloatingPointType
        resultExponentiated[i] = exp(__realCast<float>(outputs[i])); // if PrecisionType is half, this is unnecessarily expensive
        resultExponentiated[i] = exp(__realCast<half>(outputs[i]));  // if PrecisionType is float, this is too imprecise
    }
}

I am aware this is multiple issues in one report, but I believe the context is needed and will provide the individual pieces as subissues.

@tomas-davidovic
Copy link
Author

Sorry, turns out I cannot do subissues, but I hope the individual problems are described well enough in the text they can be broken up. In case there is a problem, these would be the subissues:

Array size as a generic argument prevents using = {} initialization.

void empty_init<let TSize : uint>()
{
    float array[TSize] = {}; // does not work, despite being clearly codegen time known size.
                             /// WAR:
    [ForceUnroll]
    for (uint war = 0; war < TSize; ++war)
        array[war] = 0.f;
}

Assigning a float literal into a generic argument of the type __BuiltinFloatingPointType is not possible without an explicit cast.

void assign_float<TFloat : __BuiltinFloatingPointType>()
{
    TFloat foo;
    foo = 1.0f;

    TFloat bar;
    bar = TFloat(1.0f); // Must explicitly cast, even though there is an implicit cast from 1.f to every __BuiltinFloatingPointType
}

Assigning from a variable with type defined by generic argument of type __BuiltinFloatingPointType into a float is not possible even with C-style casts, __realCast is required:

interface IGenerator
{
    associatedtype TFloat : __BuiltinFloatingPointType;
    TFloat generate();
};

float gen_plus_1<Gen : IGenerator>()
{
    Gen g;
    Gen::TFloat f = g.generate();
    float result0 = 1.f + f;                    // no overload for `+` applicable to types
    float result1 = 1.f + (float)f;             // expected a function, got `typeof(float)`
    float result2 = 1.f + float(f);             // expected a function, got `typeof(float)`
    float result3 = 1.f + __realCast<float>(f); // feels like a workaround, not a recommended workflow
    return result0;
}

Calling a built-in function exp that is defined for both half and float is not possible using a generic of type __BuiltinFloatingPointType without forcing an explicit cast that will have unintended consequences on the codegen:

float gen_exp<Gen : IGenerator>()
{
    Gen g;
    Gen::TFloat f = g.generate();

    float e0 = exp(__realCast<float>(f)); // if f is `half`, does exp(float) instead of exp(half) -> slow
    float e1 = exp(__realCast<half>(f));  // if f is `float`, does exp(half) instead of exp(float) -> imprecise
    float e2 = exp(f); // does not work, despite exp having __BuiltinFloatingPointType attribute and TFloat being __BuiltinFloatingPointType
                       // type.
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant