Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add documentation for buffer types #5410

Merged
merged 4 commits into from
Oct 28, 2024
Merged

Conversation

kaizhangNV
Copy link
Contributor

@kaizhangNV kaizhangNV added the pr: non-breaking PRs without breaking changes label Oct 25, 2024
@kaizhangNV kaizhangNV requested a review from csyonghe October 25, 2024 17:29
source/slang/hlsl.meta.slang Show resolved Hide resolved
source/slang/hlsl.meta.slang Show resolved Hide resolved
source/slang/hlsl.meta.slang Outdated Show resolved Hide resolved
@@ -4216,6 +4364,10 @@ struct $(item.name)
}
}

/// Load two 32-bit unsigned integers from the buffer at the specified location with alignment
/// of `uint2`.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Speak out what the alignment is, I. This case it is 8.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems still 8.
I just checked:
source/slang/slang-ir-layout.cpp:236

source/slang/hlsl.meta.slang Show resolved Hide resolved
source/slang/hlsl.meta.slang Show resolved Hide resolved
source/slang/hlsl.meta.slang Show resolved Hide resolved
source/slang/hlsl.meta.slang Outdated Show resolved Hide resolved
Copy link
Collaborator

@csyonghe csyonghe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the alignments for LoadAligned is not correct: 2-vector should be 8 bytes (offset must be multiple of 8 too), and 4-vector should be 16.

}
}

/// Load two 32-bit unsigned integers from the buffer at the specified location with alignment
/// of `uint2`, which is 4.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this is 8?

}
}

/// Load four 32-bit unsigned integers from the buffer at the specified location.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should be 16.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

uint4 is four 16bits integer?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is four 32 bit integers, and LoadAlign will align to the vector boundary, which is 16 bytes.

@kaizhangNV
Copy link
Contributor Author

kaizhangNV commented Oct 25, 2024

I think the alignments for LoadAligned is not correct: 2-vector should be 8 bytes (offset must be multiple of 8 too), and 4-vector should be 16.

I think offset is different from alignment.
If you see some definition here:
void __byteAddressBufferStore(RWByteAddressBuffer buffer, int offset, int alignment, T value);

Alignment put the restriction on the address.

Also, I checked the alignment by looking at our code at

source/slang/slang-ir-layout.cpp:236

    case kIROp_VectorType:
    {
        auto vecType = cast<IRVectorType>(type);
        IRSizeAndAlignment elementTypeLayout;
        getSizeAndAlignment(optionSet, rules, vecType->getElementType(), &elementTypeLayout);
        *outSizeAndAlignment = rules->getVectorSizeAndAlignment(elementTypeLayout, getIntegerValueFromInst(vecType->getElementCount()));
        return SLANG_OK;
    }

outSizeAndAlignment does show that uint{2,3,4} have alignment of 4.

/// Load two 32-bit unsigned integers from the buffer at the specified location
/// with additional alignment.
///@param location The input address in bytes.
///@param alignment Specifies the alignment of the location, which must be a multiple of 4.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

8

}
}

/// Load four 32-bit unsigned integers from the buffer at the specified location.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is four 32 bit integers, and LoadAlign will align to the vector boundary, which is 16 bytes.

@csyonghe
Copy link
Collaborator

OK, I think the current behavior is LoadAligned will treat the buffer as an array of T, and the offset must be at the boundary of T's stride. Therefore, the result will be undefined if you do LoadAligned<float4>(4).

You can use slang-playground to compile this code to spirv:

RWByteAddressBuffer bb;

[numthreads(1,1,1)]
void computeMain()
{
    var v = bb.LoadAligned<float4>(16);
    bb.Store(0, v);
}

You can see in the spirv we have:

%computeMain = OpFunction %void None %3
          %4 = OpLabel
         %11 = OpAccessChain %_ptr_StorageBuffer_v4float %bb %int_0 %int_1
         %16 = OpLoad %v4float %11
         %17 = OpAccessChain %_ptr_StorageBuffer_v4float %bb %int_0 %int_0
               OpStore %17 %16
               OpReturn
               OpFunctionEnd

Which is treating bb as float4[].

@csyonghe
Copy link
Collaborator

Looks like we are generating wrong code for composite types...
Try

RWByteAddressBuffer bb;

struct VV
{
    float4 v1;
    float2 v2;
};
[numthreads(1,1,1)]
void computeMain()
{
    var v = bb.LoadAligned<VV>(16);
    bb.Store(0, v);
}

The load for the v2 field is at a completely invalid address.

@csyonghe
Copy link
Collaborator

Let's put in the document that this currently only works for scalar, vector and matrix types.

@csyonghe
Copy link
Collaborator

and the semantics is that it assumes the address to load/store is aligned at the stride boundary of the vector type.

@kaizhangNV
Copy link
Contributor Author

OK, I will update the document accordingly.

@kaizhangNV
Copy link
Contributor Author

kaizhangNV commented Oct 27, 2024

I tried the code like:

ByteAddressBuffer byteBuffer;
RWStructuredBuffer<uint2> output;

[shader("compute")]
void computeMain(int3 dispatchThreadID : SV_DispatchThreadID)
{
    uint2 out;
    out = byteBuffer.Load2Aligned(4);
    output[0] = out;
}

and it generates:

%10 = OpAccessChain %_ptr_StorageBuffer_uint %byteBuffer %int_0 %int_1
%15 = OpLoad %uint %10
%17 = OpAccessChain %_ptr_StorageBuffer_uint %byteBuffer %int_0 %int_2

and you can see that, we treat the whole buffer as uint type, so the alignment is always 4.
I tried uint2, uint3, and uint4, same behavior. They're all aligned to 4.

But if I give an input location that is aligned to size of the vector, it will treat the buffer as the expecting types, e.g.

byteBuffer.Load2Aligned(8);  => OpAccessChain %_ptr_StorageBuffer_v2uint %byteBuffer %int_0 %int_1

byteBuffer.Load3Aligned(12);  => OpAccessChain %_ptr_StorageBuffer_v3uint %byteBuffer %int_0 %int_1 

byteBuffer.Load4Aligned(16); => OpAccessChain %_ptr_StorageBuffer_v4uint %byteBuffer %int_0 %int_1 

so it looks to me that, as long as the minimum alignment is met (which is 4), it can still generate the correct result.

@kaizhangNV
Copy link
Contributor Author

So as long as we define that

out = byteBuffer.Load2Aligned(4); 

=>

%10 = OpAccessChain %_ptr_StorageBuffer_uint %byteBuffer %int_0 %int_1
%17 = OpAccessChain %_ptr_StorageBuffer_uint %byteBuffer %int_0 %int_2

is an unexpected behavior, otherwise, alignment of 4 should be valid statement.

Update the doc for all Load{2,3,4}Aligned and LoadxAligned<T> functions of
buffer type. We assume that those aligned version of Load{2,3,4} and
Load<T> will treat the whole buffer as type of unit{2,3,4} or T,
so the address must be aligned to size of the loaded type.
@kaizhangNV
Copy link
Contributor Author

@csyonghe add a new commit to update all the aligned version of load and store, please review.
For other load and store methods, doc remains unchanged.

@csyonghe csyonghe merged commit 0557a19 into shader-slang:master Oct 28, 2024
13 checks passed
@csyonghe csyonghe mentioned this pull request Oct 28, 2024
@kaizhangNV kaizhangNV deleted the doc branch January 8, 2025 21:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pr: non-breaking PRs without breaking changes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants