Skip to content

Commit

Permalink
Work on mapping examples
Browse files Browse the repository at this point in the history
  • Loading branch information
ccummingsNV committed Jan 13, 2025
1 parent 98745bc commit 5dc7d2e
Show file tree
Hide file tree
Showing 5 changed files with 159 additions and 38 deletions.
18 changes: 12 additions & 6 deletions docs/broadcasting.rst
Original file line number Diff line number Diff line change
Expand Up @@ -12,13 +12,13 @@ are passed either equally sized buffers, or single values. For example:

.. code-block:: python
# Adding 2 buffers of equal size
# Adding all elements in 2 buffers of equal size
mymodule.add(np.array([1, 2, 3]), np.array([4, 5, 6])
# Adding a single value to every element of a buffer
mymodule.add(5, np.array([4, 5, 6])
# Adding 2 2D buffers of the same shape (2x2)
# Adding all elements in 2 2D buffers of the same shape (2x2)
mymodule.add(np.array([[1, 2], [3, 4]]), np.array([[5, 6], [7, 8]]))
The process of taking the arguments and inferring how to vectorize the function is known as
Expand All @@ -30,6 +30,8 @@ Before diving in, some terminology:
a 2D buffer has a `dimensionality` of 2, a volume texture has a `dimensionality` of 3 etc.
- `Shape`: The size of each dimension of a value. For example, a 1D buffer of size 3 has a `shape` of (3,), a 32x32x32 volume texture has a shape of (32,32,32).
In effect, `dimensionality` is equal to the length of the `shape` tuple.
Note: For those new to broadcasting, a common point of confusion is that a `3D vector` does **not** have
a `dimensionality` of 3! Instead, it has a `dimensionality` of 1, and its `shape` is (3,).
Expand Down Expand Up @@ -69,16 +71,20 @@ and generates an output of a given **shape**:
B (5,3,4)
Out Error
SlangPy will also support broadcasting a single value to all dimensions of the output. Conceptually,
this is similar to adding dimensions of size 1 to the value until it matches the output's dimensionality:
SlangPy will also support broadcasting a single value to all dimensions of the output. Programmatically,
a single value can be thought of as a value that isn't indexed - its dimensionality is 0, and its shape
is ().
Conceptually, broadcasting the same value to all dimensions is similar to adding dimensions of size
1 to the value until it matches the output's dimensionality:
.. code-block:: python
# For a function Out[x,y,z] = A[x,y,z] + B[x,y,z]
# A single value is broadcast to all dimensions of the output
# Out[x,y,z] = A[0] + B[x,y,z]
A (1)
# Out[x,y,z] = A + B[x,y,z]
A ()
B (10,3,4)
Out (10,3,4)
Expand Down
125 changes: 125 additions & 0 deletions docs/mapping.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,125 @@
Mapping
=======

In the previous broadcasting section, we saw how SlangPy applies broadcasting rules to automatically vectorize a function. Mapping gives more control over this process by allowing the user to explicitly specify the relationship between argument and kernel dimensions.

.. code-block:: python
a = np.random.rand(10,3,4)
b = np.random.rand(10,3,4)
result = mymodule.add(a,b, _result=numpy)
In this example:

- ``a`` and ``b`` are `arguments` to the ``add`` kernel, each with shape ``(10,3,4)``.
- The kernel is dispatched with overall shape ``(10,3,4)``.
- Any given thread, ``[i,j,k]``, reads ``a[i,j,k]`` and ``b[i,j,k]`` and writes ``result[i,j,k]``.

There is a simple 1-to-1 mapping of argument dimensions to kernel dimensions.

Re-mapping dimensions
---------------------

``map`` can be used to change how argument dimensions correspond to kernel dimensions. In the above example, we could have written:

.. code-block:: python
a = np.random.rand(10,3,4)
b = np.random.rand(10,3,4)
result = mymodule.add.map((0,1,2), (0,1,2))(a,b, _result=numpy)
The tuples passed to map specify how to map dimensions for each argument. In this case we're mapping dimension 0 to dimension 0, dimension 1 to dimension 1 and dimension 2 to dimension 2 etc. This 1-to-1 mapping is the default behaviour.

Mapping works with named parameters too, which can be a little clearer:

.. code-block:: python
# Assume the slang add function has signature add(float3 a, float3 b)
a = np.random.rand(10,3,4)
b = np.random.rand(10,3,4)
result = mymodule.add.map(a=(0,1,2), b=(0,1,2))(a=a,b=b, _result=numpy)
----

**Mapping arguments with different dimensionalities**

As we've already seen, unlike Numpy, SlangPy by design doesn't auto-pad dimensions. When this behaviour is desirable, explicit mapping can be used to tell SlangPy exactly how to map the smaller inputs to those of the overall kernel:

.. code-block:: python
a = np.random.rand(8,8)
b = np.random.rand(8)
# Fails in SlangPy, as b is not auto-extended
result = mymodule.add(a,b, _result=numpy)
# Works, as we are explicilty mapping
# This is equivalent to padding b with empty dimensions, as numpy would
# result[i,j] = a[i,j] + b[j]
result = mymodule.add.map(a=(0,1), b=(1,))(a=a,b=b, _result=numpy)
# The same thing (didn't need to specify a as 1-to-1 mapping is default)
result = mymodule.add.map(b=(1,))(a=a,b=b, _result=numpy)
# Also works, as we are explicilty mapping
# result[i,j] = a[i,j] + b[i]
result = mymodule.add.map(b=(0,))(a=a,b=b, _result=numpy)
----

**Mapping arguments to different dimensions**

Another use case is performing some operation in which you wish to broadcast all the elements of one argument across the other. The simplest is the mathematical outer-product:

.. code-block:: python
# Assume the slang multiply function has signature multiply(float a, float b)
# a is mapped to dimension 0, giving kernel dimension [0] size 10
# b is mapped to dimension 1, giving kernel dimension [1] size 20
# overall kernel (and thus result) shape is (10,20)
# result[i,j] = a[i] * b[j]
a = np.random.rand(10)
b = np.random.rand(20)
result = mymodule.multiply.map(a=(0,), b=(1,))(a=a,b=b, _result=numpy)
----

**Mapping to re-order dimensions**

Similarly, dimension indices can be adjusted to re-order the dimensions of an argument. A trivial example to transpose a matrix (replace rows with columns) would be:

.. code-block:: python
# Assume the slang copy function has signature float copy(float val)
# and just returns the value you pass it.
# result[i,j] = a[j,i]
a = np.random.rand(10,20)
result = mymodule.copy.map(val=(1,0))(val=a, _result=numpy)
----

**Mapping to resolve ambiguities**

In addition to performaning more complex broadcasting, mapping can also be used to resolve ambiguities that would prevent SlangPy vectorizing normally. For example, consider the following generic function (from the `nested` section):

.. code-block::
void copy_generic<T>(T src, out T dest)
{
dest = src;
}
One way to resolve the ambiguities is to map dimensions as follows:

.. code-block:: python
# Map argument types explicitly
src = np.random.rand(100)
dest = np.zeros_like(a)
module.copy_generic.map(src=(0,), dest=(0,))(
src=src,
dest=dest
)
By telling SlangPy that both `src` and `dest` should map 1 dimension, and they are both 1D arrays of floats, SlangPy can infer that you want to pass `float` into `copy_generic` and generates the correct kernel.

20 changes: 8 additions & 12 deletions examples/broadcasting/broken_sampler.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,6 @@

# Load module
module = spy.Module.load_from_file(device, "example.slang")
"""

# Add 2 identically shaped 2d float buffers
a = np.random.rand(10, 5).astype(np.float32)
Expand Down Expand Up @@ -63,7 +62,7 @@
print("")

# Add a float3 and an array of 3 floats!
a = sgl.float3(1,2,3)
a = sgl.float3(1, 2, 3)
b = np.random.rand(3).astype(np.float32)
res = module.add_floats(a, b, _result='numpy')
print(f"A Shape: {a.shape}")
Expand All @@ -74,40 +73,38 @@
# Should get a shape mismatch error, as slangpy won't 'pad' dimensions
try:
a = np.random.rand(3).astype(np.float32)
b = np.random.rand(5,3).astype(np.float32)
b = np.random.rand(5, 3).astype(np.float32)
res = module.add_floats(a, b, _result='numpy')
except ValueError as e:
#print(e)
# print(e)
pass

# Now using add_vectors(float3, float3), no shape mismatch error
# Now using add_vectors(float3, float3), no shape mismatch error
# as a is treated as a single float3, and b is an array of 5 float3s,
# and SlangPy will auto-pad single values.
a = np.random.rand(3).astype(np.float32)
b = np.random.rand(5,3).astype(np.float32)
b = np.random.rand(5, 3).astype(np.float32)
res = module.add_vectors(a, b, _result='numpy')
print(f"A Shape: {a.shape}")
print(f"B Shape: {b.shape}")
print(f"Res Shape: {res.shape}")
print("")

"""

# Create a sampler and texture
sampler = device.create_sampler()
tex = device.create_texture(width=32, height=32, format=sgl.Format.rgb32_float,
usage=sgl.ResourceUsage.shader_resource)
tex.from_numpy(np.random.rand(32, 32, 3).astype(np.float32))
"""


# Sample the texture at a single UV coordinate. Results in 1 thread,
# as the uv coordinate input is a single float 2.
a = sgl.float2(0.5,0.5)
a = sgl.float2(0.5, 0.5)
res = module.sample_texture_at_uv(a, sampler, tex, _result='numpy')
print(f"A Shape: {a.shape}")
print(f"Res Shape: {res.shape}")
print(res)
"""

# Sample the texture at a single UV coordinate. Results in 1 thread,
# as the uv coordinate input is a single float 2.
ad = np.random.rand(20, 2).astype(np.float32)
Expand All @@ -117,4 +114,3 @@
res = module.sample_texture_at_uv(a, sampler, tex, _result='numpy')
print(f"A Shape: {a.shape}")
print(f"Res Shape: {res.shape}")
print(res)
2 changes: 1 addition & 1 deletion examples/broadcasting/example.slang
Original file line number Diff line number Diff line change
Expand Up @@ -13,5 +13,5 @@ float3 add_vectors(float3 a, float3 b)

float4 sample_texture_at_uv(float2 uv, SamplerState sampler, Texture2D<float4> texture)
{
return texture.Sample(sampler, uv);
return texture.SampleLevel(sampler, uv, 0);
}
32 changes: 13 additions & 19 deletions examples/broadcasting/main.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,6 @@

# Load module
module = spy.Module.load_from_file(device, "example.slang")
"""

# Add 2 identically shaped 2d float buffers
a = np.random.rand(10, 5).astype(np.float32)
Expand Down Expand Up @@ -63,7 +62,7 @@
print("")

# Add a float3 and an array of 3 floats!
a = sgl.float3(1,2,3)
a = sgl.float3(1, 2, 3)
b = np.random.rand(3).astype(np.float32)
res = module.add_floats(a, b, _result='numpy')
print(f"A Shape: {a.shape}")
Expand All @@ -74,47 +73,42 @@
# Should get a shape mismatch error, as slangpy won't 'pad' dimensions
try:
a = np.random.rand(3).astype(np.float32)
b = np.random.rand(5,3).astype(np.float32)
b = np.random.rand(5, 3).astype(np.float32)
res = module.add_floats(a, b, _result='numpy')
except ValueError as e:
#print(e)
# print(e)
pass

# Now using add_vectors(float3, float3), no shape mismatch error
# Now using add_vectors(float3, float3), no shape mismatch error
# as a is treated as a single float3, and b is an array of 5 float3s,
# and SlangPy will auto-pad single values.
a = np.random.rand(3).astype(np.float32)
b = np.random.rand(5,3).astype(np.float32)
b = np.random.rand(5, 3).astype(np.float32)
res = module.add_vectors(a, b, _result='numpy')
print(f"A Shape: {a.shape}")
print(f"B Shape: {b.shape}")
print(f"Res Shape: {res.shape}")
print("")

"""

# Create a sampler and texture
sampler = device.create_sampler()
tex = device.create_texture(width=32, height=32, format=sgl.Format.rgb32_float,
usage=sgl.ResourceUsage.shader_resource)
tex.from_numpy(np.random.rand(32, 32, 3).astype(np.float32))
"""

# Sample the texture at a single UV coordinate. Results in 1 thread,
# as the uv coordinate input is a single float 2.
a = sgl.float2(0.5,0.5)
a = sgl.float2(0.5, 0.5)
res = module.sample_texture_at_uv(a, sampler, tex, _result='numpy')
print(f"A Shape: {a.shape}")
print(f"Res Shape: {res.shape}")
print(res)
"""
# Sample the texture at a single UV coordinate. Results in 1 thread,
# as the uv coordinate input is a single float 2.
ad = np.random.rand(20, 2).astype(np.float32)
a = spy.NDBuffer(device, element_type=sgl.float2, shape=(20,))
a.from_numpy(ad)
print(a)

# Sample the texture at 20 UV coordinates. Results in 20 threads.
# Although the texture has shape [32,32,3] (32x32 pixels of float3s),
# in this case it acts as a single value, as it is being passed to
# a function that takes an [n,m,3] structure (a float3 texture). As a
# result, the texture is effectively *broadcast* to all threads.
a = np.random.rand(20, 2).astype(np.float32)
res = module.sample_texture_at_uv(a, sampler, tex, _result='numpy')
print(f"A Shape: {a.shape}")
print(f"Res Shape: {res.shape}")
print(res)

0 comments on commit 5dc7d2e

Please sign in to comment.