Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dot product optimization pass #20

Open
jklontz opened this issue Nov 2, 2013 · 16 comments
Open

Dot product optimization pass #20

jklontz opened this issue Nov 2, 2013 · 16 comments
Labels

Comments

@jklontz
Copy link
Member

jklontz commented Nov 2, 2013

At the heart of many computer vision algorithms (subspace learning, deep learning, wavelets) is a dot product operation of an incoming image against a filter constructed offline. This idea is to introduce a suite of LLVM optimization passes that leverage the fact that the filter is known at compile time. Specifically:

  1. Completely unroll the dot product loop based on the known filter size.
  2. Substitute memory access instructions with the known constant values for the filter.
  3. Eliminate instructions where the filter value is 0 (or perhaps near-zero)

Together these passes convert the code between a generic dense dot product and a hard-coded sparse dot-product.

As a stretch goal:

  1. Other optimizations which become possible when the dot product can be approximated within a pre-specified margin of error.

This is a long-term research idea and a good paper alone. It is also an example of an interesting idea that becomes possible with a computer vision DSL.

@jklontz
Copy link
Member Author

jklontz commented Nov 12, 2013

@mburge I think this is optimization we discussed yesterday.

@jklontz
Copy link
Member Author

jklontz commented Mar 12, 2014

caotto: so basically you build a look-up-table for convolution with a filter at compile time?
[1:26pm] caotto: jklontz: or do I not get the idea
[1:27pm] jklontz: I would go further than a LUT
[1:27pm] caotto: elaborate?
[1:27pm] jklontz: and inline the LUT into the code
[1:27pm] jklontz: for example
[1:29pm] jklontz: in our current sheme say we learn a 2-dimensional filter, x, with values {1, 2}
[1:29pm] jklontz: conventionally we would apply that filter to another vector y like:
[1:30pm] jklontz: int response = 0; for (int i=0; i<x.len; i++) response += y[i] * x[i]
[1:30pm] jklontz: however, since we know x at compile time now
[1:30pm] jklontz: the compiler to simplify this to
[1:30pm] jklontz: int response = y[0] * 1 + y[1] * 2;
[1:31pm] jklontz: pretty powerful, no?
[1:32pm] jklontz: this then opens the door to conventional algebraic simplification
[1:32pm] caotto: so it's not a LUT based on input values, rather you are inlining the math for doing the convolution
[1:32pm] jklontz: removing the *1, elminating any operations that are *0
[1:32pm] jklontz: yeah
[1:32pm] caotto: I see, so it's not a LUT at all
[1:33pm] jklontz: and the compiler can turn that dense multiplication into a sparse one automatically
[1:33pm] jklontz: yeah I guess LUT is not the right word
[1:33pm] caotto: Well that sounds pretty interesting.
[1:34pm] jklontz: I think it could be a real game changer for deep learning which involves learning a lot of filters
[1:35pm] caotto: so this is basically saying, we do the convolution in time domain, and just optimize the hell out of it
[1:36pm] caotto: have you considered how this compares/would apply to fft based techniques
[1:36pm] jklontz: I haven't
[1:36pm] jklontz: but I think it's relevant to anything where one or both of the operands is known at training time
[1:37pm] caotto: so for example I'm pretty sure OpenCV's filter engine will do fft, apply the filter in frequency domian, then inverse fft
[1:37pm] caotto: I guess you could just do the fft at compile time?
[1:37pm] jklontz: yup
[1:37pm] caotto: and do somehting similar based on that
[1:37pm] jklontz: precisely

@jklontz
Copy link
Member Author

jklontz commented Jul 10, 2014

@bklare This is one of the optimizations that I think would be particularly cool for a deep learning neural net.

@bklare-zz
Copy link

This is a pretty cool idea. I am still getting up to speed on CNN's. Do you know if the filters are the same at each layer? That is, the first layer is likely some form of Gabor filters, which you could use your hard coded and sparse approach for. I am wondering if this is the same at the subsequent layers as well. If not, the benefit may be lessened.

@jklontz
Copy link
Member Author

jklontz commented Jul 10, 2014

I think the filters are different at each layer, but they could be equally sparse?

@JordanCheney
Copy link
Contributor

https://www.facebook.com/publications/546316888800776/

If we wanted to build a deep learning network this could be a pretty cool one :)

@bklare-zz
Copy link

@jklontz Yes, I think they could be equally sparse. I guess my concern is that the selected filters for the subsequent would be unknown until after training.

@bklare-zz
Copy link

@JordanCheney Yeah, that paper has gotten a lot of buzz. It was presented at CVPR a few weeks ago as well. It is worth studying if you haven't already done so.

@JordanCheney
Copy link
Contributor

I was excited talking about this yesterday and I was just wondering if there were any thoughts about implementing convolution in likely yet? I think it could be done in the likely standard library as a function that takes a base matrix, a filter matrix and a stride value. A dot product can then be computed for slices of the base matrix (is slicing implemented in likely?) and the filter. The filter would "move" across the base matrix in accordance with the specified stride value. My only question is would implementing this in the standard library take advantage of all of the optimizations discussed above?

@jklontz
Copy link
Member Author

jklontz commented Jul 20, 2014

I completely agree that it should be implemented in the standard library, and have been doing 47d06c0 as a precursor. The convolution function should only take the image and filter as arguments and be able to infer the rest. The main feature currently lacking for this is being able to specify where in the image you want to access.

@jklontz
Copy link
Member Author

jklontz commented Jul 20, 2014

To follow up on this, the most immediate next step is to extend kernelArgument::evaluateOperator to support arguments. For example:

image = (read "../data/misc/lenna.tiff"); currently works
(image 0 128 64); doesn't work, should return the value at column = 0, channel = 128, row = 64

@JordanCheney
Copy link
Contributor

are the column and channel arguments switched here?

@JordanCheney
Copy link
Contributor

Quick question- kernelArgument::evaluateOperator takes a likely_ast as one argument. I thought the purpose of an ast was that it could parse an arbitrarily sized list of arguments, which is the desired outcome here. Is there a limitation here that I am not understanding?

@jklontz
Copy link
Member Author

jklontz commented Jul 20, 2014

Yeah, I switched column and channel in the example!

You should first confirm that ast->is_list and then you can access the parameters as ast->atoms[0] through ast->atoms[ast->num_atoms-1]

Note that in the example above, ast->atoms[0] == "image" and ast->atoms[1] == 0 and ast->num_atoms == 4. This is because the ast is isomorphic with the source code s-expression: (image 0 128 64).

@JordanCheney
Copy link
Contributor

Quick sanity and understanding check for me- My understanding right now is that likely functions as a stack of environments, each of which has certain operations registered in some type of LUT. The root of this stack has all of the likely_standard_library functions in it. First, is this the correct, and, if so, does creating a new likely_matrix automatically register it in the current environment's LUT so that it can be processed later on? This second point ties into this because "image" is a likely_matrix not an operation so I wanted to check if ast->atoms[0] can be looked up using the same lookup() function or if a different method must be used.

@jklontz
Copy link
Member Author

jklontz commented Jul 23, 2014

The first point is correct, but as you've noticed, it's more of a "lookup linked list" than a "lookup table". The second point about being able to lookup an image will be true in a few days, as I still have a few more patches that need to land first. Right now if you look up image you will get back (read "../data/misc/lenna.tiff") unevaluated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants