-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dot product optimization pass #20
Comments
@mburge I think this is optimization we discussed yesterday. |
caotto: so basically you build a look-up-table for convolution with a filter at compile time? |
@bklare This is one of the optimizations that I think would be particularly cool for a deep learning neural net. |
This is a pretty cool idea. I am still getting up to speed on CNN's. Do you know if the filters are the same at each layer? That is, the first layer is likely some form of Gabor filters, which you could use your hard coded and sparse approach for. I am wondering if this is the same at the subsequent layers as well. If not, the benefit may be lessened. |
I think the filters are different at each layer, but they could be equally sparse? |
https://www.facebook.com/publications/546316888800776/ If we wanted to build a deep learning network this could be a pretty cool one :) |
@jklontz Yes, I think they could be equally sparse. I guess my concern is that the selected filters for the subsequent would be unknown until after training. |
@JordanCheney Yeah, that paper has gotten a lot of buzz. It was presented at CVPR a few weeks ago as well. It is worth studying if you haven't already done so. |
I was excited talking about this yesterday and I was just wondering if there were any thoughts about implementing convolution in likely yet? I think it could be done in the likely standard library as a function that takes a base matrix, a filter matrix and a stride value. A dot product can then be computed for slices of the base matrix (is slicing implemented in likely?) and the filter. The filter would "move" across the base matrix in accordance with the specified stride value. My only question is would implementing this in the standard library take advantage of all of the optimizations discussed above? |
I completely agree that it should be implemented in the standard library, and have been doing 47d06c0 as a precursor. The convolution function should only take the image and filter as arguments and be able to infer the rest. The main feature currently lacking for this is being able to specify where in the image you want to access. |
To follow up on this, the most immediate next step is to extend
|
are the column and channel arguments switched here? |
Quick question- kernelArgument::evaluateOperator takes a likely_ast as one argument. I thought the purpose of an ast was that it could parse an arbitrarily sized list of arguments, which is the desired outcome here. Is there a limitation here that I am not understanding? |
Yeah, I switched column and channel in the example! You should first confirm that Note that in the example above, |
Quick sanity and understanding check for me- My understanding right now is that likely functions as a stack of environments, each of which has certain operations registered in some type of LUT. The root of this stack has all of the likely_standard_library functions in it. First, is this the correct, and, if so, does creating a new likely_matrix automatically register it in the current environment's LUT so that it can be processed later on? This second point ties into this because "image" is a likely_matrix not an operation so I wanted to check if ast->atoms[0] can be looked up using the same lookup() function or if a different method must be used. |
The first point is correct, but as you've noticed, it's more of a "lookup linked list" than a "lookup table". The second point about being able to lookup an image will be true in a few days, as I still have a few more patches that need to land first. Right now if you look up |
At the heart of many computer vision algorithms (subspace learning, deep learning, wavelets) is a dot product operation of an incoming image against a filter constructed offline. This idea is to introduce a suite of LLVM optimization passes that leverage the fact that the filter is known at compile time. Specifically:
Together these passes convert the code between a generic dense dot product and a hard-coded sparse dot-product.
As a stretch goal:
This is a long-term research idea and a good paper alone. It is also an example of an interesting idea that becomes possible with a computer vision DSL.
The text was updated successfully, but these errors were encountered: