Decide How to design the Executor #3

tim-becker · 2015-03-28T19:39:01Z

Here is a proposal for how our project should work based on our discussions so far:

The user navigates to a starting clnum, and selects memory and registers that they want to be symbolic. Additionally, they specify a desired outcome state that contains constraints, for example

%eip = 0x80486B3
%eax = 0
*0x0804A024 = 0

We will create an initial state using the starting clnum and symbolic values, and begin executing from the current PC. After each instruction, we will check the desired outcome with a solver. If it is satisfied, we can get values for the initial symbolic state. Note that performance here should not be a huge concer, as most of these checks should instantly fail due to an incorrect concrete value (i.e. %eax != 0x80486B3).

Along the execution, if we reach a conditional that depends on a symbolic value, we should split into two separate states (one for each outcome), and continue executing on each. If there are a lot of symbolic comparisons, this will likely be very expensive, but let's hope that the abundance of concrete values will limit this growth.

Does this about summarize our discussion so far?

Also, here are some further questions to discuss:

What is the right way to represent this "state-splitting" that we'll do for symbolic comparisions?
When we split into two branches on a symbolic comparison, should we continue depth-first or breadth-first?

The text was updated successfully, but these errors were encountered:

cganas · 2015-03-28T19:52:37Z

Yeah, this seems like a good summary. It may be useful if we serialize how we are going to to deal with redefinition of memory and SSA.

As far as splitting into branches on symbolic comparison, we may have to do some testing to determine the optimal responsive search. When we are working with z3 we have no way of estimating/guaranteeing that a given model will be realized in a reasonable amount of time. It seems to me that we should do DFS, which will greatly reduce the risk of exponential space blow up in the executor, with a timeout set on the constraint query.

It would be interesting to have an algorithm that adapts to the lengths of the SATs. Initially we set a strict timeout on z3 and every query that timeouts has its state serialized and put into a backup reserve. If we determine that there does not exist a query that will solve the SAT problem within our timeout, or many of them do not, then we increase the timeout and bring back the longer queries.

This has the interesting property that we may be able to find the "fastest" assignment that satisfies the problem.

Edit: It would also be useful to take advantage of some incremental solving so we can avoid calculate a large expression that depends on some unsat formula being sat. For example, before we jump to SSA to deal with redefinitions or split the state to take branches we can hopefully short circuit if it is unsat.

dbrumley · 2015-03-29T13:37:57Z

It's not clear how you will connect the values in the goal state with the input state. For example, if you state you want zf ==0 to take a jump at a destination, how do you know what part of the input influences zf at that point?

As an alternate approach, SAGE takes the backwards slice at all decision points, calculates the formula over that, and then solves. I would consider doing a backward slice from the desired point.
In particular, if it were me I would start out just getting a single query working as part of a backward slice (I'm guessing that's most useful for crackmes) before worrying too much about a path selection strategy.

Re: exponential blowup in the size of the formula: you can either do substitution and get blowup or convert to SSA and not get blowup. Both are actually reasonable choices. I don't get what you mean by DFS in this context, though.

For Z3 performance; most queries should come back relatively quickly (<30s) or not at all (>5min) in my experience. If you end up with a bunch of queries that take in the minutes, then you should queue them up and use an exponential bucket backoff system to prevent blocking. Put all queries in bucket 1 that takes 30s, and stop anything not solved and move to bucket 2. Bucket 2 is 1m. Repeat. Backet 3 is 2m, repeat. Bucket 4 is 4m, repeat, and so on. This will prevent blocking and be within a factor of two of hindsight optimal.

cganas mentioned this issue Apr 20, 2015

Concolic Executor UI Discussion #5

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Decide How to design the Executor #3

Decide How to design the Executor #3

tim-becker commented Mar 28, 2015

cganas commented Mar 28, 2015

dbrumley commented Mar 29, 2015

Decide How to design the Executor #3

Decide How to design the Executor #3

Comments

tim-becker commented Mar 28, 2015

cganas commented Mar 28, 2015

dbrumley commented Mar 29, 2015