Funding
Australian citizens who undertake this project may be eligible for generous project funding. Please contact us to find out more.
Context
This project is one stream in a larger effort that seeks to leverage high-performance computing to drastically improve the runtime efficiency and scalability of Boolean SAT(isfiability) solving, in particular, SAT solving for transition system models. We are guided by bottlenecks in SAT solving for the following application domains:
- Program analysis via symbolic execution; e.g. [1]
- The cryptanalysis of hash functions; e.g. [2]
- Analysis of security properties, and in particular privacy properties; e.g. [3]
We seek to advance SAT solving system technology, with the concrete objectives being to create a SAT system that:
1. Is convincingly faster (walltime) compared to existing monolithic serial systems – i.e., ideally we show an exponential separation, however at the very least a polynomial speedup factor.
2. Scales gracefully to problems where the given CNF representation is very large – e.g., billions of clauses.
Our application focus shall be domains where the underlying decision problems are most naturally modelled as state transition systems. We expect to achieve our objectives through careful scientific investigation of high-performance computing technologies for the efficient and targeted discovery and communication of information between the processing elements in a distributed search.
Background
There are two paradigms for solving Boolean SAT problems using serial processing: (i) stochastic local search (SLS), or otherwise (ii) a backtracking search, such as conflict-driven clause learning (CDCL). Both have their merits, but the latter is best suited, in isolation, to solving transition system problems. These techniques have yet to be successfully coupled in a flexible SAT-solving framework.
Parallel computation has found a number of uses in solving SAT problems. Parallelism can be exploited in: (i) the tuning of algorithm configurations, (ii) search-space splitting, and (iii) portfolio approaches. Identifying and sharing good constraints is key to efficient parallelism. Bottlenecks have been identified
in the structure of resolution refutations which place hard limits on parallelism in search-space splitting for CDCL solvers. [4] Finally, once we start solving bounded model checking (BMC) problems--as we expect to do in this project--parallelization is raised as a topic when considering the search horizon at which to pose queries.
A fundamental process in any state-of-the-art symbolic execution tool is the translation/compilation of queries to decision problems, and then a call to a satisfiability/decision procedure which is used to resolve the query. Mayhem and angr use the SAT (Modulo Theory) system Z3. S2E supports a number of solvers, including Z3, that is to our knowledge used most frequently for industry-strength solving. The open-source (MIT License) Z3 tool itself implements a bespoke serial Boolean satisfiability procedure to solve hard decision problems. The focus of this project shall be on developing efficient parallelization strategies, for variable selection, value selection, and constraint propagation in the decision engine that underlies an effective symbolic execution toolchain.