Skip to content

Commit

Permalink
Merge pull request #76 from freespek/igor/alloy-exp67
Browse files Browse the repository at this point in the history
Two more Alloy experiments
  • Loading branch information
konnov authored Nov 1, 2024
2 parents d577779 + ae4440c commit eeba7fd
Show file tree
Hide file tree
Showing 7 changed files with 55 additions and 23 deletions.
27 changes: 17 additions & 10 deletions EXPERIMENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -362,9 +362,17 @@ This experiment takes about 37 minutes on a single node. The interesting thing
is that we can parallelize the enumeration of 100 runs across multiple cores.
For instance, we use 20 CPU cores to check 5 symbolic runs each.

## 6. Induction checking Spec 4
## 8. Spec3b-smt: Checking Accountable Safety with SMT

### 6.1. Inductiveness check
See the detailed experiments in [README.md](./spec3b-smt/README.md).

## 7. Spec3c-alloy: Checking Accountable Safety with Alloy

See the detailed experiments in [README.md](./spec3c-alloy/README.md).

## 8. Induction checking Spec 4

### 8.1. Inductiveness check

In this experiment, we show that `IndInv` is an inductive invariant for `Spec 4`:

Expand All @@ -384,7 +392,7 @@ $ apalache-mc check --length=1 --inv=IndInv \

This experiment takes about 2 seconds.

### 6.2. Inductive checking of Accountable Safety
### 8.2. Inductive checking of Accountable Safety

```sh
$ cd ./spec4
Expand All @@ -395,9 +403,9 @@ $ JVM_ARGS=-Xmx20G apalache-mc check \

This experiment took 19 hours 48 min 29 sec.

## 7. Induction checking with Spec 4b
## 9. Induction checking with Spec 4b

### 7.1. Checking Accountable Safety for M3
### 9.1. Checking Accountable Safety for M3

```sh
$ apalache-mc check --length=0 --init=IndInit_C1 \
Expand All @@ -406,7 +414,7 @@ $ apalache-mc check --length=0 --init=IndInit_C1 \

This experiment took XXX sec.

### 7.2. Checking Accountable Safety for M4a
### 9.2. Checking Accountable Safety for M4a

```sh
$ apalache-mc check --length=0 --init=IndInit_C4 \
Expand All @@ -415,7 +423,7 @@ $ apalache-mc check --length=0 --init=IndInit_C4 \

This experiment took XXX sec.

### 7.3. Checking Accountable Safety for M4b
### 9.3. Checking Accountable Safety for M4b

```sh
$ apalache-mc check --length=0 --init=IndInit_C2 \
Expand All @@ -424,7 +432,7 @@ $ apalache-mc check --length=0 --init=IndInit_C2 \

This experiment took XXX sec.

### 7.4. Checking Accountable Safety for M5a
### 9.4. Checking Accountable Safety for M5a

```sh
$ apalache-mc check --length=0 --init=IndInit_C1 \
Expand All @@ -433,7 +441,7 @@ $ apalache-mc check --length=0 --init=IndInit_C1 \

This experiment took XXX sec.

### 7.5. Checking Accountable Safety for M5b
### 9.5. Checking Accountable Safety for M5b

```sh
$ apalache-mc check --length=0 --init=IndInit_C3 \
Expand All @@ -442,7 +450,6 @@ $ apalache-mc check --length=0 --init=IndInit_C3 \

This experiment took XXX sec.


[Apalache]: https://apalache-mc.org/
[Apalache installation]: https://apalache-mc.org/docs/apalache/installation/index.html
[Apalache releases]: https://github.com/apalache-mc/apalache/releases
Expand Down
4 changes: 3 additions & 1 deletion reports/spec3-alloy.tex
Original file line number Diff line number Diff line change
Expand Up @@ -154,7 +154,9 @@ \subsection{Model Checking Results}\label{sec:alloy-results}
\\
8 & 5 & 7 & 4 & 6 & 24 & 1h 27m & 156M
\\
9 & --- & --- & --- & --- & --- & --- & ---
9 & 5 & 10 & 4 & 8 & 24 & over 8 days (timeout) & 198 MB
\\
9 & 5 & 10 & 4 & 8 & 32 & over 8 days (timeout) & 220 MB
\\
10 & 3 & 15 & 4 & 5 & 12 & 31s & 56M
\\
Expand Down
33 changes: 21 additions & 12 deletions spec3c-alloy/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -50,24 +50,26 @@ conduct experiments for the same kinds of inputs:

| Input | #blocks | #checkpoints | #signatures | #ffg_votes | #votes | runtime | memory |
|------------|--------:|-------------:|------------:|-----------:|-------:|---------:|--------:|
| [ffg-exp1] | 3 | 5 | 4 | 5 | 12 | 4 sec | 35 MB |
| [ffg-exp2] | 4 | 5 | 4 | 5 | 12 | 10 sec | 40 MB |
| [ffg-exp3] | 5 | 5 | 4 | 5 | 12 | 15 sec | 45 MB |
| [ffg-exp4] | 3 | 6 | 4 | 6 | 15 | 57 sec | 52 MB |
| [ffg-exp5] | 4 | 6 | 4 | 6 | 15 | 167 sec | 55 MB |
| [ffg-exp6] | 5 | 6 | 4 | 6 | 15 | 245 sec | 57 MB |
| [ffg-exp7] | 6 | 6 | 4 | 6 | 15 | 360 sec | 82 MB |
| [ffg-exp8] | 5 | 7 | 4 | 6 | 24 | 1h 27m | 156 MB |
| [ffg-exp1][] | 3 | 5 | 4 | 5 | 12 | 4 sec | 35 MB |
| [ffg-exp2][] | 4 | 5 | 4 | 5 | 12 | 10 sec | 40 MB |
| [ffg-exp3][] | 5 | 5 | 4 | 5 | 12 | 15 sec | 45 MB |
| [ffg-exp4][] | 3 | 6 | 4 | 6 | 15 | 57 sec | 52 MB |
| [ffg-exp5][] | 4 | 6 | 4 | 6 | 15 | 167 sec | 55 MB |
| [ffg-exp6][] | 5 | 6 | 4 | 6 | 15 | 245 sec | 57 MB |
| [ffg-exp7][] | 6 | 6 | 4 | 6 | 15 | 360 sec | 82 MB |
| [ffg-exp8][] | 5 | 7 | 4 | 6 | 24 | 1h 27m | 156 MB |
| [ffg-exp9][] | 5 | 10 | 4 | 8 | 24 | >8 days (timeout) | 198 MB |
| [ffg-exp9a][] | 5 | 10 | 4 | 8 | 32 | >8 days (timeout) | 220 MB |

In addition to the above experiments, we ran a few experiments that have
inputs comparable in size to those produced by our TLA+ specification:

| Input | #blocks | #checkpoints | #signatures | #ffg_votes | #votes | runtime | memory |
|-------------|--------:|-------------:|------------:|-----------:|-------:|--------:|--------:|
| [ffg-exp10] | 3 | 15 | 4 | 5 | 12 | 31 sec | 56 MB |
| [ffg-exp11] | 4 | 20 | 4 | 5 | 12 | 152 sec | 94 MB |
| [ffg-exp12] | 5 | 25 | 4 | 5 | 12 | 234 sec | 117 MB |
| [ffg-exp13] | 7 | 15 | 4 | 10 | 40 | >10 days | 300 MB |
| [ffg-exp10][] | 3 | 15 | 4 | 5 | 12 | 31 sec | 56 MB |
| [ffg-exp11][] | 4 | 20 | 4 | 5 | 12 | 152 sec | 94 MB |
| [ffg-exp12][] | 5 | 25 | 4 | 5 | 12 | 234 sec | 117 MB |
| [ffg-exp13][] | 7 | 15 | 4 | 10 | 40 | >16 days (timeout) | 300 MB |

As we can see, the running times increase dramatically, when we increase the
maximum number of FFG votes and votes. While we do not have a precise
Expand All @@ -78,6 +80,11 @@ gives us approximately 9 justified checkpoints. Since justified checkpoints may
refer to on another (via FFG votes), 9 checkpoints may build longer
justification chains.

The experiments [ffg-exp9][], [ffg-exp9a][], and [ffg-exp13][] push the boundary
beyond the minimal interesting small instances. In all of them the SAT solver
did not terminated. Interestingly, in case of [ffg-exp9][] and [ffg-exp9a][],
Kissat was solving the remaining 3% for multiple days.

<!-- References -->

[Alloy]: https://alloytools.org/
Expand All @@ -91,6 +98,8 @@ justification chains.
[ffg-exp6]: ./ffg-exp6.als
[ffg-exp7]: ./ffg-exp7.als
[ffg-exp8]: ./ffg-exp8.als
[ffg-exp9]: ./ffg-exp9.als
[ffg-exp9a]: ./ffg-exp9a.als
[ffg-exp10]: ./ffg-exp10.als
[ffg-exp11]: ./ffg-exp11.als
[ffg-exp12]: ./ffg-exp12.als
Expand Down
7 changes: 7 additions & 0 deletions spec3c-alloy/ffg-exp9.als
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
module abs/ffg_exp8
-- experiment 8

open abs/ffg

run noAccountableSafety for 10 but 5 Block, 7 Checkpoint, 4 Signature, 6 FfgVote, 24 Vote, 5 int

Binary file added spec3c-alloy/ffg-exp9.cnf.gz
Binary file not shown.
7 changes: 7 additions & 0 deletions spec3c-alloy/ffg-exp9a.als
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
module abs/ffg_exp9a
-- experiment 9a

open abs/ffg

run noAccountableSafety for 10 but 5 Block, 10 Checkpoint, 4 Signature, 8 FfgVote, 32 Vote, 5 int

Binary file added spec3c-alloy/ffg-exp9a.cnf.gz
Binary file not shown.

0 comments on commit eeba7fd

Please sign in to comment.