diff --git a/EXPERIMENTS.md b/EXPERIMENTS.md index 51f5013..2941402 100644 --- a/EXPERIMENTS.md +++ b/EXPERIMENTS.md @@ -362,9 +362,17 @@ This experiment takes about 37 minutes on a single node. The interesting thing is that we can parallelize the enumeration of 100 runs across multiple cores. For instance, we use 20 CPU cores to check 5 symbolic runs each. -## 6. Induction checking Spec 4 +## 8. Spec3b-smt: Checking Accountable Safety with SMT -### 6.1. Inductiveness check +See the detailed experiments in [README.md](./spec3b-smt/README.md). + +## 7. Spec3c-alloy: Checking Accountable Safety with Alloy + +See the detailed experiments in [README.md](./spec3c-alloy/README.md). + +## 8. Induction checking Spec 4 + +### 8.1. Inductiveness check In this experiment, we show that `IndInv` is an inductive invariant for `Spec 4`: @@ -384,7 +392,7 @@ $ apalache-mc check --length=1 --inv=IndInv \ This experiment takes about 2 seconds. -### 6.2. Inductive checking of Accountable Safety +### 8.2. Inductive checking of Accountable Safety ```sh $ cd ./spec4 @@ -395,9 +403,9 @@ $ JVM_ARGS=-Xmx20G apalache-mc check \ This experiment took 19 hours 48 min 29 sec. -## 7. Induction checking with Spec 4b +## 9. Induction checking with Spec 4b -### 7.1. Checking Accountable Safety for M3 +### 9.1. Checking Accountable Safety for M3 ```sh $ apalache-mc check --length=0 --init=IndInit_C1 \ @@ -406,7 +414,7 @@ $ apalache-mc check --length=0 --init=IndInit_C1 \ This experiment took XXX sec. -### 7.2. Checking Accountable Safety for M4a +### 9.2. Checking Accountable Safety for M4a ```sh $ apalache-mc check --length=0 --init=IndInit_C4 \ @@ -415,7 +423,7 @@ $ apalache-mc check --length=0 --init=IndInit_C4 \ This experiment took XXX sec. -### 7.3. Checking Accountable Safety for M4b +### 9.3. Checking Accountable Safety for M4b ```sh $ apalache-mc check --length=0 --init=IndInit_C2 \ @@ -424,7 +432,7 @@ $ apalache-mc check --length=0 --init=IndInit_C2 \ This experiment took XXX sec. -### 7.4. Checking Accountable Safety for M5a +### 9.4. Checking Accountable Safety for M5a ```sh $ apalache-mc check --length=0 --init=IndInit_C1 \ @@ -433,7 +441,7 @@ $ apalache-mc check --length=0 --init=IndInit_C1 \ This experiment took XXX sec. -### 7.5. Checking Accountable Safety for M5b +### 9.5. Checking Accountable Safety for M5b ```sh $ apalache-mc check --length=0 --init=IndInit_C3 \ @@ -442,7 +450,6 @@ $ apalache-mc check --length=0 --init=IndInit_C3 \ This experiment took XXX sec. - [Apalache]: https://apalache-mc.org/ [Apalache installation]: https://apalache-mc.org/docs/apalache/installation/index.html [Apalache releases]: https://github.com/apalache-mc/apalache/releases diff --git a/reports/spec3-alloy.tex b/reports/spec3-alloy.tex index a6ff363..b0c8cf6 100644 --- a/reports/spec3-alloy.tex +++ b/reports/spec3-alloy.tex @@ -154,7 +154,9 @@ \subsection{Model Checking Results}\label{sec:alloy-results} \\ 8 & 5 & 7 & 4 & 6 & 24 & 1h 27m & 156M \\ - 9 & --- & --- & --- & --- & --- & --- & --- + 9 & 5 & 10 & 4 & 8 & 24 & over 8 days (timeout) & 198 MB + \\ + 9 & 5 & 10 & 4 & 8 & 32 & over 8 days (timeout) & 220 MB \\ 10 & 3 & 15 & 4 & 5 & 12 & 31s & 56M \\ diff --git a/spec3c-alloy/README.md b/spec3c-alloy/README.md index 171546f..7f4ea35 100644 --- a/spec3c-alloy/README.md +++ b/spec3c-alloy/README.md @@ -50,24 +50,26 @@ conduct experiments for the same kinds of inputs: | Input | #blocks | #checkpoints | #signatures | #ffg_votes | #votes | runtime | memory | |------------|--------:|-------------:|------------:|-----------:|-------:|---------:|--------:| -| [ffg-exp1] | 3 | 5 | 4 | 5 | 12 | 4 sec | 35 MB | -| [ffg-exp2] | 4 | 5 | 4 | 5 | 12 | 10 sec | 40 MB | -| [ffg-exp3] | 5 | 5 | 4 | 5 | 12 | 15 sec | 45 MB | -| [ffg-exp4] | 3 | 6 | 4 | 6 | 15 | 57 sec | 52 MB | -| [ffg-exp5] | 4 | 6 | 4 | 6 | 15 | 167 sec | 55 MB | -| [ffg-exp6] | 5 | 6 | 4 | 6 | 15 | 245 sec | 57 MB | -| [ffg-exp7] | 6 | 6 | 4 | 6 | 15 | 360 sec | 82 MB | -| [ffg-exp8] | 5 | 7 | 4 | 6 | 24 | 1h 27m | 156 MB | +| [ffg-exp1][] | 3 | 5 | 4 | 5 | 12 | 4 sec | 35 MB | +| [ffg-exp2][] | 4 | 5 | 4 | 5 | 12 | 10 sec | 40 MB | +| [ffg-exp3][] | 5 | 5 | 4 | 5 | 12 | 15 sec | 45 MB | +| [ffg-exp4][] | 3 | 6 | 4 | 6 | 15 | 57 sec | 52 MB | +| [ffg-exp5][] | 4 | 6 | 4 | 6 | 15 | 167 sec | 55 MB | +| [ffg-exp6][] | 5 | 6 | 4 | 6 | 15 | 245 sec | 57 MB | +| [ffg-exp7][] | 6 | 6 | 4 | 6 | 15 | 360 sec | 82 MB | +| [ffg-exp8][] | 5 | 7 | 4 | 6 | 24 | 1h 27m | 156 MB | +| [ffg-exp9][] | 5 | 10 | 4 | 8 | 24 | >8 days (timeout) | 198 MB | +| [ffg-exp9a][] | 5 | 10 | 4 | 8 | 32 | >8 days (timeout) | 220 MB | In addition to the above experiments, we ran a few experiments that have inputs comparable in size to those produced by our TLA+ specification: | Input | #blocks | #checkpoints | #signatures | #ffg_votes | #votes | runtime | memory | |-------------|--------:|-------------:|------------:|-----------:|-------:|--------:|--------:| -| [ffg-exp10] | 3 | 15 | 4 | 5 | 12 | 31 sec | 56 MB | -| [ffg-exp11] | 4 | 20 | 4 | 5 | 12 | 152 sec | 94 MB | -| [ffg-exp12] | 5 | 25 | 4 | 5 | 12 | 234 sec | 117 MB | -| [ffg-exp13] | 7 | 15 | 4 | 10 | 40 | >10 days | 300 MB | +| [ffg-exp10][] | 3 | 15 | 4 | 5 | 12 | 31 sec | 56 MB | +| [ffg-exp11][] | 4 | 20 | 4 | 5 | 12 | 152 sec | 94 MB | +| [ffg-exp12][] | 5 | 25 | 4 | 5 | 12 | 234 sec | 117 MB | +| [ffg-exp13][] | 7 | 15 | 4 | 10 | 40 | >16 days (timeout) | 300 MB | As we can see, the running times increase dramatically, when we increase the maximum number of FFG votes and votes. While we do not have a precise @@ -78,6 +80,11 @@ gives us approximately 9 justified checkpoints. Since justified checkpoints may refer to on another (via FFG votes), 9 checkpoints may build longer justification chains. +The experiments [ffg-exp9][], [ffg-exp9a][], and [ffg-exp13][] push the boundary +beyond the minimal interesting small instances. In all of them the SAT solver +did not terminated. Interestingly, in case of [ffg-exp9][] and [ffg-exp9a][], +Kissat was solving the remaining 3% for multiple days. + [Alloy]: https://alloytools.org/ @@ -91,6 +98,8 @@ justification chains. [ffg-exp6]: ./ffg-exp6.als [ffg-exp7]: ./ffg-exp7.als [ffg-exp8]: ./ffg-exp8.als +[ffg-exp9]: ./ffg-exp9.als +[ffg-exp9a]: ./ffg-exp9a.als [ffg-exp10]: ./ffg-exp10.als [ffg-exp11]: ./ffg-exp11.als [ffg-exp12]: ./ffg-exp12.als diff --git a/spec3c-alloy/ffg-exp9.als b/spec3c-alloy/ffg-exp9.als new file mode 100644 index 0000000..9eabf87 --- /dev/null +++ b/spec3c-alloy/ffg-exp9.als @@ -0,0 +1,7 @@ +module abs/ffg_exp8 +-- experiment 8 + +open abs/ffg + +run noAccountableSafety for 10 but 5 Block, 7 Checkpoint, 4 Signature, 6 FfgVote, 24 Vote, 5 int + diff --git a/spec3c-alloy/ffg-exp9.cnf.gz b/spec3c-alloy/ffg-exp9.cnf.gz new file mode 100644 index 0000000..db5ed22 Binary files /dev/null and b/spec3c-alloy/ffg-exp9.cnf.gz differ diff --git a/spec3c-alloy/ffg-exp9a.als b/spec3c-alloy/ffg-exp9a.als new file mode 100644 index 0000000..3803109 --- /dev/null +++ b/spec3c-alloy/ffg-exp9a.als @@ -0,0 +1,7 @@ +module abs/ffg_exp9a +-- experiment 9a + +open abs/ffg + +run noAccountableSafety for 10 but 5 Block, 10 Checkpoint, 4 Signature, 8 FfgVote, 32 Vote, 5 int + diff --git a/spec3c-alloy/ffg-exp9a.cnf.gz b/spec3c-alloy/ffg-exp9a.cnf.gz new file mode 100644 index 0000000..8f98975 Binary files /dev/null and b/spec3c-alloy/ffg-exp9a.cnf.gz differ