Merge pull request #76 from freespek/igor/alloy-exp67

Two more Alloy experiments
freespek · Nov 1, 2024 · eeba7fd · eeba7fd
2 parents d577779 + ae4440c
commit eeba7fd
Show file tree

Hide file tree

Showing 7 changed files with 55 additions and 23 deletions.
diff --git a/EXPERIMENTS.md b/EXPERIMENTS.md
@@ -362,9 +362,17 @@ This experiment takes about 37 minutes on a single node. The interesting thing
 is that we can parallelize the enumeration of 100 runs across multiple cores.
 For instance, we use 20 CPU cores to check 5 symbolic runs each.
 
-## 6. Induction checking Spec 4
+## 8. Spec3b-smt: Checking Accountable Safety with SMT
 
-### 6.1. Inductiveness check
+See the detailed experiments in [README.md](./spec3b-smt/README.md).
+
+## 7. Spec3c-alloy: Checking Accountable Safety with Alloy
+
+See the detailed experiments in [README.md](./spec3c-alloy/README.md).
+
+## 8. Induction checking Spec 4
+
+### 8.1. Inductiveness check
 
 In this experiment, we show that `IndInv` is an inductive invariant for `Spec 4`:
 
@@ -384,7 +392,7 @@ $ apalache-mc check --length=1 --inv=IndInv \
 
 This experiment takes about 2 seconds.
 
-### 6.2. Inductive checking of Accountable Safety
+### 8.2. Inductive checking of Accountable Safety
 
 ```sh
 $ cd ./spec4
@@ -395,9 +403,9 @@ $ JVM_ARGS=-Xmx20G apalache-mc check \
 
 This experiment took 19 hours 48 min 29 sec.
 
-## 7. Induction checking with Spec 4b
+## 9. Induction checking with Spec 4b
 
-### 7.1. Checking Accountable Safety for M3
+### 9.1. Checking Accountable Safety for M3
 
 ```sh
 $ apalache-mc check --length=0 --init=IndInit_C1 \
@@ -406,7 +414,7 @@ $ apalache-mc check --length=0 --init=IndInit_C1 \
 
 This experiment took XXX sec.
 
-### 7.2. Checking Accountable Safety for M4a
+### 9.2. Checking Accountable Safety for M4a
 
 ```sh
 $ apalache-mc check --length=0 --init=IndInit_C4 \
@@ -415,7 +423,7 @@ $ apalache-mc check --length=0 --init=IndInit_C4 \
 
 This experiment took XXX sec.
 
-### 7.3. Checking Accountable Safety for M4b
+### 9.3. Checking Accountable Safety for M4b
 
 ```sh
 $ apalache-mc check --length=0 --init=IndInit_C2 \
@@ -424,7 +432,7 @@ $ apalache-mc check --length=0 --init=IndInit_C2 \
 
 This experiment took XXX sec.
 
-### 7.4. Checking Accountable Safety for M5a
+### 9.4. Checking Accountable Safety for M5a
 
 ```sh
 $ apalache-mc check --length=0 --init=IndInit_C1 \
@@ -433,7 +441,7 @@ $ apalache-mc check --length=0 --init=IndInit_C1 \
 
 This experiment took XXX sec.
 
-### 7.5. Checking Accountable Safety for M5b
+### 9.5. Checking Accountable Safety for M5b
 
 ```sh
 $ apalache-mc check --length=0 --init=IndInit_C3 \
@@ -442,7 +450,6 @@ $ apalache-mc check --length=0 --init=IndInit_C3 \
 
 This experiment took XXX sec.
 
-
 [Apalache]: https://apalache-mc.org/
 [Apalache installation]: https://apalache-mc.org/docs/apalache/installation/index.html
 [Apalache releases]: https://github.com/apalache-mc/apalache/releases

diff --git a/reports/spec3-alloy.tex b/reports/spec3-alloy.tex
@@ -154,7 +154,9 @@ \subsection{Model Checking Results}\label{sec:alloy-results}
             \\
         8 & 5 & 7 & 4 & 6 & 24 & 1h 27m & 156M
             \\
-        9 & --- & --- & --- & --- & --- & --- & ---
+        9 & 5 & 10 & 4 & 8 & 24 & over 8 days (timeout) & 198 MB
+            \\
+        9 & 5 & 10 & 4 & 8 & 32 & over 8 days (timeout) & 220 MB
             \\
         10 & 3 & 15 & 4 & 5 & 12 & 31s & 56M
             \\

diff --git a/spec3c-alloy/README.md b/spec3c-alloy/README.md
@@ -50,24 +50,26 @@ conduct experiments for the same kinds of inputs:
 
 | Input      | #blocks | #checkpoints | #signatures | #ffg_votes | #votes | runtime  | memory  |
 |------------|--------:|-------------:|------------:|-----------:|-------:|---------:|--------:|
-| [ffg-exp1] |    3    |      5       |      4      |      5     |   12   |  4 sec   |  35 MB  |
-| [ffg-exp2] |    4    |      5       |      4      |      5     |   12   | 10 sec   |  40 MB  |
-| [ffg-exp3] |    5    |      5       |      4      |      5     |   12   | 15 sec   |  45 MB  |
-| [ffg-exp4] |    3    |      6       |      4      |      6     |   15   | 57 sec   |  52 MB  |
-| [ffg-exp5] |    4    |      6       |      4      |      6     |   15   | 167 sec  |  55 MB  |
-| [ffg-exp6] |    5    |      6       |      4      |      6     |   15   | 245 sec  |  57 MB  |
-| [ffg-exp7] |    6    |      6       |      4      |      6     |   15   | 360 sec  |  82 MB  |
-| [ffg-exp8] |    5    |      7       |      4      |      6     |   24   | 1h 27m   |  156 MB |
+| [ffg-exp1][] |    3    |      5       |      4      |      5     |   12   |  4 sec   |  35 MB  |
+| [ffg-exp2][] |    4    |      5       |      4      |      5     |   12   | 10 sec   |  40 MB  |
+| [ffg-exp3][] |    5    |      5       |      4      |      5     |   12   | 15 sec   |  45 MB  |
+| [ffg-exp4][] |    3    |      6       |      4      |      6     |   15   | 57 sec   |  52 MB  |
+| [ffg-exp5][] |    4    |      6       |      4      |      6     |   15   | 167 sec  |  55 MB  |
+| [ffg-exp6][] |    5    |      6       |      4      |      6     |   15   | 245 sec  |  57 MB  |
+| [ffg-exp7][] |    6    |      6       |      4      |      6     |   15   | 360 sec  |  82 MB  |
+| [ffg-exp8][] |    5    |      7       |      4      |      6     |   24   | 1h 27m   |  156 MB |
+| [ffg-exp9][] |    5    |      10      |      4      |      8     |   24   | >8 days (timeout)  |  198 MB |
+| [ffg-exp9a][] |    5   |      10      |      4      |      8     |   32   | >8 days (timeout)  |  220 MB |
 
 In addition to the above experiments, we ran a few experiments that have
 inputs comparable in size to those produced by our TLA+ specification:
 
 | Input       | #blocks | #checkpoints | #signatures | #ffg_votes | #votes | runtime | memory  |
 |-------------|--------:|-------------:|------------:|-----------:|-------:|--------:|--------:|
-| [ffg-exp10] |    3    |      15      |      4      |      5     |   12   | 31 sec  | 56 MB   |
-| [ffg-exp11] |    4    |      20      |      4      |      5     |   12   | 152 sec | 94 MB   |
-| [ffg-exp12] |    5    |      25      |      4      |      5     |   12   | 234 sec | 117 MB  |
-| [ffg-exp13] |    7    |      15      |      4      |      10    |   40   | >10 days | 300 MB  |
+| [ffg-exp10][] |    3    |      15      |      4      |      5     |   12   | 31 sec  | 56 MB   |
+| [ffg-exp11][] |    4    |      20      |      4      |      5     |   12   | 152 sec | 94 MB   |
+| [ffg-exp12][] |    5    |      25      |      4      |      5     |   12   | 234 sec | 117 MB  |
+| [ffg-exp13][] |    7    |      15      |      4      |      10    |   40   | >16 days (timeout) | 300 MB  |
 
 As we can see, the running times increase dramatically, when we increase the
 maximum number of FFG votes and votes. While we do not have a precise
@@ -78,6 +80,11 @@ gives us approximately 9 justified checkpoints. Since justified checkpoints may
 refer to on another (via FFG votes), 9 checkpoints may build longer
 justification chains.
 
+The experiments [ffg-exp9][], [ffg-exp9a][], and [ffg-exp13][] push the boundary
+beyond the minimal interesting small instances. In all of them the SAT solver
+did not terminated. Interestingly, in case of [ffg-exp9][] and [ffg-exp9a][],
+Kissat was solving the remaining 3% for multiple days.
+
 <!-- References -->
 
 [Alloy]: https://alloytools.org/
@@ -91,6 +98,8 @@ justification chains.
 [ffg-exp6]: ./ffg-exp6.als
 [ffg-exp7]: ./ffg-exp7.als
 [ffg-exp8]: ./ffg-exp8.als
+[ffg-exp9]: ./ffg-exp9.als
+[ffg-exp9a]: ./ffg-exp9a.als
 [ffg-exp10]: ./ffg-exp10.als
 [ffg-exp11]: ./ffg-exp11.als
 [ffg-exp12]: ./ffg-exp12.als

diff --git a/spec3c-alloy/ffg-exp9.als b/spec3c-alloy/ffg-exp9.als
@@ -0,0 +1,7 @@
+module abs/ffg_exp8
+-- experiment 8
+
+open abs/ffg
+
+run noAccountableSafety for 10 but 5 Block, 7 Checkpoint, 4 Signature, 6 FfgVote, 24 Vote, 5 int
+
diff --git a/spec3c-alloy/ffg-exp9.cnf.gz b/spec3c-alloy/ffg-exp9.cnf.gz
diff --git a/spec3c-alloy/ffg-exp9a.als b/spec3c-alloy/ffg-exp9a.als
@@ -0,0 +1,7 @@
+module abs/ffg_exp9a
+-- experiment 9a
+
+open abs/ffg
+
+run noAccountableSafety for 10 but 5 Block, 10 Checkpoint, 4 Signature, 8 FfgVote, 32 Vote, 5 int
+
diff --git a/spec3c-alloy/ffg-exp9a.cnf.gz b/spec3c-alloy/ffg-exp9a.cnf.gz