Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Finishing the experiments with Spec4b #77

Merged
merged 17 commits into from
Nov 1, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
63 changes: 24 additions & 39 deletions EXPERIMENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -401,54 +401,39 @@ $ JVM_ARGS=-Xmx20G apalache-mc check \
MC_ffg_b3_ffg5_v12.tla
```

This experiment took 19 hours 48 min 29 sec.
This experiment did not finish after 6 days.

## 9. Induction checking with Spec 4b

### 9.1. Checking Accountable Safety for M3
We ran the experiments using the following scripts:

```sh
$ apalache-mc check --length=0 --init=IndInit_C1 \
--inv=AccountableSafety MC_ffg_b1_ffg5_v12.tla
```

This experiment took XXX sec.

### 9.2. Checking Accountable Safety for M4a

```sh
$ apalache-mc check --length=0 --init=IndInit_C4 \
--inv=AccountableSafety MC_ffg_b3_ffg5_v12.tla
```

This experiment took XXX sec.

### 9.3. Checking Accountable Safety for M4b
- [check-inductive.sh](./spec4b-optimizations/check-inductive.sh)
to check inductiveness of our invariants.

```sh
$ apalache-mc check --length=0 --init=IndInit_C2 \
--inv=AccountableSafety MC_ffg_b3_ffg5_v12.tla
```

This experiment took XXX sec.
- [check-accountable-safety.sh](./spec4b-optimizations/check-accountable-safety.sh)
to check accountable safety against the inductive invariant.

### 9.4. Checking Accountable Safety for M5a
The table below summarizes the experiments with inductiveness checking:

```sh
$ apalache-mc check --length=0 --init=IndInit_C1 \
--inv=AccountableSafety MC_ffg_b3_ffg5_v12.tla
```
| Instance | Init | Invariant | Memory | Time |
|------------------------|----------|-----------|--------|------------|
| MC_ffg_b1_ffg5_v12 | Init | IndInv | 580 MB | 7s |
| MC_ffg_b3_ffg5_v12 | Init | IndInv | 700 MB | 7s |
| MC_ffg_b1_ffg5_v12 | Init_C1 | IndInv | 1.4 GB | 2min 8s |
| MC_ffg_b3_ffg5_v12 | Init_C1 | IndInv | 1.8 GB | 19min 10s |
| MC_ffg_b3_ffg5_v12 | Init_C2 | IndInv | 1.6 GB | 13min 16s |
| MC_ffg_b3_ffg5_v12 | Init_C3 | IndInv | 1.6 GB | 17min 39s |
| MC_ffg_b3_ffg5_v12 | Init_C4 | IndInv | 1.6 GB | 16min 23s |

This experiment took XXX sec.

### 9.5. Checking Accountable Safety for M5b

```sh
$ apalache-mc check --length=0 --init=IndInit_C3 \
--inv=AccountableSafety MC_ffg_b3_ffg5_v12.tla
```
The table below summarizes the experiments with accountable safety:

This experiment took XXX sec.
| Instance | Init | Invariant | Memory | Time |
|------------------------|---------|-------------------|--------|------------|
| MC_ffg-b1_ffg5_v12 | Init_C1 | AccountableSafety | 1.2 GB | 11h 31min |
| MC_ffg-b3_ffg5_v12 | Init_C4 | AccountableSafety | 1.4 GB | TO (> 6d) |
| MC_ffg-b3_ffg5_v12 | Init_C2 | AccountableSafety | 1.3 GB | 1day 6h |
| MC_ffg-b3_ffg5_v12 | Init_C3 | AccountableSafety | 1.2 GB | 1h 53min |
| MC_ffg-b3_ffg5_v12 | Init_C1 | AccountableSafety | 1.5 GB | TO (> 6d) |

[Apalache]: https://apalache-mc.org/
[Apalache installation]: https://apalache-mc.org/docs/apalache/installation/index.html
Expand Down
17 changes: 9 additions & 8 deletions reports/spec3.tex
Original file line number Diff line number Diff line change
Expand Up @@ -2,10 +2,10 @@

\section{\SpecThree{}: Adding a state machine}\label{sec:spec3}

In the course of writing \texttt{Spec 2}, we realized
In the course of writing~\SpecTwo{}, we realized
that the executable Python specification is essentially sequential. In other
words, even though the 3SF algorithm is distributed, its Python
specification as well as \texttt{Spec 1} and \texttt{Spec 2} are encoding
specification as well as~\SpecOne{} and~\SpecTwo{} are encoding
all possible protocol states in a single specification state.

\subsection{State machines in \tlap{}}
Expand All @@ -32,7 +32,7 @@ \subsection{State machines in \tlap{}}

\subsection{Introducing a state machine}

Having the observations above in mind, we introduce \texttt{Spec 3} in which we
Having the observations above in mind, we introduce~\SpecThree{} in which we
specify a state machine that incrementally builds possible protocol states by
constructing the following data structures:

Expand Down Expand Up @@ -77,7 +77,7 @@ \subsection{Introducing a state machine}
\begin{figure}
\centering
\includegraphics[width=\textwidth]{images/abstract-ffg-next.pdf} % Include the PDF file
\caption{The transition predicate of Spec 3}\label{fig:abstract-ffg-next}
\caption{The transition predicate of~\SpecThree{}}\label{fig:abstract-ffg-next}
\end{figure}

\subsection{Model checking experiments}
Expand Down Expand Up @@ -113,8 +113,8 @@ \subsection{Model checking experiments}
\\ \midrule
\texttt{check}
& $\textit{ExistTwoFinalizedConflictingBlocks}$
& N
& 1h 27min
& 6
& 52min
\\ \midrule
\texttt{simulate}
& $\textit{AccountableSafety}$
Expand All @@ -124,9 +124,10 @@ \subsection{Model checking experiments}
\texttt{check}
& $\textit{AccountableSafety}$
& 6
& N
& 16h 13min
\\ \bottomrule
\end{tabular}
\caption{Model checking experiments with Spec 3}\label{tab:abstract-ffg-mc}
\caption{Model checking experiments
with~\SpecThree{}}\label{tab:abstract-ffg-mc}
\end{table}

7 changes: 4 additions & 3 deletions reports/spec4.tex
Original file line number Diff line number Diff line change
Expand Up @@ -86,13 +86,14 @@ \subsection{Model checking experiments}
& \texttt{Next}
& \texttt{AccountableSafety}
& 0
& XXX
& XXX
& 1.5 GB
& TO ($> 6$d)
\\
\bottomrule
\end{tabular}
\caption{Model checking experiments
with~\SpecFourB{} on the instance~\texttt{MC\_ffg\_b3\_ffg5\_v12}
with~\SpecFour{} on the instance~\texttt{MC\_ffg\_b3\_ffg5\_v12}
(``TO'' means timeout)
}\label{tab:spec4-experiments}
\end{table}

100 changes: 81 additions & 19 deletions reports/spec4b.tex
Original file line number Diff line number Diff line change
Expand Up @@ -94,20 +94,24 @@ \subsection{Decomposition of chain configurations}\label{sec:decomposition}

\subsection{Model checking experiments}

Table~\ref{tab:spec4b-experiments} summarizes our experiments with Apalache for
various configurations. One interesting effect of the optimizations,
especially of the ones presented in Section~\ref{sec:decomposition}, is
a significant drop in the memory consumption of the SMT solver. In our
experiments, Z3 required from 700~MB to 1.5~GB\@. While this is still a factor of
20 in comparison to the Alloy experiments in Section~\ref{sec:alloy}, this is
significantly better than our initial experiments with~\SpecTwo{}
and~\SpecThree{}, which required up to 20~GB of RAM\@.

We notice significant variations in the running times on different
configurations. This is probably due to the effect of different runs of the Z3
SMT solve having great variety of running times, which is well-known in the
computer-aided community. To further confirm these variations, we could run
multiple experiments with `hyperfine`.
Tables~\ref{tab:spec4b-experiments} and~\ref{tab:spec4b-inductiveness}
summarize our experiments with Apalache for various configurations. One
interesting effect of the optimizations, especially of the ones presented in
Section~\ref{sec:decomposition}, is a significant drop in the memory
consumption of the SMT solver. In our experiments, Z3 required from 700~MB to
1.5~GB\@. While this is still a factor of 20 in comparison to the Alloy
experiments in Section~\ref{sec:alloy}, this is significantly better than our
initial experiments with~\SpecTwo{} and~\SpecThree{}, which required up to
20~GB of RAM\@. Interestingly, inductiveness checks in
Table~\ref{tab:spec4b-inductiveness} take significantly longer than in the case
of~\SpecFour{}. We conjecture that this is caused by the need to check more
specialized graphs in conjunction with steps.

We observe that the run-times depend on the different
configurations used. This is probably due to the effect of built-in variance of runs of the Z3
SMT solver, which is well-known in the
community of computer-aided verification. To further confirm these variations,
we could run multiple experiments with~\texttt{hyperfine}.

\begin{table}
\centering
Expand All @@ -127,8 +131,8 @@ \subsection{Model checking experiments}
M4a: Fig.~\ref{fig:four-top}
& \texttt{MC\_ffg\_b3\_ffg5\_v12}
& \texttt{Init\_C4}
& XXX
& XXX
& 1.4 GB
& TO ($>6$d)
\\
M4b: Fig.~\ref{fig:four-bottom}
& \texttt{MC\_ffg\_b3\_ffg5\_v12}
Expand All @@ -145,12 +149,70 @@ \subsection{Model checking experiments}
M5b: Fig.~\ref{fig:five2}
& \texttt{MC\_ffg\_b3\_ffg5\_v12}
& \texttt{Init\_C1}
& XXX
& XXX
& 1.5 GB
& TO ($>6$d)
\\
\bottomrule
\end{tabular}
\caption{Checking accountable safety
against~\SpecFourB{}}\label{tab:spec4b-experiments}
against~\SpecFourB{} (``TO'' means timeout)}%
\label{tab:spec4b-experiments}
\end{table}

\begin{table}
\centering
\begin{tabular}{lllrr}
\tbh{Instance}
& \tbh{Init}
& \tbh{Invariant}
& \tbh{Memory}
& \tbh{Time}
\\ \toprule
\texttt{MC\_ffg\_b1\_ffg5\_v12}
& \texttt{Init}
& \texttt{IndInv}
& 580 MB
& 7s
\\
\texttt{MC\_ffg\_b3\_ffg5\_v12}
& \texttt{Init}
& \texttt{IndInv}
& 700 MB
& 7s
\\
\texttt{MC\_ffg\_b1\_ffg5\_v12}
& \texttt{Init\_C1}
& \texttt{IndInv}
& 1.4 GB
& 2min 8s
\\
\texttt{MC\_ffg\_b3\_ffg5\_v12}
& \texttt{Init\_C1}
& \texttt{IndInv}
& 1.8 GB
& 19min 10s
\\
\texttt{MC\_ffg\_b3\_ffg5\_v12}
& \texttt{Init\_C2}
& \texttt{IndInv}
& 1.6 GB
& 13min 16s
\\
\texttt{MC\_ffg\_b3\_ffg5\_v12}
& \texttt{Init\_C3}
& \texttt{IndInv}
& 1.6 GB
& 17min 39s
\\
\texttt{MC\_ffg\_b3\_ffg5\_v12}
& \texttt{Init\_C4}
& \texttt{IndInv}
& 1.6 GB
& 16min 23s
\\
\bottomrule
\end{tabular}
\caption{Checking inductiveness
for~\SpecFourB{}}\label{tab:spec4b-inductiveness}
\end{table}

8 changes: 6 additions & 2 deletions spec4b-optimizations/MC_ffg_b1_ffg5_v12.tla
Original file line number Diff line number Diff line change
Expand Up @@ -59,7 +59,6 @@ IndInit ==
/\ fork_number \in -MAX_BLOCK_BODY..0
/\ chain2_fork_block_number = fork_number
/\ justified_checkpoints = Gen(5)
/\ InitAccountableSafety
/\ IndInv

IndInit_C1 ==
Expand All @@ -72,6 +71,7 @@ IndInit_C1 ==
LET b1 == [ body |-> 1, slot |-> i ]
b2 == [ body |-> -1, slot |-> j ]
IN
/\ 0 < i /\ 0 < j
/\ all_blocks = { GenesisBlock, b1, b2 }
/\ chain1 = { GenesisBlock, b1 }
/\ chain1_tip = b1
Expand All @@ -82,8 +82,12 @@ IndInit_C1 ==
/\ ffg_votes = Gen(5) \* must be >= 4 to observe disagreement
/\ votes = Gen(12) \* must be >= 12 to observe disagreement
/\ justified_checkpoints = Gen(5)
/\ InitAccountableSafety
/\ VotesInv
/\ CheckpointsInv

\* restrict the initial condition to have accountable safety
IndInit_C1_AS ==
/\ IndInit_C1
/\ InitAccountableSafety

=============================================================================
29 changes: 23 additions & 6 deletions spec4b-optimizations/MC_ffg_b3_ffg5_v12.tla
Original file line number Diff line number Diff line change
Expand Up @@ -68,11 +68,16 @@ IndInit_C1 ==
/\ ffg_votes = Gen(5) \* must be >= 4 to observe disagreement
/\ votes = Gen(12) \* must be >= 12 to observe disagreement
/\ justified_checkpoints = Gen(5)
\*/\ InitAccountableSafety
/\ VotesInv
/\ CheckpointsApproxInv
\*/\ CheckpointsInv
/\ CheckpointsInv
\*/\ CheckpointsApproxInv

\* restrict the initial condition to have accountable safety
IndInit_C1_AS ==
/\ IndInit_C1
/\ InitAccountableSafety

\* a very restricted initial condition
IndInit_C1_1_2_3_4 ==
(*
/ [+1] - [+2]
Expand Down Expand Up @@ -121,10 +126,14 @@ IndInit_C2 ==
/\ ffg_votes = Gen(5) \* must be >= 4 to observe disagreement
/\ votes = Gen(12) \* must be >= 12 to observe disagreement
/\ justified_checkpoints = Gen(5)
/\ InitAccountableSafety
/\ VotesInv
/\ CheckpointsInv

\* restrict the initial condition to have accountable safety
IndInit_C2_AS ==
/\ IndInit_C2
/\ InitAccountableSafety

IndInit_C3 ==
(*
/ [+2]
Expand All @@ -147,10 +156,14 @@ IndInit_C3 ==
/\ ffg_votes = Gen(5) \* must be >= 4 to observe disagreement
/\ votes = Gen(12) \* must be >= 12 to observe disagreement
/\ justified_checkpoints = Gen(5)
/\ InitAccountableSafety
/\ VotesInv
/\ CheckpointsInv

\* restrict the initial condition to have accountable safety
IndInit_C3_AS ==
/\ IndInit_C3
/\ InitAccountableSafety

IndInit_C4 ==
(*
/ [+1] - [+2]
Expand All @@ -173,8 +186,12 @@ IndInit_C4 ==
/\ ffg_votes = Gen(5) \* must be >= 4 to observe disagreement
/\ votes = Gen(12) \* must be >= 12 to observe disagreement
/\ justified_checkpoints = Gen(5)
/\ InitAccountableSafety
/\ VotesInv
/\ CheckpointsInv

\* restrict the initial condition to have accountable safety
IndInit_C4_AS ==
/\ IndInit_C4
/\ InitAccountableSafety

=============================================================================
13 changes: 13 additions & 0 deletions spec4b-optimizations/check-accountable-safety.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
#!/usr/bin/env bash
#
# Run the experiments to check accountable safety.

if [ "$APALACHE_HOME" == "" ]; then
echo "set APALACHE_HOME to the Apalache installation directory"
exit 1
fi

parallel --joblog joblog-as.txt --results out/ --colsep ',' \
-a experiments-accountable-safety.csv \
/usr/bin/time -f "'%Uuser %Ssystem %elapsed %Mmaxk'" \
${APALACHE_HOME}/bin/apalache-mc check --length=0 --init={2} --inv={3} {1}
Loading