This web site contains additional information on our paper “From Innermost to Full Probabilistic Term Rewriting: Almost-Sure Termination, Complexity, and Modularity”. The conference version of our paper was published at FoSSaCS 2024 and a longer conference version including all appendices is available at arXiv.
We now wrote an revised and extended journal version of our paper that is also available at arXiv. This website provides information on our implementation in the tool AProVE, and we describe our experiments for the conference paper at FoSSaCS. Moreover, we now added a new section (Experiments Journal Version) on the evaluation of our revised and extended journal version, where we consider the current strategy used within AProVE.
The easiest way to run AProVE is via its web interface (for PTRSs in WST format or in ARI format).
Moreover, an artifact of AProVE is available at Zenodo. It includes an OVA file for a VM that has everything installed.
In addition, recent releases of AProVE are available for download on GitHub. Master_2023_12_29 is the newest release that is able to analyze full almost-sure termination (fAST) by checking the conditions of our newly developed theorems and then analyzing innermost almost-sure termination (iAST). For termination analysis of probabilistic term rewrite systems, the following dependencies have to be installed (see here for more installation instructions):
Extending the path environment is necessary so that AProVE can find these programs.
In the following, we first describe the input syntax of a PTRS file. Then, we explain more details about our implementation. In the end, we describe how to execute the implementation via the web interface and via the JAR file on GitHub.
Our implementation allows for multiple different input formats of PTRSs.
First, AProVE can parse the input syntax for PTRSs that was first described in the paper ADY20. Here, the PTRS file needs to have the extension .ptrs. The syntax extends the WST format for term rewrite systems by adding a new syntax for probabilistic rules. A probabilistic rewrite rule has the form:
lhs -> w1 : rhs1 || ... || wk : rhsk
Here, lhs, rhs1, …, rhsk are terms and w1, …, wk are natural numbers > 0. This corresponds to the rule:
lhs -> {w1/w : rhs1, ..., wk/w : rhsk}
where w is the sum of w1, …, wk.
This syntax does not use the actual probabilities but rather computes them after summing up all occurring numbers in a rule. Hence, we also allow the following syntax for probabilistic rules:
lhs -> a1/b1 : rhs1 || ... || ak/bk : rhsk
Here, lhs, rhs1, …, rhsk are again terms and a1, …, ak and b1, …, bk are natural numbers > 0 that describe the actual probabilities. This corresponds to the rule:
lhs -> {a1/b1 : rhs1, ..., ak/bk : rhsk}
However, one has to ensure that the value ai/bi is actually between 0 and 1, and that the sum of all probabilities in a rule is 1. Otherwise, AProVE rejects the input file with an exception. The PTRS may contain rules in both notations.
One can use AProVE to prove termination, AST, or PAST of PTRSs w.r.t. either full or innermost rewriting, and w.r.t. either arbitrary or only basic start terms.
The proof goal can be indicated via the keywords STRATEGY
, GOAL
, and STARTTERM
.
The input file has to contain either (GOAL TERMINATION)
, (GOAL AST)
, or (GOAL PAST)
.
If the input file contains (STRATEGY INNERMOST)
, then AProVE tries to prove the provided goal w.r.t. innermost evaluation, and if the file does not contain the STRATEGY
keyword, then AProVE tries to prove the provided goal w.r.t. full rewriting.
This is analogous for the STARTTERM
keyword: If the input file contains (STARTTERM BASIC)
, then AProVE tries to prove the provided goal w.r.t. basic start terms, and if the file does not contain the STARTTERM
keyword, then AProVE tries to prove the provided goal w.r.t. arbitrary start terms.
Second, AProVE can also parse the new ARI format for (probabilistic) term rewrite systems from the annual termination competition. Then, the PTRS file must have the extension .ari.
As described in the input syntax section, we only allow for rational probabilities. Internally, we store probabilities as fractions, by storing the numerator and denominator as integers. However, once we need to check satisfiability of polynomial (in)equations, we multiply the (in)equations with appropriate integers so that the resulting SMT-problem only deals with natural polynomials.
Up to now, our main technique for termination analysis of PTRSs was the probabilistic DP framework from KG23 and its improved version from KDG24 which however can only prove iAST. Now, if one wants to analyze fAST for basic start terms, then we first try to prove that the conditions of our new Thm. 33 are satisfied. If this succeeds, then we can use the full probabilistic DP framework in order to prove iAST, which then implies fAST. Otherwise, we try to prove all conditions of our new Thm. 38. If this succeeds, then we can use most of the processors from the probabilistic DP framework to prove iAST, which again implies fAST. If neither Thm. 33 nor Thm. 38 can be applied, then we try to prove fAST using the direct application of polynomial orderings from KG23. For the remaining strategy to prove iAST/fPAST/iPAST, we refer to here and here. The strategy for arbitrary start terms is completely analogous but with Thm. 17 and Thm. 27 instead of Thm. 33 and Thm. 38, respectively.
Our implementation can be tested via a web interface (for PTRSs in WST format or in ARI format).
Enter Program Code
shows the program that is currently analyzed. The user can modify the program or replace it by a new one.Upload a File
of a program to analyze (where the file should have the extension .ptrs
).Submit
calls AProVE in order to prove almost-sure termination of the given PTRS.Return to program input
takes you back to the web page with the input program.After downloading AProVE and installing all dependencies, make sure that the path variables are set as described above. Then AProVE can be run on the file rw1.ptrs
by executing
java -ea -jar aprove.jar -m wst rw1.ptrs
If you want to see the full proof, you can use the option -p plain
for a plain text version or -p html
for an HTML format.
java -ea -jar aprove.jar -m wst rw1.ptrs -p plain
java -ea -jar aprove.jar -m wst rw1.ptrs -p html
To evaluate the power of our approach for our conference paper at FoSSaCS 2024, we used the set of 100 benchmarks from here and added 18 new examples (including the 11 examples from the paper).
The full set of benchmarks can be downloaded here, where the subdirectory FoSSaCS24
contains our 18 new examples.
For a detailed explanation of every single new benchmark see below.
In the following table, we compare AProVE’s results for basic start terms when using the different new theorems to switch from proving fAST to iAST (Thm. 17, Thm. 27, Thm. 33, Thm. 38). Note that Thm. 33 subsumes Thm. 17 and Thm. 38 subsumes Thm. 27 for basic start terms, since right-linearity implies spareness. Furthermore, we show the results for the old version of AProVE that only uses the direct application of polynomials of KG23 for proving fAST, and the new version of AProVE that combines Thm. 33 and Thm. 38, as described above.
We tested AProVE using StarExec, which provides an Intel Xeon E5-2609 CPU with 2.4 GHz and 264 GB of RAM.
Here, the result YES
indicates a successful fAST proof, and
the result MAYBE
indicates that the tool crashed or terminated the verification process within the given time without a result.
The result NO
indicates a successful fAST disproof, however this can only occur if the PTRS only contains trivial probabilities and is non-terminating, as we did not implement any specific technique to disprove fAST yet.
As in the Termination Competition, a timeout of 300 seconds was used for each example.
The following table shows the overall results for basic start terms.
Examples | old AProVE | Thm. 17 | Thm. 27 | Thm. 33 | Thm. 38 | new AProVE |
---|---|---|---|---|---|---|
118 | 36 | 48 | 44 | 58 | 56 | 61 |
From the 58 examples that we can solve by using both Thm. 33 and Thm. 38 in ``new AProVE’’, 5 examples (that are all right-linear) can only be solved by Thm. 33, 3 examples (where one is right-linear and the others only spare) can only be solved by Thm. 38, and 50 can be solved by both. Furthermore, AProVE can prove iAST for 90 of the 115 examples.
A detailed table with the result for each individual benchmark can be downloaded here.
Finally, we also give a table to show the overall results for arbitrary start terms.
Examples | old AProVE | Thm. 17 | Thm. 27 | new AProVE |
---|---|---|---|---|
118 | 36 | 48 | 44 | 49 |
We added some examples to demonstrate the strengths and the limits of the new theorems Thm. 27 and Thm. 38, since all but two of the benchmarks from the earlier collection are left-linear, whereas Thm. 27 and Thm. 38 are specifically needed for non-left-linear PTRSs.
We added three examples (tree01.ptrs - tree03.ptrs) which illustrate that we are able to prove fAST for probabilistic algorithms dealing with data structures that use non-left-linear rules. All of them have the following two rules in order to create a random tree:
createT -> 1 : Leaf || 1 : NodeOne(createT,numR) || 1 : NodeTwo(createT,createT,numR)
numR -> 1 : 0 || 1 : s(numR)
The trees in this algorithm may have zero, one, or two successor nodes.
The PTRS tree01.ptrs contains a simplify function (with a non-linear left-hand side) that can merge two subtrees if they are completely the same:
simplify(NodeTwo(x,x,n)) -> 1 : NodeOne(x,n)
Furthermore, we have some rules to add up all numbers occurring in a tree:
sumAll(Leaf) -> 1 : 0
sumAll(NodeOne(x,n)) -> 1 : plus(n,sumAll(x))
sumAll(NodeTwo(x,y,n)) -> 1 : plus(n,plus(sumAll(x),sumAll(y)))
plus(0,y) -> 1 : y
plus(s(x),y) -> 1 : s(plus(x,y))
Note that first simplifying or first adding up all integers may lead to different results. However, in all cases, i.e., for all evaluation strategies, the above rules terminate. Together with the random tree generation, the PTRS is fAST, and we are able to prove this by using Thm. 27 and by proving iAST using the DP framework from KG23.
We are also able to deal with non-right linear rules that are spare as demonstrated by tree02.ptrs. Here, instead of summing up all numbers, we are summing up the square of all numbers in the tree:
sumAll(Leaf) -> 1 : 0
sumAll(NodeOne(x,n)) -> 1 : plus(times(n,n),sumAll(x))
sumAll(NodeTwo(x,y,n)) -> 1 : plus(times(n,n),plus(sumAll(x),sumAll(y)))
plus(0,y) -> 1 : y
plus(s(x),y) -> 1 : s(plus(x,y))
times(0,y) -> 1 : 0
times(s(x),y) -> 1 : plus(y,times(x,y))
We are able to prove fAST using Thm. 38 (since the PTRS is non-right-linear but spare) and the DP framework.
Finally, tree03.ptrs represents a search algorithm that checks whether the random tree generator created a tree with contains a specific number. Again, we are able to prove fAST using Thm. 38 (since the PTRS is non-right-linear but spare) and the DP framework.
We also added the probabilistic quicksort algorithm from KG23 and included a simplify function to remove multiple entries of the same number in the list after sorting it:
simplify(cons(x,cons(x,xs)))-> 1 : cons(x,xs)
However, for this system we are unable to prove spareness, and hence, we are not allowed to use any of our new theorems. Since a direct application of polynomials also fails, we are not able to prove fAST for the PTRS (while we are able to prove iAST).
Another problem of our new theorems is that we cannot deal with overlapping PTRSs. The following PTRS (div.ptrs) is the integer division from KG23 but with an additional short-cut rule to compute x-x faster. This additional rule does not interfere with the property of being fAST or not, however, the new system is now overlapping, and we are again unable to use any of our new theorems to prove fAST.
minus(x, x) -> 1 : 0
minus(x, 0) -> 1 : x
minus(s(x), s(y)) -> 1 : minus(x, y)
div(0, s(x)) -> 1 : 0
div(s(x), s(y)) -> 1 : s(div(minus(x, y), s(y))) || 1 : div(s(x), s(y))
The following PTRS (coinflips01.ptrs) describes a program that flips two coins and terminates if the results of both coin flips differ. Otherwise, we do the same two simultaneous coin flips again.
coin -> 1 : head || 1 : tail
eq(x,x) -> 1 : true
loop(true) -> 1 : loop(eq(coin,coin))
This PTRS is fAST, but in order to prove this, we need the rewriting processor. However, the PTRS is non-left-linear, which means we can only apply Thm. 27 or Thm. 38 and after applying these theorems (and moving to simultaneous rewriting), we are not able to use the rewriting processor anymore. Hence, we cannot prove fAST for this system.
In the next example (coinflips02.ptrs), we have a different implementation of the equality function that is left-linear.
coin -> 1 : head || 1 : tail
eq(head,head) -> 1 : true
eq(tail,tail) -> 1 : true
eq(tail,head) -> 1 : false
eq(head,tail) -> 1 : false
loop(true) -> 1 : loop(eq(coin,coin))
Now we can use Thm. 33 without moving to simultaneous rewriting and use the rewriting processor to prove iAST, hence fAST.
old Thm. 4.2 Thm. 5.5 Thm. 5.13 Thm. 5.20 new
For the journal version of our paper, we performed new experiments in order to compare our results to the first DP framework for fAST which we developed in the meanwhile in KG24. For this evaluation, we used the set of 130 benchmarks that was used in the evaluation of KG24. The full set of all 130 benchmarks can be downloaded here.
In the following table, we compare AProVE’s results for basic start terms when using the different new theorems to switch from proving fAST to iAST (Thm. 4.2 (Thm. 17 in the conference paper), Thm. 5.5 (previously Thm. 27), Thm. 5.13 (previously Thm. 33), Thm. 5.20 (previously Thm. 38)). Furthermore, we show the results for a version of AProVE (called old in the tables below) that uses the direct application of polynomials of KG23 and the DP framework of KG24 for proving fAST, and the new version of AProVE (called new in the tables below) that combines Thm. 5.13, Thm. 5.20, and the DP framework of KG24. So compared to the implementation described above, we now additionally try to use the DP framework for fAST, if both Thm. 5.13 and Thm. 5.20 are not applicable.
Again, we tested AProVE using StarExec and used a timeout of 300 seconds was used for each example, as in the Termination Competition.
The following table shows the overall results for basic start terms.
Examples | old | Thm. 4.2 | Thm. 5.5 | Thm. 5.13 | Thm. 5.20 | new |
---|---|---|---|---|---|---|
130 | 58 | 64 | 60 | 69 | 67 | 72 |
From the 72 examples that we can solve by using both Thm. 5.13 and Thm. 5.20 in new for basic start terms, there are 5 examples (that are all linear) which can only be solved by Thm. 5.13 but not by Thm. 5.20. The same 5 examples can be solved by Thm. 4.2 but not by Thm. 5.5 for basic and arbitrary start terms. Moreover, there are 3 examples (where one of them is right-linear and the other two are just spare) which can only be solved by Thm. 5.20 but not by Thm. 5.13. The right-linear example can also be solved by Thm. 5.5 but not by Thm. 4.2. These are the same examples that we discussed in the experiments of our conference paper. Furthermore, AProVE can prove iAST for 109 of the 130 examples.
Finally, we also give a table to show the overall results for arbitrary start terms.
Examples | old | Thm. 4.2 | Thm. 5.5 | new |
---|---|---|---|---|
130 | 51 | 57 | 53 | 58 |
A detailed table with the result for each individual benchmark can be downloaded here.