This web site contains additional information on our paper “Dependency Pairs for Expected Innermost Runtime Complexity and Strong Almost-Sure Termination of Probabilistic Term Rewriting”. More precisely, we provide information on our implementation in the tool AProVE and we describe our experiments.
The easiest way to run AProVE is via its web interface (for PTRSs in WST format or in ARI format).
Moreover, recent releases of AProVE are available for download on GitHub. The version master_2025_06_04 is able to analyze almost-sure termination (AST), strong almost-sure termination (SAST), and expected complexity w.r.t. different rewrite strategies and different start terms. For termination and complexity analysis of probabilistic term rewrite systems, the following dependencies have to be installed (see here for more installation instructions):
Extending the path environment is necessary so that AProVE can find these programs.
In the following, we first describe the input syntax of a PTRS file. Then, we explain more details about our implementation. In the end, we describe how to execute the implementation via the web interface and via the JAR file on GitHub.
One can use AProVE to prove SAST and analyze expected complexity of PTRSs w.r.t. either full or innermost rewriting, and w.r.t. either basic or arbitrary start terms. The proof goal can be indicated via the keywords ‘STRATEGY’ and ‘GOAL’. So the input file can contain ‘(GOAL SAST)’ or ‘(GOAL COMPLEXITY)’. Further goals that can be handled by AProVE are ‘(GOAL TERMINATION)’ for certain termination, ‘(GOAL AST)’ for almost-sure termination, and ‘(GOAL PAST)’ for positive almost-sure termination. If the input file contains ‘(STRATEGY INNERMOST)’, then AProVE tries to prove the provided goal w.r.t. innermost evaluation and if the file does not contain the ‘STRATEGY’ keyword, then AProVE tries to prove the provided goal w.r.t. full rewriting. The input syntax is similar for the start terms: If the input file contains ‘(STARTTERM BASIC)’, then AProVE tries to prove the provided goal w.r.t. basic start terms and if the file does not contain the ‘STARTTERM’ keyword, then AProVE tries to prove the provided goal w.r.t. arbitrary start terms.
Regarding the rules of the PTRS, our implementation allows for multiple different input formats.
First, AProVE can parse the input syntax for PTRSs that was first described in the paper ADY20. Here, the PTRS file needs to have the extension ‘.ptrs’. The syntax extends the WST format for term rewrite systems by adding a new syntax for probabilistic rules. A probabilistic rewrite rule has the form:
lhs -> w1 : rhs1 || ... || wk : rhsk
Here, lhs, rhs1, …, rhsk are terms and w1,…,wk are natural numbers > 0. This corresponds to the rule:
lhs -> {w1/w : rhs1, ..., wk/w : rhsk}
where w is the sum of all w1,…,wk.
We also allow the following syntax for probabilistic rules that might be easier to read:
lhs -> a1/b1 : rhs1 || ... || ak/bk : rhsk
Here, lhs, rhs1,…,rhsk are again terms and a1,…,ak and b1,…,bk are natural numbers > 0 that describe the actual probabilities. This corresponds to the rule:
lhs -> {a1/b1 : rhs1, ..., ak/bk : rhsk}
However, one has to ensure that the value ai/bi is actually between 0 and 1, and that the sum of all probabilities in a rule is 1. Otherwise, AProVE rejects the input file with an exception. The PTRS may contain rules in both notations.
Moreover, AProVE can also parse the new ARI format for (probabilistic) term rewrite systems from the annual termination competition. Then, the PTRS file must have the extension ‘.ari’.
As described in the input syntax section, we only allow for rational probabilities. Internally, we store probabilities as fractions, by storing the numerator and denominator as integers. However, once we need to check satisfiability of polynomial (in)equations (e.g., in the direct application of polynomial interpretations) we multiply the (in)equations with appropriate integers so that the resulting SMT problem only deals with natural polynomials.
Regarding expected complexity analysis, if the input does not already focus on innermost rewriting, we first try to use the criteria from KFG24 to move from full rewriting to innermost rewriting. Afterwards, if the input does not only consider basic start terms, we use the transformation from Fuhs19 to move from arbitrary to basic start terms. Note that this transformation adds non-constant function symbols and may lead to a linear overhead. Hence, we perform a simple search for a constant polynomial interpretation to prove constant expected derivational complexity before applying the transformation. In general, the strategy before applying the ADP framework is:
The implementation of the ADP framework and the strategy for when to use which processor is very similar to the original non-probabilistic DT framework for innermost runtime complexity. We first use the processors that do not rely on finding an interpretation and that always simplify the ADP problem when they are applicable (i.e., all processors except the reduction pair processor and the rule overlap processor) until they cannot be applied anymore. Then we use the reduction pair processor to remove some ADPs from the set of ADPs that count for complexity. As soon as one of the processors is applied successfully, we re-start the strategy again, since other processors might be applicable again on the simplified subproblems. Moreover, before the first application of the reduction pair processor we use the transformational rule overlap instantiation processor (on a fixed number of terms), but this processor is not applied again when re-starting the strategy later on, since it does not always help in inferring an upper bound and often increases the number of ADPs.
In general, the strategy within the ADP framework is the following:
REMARK (Dependency Graph Processor): Our implementation extends the implementation of the non-probabilistic DT framework for innermost runtime complexity, where the dependency graph processor was split into multiple sub-processors. Therefore, the output of AProVE splits our novel dependency graph processor into multiple simpler sub-processors as well to make the proofs more readable. To be precise, the output splits the dependency graph processor into the ‘Leaf Removal Processor’ (removing annotations from ADPs that have no corresponding DP with a successor in the dependency graph), ‘Usable Terms Processor’ (removing the remaining annotations from ADPs whose corresponding DPs do not reach any SCC-prefix in the dependency graph), and the ‘SSC-prefix Split Processor’ (performing the split into multiple different problems for each SCC-prefix).
In the future, we intend to extend the reduction pair processor to more advanced orders (like polynomials with real coefficients or matrix orderings as described in ADY20).
Finally, to analyze SAST, we perform the same techniques and simply check whether we obtain a finite bound on the complexity.
Our implementation can be tested via a web interface for PTRSs in WST format or in ARI format.
Enter Program Code
shows the program that is currently analyzed. The user can modify the program or replace it by a new one.Upload a File
of a program to analyze (where the file should have the extension .ptrs
or .ari
, respectively).Submit
calls AProVE in order to prove almost-sure termination of the given PTRS.Return to program input
takes you back to the web page with the input program.After downloading AProVE and installing all dependencies, make sure that the path variables are set as described above. Then AProVE can be run by executing
java -ea -jar aprove.jar -m benchmark rw1.ptrs
or if you want to see the full proof you can use the option -p plain
for a plain text version or -p html
for an html format.
java -ea -jar aprove.jar -m benchmark rw1.ptrs -p plain
We compared our new implementation in the tool AProVE (new) against the old version of AProVE (polo) that could only prove SAST by a direct application of (monotonic) polynomial interpretations and the tool NaTT (NaTT) that uses polynomial and matrix interpretations to prove SAST. (We cannot compare against other tools for expected complexity analysis of PTRSs, since our implementation is the first tool that can perform such an analysis.)
For this evaluation, we used the set of 128 benchmarks from the TPDB, which were also used for the evaluation of KG24. (We omitted two examples of the evaluation of KG24 which do not satisfy the syntactic restrictions of well-defined PTRSs.) Since these 128 benchmarks were originally developed for evaluating tools that analyze almost-sure termination, many of them are AST, but not SAST, i.e., they have infinite runtime complexity. Hence, we added 10 additional interesting examples that are targeted towards expected complexity analysis. These also include all examples from our paper. You can download the overall benchmark set here.
We tested AProVE on a computer with an Intel Core i7-7920HQ CPU with 3.10 GHz and 32 GB of RAM, and a timeout of 30 seconds was used for each example. This is different to the timeout and hardware used in the annual Termination Competition, since the https://starexec.net/starexec/ site that was used for the competition in previous years is not available anymore.
The following table shows the overall results for proving SAST:
Strategy | Start Terms | polo | NaTT | new |
---|---|---|---|---|
Full | Arbitrary | 30 | 33 | 35 |
Full | Basic | 30 | 33 | 44 |
Innermost | Arbitrary | 30 | 33 | 54 |
Innermost | Basic | 30 | 33 | 62 |
The next table shows the upper bounds on the complexity inferred by our new implementation.
Strategy | Start Terms | Pol_0 | Pol_1 | Pol_2 | Exp | 2-Exp | infinity |
---|---|---|---|---|---|---|---|
Full | Arbitrary | 2 | 21 | 1 | 11 | 0 | 103 |
Full | Basic | 15 | 25 | 1 | 3 | 0 | 94 |
Innermost | Arbitrary | 2 | 47 | 0 | 5 | 0 | 84 |
Innermost | Basic | 25 | 35 | 0 | 2 | 0 | 76 |
Finally, the following table shows the upper bounds on the complexity that can be inferred when applying a polynomial interpretation directly to the PTRS. So it shows which bounds can be inferred by polo for those 30 examples where it can prove SAST.
Start Terms | Pol_0 | Pol_1 | Pol_2 | Exp | 2-Exp | infinity |
---|---|---|---|---|---|---|
Arbitrary | 2 | 16 | 0 | 11 | 1 | 108 |
Basic | 4 | 21 | 1 | 4 | 0 | 108 |
The detailed results for each benchmark w.r.t. each strategy can be downloaded here.
As mentioned before, we added 10 more benchmarks that focus on expected runtime complexity of PTRSs.
The leading examples R_1 and R_2 from our paper generate a random number following a geometrical distribution. While the right-hand sides of our rules only contain finite multi-distributions, the repeated evaluation of these rules represents a distribution with infinite support. This gives rise to several interesting tasks:
The examples geo_div.ari (leading example R_1), geo_double_div.ari, geo_f.ari (leading example R_2), and times.ari from our novel benchmark set demonstrate this behavior.
In addition to a geometrical distribution, we also added examples that uses a Markov chain (modeled by the PTRS) to model a dice roll using only rewrite rules whose multi-distributions have the probability 1/2. In the following, ‘dice’ is the initial state of the corresponding Markov chain. We have a chance of 1/6 to reach each of the final states ‘stateIII’, ‘stateI0I’, ‘stateI00’, ‘state0II’, ‘state00I’, and ‘state000’. Note that this algorithm is AST, has a constant expected runtime complexity, creates a number between 1 and 6, and hence, the repeated evaluation only represents a finite distribution.
dice -> 1/2 : stateI || 1/2 : stateO
stateI -> 1/2 : stateII || 1/2 : stateIO
stateII -> 1/2 : stateIII || 1/2 : stateI
stateIO -> 1/2 : stateIOI || 1/2 : stateIOO
stateO -> 1/2 : stateOI || 1/2 : stateOO
stateOI -> 1/2 : stateOII || 1/2 : stateO
stateOO -> 1/2 : stateOOI || 1/2 : stateOOO
stateIII -> s(O)
stateIOI -> s(s(O))
stateIOO -> s(s(s(O)))
stateOII -> s(s(s(s(O))))
stateOOI -> s(s(s(s(s(O)))))
stateOOO -> s(s(s(s(s(s(O))))))
Our benchmark set contains examples where we exchanged the geometrical distribution by the dice roller. It is noteworthy that dice_f.ari (R_2 but with the dice rolling instead of a geometrical distribution) is SAST, because we do not have an infinite resulting distribution after the generation of the numbers anymore.
The examples dice_div.ari (leading example R_1 with dice rolling), dice_random.ari, dice_f.ari (leading example R_2 with dice rolling), and dice.ari (only the dice rolling algorithm) from our novel benchmark set demonstrate this behavior.
There are plenty of probabilistic data structures that probabilistically change or remove parts of the structure. We present a very small example that iterates over a given list and in each step it may remove the remaining list with a certain probability.
listdrop(nil) -> 1 : nil
listdrop(cons(x,xs)) -> 1/2 : nil || 1/2 : cons(x,listdrop(xs))
Even though the algorithm might iterate over the whole list of length n, the expected runtime complexity is constant (in O(1)). A direct application of polynomials would only be able to prove a linear runtime bound, due to the monotonicity constraints. In contrast, with our novel ADP framework, we can infer a constant upper bound for the expected runtime complexity.
The xamples listdrop.ari and treedrop.ari from our novel benchmark set illustrate the analysis of such algorithms.