Dependency Pairs for Expected Innermost Runtime Complexity and Strong Almost-Sure Termination of Probabilistic Term Rewriting

This web site contains additional information on our paper “Dependency Pairs for Expected Innermost Runtime Complexity and Strong Almost-Sure Termination of Probabilistic Term Rewriting”. More precisely, we provide information on our implementation in the tool AProVE and we describe our experiments.

Implementation in AProVE

The easiest way to run AProVE is via its web interface (for PTRSs in WST format or in ARI format).

Moreover, recent releases of AProVE are available for download on GitHub. The version master_2025_06_04 is able to analyze almost-sure termination (AST), strong almost-sure termination (SAST), and expected complexity w.r.t. different rewrite strategies and different start terms. For termination and complexity analysis of probabilistic term rewrite systems, the following dependencies have to be installed (see here for more installation instructions):

Java 11 (or newer)
the satisfiability checkers Z3, Yices, and MiniSAT

Extending the path environment is necessary so that AProVE can find these programs.

In the following, we first describe the input syntax of a PTRS file. Then, we explain more details about our implementation. In the end, we describe how to execute the implementation via the web interface and via the JAR file on GitHub.

Input Syntax

One can use AProVE to prove SAST and analyze expected complexity of PTRSs w.r.t. either full or innermost rewriting, and w.r.t. either basic or arbitrary start terms. The proof goal can be indicated via the keywords ‘STRATEGY’ and ‘GOAL’. So the input file can contain ‘(GOAL SAST)’ or ‘(GOAL COMPLEXITY)’. Further goals that can be handled by AProVE are ‘(GOAL TERMINATION)’ for certain termination, ‘(GOAL AST)’ for almost-sure termination, and ‘(GOAL PAST)’ for positive almost-sure termination. If the input file contains ‘(STRATEGY INNERMOST)’, then AProVE tries to prove the provided goal w.r.t. innermost evaluation and if the file does not contain the ‘STRATEGY’ keyword, then AProVE tries to prove the provided goal w.r.t. full rewriting. The input syntax is similar for the start terms: If the input file contains ‘(STARTTERM BASIC)’, then AProVE tries to prove the provided goal w.r.t. basic start terms and if the file does not contain the ‘STARTTERM’ keyword, then AProVE tries to prove the provided goal w.r.t. arbitrary start terms.

Regarding the rules of the PTRS, our implementation allows for multiple different input formats.

Extended WST Format

First, AProVE can parse the input syntax for PTRSs that was first described in the paper ADY20. Here, the PTRS file needs to have the extension ‘.ptrs’. The syntax extends the WST format for term rewrite systems by adding a new syntax for probabilistic rules. A probabilistic rewrite rule has the form:

lhs -> w1 : rhs1 || ... || wk : rhsk

Here, lhs, rhs1, …, rhsk are terms and w1,…,wk are natural numbers > 0. This corresponds to the rule:

lhs -> {w1/w : rhs1, ..., wk/w : rhsk}

where w is the sum of all w1,…,wk.

AProVE Format

We also allow the following syntax for probabilistic rules that might be easier to read:

lhs -> a1/b1 : rhs1 || ... || ak/bk : rhsk

Here, lhs, rhs1,…,rhsk are again terms and a1,…,ak and b1,…,bk are natural numbers > 0 that describe the actual probabilities. This corresponds to the rule:

lhs -> {a1/b1 : rhs1, ..., ak/bk : rhsk}

However, one has to ensure that the value ai/bi is actually between 0 and 1, and that the sum of all probabilities in a rule is 1. Otherwise, AProVE rejects the input file with an exception. The PTRS may contain rules in both notations.

ARI Format

Moreover, AProVE can also parse the new ARI format for (probabilistic) term rewrite systems from the annual termination competition. Then, the PTRS file must have the extension ‘.ari’.

Implementation

As described in the input syntax section, we only allow for rational probabilities. Internally, we store probabilities as fractions, by storing the numerator and denominator as integers. However, once we need to check satisfiability of polynomial (in)equations (e.g., in the direct application of polynomial interpretations) we multiply the (in)equations with appropriate integers so that the resulting SMT problem only deals with natural polynomials.

Regarding expected complexity analysis, if the input does not already focus on innermost rewriting, we first try to use the criteria from KFG24 to move from full rewriting to innermost rewriting. Afterwards, if the input does not only consider basic start terms, we use the transformation from Fuhs19 to move from arbitrary to basic start terms. Note that this transformation adds non-constant function symbols and may lead to a linear overhead. Hence, we perform a simple search for a constant polynomial interpretation to prove constant expected derivational complexity before applying the transformation. In general, the strategy before applying the ADP framework is:

Try direct application of constant polynomial interpretation (provided that all function symbols have arity 0).
If not innermost: Check criteria from KFG24.
If not basic: Apply transformation from Fuhs19.
If innermost and basic: Move to the ADP framework (see below).
Otherwise: Try direct application of further (monotonic) polynomial interpretations.

The implementation of the ADP framework and the strategy for when to use which processor is very similar to the original non-probabilistic DT framework for innermost runtime complexity. We first use the processors that do not rely on finding an interpretation and that always simplify the ADP problem when they are applicable (i.e., all processors except the reduction pair processor and the rule overlap processor) until they cannot be applied anymore. Then we use the reduction pair processor to remove some ADPs from the set of ADPs that count for complexity. As soon as one of the processors is applied successfully, we re-start the strategy again, since other processors might be applicable again on the simplified subproblems. Moreover, before the first application of the reduction pair processor we use the transformational rule overlap instantiation processor (on a fixed number of terms), but this processor is not applied again when re-starting the strategy later on, since it does not always help in inferring an upper bound and often increases the number of ADPs.

In general, the strategy within the ADP framework is the following:

Dependency Graph Processor
Usable Rules Processor
Knowledge Propagation Processor
Probability Removal Processor
Rule Overlap Instantiation Processor (only once for a fixed number of terms)
Reduction Pair Processor (with different polynomial templates)

REMARK (Dependency Graph Processor): Our implementation extends the implementation of the non-probabilistic DT framework for innermost runtime complexity, where the dependency graph processor was split into multiple sub-processors. Therefore, the output of AProVE splits our novel dependency graph processor into multiple simpler sub-processors as well to make the proofs more readable. To be precise, the output splits the dependency graph processor into the ‘Leaf Removal Processor’ (removing annotations from ADPs that have no corresponding DP with a successor in the dependency graph), ‘Usable Terms Processor’ (removing the remaining annotations from ADPs whose corresponding DPs do not reach any SCC-prefix in the dependency graph), and the ‘SSC-prefix Split Processor’ (performing the split into multiple different problems for each SCC-prefix).

In the future, we intend to extend the reduction pair processor to more advanced orders (like polynomials with real coefficients or matrix orderings as described in ADY20).

Finally, to analyze SAST, we perform the same techniques and simply check whether we obtain a finite bound on the complexity.

Web Interface

Our implementation can be tested via a web interface for PTRSs in WST format or in ARI format.

The window Enter Program Code shows the program that is currently analyzed. The user can modify the program or replace it by a new one.
In addition, one can also Upload a File of a program to analyze (where the file should have the extension .ptrs or .ari, respectively).
To run AProVE, one can choose the timeout. Afterwards, the button Submit calls AProVE in order to prove almost-sure termination of the given PTRS.
After viewing the result, Return to program input takes you back to the web page with the input program.

JAR File

After downloading AProVE and installing all dependencies, make sure that the path variables are set as described above. Then AProVE can be run by executing

java -ea -jar aprove.jar -m benchmark rw1.ptrs

or if you want to see the full proof you can use the option -p plain for a plain text version or -p html for an html format.

java -ea -jar aprove.jar -m benchmark rw1.ptrs -p plain

Experiments

We compared our new implementation in the tool AProVE (new) against the old version of AProVE (polo) that could only prove SAST by a direct application of (monotonic) polynomial interpretations and the tool NaTT (NaTT) that uses polynomial and matrix interpretations to prove SAST. (We cannot compare against other tools for expected complexity analysis of PTRSs, since our implementation is the first tool that can perform such an analysis.)

For this evaluation, we used the set of 128 benchmarks from the TPDB, which were also used for the evaluation of KG24. (We omitted two examples of the evaluation of KG24 which do not satisfy the syntactic restrictions of well-defined PTRSs.) Since these 128 benchmarks were originally developed for evaluating tools that analyze almost-sure termination, many of them are AST, but not SAST, i.e., they have infinite runtime complexity. Hence, we added 10 additional interesting examples that are targeted towards expected complexity analysis. These also include all examples from our paper. You can download the overall benchmark set here.

We tested AProVE on a computer with an Intel Core i7-7920HQ CPU with 3.10 GHz and 32 GB of RAM, and a timeout of 30 seconds was used for each example. This is different to the timeout and hardware used in the annual Termination Competition, since the https://starexec.net/starexec/ site that was used for the competition in previous years is not available anymore.

The following table shows the overall results for proving SAST:

Strategy	Start Terms	polo	NaTT	new
Full	Arbitrary	30	33	35
Full	Basic	30	33	44
Innermost	Arbitrary	30	33	54
Innermost	Basic	30	33	62

The next table shows the upper bounds on the complexity inferred by our new implementation.

Strategy	Start Terms	Pol_0	Pol_1	Pol_2	Exp	infinity
Full	Arbitrary	2	21	1	11	103
Full	Basic	15	25	1	3	94
Innermost	Arbitrary	2	47	0	5	84
Innermost	Basic	25	35	0	2	76

Finally, the following table shows the upper bounds on the complexity that can be inferred when applying a polynomial interpretation directly to the PTRS. So it shows which bounds can be inferred by polo for those 30 examples where it can prove SAST.

Start Terms	Pol_0	Pol_1	Pol_2	Exp	2-Exp	infinity
Arbitrary	2	16	0	11	1	108
Basic	4	21	1	4	0	108

The detailed results for each benchmark w.r.t. each strategy can be downloaded here.

New Benchmarks

As mentioned before, we added 10 more benchmarks that focus on expected runtime complexity of PTRSs.

Leading Example and Geometrical Distributions

The leading examples R_1 and R_2 from our paper generate a random number following a geometrical distribution. While the right-hand sides of our rules only contain finite multi-distributions, the repeated evaluation of these rules represents a distribution with infinite support. This gives rise to several interesting tasks:

We need to consider the expected runtime complexity of this generation itself.
We need to consider the expected increase of the initial input (which is constant, in fact it is 2 for the geometrical distribution).
We need to consider the expected runtime complexity of the remaining program that uses the result of the geometrical distribution. Since the resulting distribution is infinite, we may end up with an infinite expected runtime complexity, even if the generation and the remaining program have a finite expected runtime complexity (see R_2).

The examples geo_div.ari (leading example R_1), geo_double_div.ari, geo_f.ari (leading example R_2), and times.ari from our novel benchmark set demonstrate this behavior.

Dice Roller

In addition to a geometrical distribution, we also added examples that uses a Markov chain (modeled by the PTRS) to model a dice roll using only rewrite rules whose multi-distributions have the probability 1/2. In the following, ‘dice’ is the initial state of the corresponding Markov chain. We have a chance of 1/6 to reach each of the final states ‘stateIII’, ‘stateI0I’, ‘stateI00’, ‘state0II’, ‘state00I’, and ‘state000’. Note that this algorithm is AST, has a constant expected runtime complexity, creates a number between 1 and 6, and hence, the repeated evaluation only represents a finite distribution.

    dice -> 1/2 : stateI || 1/2 : stateO
    stateI -> 1/2 : stateII || 1/2 : stateIO
    stateII -> 1/2 : stateIII || 1/2 : stateI
    stateIO -> 1/2 : stateIOI || 1/2 : stateIOO
    stateO -> 1/2 : stateOI || 1/2 : stateOO
    stateOI -> 1/2 : stateOII || 1/2 : stateO
    stateOO -> 1/2 : stateOOI || 1/2 : stateOOO

    stateIII -> s(O)
    stateIOI -> s(s(O))
    stateIOO -> s(s(s(O)))
    stateOII -> s(s(s(s(O))))
    stateOOI -> s(s(s(s(s(O)))))
    stateOOO -> s(s(s(s(s(s(O))))))

Our benchmark set contains examples where we exchanged the geometrical distribution by the dice roller. It is noteworthy that dice_f.ari (R_2 but with the dice rolling instead of a geometrical distribution) is SAST, because we do not have an infinite resulting distribution after the generation of the numbers anymore.

The examples dice_div.ari (leading example R_1 with dice rolling), dice_random.ari, dice_f.ari (leading example R_2 with dice rolling), and dice.ari (only the dice rolling algorithm) from our novel benchmark set demonstrate this behavior.

Finite Expected Runtime

There are plenty of probabilistic data structures that probabilistically change or remove parts of the structure. We present a very small example that iterates over a given list and in each step it may remove the remaining list with a certain probability.

    listdrop(nil)          -> 1 : nil
    listdrop(cons(x,xs))   -> 1/2 : nil || 1/2 : cons(x,listdrop(xs))

Even though the algorithm might iterate over the whole list of length n, the expected runtime complexity is constant (in O(1)). A direct application of polynomials would only be able to prove a linear runtime bound, due to the monotonicity constraints. In contrast, with our novel ADP framework, we can infer a constant upper bound for the expected runtime complexity.

The xamples listdrop.ari and treedrop.ari from our novel benchmark set illustrate the analysis of such algorithms.