Solving Simple Tag Problems

June 02, 2025

This post is a continuation of my last post on The Problem of Tag where I introduced a marble machine called a tag system, and the possible behaviors they can have, namely halting, looping, and diverging. I also talked about the problem of tag, which asks whether a particular tag system operating on a particular starting sequence eventually halts, loops, or diverges. I had also alluded to the fact that tag systems are Turing complete, which is a really important “no-go” theorem when discussing the problem of tag – in particular, it means that there is no fully general solution to the problem. That is, there’s no algorithm that always runs in a finite amount of time that correctly decides which behavior a particular tag system has on a particular string. I’m not going to actually prove this result here, but considering a very naive approach gives some intuition of why this might be true. Consider an algorithm that simulates the running of the tag system:

class TagSystem:
    def __init__(self, v, rules):
        self.v = v
        self.rules = rules

def decide(T, A):
    while len(A) >= T.v:
        read = A[0]
        A = A[T.v:] + T.rules[read]
    return "halts"

T = TagSystem(2, [[2], [1, 0], [1, 1, 0]])
print(decide(T, [2, 2]))

With this approach, if \(T\) halts on \(A\), then the algorithm will always correctly return "halts" (that is, this algorithm recognizes halting strings). However, if \(T\) loops on \(A\), then the algorithm will loop forever because… well… it doesn’t halt. Luckily, a tag system that loops will continue to revisit the same finite set of strings at the same fixed interval over and over again, so we can catch this by recording all the strings we see and returning “loops” if we ever see the same one twice:

def decide(T, A):
    seen = set()
    while len(A) >= T.v:
        if tuple(A) in seen:
            return "loops"
        seen.add(tuple(A))
        read = A[0]
        A = A[T.v:] + T.rules[read]
    return "halts"

T = TagSystem(2, [[2], [1, 0], [1, 1, 0]])
print(decide(T, [1, 0, 2]))

With this revision, we can recognize any string that halts or loops. The only other possibility is that a string diverges, which is where we get stuck: how can we distinguish a string that diverges from a string that loops? Well, a diverging string eventually blows up in length, whereas a looping string is transient for some number of steps before repeating strings. But is there a reliable way to tell whether we’re in the middle of the transient of a looping string or diverging (potentially slowly)? We could try some different tricks, but they’ll all be confounded by the fact that some looping strings explode in length by a lot before they loop, and some diverging strings grow really slowly. Tag systems being Turing complete shows that this problem is impossible.

However, it doesn’t mean that the problem is never solvable, because there are tag systems where it’s not only possible, but actually easy. We’ve already seen a nontrivial partial solution to the problem of tag: if one can prove that a particular tag system never diverges, then the algorithm above solves the problem of tag for it. At this point, you might wonder what sorts of tag systems we can prove never diverge – I’ll answer that in a moment, but I’ll first suggest a detour to an even simpler tag system to analyze, which removes two marbles from the machine and never inserts anything:

\[\begin{cases} \mathrm{remove}\, 2\, \mathrm{marbles} \\ R \rightarrow \epsilon \\ G \rightarrow \epsilon \\ B \rightarrow \epsilon \end{cases}\]

This tag system clearly halts. On every step, we take two marbles out and add zero back in, and so the total number of marbles decreases by two. If we start with N marbles, after \(\lfloor \frac{N}{2} \rfloor\) steps we have less than two marbles left, and halt. We can also concoct a similarly easy to analyze tag system that usually loops:

\[\begin{cases} \mathrm{remove}\, 2\, \mathrm{marbles} \\ R \rightarrow GB \\ G \rightarrow BR \\ B \rightarrow RG \end{cases}\]

For any string of length less than two, this tag system has already halted. For any other string, we take two marbles out and add two back in on every step. Therefore, the total number of marbles stays the same at each step. For any particular starting sequence of length \(N > 2\), the length after any number of steps is still \(N\). Since there are finitely many strings of length \(N\), by the pigeonhole principle we eventually have to repeat one, and so this tag system loops.

As a final motivating example before I actually state the next theorems, consider this tag system that usually diverges:

\[\begin{cases} \mathrm{remove}\, 2\, \mathrm{marbles} \\ R \rightarrow RGB \\ G \rightarrow GBR \\ B \rightarrow BRG \end{cases}\]

As before, any string of length less than two has already halted. For any other string, we take two marbles out and add three back in on every step. Therefore, the total number of marbles increases at each step. Therefore, this tag system diverges on all such strings.

Wang’s Decidability Criteria

I definitely wasn’t the first person to make these observations. Emil Post alluded to these criteria in 1943, although the earliest detailed proof that I found is from Hao Wang’s 1963 paper, Tag Systems and Lag Systems. The following proofs stick very close to his, although they’re a bit more verbose and more closely follow my translation of the proofs into Lean.

Theorem 1 (Wang’s Halting Criterion): For a tag system \(T = (v, \Sigma, \sigma)\), if the length of the appendant for all symbols is less than \(v\), then the tag system halts on all strings. Formally,

\[(\forall a \in \Sigma, \left|\sigma(a)\right| < v) \rightarrow \forall A \in \Sigma^\ast, halts(T, A)\]

Proof: Suppose that \(\left|\sigma(a)\right| < v\) for any symbol \(a\). We proceed by strong induction on the length of \(A\). Note that any string of length less than \(v\) has already halted, including the empty string. Now, suppose that \(T\) halts on any string strictly shorter than \(A\), and that \(v \le \left|A\right|\). So we have that

\[next(T, A) = drop (A \, \texttt{++} \, \sigma(head(A)), v)\]

And so

By the induction hypothesis, \(halts(T, next(T, A))\), and therefore \(halts(T, A)\).

□

Theorem 2 (Wang’s Looping Criterion): For a tag system \(T = (v, \Sigma, \sigma)\), if the length of the appendant for all symbols is precisely \(v\), then \(T\) halts on all strings shorter than \(v\) and loops on all others. Formally,

\[(\forall a \in \Sigma, \left|\sigma(a)\right| < v) \rightarrow \forall A \in \Sigma^\ast, \neg halted(T, A) \rightarrow loops(T, A)\]

Proof: Suppose that \(\left|\sigma(a)\right| = v\) for any symbol \(a\), and that \(T\) hasn’t halted on \(A\). It follows from the latter that

\[next(T, A) = drop (A \, \texttt{++} \, \sigma(head(A)), v)\]

And so

In fact, this doesn’t depend on the particulars of \(A\) beyond that \(T\) isn’t halted on it, which is purely a function of its length. It follows by induction that \(\left|next^i(T, A)\right| = \left|A\right|\). There are finitely many strings of length \(\left|A\right|\), and so by the pigeonhole principle we have \(i_1\) and \(i_2\) such that \(i_1 \ne i_2\) and \(next^{i_1}(T, A) = next^{i_2}(T, A)\). Without loss of generality, suppose \(i_1 < i_2\).

This gives us a loop: the transient length is \(i_1\) and the period is \(i_2 - i_1\). To show that this is a loop, it must be that \(T\) hasn’t halted after \(i_1\) steps; to see this, note that \(T\) hasn’t halted on \(A\), and \(\left|next^{i_1}(T, A)\right| = \left|A\right|\), so it also hasn’t halted. Next, we must demonstrate that the period is positive: since \(i_1 < i_2\), clearly \(0 < i_2 - i_1\). Lastly, we must show that \(next^{i_1}(T, A) = next^{i_2}(T, A)\), but this was already shown by the pigeonhole argument from earlier, which completes the proof.

□

Theorem 3 (Wang’s Divergence Criterion): For a tag system \(T = (v, \Sigma, \sigma)\), if the length of the appendant for all symbols is greater than \(v\), then \(T\) halts on all strings shorter than \(v\) and diverges on all others. Formally,

\[(\forall a \in \Sigma, v < \left|\sigma(a)\right|) \rightarrow \forall A \in \Sigma^\ast, \neg halted(T, A) \rightarrow diverges(T, A)\]

Proof: Suppose that \(v < \left|\sigma(a)\right|\) for any symbol \(a\), and that \(T\) hasn’t halted on \(A\). Now, note that for any \(A'\) on which \(T\) hasn’t halted (including \(A\)), we have that

\[next(T, A') = drop (A' \, \texttt{++} \, \sigma(head(A')), v)\]

And so

Next, after any number of steps \(i\), we need to find another number of steps \(j > i\) such that we have an even longer string. \(j = i + 1\) is such a choice: clearly \(j > i\), and taking \(A' = next^i(T, A)\) we have that \(\left|next^j(T, A)\right| > \left|next^i(T, A)\right|\).

□

Which tag systems does this let us decide?

At this point, we have a few tools in our tool belt for solving the problem of tag: using Wang’s criteria, we already have an answer for any tag system where all of the rules put in less than \(v\) marbles, exactly \(v\) marbles, or more than \(v\) marbles. We also have the algorithm from earlier, where we can run any non-diverging tag system until it either halts or repeats a string; together with Wang’s criteria, we can see that a tag system never diverges if all of its rules put in no more than \(v\) marbles, which allows us to solve the problem of tag for an even larger class of tag systems.

Two useful complexity measures for tag systems used in the literature are the deletion number, \(v\), and alphabet size, \(\mu = \left|\Sigma\right|\). These classes are denoted \(\mathrm{TS}(\mu, v)\). Intuitively, the higher the deletion number, the more “state information” you can embed in the string, and similarly the more symbols in the alphabet (or colors of marbles if you prefer), the more expressive the transition function becomes. Therefore, one should expect that as these numbers get higher, the more computationally powerful that class of tag systems are. For example, Wang’s decidability criteria alone are sufficient to solve the problem of tag for any tag system in \(\mathrm{TS}(1, v)\):

Theorem 4 (Decidability of \(\mathrm{TS}(1, v))\): For a tag system \(T = (v, \left\{a_1\right\}, \sigma)\), the problem of tag is solvable.

Proof: Note that we have exactly one symbol in the alphabet, so we trivially have that \(\forall a \in \Sigma, \sigma(a) = \sigma(a_1)\). Let \(L \coloneq \left|\sigma(a_1)\right|\). Now we have three cases:

Case 1: suppose \(L < v\). By theorem 1, we have \(halts(T, A)\).

Case 2: suppose \(L = v\). By theorem 2, we have \(loops(T, A)\).

Case 3: suppose \(L > v\). By theorem 3, we have \(diverges(T, A)\).

□