## 1 Introduction

Multiple context-free languages (MCFLs) form a class of languages which contains context-free languages and is contained in context sensitive languages. MCFLs were introduced to better model natural languages for which it has been shown that context-free languages did not allow enough expressibility [9]. MCFLs allow some cross serial dependencies in natural languages such as Swiss German; for nice examples, see [10]. They share several properties with context-free languages. Indeed, they form a cone of languages, they are semilinear, they are not closed under intersection and they satisfy some form of pumping lemma [12]. MCFLs also have some useful decidability properties, for instance, one can decide membership in polynomial time [12].

Given a presentation for a group *G*, it is a natural question to ask whether two words represents the same element in *G*.
Using the elementary fact that *word problem*, and study it via language theoretical instruments.
A remarkable result of Muller and Schupp [8], which relies on results of Stallings and Dunwoody [13, 3], shows that the class of groups that have context-free word problem coincides with the class of virtually free groups.

With a complete classification of groups whose word problem is context-free, it is natural to look at larger classes.
We will be interested in the class of multiple context-free languages (MCFLs); we will give a rigorous definition of this class in due course.
The class was first studied in [12].
The class of MCFLs is strictly larger than the class of CF languages.
For example, the language

It is natural then to ask what are the closure properties of this class. It is shown in [12] that the class is closed under finite extensions and taking finitely generated subgroups. It is shown in [4] that the class is not closed under direct products.

In this paper, we prove the following result.

*Let G be the fundamental group of a finite graph of groups.
Assume that all the vertex groups have multiple context-free word problem and all the edges groups are finite.
Then G has multiple context-free word problem.*

Since the class groups with regular word problem coincides with the class of finite groups, one could rephrase this result as saying that the class of MCF groups is closed under amalgamation over regular groups.
This result is not true substituting regular groups with CF groups.
Indeed,

## 2 Background

We are interested in the study of formal languages. In this section, we will give an introduction to formal languages and MCFLs. For a more comprehensive treatment, we refer to [6].

Given a finite set Σ, *free monoid over S*, i.e., the set of all finite words in Σ with the concatenation operation.
We will denote with ε the trivial element of

Given a finite set Σ, we say that a set *language* over Σ.

Since the definition of language is very broad, we will restrict our attention to languages that have a nice description.
The reader should think of this as the same meta-distinction between continuous functions

Hence we want to prescribe a general recipe that will allow us to produce languages.

### Chomsky grammars and hierarchy

A *Chomsky grammar**G* is a tuple *N* are (disjoint) finite sets, *x* contains at least one symbol of *N*.
We call Σ the set of *terminals* of *G*, *N* the set of *non-terminals*, *S* the *starting symbol* and δ the *production rules*.

We will often use the following conventions:
the elements of Σ will be denoted by lowercase letters (e.g., *N* by uppercase letters (e.g.,

Given a grammar

Let *derivable* words.

- •
*S*is derivable. - •For
, if$u,v,w\in {\left(\Sigma \cup N\right)}^{*}$ *uvw*is derivable and the rule is an element of δ, then$v\to x$ *uxw*is derivable. In particular, we say that*uxw*is*derivable*from*uvw*.

We say that a *derivation* (for *language associated* to the grammar *G* is the intersection

Let

- •
,${\tau}_{1}:S\to AB$ - •
,${\tau}_{2}:A\to aAb$ - •
,${\tau}_{3}:B\to ABc$ - •
,${\tau}_{4}:A\to \epsilon $ - •
.${\tau}_{5}:B\to \epsilon $

To generate the language *S*.
The only rule we can apply at the first step is *AB*.
Then we can substitute *A* with *aAb*, using rule *aAbB*.
Applying *k* more times gives

We now give a classification of some grammars.

A Chomsky grammar

- •
*regular*if all the elements of δ have the form , where$X\to wY$ ,$X\in N$ and$Y\in N\cup \left\{\epsilon \right\}$ ;$w\in {\Sigma}^{*}$ - •
*context-free*if all the elements of δ have the form , where$X\to w$ and$X\in N$ ;$w\in {\left(\Sigma \cup N\right)}^{*}$ - •
*unrestricted*otherwise.

The language *regular* (respectively, *context-free* or *recursively enumerable*) if *G* is regular (respectively, context-free or unrestricted).

The intuitive idea that one should have about the above definition is the following: a derivation in a regular language consists of substituting the last letter of a word with a new string of letters. A derivation in a context-free language consists of substituting a single letter (but not necessarily the last one) of a word with a new string of letters. The last case covers all other possibilities.

The gap between being context-free and being recursively enumerable seems (and in fact is) very big. The class of multiple context-free languages (MCFLs) that we are going to describe is one of the classes that properly lives in this gap, namely, properly contains context-free languages, and is properly contained in the class of recursively enumerable languages [12].

As before, we are going to describe a grammar that defines the class of MCFLs.
It should be noted that this will not be a Chomsky grammar.
We start with the definition of *linear rewriting function*.
The idea is very simple, but the definition may look a bit convoluted.
Intuitively, a linear rewriting function is a function that “pastes words together”, possibly adding some string of letters.
For instance, if

Fix a finite alphabet Σ, and let *rewriting* on the variables *w* is *linear* if each element of *X* occurs at most once.

Given a rewriting *w*, we can associate to it the function *w* each occurrence of *w*.
A rewriting function is *linear* if it comes from a linear rewriting.

We say that a function *(multiple) rewriting function* if it is a rewriting function in each component.
A (multiple) rewriting function coming from rewritings *linear* if

Note that being linear in each component is not enough for a multiple rewriting function to be linear.
In fact, the whole word

A *stratified set* is a set *N* equipped with a function *dimension*.

A *multiple context-free grammar* (MCFG) on an alphabet Σ is a tuple

- •Σ is a finite set of
*terminals*. - •
*N*is a finite stratified set of*non-terminals*. - •
is the$S\in N$ *starting symbol*such that .$\parallel S\parallel =1$ - •
*F*is a finite set of elements of the form , where$(A,f,{B}_{1},\dots ,{B}_{s})$ are elements of$A,{B}_{1},\dots ,{B}_{s}$ *N*, and is a linear rewriting function.$f:{\left({\Sigma}^{*}\right)}^{\parallel {B}_{1}\parallel +\dots +\parallel {B}_{s}\parallel}\to {\left({\Sigma}^{*}\right)}^{\parallel A\parallel}$

Given an element *F*, we will denote it by *k*-MCF if

As in the case of Chomsky grammars, given a MCFG *H*, we want to associate a language

Let

- •if
, then$\tau =A\to f\left(\epsilon \right)$ ;$f\left(\epsilon \right)\in {D}_{H}\left(A\right)$ - •if
and$\tau =A\to f({B}_{1},\dots ,{B}_{s})$ , then${y}_{1}\in {D}_{H}\left({B}_{1}\right),\dots ,{y}_{s}\in {D}_{H}\left({B}_{s}\right)$ .$f({y}_{1},\dots ,{y}_{s})\in {D}_{H}\left(A\right)$

For a MCFG *language associated to H* as

*L*is a

*multiple context-free language*if there is a MCFG

*H*such that

## 3 Grammars and automata

The goal of this section is to explain the relation between grammars and automata. In what follows, an automaton should be thought as a “computer with limitations”, namely as a machine that can do some operations, but does not possess the power (usually memory) of a Turing machine. As in the case of grammars, an automaton is naturally associated to a language. The intuitive explanation for this is the following: an automaton is associated to an algorithm that, given a word, either “accepts” or “rejects” it. The language associated to an automaton is the set of all “accepted” words.

In what follows, we fix a finite alphabet Σ, and all the definitions are understood to be dependent on Σ.
Recall that a *partial function**domain* of *f*.

A *storage type* is a tuple *C* is a set, called the set of *storage configurations*; *P* is a subset of the power set *P* are called *predicates*; *F* is a set of partial functions *instructions*; and *initial configurations*.

An *automaton with storage* is a tuple *Q* is a finite set of *states*, *I* is a tuple *initial state*, *final states*, and *initial storage configuration*.
Finally, *transitions*.

Given an automaton with storage *graph realisation of *, denoted by

- •The vertices of
are the elements of$\Gamma \left(\mathcal{M}\right)$ .$Q\times C$ - •To each
, we associate an oriented edge between the pair$\tau =({q}_{1},\sigma ,p,f,{q}_{2})\in \delta $ if$(({q}_{1},{c}_{1}),({q}_{2},{c}_{2}))$ ,${c}_{1}\in p$ . In that case, the label of this edge is σ.$f\left({c}_{1}\right)={c}_{2}$

Note that *f* is a partial function, so with *f*.

Let Σ be an alphabet, and let *ε-expansion* of *w* if

Given an automaton with storage *w* is in *w*.

In order to improve the readability of the above definitions, we will provide a fairy tale example to clarify the role of the various entities above.

Imagine there is a group of children playing a treasure hunt in a town.
The town is finite (as towns tend to be), and each block of the town is one of the states *Q*.
The children possess an extremely bad memory, but luckily each of them is equipped with a book to write notes.
The set *C* consists of all possible books with all possible contents opened to any page.
The set *P* contains some description about the state of the book, for example “the set of all books open on a blank page” or “all books open to the 12th page”.

Now suppose that there is a voice guiding the game in order to help the children find the treasure and, in particular, every now and then is reading out loud some hint (the alphabet Σ).
The voice represents the word *w* in the alphabet.
When a hint (letter) is read, the children will perform an action, and the possible actions are encoded in the set δ.

At the start of the game, the children will all be in the central block of the city (*Q*), then listen to what the voice is saying (an element of Σ), and look if there is something written on the book (an element of *P*).
Then each child decides which strategy apply on that turn (i.e., picks an element of δ), which is compatible with the information *Q*, Σ and *P*.
Following such a strategy, they may change page or write something on the book (an element of *F*) and go to a new block (an element of *Q*) accordingly.
If, at any time, a child cannot perform an action, then he or she is disqualified from the game.
When the voice stops giving hints, each child will start digging exactly where they stand and see if a treasure is found.

If at least one child has found a treasure, then the instructions were correct (and hence the word *w* is accepted).

Let us start with some famous automata in order to familiarise with the above concepts.

A *trivial storage* is a storage type

A *finite state automaton* (FSA) is an automaton with storage with trivial storage.

It is a very easy exercise to see that an FSA is completely described by a finite oriented graph with edges labelled by elements of Σ (and not

The following theorem forms a bridge between languages associated to grammars, and languages accepted by automata.

*For a language *

*•**L**is associated to a regular grammar;**•**L**is accepted by an FSA.*

A *push-down storage* over a finite alphabet Ω is a storage type

- •
.$C={\Omega}^{*}$ - •We define the set
as the set of words in$\mathrm{\U0001d68e\U0001d69a\U0001d69e\U0001d68a\U0001d695\U0001d69c}\left(\omega \right)$ that end with ω (note that${\Omega}^{*}$ is the set$\mathrm{\U0001d68e\U0001d69a\U0001d69e\U0001d68a\U0001d695\U0001d69c}\left(\epsilon \right)$ ). Then$\left\{\epsilon \right\}$ .$P=\{\mathrm{\U0001d68e\U0001d69a\U0001d69e\U0001d68a\U0001d695\U0001d69c}\left(\omega \right)\mid \omega \in \Omega \cup \left\{\epsilon \right\}\}$ - •We define the function
that sends$\mathrm{\U0001d699\U0001d69e\U0001d69c\U0001d691}\left(\omega \right):{\Omega}^{*}\to {\Omega}^{*}$ *x*to . Furthermore, we define a partial function$x\omega $ that sends${\mathrm{\U0001d699\U0001d698\U0001d699}}_{\omega}:\mathrm{\U0001d68e\U0001d69a\U0001d69e\U0001d68a\U0001d695\U0001d69c}\left(\omega \right)\to {\Omega}^{*}$ to$x\omega $ *x*. Then .$F=\left\{\mathrm{Id}\right\}\cup \{{\mathrm{\U0001d699\U0001d698\U0001d699}}_{\omega},\mathrm{\U0001d699\U0001d69e\U0001d69c\U0001d691}\left(\omega \right)\mid \omega \in \Omega \}$ - •
.${C}_{I}=\left\{\epsilon \right\}$

The intuitive idea behind the push-down storage is to have a stack of papers that can grow arbitrarily large, but the automaton can read only what is written on the top-most paper.
This corresponds to the predicate

A *push-down automaton* is an automaton with storage with push-down storage.

*For a language *

*•**L**is associated to a context-free grammar;**•**L**is accepted by a push-down automaton.*

We now want to describe the last automaton we are interested in, namely, the tree-stack automaton.

Let *S* be a set.
If *u* is a *prefix* for *uv*.
Given a set *D* is *prefix-closed* if, for each word *w* are in *D*.
Similarly, we say that *v* is a *suffix* for *uv*.

Given an alphabet Ω, an *Ω-tree* is a partial function

Note that this corresponds to a rooted tree in the usual graph-theory sense, where each edge is labelled by a natural number, the root is labelled by the symbol

An *Ω-tree with a pointer* is a pair *T* is an Ω-tree and

One should think of the pointer as a selected vertex of the tree. Figure 1 may provide some clarification.

Let *F* on *c* to *x*.

A *tree-stack storage* over a finite alphabet Ω is a storage type

- •
is an Ω-tree with pointer$C=\{(T,p)\mid (T,p)$ .$\}$ - •For
, we set$\omega \in \Omega \cup \left\{\u25c6\right\}$ and$\mathrm{\U0001d68e\U0001d69a\U0001d69e\U0001d68a\U0001d695\U0001d69c}\left(\omega \right)=\{(T,p)\in C\mid T\left(p\right)=\omega \}$ .Then$\mathrm{\U0001d697\U0001d698\U0001d69d\U0001d68e\U0001d69a\U0001d69e\U0001d68a\U0001d695\U0001d69c}\left(\omega \right)=\{(T,p)\in C\mid T\left(p\right)\ne \omega \}$ .$P=\{\mathrm{\U0001d68e\U0001d69a\U0001d69e\U0001d68a\U0001d695\U0001d69c}\left(\omega \right),\mathrm{\U0001d697\U0001d698\U0001d69d\U0001d68e\U0001d69a\U0001d69e\U0001d68a\U0001d695\U0001d69c}\left(\omega \right)\mid \omega \in \Omega \cup \left\{\u25c6\right\}\}\cup \left\{C\right\}$ - •For
and$n\in \mathbb{N}$ , we define the following partial functions:$\gamma \in \Omega $ **–** as the map${\mathrm{\U0001d699\U0001d69e\U0001d69c\U0001d691}}_{n}\left(\gamma \right):\{(T,p)\mid pn\notin \mathrm{domain}\left(T\right)\}\to C$ ,$(T,p)\mapsto (T[pn\mapsto \gamma ],pn)$ **–** as the map${\mathrm{\U0001d69e\U0001d699}}_{n}:\{(T,p)\mid pn\in \mathrm{domain}\left(T\right)\}\to C$ ,$(T,p)\mapsto (T,pn)$ **–** as the map that sends$\mathrm{\U0001d68d\U0001d698\U0001d6a0\U0001d697}:C-\mathrm{\U0001d68e\U0001d69a\U0001d69e\U0001d68a\U0001d695\U0001d69c}\left(\u25c6\right)\to C$ , for$(T,pm)\mapsto (T,p)$ ,$m\in \mathbb{N}$ **–** as the map that sends${\mathrm{\U0001d69c\U0001d68e\U0001d69d}}_{\gamma}:C-\mathrm{\U0001d68e\U0001d69a\U0001d69e\U0001d68a\U0001d695\U0001d69c}\left(\u25c6\right)\to C$ to$(T,p)$ , where$({T}^{\prime},p)$ is obtained by${T}^{\prime}$ *T*changing the value of*p*to γ.

.$F=\{\mathrm{Id},{\mathrm{\U0001d699\U0001d69e\U0001d69c\U0001d691}}_{n}\left(\gamma \right),{\mathrm{\U0001d69e\U0001d699}}_{n},\mathrm{\U0001d68d\U0001d698\U0001d6a0\U0001d697},{\mathrm{\U0001d69c\U0001d68e\U0001d69d}}_{\gamma}\mid \gamma \in \Omega ,n\in \mathbb{N}\}$ - •
.${C}_{I}=\left\{(\epsilon \mapsto \u25c6,\epsilon )\right\}$

One should note that the command *n* emanating from the vertex *p*.

For a subset *F* of Ω, we will write

A *tree-stack automaton* is an automaton with storage with tree-stack storage.

We say that a tree-stack automaton is * k-restricted* if, for any

*k*edges of the form

Intuitively, Definition 3.17 states that every vertex in the tree-stack can be accessed from below a uniformly finite number of times. We will see in Lemma 4.3 that this is equivalent to the fact that each vertex in the tree-stack is only accessed for a uniformly bounded amount of time.

*For a language *

*•**L**is associated to a**k**-MCFG;**•**L**is accepted by a**k**-restricted tree-stack automaton.*

A tree-stack automata is *cycle-free* if, for every non-trivial loop in the graph realisation

*Given a k-restricted tree-stack automaton *

It is true that a 1-restricted tree-stack automaton is equivalent to a push-down automaton. It is tempting to think that this equivalence can be realised just taking the stack of the push-down automaton as the tree-stack, using push each time a push command is issued and down each time a pop command is issued. However, this cannot be done with finitely many branches at each vertex.

We may try and subvert this issue by replacing subsequent push commands with up commands; however, this will quickly fail to be 1-restricted (it will in fact not be *k*-restricted for any *k*).

We include here a method for associating a tree-stack automaton to a push-down automaton.
We mimic the pop command as follows.
Observe that there are no up commands in the automaton.
This means that if we are in a vertex *v* and a command down is performed, the whole branch above *v* is no longer accessible anymore, mimicking the fact that the symbols of that branch were removed from the stack.
To be sure that this does not cause loss of information, all the down commands in states with a vertex labelled by a letter of the alphabet Ω correspond to pop commands of the original automaton.

If one collapses all the edges labelled with a 0, we arrive at the tree one would get if each push command added a new edge at a vertex and each pop command moved down in the tree.

This requires us to open a new branch of the tree each time a new push command is issued.
This is the purpose of

Let

We define

- •For each element
, let$\omega \in \Omega $ be an extra symbol. Then${\square}_{\omega}$ .${Q}^{\prime}=Q\cup \{({\square}_{\omega},q)\mid \omega \in \Omega ,q\in Q\}$ - •
is the tree-stack storage with respect to an${T}^{\prime}$ –tree, where${\Omega}^{\prime}$ .${\Omega}^{\prime}=\Omega \cup \{{\star}_{\omega}\mid \omega \in \Omega \}$ - •The initial and final states of
are the same as${I}^{\prime}$ *I*(because ).$Q\subseteq {Q}^{\prime}$ - •
will be the set containing the following instructions:${\delta}^{\prime}$ - ()(a)For each rule
, there is a corresponding rule$\tau =({q}_{1},\sigma ,p,f,{q}_{2})\in \delta $ as follows. If${\tau}^{\prime}=({q}_{1},\sigma ,{p}^{\prime},{f}^{\prime},{q}_{2}^{\prime})\in {\delta}^{\prime}$ , then$p=\mathrm{\U0001d68e\U0001d69a\U0001d69e\U0001d68a\U0001d695\U0001d69c}\left(\omega \right)$ (note that those predicates have the same names, but are subsets of different power sets). Similarly, if${p}^{\prime}=\mathrm{\U0001d68e\U0001d69a\U0001d69e\U0001d68a\U0001d695\U0001d69c}(\{\omega ,{\star}_{\omega})$ *p*represents the whole set of configurations of the push-down storage, then will represent the whole set of configuration of the tree-stack storage. If${p}^{\prime}$ , then$f=\mathrm{\U0001d699\U0001d69e\U0001d69c\U0001d691}\left(\omega \right)$ and${f}^{\prime}={\mathrm{\U0001d699\U0001d69e\U0001d69c\U0001d691}}_{0}\left({\star}_{\omega}\right)$ . If${q}_{2}^{\prime}=({\square}_{\omega},{q}_{2})$ , then$f={\mathrm{\U0001d699\U0001d698\U0001d699}}_{\omega}$ and${f}^{\prime}=\mathrm{\U0001d68d\U0001d698\U0001d6a0\U0001d697}$ .${q}_{2}^{\prime}={q}_{2}$ - ()(b)For the state
, we have the instruction$({\square}_{\omega},q)$ .$(({\square}_{\omega},q),\epsilon ,C,{\mathrm{\U0001d699\U0001d69e\U0001d69c\U0001d691}}_{1}\left(\omega \right),q)$ - ()(c)For every
and$q\in Q$ , we have the instruction$\omega \in \Omega $ .$(q,\epsilon ,\mathrm{\U0001d68e\U0001d69a\U0001d69e\U0001d68a\U0001d695\U0001d69c}\left({\star}_{\omega}\right),\mathrm{\U0001d68d\U0001d698\U0001d6a0\U0001d697},q)$

- ()(a)For each rule

We also include an application of this example to give a tree-stack automaton which recognises the word problem in

Define a tree-stack automaton as follows.
The states of the automaton are *T* to be the tree-stack with alphabet *S* is the start state with empty stack as the start stack.
The final states are

This automaton accepts words that contain an equal number of the letter *t* and the letter *T*, which coincides with the word problem in

For some explicit examples of 2-restricted tree-stack automata, see [2, Examples 3.2 and 3.3].

## 4 Closure under free products

In this section, we prove that the class of groups whose word problem is multiple context-free is closed under free products.
To do this, we will show that, given

*Let $M$ be a tree-stack automaton accepting the language M.
Then there exists a tree-stack automaton ${M}^{\prime}$ such that $L\left({M}^{\prime}\right)=L$ and ${M}^{\prime}$ accepts a non-empty word only if the tree-stack storage is in the state $(T,\epsilon )$ for some Ω-tree T.*

We build a new automaton which accepts the same language as follows:
Add two extra states

We change the set of accept states to

It will also be useful to know that the amount of time spent at any vertex in the tree-stack is uniformly bounded.

A *run* in a tree-stack automaton is a path in the graph realisation.
This can be seen as a valid sequence of instructions.
An *accepted run* is a run which ends in an accept state.

*If M is a k-restricted cycle-free tree-stack automaton, then there is an n such that, for each *

Consider the two possibilities for entering a vertex of the form *p* is fixed and *q* and *T* may vary.
Either we have an edge *l*.
There are only *k* possibilities of the second instance since the automaton is *k*-restricted.

In the first instance, there must have been an edge of the form *k* such edges by *k*-restrictedness.
Since δ is finite, there can only be a finite number of instructions that contain a push command.
Therefore, there are a bounded number of choices for *m*.

We will not require the exact bound; however, it can be calculated.
A good estimate is *k*

Let

*If *

Let *W* be the word problem in

We will assume that these automata are *k*-restricted, cycle-free and accept a word if and only if the stack pointer is at the root.
Let *n* be the maximum of the two bounds obtained from Lemma 4.3 applied to

We now define the automaton *W*.
The automaton is depicted in Figure 2.

The states of *T* is the set of tree-stacks on the alphabet *S*, with empty initial tree, and the final state is *F*.
The transitions are

- •
,${\mathcal{D}}_{=}=\left\{({q}_{1},\sigma ,\mathrm{\U0001d68e\U0001d69a\U0001d69e\U0001d68a\U0001d695\U0001d69c}\left(\u25c6\right),f,{q}_{2})\right\}$ - •
,${\mathcal{D}}_{\ne}=\left\{({q}_{1},\sigma ,\mathrm{\U0001d697\U0001d698\U0001d69d\U0001d68e\U0001d69a\U0001d69e\U0001d68a\U0001d695\U0001d69c}\left(\u25c6\right),f,{q}_{2})\right\}$ - •
,${\mathcal{S}}_{=}=\{({q}_{1},\sigma ,\mathrm{\U0001d68e\U0001d69a\U0001d69e\U0001d68a\U0001d695\U0001d69c}\left((q,{\square}_{i})\right),f,{q}_{2})\mid ({q}_{1},\sigma ,\mathrm{\U0001d68e\U0001d69a\U0001d69e\U0001d68a\U0001d695\U0001d69c}\left(\u25c6\right),f,{q}_{2})\in {\mathcal{D}}_{=},q\in Q\}$ - •
,${\mathcal{S}}_{\ne}=\{({q}_{1},\sigma ,\mathrm{\U0001d697\U0001d698\U0001d69d\U0001d68e\U0001d69a\U0001d69e\U0001d68a\U0001d695\U0001d69c}\left((q,{\square}_{i})\right),f,{q}_{2})\mid ({q}_{1},\sigma ,\mathrm{\U0001d697\U0001d698\U0001d69d\U0001d68e\U0001d69a\U0001d69e\U0001d68a\U0001d695\U0001d69c}\left(\u25c6\right),f,{q}_{2})\in {\mathcal{D}}_{\ne},q\in Q\}$

and

The reader should note that tree-stacks were defined with *k*-restricted since the commands in *k*-restricted.
We want to show that

The way the automaton above works is as follows.
We start with our word and move to one of the automata *q*; otherwise, we open a new branch and move to

An accepted run Λ of the automaton will have the pointers start and end at the root of the tree-stack.
Let

For each instruction, there are two possible pointers; these can be viewed as vertices of *v* of the run Λ associated to Θ is an element of

Using the above, Λ decomposes as *w* corresponding to the run Λ decomposes as *v* is an element of *W*, then so is *w*.

For the base case, note that if *W*.

For the other direction, we will use induction on the free product length of the word *free product length* of *w* is the *p* such that

If *p* and is an element of *W*, then there is an *i* such that

To make sure that we can do this process, we have to be able to push a new edge at the correct moment.
This may not be possible if we have already pushed *n* edges at this vertex.
However, we assumed that the automaton

In fact, in the proof, we have shown a slightly stronger result.

*If *

*k*-MCF, then the word problem in

*k*-MCF.

It is clear from the proof of Theorem 4.4 that the automaton constructed is

## 5 Amalgamated free products

In this section, we generalise the previous result to show that the class of groups with multiple context-free word problem is closed under amalgamation over finite subgroups.

The idea is similar to the previous proof; there are however more details. We feel that the interested reader should understand the proof of Theorem 4.4, which encapsulates most of the details in an easier setting. The key idea is the following:

*Let G be a group with multiple context-free word problem.
Let H be a finite subset of G.
Then *

For each *H* is a finite set, so is *R*.
Let *R*.
Let *G* with start state *T* is the set of tree-stacks over the alphabet Ω.
Assume that this automaton has been modified as in Lemma 4.1.

The idea is the following:
Let *w* be the input word.
We will build an automaton that will “guess” an element of *H*, say, *h*, and then proceed to process the word *q* and the first letter of *v* (that is,

More formally, we will build a new automaton

The automaton will have start state *S* and final state

We stress once more that everything boils down to the fact that, given an automaton

If *H* is a normal subgroup *G*, then the word problem in *H*.
Thus we immediately get the following corollary.

*If G is a groups with multiple context-free word problem and H is a finite normal subgroup of G, then *

We recalled the following result from [7].

*Let *

*G*such that

- (i)
,$n\ge 2$ - (ii)
*each*${c}_{i}$ *is in one of the factors*${G}_{1}$ *or* ,${G}_{2}$ - (iii)
*the words* ,${c}_{i}$ ${c}_{i+1}$ *come form different factors,* - (iv)
*no*${c}_{i}$ *is in**H*.

*Then the product *

With Proposition 5.1, we can prove our main theorem; as previously stated, the idea is similar to Theorem 4.4 with a few extra details.

*Let *

The idea is the following:
Suppose that the word *w* into (maximal) subwords that contain only elements of *G*.
Theorem 5.3 gives that *i* such that *H*.
Let *u* be the subword of *w* associated to *i* and the element *u* really represents *u* and proceed as if it had, instead, read the word *H* is finite.

It is clear that the word *w* will be accepted if and only if the automaton will accept the word obtained by *w* substituting *u* with *v*.
By induction on the length of the sequence

More formally, let *W* be the word problem in *G*.
We build an automaton similar to Theorem 4.4 accepting the language *W*.

The states of

The transitions will consist of the following:

Before explaining in detail the rules, there is one key and central observation.
If the automaton is in a state

or of the form

That is, if there is a non-empty word *w* at the second variable, the only possible rule that can be applied is one mimicking the behaviour of one of the original automata if the first letter of *w* was read.
That is, the priority is always to deplete the second variable of the states.

The elements of group (5.1) consist of the very final instruction and the two instructions that starts processing letters in one of the two alphabets

The elements of the group (5.2) consist of the second to last move in a run; they are triggered when the complete word has been read and the tree-stack is one step away from the root.

The elements of groups (5.3) and (5.4) consist of the same type of rules, with the roles of *w* in the second variable.
What happened is that we effectively substituted the subword representing *w*.

We will now give a precise proof of the theorem.
This automaton works similarly to the automaton in Theorem 4.4.
Let Λ be an accepted run for the automaton.
Let

There is a subtree

The run Λ decomposes as a concatenation

Since the tree

The original decomposition *w* as

It should be noted that the final tree for the run *W*.

We must now prove that this automaton accepts all words in *W*.
We will use the free product length of a word once again.
Let *k*.
If this word represents the trivial word, then there is a subword *H*.
Let *u* be the corresponding element of *v* be an element of

The automaton will leave the automaton *v*.
Thus *w* is in

## 6 HNN extensions and graphs of groups

The goal of this section is to prove Theorem 5.4 for HNN extension with finite associated subgroup. We recall the definition of HNN extension.

Let *G* be a group.
Let *G*, and let *HNN extension* is the group given by the presentation

Our goal is to prove the following result.

*Let G be a finitely generated group whose word problem is multiple context-free.
Let *

The proof of Theorem 6.2 almost coincides with the proof in the case of the amalgamated product, modulo the following lemma.

*Consider a word *

*•**either*$n=0$ *and*${g}_{0}=1$ *in**G*,*•**or*$n>0$ *and, for some*$i\in \{1,\dots ,n-1\}$ *, one of the following holds:*- ()(a)
${\epsilon}_{i}=1$ *and*${\epsilon}_{i+1}=-1$ *and* ,${g}_{i}\in {H}_{1}$ - ()(b)
${\epsilon}_{i}=-1$ *and*${\epsilon}_{i+1}=1$ *and* .${g}_{i}\in {H}_{2}$

- ()(a)

The proof here is similar to the proof of Theorem 5.4.
Instead of changing automaton when we change alphabet, we instead note that, each time we read a *t* or *g* in

We have now all the ingredients to prove Theorem A.

*Let G be the fundamental group of a finite graph of groups.
Assume that all the vertex groups have multiple context-free word problem and all the edges groups are finite.
Then G has multiple context-free word problem.*

Let

We greatly thank Bob Gilman for introducing us to the subject and making this project possible. The second author would like to thank UC Berkeley for inviting him as a visiting scholar. The first author would like to thank Alessandro Sisto for inviting him to complete this work at the ETH. We thank the anonymous referee for helpful comments and suggestions, in particular, the addition of Section 6. Finally, we would like to thank Neil Fullarton for his invaluable work with a stapler.

## References

- [1]↑
N. Chomsky, Context-free grammars and pushdown storage, Quart. Progress Rep. 65 (1962), 187–194.

- [2]↑
T. Denkinger, An automata characterisation for multiple context-free languages, Developments in Language Theory, Lecture Notes in Comput. Sci. 9840, Springer, Berlin (2016), 138–150.

- [3]↑
M. J. Dunwoody, The accessibility of finitely presented groups, Invent. Math. 81 (1985), no. 3, 449–457.

- [4]↑
R. H. Gilman, R. P. Kropholler and S. Schleimer, Groups whose word problems are not semilinear, Groups Complex. Cryptol. 10 (2018), no. 2, 53–62.

- [5]↑
M.-C. Ho, The word problem of ℤ n \mathbb{Z}^{n} is a multiple context-free language, Groups Complex. Cryptol. 10 (2018), no. 1, 9–15.

- [6]↑
J. E. Hopcroft and J. D. Ullman, Formal Languages and Their Relation to Automata, Addison-Wesley, Reading, 1969.

- [7]↑
R. C. Lyndon and P. E. Schupp, Combinatorial Group Theory, Classics Math., Springer, Berlin, 2001.

- [8]↑
D. E. Muller and P. E. Schupp, Context-free languages, groups, the theory of ends, second-order logic, tiling problems, cellular automata, and vector addition systems, Bull. Amer. Math. Soc. (N. S.) 4 (1981), no. 3, 331–334.

- [9]↑
C. Pollard, Generalized phrase structure grammars, head grammars, and natural language, Ph.D. thesis, Stanford University, 1984.

- [10]↑
S. Salvati, Multiple context-free grammars. Course 1: Motivations and formal definition, 2011.

- [11]↑
S. Salvati, MIX is a 2-MCFL and the word problem in ℤ 2 \mathbb{Z}^{2} is captured by the IO and the OI hierarchies, J. Comput. System Sci. 81 (2015), no. 7, 1252–1277.

- [12]↑
H. Seki, T. Matsumura, M. Fujii and T. Kasami, On multiple context-free grammars, Theoret. Comput. Sci. 88 (1991), no. 2, 191–229.

- [13]↑
J. Stallings, Group Theory and Three-dimensional Manifolds, Yale University Press, New Haven, 1971.