HYPER-REF: A General Model of Reference for First-Order Logic and First-Order Arithmetic

: In this article I present HYPER-REF, a model to determine the referent of any given expression in First-Order Logic (FOL). I also explain how this model can be used to determine the referent of a first-order theory such as First-Order Arithmetic (FOA). By reference or referent I mean the non-empty set of objects that the syntactical terms of a well-formed formula (wff) pick out given a particular interpretation of the language. To do so, I will first draw on previous work to make explicit the notion of reference and its hyperintensional features. Then I present HYPER-REF and offer a heuristic method for determining the reference of any formula. Then I discuss some of the bene ﬁ ts and most salient features of HYPER-REF, including some remarks on the nature of self-reference in formal languages.


Introduction
If Lakatos where to write Proofs and Refutations (Imre 1976) today, he would need to do something different. Indeed, much of mathematical practice today not only has shifted towards more formal ways of dealing with mathematical objects but has been the concern of an ever growing literature which tries to philosophically inform mathematical notions. How philosophy of science and ontology have done so is pretty well documented, given the focus that philosophy of mathematics had during the twentieth century. But there is still much that can be gained if we use resources from other areas of philosophy.
Here I am not trying to re-write Proofs and Refutations. Far from it, my goal here is to formally characterize a notion very important in mathematics and logic using insights from philosophy of language, bringing with me tools and intuitions from recent work on hyperintensionality (Leitgeb 2019;Sedlár 2019). This notion is that of reference, and explaining how it behaves in formal languages to better understand mathematical and metamathematical results is what I take to be Lakatos' Undone Work.
There is a long standing position in analytic philosophy according to which the reference of a sentence is its truth-value. Indeed, a defense of this stance was presented by Frege himself in Über Sinn und Bedeutung. However, in contemporary analytic philosophy the word reference is also used to talk about the object(s) which the sentence is about (Leitgeb 2002). This paper is about the latter notion, which is also called referent to precisely distinguish it from reference as truth-value (Dummett 1973, pp. 409-412). HYPER-REF is a model that takes a formula in First Order Logic (FOL) and helps us determine the referent(s) of such formula with the aided by (1) a reference function r that maps formulae to its referents and (2) algorithms to decompose such formulae into its parts to facilitate the process. With little changes, we can also show how this model can be used to determine the referent of formulae of first-order theories, such as First Order Arithmetic (FOA).
Understanding reference as the referent of a sentence was implicit in much of the talk about self-reference in arithmetic 1 and formal semantics, 2 but the study of reference in formal languages has only recently begun. Indeed, even if one can trace back debates about self-reference in arithmetic to Gödel's incompleteness theorems and, more recently, debates on whether Yablo's Paradox is self-referent, it was not until the recent work of Lavinia Picollo that we were able to better understand the behavior of reference in formal languages (Picollo 2018(Picollo , 2020a(Picollo , 2020b. Here I offer several criticisms to Picollo's account and present a new one with a model to prove its effectiveness, which is HYPER-REF. Here I define the reference of a sentence or formula as the non-empty set of objects that the syntactical terms of a well-formed formula (wff) pick out given a particular interpretation. Which objects it picks out depends on the interpretation given to the language. A more familiar way to put this is to take reference as the object(s) the sentence is about, and so HYPER-REF will not exactly point out which particular object of the domain is being talked about, but give a general procedure to determine the referent.
Reference in formal languages as it has been defined here displays some hyperintensional features which HYPER-REF necessarily needs to address (Leitgeb 2002;Picollo 2018Picollo , 2020a. That is, referential equivalence violates (the subject of the sentence) and putting it as the first term of the formula. This is of course not accidental, and the point is that if we can identify the subject of sentence in natural language this way, we can also put it as the first term of a formula.
Given this, we can notice that a formula Q(a, b) that expresses the property 'a is greater than b' is not talking about both elements of the relation, but of the first. That is, it is stating that the first element is greater than the second one. The same applies to a formula R(a, b, c) that represents the property 'a is greater than b but smaller than c': it is referring to a given the fact that it is stating that the object denoted by a is greater than b and smaller than c, and so on. 3 Now, in those cases it is clear that neither Q(b, a) is referentially equivalent to Q(a, b) nor R(b, a, c) is referentially equivalent to R(a, b, c), and this is presumably because Q and R are asymmetric. Yet, I hold that this even happens with symmetric predicates.
Indeed, even if P(a, b) is logically equivalent to P(b, a) when P(x, y) expresses, for instance 'x is the twin of y', the reference of both expressions is different, i.e. they are not referentially equivalent. In the former we are picking out (referring to) a and saying of it that b is its twin, while in the latter we are saying of b that a is its twin. Although both have the same truth conditions, their referents are different just from the fact that their construction is syntactically different, what amounts to a shift in reference. And so, even if P( a, b) ⊨ P(b, a), that is not enough to say that they have the same reference: precisely because the object picked out by the syntactical terms of the formulae are different. If one would want to say, for instance, that they are both twins (not necessarily) of one another, one could write T(a) ∧ T(b); to construct a formula that refers to both elements and one could even write P(a, b) ∧ P(b, a) as a formula that refers to both elements and where is it said of them that they are twins with one another.
As it will be shown, HYPER-REF takes into account these hyperintensional features of reference. In the next section, I will inductively define the reference of formulae in FOL and present a heuristic procedure based on this definition that allows us to take any given formula in FOL and to determine the object(s) its syntactical terms pick out. This heuristic procedure is based on previous applications of graph theory to the analysis of semantic paradoxes found in Davis (1979), Beringer and Schindler (2017) and Rossi (2019). The graph theoretical notation is based on Bondy and Ram Murty (2008).
Based on the definition of reference provided in the previous section, I will now define inductively define the reference for the different formulae of FOL. The language of FOL (L FOL ) contains a set of variables {x 1 , x 2 , x 3 … }, a set of constants {a 1 , a 2 , a 3 … }, a set of logical connectives {¬, ∧, ∨, →}, a set of quantifiers {∀, ∃}, a set of n-ary function symbols {f 1 , f 2 , f 3 … } and a set of n-place predicate symbols {P 1 , P 2 , P 3 … } including a binary predicate for identity that can be written as =.
For simplicity's sake, I use x, y, z for arbitrary variables, g as a metavariable, a, b, c for arbitrary constants and P Q, S for arbitrary predicates. I also use Greek lowercase for arbitrary formulae and uppercase for arbitrary sets of formulae.
Wffs are generated by the following BNF, where ∘ is any dyadic connective: We assume a non-empty domain D for the language and a function val such that for all d ∈ D we add a constant a d to the language such that val(a d ) = d.
if P is predicate symbol of arity n, then val(P) ⊆ D n .
if f is a function symbol of arity n, then val(f ) is a function D n ↦ D.
An interpretation of the language is a pair J = 〈D, val〉. This ensures that every member of the domain has a constant that represents it in the language, which is very important for a theory of reference, as we will soon see. Truth is defined the usual way.
Based on the language presented, I will define the reference of the different formulae for L FOL . This will be done via function r that maps wff of FOL (For FOL ) to its referents, the set P(D) \{∅} ∪ {x 1 , …, x n }, such that r : For FOL ↦ P( D) \{∅} ∪ {x 1 , …, x n }. Please remember that g is a metavariable.
Definition 2.1. Reference of L FOL formulae: For any wff ϕ we can define a function r with domain For FOL and co-domain P(D) \{∅} ∪ {x 1 , …, x n } that, given a wff, it returns the reference inductively by: 1. r( P( t 1 , …, t n )) = {val(t 1 )} 2. r( ¬P(t 1 , …, t n )) = r( P(t 1 , …, t n )) 3. r( ψ ∘ γ) = r(ψ) ∪ r(γ) 4. r( ¬( ψ ∘ γ)) = r(ψ ∘ γ) 5. r( ∀g 1 , …, g n (ψ)) = D 6. r( ¬ ∃ g 1 , …, g n (ψ)) = r(∀g 1 , …, g n (ψ)) 7. r( ¬ ∀ g 1 , …, g n (ψ)) = {x i } 8. r( ∃g 1 , …, g n (ψ)) = r(¬ ∀ g 1 , …, g n (ψ)) I the next section I will explain the rationale behind of all the cases of the function, but it is important that for now we can understand what I am defending here. The first thing to address is why reference on atomic formulae works as it does. According to the previous definition, P(t 1 , …, t m ) refers to the object denoted by the first term in the formula, t 1 . This only works if we take into account the syntactic restriction mentioned above, according to which the subject of the sentence is always written as the first of any given formula. Using the function r, we can see that the reference of P(t 1 , …, t n ) is {d} such that t 1 is the name of d according to the interpretation of the language, and so I write r(P(t 1 , …, t n )) = {val(t 1 )}. Something similar happens with negated atomic formulae: according to r, any atomic formulae ¬P n (t 1 , …, t m ) refers to the same object that the non-negated formula refers to. This is because here we are still referring to the same element, but just saying that it lacks a property.
Complex formulae of the form, (ψ ∘ γ) refer to whatever ψ refers to and whatever γ refers to, regardless if they are quantified or non-quantified formula. In the next section I will defend that this is even true when for formulae of the form ψ → γ. And something similar as what happened with negated atomic formulae happens when we have negated complex formulae. That is, the reference stays the same. Thus, the reference of ∀xP(x) ∧ ∀yQ(y) or ∀xP(x) → ∀yQ(y) is the union of r(∀xP(x)) and r(∀yQ(x)). Via this function we can already see how the model reflects the other hyperintensional feature mentioned above: r(P(a) ∨ ¬P(a)) ≡ /r( P( b) ∨ ¬P(b)), as the reference of the former is r(P(a)) ∪ r ¬ P(a)), i.e. {val(a)}; while the reference of the latter is r(P((b)) ∪ r ¬ P(b), i.e. {val(b)}.
Regarding quantified formulae, there are two remarks that need to be made at this point: (1) universally quantified formulae refer to every object of the domain; (2) existentially quantified formulae refer to an element of the domain, some element of the domain; to tell which one is impossible if we analyze this type of formula in its abstract form (∃g 1 , …, g n (ψ)), but we at least say it is referring to an element of the domain. More on this in the next section.
I will now show how from function r we can construct a method to decompose more complex formulae. To do so I use a particular type of graphs that I will call reference graphs or reference trees. Reference graphs are branching trees that are branched out from a root, which in this case will be the formula whose reference we wish to know and therefore they do not contain loops.
For example, say we wanted to know the reference of a formula such as ∀x ∃ y((Px ∧ Qy) ∧ Pa)). What I want is to take this formula and have a tree that has this formula as its root and whose branching is a decomposition of the formula based on some guiding principles provided by r. Such a tree would look like this, with some vertices in between ( Figure 2).
However, before presenting the algorithms it must be noted that formulae to be evaluated with r need to be of a particular form to ease the task of determining their reference. For instance, a formula such as ∀x ∃ y((Px ∧ Qy) ∧ Pa) is in prenex form and thus hinders our ability to see what the formula is about: as you can see, the formula Pa is stating something about a that is independent of whatever is being said about x, as a is not bound by the quantifier.
What we need then is to find a logically equivalent formula that helps us make this intuition explicit. That is, a formula that has the same truth value in every model but where every possible subformla has been pushed out. In the case of the previous formula, ∀x ∃ y(Px ∧ Qy) ∧ Pa would be such formula. And so, only when all the subformulae with no bound variables of a formula have been successfully pushed out and quantifiers are pushed in next to the subformulae whose variables they can bind, can we begin applying Definition 2.1.
I will call this syntactic restriction the postnex form of a formula. 4 To achieve this I will introduce an algorithm very similar to Picollo's (2018) to show how to translate any formula into its postnex form. To that end, I introduce the notion of primes: Definition 2.2. Prime: A formula ϕ of L FOL is a prime if and only if it is an atomic formula, the negation of an atomic formula, a universally quantified expression whose nested quantifiers (if any) have the same domain, the negation of a 4 What follows is heavily inspired by Picollo's own postnex disjunctive normal form and the more familiar prenex normal form (Picollo 2018).

HYPER-REF
universally quantified expression whose nested quantifiers (if any) have the same domain, an existentially quantified whose nested quantifiers (if any) have the same domain, or the negation of a existentially quantified expression whose nested quantifiers (if any) have the same domain.
With "whose nested quantifiers (if any) have the same domain" I mean that formulae with nested quantifiers can indeed be primes, but only if the nested quantifiers cannot be pushed inwardly any further. For example, while ∀x ∃ y( Py ∧ (Qx ∨ Qy)) is not a prime, ∃y ∀ z(Pzy ∧ ∀x(Qzx ∨ Qzy)) is.
Formulae such as ¬(P(a) ∧ P(b)) and ∀x ∃ y((Px ∧ Qy) ∧ Pa) are not primes. Based on this, I will now introduce the definition of the postnex form of a formula. Herein I will use the formula Cg(ϕ), where C stands for either ∀ or ∃ and g is our metavariable.
Definition 2.3. Postnex form of a formula: A formula ϕ of L FOL is in postnex form if and only if: 1. ϕ is a prime, or 2. every subformula is a disjunction of conjunctions of primes and 3. ϕ does not contain quantifiers that don't bind any variable and 4. every subformula of the form To arrive at the postnex of a formula it is necessary for it to undergo a process called postnexation which I will specify now and that explains how to comply with the restrictions set by the definition. To obtain the postnex form of a formula, nested quantifiers sometimes need to be taken into account. To that end, I introduce the notion of depth.
Definition 2.4. Depth: Let dep be an assignment of numbers to formulae of L FOL such that The dep of a formula ϕ is just the length of the longest chain of nested quantifiers occurring in ϕ. Now we proceed to define the process by which we can turn a formula into its postnex form.
Definition 2.5. i-postnexation: The i-postnexation [ϕ] i of a formula ϕ such that ϕ ∈ L FOL is the result of applying the following transformations to each subformula Cgψ of ϕ of depth i. The process of postnexation of ϕ must be performed starting with formulae of depth i = 0 and ends after performing postnexing the formulae with the maximum depth, i = max{dep(Cgψ) : Cgψ is a subformula of ϕ}. 1. Replace every subformula of the form ψ → γ with ¬ψ ∨ γ until they don't occur any longer, starting with the innermost. 2. Erase double negations. 3. Replace every subformula of the from ¬(ψ ∧ γ) and ¬(ψ ∨ γ) for (¬ψ ∨ ¬γ) and (¬ψ ∧ ¬γ) respectively until they don't occur any longer, starting with the innermost. 4. Erase double negations. 5. Replace every subformula of the form Cg 1 , …, g n (ψ ∘ γ) where γ is a prime containing no bound variables by Cg 1 , …, g n , for Cg 1 , …, g n (ψ) ∘ γ until they don't occur any longer, starting with the innermost. 6. Replace every subformula of the form ψ respectively until they don't occur any longer, starting with the innermost. 7. Replace every subformula of the form As the process of postnexation only involves the application of semantical equivalences, it produces a logically equivalent formula that allows us to apply function r successfully. For example, consider the formula Its 0-postnexation, [ϕ] 0 , consists in applying transformations 2.5 to 2.5 in the definition above to its subformulae with depth 0 i.e. ((P(x, y) → Qy) ∧ Pz) ∧ P(a)). This results in: Its 1-postnexation, [ϕ] 1 , does the same to its subformulae with depth 1, i.e. ∀x(((¬P(x, y) ∧ Pz) ∨ (Qy ∧ Pz)) ∧ Pa). This results in:

HYPER-REF
Its 2-postnexation, [ϕ] 2 , consists in applying the same procedure to subformulae of depth 2, i.e. ∃y((∀x( ¬P(x, y) ∧ Pz) ∨ (Qy ∧ Pz)) ∧ Pa). This results in a very slight change where the existential moves one place to the right: Its 3-postnexation, [ϕ] 3 , would be: By this process we are identifying the main quantifier of the formula, (the determiner of the sentence) and also other independent formulae that are not in the domain of said quantifier. And now, with its 3-postnexation we can move to draw a reference tree that takes from the formula to referents of it. As I stated before, the trees of graphs that I will be working here will not have cycles or loops and will be branching trees. That means they are a type of directed, rooted graphs whose starting point is a node from which different nodes will branch out called the root of the tree. From this root we will specify how to get to the reference of a formula thanks to a series of algorithms defined below. In what follows I borrow heavily from Rossi (2019).
So first I will show how to expand vertices that are labeled with formulae that are not complex and do not contain quantifiers (Definition 2.6). Then I will move to explain how to expand vertices that are labeled with complex formulae (Definition 2.7). Lastly I will explain how to expand vertices labeled with quantified formulae (Definition 2.8).
And so, for every ϕ ∈ L FOL , I will define a labeled tree 〈V ϕ , A ϕ , L ϕ 〉, where V ϕ is the set of vertices {v 1 , …, v n } of the tree, A ϕ is the set of arcs or branches between vertices {( v x , v y ), …, (v z , v w )} and L ϕ is the labeling function of the tree that assigns FOL formulae to the nodes of the tree such that L : V ↦ For FOL . These trees will have a starting vertex which is labeled with ϕ itself in postnex form that will be the root or the starting point of tree from which the tree is expanded. I will refer to this vertex as root.
To expand them I will use new tress that add nodes and arcs. That is, for example, given a tree 〈V, A〉 and a label function L, I will define a new tree, say 〈V ′ , A ′ 〉, and a new labeling function, say L ′ , which extends the labeled tree 〈V, A, L〉 ϕ such that this new function assigns either referents or expressions of FOL generated by the grammar to the new vertices according to Definitions 2.6-2.8.
So let us start with operations for vertices labeled with expressions that do not contain quantifiers or dyadic connectives (quantifier and dyadic free expressions). This next definition will show how to expand a tree that contain vertices labeled with this type of subformulae.
Definition 2.6. Operations with quantifier and dyadic free formulae: For every tree 〈V, A〉 labeled with For FOL and a root vertex root, and for every function L, define the following set by induction: What this is saying is that, for example, you can take a tree and generate a new one with an extra vertex (v i ) if the original tree contains a vertex v such that is labeled as t 1 (L(v) = t 1 ). In this case the label (σ) of the new vertex (v i ) would be {val(t 1 )}.
Now we move to formulae with dyadic connectives, which require a different process.
Definition 2.7. Operations with formulae with dyadic connectives: For every tree 〈V, A〉 labeled with For FOL and a root vertex root, and for every function L, define the following set by induction: Here what happens is that when we get a tree in which a vertex's label is (ψ ∘ γ), then we get to define a new tree with two new branches. In each of those there is a new vertex (v i and v j , respectively) whose labels are ψ and γ. Let us finish with nodes labeled with quantified formulae.
Definition 2.8. Operations with quantified formulae: For every tree 〈V, A〉 labeled with For FOL and a root vertex root, and for every function L, define the following set by induction: Here what happens is similar to Definition 2.6. A single new vertex is branched out containing a new label depending if the formula is universally or existentially quantified.
Lastly, in order to define all the new trees, we just need to put together Definitions 2.6, 2.7 and 2.8. Such that for every formula of L FOL , the labeled reference tree 〈V ϕ , A ϕ , L ϕ 〉 is the result of applying Definitions 2.6, 2.7 and 2.8 to a graph consisting only in the root (labeled with ϕ) until a fixed point is reached. Definition 2.9. For every ϕ ∈ L FOL , the labeled reference tree T = 〈V ϕ , A ϕ , L ϕ 〉 generated from ϕ is the least fixed points of the following definition: -At stage 0, put: -For an arbitrary successor stage m + 1, put: We can put all this to use by establishing the reference of the formula ϕ such as ∀z ∃ y ∀ x(((P(x, y) → Qy) ∧ Pz) ∧ Pa), whose postnexation we presented earlier.
We start by assigning 〈V ϕ , A ϕ , L ϕ 〉 its root, which, according to our labeling function, must be ϕ (Figure 3).
But we can still apply r further: we can apply it to the formula on the left and the formula on the right and represent it applying Clauses 2 of Definition 2.8 and 2 of Definition 2.6 respectively ( Figure 5).
Finally, the new branch on the right can be expanded using Clause 1 of Definition 2.6 to obtain ( Figure 6).
Given the function r, the reference of the formula would be the union of the sets '{y}', and '{val(a)}'.
This concludes the presentation of HYPER-REF. Based on the model presented thus far, one could establish the reference of any formula in FOL. In what follows I

Using HYPER-REF to Determine the Reference of FOL Formulae
In the last section, I presented HYPER-REF, which consisted in a function r, which establishes the reference of formulae in their postnext form, and a procedure to construct reference graphs. In this section I will motivate and explain the cases of the function for FOL formulae. In the following section I do the same for formulae in FOA.
The most basic formulae are atomic formulae, P(a). This formula is saying of a that it has the property P, for instance, 'a is Powerful'. 5 Setting P(a) as the root and applying rules 2 and 1 of definition 2.6 one obtains the following graph (Figure 7).  The reference of a formula expressing a relational predicate can be determined similarly. Given any n-place predicates of the form P(t 1 , …, t n ), a wff expressing a n-nary relation between (t 1 , …, t n ) refers to the object denoted by the first term. This may look counter-intuitive, as pointed out in the previous section, but it is just the consequence of the syntactic constraint mentioned earlier. The graph for this formula is just Figure 7 with P(t 1 , …, t n ) as its root, t 1 as the next vertex and {val(t 1 )} as the last one.
We may continue with formulae with logical connectives. We start with negation. A formula ¬P(a) expresses that a certain object denoted by a lacks the property P. It may be decomposed by first applying Clause 3 of Definition 2.6, and then Clauses 2 and 1 of Definition 2.6 ( Figure 8).  Now it is time for the dyadic connectives ∧, ∨, and →. The connective ↔ can be defined the usual way. It is the case that for each formula of the form ϕ ∧ ψ, ϕ ∨ ψ, and ϕ → ψ, they refer to both the object that the formula ϕ refers to and the object that the formula ψ refers to. For instance, think of the formula Pa ∧ Pb as stating that both a and b have the property P, and as such it refers to both of them. And so is the case with the other formulae with dyadic connectives. Indeed, formulae such as Pa ∨ Pb state that either a or b or both have the property P, to even establish which has the property we have to acknowledge which elements may have such property, that is, which elements are we talking about; a formula such that Pa → Pb states that if a has the property P, then b has said property too.
And so, any formula (ψ ∘ γ) (where ∘ represents either ∧, ∨ or →) refers to whatever object(s) ϕ refers and to whatever object(s) ψ refers to. And so, a graph for a formula P(a) ∧ P(b) can be drawn using rules 1 of Definition 2.7 and the rules 2 and 1 of Definition 2.6 ( Figure 9). Now we move to quantified formulae, which allow us to refer to bound variables through the use of quantifiers. Establishing reference for this type of formulae is slightly more complicated than with atomic formulae, but not as cumbersome as Picollo (2018Picollo ( , 2020b seems to think. The simplest quantified formulae of FOL are ∀xP(x) and ∃xP(x). The former of these refers to everything and it states that every element has the property P, the latter refers to an element and it states that it has the property P. At this point we must again remember that we are analyzing reference and not truth, and such we are only interested in determining what a given quantified formula of FOL is talking about.
Thus, it is because we are referring with the universal quantifier that we can tell that we are referring to everything; and it is because we are referring with the existential quantifier that we can tell we are referring to some object, which one? again, that is not the point of referring with the existential quantifier, but to state that there is an object that satisfies some conditions expressed the formula. This will be relevant very soon for a discussion on Picollo's own view. Graphs for basic quantified formulae can be obtained using rules 1 and 2 of 2.8 respectively (Figures 10 and 11).
Here we can see that by introducing formulae with quantifiers we are introducing a new mechanism of reference altogether. Previously we had only been able to refer to single, determined objects via the constants that denoted them thanks to the interpretation of the language (m-reference). Through reference via quantifiers (q-reference) we can do even more: we can talk of all of the elements in the domain, some of them and even just one (think of definite descriptions), so here we are not referring as concisely as with m-reference, as with the universal quantifier we are not talking about one determinate object of the domain, but of all elements of the domain; with the existential quantifier we are referring to some element of the domain. Indeed, when we say ∃xP(x) we are not referring to an  element in particular, we are just establishing the fact that there is one element of the domain that satisfies P(x). 6 Now, a possible objection to my treatment of reference via quantifiers would be to say that it leads to some undesirable results, namely that function r forces us to consider that r(∃xϕ) ≠ (∃yϕ), since r( ∃xϕ) = {x} ≠ {y} = r(∃yϕ), but that's implausible since we know that the reference of both formulae is the same, they refer to some element of the domain. On the contrary, I think this an interesting feature of HYPER-REF that precisely reflects how we use variables in FOL. When we write ∃xϕ, we want to talk about an element of the domain, and we use the variable x to represent this. When we write ∃xϕ and also write ∃yϕ, in both cases we want to say something about an element of the domain, but maybe not the same element of the domain. 7 In cases such as these, there is no syntactic restriction that could force us to believe that x = y such that {x} = {y}, therefore there is no reason to believe that concluding r(∃xϕ) ≠ (∃yϕ) is a disadvantage of the system. So {x} and {y} still represent some element of the domain, the fact that we use different variables is just there to represent that there is perhaps reason to think that they are not the same. This is why I think the set theoretic notation is important, x may just be a variable of the language as much as y is, by the same coin, {x} is some element of the domain as much as {y} is. 8 Returning to the issue of nested quantifiers, we can also see that formulae with nested quantifiers whose quantifiers cannot be pushed inwardly q-refer to what the first or main quantifier refers. For instance, formulae such as ∀x 1 ∀ x 2 ϕ and ∀x 1 ∃ x 2 ϕ where quantifiers cannot be pushed further in still refer to every x ∈ D. This is because in each case the second quantifier is not used to determine what it is being talked about, but for predication. For instance, a formula of ∀x ∀ y( Px, y) where P states that for all xs (the subject of the formula) it is true that they stand in a relation P with regard to all the ys. P could be 'x greater than y' and thus the formula would state that all the xs are greater than all the ys. Now we add negations to simple quantified formulae. Even if ∀xP(x) refers to every element in the domain, ¬ ∀ xP(x) does not, because of the semantics of the 6 The original usage of the terms m-reference and q-reference are found in Picollo (2018). 7 In a very Finean spirit, we could hold that (∃xϕ) and (∃yϕ) are precisely not synonymous when they jointly occur in a formula (say, ∃x ∃ yϕ) (Fine 2007, ch. 1); yet, they may be synonymous in other instances, when they do not appear jointly, but the point here is that it need not be. 8 Though similar, I think this is not strictly related to other area of Kit Fine's work regarding models for arbitrary objects (Fine 1985). Indeed, there Fine was interested in establishing a theory of arbitrary objects, but the reference of ∃gϕ is not an arbitrary object, precisely because its reference is part of the domain, whereas arbitrary objects are not. universal quantifier: negating a universal quantifier results in referring not to all elements of the domain, but only some elements (at least one) of the domain, which was exactly to what ∃xP(x) referred. As such, they refer to the same thing(s), some x ∈ D. As expected, something very similar happens with expressions such as ∀xP(x) and ¬ ∃ xP(x), as they refer to every element in the domain.
However, quantified formulae not always come so simple. Universal quantified formulae often come in the form ∀x( Px → Qx) and existential formulae often come in the form ∃x(Px ∧ Qx). But of course that is not the end of the story, as it is possible to construct quantified formulae with any connective (not just ones just said), nested quantifiers and every conceivable syntactic combination. This is where I think that Picollo over complicates things with q-reference. Picollo's starting point is simple quantified formulae such as ∀xP(x) and ∃xP(x), for which she says respectively that refer to every element of the domain and some element(s) of the domain. But then she jumps to formulae such as ∀x(Px → Qx), and says that it refers to every element that has the property Q provided that said element has the property P (Picollo 2018, pp. 584-585). 9 Both formulae ∀xP(x) and ∀x( Px → Qx) have the same determiner, i.e. the same quantifier, and so I don't think there is really a shift in reference. What determines the reference of a quantified formula is the quantifiers, not the predicate. And if they have the same quantifier, I think this shift complicates things. The predicate of a quantified formula has no bearing in what the formula refers to in the sense that changing the predicate does not change what we are referring to. Changing the predicate changes what we are saying about the referent, sure, but it does not change what we are talking about. And so, Picollo is introducing a novel way of reference which she is not explaining, and that's reference through predicates: if ∀xP(x) and ∀x(Px → Qx) refer to different objects there is a shift in reference for which she is not accounting.
Here is where things get a tad harder for Picollo and others who agree with her (Beringer and Schindler 2017, p. 448;Leitgeb 2005, pp. 180-181), for these amounts to saying that every semantically distinct formula of FOL refers to different objects. And so she is forced to devise algorithms to transform any formula into its postnex disjunctive normal form and only then can we establish what a sentence is referring to. That is, to take a formula ϕ and rewriting it such that (1) it does not contain →, ↔, ∃ nor superfluous quantifiers, (2) every subformula of ϕ is a disjunction of conjunctions of primes (an atomic formula, its negation, a universal formula, or its negation), and (3) in every subformula of the form ∀g 1 … ∀g n (ϕ 1 ∨ … ∨ ϕ n ), with g i free in ϕ j . 10 And so, for instance, Picollo states that even if formulae such as ∀x( ϕ → ψ) and ∀x(¬ψ → ¬ϕ) (which I will call Π) refer to the same objects, and even if formulae such as ∀xϕ and ∀x ¬ ¬ϕ (which I will call Σ) refer to the same objects, formulae Π and Σ do not refer to the same objects (Picollo 2020a, p. 428). But if we take into account what has been said so far, one can see that both pairs of sentences refer to the same objects: all the objects in the domain. Again, here the quantifier (what actually counts when establishing the referent of a formula) says the same, so I see no reason as to why say that Π and Σ have different referents. What makes them different is what are they attributing or predicating of such objects, but they still refer to every object in the domain because that is the role that the universal quantifier fulfills in regards to reference.
What Picollo is ultimately doing is a very robust and precise theory of predication. One that allows us to determine when two expressions are predicating the same of the same objects. But this is different to what I have sought to explain, and so a theory of predication needs to be deferred until we have a theory of reference.

Using HYPER-REF to Determine the Reference of FOA Formulae
In the previous section I used HYPER-REF to determine the reference of a number of formulae in FOL to explain the function r. In this section I will do the same for the FOA formulae, a language which is an instance of FOL. My interest here is to explain how I can apply the theory to other language of first-order logic that can afford self-reference. To that end I will explain the particularities r for FOA formulae, and then review the strong and weak diagonal lemmas to understand the extend to which they are self-referential. The language of FOA (L FOA ) is an instance of the language of FOL with a different interpretation. For that, we need to introduce 0 as the only individual constant, S as the monadic function symbol for the successor operation, + and × as dyadic function symbols and the axioms of Robinson Arithmetic (Q), the weakest theory in which self-reference is possible. The new interpretation is N = 〈ω, val〉, comprised of a domain ω of positive integers and now the function val adjusted to the new domain.
To do that, I will introduce gödelization. Gödelization of L FOA is a function that assigns to each symbol of our language (and therefore every string of symbols) a unique natural number (which belongs to the domain). Thus, if θ is a string of symbols, #θ will be its code or Gödel number. To do so, we need to understand that for every individual term consisting n occurrences of the symbol S followed by 0 is called the numeral of n. And so, if #θ is the code or Gödel number of θ, £θ· is the numeral of its code.
Thus, for all n ∈ ω we add a constant £n· to the language such that val( £n·) = #n if P is predicate symbol of arity n, then val(P) ⊆ ω n if f is a function symbol of arity n, then val(f ) is a function ω n ↦ ω As £ψ· is just a term of FOA (an instance of FOL), any formula that uses numerals to refer (such as P(£ψ·)) is just a special instance of case 2.1 of Definition 2.1: that is (P £ψ·) refers to #ψ. Consequently, something very similar applies with Clause 2.6 of Definition 2.6, where the new node contains {val(£ψ·)}, that is {#ψ}.
This is also a departure from Picollo, for whom P £ψ· m-refers to ψ and not to #ψ (Picollo 2018, §4;2020a, sec. 3.2). But I think this is not correct. For once, she is supposing that m-reference has instances of transitivity which I will deny in a minute and also simply because ψ is not an object of the domain, #ψ is. Although her model has its merits, I think it is problematic to say that you can refer to objects outside the domain.
As our domain is ω, the reference of 0 is the empty set. Furthermore, the reference of a function is the range of values it can have given its domain, since our domain is ω, then the reference of any instance of the functions just introduced (S, + and ×) is the value the function has: the reference of S0 is the value of the function, which is 1; the reference of the function 5 + 4 (which is short for +(5, 4)) is 9. This means that reference through functions is just another case of m-reference. Now, even if S0 and SS0 appear in SSS0, neither of them should be taken as subterms of SSS0 in way such that SSS0 refers to whatever S0 and SS0 refer to. This is so because with SSS0 we are solely interested in determining the successor of SS0, and so claiming that the reference of SSS0 also is 1 and 2 is wrong, because the successor of SS0 is neither (they are not identical to 3). And so only because S0 and SS0 occur in SSS0 that does not mean we are referring to them because we are not taking them as values of the function. 11 We now move to self-reference. For this we need to first detail the conditions for a successful gödelization as it cannot be arbitrary on pain on triviality. 12 As such, the coding must be effective, ε-adequate and strongly monotonic. A numbering is effective if there is an algorithm through which we can establish which expression σ a number n codifies, and vice versa (Picollo 2018, p. 574). A numbering is ε-adequate if it represents a large portion of syntactic relations and operations by elementary relations and operations on ω (Grabmayr and Visser 2020, p. 3). A numbering is monotonic if for a sub-expression σ of σ′, it assigns codes such that #σ ≤ #σ′, and its strong if, in addition, the code of the Gödel numeral of an expression is larger than the code of the expression itself (Grabmayr and Visser 2020, p. 3). 13 This is needed as we need the codings to preserve m-reference (Grabmayr and Visser 2020, sec. 6.2). These constraints may seem exaggerated, but they allow for to introduce self-reference in the system with relative ease. 14 All this technicality aside, let us analyze some features of reference via numerals. First of all, we must observe how reference works here. Given an expression ϕ of L FOA , our coding must assign it a code which is a number #ϕ that belongs to the domain whose numeral is £ϕ·, which are used in the wffs.
Given that our gödelization is effective and strongly monotonic, we can take any numeral £ϕ· and from there establish which number it refers to, which is the expression ϕ. After all, with gödelization we are just assigning names to the expressions, such that the name, refers to the expressions of the language itself. As said earlier, the case of r for this expression is just a special instance of case 2.1 where r(P(£ϕ·)) = ϕ. The rules for graphs containing Gödel numbers are just 11 A very different issue is the fact that to establish the value of SSS0 we need to establish the value of SS0 and S0, which is the whole point of induction and an epistemological claim independent from the fact of the reference of SSS0 is neither SS0 nor S0. 12 See Heck (2007, sec. 2.1) and Halbach and Visser (2014a) for some difficulties of more traditional numberings. Examples of non-standard Gödel numberings for which self-reference fails are introduced in Grabmayr (2021), sec. 2.3). 13 This concept is actually attributed to Volker Halbach, who formulated to either Albert Visser and/or Balthasar Grabmayr in a personal communication. Mortals as I who have no access to such correspondence are obliged to quote it as it has been done here. 14 Indeed. As has been argued by Grabmayr (2021), these constraints assure us relative invariance in respect to main results that are provable in first-order Robinson arithmetic. This is a huge source of intensionality in the system (Halbach and Visser 2014a, §2.1), as with two different codes we could referring to the same expression, or with the same code be referring to different expressions, but for the present purposes it is only relevant that one is chosen. special instances of rule 2.6 of Definition 2.6, such that from P(£ϕ·) we must branch to {£φ·} and then to {#ϕ}.
According to an inconvenience noted by Picollo (2018, pp. 582-583) and later corrected and presumably solved in Picollo (2020a, pp. 424-425) an expression with code £ϕ· m-refers (besides to the sentence coded) to every well-formed expression whose code is smaller. However if we see these codes as names of expressions (as it will be done here) we can see that there is no such problem.
M-reference is not a transitive relation: if ϕ m-refers to ψ and ψ m-refers to a, ϕ does not m-refer to a, even if we are using gödelization. Indeed, when one m-refers to an object of the domain, one does so by using a name: although an expression such as P(£Q(£0 = 0·)·) refers to Q(£0 = 0·) we do not want to say P(£Q(£0 = 0·)·) refers to 0 = 0 (the reference of Q(£0 = 0·)). If this was the case, then this accounts to saying that Q(£0 = 0·) is identical to 0 = 0, as they would both be the reference of P(£Q(£0 = 0·)·). However, they are not, because they are different formulae: the former states a that a formula with code £0 = 0· has the property Q, the later states that 0 is equal to 0.
Another example may be in order. Consider the formula P(£ϕ·), this formula refers to whatever formula £ϕ· stands for, but not for whatever ϕ refers to, as the reference of ϕ can be perfectly different to ϕ. A similar phenomenon can be observed in natural language: if I say 'the grass is green' and right after say 'the sentence I just said has 4 words', with my last utterance I am not referring to grass nor stating that the grass satisfies the property of having four words. With the latter, I am referring to the expression 'the grass is green' but not to grass itself.
Returning to self-reference, there are two standard ways to enrich the language so it can afford it. One way will lead us to diagonalization (strong and weak) and other related functions such as normalization (Smullyan 1957;1994, chaps. 1 and 5), the other leads us to self-referential numberings, which were hinted in Kripke (1975, footnote 6) and Feferman (1984, p. 80) and developed later on. 15 Here I will only focus on strong and weak diagonalization. Self-referential numberings have been recently frowned upon for their lack of monotonicity, and so are subject to a number of criticisms, such as the fact that m-self-reference is not attainable in the language (Heck 2007). This, however was recently challenged by Grabmayr and Visser (2020), who showed that there are selfreferential numberings for expressively weak languages. 16 15 Kripke actually constructed such numberings in his 1982 seminar on truth. Allen Hazen allowed me to see his notes on the seminar and for that I am grateful. 16 See also Kripke (2021).
Self-reference through strong diagonalization is achieved by enriching the language with function symbols for primitive recursive functions, such that the resulting language L + FOA contains a term d ̥ which represents a diagonal function. Given a formula ϕ with x free, the diagonalization of ϕ returns a formula where all instances of x have been replaced in ϕ for £ϕ·. Let y → abbreviate y 1 , …, y n , a possibly empty sequence of individual variables.
Lemma 4.1. Strong diagonalization: For every formula P(x, y → ) where x is different to each y 1 , …, y n , there is a term t such that Q ⊢ t = £P(t, y → )·.

Proof.
Let From the perspective of HYPER-REF, we can see that strong diagonalization does not exactly offer us self-reference upfront, at least not as the indexical 'I' does in natural language. Indeed, r(P(x, y → )) is {val(x)}, but {val(x)} is equivalent neither to x nor to P(x, y → ), but rather to {#x}, an object of the domain (a number).
Consequently r(P(t, y → )) such that t = £P(t, y → )·) is not is not P(t, y → ), but rather { # P( t, y → )}, a number that is an object of the domain, but not the formula itself, which is not part of the domain. Thus, t is not equivalent to the sentence that says of itself that has the property P, but rather equivalent to that sentence's name. It is not self-referential because it literally refers to itself (r(t) ≠ t), but rather because we can prove (because we know) that the sentence whose name t is identical refers to t (r(P(t, y → )) = t). We can create a graph to show this better ( Figure 12). 17 This result is due to Jeroslow (1973). This particular way of presenting the proof is due to Picollo (2018, p. 582).
And so, the graph never takes us again to P(t, y → ), but to val(t) (or val(£P(t, y → )·)) in this case), which is identical to #P(t, y → ), a name, and not the formula itself.
On the other hand, the weak diagonal lemma states that for every predicate P there is a formula ϕ materially equivalent to P(£ϕ·) (that is, Q ⊢ ϕ ↔ P(£ϕ·)). The sentences that fulfill these requirements are either of the form ∀x ∀ y(x = £∀ y(Diag(x, y) → P(y))· →(Diag(x, y) → P(y))) or of the form ∃x ∃ y(x = £d y(Diag(x, y) ∧ P(y))· ∧(Diag( x, y) ∧ P( y))) depending of the method you are choosing to define diagonalization (see Boolos, Burgess, and Jeffrey 1974;Milne 2007, §17.2). Notice however, that d ̥ represents diagonalization and Diag strongly represents it. In both cases however, self-reference is not obtained upfront, because in the former case the sentence is referring to every x and in the latter is referring to some x. What really enables us to establish that these sentences refer to themselves is actually what is being said of them. That is, that there is a formula y that is the diagonalization of a formula x which has the property P and that we know that ϕ is that formula.
Graphs for formulae obtained via the weak diagonalization lemma are pretty straightforward, since its postnex is just ∃x( x = £∃ y(Diag(x, y)∧ P(y))· ∧ ∃ y(Diag(x, y) ∧ P(y))). Its graph is obtained labeling the root to this for formula and applying Clause 2 of Definition 2.8 ( Figure 13).
The simplicity of the graph is due to the fact that there are no independent subformulae that can be pushed out of the main formula, and so the formula is only referring via the first existential quantifier. This helps to show in a heuristic way what I just stated regarding self-reference through the weak diagonal lemma: the claim that formulae like this one are self-referential is a claim that rests on the epistemological fact that we know what is being ascribed to the referent of the formula that we can know that the formula is actually referring to itself.
There has been an enduring claim in the literature according to which diagonalization (either strong or weak) is equivalent to other mechanisms of self- reference in natural languages, such as indexicals (Kleene 1986, p. 134;Gaifman 2006, p. 713;Raatikainen 2018, sec. 4.3;Smullyan 1992, p. 11) and this appears to be the received view (Milne 2007, Appendix); others claim that they are equivalent to demonstratives (Priest 1995); while some others think of them as definite descriptions (Beringer and Schindler 2017;Halbach 2016, p. 162;Heck 2007, p. 5). However, in the light of HYPER-REF these claims look ill-founded, as this sort of epistemic self-reference is not evidently mirrored by any of this mechanisms for self-reference. This is without a doubt an interesting and open research topic.

Conclusion
The remarks made in this paper can at least help us to understand three things: the first, is that there are formulae that are referentially identical but semantically distinct, such as ϕ and ¬ϕ. The second is that under HYPER-REF there is also room for referentially distinct yet semantically equivalent formulae, such as P(a, b) and P(b, a), whenever P is symmetric. Lastly, although the possibility of self-reference in formal languages has been acknowledged, claims of self-reference through diagonalization (strong and weak) rest upon epistemological claims. The study of reference and self-reference in formal languages is still nascent. Important philosophical and logical questions are open whose answer this article is admittedly presupposing. For me, the most pressing of these is the very possibility of reference in formal languages: reference is a term that was introduced to discuss certain aspects of natural language, a language that does not depend on the interpretation assign to it. The fact that in formal languages we need not only the language itself, but also an interpretation of it in terms of an ordered pair 〈D, val〉, makes me question the role that the interpretation of the language has in making reference in formal languages possible.
Here I assumed that an interpretation of the language is sufficient to establish relations of reference between formulae and objects of the domain, but this is hardly the same relation that occurs with sentences of natural language and objects of the world. If reference in formal languages is possible after all, then there is much to say about the differences between reference in formal and natural languages, including how different or similar are mechanisms of self-reference in both languages. As was hinted in the previous section, strong diagonalization cannot be equated with self-reference via indexicals and as of now I see no reason to think that weak diagonalization does.
like to also thank the other members of the program (both mentors and mentees) for their valuable comments, especially Jonas Raab. My gratitude is also extended to the organizers of the event -Sophie Nagler, Hannah Pillin and Deniz Sarikayaand the two anonymous reviewers from Kriterion.