Folklore as an evidential category

Abstract Folklore refers to information that we have learnt as a part of the history of our own people and that has passed on from generation to generation for hundreds, or even thousands of years. This paper shows that as an information source folklore has features in common with other information sources, most notably hearsay, but it nevertheless constitutes an information source of its own, characterized as [−personal] [−direct] and [+internalized]. In addition, the paper proposes a formal-functional typology based on the element used for folklore coding. It is also shown that the semantic similarity of the coded element with the proposed definition of folklore corresponds to its frequency. Finally, the paper discusses the central theoretical implications this study has for our understanding of evidentiality.


Introduction
Most statements we make are based on some kind of evidence, which may be our own (as with visual evidence), or we may have to rely on other people's evidence, which is the case with hearsay. However, a part of our understanding of the world is based on less concrete evidence, whose origins and truthfulness are not necessarily clear, which is perhaps best manifested in folklore, traditional stories, and myths of one's own culture. Folklore may be highly relevant to our identity, but it clearly differs in its nature from e.g. visual and reported evidence. For example, we have not observed the states-of-affairs referred to in any way and we cannot cite the original source directly. This type of information source is the focus of this paper. It discusses the nature of folklore and its similarities to and differences from other information sources. Folklore is used as an umbrella term for all information sources that somehow represent traditional stories of different groups of people. In addition to providing a detailed definition of folklore and distinguishing folklore from other evidence types, a formal-functional typology of folklore coding will be proposed, and the rationale behind the typology will be discussed.
Folklore has been chosen as the topic of this paper because it differs from all other evidence types in its multi-faceted nature. However, precisely because of its heterogeneous nature, folklore has common features with many other evidence types, most notably with hearsay. This makes its linguistic coding an interesting topic, which will likely add to our understanding of evidentiality in more general terms: why certain strategies are chosen and when. Finally, and in a way more importantly in this context, there are no previous studies on folklore from any kind of broad comparative perspective.
A note on data collection is in order before proceeding further. Even though it is possible to define folklore in the way it is defined in Section 2 (without saying anything on whether this is the best, let alone the only way of defining the notion), it has been necessary to be more liberal in data gathering. Consequently, the discussion also includes cases that do not correspond fully to the proposed definition but that can be regarded as good enough manifestations of folklore. The reason for this is practical; the number of languages considered would have been very low if only genuine examples of folklore would have been included in the discussion. Consequently, the discussed examples also include instances of narratives and (traditional) stories, regardless of the exact term/definition used in the source. The problems in finding data also means that the present study cannot be considered a genuine typological paper because the number of languages discussed is not very high and not diverse enough, and the focus is on languages spoken in the Americas. This is not a deliberate choice, however, for the coding of folklore was most often discussed in descriptions of these languages. The goals of this paper are thus primarily theoretical in nature. This also means that the proposed typology may not be the final word on folklore coding, but at least it is a first step towards a broader understanding of the notion. It is also noteworthy that the paper does not have concrete examples of all languages whose data is discussed. This is inevitable because not all sources consulted provide actual linguistic examples of folklore coding but only mention the evidential marker that typically appears with folklore.
The organization of the paper is as follows. In Section 2, folklore, as the notion is understood in this paper, is defined. Section 3 illustrates a formal-functional typology based on the element used for folklore coding, while Section 4 discusses the rationale behind the attested types. Section 5 is concerned with the central theoretical implications this study has for our understanding of folklore and evidentiality in more general terms. Section 6 presents a brief summary.

Defining folklore
Folklore is far from being a homogeneous notion, and different cultures differ enormously in how they treat their traditional stories and also in whether folklore is oral and/or written. Despite the evident differences in the nature of folklore, some common traits can be given that can be seen as independent of specific cultures and that are characteristic of folklore regardless of the culture whose folklore we are dealing with. In this paper, folklore is understood as an information source that meets the following requirements: 1. Folklore presents the (oral or written) heritage of one's own culture including myths, traditional stories, history etc., which has been passed on from generation to generation. 2. The speaker has not been involved in the events depicted in any way, and they consequently do not have any personal evidence (of any kind) for the information they are referring to. 3. Folklore resembles reported evidence, but in contrast to typical reported evidence, the original source of information is completely unknown. 4. Due to the origin and nature of folklore, the speaker has no evidence for or against its truth value. However, due to its importance for one's own culture, the speaker may believe folklore to be true and have subjective certainty of its truth value.
The first characteristic distinguishes folklore from stories that may be created and told by anyone, such as (written) fiction, jokes, bedtime stories, and stories based on real events that we have experienced ourselves. Everyone can tell these kinds of stories, and we have the right to create them. On the other hand, we do not have the right to create new stories considered folklore. Moreover, we may tell stories and create fiction about any topic we can possibly imagine while folklore is restricted to quite specific events, whose nature is very culture-dependent. Finally, the speaker may have been involved in their own stories and/or they may have witnessed them while folklore has happened in the remote past, which excludes the speaker's own involvement in the events depicted. Second, the speaker does not have any kind of personal evidence for folklore whereas we may have different kinds of evidence for most other events. For example, we may experience the event 'John and Lisa are having a barbeque' via our senses (visual, olfactory), we may infer or assume this, based on some indirect evidence, or we may hear this from someone else. On the other hand, we may only have reported or similar evidence for folklore. This also has the consequence that the reliability of folklore does not vary contextually, while context and the type of evidence are highly relevant to other events. The differences between information sources are relevant for normal events, and for them claims based on visual evidence are more reliable than those made on mere hearsay, but these differences are lacking for folklore. In this sense, folklore resembles general knowledge for which the reliability of the original information source is irrelevant when the given piece of information has become internalized by the speaker (see Kittilä 2019 for the coding of general knowledge). Moreover, the evidence we have for other events may be temporarily recent ('Lisa was just having breakfast') or more remote ('John visited Botswana 25 years ago') while folklore is always located in the remote past that has not been witnessed by any person alive at the moment when folklore is referred to. The exact timing is usually unknown, and it is also irrelevant.
Third, because folklore constitutes information that has been passed on from generation to generation, its transmission resembles reported evidence. However, folklore differs from typical reported evidence in that the speaker usually knows their source for hearsay, and it is always known in the case of quotation, but the speaker cannot name the original source nor have they learnt the given piece of information from its original source for folklore. Because we usually know the original source of information for reported evidence, our judgement on the reliability of the information at hand is influenced by how reliable we find the original source of information to be. On the other hand, we usually consider folklore as highly reliable evidence, regardless of the source we have learnt it from. The reliability follows from the nature of the information itself, not from its source; folklore is reliable evidence due to its relevance for us as representative of our own culture.
Fourth, we can prove the truth value of most of our claims by referring to the non-linguistic occurrence our claim is based on. For example, we can prove a statement like 'Lisa lives in a big house' by driving to Lisa's place and showing that this is actually the case, or we can disprove this by showing that Lisa lives in a tiny flat. Put another way, we usually have a way of providing some evidence for or against statements concerning the states-of-affairs we are referring to. In contrast, this kind of evidence is not available for folklore. This would require that we can refer to the actual events folklore is based on and concretely show that they are either correct or incorrect. The special nature of folklore is manifested also in the fact that we believe folklore to be true even though we have no concrete evidence for its truth value. Finally, new reliable evidence that contradicts our current understanding of the world usually changes our ideas about normal states-of-affairs (including facts and pieces of general knowledge), but similar contradicting evidence is not available for folklore. This follows from the subjective nature of the speaker's certainty. As such, there is some subjectivity involved whenever the speaker has no direct (sensory) evidence for their claim. For example, the choice between inferential and assumptive evidentials is often determined by how reliable the speaker finds the available evidence to be. Folklore, however, differs from these evidence types in that the speaker is not making their decision based on whether they find the available evidence reliable or not, but the choice is based on the fact that believing folklore to be true is a part of our own culture.
Above, folklore has been discussed in light of features relevant for defining folklore for this study and distinguishing it from other information sources. Below the notion will be discussed from a more general perspective, showing how it relates to other information sources. Plungian (2010: 37) serves as the basis for the discussion, but I have added the feature [+/−internalized] in order to render an explicit distinction between folklore and some closely related evidence types possible: Direct/personal (=attested, witnessed, firsthand, confirmative) -Participatory/endophoric; common knowledge -Visual (with subtypes) -Non-visual (sensory) Indirect/personal -Inferential (based on observed results) -Presumptive (based on plausible reasoning) (common knowledge) Indirect/non-personal (secondhand) -Reported (with subtypes) As can be seen above, folklore is not considered in Plungian's typology directly, but if we consider its nature in light of the features given above, folklore represents indirect and non-personal evidence. This classification, however, does not distinguish folklore from reported evidence because the two evidence types have the exact same definition in light of these features. Consequently, the feature [+/−internalized] has been added to the definition in order to enable an explicit distinction between reported evidence and folklore. This means that folklore has become the speaker's internal information (in the same way as general knowledge and facts, see Kittilä 2019), while this does not apply to reported evidence. The definition proposed for folklore in the present paper is thus [−direct], [−personal] and [+internalized].
Folklore as an evidential category 3 Formal-functional typology of folklore coding In this section, a formal-functional typology of folklore coding will be proposed based on the element languages use for referring to folklore. Only markers with clear evidential functions are considered, and possible other elements (such as tense or aspect markers) are not taken into account (even though some markers may also comprise other elements, see e.g. [2] below). The focus of this section lies on the illustration of the different types; their rationale (e.g., how well they correspond to the semantics of folklore) will be discussed in Section 4. The order of the discussed types does not correlate with their frequencies.
A couple of methodological notes are in order before proceeding to the typology itself. First, languages where folklore is coded by a multi-functional zero, whose evidentiality value is not specified, are excluded because zero can be seen as a kind of default marking, and its exact evidentiality value is determined contextually. This means that, for example, languages like Finnish, German and English are excluded. Second, the typology proposed is based on the discussion of the data in the sources consulted. It is possible, for example, that a certain language may employ a variety of mechanisms for coding folklore, but this may have gone unnoticed by the author of a given grammar. This is, however, not a major problem because the goal of this study is to propose a typology of language types (i.e. what are the possible mechanisms attested), for which individual languages merely serve as data. In practice, this means that a given language may be classified as belonging to the wrong type, but this does not affect the typology itself. Rather closely related to this, I will not provide any real statistics for the discussed language types.

Type 1: Dedicated folklore marker
The first type is illustrated by languages that have a marker whose primary function is the coding of folklore. Some examples are found in (1)-(3): (1) Ladakhi mi-gun i-ləm-ne čhen-yot-kək. man-PL PROX.DEM-path-ABL go-REP-NARR 'Men had been passing by this way (lit. going from this way).' (Koshal 1979: 206) (2) Matses matses-n cun tita bed-pa-ac Matses-ERG 1.GEN mother grab-comment-NARR.PST ca-denne-c ubi usun-sho. say-REM.PST-IND 1.ABS be.pregnant.with-when:S/A/0>0 'They tell that Matses captured my mother while she was pregnant with me.' (Fleck 2003: 421-422) (3) Yuki se=éi hul-koʔói náu-mil. and.DIFFERENT.SUBJECT=MYTHIC eye-gopher look-FINITE 'And Coyote watched.' (Mithun 1999: 199) In Ladakhi, the marker -kək is used in narratives without connotations of oral or second-hand evidence. The use of the marker indicates that the speaker has not witnessed the events of their story, and that they do not have any direct evidence for them (Koshal 1979: 185, 205). The use of the narrative affix seems to be confined to stories, which makes it a dedicated folklore marker as the term is defined in this paper. In Matses, a specific form is used for mythic and historical past (Fleck 2003: 421). This form is a combination of the topic continuity affix -pa and the affix -ac that is formally identical to the recent past inferential. In addition, the quoted sentence bears the remote past experiential marker -denne. Speakers may use somewhat different strategies for this, but the combination pac + denne is the most common one (Fleck 2003: 421-422). Hearsay evidentials cannot be used for this. Finally, the marker in Yuki refers, according to Mithun (1999: 576), to mythical time rather than source of information, which is rather directly compatible with the semantics of folklore even though the marker is not strictly speaking a marker of the information source.
In addition to the three languages above, markers that come close to being primary folklore markers are attested at least in Tonkawa and Kalapalo. In Tonkawa (Hoijer 1933-1938: 140, see also Mithun 1999, the suffix -lakno'o is used in myths to show that the events depicted happened a long time ago. The use of the suffix underlines the fact that the speaker has not witnessed the events themselves and that the people who have told them the story have heard it from someone else. If the current speaker has heard a story from someone who has actually witnessed the events personally, the quotative affix -no'o would be used (Hoijer 1933(Hoijer -1938. Kalapalo illustrates an interesting case in that it has three markers that emphasize different semantic aspects of folklore (the following illustration is from Basso 2012). The first one of these, -kita, is the most evident folklore marker used in historical narratives. It also appears in the leader's speeches. The second marker, -kili, refers to inherited knowledge, an Folklore as an evidential category aspect that is characteristic of folklore even though inherited knowledge comprises more types of information than just folklore. Third, the suffix -tï codes authoritative hearsay that the speaker has not personally witnessed but has learnt from other authorities.
Finally, there are two further types of languages that do not have a dedicated marker for folklore per se, but where a specific coding strategy is employed for folklore. First, there are languages that use a specific combination of markers discussed in more detail in Section 3.5. The second type is illustrated by languages that code evidentiality by grammaticalized markers (which are at least to some extent obligatorily), but where evidentiality coding is omitted for folklore. This is attested in Nganasan, where verbs usually contain information about the information source, but evidentiality coding is omitted in clauses describing folk tales or myths (Gusev 2007). On the other hand, the reported evidential does appear in narratives describing something that the narrator learnt from particular people, most often from their ancestors. It is also used by a shaman recounting what spirits have told. The use of the reported evidential presupposes the existence of a firsthand information source, which is lacking for folklore and very well corresponds to the semantics of folklore. Nganasan thus has a specific strategy for folklore coding even though it lacks a specific dedicated marker for this.
To summarize. Even though folklore constitutes a rather specific type of information source, there are languages that code it with dedicated markers and provide us with the best possible evidence for the existence of folklore as an independent information source. It is interesting that most of these languages are spoken in the Americas (especially in North America). Their number is too low to make any valid generalizations, but it would be interesting to investigate whether the occurrence of (a) dedicated folklore marker(s) reveals something about how different cultures view folklore.

Type 2: Direct evidential
In the second type, folklore is coded by direct evidentials. This type is very rare in my data; the only potential attestation is found in Wanka Quechua. According to Floyd (1993: 102-106), direct evidentials may also be used in Wanka Quechua in cases, where the speaker has no direct evidence for their claims, such as stories about their parents' childhood and the history of their own people. Floyd suggests that information about the history of one's own people is a part of the speaker's personal sphere, which triggers the use of a direct evidential (Floyd 1993: 102-106). This goes well with the internalized nature of folklore discussed in Section 2.
However, we should note that in Wanka Quechua the choice of evidential seems to depend on who is speaking of folklore. Shamans have the epistemic authority to speak about it as their own experience because they are believed to have direct access to folklore enabling them to use direct evidentials in the same way as normal people may use direct evidentials when speaking of events they have witnessed, e.g., visually. 1

Type 3: Indirect evidentials
The third type is represented by languages where different kinds of indirect evidential markers are employed for folklore coding. This type comprises all markers that code information sources that have been labelled as [+personal] and [-direct] by Plungian (2010: 37). As the discussion below shows, the exact nature and terminology used for these markers in different sources varies enormously (see also Keinänen 2017 for terminology of evidentiality), but they all have in common that they constitute the speaker's personal, yet indirect information. In other words, the speaker has some kind of evidence, but they have not witnessed the event depicted directly. A typical example is illustrated by inferentials that may be used e.g., in cases where we have visual evidence for the result but not for the cause. In addition, many languages seem to employ markers that indicate that the speaker has not witnessed the state-of-affairs they are referring to, but the nature of evidence is not specified in any detail (e.g. it could be general knowledge, assumption etc.).
DET man clever=REL=ERG 3A-fool-V-REP-CP=EV DET other=REL man 'The clever man again fooled another man (they say).' (Faarlund 2012: 138) (6) Slave ejǫ sįá xaokedak'ǫ nǫ. there CONJECT 3PL.made_fire apparently 'She found a place where they had made fire.' (Rice 1989: 397) The particle -kɨ of Comanche indicates that the speaker has no first-hand evidence for their story. In addition, the particle is used in folktales and it is also used for citing other people's stories (in which function the particle could also be classified as a reported evidential). In the first function, it is best regarded as a general non-first-hand evidential. The clitic =ʻuŋ 'allegedly' of Chiapas Zoque is functionally rather close to -kɨ; it also means that the speaker only has secondhand evidence for their claim, and it is used in folklore. Finally, the Slave particle sja codes conjecture and it also appears in tales. Example (6) is from a tale, in which it implies that certain events and places have been mentioned but they are not known to the speaker. Moreover, the indirect inferred marker also occurs in folktales if the speaker knows its details.
As noted above, indirect evidentials are rather commonly used for coding folklore. Examples similar to those illustrated in (4)-(6) are also attested, for example, in Meithei (Chelliah 1997: 224), and Kolyma Yukaghir (Maslova 2003: 231). The exact semantics of the indirect evidential used varies; the marker may be a kind of general non-first-hand evidential or its other functions may include assumption or inference. Many of the indirect markers imply that the speaker has no evidence of their own, which is very well in line with the basic semantics of folklore. Moreover, in cases where folklore concerns the creation of one's people, we may say that we have evidence for the result but not for the cause.

EVID
'"You, Raven, how come you have such a pretty wife?" they said.' (Holton andLevick 2008: 4, cited in Tenenbaum 2006: 90) Seminole Creek has a suffix that Nathan (1977: 115) has labelled a quotative-distant past marker. The marker means that the speaker has not witnessed the event depicted but has only heard about it. Moreover, the use of the marker usually also implies that the state-of-affairs referred to happened a long time ago. These features explain the use of the suffix for folklore coding. Assiniboine has both a general hearsay marker (káya / káa / káyapt), which translates best as 'they say' and a quotative particle hųštá, which means more or less 'it is said/so they tell it' (Cumberland 2005: 330). Both markers imply that the speaker has no evidence of their own for their claim. The quotative is primarily used in narratives, especially in those that refer to events that happened in the remote past. However, hearsay and quotative markers are at least to some extent interchangeable in folklore (Cumberland 2005: 334). Finally, Dena'ina has the hearsay inferential particle ɬu that is common in traditional narratives. In contrast to Seminole Creek and Assiniboine, the primary function of the marker is to code general hearsay, not quotation. The use of reported evidentials for folklore coding is rather common across languages (see Willett 1988: 57 andAikhenvald 2012: 270 for similar notes), and examples similar to those in (7)-(9) are attested also in, e.g., Cora (Casad 1984: 179), Wintu (Aikhenvald 2004: 314, cited from Pitkin 1984, Jamul Tiipay (Miller 2001: 276-278), Huastec (Edmondson 1988: 389), and Kham (Watters 2002: 299-300) among many others.
As noted above, Type 4 comprises both general hearsay and quotative evidentials. It is interesting that in most languages of Type 4, it is the quotative, and Folklore as an evidential category not the general hearsay marker, that is employed for this function. This may have good semantic (and/or pragmatic) reasons (see Section 4 for a discussion) but it may also follow simply from terminological choices made by the authors of the consulted grammars. It would be interesting to study the evidentiality systems of Type 4 languages in more detail in order to find out whether they actually have two reported evidentials or whether languages of other types have a quotative (while using another evidential for folklore coding). This, however, lies outside the scope of this paper.

Type 5: Combination of different markers
Thus far, the discussion was concerned with languages that either use a dedicated marker for folklore, or where one (single marker) of the available markers is used for coding folklore. Type 5, in turn, comprises languages that use a combination of different markers for folklore coding. As already noted above, these languages could be classified as Type 1 languages because the mechanism (combining two markers) they employ for coding folklore is unique. However, they are here viewed as a type of their own because strictly speaking we are not dealing with markers that are used exclusively for folklore coding. Moreover, my goal is to illustrate the entire variety of mechanisms languages employ for coding folklore; for this reason, these languages are discussed separately. Type 5 does not seem to be very common, but two somewhat different manifestations are found below: Kyuquot, Nuu-Chat-Nuulth dialect huya•ɬ-(y)i:-č hiɬ-a•c-qin. dance-INDEF-INF there-at.bow-at.head 'It was dancing in the bow of the boat, the story goes.' (Rose 1981: 229) As shown in (10), a combination of inferential and hearsay markers is used for coding folklore in Qiang. In Kyuquot, a combination of inferential evidential and indefinite mood occurs in similar cases, e.g. stories and religious dogma, and in sentences whose contents are believed to be true by the speaker or that have recently come to their attention (Rose 1981: 229). LaPolla does not elaborate on (possible) other uses of the hearsay-inferential combination, so it seems to be exclusively used for folklore coding. A potential further example is attested in Jarawara, where a combination of non-first-hand and reported appears for folklore (Aikhenvald 2012: 270), but the use of the reported evidential for this function is not obligatory even though it appears in 90% of the cases (see also Dixon 2004: 203 for a different view on this). Finally, Matses, discussed in Section 3.1, could also be seen as an example of Type 5 but Matses is not considered a Type 5 language here because one of the markers involved (topic continuity suffix) is not related to evidentiality in any way.

Variable marking
The last type is represented by languages in which the coding of folklore varies according to its nature. 2 This kind of variation occurs naturally with stories (understood in a broad sense) because stories may differ, e.g., depending on whether the speaker has been involved in them themselves or not, or whether they are fiction or based on real events. In other words, different types of stories constitute different kinds of information source, and they may be based on different types of evidence, which very well accounts for their differential coding. This is directly manifested in Tariana, where all five evidentials of the language may appear in tales/stories (understood in a broad sense, see Aikhenvald 2004: 311-312). In this paper, the discussion is limited to cases that correspond to the definition of folklore given in Section 2. This means, for example, that only two of the Tariana evidentials are relevant here.
Type 5 does not seem to be very common (when restricted to folklore only), but relevant cases seem to be attested in Tariana and Desano. In both languages, the evidentials participating in the variation are reported and inferential evidentials, which are functionally rather common representatives of these categories in both languages; a reported evidential is used when the speaker has not witnessed a state-of-affairs personally, while an inferential evidential codes events that the speaker has not witnessed but infers to have occurred based on some observable evidence. What is interesting, however, is that basically the same evidentials seem to work very differently in these two languages. In Desano (Kaye 1970: 32-35), reported evidentials appear in traditional, oral literature of one's own culture, while an inferential is used in stories and legends of other cultures that are not a part of the tradition of the Desano people. On the other hand, in Tariana, an inferential 3 is used in culturally important stories, such as the travels of the Tarianas' ancestors (who are supposed to have left signs, like stones and caves etc., behind them), while most other stories are told using a reported evidential (Aikhenvald 2003: 139-140). The inferential is thus used for one's own culture in Tariana while it codes other peoples' stories in Desano. On the other hand, a reportative is used for one's own folklore in Desano, and for other stories in Tariana.
In Desano and Tariana, the variation in coding is motivated by the inherently different nature of the coded instances of folklore. Moreover, there are languages where the coding may vary according to the person who is talking about folklore. An example was discussed above in light of Wanka Quechua, where high, religious authorities may use the direct evidential to refer to folklore while normal people must use an indirect evidential. In this sub-section, the focus is on languages like Tariana and Desano, where the variation attested is available for all speakers of the language, but a note on Quechua is nevertheless in order.

Rationale
In this section, the rationale behind the attested language types is discussed. The discussion is motivated by how well a given marker corresponds to the definition of folklore as based on the features [+internalized], [−personal] and [−direct]. This multi-faceted nature of folklore is directly manifest in its coding, and languages may stress one or more of these features yielding massive variation in its formal treatment. It will be shown that the semantics of folklore and that of the used marker explain quite well why certain markers are more common than others. The discussion proceeds from markers that correspond best to the definition to markers whose semantics is least compatible with folklore.

Dedicated folklore markers
The first type is illustrated by languages that employ a dedicated marker for coding folklore. Because folklore is defined as information that is [+internalized], [−personal], and [−direct], the semantics of dedicated folklore markers corresponds directly to the semantic definition of folklore, as expected. This definition is naturally somewhat circular, but in a similar vein, visual evidentials code visual evidence, and their semantics directly corresponds to what we view as visual evidence.
Because dedicated folklore evidentials code exactly the type of information that is here viewed as folklore, we could expect this type of marker to be the most frequent marker type across languages. However, their cross-linguistic frequency is not very high when compared, for example, to visual evidentials. There are good reasons for this. Languages need dedicated visual evidentials, because visual evidence is a very important way of gathering information about the surrounding world; we witness hundreds of states-of-affairs visually every day (if we are not visually impaired, that is). In addition, the distinction between visual and e.g., hearsay evidence may be communicatively highly relevant because statements based on visual evidence are more reliable than those based on hearsay. Moreover, we are usually not able to infer from contextual cues or by means of our world knowledge whether a given statement is based on visual or hearsay evidence, and we may have both kinds of evidence for most events. These features do not apply to folklore. First, the number of states-of-affairs we may refer to as folklore is very low; in other words, we do not witness a range of events regarded as folklore daily. Second, folklore is usually not in contrast with other evidence types and it is thus not relevant to mark a piece of information explicitly as folklore, e.g., as opposed to hearsay evidence. Consequently, there is no urgent communicative need for dedicated folklore markers.

Reported evidentials
The evidential marker that corresponds second best to the semantics of folklore is represented by reported evidentials. Reported evidence is not our internal evidence (and it is thus not [+internalized]), but it is [-personal] and [-direct]). Consequently, folklore and reported evidence have a number of features in common. In contrast to dedicated folklore markers, this semantic similarity is manifested in the cross-linguistically frequent use of reported evidentials as markers of folklore. This seems intuitively rather natural because folklore can be viewed as one type of story, and stories are often based on other people's experiences. We thus depend on other people's evidence in case we have not experienced the events of the story personally. The same applies to reported evidence, which rather directly explains the use of reported evidentials for folklore coding.
As discussed above, the use of reported evidentials for folklore coding follows very naturally. One further intriguing thing is the, also aforementioned, fact that languages seem to favour quotative evidentials over hearsay evidentials for folklore Folklore as an evidential category coding. This may, as also noted above, be due to terminological choices or it may follow from the fact that languages where any kind of reported evidential is used for folklore coding happen to be richer in quotatives. However, this unequal distribution may also well be found in the semantics of the given evidentials. General hearsay evidence is second-hand evidence, whose source may be unknown to the speaker and whose nature is not specified in any way. This type of evidence thus probably illustrates the least direct evidence type. Quotatives differ from hearsay evidentials in that they specify the source of our second-hand evidence. The semantics of folklore is thus better compatible with quotatives. First, even though the speaker usually has not learnt a given piece of folklore from its original source, our ancestors or some kind of high authority may be seen as the source, who is quoted by using quotatives. Second, hearsay and quotative evidentials differ from each other also in that quotatives may, in favourable contexts, be used to provide more reliable evidence for a given claim. For example, we may quote a high authority, such as the principal of a school, for providing more convincing evidence for our claims. This difference is manifest, for example, between statements like 'They say that there will be no classes tomorrow', and 'According to the principal there will be no classes tomorrow'. The latter of these constitutes a more convincing claim because we are quoting the person who actually has the right to cancel classes. Because folklore is part of the history of our people, it can be regarded as reliable information, which also favours the use of quotatives rather than hearsay evidentials. Finally, quotatives constitute more objective evidence because we are simply quoting someone without our own subjective evaluation.

Non-first-hand/indirect evidentials
The next type is represented by non-first-hand evidentials that are also commonly used for coding folklore. They are [−direct] and [−internalized] evidence but in contrast to reported evidence they are [+personal] according to Plungian's (2010) classification. This means that the speaker is making a statement based on their own observation, but differently from, e.g., visual evidence, they have not observed the state-of-affairs referred to directly. The personal nature of non-first-hand evidence is in clear contrast to folklore that constitutes non-personal evidence. Moreover, folklore is not based on inference or assumption, but the speaker has learnt it as oral tradition from their own people. In these two respects, reported evidentials correspond more closely to the semantics of folklore.
Despite the clear differences between reported and non-first-hand evidentials, non-first-hand evidentials occur rather naturally with folklore due to their [-direct] nature. We may say that non-first-hand evidentials concern more the nature of the information, while hearsay evidentials focus more on its source (which makes the information in question indirect and non-personal). We may also note that inferentials seem to be more common with folklore than assumptives. As in the case of hearsay and quotative evidentials, this may follow simply from terminological choices or from the general cross-linguistic predominance of inferentials but it may also have a functional explanation (see also de Haan 2001). Inference represents more reliable information than, for example, assumption because inference is typically based on some kind of concrete evidence (such as result of an event), whereas assumption is typically based on our general knowledge of the world. In light of this, it is rather natural that languages opt for employing inferentials for folklore coding in case either of these two markers is chosen. However, assumptives are used when the speaker is making a claim based on their general knowledge of the world. Folklore can also be seen as a kind of general (internalized) knowledge, which could favour the use of assumptives, but in any case, inferentials seem to be more common with folklore.

Direct evidential
Direct evidentials are very rarely, if ever, used as markers of folklore (the Wanka Quechua case discussed above is the only potential instance of this that I have come across). It is highly likely that this a not a mere coincidence but directly explained by the very evident semantic differences between direct evidence and folklore. Direct evidentials do not correspond to the proposed definition of folklore in any way; rather, they constitute the exact opposite of it. Put in another way, direct evidence is [internalized], [+direct] and [+personal]. In Wanka Quechua, where direct evidentials may be used for folklore coding, only high authorities may under favourable circumstances employ direct evidentials for this purpose. This follows because they are deemed to have direct access to the original source of folklore. This accounts rather well for the use of direct evidentials because their use can be explained as being similar to the use of direct evidentials in their typical uses. In both cases, we are dealing with reliable evidence. This resembles the use of direct evidentials for coding general knowledge/facts in that e.g., visual evidence and general knowledge constitute very different information sources but they share the feature of high reliability (Kittilä 2019).

Combinatory type
Languages that code folklore by a combination of markers differ from the languages discussed thus far in that the nature of the evidence they code cannot be Folklore as an evidential category directly labelled as representing a specific type because their exact definition depends on which markers are combined. However, the type per se is intriguing because it underlines the very heterogeneous nature of folklore; in some languages a single marker does not suffice for capturing the exact nature of folklore, but two markers need to be combined. For example, in Qiang, inferential and hearsay evidentials together code folklore, either of them alone would yield the wrong impression on how the evidence in question should be interpreted. As shown in this paper, in some languages either of these markers alone suffices for folklore coding but combining them underlines the peculiar nature of folklore. Two other things are relevant in this regard. First, all markers found in the attested combinations are indirect evidentials. The sample is naturally very small, but the use of indirect evidentials provides more evidence for the indirect nature of folklore as evidence type. Second, and more importantly, it seems that this kind of combinatory marking is attested only for folklore (and perhaps other stories), which provides more evidence for the multi-faceted nature of folklore and its special place among evidence types.

Variable type
The variable type resembles the combinatory type in that it is harder to classify it, based on one type of evidence only. For example, Desano and Tariana may be distinguished according to which feature of folklore they focus on. As discussed above, an inferential evidential codes other peoples' folklore in Desano, while in Tariana it appears with the folklore of one's own culture. Reported evidentials work in the opposite way. Inferentials code more direct evidence than reported evidentials in that inference is based on the speaker's personal, yet indirect, evidence, while the speaker has no evidence whatsoever of their own for reported evidence. In other words, Tariana may be said to emphasize the directness of information, since folklore of one's own culture may be viewed as more direct and reliable evidence for us. Desano, in turn, places more stress on the original source of information by using reported evidentials. This kind of variation is analogous to e.g., differential formal treatment of inference and assumption in some languages. These two evidence types are close to each other, but some languages distinguish between them explicitly, based, for example, on the nature of evidence. In a similar vein, some languages distinguish between different types of folklore explicitly.
The other type of variation briefly discussed in Section 3.2 cannot be directly explained by the semantic nature of evidence we have, but context and epistemic authority, along with the right to know and access to knowledge, play a more important role. In other words, in Wanka Quechua, the variation between direct and indirect evidentials in the coding of folklore is determined by whether the speaker is viewed as having direct access to the coded information or not. This does not necessarily correspond to whether the speaker actually has direct access to the information, but rather, it is determined by whether they are viewed as having epistemic authority.

Summary
In this section, the rationale behind the formal types of folklore coding has been discussed. The variation in coding is expected because folklore is a very heterogeneous notion and languages may thus stress any of the features associated with it in its coding. The types display massive variation as regards which feature(s) they are based on. What is interesting is that the number of languages using a certain coding strategy seems to correlate with how well the given type corresponds to the semantic definition of folklore; the closer a given type is to folklore, the higher the number of languages employing the given strategy is. The only exception to this is presented by dedicated folklore markers that are attested but are less common than e.g. reported and indirect evidentials. This generalization should be approached with caution, however, because the sample this study is based on is rather limited and a larger study may make revisions necessary.

Theoretical implications
First of all, and perhaps most importantly, I have shown that folklore constitutes an evidence type in its own right, which is best manifested in languages such as Ladakhi and Matses. Folklore unarguably has common features with some other evidence types but it also differs drastically from all other evidence types, which is manifested, for example, in the fact that Plungian's classification cannot deal with it directly and that the additional feature [+internalized] is needed for this. The present study has thus shown that internalized evidence should also be included in our understanding of evidence types. In other words, evidentiality is not only about what we can concretely witness, for example, by our senses, but that what we know without any concrete evidence is also relevant and that we can make claims based on this type of evidence. Evidentiality is thus not only about the information source but it relates to speaking about knowledge in broader terms. The notion comprises basically every possible way of gathering information about the world we live in.
Second, rather closely related to the first point, the discussion in this paper has provided more evidence for the fact that evidentiality is not only about the information source but that how we conceptualize it and who has the right to know also make an important contribution (see e.g. Evans et al. 2018 andFloyd et al. 2018 for more detailed discussions). First, as an evidence type, folklore is basically the same for all cultures (even though there are significant differences in the traditional stories of each culture) and it can be defined in the way proposed in Section 2. However, as the discussion here has shown, there is massive variation in the formal treatment of folklore. This means that speakers of different languages conceptualize the same type of evidence differently by according it different coding. This is perhaps most evident in languages such as Tariana and Desano, where different types of folklore receive different coding. In a similar vein, the speaker chooses between inferential and assumptive evidentials, on the one hand, and general hearsay and quotative evidentials, on the other (if these distinctions are available in the given language) depending on how they conceptualize their evidence and/or how they want to present it to the hearer. Second, languages such as Wanka Quechua, underline the relevance of the right to know or epistemic authority for our understanding of evidentiality. As has been shown by, e.g. Evans et al. (2018), evidentiality is not only about the speaker's information source but also the right to know plays an important role (see also Bergqvist 2016, andFloyd et al. 2018). As regards the coding of folklore, the choice of evidentials is determined rather directly by the identity of the speaker, and we may say that the right to know is very strongly integrated in the culture. However, the basic principle is the same, as for example in Kogi (Bergqvist 2016), where the speaker chooses an evidential based on whether they assume epistemic authority or not. Put together, these two features provide more evidence for the non-objective nature of evidentiality.
Third, even though there are languages that code folklore by a dedicated marker, the number of these languages is quite low, which is probably no coincidence. The speaker is, at least in the great majority of cases, aware of their information sources, but these are not known to the hearer. The nature of the information source may be communicatively very relevant because the speaker's information source largely determines how reliable the hearer finds a given statement to be. For example, claims based on visual evidence are more reliable than those based on hearsay. It is thus communicatively important to have specific markers for, e.g., direct and indirect evidence because the hearer is usually not able to infer the type of evidence we have, which renders explicit coding necessary. On the other hand, we usually know the folklore of our own people, and it only has one possible source, which excludes any kind of contrast between e.g. visual and hearsay evidence. This makes dedicated folklore markers rather redundant communicatively, probably contributing to their low cross-linguistic frequency. Moreover, folklore resembles other evidence types, especially reported and indirect evidence, which makes it economical to use these existing markers for its coding. This is also manifested in the nature of the evidentiality systems; languages with dedicated folklore markers usually have a rather rich array of evidentials at their disposal. Moreover, grammaticalization of evidentials does not start with folklore or factual evidentials. Folklore is highly relevant to our own identity, however, which may explain the fact that some languages have a dedicated marker for folklore despite its communicative irrelevance. In this regard, it would be very interesting to take a closer look at how important folklore along with the history of one's own people is for those cultures where these languages are spoken.
Fourth, the discussion in this paper is relevant to how certain evidence types are coded in languages. Languages show uniformity in the coding of the basic evidence types, i.e., languages have distinct evidentials for sensory evidence, inference/assumption and reported evidence (naturally it depends on the language which of these actually exist). But languages display more variation in how they deal with other types, such as folklore, dreams and general knowledge. This variation reflects the type of evidence coded rather directly; general knowledge, for example, is usually coded by direct evidentials (Kittilä 2019) while there is massive variation for folklore. It seems to be most typical of languages to code folklore by some kind of reported evidential, which is in line with its semantic nature. Folklore also clearly resembles common knowledge and endophoric evidence, but most languages seem to opt for coding it by reported evidentials, stressing rather its general semantic nature. The massive variation in folklore coding may also follow from the fact that it is viewed differently by different people and cultures. General knowledge, for its part, illustrates a more homogeneous evidence type, which is reflected in its rather uniform coding.

Summary
In this paper, I have discussed folklore as an evidential category. In Section 2, folklore was defined as evidence that is [+internalized], [−personal] and [−direct]. Folklore thus resembles other information sources, most notably reported evidence, but it can, and also should, be viewed as an evidence type in its own right. Formally, languages display massive variation in the coding of folklore, which very well reflects its multi-faceted nature. Some languages code it by a dedicated folklore marker, but most languages resort to using one of the existing markers, which is, depending on the language, usually an indirect or a reported evidential.
As the discussion in Section 4 shows, the distribution of markers is rather directly explained by how well the semantics of the employed marker corresponds to the semantic nature of evidentials. The discussion in the paper broadens our scope to evidentiality, for example, by showing that evidentiality is not only about the events that we observe externally, e.g. by our senses, but also that our internalized evidence plays an important role for this, in case we wish to arrive at a holistic understanding of evidentiality.
The present paper is, however, only a first small step towards a better understanding of folklore as an evidential category. First, the number of languages for which I have been able to find reliable and useful data is not very high, which means that I was not able to present any real statistics for the proposed types even though some types seem more common than others in light of this small sample. Moreover, most of the languages illustrated are spoken in the Americas, and it would be interesting to see how other languages deal with folklore. Second, it would be interesting to take a look at how the employed markers correlate with the general evidentiality system of given languages. For example, do larger systems tend to use more indirect evidentials while smaller systems favour reported evidentials? Third, closely related to the lack of statistics, we may say that the variable type may be far more common across languages, but this is not discussed in the sources, which may be due to lack of space (evidentiality is only one of the many categories that need to be discussed in a grammar). Fourth, as briefly noted above as well, it would be interesting to take a closer look at how the relevance and importance of folklore correlates with its coding in languages. For example, is folklore viewed as a more integral part of the history of one's own people in cultures where folklore is coded by a dedicated folklore marker? Does the directness of coding correlate with the relevance of folklore for a given culture, i.e. is folklore and one's own heritage more important for cultures where, e.g. inferential evidentials are used for this than in languages where folklore is coded by hearsay markers? This information may be very hard to find in grammars, but it may be an interesting field of study for people working on individual languages (and cultures). Finally, the relation of folklore to other types of stories may be worth studying.