Collective expression: how robotic swarms convey information with group motion

Abstract When faced with the need of implementing a decentralized behavior for a group of collaborating robots, strategies inspired from swarm intelligence often avoid considering the human operator, granting the swarm with full autonomy. However, field missions require at least to share the output of the swarm to the operator. Unfortunately, little is known about the users’ perception of group behavior and dynamics, and there is no clear optimal interaction modality for swarms. In this paper, we focus on the movement of the swarm to convey information to a user: we believe that the interpretation of artificial states based on groups motion can lead to promising natural interaction modalities. We implement a grammar of decentralized control algorithms to explore their expressivity. We define the expressivity of a movement as a metric to measure how natural, readable, or easily understandable it may appear. We then correlate expressivity with the control parameters for the distributed behavior of the swarm. A first user study confirms the relationship between inter-robot distance, temporal and spatial synchronicity, and the perceived expressivity of the robotic system. We follow up with a small group of users tasked with the design of expressive motion sequences to convey internal states using our grammar of algorithms. We comment on their design choices and we assess the interpretation performance by a larger group of users. We show that some of the internal states were perceived as designed and discuss the parameters influencing the performance.


Introduction
As robots make their way into our world, the number of application domains where they are likely to interact and cooperate with humans multiplies. Each of these domains offers an opportunity to develop more intuitive relationships with robots, by pushing forward their capacity to detect social attitudes and adopt expressive stances. While robotics often deals with humanoid and zoomorphic artefacts, recent technological advances result in the emergence of new forms as well as new action opportunities, sometimes remote from familiar modes of operations. Robot swarms are one of these new entities, composed of large numbers of robots that can evolve in formation and adapt to multiple environments. The robustness of swarm systems comes mostly from their distributed and scalable control. An increasing number of low-cost commercial swarm systems are available, but the complexity of decentralized control, based on local interactions between the robots and with their environment, is what still holds back their expansion to real-world applications [1].
Domain-specific programming languages [2] and software architectures [3] try to address these issues, so that researchers can focus on novel user interaction design. Concerning the interaction with humans, what makes swarms special is that they have no defined physicality: they can adopt emerging configurations depending on environmental constraints, internal policies, and commands issued by a user [4]. As such the motion of swarms is defined and constrained by the structure of biological entities and considered as a type of biological motion [5]. However, swarm motions have no underlying form that rigidly determines the relationship between parts, as opposed to motions of the human body for instance. This absence of predictable structure, and the necessity for an observer to consider multiple individuals, possibly as a single entity, make it necessary to develop new methods for the analysis of human-swarm interaction. Those methods should investigate whether the perception of swarms motion is sensitive to the structure of moving swarms and whether the human interpretation is coherent to an expressed internal state of a swarm (e.g, system alerts or important new information available for the operator). As opposed to the undergoing research in animated group motion already active for decades (for instance with the major work of Reynold [6]), research on robot swarms' expressive capabilities has only started [7]. So far we possess scant information about how a swarm's motion impacts a user's emotional response [8]. Specifically, we do not know how the state attributed to a swarm (e.g. is it considered as a single entity, an aggregate of autonomous robots, an ephemeral formation?) affects its perceived psychological traits (nervous, shy, aggressive, etc.), as well as the expressivity that may be attributed to its behavior. Is the perception of a robot swarm similar to the observation of a school of fish or a flock of birds? How is a robot swarm able to impress the sense of a collective movement organized towards a goal? What collective features govern the transmission of information about the swarm's perceptive and social states?
This paper addresses these questions, elaborating on the notion of swarm expressive behavior. In particular, we examine how various parameters contribute to the organization perceived in a swarm's behavior, and how this organization translates into the swarm's expressivity and the possibility to identify internal states such as emotions. These questions are addressed with two user studies involving a small swarm of tabletop robots. The article proceeds in four steps. First we introduce the topic of robotic swarms' expressive behavior (Section 2). Second, we describe the software infrastructure, the control algorithms, and the control attributes implemented to study a swarm's expressive behavior (Section 3). Third, in a first experiment, we investigate the relationship between expressive behavior and perceived key attributes of a swarm (Section 4). In particular, we test whether the expressivity attributed to the swarm's behavior depends on attributes of temporal and spatial synchronicity, and whether variations in that expressivity are correlated with variations of the parameters of organization perceived in the swarm. Finally, in a second experiment (Section 5) we further investigate the expression of internal states by a robot swarm, using expressive motion sequences designed by choreographers to represent specific emotions. We evaluate the success of this representation and determine on which parameters of perceived organization the expressive sequences are relying upon.

Expressive behavior of robotic swarms
An important issue for the supervision of a semiautonomous swarm is the possibility to efficiently convey information about the swarm's current state, its future states, and the effects of human input on its behavior. This work originates from the recent key contributions to human-swarm interaction [5,[9][10][11][12][13][14] and the use of nonverbal communication from robots [15][16][17][18][19]. Both domains are discussed in this section, leading to the key concept of the cohesion of a swarm for group-level perception of motion.

Human-swarm interaction
Human-Swarm Interaction (HSI) differs from common Human-Robot Interaction (HRI) for the large numbers of units involved and because it heavily relies on local interaction from which the group behaviors emerge [10]. Such self-organized and emerging behaviors are more challenging to visualize than deterministic and predictable control strategies. This is where the current design paradigm of commercial centralized mission planners is limited. For instance, when rendering online positions of a deployed aerial fleet on a screen, the operator will have more difficulties understanding group motion than individual mission-oriented goals. All robot activities have to be encompassed in a supportive visual interpretation that facilitates the operator decision-making [20]. The information conveyed to the swarm operator is determined from the collective movement of the swarm as it progresses towards a specific goal. This requires examining the possible swarm visual configurations to identify the most efficient means of communicating, for instance, directions or danger. A stepping stone towards the intuitive visualization of swarm behaviors consists of the identification of invariants for the design of interactions between human and collectives of robots [9]. The design of appropriate control algorithms notwithstanding, one of the current HSI challenges is the state estimation and visualization for swarms [10]. A very important issue is whether humans may be able to understand swarm motion dynamics [11] and properly react to it, leading to the design of swarm motion dynamics that are compatible with human cognitive skills of interpretation.
Humans are generally good at recognizing patterns of collective motion [13]. However, because human attention can fluctuate and the capacity of humans working mem-ory is limited, the number of robots a single operator can control is also limited [12,21]. In fact, in tasks where operators have to recognize a common type of swarm behavior (e.g. flocking ), they report to be taking a holistic approach to the perception of collective motion inherent to emergent swarm behaviors [13]. Walker and Lewis [13] observed operators applying strategies such as "unfocusing their eyes" and/or "watching for a global pattern to emerge". Those strategies are the responses of the cognitive system to deal with the increase in control workload during swarm interaction [22,23]. Research in the emerging field of HSI has often used user studies to investigate workload and performance [5,14]. Seiffert et al. [5], for instance, considered motion perception for the evaluation of the swarm configuration, and they observed that the discrimination of organized swarms is superior than for scrambled systems without structure, but inferior to the discrimination of motions for rigid structures. Therefore, to consider the swarm's specificity with respect to human interaction, one needs to take into account its distributed nature, and develop the adequate concepts to determine how socially impactful a swarm can be. To the best of our knowledge, previous studies have rarely focused on how a human perceives a swarm based on the expression of its internal state.

Nonverbal communication in HSI
To convey information about swarm states, a flexible strategy is to use iconic representations that users can recognize without having to recall them, such as the top LEDs on each of the robots [15], or make the robots emit sounds [24]. Note that the latter uses sounds to help the user be aware of a malfunction in the swarm, not to share high-level state information. For broader use, one needs to define the information conveyed by a swarm, a non trivial task that Cappo et al. [25] addressed with swarm behavior descriptors defined as: 1-action for the global motion of the fleet, 2-goal, i.e. the destination of the fleet, 3-shape, maintaining a geometry over the whole motion, 4-heading of the robots, and 5-manner, i.e. trajectory variations giving various dynamic attributes to the movement. The researchers simulated over 1000 possible combinations of behaviors descriptors, but without performing any user interaction study. The shape descriptor is restrictive for general swarm motion as it removes the possibility of using distributed path planning algorithms that would not maintain a shape throughout the complete motion. Beyond issues of communication and supervision, the representation of swarm states is also a matter of social presence. As robot swarms are bound to evolve inside social territories, they need to develop communication modalities beyond symbols and signs. Nonverbal behaviors, social attitudes, emotional expressions constitute important ingredients to establish a social bond [26]. For such a connection to be formed and maintained, several paths have been explored with traditional forms of robotics. Mimicking the human silhouette and postural structures, a humanoid robot can express emotional states using a combination of body postures, facial, and gestural expressions [17]. Yet more abstract, highlevel motion patterns can contribute to the emotional expression, without requiring a humanoid appearance, or even specific emotions to be expressed [27]. For instance, the kinematics of movement has been shown to participate in the emotional appraisal of an action [18,28]. Motion characteristics such as path curvature and acceleration are correlated with different levels of perceived arousal and valence [19,29]. A common denominator for the different modalities of social presence is the notion of expressivity. An expressive behavior can be considered one that successfully transmits a particular emotion, an attitude, or a general disposition to act and react in certain ways. Phrased by Simmons & Knight [16], expressivity represents the ability to "convey an agent's attitude towards its task or environment". The expressivity of a movement determines how natural, readable, or easily understandable this movement may appear. Thus, expressivity determines to a great extent the capability for an intuitive and transparent interaction with a robot, including the interaction with a robot swarm.
Because of the distributed nature of robot swarms, the notion of expressivity is bound to take a different meaning from traditional approaches that connect expressivity to gestural and morphological properties. A swarm has no body nor body parts to express feelings or attitudes. Without a definite physicality, a swarm can reconfigure and adapt to different environments and commands coming from the user. In this context, an observer has to consider the emergent properties resulting from multiple individual behaviors, for instance the tendency for the individuals to remain close to each other, or to adopt similar velocities. Determining a swarm's expressivity is therefore a different process than considering the movements of a single robot, or even of a small group of centrally controlled robots.

Perceiving the swarm as a coherent entity
Instead of relying on body and motion perception, assessing the behavior of a swarm depends on at least three do-mains of computation: 1. ensemble coding, 2. perceptual grouping, and 3. perception of motion features. In order to convey internal states of a swarm, one must first understand what contributes to the user's perception of the swarm as a single entity.
Research on the perception of ensembles (1) has determined that sets are represented in a qualitatively different way than single items [30,31]: from a set of objects people have the ability to rapidly extract information about size, orientation, motion direction, or even social features such as emotions attached to facial expressions [32]. Observing the behavior of the robots composing the swarm, a person may extract statistical summaries, relative for instance to the average velocity or average direction of the robots' movements.
The perception of a swarm as a coherent ensemble (2) is also determined by Gestalt factors. The visual system integrates elements of the visual scene as parts of the same structure when those elements, in addition to being close to each other, move coherently, that is in a similar speed and direction [33,34]. This property of common fate governs the possibility to consider the swarm as a cohesive entity and attribute to this entity a number of traits defining its behavior. Gestalt factors also determine some dynamic motion patterns, such as the perception of chasing [35,36]: when two or more mobiles give the impression of chasing each other, that may contribute to the expressive behavior of the swarm. In general, the rapid detection of temporal contingencies between changes in speed or direction [37] provides a perceptual basis upon which identifying meaningful interactions among the swarm's robots.
Kinematic and dynamic features (3) constitute a third class of information picked up by the visual system when considering the behavior of the swarm. Movement qualities may be related to emotion expression [28,38,39]. While the general level of movement activity and spatial extent are considered important features for the distinction of emotion categories, variations in movement patterns may provide further evidence to distinguish levels of valence and arousal. Dietz et al. [7] have recently investigated the impact of such variations on the perception of a swarm's behavior and found that an increase in speed and smoothness had a significant effect on the perceived emotional state.
Together, ensemble coding, perceptual grouping, and perception of motion features, conspire to produce the perception of different global states characterizing a swarm. These states may vary in terms of perceived cohesion (i.e. whether the robots give the appearance of a cohesive entity) and perceived organization (i.e. whether the robots give the appearance of manifesting an organized behav- aggregation the tendency to perceive the robots as remaining close to each other synchronization the tendency to perceive the robots as synchronizing their movements leadership the tendency to perceive the robots as following one of theirs figure the tendency to perceive the robots as forming a figure altogether ior). Traditionally, the literature on swarm behavior distinguishes two different parameters that govern the representation of a swarm as a single entity [40,41]: a parameter of cohesion that represents a tendency for individuals to remain close to each other, and a parameter of synchronization, which can be in terms of velocity or alignment.
For the purpose of this article, we make a distinction between four parameters of perceived organization (presented in Table 1): a parameter of aggregation, corresponding to the impression for an observer that the robots forming the swarm tend to stay together rather than scattering; a parameter of synchronization, or the impression that the robots are aligning their movements; a parameter of leadership addressing the impression that the robots are following or chasing a member of the swarm; and a parameter of figure composition (second study only), or the impression that the robots are forming a figure. More specifically, we will use the term cohesion as a key concept to refer to a global property resulting from the interaction between the three aforementioned parameters. This perceived cohesion can be seen as a pre-condition to the representation of the robot ensemble as a coherent entity potentially able to express internal states through its behavior.

Research questions and general hypotheses
This article attempts to evaluate the sources of a swarm's expressive behavior. This endeavor required the implementation of a flexible swarm control infrastructure for the design of decentralized group motions (Section 3), and the construction of evaluation tools to assess how an observer perceive and evaluate these movements. Based on these two sets of tools, we could determine the relationship between motion observables, as determined by decentralized control algorithms, and the qualifications attributed to collective movements.  Figure 1: Structure of collective expression explored: from swarm control algorithms (1), we extract common control attributes (2) in order to assess the swarm perceived organisation elements (3) and finally relate these elements to cohesion, expressivity and emotional states (4) of the swarm.
The architecture of this study on collective expression has a four-tier structure of key concepts ( Figure 1): 1. five decentralized swarm control algorithms are implemented to create expressive swarm behaviors (Section 3.2); 2. we determine a set of control attributes to tune these algorithms and design group motion sequences from them (Section 3.4); 3. these sequences are investigated with respect to the user's evaluation of parameters of perceived organization (introduced in Section 2.3); 4. these parameters and their respective scales allow to determine some perceptual determinants of the global cohesion attributed to the swarm, the expressivity attached to its movements, and the possible emotional states identified.
Given the necessity of considering multiple individual robots, we surmise that an observer has to represent the swarm as a single entity before attributing any kind of properties to its behavior. We suppose therefore that a certain level of perceived cohesion is necessary for expressivity to develop, and we should expect to observe a relationship between the perception of a swarm as a coherent entity, as measured by parameters of organization, and a score of expressivity attributed to the swarm's behavior. When swarm behaviors are designed by experts (choreographers) to convey emotional states, we expect these parameters to play a role in the way collective motions are channeled to produce recognizable emotions.
We make the following hypotheses: 1. Considering the swarm as a coherent and stable entity should depend on the ability to identify parameters of aggregation, synchronization and leadership in the swarm movements (first experiment); 2. Expressivity should also be related to the parameters of aggregation, synchronization, and leadership, inasmuch a sufficient level of organization is necessary for the swarm to be considered as a single entity. However, an excessive organization may be detrimental to the overall expressivity if it results in stereotyped motion patterns (first experiment); 3. Users can distinguish internal states (e.g. attitudes or emotions) of a robotic swarm based on group motion designed by an expert (e.g. choreographer) in expressive motion (second experiment); 4. The recognition performance of a given set of expressive motions designed from internal states also relies, perhaps not consciously, on perceived attributes of organization in the swarm's behavior (second experiment).

Implementation of swarm expressive behaviors
Literature on swarm intelligence covers a plethora of decentralized control algorithms for connected groups of robots [42]. Within this body of knowledge, interaction studies often focus on a single control mechanism at the time to relate control inputs to the user perception. Instead, our interest lies in the relation between the motion attributes of the group and the user perception. We implement and study multiple control algorithms in terms of the motion they generate. These motions can then be analyzed in relation with user perception. A flexible and generic system for the design of decentralized group motions requires specialized tools. We introduce in this section our software infrastructure, leveraging a swarm-specific programming language uniform for all control algorithms. We can then detail the control algorithms we implemented for this study, and the attributes we extract from the generated motions.

Software ecosystem
Even the implementation of known algorithms for a swarm can be very challenging, especially considering that swarms are in essence decentralized systems, the behavior of which is based only on local interactions. To the best of our knowledge, only one solution can provide portability, scalability and fast development time: Buzz. Buzz is both a programming language and a virtual machine to run its scripts. It was created by our research group in 2016 to accelerate the implementation of swarm behaviors [2]. Buzz provides special constructs to address three essential concepts: a) shared memory (virtual stigmergy), b) swarm aggregation, and c) neighbour operations. The Buzz virtual machine (BVM) must run on every unit of the swarm and with the exact same script, but units can differ (i.e., a heterogeneous swarm), since the language is platform-agnostic. Example scripts are available online [43], as well as the code for the compiler and BVM [44]. The behaviors described in this section are also open-source [45]. Using Buzz, we ensure our code can be deployed on many hardware platforms. In this work we also leveraged its swarm-level primitives: virtual stigmergy [46] and neighbour operations. We use the former (a tuple space shared across the swarm) to agree on swarm-wide variables, such as the current state in a swarm-wide state machine. We use the latter (a local communication system) as it is the root of most swarm intelligence algorithms: local interaction.

Control algorithms
As shown in Figure 2, we implemented a set of five common swarm behaviors in Buzz scripts: aggregation, formation from graph descriptions, cyclic pursuit, autonomous deployment, and flocking. All scripts require only local interaction with their neighbours: for n robots in the swarm, each pair of robots knows b ij , the bearing between robot i and j, and d ij , the distance between these two. In the following subsections, we detail each algorithm to compute from these inputs, sometimes in conjunction with consen-sus mechanisms, each robot's velocity vector. Their usage for both experiments, such as to which emotional state (fear, anger, happiness, sadness, surprise or disgust) they were associated, is mentioned.
In a Buzz script, this velocity vector is an argument to a function dealing with low-level hardware control to actuate the robot. In the end, while the exact path of each robot is not determined, the group motion parameters and goal locations are scripted.

Flocking
Among the most popular formalization of biological swarm behaviors, potential functions are a simple, yet flexible control approach. Averaging potential force algorithms are often referred as a flocking behavior. Each robot computes a virtual force vector: where θ i and d i are the direction and the distance of the ith perceived obstacle or robot and the function f i (d i ) is derived from an artificial potential function. One of the most commonly used artificial potentials is the Lennard-Jones potential, adapted for our physical system as shown in Figure 3. The two parts of the potential equation represent the attractor and repulsor effect driven by only two parameters: a target distance (D) and a gain (ϵ). In this control approach, a goal (target location) is represented as an attractor influencing the whole group simultaneously. By manually tuning the function's gains we generated sequences to highlight the control attributes of Section 3.2. The wide spectrum of available motion provided by this control algorithm alone made it the ideal candidate for a first phase of user study (see Section 4). Then, in the second phase of our user study, flocking is the control algorithm selected by a focus group to represent sadness (see Section 5). - Lennard-Jones Potential Figure 3: The Lennard-Jones potential adapted for wheeled robots formation. The '-' and '+' domains are respectively the repulsive and attractive parts, for which the pivot point is set with parameter t. D is the distance between two robots and ϵ a parameter acting as a control gain on the potential.

Aggregation
Aggregation is a simple behavior regrouping all robots to a point, often the swarm centroid. As mentioned by Sahin [47], it is frequently observed in biological systems and it constitutes a pre-condition for most collective behavior. It was shown to be feasible without any computation on the robots, by directly adapting the velocity vector to the average relative position of neighbours [48]. We used a more formal model for aggregation, with each robot linear and rotational velocity, given by, respectively: which computes a gradient minimizing the distance and bearing between near robots, ultimately converging to the same point. To force the swarm to regroup on a different location, we add a component: with d it and b it , the distance and bearing to the target meeting point, respectively. Aggregation is the control algorithm selected by the focus group to represent fear in our second user study (see Section 5).

Cyclic pursuit
Even with the simplest form of robotic swarm, devoid of computational resources, two additional behaviors (together with aggregation previously introduced) were shown to emerge from a local reactive control: dispersion and cyclic pursuit [48]. Pursuit is an important behavior for many applications in robotics such as patrolling around a point of interest, an intruder, or scanning a building. Further analysis of the pursuit transient states showed more complex patterns that have significant potential for expressive motion [4]. In this work, we selected a formal model without the transient states, so that each robot linear and rotational velocity are given by: with r the distance to point of interest, b it the bearing toward this point, b ip the bearing toward robot i predecessor in the circle (i.e. the closest robot in front), and f , k, parameters of the pursuit behavior similar to those used by Kubo et al. [49]. Cyclic pursuit is the control algorithm selected by the focus group to represent happiness in our second user study (see Section 5).

Graph formation
Swarm intelligence has not been inspired only by behaviors observed in biological systems. As mentioned earlier, a decentralized algorithm leverages local interactions with neighbouring robots and thus relies heavily on the swarm network topology. Therefore, many algorithms find their roots in network engineering, such as the body of work around graph-based formations. A directed graph is composed of nodes having predecessors and successors (see Figure 2-b), a representation useful for modelling control structures, information flows, and the error propagation [50]). The challenge is to progressively reach a formation from a given shape, as long as a directed graph can be generated for that shape. Our implementation represents the target shape as an acyclic directed graph in which each robot can find its position using two other robots (predecessors) that have already taken place in the shape as reference. We assume that all robots possess the graph representation, but none is initially assigned to a specific position. The overall shape is built dynamically and iteratively: each new robot joins the shape only after being granted permission by one of the parents, using local communication exclusively. The resulting algorithm is completely decentralized and parallel: multiple robots can join different parts of the shape at any given time. This method is detailed in the work of [51]. To summarize, when a robot gets close to another that is already in formation, it bids to be its successor in a known graph structure, and if granted, it moves toward the relative position of the next node in the directed graph. While this control structure has a lot of potential for figurative representations (icons or symbols) in HSI, we restricted its usage to abstract geometrical shapes to focus on the mo-tion attributes of the group. Graph formation is the control algorithm selected by the focus group to represent disgust (alternating between a 'C' shape and its mirror) in our second user study (see Section 5).

Autonomous deployment
The last control algorithm implemented for this study emerges from computational geometry. Instead of considering relative motion of the robots only, a surrounding region is split between the swarm members, a process referred to as surface tessellation. Some major application scenarios benefit from this approach, e.g. search and rescue missions, and the deployment of sensor networks. The Voronoi tessellation [52] is an algorithm that has been extensively studied for multi-robot deployment. It usually takes the initial robot positions as seeds to the tessellation problem and then partitions the area. The logic is simple: create a frontier halfway between each robot and then stop those lines when they cross another frontier or the region's borders. We integrated in Buzz the sweeping line algorithm, also known as Fortune's algorithm, one of the most efficient ways to extract cell lines from a set of seeds [53]. We then cut the open cells with a user-defined convex polygonal boundary. From this point on, each robot has knowledge of its cell's limits. For a uniform distribution of the robots in the area, we use a simple gradient descent towards the centroid of each cell, such as in [54]. Each robot recomputes the tessellation following updates on the relative position of its neighbours; an approach that is robust to both packet loss and environmental dynamics. Within each robot cell, the user can determine any way to compute the goal of the robots instead of its centroid. For instance, to explore a region while trying maximum coverage, one can generate random goals within each cell.
Autonomous deployment is the control algorithm selected by the focus group to represent anger (random goals) and surprise (uniform) in our second user study (see Section 5).

Hardware selection
The robotic platform selected for this study has to be portable and avoid as much as possible any bias due to an anthropomorphic or zoomorphic appearance. We selected the Zooids [55], a group of small tabletop cylindrical robots 2.6 cm in diameter, localized using structured light emitted by a ceiling projector. While our behavioral scripts can be ported on any hardware platform (as explained above), we selected the Zooids for the minimal setup time, their low cost of manufacturing, their open-source controller code, and the simplicity of their manipulation. Even if abstract shapes are a bit less common in human-robot interaction studies, examples showed they can, for instance, generate less embarrassment from the user [56]. We built a charging station and we made enough Zooid units for two sets: one can charge while the other performs.
The Buzz low-level actuating functions implemented on the Zooids call their embedded controller. To be able to explore the expressivity of the robot's motion from the quality of their movements, we manipulate some low-level variables of the controller: 1. the maximum velocity changes the average velocity of the swarm motion without rendering the system unstable (such as playing with the controllers gains can do), because the Zooids controller is focused on small precise movement to reach the goals and it saturates for large displacements; 2. artificial delays to the movement commands allow us to manipulate the synchronicity of the movements, such as creating an implicit leader in the swarm.

Control attributes
Each algorithm has its own parameters, increasing quickly the complexity of the analysis of their influence. Moreover, simple behaviors in mobile swarm systems, such as flocking and cyclic pursuit, often lead to emerging transient states [4]. To derive and deconstruct these states generated by a given set of control parameters, one must run numerous simulations. Instead, to influence the level of perceived organization and expressivity we designed a set of higher level motion control attributes. These control attributes determine objective interdependence between the robots. These relationships are the basis of our evaluation of perceived organization and expressivity: 1. the average inter-robot distance, 2. the spatial synchronicity of the swarm, i.e. the robots move as a cohesive group, and 3. the temporal synchronicity of the swarm, i.e. the robots move simultaneously.
Each attribute is positioned on a continuous range (close/far, synchronized/unsynchronized), by the behavior control parameters. For instance, increasing the distance (target) parameter alone in a Lennard-Jones potential leads to an unstable and unpredictable inter-robot dis-tance over time. Therefore, the epsilon/target pair has to be manipulated together to get a stable formation for each inter-robot distance. Leveraging the unstable spectrum of the range of these two parameters, one can also influence the spatial synchronicity of the group. In other words, the more unstable a given pair of parameters is, the sparser the robot motion will be. Temporal synchronicity requires the use of another control parameter in the potential definition: the delay or latency for each robot in registering a goal attractor. By delaying the influence of a goal's attraction on certain robots, we influence the temporal synchronicity. For instance, a leader robot can notice the goal attractor seconds before the rest of the swarm, thus creating a break in the temporal synchronicity. These three control attributes serve to generate motions deprived of internal state (or conceptual meaning) to objectively study the cohesion and expressivity of the swarm in our first experiment. We then extract the value of these attributes from the motions designed by choreographers in our second experiment. We achieved numerical estimation of each control attributes from computing the standard deviation of the rotational velocity for spatial synchronicity, the maximum individual velocity gap from the group for temporal synchronicity and the standard deviation of the average distance to the swarm centroid for the inter-robot distance.

First experiment: influence of perceived organization on collective expression
As a first step to enhance our knowledge on the relationships between perceived organization and expressivity in a robot swarm, we conducted a user study to validate hypotheses 1 and 2 (see Section 2.4). From high and low values of the attributes detailed in Section 3.4, we generated eight abstract non-figurative motion sequences and assess their level of perceived organization and expressivity from the scores attributed by participants in live sessions.

Participants
We recruited 27 participants with good knowledge and experience of dance. For this study on swarm motion perception, we intentionally targeted this specific population to give us insights on the slight differences in each of the swarm motion states: dancers and choreographers are among the experts of body motions, let it be human or artificial. We believe the conclusions obtained from their answers can better help us define the motion parameters for a broader spectrum of users. From the 27 participants, 4 identified themselves as men, 22 as women and 1 as "other"; two thirds are dance students (19), while the others are freelancers (8). The participants did not receive any kind of financial compensation for the study, but rather participated out of curiosity about natural interaction with robotic systems. The study protocol was approved by the Paris 8 University research board and Polytechnique de Montréal's ethical committee. Participants signed an informed consent form to partake in the study.

Methods
To illustrate the different motion patterns in multiple sequential sessions, we alternated between two sets of six Zooids robots. As shown in Table 2, the three high-level motion attributes described earlier were used as binary in-puts, generating 8 possible combinations, i.e. 8 different motion scripts. Each motion followed the same goal sequence (see Figure 4): (1) from point A to point B, (2) from point B to point C, (3) from C to B, (4) from B to C, and (5) from C to A.
Participants were asked to sit in front of the table on which the Zooids performed. They had a 14 questions to answer on a tablet (available in French and English) after observing each sequence. Each motion script was run only once for the participants, but they were played following one of three possible orders: 1-2-3-4-5-6-7-8, 5-6-7-8-1-2-3-4 and 1-2-7-8-3-4-5-6. The motion sequences were triggered one at the time by the experimenters when the participant confirmed all questions were answered. The experimenter also explained beforehand that an unknown number of motion sequences going through the same goals would be automatically generated with different motion attributes.
To assess the values of cohesion and expressivity attributed to the swarm, participants completed a survey comprising three different scales (see table 3): (i) a scale evaluating the organization perceived in the swarm's behavior; (ii) a scale measuring the cohesion attributed to the swarm (i.e. whether it is considered as a coherent and stable entity); and (iii) a scale assessing the level of expressivity of the swarm's behavior. For each item of the different scales, we used a seven-point Likert scale with response ranging from 0 (strongly disagree) to 6 (strongly agree).

Results
This study presents a large number of tied ranks for a relatively small dataset (27 participants). We used Kendall's τ b correlation test to assess the contribution of each parameter in our dataset. We extract the perceived organization from the measures of cohesion and expressivity. As we could not assume that the psychological distance between the scores of expressivity and between those of cohesion were equivalent, we used an ordinal logistic regression to examine the effect of spatial synchronization, temporal synchronization, and distance on both perceived cohesion and expressivity.

How do the parameters of perceived organization contribute to the cohesion attributed to the swarm?
We found a positive correlation between cohesion and tendency to stay in groups: a higher perceived tendency for the robots to stay in groups is more likely to be associated with a higher perceived cohesion (τ b = 0.398, ρ < 0.001). Similarly, we found that a higher perceived tendency for the robots to synchronize their movements is more likely Organization on a scale from 0 to 6, indicate to which extent you agree with the following statements: -the robots tend to stay in groups -the robots tend to synchronize their movements -the robots tend to follow one of theirs Cohesion on a scale from 0 to 6, indicate to which extent you agree with the following statement: the robots form a coherent and stable group and seem to progress while connected to each other Expressivity on a scale from 0 to 6, how would you evaluate the expressivity of the robot swarm?
to be associated with a higher perceived cohesion (τ b = 0.440, ρ < 0.001). Finally, we found a significant positive association between the perceived tendency for the robots to follow one of theirs and the cohesion attributed to the swarm (τ b = 0.309, ρ < 0.001).

How do the parameters of perceived organization contribute to the expressivity of the swarm?
The correlation scores between expressivity and the perceived tendency for the robots to stay in groups were less important than for the cohesion, but we still found a significant positive association (τ b = 0.148, ρ = 0.008), as well as the perceived tendency for the robots to follow one of theirs (τ b = 0.197, ρ < 0.001). While the tendency to stay in groups and the possibility to perceive chasing relationships between the robots seem to benefit the expressivity of the swarm, an excessive level of synchronization may be detrimental to this measure, as indicated by the absence of a significant association between the perceived tendency for the robots to synchronize their movements and expressivity (τ b = 0.033, ρ = 0.585).

How the motion control attributes affect the expressivity and cohesion of the swarm?
Temporal synchronicity had a significant effect on the expressivity of the swarm (see Figure 5). With temporally asynchronous conditions, expressivity was 1.647 (95% CI, 1.012 to 2.681) times more likely to increase (χ 2 (1) = 4.028, ρ = 0.045). However, we did not find an impact of spatial synchronicity: the odds of spatially asynchronous conditions to be considered expressive was similar to that of spatially synchronous conditions (odds ratio of 0,698, 95% CI, 0.430 to 1.134), χ 2 (1) = 2.107, ρ = 0.147. Similarly, the odds of large spacing conditions to be considered expres- sive did not differ from that of small spacing conditions (odds ratio of 1,389, 95%CI, 0.855 to 2.256), χ 2 (1) = 1.761, p = 0.184.

Discussion
In this first experiment, we assumed that the expressivity of a swarm is dependent on a sense of cohesion emanating from the robots movements, itself contingent upon parameters of perceived aggregation, synchronization, and leadership. We verified the hypothesis that the score of cohesion (measuring to what extent people consider the swarm as a coherent and stable entity) is linked to the possibility of identifying moments of aggregation and synchronization. We found indeed that all the three parameters we measured (tendency to perceive the robots as staying in groups, synchronizing their movements, and following a leader) were positively associated with the score of cohesion. As predicted, the relationship between those parameters and expressivity is slightly more complicated. Motion patterns considered expressive are more likely to be associated with a higher level of aggregation and with the im- pression that the robots were following a leader, but we did not find a significant correlation with the scores of synchronization (both temporal and spatial). We also found that conditions more favorable to expressivity are those in which movements are temporally asynchronous, confirming the idea that a high synchronicity may be detrimental to expressive patterns. It is interesting to observe that, contrary to the score of expressivity, the score of cohesion is mainly affected by spatial synchronicity, with spatially asynchronous conditions being considered less cohesive. We observe an interesting relationship between the two parameters: we postulate that a sufficient level of cohesion is necessary for the swarm to be considered expressive (hence the positive correlation between expressivity and the aggregation and organization parameters), but cohesion and expressivity dissociate with respect to the impact of temporal synchronicity (detrimental to expressivity) and spatial asynchrony (detrimental to cohesion).

Second experiment: expression of emotional states using collective movements
The previously validated swarm attributes (cohesion and expressivity) can now be related to the design of expressive sequences: we conducted a second user study addressing hypotheses 3 and 4 (Section 2.4). The domain of emotion expression through swarm movements is largely uncharted, and we do not know which movement patterns are responsible for the expression of specific emotions. For this reason, we assigned a group of choreographers the task to design from scratch expressive sequences that, according to them, would evoke one of the six emotions known as basic emotions: happiness, sadness, fear, anger, disgust, and surprise [57]. Subsequently we tested the possibility for naive observers to identify the emotions associated with these expressive sequences and we used the measures of perceived organization devised for the previous experiment to determine to what extent the identification of emotions (and the possible ambiguity between emotional states) could be related to variations in organization patterns.

Participants
After the design phase, this second study was entirely conducted online. The participants were recruited through direct email invitations and with promotion of the study over social networks. We reached out to 41 participants, 34% men and 66% women. While the previous study focuses on students and young professionals, this one gathers the inputs from participants above 30 years old in majority (59%). The participants did not receive financial compensation for this study, and the protocol was approved by both universities' ethical committees (Paris 8 University and Polytechnique Montreal). Before the online questionnaire started, all participants had to accept the consent form for this study.

Methods
To create the sequences, we tasked three choreographers with the design of six expressive motions using a small tabletop swarm of six Zooids. The experiment is twofold: we first gathered a small focus group to design expressive sequences and we then presented the results to a larger number of participants. We did not expect the motion sequences designers to have good knowledge of decentralized programming, so we designed a simple software interface to ease their iterative design. All the control algorithms detailed in Section 3.2 can be selected by the user from the interface and then tuned using parameters such as maximum velocity, overall group shape, inter-robot distance, and temporal synchronicity (leadership). Each choreographer separately practiced with a programmer the spectrum of control actions. Subsequently, the choreographers met and decided together how to best represent six emotions us-ing the Zooids expressive motions: fear, happiness, sadness, surprise, disgust, anger. These emotions are known to be the easiest to name (self-recognize) [57]. We then conducted a small (six participants) qualitative assessment to confirm the perceived emotions for each designed sequence and we made small adjustments according to the participants' feedback. The resulting six Zooids' emotions are detailed in terms of control algorithms and velocity in Table 4. A compilation video of all six sequences is available online¹. The participants were asked to complete a survey (11 questions on a Likert scale) after watching each sequence. To conduct this part online, we filmed six short video sequences made with the Zooids. Each sequence related to one of six emotions. For each expressive sequence, participants had to evaluate on six seven-point Likert scales whether the sequence evoked fear, surprise, disgust, anger, happiness and sadness. Participants had also to evaluate the sequences with the three scales of perceived organization presented in the first experiment as well as the fourth one introduced in Section 2.3: the tendency to perceive the robots as forming a figure altogether.

Feedback from choreographers
We acknowledge that the design of the six sequences is limited by the available control algorithms and the selected control attributes. This limitation can only be relaxed with a larger set of control options than what we implemented: we believe this would bring a level of complexity to the system that may influence the designers. However, discussion with the three choreographers highlighted the characteristics of their design choices and they did not feel constrained. The designers reached a smooth and fast consensus on the representation of Anger, Fear, Surprise and Happiness. Choreographers selected the fast random deployment for anger because the robots seemed "crazy", i.e. disorganized without any apparent logic in their movements and sometimes even colliding with one another. Fear was the easiest to design for them, and it reflects on the results mentioned above: aggregation seemed like the obvious choice to them. Surprise and happiness turned out to be both represented by circles, but not on purpose. Happiness used cyclic pursuit, as a form of tribal dance around a fire, a celebration of the group. On the other hand, surprise used uniform autonomous deployment, for them, a more abstract representation of a spurt or a sudden heart-rate burst. Disgust was the most difficult emotion to represent: in the end, the "C" shape made from graph formation control aimed at creating the impression of a jury, as far as possible from the center (user focus), sometimes whispering (shaking) from their high moral authority perspective. Finally, the focus group never reached an agreement on the representation of sadness, but the selected behavior (flocking from right to left close to the user) was perceived by them as comforting, a behavior we generally seek when sad. The design approach based on high-level algorithms selection and few control attributes options in this second experiment is far more intuitive than the regulated tuning of parameters required to generate the motion sequences variations of the first experiment. Nevertheless, we extracted from the resulting sequences measures to quantify the underlying control attributes. To ensure the values were perfectly fit for the video sequences presented to the participants, we extracted the position of all robots from each sequence. Using each robot position recorded at 30Hz, we computed the velocity vector of all robots and of the average for the whole swarm. The spatial synchronicity is measured with the standard deviation of each robot rotational velocity, averaged over the entire sequence. With larger standard deviation, the spatial synchronicity decreases. The temporal synchronicity is measured with the largest difference between the maximum velocity of the swarm and of the slowest member, over the whole sequence. A large difference means a low temporal synchronicity. Finally, the inter-robot distance is measured with the spatial dispersion of the swarm: the standard deviation of the distance between each member and the swarm centroid. Table 5 presents the values of each control attributes for all sequences. Fear as the smallest interrobot distance, while Anger as the largest, are expected consequences of the aggregation and random deployment algorithms. Happiness has the highest standard deviation of rotational velocity, due to the circular movement, and thus has the lowest spatial synchronicity, closely followed by Anger. Disgust has the highest temporal synchronicity, since most of the time all the robots are standing still together. Here again, Fear stands out as the least synchronized sequence.

Results
We present two interrelated dataset: 1. the recognition of emotions conveyed by each sequence and 2. the influence of the parameters of perceived organization on the expressivity of these sequences. 36 Figure 7: Confusion matrix representing for each expressive sequence (fear, happiness, sadness, surprise, disgust, anger) the % of participants having considered a given emotional state as the best candidate to qualify for this sequence. For instance, 33% of participants considered Surprise as the best candidate for the sequence designed to convey the emotion of fear.

How well were the emotional states distinguished by the participants?
The participants filled a Likert scale series of questions assessing the level to which each sequence evokes a specific emotion. In general, participants had difficulty associating emotional states to the sequences they were presented with. The best scores were found for Happiness, Surprise and Fear scales, but only with an average of approximately two on a seven-point scale. If the sequences do not evoke salient emotions, participants still responded in different ways to them. This is what we can observe with the classification distribution shown in Figure 7. For each sequence and each participant, the best candidate (the emotion reaching the highest score) was extracted from the Likert-scale scores (including ties). Fear and Happiness are the two most successful, in the sense that they are associated more frequently to their corresponding sequence (the one choreographers intended to convey this specific emotion). We ran a Kendall's W test to evaluate to what extent participants agreed on the rank attributed to each emotion for each sequence. Coefficients of concordance are indicated in Table 6. We found a significant concordance for each sequence, except for the sequence 'disgust'.
The matrix also reveals potential misclassifications: emotions preferentially attributed to a sequence that was not intended to convey these emotions. If we look specifically to the sequences 'fear' and 'happiness', we can observe that both these sequences tend to be equally associated to their corresponding emotion and to the emotion Surprise. Wilcoxon signed-rank tests conducted on the rank scores associated with each emotional state confirm indeed that, for the sequence 'fear', the comparison between Fear and Surprise is not significant (z = 1.023, ρ = 0.306), while all the other comparisons are significant. Similarly for the sequence 'happiness' the comparison between Happiness and Surprise is not significant (z = 1.157, ρ = 0.247), while all the other comparisons are significant. We verified to what extent the identification of the different emotional states was associated to parameters of perceived organization in the swarm's behavior. As shown in Table 7, some parameters of perceived organization are specifically related to emotional states. The tendency for the robots to stay in groups is more likely to be associated to higher scores of Fear (τ b = 0.115, ρ < 0.05). This is coherent with the fact that choreographers chose to depict a strong level of aggregation for the robots, similar to the way the members of a biological swarm maintain a close proximity to protect from predation. The tendency for the robots to synchronize their movements is more likely to be associated with higher scores of Happiness (τ b = 0.161, ρ < 0.005). We also found that Happiness is specifically associated to the tendency for the robots to form figures (τ b = 0.216, ρ < 0.0005). In combination, the parameters of synchronicity and figure seem to be critical to the particular sequence choreographers chose to express happiness. Robots were represented as engaged in a sort of circle dance, evolving along the lines of a virtual figure, with a high level of interdependence between the robots' movements. The tendency for the robots to follow one of their peers was associated to most emotional states, except Happiness and Anger. In fact, Fear, Surprise, Sadness and Disgust sequences all used at some point an element of sequential transformation, one dynamic state (for instance the robots aggregated in the bottom left corner of the table), followed by another state with a transition phase where one robot is perceived as leading the way. The emotional states not significantly associated with the parameter of leadership are those where this element of sequentiality is not present, with the robots scattered randomly all over the table (Anger), or staying inside the same zone during the entire sequence (Happiness).

Discussion
The Buzz programming language and its virtual machine proved a versatile tool to design expressive sequences. Due to the simplified design parameters, the choreographers achieved a mapping between the target emotional states and the control algorithms, without requiring decentralized programming expertise. Each available control algorithm was in the end associated with a unique target emotional state, such as fear with 'aggregation', whereas anger was uniquely associated with the 'random deployment' algorithm. However, this mapping between emotions and control algorithms did not translate into a unique recognition profile for the emotional states: in general participants had difficulty identifying the emotions intended by the choreographers, while still being able to differentially respond to these emotions. Whereas Fear and Happiness were more frequently associated with the corresponding sequences, other emotions such as sadness or disgust proved especially difficult to represent using collective motions. It is difficult to determine whether Fear and Happiness were more successful because they can be conveyed using abstract patterns [58] that are suited to swarm expression, whereas other emotions are more tightly linked to facial, gestural and postural configurations (e.g. a defensive posture in the case of Disgust). Some of the perception confusion observed in our study are similar to the ones already pointed out by Barakova and Lourens [59] with the use of the Laban dance notation on robotic motion: an overlap arise between the coding of 'fear', 'anger' and 'happiness'. It is also possible that, in order to better illustrate emotions, what was lacking was a fine control of the movement qualities that are known to evoke specific emotion (e.g., jerky movements for anger; large and fast movements for happiness [28,60]). Misclassifications, especially between Fear and Surprise, and between Happiness and Surprise, could also result from limits of the control parameters to finely tune motion parameters. In addition, we can surmise that Fear was confused with Surprise because of the sudden reconfiguration of the swarm at certain points of the sequence, implying an element of rapid adjustment to external variations. Happiness and Surprise, on the other hand, could be confused because of an impression of high arousal due to frequent changes of configuration.
An interesting element for the study of collective expressions is the diverse range of intuitions choreographers relied upon when designing the expressive sequences. Based on the variations we observed in the parameters of perceived organization, we can delineate at least four expressive features: collective behaviors, pictorial elements, narrative elements, and variations in interdependence. To depict emotions, choreographers could draw from a repertoire of collective behaviors observed in animal groups. Flocks of birds or fish schools display selforganization behaviors [61] that may inspire collective expression of robotic swarms. In our experiment, choreographers seemed to base their design of fear-related motion on animal behaviors when they animated the swarm as a flock of sheep fleeing from a predator. In certain sequences, a pictorial element is involved, when robots adopt a configuration that, to a human observer, may suggest a geometrical figure. Such an element was present in the sequence representing happiness. In this sequence choreographers depicted a circle, thus making use of the feature of roundness, a feature frequently associated to the expression of positive emotions [58,62]. In some other sequences, such as Fear and Disgust, choreographers based their design on a narrative approach, portraying a sequence of events. Successive changes of the swarm configuration, and chasing sequences where one robot is seen as leading the way, were used to convey attitudes and emotions. These sequences echo sequences used in numerous experimental investigations of animacy perception [63,64], and suggest that the motion patterns thus depicted correspond to basic expressive patterns. Finally, choreographers made use of variations in robots' interdependencies to illustrate certain emotional states. Happiness was associated to a high level of synchronicity by observers and exemplifies the expressive potential of dynamic interrelationships. In this sequence, the robots were engaged in a highly dynamic game of position adjustment that could transmit a playful attitude of joy. These subtle patterns constitute an interesting element to tap into when designing expressive collective motions.

Conclusion
In this work, we addressed the challenge of representing internal states of a swarm. We designed two sets of user studies, each increasing our understanding of the motions parameters involved. A flexible implementation was required to conduct theses studies, so we presented our decentralized software infrastructure. Based on a swarm-specific programming language, we implemented a series of common swarm control algorithms for motion designers to pick and tune, independently from the underlying hardware. Each algorithm has specific parameters, quickly increasing the complexity of the analysis of their influence. To narrow the analysis, we propose a small set of three high-level motion attributes: temporal synchronicity, spatial synchronicity and inter-robot distance.
The first experiment relates the expressivity and cohesion of the robot group to the high-level control attributes, injected as control parameters of a flocking behavior. The results show that the perceived cohesion of the group increases with the robots' tendency to stay in groups (be organized) and their spatial synchronicity. The expressivity of the swarm was also increased by the robots' tendency to stay in groups, but was reduced by temporal synchronicity.
The second experiment tasked a small group of professional choreographers with the design of six expressive motion sequences to illustrate internal emotional states. The results show that half the sequences were attributed emotions with significant agreement over all the online participants: fear, happiness and surprise. Fear and happiness were associated with high synchronicity, and happiness also to the tendency to form figures. We also observed that anger was associated significantly to the absence of leadership in the swarm.
Using these results, the swarm motion can be tuned to share high-level information to its operator. For instance, while conducting an exploration mission, part of the deployed swarm can synchronously aggregate when detecting a gas leak to inform their operator of the danger. In a broader perspective, we believe these preliminary results represent a stepping stone on the path to a better understanding of artificial swarm perception aimed at improving non-verbal communication between human and swarm during collaborative tasks.
Further steps include understanding the expressive figures that develop in relation to the swarm's dynamic state changes. They also involve understanding the relationship between such expressive figures and whether the swarm is perceived as a friendly, indifferent, or intimidating presence. Finally, our next experiments will integrate the design of expressive motions in task-oriented interaction scenarios to explore how to best leverage these findings.