A fully automated touch-response behavior inspection pipeline on zebrafish larvae

: A touch-evoked response of zebraﬁsh larvae provides information on the mechanism of the gene functional expressions. Recently, an automated system has been developed for precise and repeated touch-response experimentation with minor human intervention. To quantify the collected data, we propose a fully automated multi-larvae touch-response behavior inspection pipeline based on larva tracking and segmentation. Experimental data with different treatments is analyzed by using the proposed inspection platform for demonstration, and the result proves that this platformcangeneratecomparabletouch-responsebehavior inspection readouts efficiently and automatically. The initial results were published in 31. Workshop Computational Intelligence, and this paper summarizes and extends the main work of the respective article.


Introduction
Zebrafish larvae are commonly used animal models for organism-based screenings due to their small size, high fecundity, and short reproductive cycle [1].Their specific (repeatedly and obvious) behaviors indicate certain functional mechanisms of mutants by the treatments [2,3], making it possible to do the large-scale high-throughput screening of chemicals or drugs.Automated experimental systems to acquire the data of these behaviors have been developed so far [3][4][5][6][7], so the automated high-throughput inspection of the data from the systems is also becoming in higher demand, as visual inspection is time-consuming and not statistically comparable.In particular, the automated touch-response experimental system on zebrafish larvae has been developed to conduct the experiment by controlling a blunt needle to touch the larvae at a specific position and with a predefined force (Figure 1).The touch-evoked response of zebrafish larva consists of three components, including a C-Bend (the larva bends the body as a C-Shape, shortened as C-Bend), reverse C-Bends, and escape movement (changing the position).The touch-response experimental data (videos) are in a high frame rate [8,9], so the automated inspection is essential in this case.During the touch-evoked response of zebrafish larvae, four time points of importance are the time when touch applied (t 1 ), response begins (t 2 ), response peak (t 3 ), and response stops (t 4 ).Four criteria are to be quantified, including the latency time (t l ), C-Bend curvature maximum (c m ), C-Bend peak time (t cp ), response time (t r ), and escape distance (d e ).However, it is challenging to generate a precise number of C-Bend curvatures and escape distance manually [7].Furthermore, the operators cannot keep the same criteria all the time for each video, as the video has more than ten thousand frames on average.Thus, we proposed a touch-response quantification pipeline for single zebrafish larva in [8], but as for the multilarvae case, we face more challenges: (i) multiple larvae need to be tracked and segmented at the same time; (ii) the larva that is touched shall be defined; (iii) the quantification of multiple larvae has higher computational costs.To solve these problems, we proposed an AI-based Multi-larvae Touch-response behavior Inspection Pipeline (AMTIP) in the work published in 31.Workshop Computational Intelligence [10] with the main work summarized and extended in this paper.
In AMTIP, the tracking procedure plays a vital role, especially in the tracking of multiple larvae [9].Recently, machine learning or deep learning based tracking methods have emerged to promote the accuracy of the tracking procedure [11,12], and much previous work focused on the tracking and segmentation of single or multiple adult zebrafish [13][14][15][16].To make the best of the deep learning methods, we use a U-Net [17] based segmentation method for the initialization of tracking.However, those high-computational methods are difficult to be used in the tracking procedure of our high-frame-rate videos.In order to make the inspection pipeline less complex, we propose an optical flow based needle tracking procedure and a particle filter based larvae tracking procedure.Besides, the segmentation for each larva is also of importance to the analysis of the movements.In [18], a Gaussian Mixture Model (GMM) based segmentation is used to detect the moving objects, and the noise is filtered according to the region size by using a global Otsu (a conventional automated thresholding) method.However, considering global information in our platform makes the procedure more computationally expensive.Therefore, a local region growing based segmentation method is used for each larva according to the result of tracking procedure.Based on the tracking and segmentation results, we propose AMTIP to find the touched larvae and generate the behavior quantification according to the proposed experiment criteria.In order to test the performance of the proposed platform, we conduct six sets of experiments with different drugs and analyze the experiment criteria and detected errors (failure cases).With the verification of the experiment results, AMTIP shows a high efficiency for analyzing the touch-response experimental data and reduces the efforts for the operators involved in the experiments.The methods used in AMTIP can make contributions to the optimization of object tracking methods for analyzing videos with expensive computation.As well, AMTIP can also be transformed into the inspection pipeline of other organisms (like medaka) and can also be added with more quantification criteria.
The organization of the article is as follows.Section 2 describes the design of the proposed AMTIP.Section 3 provides the setup of the experiments, the quantification criteria and results as well as the discussion.According to the above results, conclusions are drawn in Section 4.

Multi-larvae touch-response inspection pipeline
The touch-response inspection procedure transfers the raw data collected by the acquisition platform into variables (criteria) that make sense to humans.Figure 2 visualizes the architecture of AMTIP that transfers the videos collected by the data acquisition system [8,9] and generates the quantification criteria of the touch-response behaviors, including latency time t l , C-Bend curvature maximum c m , C-Bend curvature peak time t cp , response time t r , and escape distance d e .Four time points are vital to the quantification of touch response, including t 1 (touch applied), t 2 (response begins), t 3 (response peak), and t 4 (response stops).The AMTIP contains three essential parts: initialization, tracking and segmentation procedure, and quantification.The initial positions (initialization, Step 1 in Figure 2) of the needle and larvae obtained from the first frame are vital to the accuracy of the tracking procedure in AMTIP.Thus, a U-Net is used to segment the needle and larvae, which can be used directly as the initial positions for the tracking procedure.
Given the initial positions, each following frame of the video is processed by the tracking and segmentation procedure (Step 2 in Figure 2), including an optical flow based needle tracking [19,20], a particle filter based larva tracking [11], and a region growing based larva segmentation.As the needle moves slowly, an optical flow is sufficient to estimate the needle position for each video frame (with coordinates in the frame denoted as where n indicates the needle and t j indicates the frame j).The optical flow, however, cannot be applied to track the larvae that move rapidly, so a particle filter based larvae tracking is considered to generate the positions of the larvae for each frame ( where l indicates the larva).To fulfill the detailed analysis of the touch-response behaviors of the larvae, a region growing based larva segmentation is used.The seed points of the region growing are chosen by the larva positions generated by the larvae tracking procedure, as described in [10].The outputs of the tracking and segmentation contain the image patches of all larvae, as well as the positions of the larvae and needle for each frame.
With the results of the tracking and segmentation procedure, the quantification criteria (latency time t l , C-Bend curvature maximum c m , C-Bend curvature peak time t cp , response time t r , and escape distance d e ) are generated by the following three steps, 1.Each video contains multiple larvae (Larva #1, Larva #2, Larva #l, etc.), so the larva that is actually touched by the needle needs to be firstly distinguished among the larvae in the video by AMTIP for the following quantification (Step 3-1 in Figure 2).The needle stops at the larva position after touch is applied, so the initial position of the touched larva (X l 0 at t = 0) is the closest to the final position of the needle (X n t f at t = t f ).
2. As for the latency time t l and response time t r , the essential time points, including t 1 (touch applied), t 2 (response begins), and t 4 (response stops), need to be computed as detailed in Figure 2. As shown in Step 3-2 in Figure 2, the distance between the needle and larva for each frame is computed from t = 0 until the time point with the distance lower than a heuristic threshold T nl , as t 1 (when touch is applied).The t 2 is obtained as the time point when the larva moves and the response begins (with a heuristic threshold T mq : the percentage of the particles used in particle filter in Step 2).Similarly, as shown in Step 3-3 in Figure 2, the t 4 is searched from t f reversely until the time point when the response stops.Consequently, the latency time is computed as t l = t 2 − t 1 , and the response time is computed as t r = t 4 − t 2 .The escape distance is computed by the sum of the distances between the larva positions in the frames from t 2 to t 4 , 3. To quantify the amplitude of the touch-response behaviors, the curvature of the C-Bend for each frame is to be analyzed according to the skeleton of the larvae.Besides, the C-Bend curvature maximum c m can be computed, as well as the time point of c m (t 3 , response peak), 3 Experiment

Experiment setup
Different chemicals can have different influences on the touch-evoked behaviors of the zebrafish larvae.Thus, the experiments on long-term treatments (denoted as E lt ) with different chemicals are conducted on the zebrafish larvae to analyze the difference between various chemicals, with the protocol visualized in Figure 3 on a timeline, and the experimental setup is outlined in Table 1.
In Experiment E lt , the larvae at 73 h post-fertilization (hpf) are put in the well plate to conduct the touching on the body.The details of the  Isoproterenol hydrochloride (Iso) with unknown effects, Caffeine (Caffi) for also reduction of movements [21], and Suberoylanilide hydroxamic acid (Saha) with unknown effects, respectively.Each treatment is in a concentration of 100 μmol/mL for the demonstration.The larvae are dechorionated and treated at 27 hpf for long-term treatment.The proposed AMTIP is used to quantify the data collected in Experiment E lt for verifying that AMTIP can generate different touch-response behavior criteria with different chemical treatments on zebrafish larvae in the long term (assumption).Even though the proposed AMTIP is designed for the multi-larvae case, it can be used to quantify the data in the single-larva case.

Experiment results
The experiment on long-term treatment (E lt ) is conducted to collect 173 videos (24 videos for Wild, 27 videos for DMSO, 38 videos for Dia, 30 videos for Iso, 24 videos for Caffi, 30 videos for Saha, with the dataset denoted as DA-Elt in Table 2) and the quantification is run via AMTIP with the results visualized in Figure 4, including latency time t l , C-Bend

Evaluation of AMTIP
The proposed AMTIP can fail in the touch-response quantification owing to the inaccuracy of the segmentation method and missing objects by the tracking procedure used in the inspection pipeline.Thus, the detected errors (failure cases) are to be analyzed to evaluate the proposed AMTIP.The collected video data contain some unquantifiable ones, such as the larvae are not touched, and the larvae or needle cannot be detected.The dataset DA-Elt collected in the experiment on long-term treatment E lt is used to analyze the detected errors: including the number of videos with no larvae touched (#NT) as well as those with failures of quantification (#QF).Among the videos collected (#C), shown in Table 3, the ground-truth numbers of the videos with no larvae touched (#NT g , generated by visual screening) are compared with the numbers output from AMTIP (#NT p ), with the false positive rate (FPR) and false negative rate (FNR).As well, the numbers of failures of quantification (#QF) are given with the percentage ( The AMTIP can generate #NT p and #QF automatically and find more than 90 % videos without any larvae touched on average (1 − FNR).Besides, around 10 % of valid videos (#C − #NT g ) cannot be quantified by AMTIP (failure cases).In addition, the larvae under the treatment of Dia are assumed to have a response scarcely.Thus, the output of latency time is expected to be infinite, and the other quantification criteria (C-Bend curvature maximum, C-Bend curvature peak time, response time, and escape distance) are expected to be 0.However, AMTIP can only generate finite numbers less than the duration of videos (15 s in our case), but from Figure 4a, the latency time of Dia is still useful to be compared with the controls as it shows a much longer latency time than those of wild and DMSO.Furthermore, the results in Figure 4b-e are over zero (negative outputs), caused by the following reasons: (i) some larvae still have a slight response; (ii) the movements of the needle can push the larvae away (fake response); (iii) the tracking procedure generates the movements of the larvae because

Discussions
The results verify that the proposed inspection pipeline AMTIP can work as an automated quantification tool for the touch-response data in a high frame rate.The AMTIP has the following advantageous strategies: -The time point when the touch is applied (t 1 ), as well as the actually touched larva, is obtained by the final position of the needle and the initialized positions of the larvae, as the local segmentation during the tracking procedure is not as accurate as the initialized segmentation by the U-Net.- The response of the larvae is defined by the particles used in the particle filter based larva tracking (details in Section 2) instead of the change of the larva center, the centers of the larvae can change slightly but constantly during the tracking procedure, even if the larvae do not move.- The time point when the touch response stops (t 4 ) is computed from the last frame reversely to the previous frames, since the larva can move slowly (no significant changes of pixels) for a moment and start moving strongly again.- The quantification is achieved after the tracking and segmentation of all frames in the video, making it possible to consider the global information of the video.
However, some drawbacks still need to be considered carefully when the users apply AMTIP to the customized data.The tracking procedure and local segmentation of the larvae are the keys to AMTIP, but they may fail in the following cases: (i) the larvae overlap with each other when moving; (ii) the well edge area has similar brightness to the larvae; (iii) the needle overlaps with larvae.It is essential to conduct the touch-response experiments on a large scale, so the proposed AMTIP is vital in such cases.

Conclusions
In this work, we introduce an AI-based inspection platform for the touch response of zebrafish larvae, which can generate five quantification indices (latency time, C-Bend curvature maximum, C-Bend curvature peak time, response time, and escape distance) automatically without human intervention.This platform uses an automated inspection pipeline based on a multi-larvae tracking procedure, with a U-Net for initialization of the tracking procedure, optical flow and particle filter for tracking, and region growing for local segmentation of larvae.Six sets of experiments (two controls and four treatments) are conducted, and the results generated from this platform as well as the analysis of the detected errors verify the effectiveness of the platform.
The AMTIP can generate the expected conclusions as the assumption according to the corresponding experimental results.A high efficiency is also guaranteed with on average 63 ms per frame for the inspection pipeline on CPU.The AMTIP can be applied to the inspection of animal behaviors and systems that are required to analyze position changes in videos and to quantify the movements into criteria.

Figure 1 :
Figure 1: The diagram of conducting touch-response experiment and the corresponding response.

Figure 2 :
Figure 2: The diagram of the AI-based multi-larvae touch-response behavior inspection pipeline (AMTIP), including the initialization, tracking and segmentation procedure, quantification, and quantification criteria.The steps are marked in red.

Figure 3 :
Figure 3: Protocol of the quantification experiment E lt which is marked in red.

Figure 4 :
Figure 4: Five quantification indices on DA-Elt with six experiment cases (wild, DMSO, Dia, Iso, Caffi, and Saha) generated by AMTIP, including latency time t l , C-Bend curvature maximum c m , C-Bend curvature peak time t cp , response time t r , and escape distance d e .(a) Latency time.(b) C-Bend curvature maximum.(c) C-Bend curvature peak.(d) Response time.(e) Escape distance.

Table 1 :
The experiment setup of experiment E lt .Wild: larvae in fish water.DMSO: larvae in dimethyl sulfoxide.Dia: larvae treated by diazepam.Iso: larvae treated by isoprenaline hydrochloride.Caffi: larvae treated by caffeine.Saha: larvae treated by suberoylanilide hydroxamic acid.

Table 2 :
The datasets of experiment E lt .
1 As each treatment is prepared with DMSO, the experiments on the larvae with only DMSO (1 %) are also conducted as controls.

Table 3 :
The analysis of the detected errors (failure cases) of the proposed AMTIP.#C: the number of the collected videos.#NTg: the ground-truth number of the videos with no larvae touched, generated by visual inspection.#NTp: the predicted number of the videos with no larvae touched, generated by AMTIP.FPR: false positive rate of the videos without larvae touched.FNR: false negative rate of the videos without larvae touched.#QF: the number of failures of quantification.E QF : the percentage of failure of quantification.E QF = #QF∕|#C − #NT g |.of the slight environment changes or other inaccuracy.Nonetheless, the results of Dia in Figure4b-e are much lower than those of wild and DMSO.In other words, even with slight variance, the proposed AMTIP verifies our assumption on treatment Dia that reduces the touch response of zebrafish larvae.Finally, AMTIP can achieve the quantification in higher efficiency (frame rate: on average 63 ms per frame on CPU) via the proposed efficient tracking and segmentation procedure compared with the U-Net (frame rate: on average 2.60 s per frame on CPU).