Robotic ecology: Tracking small dynamic animals with an autonomous aerial vehicle

Estimation algorithms for wildlife tracking with an autonomous aerial robot are supported by field validation with wild swift parrots. Understanding animal movements that underpin ecosystem processes is fundamental to ecology. Recent advances in animal tags have increased the ability to remotely locate larger species; however, this technology is not suitable for up to 70% of the world’s bird and mammal species. The most widespread technique for tracking small animals is to manually locate low-power radio transmitters from the ground with handheld equipment. Despite this labor-intensive technique being used for decades, efforts to reduce or automate this process have had limited success. Here, we present an approach for tracking small radio-tagged animals by using an autonomous and lightweight aerial robot. We present experimental results where we used the robot to locate critically endangered swift parrots (Lathamus discolor) within their winter range. The system combines a miniaturized sensor with newly developed estimation algorithms to yield unambiguous bearing- and range-based measurements with associated measures of uncertainty. We incorporated these measurements into Bayesian data fusion and information-based planning algorithms to control the position of the robot as it collected data. We report estimated positions that lie within about 50 meters of the true positions of the birds on average, which are sufficiently accurate for recapture or observation. Further, in comparison with experienced human trackers from locations where the signal was detectable, the robot produced a correct estimate as fast or faster than the human. These results provide validation of robotic systems for wildlife radio telemetry and suggest a way for widespread use as human-assistive or autonomous devices.


INTRODUCTION
Conservation management of certain critically endangered species relies on understanding how these species interact with their environment. This is achieved by tagging and tracking individual animals in the wild (1)(2)(3). Aerial robot systems can access rugged areas that are difficult for humans to traverse and thus are viewed as a potentially revolutionary tool for data collection in wildlife ecology (4,5). However, this potential remains largely unrealized. Robot systems have yet to achieve levels of signal detection, tracking accuracy, and speed that are sufficient to legitimize their role as a replacement for human trackers. Despite recent advances in automated wildlife telemetry tracking, very little is known about the movement of small, dynamic migratory species, of which many have reached critically endangered status. For large to medium animals, the miniaturization of GPS tags with remote data readout has facilitated a marked increase in understanding the movements of a diversity of species (6,7). Methods such as satellite telemetry have far-reaching applications, from investigating migration routes and wintering areas of large migratory birds (8)(9)(10) to studying the dynamics of aquatic predators (11,12). Unfortunately, these approaches are still only suitable for about 70% of bird species and 65% of mammal species (2). In the case of smaller species that return to the same breeding areas seasonally, miniature nontransmitting data loggers can be used (2); however, retrieving these data requires relocating the animals in situ. Because of this challenge, very high frequency (VHF) tracking has become one of the most useful techniques in ecology and management (13). This involves attaching small radio transmitters to animals and subsequently tracking the target species. Although scientists have been using VHF tracking since the early 1960s (14), data yielded by this approach are sparse because of the manual labor involved (2) and the need to constantly seek higher elevations to increase signal detectability. Thus, researchers are increasingly using low-cost unmanned aerial vehicles (UAVs) equipped with visual sensors as alternative tools for improving conservation management and wildlife monitoring (4,5). However, the capability of these systems is limited in terms of identifying individual animals and locating animals in unknown locations. Practical considerations when using remotely piloted aircraft for fieldwork are discussed in (15).
In recent years, there has been increased interest in end-to-end wildlife telemetry tracking with robotic systems (4), where the robot moves autonomously to track a live target animal. The usefulness of these systems, however, has yet to be proven in direct performance comparison with the traditional manual approach. Ongoing research is aimed at tracking radio-tagged carp in Minnesotan lakes using autonomous surface vehicles on the water and mobile ground vehicles when the lake is frozen (16)(17)(18)(19)(20)(21). Although this project has yielded seminal work in the field, the use of ground and surface vehicles is untenable for wildlife situated in rugged habitats. We recently validated the use of a multirotor UAV for autonomously localizing live radiotagged birds in such environments (the Manorina melanocephala) (22). Here, we present validation that robotic systems can perform comparably to experienced human trackers in head-to-head experiments. Our system is illustrated in Fig. 1.
The majority of research in radio tracking with an aerial vehicle has focused on isolated subsystems. Although these systems were typically motivated by the idea of tracking small animals [e.g., bird (23)(24)(25)(26) and fish species (27,28)], only simulations or prototypes have been presented with limited field testing. Alternatively, when tracking a relatively stationary target, the observations can be considered more robust, and thus, attention in this field has shifted to optimizing planning for single- (18,20,21) or multirobot (19,29) systems. The main assumption the authors have made is that the sequential observations are homoscedastic, meaning that the uncertainty over each measurement is constant or bounded. However, with a sporadic and unpredictable live target, this assumption is violated because of the resulting wide spectrum of observation quality from noisy to precise. As we show in this paper, this induces heteroscedastic observations, where the uncertainty varies with every observation. Failing to distinguish between low-and high-quality observations can lead to overconfident measurements that cause spurious location estimates or highly uncertain location estimates that are of little value.
A mathematically valid observation model is critical in planning the motion of the robot to improve the location estimate. In robotics, this general problem is known as active perception (30, 31) and intro-duces a coupling between data collection and planning. The idea of passively locating transmitting radio sources has been investigated in operations research motivated by search-and-rescue missions, where stationary distress beacons must be recovered rapidly. Hence, the task is a coverage problem solved via offline strategies with an emphasis on minimizing path cost over the entire area or teleoperated by humans (32). Alternatively, when the wildlife habitat is known and bounded, sensor networks can be placed to precisely track an animal's location (33,34). In our case, we require fast, precise estimates without intervention and thus use active strategies where the observation quality relies crucially on an appropriate sequence of viewpoints (35). Our objective is to reduce uncertainty (entropy) of the target location; thus, the task of actively tracking targets falls under the informative path planning paradigm (36). This problem is known to be NP-hard (37) and has been studied extensively over the past decade (38), with many applications focusing specifically on UAVs (39)(40)(41)(42)(43)(44). In this paper, we leveraged these results to obtain an approximately optimal sequence of actions by greedily selecting the most informative viewpoints at each decision step.
In this work, we present rigorous theoretical analysis and in-field validation of a complete system for autonomous wildlife tracking. We show that this system addresses many theoretical and engineering challenges to a degree that is sufficient to match or surpass the performance of skilled human trackers from a location where signals are detectable. First, we provided a mathematical derivation for our data-driven sensor model, which has previously been validated over a number of trials on real birds and stationary targets (22). This range-azimuth model was further used to predict the quality of future viewpoints in planning an approximately optimal sequence of observations. We then directly compared this system with human operators in the problem of tracking the critically endangered swift parrot (Lathamus discolor) species in the wild. In six of eight trials, the estimated bird locations fell within 50 m of the true locations on average, which was sufficient for recapture, detailed field observation, or data readout. In the remaining two trials, we were unable to establish ground truth position because the target bird moved during the flight and visual confirmation was lost. Moreover, the time taken to achieve these estimates was comparable to, and often faster than, experienced human trackers. Our demonstration validates the concept of a robotic A C D E B Fig. 1. The aerial robot system is designed to track small animals with lightweight radio tags attached. Swift parrots (L. discolor) (A) are considered here; noisy miners (M. melanocephala) (B) were considered in our previous work (22). (C) The robot was able to track swift parrots and yielded performance comparable with an expert human operator performing traditional wildlife telemetry tracking. The multirotor platform, shown from the front (D) and rear (E), includes a lightweight directional antenna system and payload that receives the signal strength from the tag (22). These data were then transmitted to a GCS for processing and online decision-making.
system for wildlife telemetry tracking used as an autonomous or humanassistive device in real field conditions. This milestone motivates further engineering development that may enable more widespread use of autonomous robots in movement ecology and in conservation management of small, dynamic moving species.

RESULTS
In this section, we report on the approaches used for sensor modeling, data fusion, and decision-making. We then present field trial results, followed by a discussion of insights into the global and local spatiotemporal movements of swift parrots that were gathered from the trials. The material in this section has its basis in the bearing-only, heuristic approach presented in previous work (22). Here, we present a full range-azimuth algorithm derived rigorously from first principles. We also provide variance analysis and proof that the objective function is monotone submodular, an important property that is useful in designing efficient planning algorithms.

Likelihood functions for observations
The most critical component of the system is the sensor model, which allowed us to convert the signal received from the radio tag to an instantaneous estimate of the target's location. An inaccurate or overconfident observation can lead to poor decision-making and imprecise final location estimates.
Let X ∈S denote the vehicle location, Y ∈S the targets' geographic coordinates within some workspace S & ℝ 2 , and Z an observation in some measurement space H. We are interested in learning the likelihood function ℓ(y; x, z), that is, the probability of receiving the measurement z at location x, given the target location y: Note that we could consider uncertainty in the vehicle location x by including it in the measurement z; however, we assumed full knowledge of the vehicle state in this paper. We use the convention that uppercase symbols (e.g., Y and Z) are random variables, lowercase symbols (e.g., y and z) denote their realizations, and bold denotes a set. Further, true (or optimal) quantities are denoted with an asterisk (e.g., y* is the true location of the bird), and estimates are denoted with a hat (e.g., ŷ is the target estimate).
To construct our sensor models, we must determine what we are measuring and the uncertainty over these measurements. In this work, we take both range and azimuth readings of the target, where both observations are assumed to be normally distributed. This results in each measurement comprising the mean and variance z = {m, s 2 }. Given a measurement function h: ðS Â SÞ→H that maps the vehicle x and target state y to the measurement space H, the Gaussian likelihood function is ℓðy; x; zÞ ¼ f ðhðx; yÞ; m; s 2 Þ ð 2Þ where f is the probability density function (PDF) of the normal distribution.
Observed and expected sensor data To derive our likelihood functions, we first characterized the raw sensor data and described the model used for obtaining our measurements.
More details on the system collecting these measurements are given in Materials and Methods. The radio tag emits an on-off keyed pulse signal; this transmission was received by the payload on board the UAV, and the received signal strength indicator (RSSI) values of the signal were captured and filtered. These RSSI values are linearly related to the power received during a transmission and are used as the raw sensor data for the observation. The range and bearing likelihood functions are based on these raw values and the learned sensor model.
The antenna used on board the UAV was a two-point phased array: a lightweight, unambiguous directional antenna designed for radio telemetry with multirotor vehicles (22). The array comprised two monopoles fed through a passive combiner circuit, which yielded a radiation pattern with a front lobe and back null. To reduce noise and spurious readings due to multipath propagation, the UAV remained stationary while yawing through a full rotation. During this rotation, the continuous RSSI values were filtered and sampled at a constant rate to give a scalar value g n associated with the bearing of the nth value f n . These values were then transmitted to a base station, giving the recorded gain pattern g = (g 1 , g 2 , … , g N ). As a result, the random vector G = g is a function of the vehicle X and target Y location. Further, let b(x, y) denote the bearing from x to y. The true bearing to the target from robot location x is then q* = b(x, y*). We assume that the error for each recorded RSSI value is normally distributed with an unknown variance s 2 (q*) that remains constant throughout an observation, that is, for arbitrary g n ∈ g where s 2 (q*) = V(G n |Q = q*). We obtain the expected gain pattern Ε[G|Q] by linear regression. Specifically, we fit the expected gain pattern to a Jth-order Fourier series φ: ℝ → ℝ, that is, given the true bearing q*, From this Fourier model, we obtain the expected gain pattern Given the expected and observed sensor output, φ and g, the main goal of Bayesian sensor data fusion is to compute PDFs of the bearing and range to a target from the robot. Given that the likelihoods are assumed to be Gaussian, the measurement tuple Z equals {m(G), s 2 (G)}. To learn the mapping from G to Z, we use a data-driven approach described in the following subsection.

Azimuth likelihood function
We model the likelihood of each azimuth measurement with a Gaussian bearing-error model (45), where Z Q ¼ fm Q ðGÞ; s 2 Q ðGÞg. That is, the difference between the true bearing to the target q* and the estimated bearingq (i.e., the bearing error) is Gaussian distributed. The bearing estimateq ¼ m Q ðgÞ and its variance s 2 Q ðgÞ are not measured directly but instead given as functions of observation quality (i.e., the correlation coefficient, discussed below). As a result, when G = g, the bearing-error likelihood function ℓ Q is given by Now, given our model φ of the gain pattern, our problem becomes that of inverse regression to find the expected bearing and uncertainty. The Gaussian bearing-error assumption stateŝ whenq ¼ Ε½Qjg and s 2 Q ¼ VðQjgÞ. We find the expected azimuth by minimizing the sum of squares of the residuals, that is, To infer the variance V(Q|g) for a given signal g, we note that the collection of {G} is heteroscedastic, that is, the conditional variance can change with each observation. This is shown in the scattergram in Fig. 2, where the bearing error is plotted against observation quality (correlation). Sensor data used to construct this example were collected from a stationary radio tag as described in our previous work (22). We assume that this unexplained variance is due to hidden causes of observation noise, such as the target animal moving during Obtaining range-azimuth likelihood functions from observations. (Top row) Two example observations taken online with a stationary target. The radial plots illustrate real RSSI readings (green line) g and a third-order Fourier series model φ(q) of the radiation pattern (black line). The model is offset (rotated) such that it is oriented toward the true bearing to the target q*, and the RSSI values are offset by the maximum correlation m Q (g) = arg max q r ϕ(q),g . These offsets are illustrated with dashed green and black radial lines. (Left) The maximum value correlation coefficient rφ ; g maps to a bearing-error s 2 Q ðgÞ, which is illustrated in the grid plots below. (Right) The maximum RSSI value g max maps to an expected range m R (g) with a fixed range-errorŝ 2 R , giving the associated grid plots below. a measurement or spurious recordings due to multipath interference. In typical regression, heteroscedasticity is considered undesirable and is reduced by introducing more regressors or nonlinear transformations of the existing variables. In our case, given that this knowledge is hidden, we cannot introduce more variables and instead marginalize out this quantity to infer the conditional variance from data. Below, we show how the coefficient of determination expresses the proportion of variability in our model (i.e., the heteroscedasticity is attributed to bearing error).
In the context of regression, we can obtain the fraction of variance unexplained (FVU) for a response variable through the coefficient of determination. In linear regression, where we have the sample variance s 2 g as an estimate of the population variance V(G), the FVU is given by the sample correlation coefficient r 2 :

VðGjQ ¼qÞ
VðGÞ ≃ However, we are interested in the bearing variance V(Q|G), which we can approximate from the model variance V(φ(Q)|G) by Taylor expansion. Recall that our estimateq ¼ m Q ðg 1 ; g 2 ; …g N Þ is a function of the random vector G. We can approximate the variance of this mapping via a first-order Taylor expansion (46), Now, because the measurement G comprises independent and identically distributed variables G n , the covariance matrix is given by S = V(G|Q)I N , where I N is the identify matrix. This gives the conditional variance in Eq. 9 as Because small changes in each realization of G will introduce small changes in m Q , the variance in Eq. 10 is approximately linear for low noise n G ; however, the approximation becomes worse as n G becomes large. By using the coefficient of determination (Eq. 8), we can express the variance of a given sensor reading g in Eq. 10 as Thus, s 2 (g) can be expressed as a function of s 2 g ð1 À r 2 φðqÞ;g Þ. In practice, we regress only on s 2 (g), assuming that the variable is a piecewise continuous function of the explanatory variable ð1 À r 2 φðqÞ;g Þ. We can also determine azimuthq ¼ m Q ðgÞ for each measurement g by the correlation coefficient r φðqÞ;g . That is, after each observation, the recorded gain pattern is correlated against the model φ(q) with regular phase offsets q, and the lag that corresponds to the maximum correlation then gives the estimated angle of arrival; that is, Eq. 7 becomes m Q (g) = arg max q∈ [0,2p) r φ(q),g . This process of obtaining an azimuth observation is illustrated on the left of Fig. 2, and example likelihood functions from one trial can be seen in Fig. 3.

Range likelihood function
Next, we estimate the distance to the target using a Gaussian rangeerror model where the set Z R ¼ fm R ðGÞ;ŝ 2 R g. The range errors are assumed to be logarithmic, as discussed below. Furthermore, unlike the bearing observations, the scattergram in Fig. 2 does not indicate that the noise is heteroscedastic, that is, the variance is constant for each observation. This yields the likelihood function In general, range measurements in cluttered environments can be highly imprecise due to multipath interference. We anticipate the vehicle to be deployed in similar environments and estimate the variance under these conditions. Although the error in range measurements can be substantial, including such observations is still useful. Because the noise is homoscedastic, we can rely on range measurements to provide an approximate location. The ability to focus on an approximate location is particularly beneficial when the search area would otherwise be expansive, such as in tracking scenarios where there is little prior knowledge of the target's location and when bearing uncertainty is high.
We are interested in mapping the sensor output g to the distance between transmitter and receiver. Because of atmospheric interactions, the signal amplitude will decrease with range. We denote d(x, y) as the Euclidean distance between our receiver x and the transmitter y. Then, the received power p r is a function of the transmitted power p t and the attenuation per meter a (47): In Eq. 13, we have assumed that p r and p t take into account the link budget, which characterizes all gains and losses in the telecommunication system. Most of these components are fixed for a given system (e.g., transmitter and receiver losses); however, for a directional antenna, the gain relative to the average radiation intensity (the isotropic directivity) depends on the immediate angle of arrival, f n . As a result, the RSSI values g n are a function of the received power p r and angle of arrival f n . The isotropic directivity is approximately constant if we take the maximum RSSI value g max = max n g n . Thus, we use the value g max to estimate distance. Now, let the true distance to the target be r* = d(x, y*) and its estimate be a function of g, that is,r ¼ m R ðgÞ. From the above discussion and Eq. 13, p r = p(g max ) for some linear function p: ℝ → ℝ. Moreover, r* is a function of log p(g max ), and the Gaussian range-error assumption may be expressed as where s 2 R ¼ VðlogRÞ. We thus obtain the estimated ranger as The function m R (G) can be fitted to a first-degree polynomial function of log g max . The variance s 2 R is estimated by the sample variance The procedure for obtaining a Gaussian range-error observation is illustrated on the right of Fig. 2, and example range likelihood functions can be seen in Fig. 3.

Combined likelihood function
The individual likelihood functions may be combined to obtain a rangeazimuth likelihood function ℓ(y; u), where Z ¼ fm Q ðGÞ; s 2 Q ðGÞ; m R ðGÞ;ŝ 2 R g. That is, assuming independent errors n Q and n R , the likelihood functions are multiplied pointwise (45), that is, We tested the null hypothesis that these errors are independent by computing the sample correlation coefficient. Because the errors are assumed to be normal, the hypothesis was tested via a Student's t distribution with 95% confidence and 150 observations. The results showed a correlation of r v Q ;v R ¼ À0:08 ± 0:136, giving a confidence of less than 66% that the errors are correlated. This result further supports the heteroscedasticity assertion, that is, that poor-quality observations are not significantly correlated with distance.

Bayesian data fusion
Given the likelihood function in Eq. 16, we can combine numerous observations to determine the most likely position of the target animal. To achieve this, we use Bayesian data fusion, assuming independent observations; this process is illustrated in Fig. 3.
By time t, we have obtained observations of the target animal at a set of times 0 ≤ t 1 … ≤ t k ≤ t. The above notation is extended to include the observation number k. Denote Y k = Y(t k ) as the target's location at time t k and, similarly, Z k = Z(t k ) is the observation taken at this time. Moreover, let U k = {X k , Z k }; denote the measurement process as Z 1:k = (Z 1 ,…, Z k ) and the values it takes on as z 1:k = (z 1 , . . . , z k ). We are ultimately interested in knowing the probability of the target's state after all K observations, that is, the posterior belief (36,45), Further, we assume that the target can transition between observations such that y k = y k−1 + n Y with n Y e N ð0; ∑ Y Þ for some co-variance S Y . This leads to the transition density Computing the posterior belief (Eq. 17) becomes simpler if the process (Y(t)) t≥0 is assumed to be Markovian and each observation Z k only depends on Y k , that is,ℓðy 1:K ; u 1:K Þ ¼ ∏ K k¼1 ℓðy k ; u k Þ. Because the likelihood function in Eq. 16 is defined this way, recursive Bayesian filtering (45) can be used to update the belief. That is, the posterior belief is computed as where h is a normalization constant such that ∫dy k p(t k , y k ) and ℓ(y k ; u k ) is the likelihood function (Eq. 16). The first step, Eq. 19, gives a motion update, and the second step, Eq. 20, gives the information update to obtain a new belief of the target location (45). Early approaches to recursive Bayesian filtering focused on Gaussian implementations because of convenient analytical solutions to computing the posterior belief in Eq. 17, for example, Kalman filters (KFs) and extensions such as the unscented and extended KFs. However, these methods are approximations to the nonlinear, non-Gaussian Bayesian filter (shown in Eqs. 19 and 20). Grid-based filtering allows for resolution-complete recursive estimation (45,48) and can be computed in reasonable time over our workspace. Thus, we represent our workspace S as an I × J grid in ℝ 2 .
The evolution model, Eq. 19, is functionally equivalent to Gaussian convolution. Further, given our grid-based workspace S , this convolution is simply a Gaussian blur, a spatial (low-pass) filter commonly used in image processing. To efficiently implement this model, we leverage results from computer vision for convolution and use a separable Gaussian kernel of width 3|S Y |.
Last, we require an estimate of the location of the target ŷ k given the posterior p(t k , y k ). Two obvious choices are the expected value of the posterior, E[Y k ] = ∫dy k y k p(t k , y k ), or the maximum a posteriori probability (MAP) estimate, arg max y k ∈S pðt k ; y k Þ. The MAP estimate performed marginally better in preliminary trials; however, in practice, the target does not remain stationary, and so we instead maximize recursively over all posteriors: y k ¼ arg max u∈½1;kÞ;y u ∈S pðt u ; y u Þ In this way, the location estimate likelihood is strictly increasing. The bearing-only likelihood function ℓ Q , the range-only likelihood function ℓ R , the combined likelihood function ℓ, and the posterior belief p(·). In each column, the first observation (k = 1) is shown in the bottom grid, the last observation (k = 4) is at the top, and higher probability mass is represented as darker, raised regions. The UAV location x k is indicated by a green dot, the target location y Ã k in purple, and the maximum likelihood estimate ŷ k in yellow.

Decision-making by information gain
Our overall objective is to know where our target animal is and with what certainty. Thus, the problem can be considered under the framework of information gathering (36). To quantify uncertainty, we use Shannon entropy, a standard measure for this purpose. The conditional (Shannon) entropy of a random variable V, given another variable W, quantifies the uncertainty over the outcomes of V in the context of W; mathematically, this is given by H(V|W) = E[log Pr(V|W)].
In this context, we aim to choose a sequence u 1:K = {x 1:K , z 1:K } of state-measurement pairs such that the final entropy of the belief H(Y N ) is minimized. That is, letting U ¼ ðS Â HÞ and fixing the measurement space H, the objective can be stated as U * 1:K ¼ arg min where the posterior belief p(t k , y k ) is a function of the robot position and measurements u 1:k . However, it is more convenient to consider the equivalent problem of maximizing the information gain of each observation. Let Y À k be distributed according to the target belief after the motion update step, that is, Eq. 19. The information gained in taking the action U k = u k is quantified by the mutual information IðY k ; Y À k Þ between the posterior and the prior belief: Decomposing Eq. 22 using the chain rule, the entropy minimization problem defined in Eq. 21 can be expressed as The objective of Eq. 23 is equivalent to entropy minimization and is, in general, nonconvex and analytically intractable. However, the mutual information given in Eq. 22 is monotone submodular, and thus, the quality of the solution provided by a greedy algorithm is at least 63% of optimal (37). That is, given a deterministic greedy algorithm that selects the action at each decision step, the resulting path û 1 : K is within a constant factor of optimal of the objective shown in Eq. 23, that is, Furthermore, this is the most efficient algorithm to obtain such a bound unless P = NP (37).
Optimizing each observation U k is constrained in that only the vehicle locations x k ⊂ u k can be selected and, consequently, only the expected information gain at each sample s can be computed; that is, we choose future waypoints x k such that As mentioned above, we assume independent errors in the likelihood functions shown in Eq. 16, giving Pr(Z) = Pr(Z Q ) Pr(Z R ). However, even solving for independent priors requires inverting all possible distributions at all sample locations s∈S; this is generally intractable.
As an efficient alternative, we assume that the target location for the next observation is the maximum likelihood position after the motion update, that is, Y k ¼ŷ À k . As a result, for a fixed viewpoint s, the expected range measurement E½RjX k ¼ s; Y k ¼ŷ À k ¼ dðs;ŷ À k Þ and expected bearing measurement E½QjX k ¼ s; Y k ¼ŷ À k ¼ bðs;ŷ À k Þ to the target are given. Moreover, the expected varianceŝ 2 Q is given by marginalizing out G such thatŝ 2 Q ¼ E½s 2 Q ðGÞ≃0:2 radian. In this case, the expected observation is a function of the viewpoint s:ẑ k ðsÞ ¼ fdðs;ŷ À k Þ;ŝ 2 Q ; bðs;ŷ À k Þ;ŝ 2 R g and the optimization over potential viewpoints s from Eq. 26 becomes To reduce computation time, instead of sampling every location in the workspace s∈ S as indicated in Eq. 27, we simply sample a uniformly distributed subset. Given the stochastic nature of observations, this does not appear to affect the quality of the planner.
Evaluating the performance of the system To validate our approach in real field conditions, we compared the performance of the robotic system to human tracker performance in locating swift parrots in the wild. The box plot in Fig. 4 collates the tracking performance from eight flights at four different sites near Temora, New South Wales, Australia. The data for individual trials are presented in table S1 for further detail. At each site, we obtained the GPS trajectory of a novice and an expert tracker performing manual wildlife telemetry. Once the human tracker had established the true location y for the target through visual confirmation, the UAV began its flight trial. The procedure for human trackers locating the bird is provided in Materials and Methods. After confirmation of the target by the human, a flight trial was performed for each type of tracker (novice and expert). Thus, we obtained two tests of the robot system at each site with a known true bird location.
For fair comparison, both the robot and the humans began trials from the same initial coordinates, with the target animal location unknown. This starting location was chosen such that the radio signal was strong enough to be measured by the onboard payload. To quantify performance, we compared the robotic tracker estimate ŷ k with the Euclidean distance between the GPS positions of human tracker and the final ground truth y at time t k . Specifically, in Fig. 4, the y axis shows (i) the error in the robot system's target estimate ŷ k for each observation k (in green and blue) and (ii) the Euclidean distance between the human tracker's location at time t k and their final estimate (in white and gray). The x-axis labels provide the observation number (e.g., observation 1) and the time interval during which that observation was taken for all flights (e.g., 46 to 60 s). Each data point in the box plot thus represents an observation from one of the eight trials. In two of the flights, the final location was uncertain because the bird moved during the flight and visual confirmation was lost. The case where data from these two flights were removed is labeled "Robot (certain)" (blue boxes).
The results shown in Fig. 4 and table S1 indicate that the robot was able to approximate the location of the target species in less time than human trackers. Specifically, by the second observation (between 143 and 289 s), the robot successfully located the bird to within 55 m in five of the trials (see table S1). In each of these five trials, the human tracker error was greater than the robots. Moreover, the final position estimates show an average of 51.47 m and are obtained in fewer than five observations (about 10 min). Table S1 indicates that the observation error does not monotonically decrease over time (e.g., in trial 7). This is not surprising because of the various environmental conditions that can cause observation noise; these include when the bird moves during readings as well as strong winds interfering with the UAV remaining stationary while spinning. In future work, we aim to be able to detect these conditions to minimize the effect of such noise.

Ecological significance of trials
The quantitative data from our trials demonstrate the potential of our system to shed light on the movements of small, highly mobile animals such as swift parrots. The Temora region was chosen because, based on a small number of sightings, it was assumed that numerous swift parrots had migrated to the area in the weeks leading up to the trial (see Materials and Methods). The results in this paper were obtained over a 7-day trial in the region, and the posterior estimates from all flights were aggregated to yield the heatmap shown in Fig. 5. The figure shows that the flocks used two distinct areas for foraging and roosting, including sites where the species had not previously been recorded.
Swift parrots are small, critically endangered migratory birds that depend on highly variable winter nectar resources. As a result, the small population (less than 2000 birds) spreads across vast areas of southeastern Australia each year in search of suitable food. Given their small body and hence tag size, as well as their capacity for highly variable and large movements, this species has not been successfully radio-tracked.

DISCUSSION
In this paper, we validated that our customized aerial robot can be used to perform autonomous wildlife tracking. We presented rig-orous mathematical derivation of all algorithmic components of the system, including an approach to computing the uncertainty of each bearing-only observation where heteroscedasticity is assumed. By creating a high point wherever the UAV was launched, our system performed comparably to, and often better than, skilled humans in tracking the critically endangered swift parrot (L. discolor).
Wildlife tracking is known to be an important but difficult problem, and tracking members of this species is particularly challenging because of their small size and highly dynamic movements. The ability of our system to track such animals thus exemplifies the capability of robotic wildlife trackers and the possibility of these systems to facilitate conservation management. Although we performed this study within a landscape that is relatively easy to traverse on foot (as discussed in Materials and Methods), the greatest benefits of this aerial tracking technology are likely to be realized within densely vegetated areas or rugged and dangerous landscapes. These situations require increased amounts of time and effort on the ground to locate tagged animals (relative to flat terrain) but no additional time or effort when tracking from the air. Tracking from the air enables more frequent direct line of sight with the tags and increases the likelihood of signal detection, which is a major challenge when radio-tracking wildlife.
Although previous work showed that the presence of aerial vehicles could disturb the target animals in some circumstances (49), we did not observe any such disturbances. The birds appeared to be more disturbed by the human tracker on the ground than by the robot flying nearby. Whereas birds would often move between adjacent trees when humans approached, they would often continue with observed behaviors when the robot was flying nearby.
There are many avenues of inquiry for improving the system hardware and decision-making algorithms. The signal detection range of our current system is about 500 m; increasing this range is a good target for further system hardware development. Algorithmically, multirobot extensions and long time-horizon planning with travel costs would allow for efficient search and tracking of numerous animals simultaneously. The problem of multirobot wildlife telemetry tracking has been partially addressed by designing optimal information-gathering algorithms without considering travel cost (19). However, these algorithms have yet to be used in real tracking experiments. A recent approach to information gathering for decentralized active perception allows for any general reward functions to perform distributed optimization (50).
When studying fine-scale movement patterns (e.g., of highly dynamic animals), it is desirable to maintain real-time information about individual trajectories. In robotics, this general problem is known as persistent monitoring. Existing approaches, however, often seek to maintain information about an entire, continuous environment (51) rather than monitoring a small number of discrete features (e.g., birds). Recent approaches partitioned the environment into discrete spatial locations and modeled the likelihood of observing birds by using Poisson processes (52). Extensions to this model have been  made where it is assumed that the presence of a robot interferes with the animals' behavior (53).

Robot system
The aerial robot system used in this work was originally presented in (22). We summarize key details here for convenience. The system comprised a commercial eight-rotor UAV (Ascending Technologies' Falcon 8), custom antenna array, sensor payload, and ground control station (GCS) laptop running ROS (54). All algorithmic component implementations were executed on the GCS. A radio link between the GCS and the UAV was allowed for telemetry and autonomous waypoint following.
The radio signal from each radio-tagged bird was received and filtered by custom electronics onboard the UAV and transmitted to the GCS (producing the sampled signal described in the "Observed and expected sensor data" section). After each observation and subsequent decision-making step, the GCS communicated a new waypoint to the UAV for autonomous navigation.
The design of our custom antenna array minimizes its vulnerability to sources of interference generated by the robot. In our field trials, we did not observe interference between the robot's flight systems and its receiver.

Experiments
Here, we present the procedure used to choose the survey location and describe its terrain. We then briefly explain the manual and robotic tracking procedures.
The trials were performed from 3 through 7 July 2017 in Temora, New South Wales, Australia. Before these trials, six birds were detected by an experienced volunteer undertaking targeted surveys in the surrounding Riverina region, where swift parrots are known to migrate on a regular basis. Follow-up surveys were conducted by the authors in late June, confirming that the survey location was suitable for this trial by detecting at least 30 birds. By the end of August, at least 200 swift parrots (10% of the global population) were detected in the area. The study site is an open, grassy, Box Ironbark woodland and thus relatively easy to traverse on foot. However, locating the birds was often complicated by logistical issues, such as limited road accessibility, fence lines, and different land tenures, including private property.
For the trials, several birds were captured, and subsequently, a BioTrack Pip Ag393 VHF radio tag was taped to their back feathers. Each tag transmitted on a unique frequency within the 150-to 152-MHz band that was preprogrammed into both the manual and the robotic receiver systems. The tags emitted an on-off key modulated signal with a pulse width of 10 ms and a period of 1.05 s. Moreover, they were lightweight (about 2 g) and hence low-power (less than 1 mW) transmitters due to the small size of the species. The use of the 150-to 152-MHz band is permitted for wildlife tracking in Australia; in other countries, alternative frequency bands could be used by retuning the receivers.
Manual tracking was undertaken by using a Titley Australis 26k VHF radio receiver system and a Yagi three-element handheld directional antenna (shown earlier in Fig. 1). The approximate location of a bird was identified by driving in an off-road-capable vehicle to different sites until a radio signal was audible from the receiver. Once a signal was detected, the tracker continued to point the antenna toward the strongest (loudest) signal while walking through the landscape. This procedure involved constant adjustment of the volume and the gain of the receiver and continued until the bird was sighted. The GPS trajectory followed by the manual tracker was recorded.
Once visual confirmation of a bird's location was achieved, the human tracker continued to observe the tagged bird to confirm the exact location while the UAV was launched from the same starting location as that of the human tracker. The UAV trajectory and raw sensor data were recorded in real time and later replayed to generate the figures reported. Each flight was performed at a constant altitude of 75 m (such that the canopy was cleared), and each observation took about 45 s to complete. For planning viewpoints, the UAV was constrained to choose locations within 300 m of the GCS (i.e., the starting position) for the pilot to maintain visual line of sight.
For Bayesian data fusion, the workspace S & ℝ 2 was discretized into a square, 300-by-300 grid, that is, I = 300 and J = 300. Each cell represented a 5 m-by-5 m area, and thus, the workspace extended 750 m in all cardinal directions from the GCS. We assumed a uniform prior on the target location and evolution model covariance S Y = s Y I 2 , where s Y = 20 m.