How to scale up from animal movement decisions to spatiotemporal patterns: An approach via step selection

Abstract Uncovering the mechanisms behind animal space use patterns is of vital importance for predictive ecology, thus conservation and management of ecosystems. Movement is a core driver of those patterns so understanding how movement mechanisms give rise to space use patterns has become an increasingly active area of research. This study focuses on a particular strand of research in this area, based around step selection analysis (SSA). SSA is a popular way of inferring drivers of movement decisions, but, perhaps less well appreciated, it also parametrises a model of animal movement. Of key interest is that this model can be propagated forwards in time to predict the space use patterns over broader spatial and temporal scales than those that pertain to the proximate movement decisions of animals. Here, we provide a guide for understanding and using the various existing techniques for scaling up step selection models to predict broad‐scale space use patterns. We give practical guidance on when to use which technique, as well as specific examples together with code in R and Python. By pulling together various disparate techniques into one place, and providing code and instructions in simple examples, we hope to highlight the importance of these techniques and make them accessible to a wider range of ecologists, ultimately helping expand the usefulness of SSA.

movements fundamentally affect other ecological processes, including population (Hamilton & May, 1977) and community dynamics (Costa-Pereira et al., 2022), transport processes (Abbas et al., 2012), disease spread (Merkle et al., 2018) and ecosystem processes (Doughty et al., 2016). Under current global change, it is becoming increasingly important to robustly predict changes in individual movement decisions and how these scale up to emergent spatial patterns of animal distributions.
Focusing on environmental conditions and resources, habitat selection methods aim to identify and quantify the link between animal movements and distributions and the environment (Fieberg et al., 2021) and are fundamentally based on a comparison between the distribution of environmental resources and the use of these by the individuals (Manly et al., 2002). A key limitation of such approaches lies in the definition of what is available to each individual (Buskirk & Millspaugh, 2006), as animals are fundamentally limited by their movement capacities and cannot reach every point in the landscape at every single movement step, an issue which has led to long-standing discussions and methodological debates in the literature (McClean et al., 1998;Northrup et al., 2013). This crucial limitation can be solved by integrated step selection analysis (iSSA), which allows simultaneous modelling of movement and habitat selection decisions by animals (Avgar et al., 2016), building upon the earlier technique of step selection analysis (SSA) (Duchesne et al., 2015;Forester et al., 2009;Fortin et al., 2005;Rhodes et al., 2005;Thurfjell et al., 2014). Not only has this fundamental methodological advancement lead to an explosion of the use of SSA and iSSA in recent years (Huggler et al., 2022;Northrup et al., 2022;Viana et al., 2018), and methodological extensions (Klappstein et al., 2021;Munden et al., 2021), but researchers have increasingly shown how the movement kernels parameterised during SSA can be 'scaled up' to predict broader-scale space use patterns (Avgar et al., 2016;Fieberg et al., 2021;Potts, Mokross, et al., 2014;Potts & Schlägel, 2020;Signer et al., 2017). Even though this markedly increases the level of understanding and the quality of predictions which can be obtained from animal movement analyses, such upscaling of SSA is seldom done by many studies using SSA, perhaps due to a lack of knowledge or perceived methodological difficulties.
The focus of this methods guide is to elucidate the various methods for scaling up from step selection to utilisation distributions and other broader space use patterns. We aim to guide the reader through various techniques used for scaling up, when they can be applied and when not, and summarising the pros and cons when more than one technique is potentially applicable. We also give example code, in both R and Python, of some simple case studies, to give the reader a practical way of getting started with these techniques.
Throughout, we will distinguish between two different types of movement models, which each require slightly different methods of analysis. The first type models the movement decisions animals make due to spatial variables that remain essentially unaffected by the animals' presence (e.g. terrain, weather). In these situations, one can use a correlative model, whereby the animal movement decision is the response variable and the unaffected spatial variable is the explanatory variable, and propagate that model forwards through time, as exemplified by studies such as Potts, Bastille-Rousseau, et al. (2014) and Signer et al. (2017).
In the second type of movement model, there are variables that both affect an animal's movement and are affected by the same animal's presence. Examples include between-animal interactions, whereby the presence of individual A (either in the present or in the recent past) affects the movement of individual B, but in turn the movements of individual A are affected by the presence individual B (Couzin et al., 2002;Giuggioli et al., 2013). Another example is where animal movement is affected by the presence or absence of some resource that they then consume and deplete. Thus, these animals also affect the resource landscape by their presence, creating a feedback loop between animal location and landscape variable (Riotte-Lambert & Matthiopoulos, 2020), which may also be mediated by memory . In all these situations, while it is possible to use correlative models to make inference about the effect of variable U on variable V (e.g. the presence of individual A on the movement of individual B), these correlative models cannot be reliably propagated forwards in time to predict broad space use patterns, since the reality is that both variables affect each other.
There is no a priori sense in which one is the explanatory variable and the other the response variable. Instead, scaling up to broader spatiotemporal scales requires a dynamic modelling approach, for example via individual-based models (IBMs) of interacting individuals (Avgar et al., 2013) or systems of partial differential equations (PDEs) (Moorcroft & Lewis, 2006;Potts & Lewis, 2019).
Analysing these dynamic models is much more complicated than the case of purely correlative models, and we cannot give a complete hands-on guide here. Our approach will therefore be to provide a gateway into these techniques by explaining how to construct such IBMs and PDEs from the output of step selection, and pointing to the possible methods of analysis that might be used. In the case of IBMs, we do provide some code to help the reader get started.
However, in the case of PDE models, there is a world of analytic techniques, which are standard tools for many applied mathematicians, but may not be familiar to those without an applied mathematics background. Our guide on PDE techniques will not cover that vast background itself, but rather will guide collaborations between ecologists and applied mathematicians by showing how to interface questions about spatial arrangements of ecological systems with mathematical tools for deriving emergent spatial phenomena. We contend that such interdisciplinary collaborations are perhaps the only way forwards for answering many important questions in spatial ecology.
The 'scaling up' procedure we will describe is outlined as follows.
The first step is to parametrise a step selection function from empirical data. This step is typically called step selection analysis. A recent 'How to' paper already exists that covers the practicalities and interpretation of SSA (Fieberg et al., 2021), so we will be relatively brief here, referring the interested reader to that paper, also noting the recent review by Northrup et al. (2022). The second step involves using the parametrised step selection function to infer broad-scale space use patterns. We will describe various existing techniques for this, which use different mathematical formalisms. Some of these techniques give exact results and some are approximate, some stochastic and some deterministic. Moreover, not all of these techniques can be used in every situation. Therefore, we will give a guide as to which technique can (or should) be used in which situations.
Finally, we will explain the sort of information one can gain from this approach. This includes (a) predicting whether or not one might expect steady state space use distributions to emerge (the alternative being those that are in perpetual flux), (b) predicting the shape of such emergent steady state distributions (e.g. stable home ranges) where they exist, and (c) using the emergent spatial patterns as a goodness-of-fit test, which can help detect missing features in step selection models. The methods presented throughout necessarily involve quite a bit of mathematical formalism. To aid the reader, we include various supplementary appendices that give specific examples of our mathematical formalisms, as well as code for implementing the methods in these simple examples, in both R and Python.

| S TEP S ELEC TI ON MODEL S OF ANIMAL MOVEMENT
The principle aim of SSA is to understand the drivers of animal movement. Such movements can be affected by a very wide range of possible phenomena, including food distribution, predator avoidance, social interactions, corridors to ease passage, physical barriers, topography and many more, together called movement covariates. All of these can potentially be revealed through a step selection approach, as long as one has the right data. SSA is a type of resource selection analysis (RSA), that is, it is a way of estimating the probability that an animal will use a given spatial area, as a function of that area's 'resource value ' (e.g. access to food, mates, resting area, thermal refuge). SSA specifically focuses on the movement between two locations. It compares the observed movement step to all other locations the animal could have reached during that same time step, given the movement capacity of the animal, aiming to determine why the animal took the particular observed step instead of the many available alternatives.
With the increasingly detailed data on animal movement and its covariates that has become available over the past years, SSA has become a very popular approach for inferring drivers of movement.
There have also been various variants of SSA introduced, such as iSSA (Avgar et al., 2016) and time-varying iSSA (Munden et al., 2021).
However, for simplicity, we will refer to all of these as SSA unless there is a good reason to be specific. To understand how to use SSA for inference, there is a recent 'How to' paper, which also deals with habitat selection more broadly (Fieberg et al., 2021). Those unfamiliar with SSA may want to read this paper before continuing. Other useful papers related to inference in SSA are Thurfjell et al. (2014) and Northrup et al. (2022).
Here, however, our focus is different. Instead of focusing on inference, our aim is to give a guide for how to use the output of SSA for predicting space use patterns, given the information provided by SSA on both an animal's preferences for resources (i.e. movement covariates) and its movement behaviour. In more formal terms, we start by observing that a by-product of SSA (or iSSA) is the parametrisation of a movement kernel, describing the probability density, p z| x, x , t of moving from one location, x, to another location, z, between times t and t + τ, given all the information we have on the movement covariates, and given that the animal travelled to x on a bearing of x . We want to examine how to derive a utilisation distribution (UD) from this movement kernel, which is the probability distribution of finding the animal at any given location, and closely related to the concept of 'home range' . This requires setting up quite a bit of mathematical notation, but a picture of what is going on underneath these equations is given in Figure 1, and we give some foundational examples in Appendix A to help the reader become comfortable with the mathematical formalisms. We will also focus on location data that are recorded at fixed intervals at a relatively low frequency (e.g. one every few minutes or hours), but see Appendix B for modifications away from this case (following Munden et al., 2021;Potts et al., 2018).

| The movement kernel from step selection
Right away, let us write down the general form for our movement kernel, which combines two sets of equations, one to estimate the intrinsic movement behaviour of the animals and one for estimating the selection for resources, and is written as We now explain all the terms from Equation (1) in detail. First, Z(x, z, t) is a vector of movement covariates. We write Z = Z 1 , … , Z n where, for each i = 1, …, n, the function Z i = Z i (x, z, t) may depend upon any combination of the start location x, the end location z, or the current time t. However, not every Z i need depend upon all of these aspects.
For example, if the ith covariate is something constant in time over the duration of the study, for example, height above sea level, then Z i would not depend upon t. Also, Z i may depend upon the start location (at time t), the end location (at time t + τ) or both. If Z i depends on both x and z, this implicitly means it could depend upon any of the points between x and z (note that this does not imply that the animal has actually moved in a straight line between x and z, merely that it is possible to let Z i depend on any points on this line).
Second, is a vector denoting the strength of the effect of each covariate, Z i . We use the subscript τ in Equation (1) to emphasise that may depend upon τ (Fieberg et al., 2021). However, for notational convenience, we will usually drop this subscript and write = = 1 , … , n . The function exp ⋅ Z(x, z, t) is sometimes called a step selection function (Fortin et al., 2005), but this term has also been used synonymously with the movement kernel (Forester et al., 2009;Potts, Bastille-Rousseau, et al., 2014). Note that we can expand the scalar product of and Z(x, z, t) in Equation (1) to give Third, z, x, x , t is a selection-free movement kernel, and has historically taken various functional forms, discussed in Fieberg et al. (2021, Section 3.1). The function ϕ τ is sometimes referred to as a resource-free or resource-independent movement kernel.
However, here we will use the term 'selection-free', as in principle ϕ τ could itself depend upon resources and other environmental features (Avgar et al., 2016). This kernel can be thought of as describing the intrinsic ability of an animal to move in a given environment, disregarding any decisions the animal makes about where to move.
In the simplest examples, ϕ τ is just a decaying function of the speed have a more complicated dependence on the landscape and/or time (Avgar et al., 2016). Note that the possible dependence on x allows for ϕ τ to incorporate a distribution of turning angles, allowing for correlated movement.
Finally, Ω is the study area and the denominator in Equation (1) ensures that the movement kernel integrates to 1 with respect to z, making p z| x, x , t a genuine probability density function. It is sometimes convenient to write Implicit in Equations (1) and (2) is the fact that the movement kernel is truncated so that it is not possible for an animal to move outside of Ω, which leads to a boundary condition on any simulation of Equation (1), sometimes called a 'no go' condition (Potts, Bastille-Rousseau, et al., 2014). One could also choose other boundary conditions (e.g. reflective or periodic) but we will stick with this boundary condition throughout, for simplicity. Equation (1) appears as equation (13)

| Incorporating animal interactions
While Equation (1) models the effect of covariates on an animal's movement, some covariates are also affected by the animal's movement. This means the two features of interest, animal locations and covariate values, feedback on one another. A key example is when an animal's movement is affected by the space use of a second animal, whose movement is, in turn, affected by the space use of the first animal. Such two-way interactions may be social, competitive, mutualistic or predator-prey. In any such case, SSA will lead to a different movement kernel for each animal, which interact with each other.
This situation leads to a system of coupled movement kernels, sometimes called 'coupled step selection functions' Potts, Mokross, & Lewis, 2014). If we have N interacting animals, then the general form of such a system is where j = 1, …, N indexes the animals.
An example system of coupled movement kernels is where each animal j has at least one covariate Z i,j (for some i between 1 and n) that denotes the recent space use of a different animal, say animal j ' where j ' ≠ j (perhaps mediated through terrain marking or memory F I G U R E 1 From a movement kernel to a utilisation distribution. Panel (a) shows the movement kernel for a simulated animal that has a biases towards (i) better resources, denoted by darker green, (ii) a central place at x C and (iii) to continuing in the same direction (i.e. correlated movement). It has arrived at location x at time t on a bearing of x (measured clockwise from north). The subsequent location, z, is sampled from the probability distribution given by the contours. Specifically, the animal has a 95% (respectively 50%) probability of being located within the blue (respectively magenta) contour at time t + τ (i.e. after one time step). Panel (b) shows the utilisation distribution after 2000 time steps, where the blue (respectively magenta) curves show 95% (respectively 50%) kernel density estimator, mimicking how home ranges are often calculated from field data. This paper aims to explain various techniques for mathematically deriving, hence predicting, the utilisation distribution (panel b) from knowledge of the movement kernel (panel a).
(a) (b) [Moorcroft & Lewis, 2006;Riotte-Lambert et al., 2015]). This means that the movement of animal j is affected by the locations of animal j ' . But if, in turn, animal j ' has a covariate, say Z i',j' , that denotes the space use of animal j, then the two animal's movements depend on one another, generating a feedback loop between them. This is what we mean by saying the equations are 'coupled'.
In principle, Equation (3) can be parametrised using SSA in exactly the same way as Equation (1), for each individual j = 1, …, N.
However, more needs to be said to put this into practice, as it is not a trivial task to determine the 'recent space use' of each animal. Indeed, even conceptually the idea of 'recent space use' begs questions. How recent? What constitutes 'space use'? The precise one-dimensional path the animal has travelled or some broader area demarcated by that path? How do we infer this area from data?
None of these questions have a single catch-all answer. However, an important step forward was made by Schlägel et al. (2019). Their method starts using the notion of an occurrence distribution (OD) to describe the 'recent space use' of an animal .
The OD is constructed by taking consecutive measured locations of an animal, over a user-defined time-frame, and building a theoretically optimal estimation of the distribution of actual locations. The R package ctmm enables users to construct this OD with just a few lines of code, as explained in Calabrese et al. (2016). However, it can take a few minutes for the package to fit the model. Furthermore, the output can be quite finely resolved, and in practice animals may respond to a more smoothed version of the OD. Therefore, it is worth users also experimenting with using either a spatial averaging of the OD, or something more intrinsically smoothed-out, like kernel density estimation (KDE; Worton, 1989) or autocorrelated KDE (AKDE; Fleming et al., 2015).
Whichever method is chosen, to use the OD for SSA simply involves considering the OD in exactly the same way as any other environmental layer that varies through time. To put this in mathematical notation, let us denote the value of the OD of animal j at time t and location by O j,t ( ) (where stands for either x or z). If we hypothesise that this OD covaries with the decision of an animal j ' to move to a location then we set Z i,j� (x, z, t) = O j,t (z) for some i (Schlägel et al., 2019). Alternatively, if we hypothesise that O j,t ( ) covaries with the decision of j ' to move from a location then we set for some i. Following this procedure for each pair of animals j, j ' = 1, …, N leads to a collection of covariates in a system of coupled movement kernels (Equation 3), which can be parametrised using SSA (Avgar et al., 2016;Fieberg et al., 2021).

| EMERG ENT PAT TERN S FROM SC ALING UP
We now turn to the main topic of this paper, which is how to scale up from the movement kernel of Equation (1) to a description of broad-scale space use patterns. For this, we use the concept of a utilisation distribution (UD), which measures the probability of finding an animal at a location x at time t. We denote the UD by u(x, t).
We will sometimes be interested in the UD as it changes over time, but we will also examine situations where it is possible to derive a steady state UD, u * (x), which denotes the limit as t → ∞ of u(x, t) . Mathematically, it is not always the case that u * (x) exists, as u(x, t) may exhibit oscillatory behaviour at long times, or more complicated spatiotemporal fluctuations (Potts & Lewis, 2019). However, where it does exists, and where this limit is approached in an ecologically relevant time frame, the UD corresponds to what is usually called a home range.
In this section, we will examine how to derive both exact and approximate expressions for u(x, t) from Equation (1). We will explain the situations in which one can use each of these expressions, and the various benefits and drawbacks of each. In general, while exact expressions are theoretically ideal, they may be either difficult to compute in practice, not amenable to mathematical analysis, or only apply to certain subcases of Equation (1). Approximate expressions are therefore also valuable. We will explain how to calculate each of the expressions in practice and also provide a gateway into mathematical analysis. Finally, we will explain how to determine whether u * (x) exists or not, and how to compute it where possible.

| Mathematical formalisms for scaling up: Integro-difference equations, PDEs and IBMs
Here we describe how to use mathematical methods to derive the space use distribution of an animal, such as a home range, starting from the SSA-estimated movement kernel (combining the movement capacity of the animal and the selection for resources in the external environment). We then show how it is then possible to predict the space use distribution of an animal in a landscape with a certain distribution of resources, based on knowledge of its step-tostep movement decisions.
Perhaps the most general form linking a movement kernel to a UD is the so-called master equation (Merkle et al., 2017;van Kampen, 1981), which is an example of an integro-difference equation (IDE). We will begin by assuming that p τ is independent of x (i.e. no correlation in movement), so that p z| x, 2. then multiply this by the probability density of moving from z to x in a timestep of length τ, given by p (z | x, t), 3. do this for all x and integrate, 4. and then the result is the probability distribution at time t + τ.
It is also possible to incorporate the aspect of correlated movement (i.e. dependence on x ) with some extra notational baggage. We explain this in Appendix C, but focus here on uncorrelated movement for ease of explanation.
One approach to calculating u(x, t) is to solve Equation (4) numerically. In practice, this involves discretising the study region, Ω, and turning the integral into a sum. Let us denote by S a set of points obtained from discretising Ω. This discretisation can be chosen by the user, but in practice it makes sense to use a grid that is related to the underlying environmental covariates, as these themselves often arrive as a discrete-space raster. Then, for any pair of grid cells, s and s ' , write the probability of moving from s ' to s as P τ (s|s ' ,t).
Next let U(s,t) denote the probability of finding an animal at grid point s at time t. Then, given this discretisation, Equation (4) becomes An example of calculating Equation (5)  Equation (4) gives an exact solution for the time-evolution of the UD, given a movement kernel. Therefore, in principle, it gives a complete description of how to scale up from a movement kernel to a UD. Furthermore, Equation (5) gives a way of calculating Equation (4), with only some minor approximations due to the discretisation. So it might feel like the job is done. However, there are two key downsides to these equations. The first is that they are not particularly amenable to exact mathematical analysis, so there is not much exact theory that one can draw on (but see , which we discuss in Section 3.2). The second is that a numerical solution can be very time-consuming. Calculating Equation (5) requires calculation of P τ (s|s ' ,t) for every s,s ' ∈ S, and P τ (s|s ' ,t) itself requires computational of a numerical integral, the denominator in Equation (1). We now deal with each drawback in turn and how to mitigate against them.
First, to make use of mathematical theory, it is beneficial to derive a PDE from Equation (4), as PDEs are in general far more amenable to mathematical analysis than IDEs. However, it does require approximations to be made. Potts and Schlägel (2020) examined the PDE limit in a situation where the master equation is of the form in Equation (4). Additionally, they made two assumptions about the movement kernel in Equation (1). First, ϕ τ must be function of only Second, Z must be function of only z (the end of the step) and t. They also take the limit as τ → 0. Given these assumptions, Potts and Schlägel (2020) showed that the UD is approximately governed by the following PDE where D τ is the diffusion constant, calculated as x ′ is a dummy variable, and | x � | is the length of the vector x ′ . The τ → 0 approximation means that any directional autocorrelation in the data is ignored. Often, this is a perfectly adequate assumptio for predicting broad-scale space use patterns (e.g. the probability density function of a correlated random walk is asymptotically the same as a Brownian motion). However, if a user wants to use Equation (6) to make quantitative predictions for a particular study animal, we recommend they truth-test the PDE approximation by running an IBM that includes any directional autocorrelation detected in their data. Finally, from the perspective of calculating u(x, t) numerically, there is in our experience no advantage to using this PDE over the master equation. However, the analytic tools for studying PDEs are vast and can give crucial qualitative information about space use, which we will discuss more in Section 4.
As an alternative to numerical solutions of either the PDE in Equation (6) or the IDE in Equation (4), a conceptually simple approach is to use a stochastic IBM (Avgar et al., 2016;Signer et al., 2017). This simply involves simulating stochastic realisations of the movement kernel in Equation (1)  amt with those calculated from the IDE and PDE methods described here needs to take this discrepancy into account. In the context of home range calculations, the amt method is closer to what is done in practice when measuring home ranges from data, as typically people will measure home ranges using highly autocorrelated movement data.
However, conceptually, a home range is usually thought of as reflecting the probability density function of an animal's locations, which is more accurately described by u(x, t).
While IBMs are conceptually simple, it can be time-consuming to use them for calculating u(x, t). For this, it is necessary to simulate the movement kernel up to time t sufficiently many times to obtain a good measure of the probability distribution, which is the distri- drift up the gradient If you do want to calculate u(x, t) using IBMs but without requiring prohibitively intensive simulations, it is instead possible to combine simulations with a smoothing method, like KDE (Potts, Mokross, et al., 2014). To do this, simulate the movement kernel perhaps a few hundred times, each time starting at the same location and running the simulation to time t (this much can be done in amt). Then take the end-point of each simulation to give a dataset of genuinely independent samples of u(x, t). Finally, construct the KDE of this dataset, which is an estimation of u(x, t).
In all these methods, if comparing u(x, t) to data, it is advisable to ensure that the value of u(x, t) is as small as possible at the boundary of the study area, Ω, if at all possible. This avoids any bias that might be caused be the effect of the 'no go' boundary conditions (see sentences after Equation (2)).

| The steady state UD in the absence of animal interactions
One of the most oft-studied emergent spatial patterns from animal movement data is the home range, that is, the space use distribution often observed in nature, where animals restrict their movements over time to a certain area in space, and do not roam over the entire available landscape . Mathematically, this can be thought of as the steady state of a UD, defined to be a configuration that does not change over time. As such, if u(x, t) satisfies Equation (4), then the steady state of u(x, t) is a function, u * (x), satisfying the following equation if such a function exists. Here we have to assume that p (z| x) = p (z | x, t) , that is, the movement kernel does not change over time, otherwise no steady state can exist. That said, ecological systems can exhibit multiple time-scales; for example a movement kernel may give rise to a UD that becomes close to a steady state over a season, but next season the animal's movement may change, causing the UD to change . In this case, we can assume that p (z| x) = p (z | x, t) for the duration of a season and use the above techniques to calculate the seasonal steady state; but over the course of a year, the UD fluctuates.
In this section, we will assume that p (z| x) is independent of u(x, t) . That is, the probability of moving to a specific location does not depend upon the past or present UD. This means that Equation (9) is linear in u * (x), which is a requirement for the techniques presented in this section. Notice that this linearity requirement precludes the case where we have multiple coupled movement kernels, like in Equation (3) (whereby the movement of one animal depends on the UD of another, whose movement depends on the UD of the first animal). We will return to this case in Section 3.3.
The linearity requirement also precludes memory effects whereby animals are attracted to their own UD.
Let us now write the discrete space version of Equation (9), following the notation of Equation (5), as As described in Section 3.1, Equation (10) is what we tend to calculate in practice. This equation can, in theory, be calculated exactly using matrix inversion (Appendix E) but to our knowledge this has never been done in the context of step selection, perhaps due to computational intensity.
Another exact method, which is also relatively computationally efficient, is that of Barnett and Moorcroft (2008). This, however, relies on two assumptions. The first is that z, x, x , t can be written The second is that the functions Z i (x, z, t) only depend upon the end-point of the step, that is, Z i (x, z, t) =Z i (z) for some function Z i (z). With these assumptions in place, Barnett and Moorcroft (2008) show that the following exact expression for u * (x) holds We show how to compute examples of this, with code, in Appendix F.
It is interesting to look at two limiting cases. First, if ψ τ is a uniform distribution, meaning that animals can move over the whole of Ω in a single timestep, then (Appendix F) which is just the usual expression for a resource selection function (Manly et al., 2002). Although this assumption on ψ τ (l) is quite restrictive, there are real examples. For example, an urban fox can often traverse its whole territory in just a few minutes , so if Ω were the territory of an urban fox then it makes sense to use a uniform distribution for ψ τ .
The other extreme is where ψ τ is arbitrarily narrow, so the animal is making distinct movement choices over much smaller spatial scales than Ω, as is often the case with animals with very large home ranges. In this case, we have (Appendix F) which is identical to Equation (12) but with switched for 2 . In other words, the effect of selection on space use doubles as one moves from selection on a very broad spatial scale to a very narrow spatial scale . Notice that this factor of 2 also appears in the PDE from Equation (6) (before D τ ); indeed, the steady state of Equation (6) is precisely Equation (13) (Potts & Schlägel, 2020). Figure 2 gives an example of the steady state UD estimations from Equations (11)-(13) for a movement kernel of the following type (9) u * (z) = ∫ Ω p (z| x)u * (x)dx, where R(z) is a resource layer and x C is a localising point (e.g. a den or nest site). This models movement in a heterogeneous environment where there is additionally some localising tendency towards a single point, such as a den or nest site. Observe from Figure 2 that Equation (12) overestimates the UD size but Equation (13) is an underestimation. We give instructions and example code for reproducing Figure 2 and calculating Equations (11)-(13) in Appendix F.

| UDs for interacting animals
In Section 3.2, we examined how to find the steady state distribu- In such situations, the techniques of Section 3.2 do not directly apply. Indeed, there is no guarantee that a steady state emerges at all, and one may instead find UDs that fluctuate in perpetuity, never settling (Potts, Giunta, et al., 2022). To understand these patterns, there are two broad approaches: numerical and analytic. For the numerical approach, one could use the IDE formalism from Equation (1) or the PDE of Equation (6). However, we recommend using an IBM instead. The principal reason for this is that IDEs and PDEs only keep track of the probability distribution of animal locations, but IBMs keep track of the actual location of animals (Wang & Grimm, 2007).
This has two advantages. First, if performing numerical simulations, one might as well keep as much realism in them as possible (i.e. why not use an IBM?). Second, animals will respond to the actual (past and present) locations of themselves and other animals, not a distribution that reflects the probability of all possible locations that each animal could have taken. For analytic approaches, PDEs are the best tool and we will discuss this more in Section 4.
Writing code for an IBM depends a lot on the specific situation that is being modelled, especially for interacting objects. We give a basic example in Appendix G, of animals that have a mutual avoidance tendency and attraction to a single static resource layer, to help the uninitiated get started. However, we caution the reader that construction of an IBM for their specific situation is likely to require F I G U R E 2 Steady state utilisation distributions (UDs). Panel (a) shows locations of a simulated animal with movement process corresponding to Equation (14) with the resource layer shown in the background (darker green means higher quality resources). Panels (b-d) overlay this with predicted UDs from Equations (11)-(13), respectively, shown as a contour line surrounding the 95% kernel of the UD estimation. Panel (b) is an exact steady state solution to Equation (8), so captures the space use well. Panels (c), proportional to exp ⋅Z(x) , and panel (d), proportional to exp 2 ⋅Z(x) , respectively over-and under-estimate space use. Here, λ = 0.2, β C = 0.2, and β R = 1.5, x C = (50, 50).
quite significant thought and modification/re-writing of this simple model, which will depend on the particular structure of Equation (1).

| Goodness-of-fit analysis from emergent patterns
Having described tools for ascertaining emergent spatial patterns from models parametrised at a finer spatiotemporal scale, we now turn our attention to what we can learn from this process. The principle aim is to examine the extent to which these fine-scale processes capture the observed broad-scale patterns. This process can help reveal missing features from the model and generate new hypotheses ).
An example of this is shown in Figure 3 using simulated data. In this example, we assume the following movement kernel describes the movement of animal j, for j ∈ {1, 2, 3, 4} where R(z) is a resource layer, x j,C is a localising point for animal j and O j� (z, t) is the OD of animal j ' at time t. Figure 3a shows the KDE distributions of each simulated animal, moving according to an IBM based on the movement kernel in Equation (15). Figure 3b-d shows the distributions of simulated animals where one of the covariates is missing. There are some technical considerations when constructing an IBM of animals that interact through their OD . In short, one needs to think about how to construct O j� (z, t) at each step of the simulation, which involves not just the locations where the animal makes a turn but also locations in between turns. One way to deal with this is to simulate a stepping-stone process (Avgar et al., 2013(Avgar et al., , 2016. Details of how we have constructed such a stepping-stone process from an example system of coupled movement kernels are given in Appendix G. Comparing panels in Figure 3 reveals that a failure to incorporate social interactions (i.e. β j,j ' = 0) leads to more overlap between home ranges than is actually the case (compare panels (a) and (b)), a failure to incorporate localising tendency (i.e. β j,C = 0) leads to UDs emerging in the wrong place (panels (a) and (c)), and a failure to include the resource layer (i.e. β j,R = 0) leads to UDs that fail to grasp the environmental heterogeneity in R(z) (panels (a) and (d)). This is perhaps quite obvious in the omniscient situation of simulated data.
However, if in a real situation using empirical data, a researcher has not realised about one of these features, and then has parametrised a model that does not include that feature, then comparing empirical data on space use to emergent patterns from simulations of that model can help reveal this missing feature .

F I G U R E 3
The effect of missing covariates. Empirically parametrised individual-based models (IBMs) can be used to detect missing features from a step selection model. Panel (a) shows some simulated data from four animals in a 100 × 100 box, whose movement is governed by three things: Mutual avoidance, attraction to resources and central place attraction. The green shades represent the resource layer (darker green implies better resources), the contours give utilisation distribution (UD) of animal locations, one colour for each animal ( As well as visual examination of discrepancies between data and IBM output, various metrics can be calculated to assess goodnessof-fit. Two possible metrics are to (i) compare the UD or OD sizes between the data and the IBM output, and (ii) to measure the UD or OD overlap between data and IBM. UD size can be measured using any number of metrics, but the locational variance is perhaps the simplest, as it is proportional to standard measures, like 95% KDE, but does not require interpolation or smoothing. Following Fieberg & Kochanny (2005), we recommend using Bhattacharyya's Affinity to measure UD overlap. Details of all these methods are given in , which also mentions relations to existing goodness-of-fit test for SSA that do not examine emergent spatial patterns (Fieberg et al., 2018).

| E XPLORING EMERG ENT PAT TERN S
While IBMs are valuable for comparing model output with data (Section 3.4), they are not always so amenable to mathematical analysis. This is where PDEs come into their own. There is a wealth of techniques for analysing PDEs in the applied mathematics literature (Buttenschön & Hillen, 2021;Evans, 2022;Murray, 2003;Robinson & Pierre, 2003). Many of these are quite technically demanding, and so our best 'how to' suggestion for those who do not have a deep background in applied mathematics is to form collaborations with those who do. The trick for forming such collaborations is to know broadly the sort of questions that can be answered by mathematical techniques and how to phrase them in the language of applied maths in a way that might entice collaborators while keeping firmly grounded in ecological and natural history knowledge. Our philosophy here will be to try to explain how to do this, with the ultimate aim of helping readers form useful collaborations, rather than doing the mathematics themselves.
Perhaps the most elementary technique in pattern formation analysis of PDEs is linear stability analysis (LSA; also sometimes called Turing pattern analysis, after Turing, 1952). This technique asks the following question: if a system is homogeneous (in our case, this means each animal is equally likely to be found anywhere on the whole landscape) and is then perturbed slightly (which will happen naturally as animals move and interact), will those perturbations grow in time? In practice, this means that UDs will segregate or aggregate spontaneously. Therefore, it can be used to answer questions such as whether avoidance processes are sufficiently strong to cause territorial segregation , or whether attraction processes are sufficient to enable aggregations to emerge spontaneously (Briscoe et al., 2002). Such analysis may also help researchers to separate-out the effect of social interactions on spatial distribution patterns from environmental interactions.
A second question that can be answered by LSA is: as the patterns grow from a homogeneous state, will they oscillate? This means that any segregations or aggregations that emerge will not be stationary but move around in space. This is of key importance in measuring UDs from data, because if a collection of animals have movement processes that lead to perpetually fluctuating space use patterns, then this has to be taken into account when measuring UDs from data. For this, one has to consider a set of locations across a time window. The size of this window should be determined by the natural period of any emergent oscillatory patterns.
The next question, which requires tools beyond LSA, is whether patterns are likely to form suddenly as parameters change. This means that a small environmental perturbation might give rise to a dramatic change in the structure of UDs (i.e. a 'tipping point'). Figure 4b gives an example of this (in the context of IBMs) whereby an increasing tendency for attraction to conspecifics leads to a sudden switch in spatial distribution from homogeneous to highly aggregated (which could, e.g., be driven by increased fear of predation).

F I G U R E 4
Transition from homogeneity to heterogeneous patterns. Panel (a) (respectively panel (b)) shows simulation output from an stochastic IBM consisting of two mutually avoiding (respectively mutually attracting) populations. When the strength of avoidance (respectively attraction) is low, the populations are well mixed, indistinguishable from non-interacting populations. As this strength is increased past the Turing bifurcation point, the populations begin to segregate (respectively aggregate). While the segregation patterns in panel (a) emerge in a continuous fashion, the aggregation patterns in panel (b) appear quite suddenly, and a hysteresis effect is observed. The precise definitions of the quantities in these plots are given in Appendix H.

(a) (b)
The existence of these sorts of tipping points can be ascertained by a variety of techniques, perhaps the most well used of which is weakly nonlinear analysis (others include Crandall-Rabinowitz bifurcation theory and centre manifold theory, and different mathematicians have different tastes regarding which to use, so it is valuable to be aware of the nomenclature). These techniques can determine whether the point at which patterns start to form (known as the bifurcation point) is supercritical, meaning that the size of the patterns is continuously dependent on the underlying process (Figure 4a), or subcritical, meaning there is a discontinuous switch from no patterns to patterns (Figure 4b).
The subcritical case is often accompanied by a hysteresis phenomenon, whereby the existence of patterns depends upon the history of the system. For example, in Figure 4b, in the range a min < a < a max , it is possible to see either aggregations or homo- for doing this, tailored to the specific case of understanding emergent space use patterns from animal movement processes (Potts, Giunta, et al., 2022). This research shows both how to relate IBMs to PDEs in a rigorous fashion, and gives methods for determining whether patterns emerge, whether they are stationary or fluctuating, aggregative or segregative. Indeed, Figure 4 gives an example of the output of such techniques. In particular, Figure 4a shows the analytically computed Turing bifurcation point of the PDE system, which is close to where the IBMs bifurcate from homogeneous to heterogeneous patterns.

| Why scale up?
We have described various existing techniques for scaling up from step selection to broader-scale space use patterns, some of which require some relatively involved mathematical and/or numerical analysis. So what is the value in learning and using these techniques?
We highlight two key points. Underlying both of these is a conceptual move from uncovering predictors to building predictive models. Correlative models, such as SSA, RSA and SDMs, are well developed for uncovering predictors, but they are less developed regarding making actual predictions.
Accurate model predictions require that the underlying models both capture the dynamics correctly (Point 1 above) and contain all the necessary mechanisms required for accurate predictions (Point 2 above). Prediction in spatial animal ecology is notoriously tricky (Hao et al., 2019) and requires significant future development. However, by plugging these two conceptual gaps, the methods described here should provide an important step in improving the predictive power of animal space use models.

| Future directions
Building models of animal movement that are able to predict broadscale space use patterns is a fundamental goal of movement ecology, with a huge range of potential applications to conservation and management (see Section 1). However, there is still a long way to go before accurate predictions from fine-grained movement models become widely possible. We end with a few ideas for where we would like to see these methods going.
1. Application across taxa and geography. Many of the methods described here are relatively new and have yet to be applied in earnest to a wide range of datasets. Broad application across different data and research groups is perhaps the best way to ascertain the value of these methods and discover practical ideas for improvement. Ideally, these would include different taxa, populations in contrasting environments, and data sampled over different time-scale and intensities.
2. Biologging data. Alongside movement, modern datasets often contain a wealth of biologging data, such as heart rate, acceleration, body temperature and neurological sensors (Williams et al., 2020). These can help inform the behavioural mode of animals, which, in turn, affects how they move and use space. Methods to incorporate these into movement models, ascertaining the extent to which they feed up to broad space use patterns, is a key research frontier (Klappstein et al., 2021).

Continuous time formulations.
Animals move in continuous time, they may make decisions at any point in time, data may be gathered at completely different points in time, all of which makes a continuous time framework appealing (Parton et al., 2016). In Appendix I, we discuss some current efforts to this end, and some possible ways forward. It is also worth mentioning that methods from Sections 3.4 and 4 are not implicitly tied to a discrete time framework, so it would be valuable to examine how to use these in the context of continuous time models.
4. Mathematical analysis of emergent phenomena. Our mathematical understanding of emergent phenomena from moving, interacting populations is still in relative infancy (Eftimie, 2018;Potts & Lewis, 2019). Specifically, efforts are required that explicitly tie these into empirically measured movement processes, for example using methods like those in Moorcroft and Lewis (2006) or Potts and Schlägel (2020). This will require strong collaborations between ecologists and applied mathematicians.

Jonathan R. Potts and Luca Börger developed the idea for this
Research Methods Guide. Jonathan R. Potts led the writing of the manuscript. Both authors contributed critically to the drafts and gave final approval for publication.

JRP acknowledges support of Engineering and Physical Sciences
Research Council (EPSRC) grant EP/V002988/1. We thank Roberto Salguero-Gomez for encouraging us to write this guide and for valuable comments during the process of writing. We also thank Brian J. Smith and an anonymous reviewer for helpful comments.

CO N FLI C T O F I NTE R E S T
The authors have no conflicts of interest to declare.

DATA AVA I L A B I L I T Y S TAT E M E N T
Code available from the Zenodo repository https://doi.org/10.5281/ zenodo.7225910 .