-
Notifications
You must be signed in to change notification settings - Fork 1
Spatial Interaction Models
The following text is based on and citated from De Vries, Nijkamp & Rietveld (2000).
Spatial Interaction Models are powerful tools and have been a source of intensive research in the past decades. Various spatial phenomena such as migration, passenger transport, international trade, shopping behavior and hospital admissions can all be described by models of this class. A very general and flexible Spatial Interaction Model was proposed by Alonso (1973, 1978). Alonso’s model contains equations for flows between regions, total outflow from a region, and total inflow to a region. The three components are connected by balancing factors. Alonso (1978) calls his model system a ‘A Theory of Movements’. It is also denoted as a ‘systemic model’, ‘Alonso’s General Theory of Movement’, ‘Three Component Model’ or ‘Extended Gravity Model’. There is not a standard name for the model. The term ‘Alonso-model’ is not suited for this model, as this mostly refers to Alonso’s (1964) model of urban land use. We will use here the term General Theory of Movement (GTM). This General Theory of Movement assumes a somewhat isolated position in the literature, and has not often been applied. This may be caused by lack of a clear interpretation, and by the difficulty to estimate the model econometrically. In this paper we will survey the GTM. We give a consistent formulation of Spatial Interaction Models and Alonso’s GTM, analyze the statistical properties of the GTM, and discuss its interpretation and estimation problems. In this way we hope to unveil the potential of this model.
Alonso’s General Theory of Movement stands in a tradition which started in the 19th century with the gravity model. Many spatial flow phenomena can be modelled as the product of the sizes of the origin and the destination, divided by a power function of distance. In the course of time various improvements of the model were proposed. A great step forward was made by Wilson (1967, 1970, 1974), who related Spatial Interaction Models to the entropy concept, and introduced a family of models: the unconstrained gravity model, the production constrained model, the attraction-constrained model and the doubly constrained model. Alonso’s GTM established a further improvement. Alonso (1973) developed his model as part of a large demographic model for the United States. The submodel for interregional migration is a Spatial Interaction Model, which offered a new specification of flows, inflows, outflows and their interrelationship. Alonso (1978) elaborated the model as a general framework for Spatial Interaction Models. Essentially the same model was independently developed by Bikker (1987, 1992) and Bikker and De Vos (1992) for international trade and hospital admissions.
In the first years after the presentation of the GTM by Alonso (1978), it stimulated a vivid discussion (Hua and Porell (1979), Anselin and Isard (1979), Hua (1980), Wilson (1980), Alonso (1980), Ledent (1980, 1981), Anselin (1982), Fotheringham and Dignan (1984), Tabuchi (1984)). This did not result, however, in a generally accepted view on the model. Standard works on Spatial Interaction Models as Batten and Boyce (1986), Fotheringham and O'Kelly (1989) and Nijkamp and Reggiani (1992) do mention Alonso’s model, but in most cases the model is not integrated in the treatment and no further pathways are explored. The situation has been sketched strikingly by Hua (1999):
“Many have been intrigued by this theory, and tried to clarify, to put it in operation, or to develop it further. But is seems frustration has prevailed and enthusiasm has been dampened somewhat now twenty years after the theory’s publication. This is not due to the exhaustion of possible development of the theory, but, to the contrary, to the non-conclusion of those studies on the theory. The mystery of the theory remains as it was and that has retarded the needed progress.”
We suggest that the confusion around the GTM can mainly be attributed to the role of two variables in the model, which are called ‘systemic variables’ or ‘balancing factors’. These variables are essential in the model, as they ensure the coherence and equilibrium in the system, but their interpretation is not clear. This has several consequences. In the first place, a clear view on the model is hampered by the fact that those mysterious variables occur in almost every equation. This means that those equations cannot be explained separately, but only in relation to the whole system of equations. Secondly, there is not an obvious single way to formulate the equations. Various representations of the model are possible, which can be derived from each other by substitutions or transformations (Hua (1999), Alonso (1978)). And thirdly, confusion arises around these variables in the process of estimation and prediction, as they are unobserved endogenous variables.
Several approaches can be thought of to interpret the GTM. The simplest is to see the model as an interpolation between the various members of Wilson’s Family of Spatial Interaction Models. It can be demonstrated that Alonso’s GTM contains the models of Wilson’s Family as special cases. The reverse is not true. The GTM is more flexible, as it enables interaction between flows and marginal totals (see Section 3). A change in transport costs on a single link, through new infrastructure, for example, will generally affect all flows and all marginal totals. Such a property is not found in Wilson’s family of models. The nature of the GTM can be clarified by studying the effect of changes in exogenous variables on the flows. This can be done for a change in a single value, or for overall changes. A more substantial approach is to give a direct interpretation of the balancing factors, by seeing them as (inverted) prices, shadowprices, or costs. Finally, the model could be derived from a more complex model, preferably an economic model, involving actors making choices based on prices.
The GTM is easy to compute. In most cases this can simply be done on a spreadsheet. Estimation of the parameters is, however, rather complicated. The balancing factors are unobserved, so they have to be estimated first. This can be achieved by estimating the allocation part of the model, and using the results to compute balancing factors. As these balancing factors are endogenous variables, the GTM is a simultaneous equation model, and application of Ordinary Least Squares (OLS) will produce biased and inconsistent estimates. Estimation by Instrumental Variables or Maximum Likelihood is then required.
In this section, we will describe Alonso’s General Theory of Movement in a more formal way. As the purpose of this paper is to discuss the general properties of the model in relation to Spatial Interaction Models, we use a notation which corresponds as much as possible which the usual notation on SIM’s (Wilson (1974), Batten and Boyce (1986), Fotheringham and O'Kelly (1989), Nijkamp and Reggiani (1992)) in the Wilson-tradition. This notation differs widely from the notation used by Alonso (1973, 1978). In the choice of additional symbols needed we try to follow Alonso (1978) and Hua (1999). After describing the setting of Spatial Interaction Modeling and definition of notation, we state the five equations which constitute the GTM. From these we derive five additional relations, which can be used in alternative representations of the model and which are useful in the analysis (Alonso (1978), Hua (1999)). We show that the four models of Wilson’s Family of SIMs (Wilson (1967, 1970, 1974)) are special cases of the GTM. Finally, we show in an example the effect of changes in exogenous variables on the flows, and thus demonstrate that the GTM is more general than Wilson’s Family.
First we will formally describe the GTM model. It maps out flows from n origins to m destinations. The variables to be explained are Tij (flow from origin i to destination j ), and their marginal totals Oi (total outflow from i ) and Dj (total inflow to j ). The exogenous variables are summarized in functions Fij , Vi and Wj , respectively related to connections, origins and destinations. Fij , the facility of movement between i and j , is a decreasing function of distance or travel costs. Vi indicates the size of origin i , and Wj the size of destination j . As we concentrate on the structure of the model, further specification of these functions is not necessary here. The GTM can then be represented by the following five equations:
(15) Tij = AiBjOiDjFij (16) Oi Ai Vi−α = (17) Dj Bj Wj−β = (18) ji∑Tij = D (19) ij∑Tij = O
The first three equations are behavioral equations, containing exogenous variables. (15) is well-known from the doubly constrained model of Wilson (1970, 1974). It describes flow Tij as proportional to total outflow Oi from origin i , total inflow Dj to destination j , and two proportionality factors Ai and Bj . Contrary to Wilson’s doubly constrained model, however, Oi and Dj are not treated as given. (16) and (17) relate them to the proportionality factors and the exogenous variables. The parameters α and β are generally assumed to fall between zero and one. We will discuss their effect on the behavior of the model later. The identities (18) and (19) define Oi and Dj as sums of flows. Note that (15), (16) and (17), after taking logs, are linear in the logs of the variables, while (18) and (19) are linear in the variables themselves. So the model as a whole is nonlinear.
The model is in fact an equilibrium model. The behavioral equations (15), (16) and (17) constitute ( ) m × n + m + n relations, while on the left hand side there are only ( ) m × n variables to be determined (due to (18) and (19)). The remaining m + n endogenous variables are the proportionality factors Ai and Bj . These are determined by the system of equations (15) to (19). We can see this by substituting (15) into (18) and (19). After some rearrangements we get
(20) (21)
These equations are well-known from the doubly constrained model of Wilson (1974). If we now consider the system of equations consisting of (15), (16), (17), (20) and (21), we have a more common representation of the model as a system of simultaneous equations. Still there are three behavioral equations and two identities. As (20) and (21) now relate various parts of the model, they can be seen as equilibrium conditions. (20) and (21) determine Ai and Bj up to a constant. If we consider the complete model, we see that they are completely determined. Substituting Oi and Dj from (16) and (17) into (20) and (21) we obtain
(22) (23)
The model can now be solved. Given values for Fij , Vi and Wj , the balancing factors Ai and Bj can be computed by iteration of (22) and (23). The resulting values are substituted into (16), (17) and (15) to obtain Oi , Dj and Tij . By substitution of (16) and (17) into (15) a 10th equation is obtained:
(24) Tij Ai ViBj WjFij −α −β = 1 1
As already noted by Alonso (1978), various representations of the model are possible by choice of equations from the above. Actually, Alonso (1978) mentions only seven equations. He does not mention equation (15), while (18) and (19) are implicit in his notation. Hua (1999) lists these ten equations. Five are needed to specify the model. The representation of the model by equations (15) to (19) is suitable to clarify the structure of the model. The system of (15), (16), (17), (20) and (21) proofs to be useful in estimation. The system of (22), (23), (24), (16) and (17) can be used to solve the model.
A simple way to get an impression of the possibilities of the model is to note that it contains Wilson’s (1967, 1970, 1974) Family of Spatial Interaction Models as special cases. This was already indicated by Alonso (1978) and demonstrated nicely by Wilson (1980). If we set α = 0 and β = 0 we have the doubly constrained model. As can be seen from (16) and (17), Oi and Dj are in that case completely determined by Vi and Wj . The remaining equations, (15), (18) and (19), or (15), (20) and (21), are exactly those of Wilson. If we set α = 1 and β = 1, we have the unconstrained gravity model. This can be seen from (24), where the proportionality factors cancel and (1) remains. In a similar way does the choice α = 1 and β = 0 result in the attraction-constrained model, and α = 0 and β = 1 in the production-constrained model. Four other special cases arise if we restrict only one of the parameters α and β to either zero or one. If α is zero, the model is constrained at the origin, if it is one, there is no substitution effect at the origin. The same applies to β at the destination.
There has been a vivid discussion about the relationship between Alonso’s GTM and Wilson’s Family (Wilson (1980), Alonso (1980), Ledent (1981), Fotheringham and Dignan (1984), Rogerson (1984), Weber and Sen (1985), Weber (1987), Pooler (1994)). Wilson (1980) states that Alonso’s GTM and his Family of SIMs are equivalent, and that everything which could be achieved with GTM can be achieved with the more familiar models, mentioned in the previous Section. He shows first that the unconstrained, productionconstrained, attraction-constrained and doubly constrained model can be derived from GTM by setting the parameters α and β to zero or one. Then he continues to argue that the GTM can be derived from the gravity model by choosing certain specifications for Vi and Wj which include Ai and Bj . However, Ai and Bj are endogenous variables, and by introducing them in a part of the model that formerly was fully exogenous, the structure of the model changes. As also stressed by Bröcker (1990), it is important to be precise about the classification of a variable as exogenous or endogenous. In a short reply Alonso (1980) stated: “the general formulation broadens the usual considerations by making explicit the joint variability of the values of cells and the values of marginals;” (page 733).
The GTM is not contained in Wilson’s family of models. Although there are some strong similarities (actually the equations (15), (18) en (19) of the GTM constitute the doubly constrained model), the models are fundamentally different, as in the GTM Oi and Dj are determined through the model. The difference between the GTM and Wilson’s Family becomes clear by analyzing the effect of changes in the exogenous variables. In the next Section we will do this systematically for overall changes, and so obtain multipliers. In this Section we will show in a numerical example that in the GTM a change in transport costs on a single link will change all flows and all marginal totals. In the doubly constrained model, the marginal totals are fixed, by assumption. In the unconstrained gravity model and in the GTM they can vary. On the other hand there are no substitution effects in the unconstrained model, which are present in the doubly constrained model and in the GTM.
Alonso’s General Theory of Movement is a very useful model. It avoids some limitations which are present in other Spatial Interaction Models. The GTM is not easy to interpret, however. Alonso (1978) called it a ‘systemic model’, as everything in the system is related. This Section is devoted to issues around the interpretation of the GTM. First we will discuss the so-called systemic variables Ai and Bj . Then we will clarify the behavior of the model by analyzing the effect of overall changes in the exogenous variables. Thereafter, we will describe a possible foundation of the GTM in a model of choice. Finally, we will mention some other approaches to provide a basis for the GTM. Much research remains to be done on these issues.
There is much confusion about the entities which we denote by the symbols Ai and Bj. These are mostly denoted as variables, but sometimes as parameters. Consideration of this issue leads to the very basic question what we mean by a variable or a parameter. The fact that Ai and Bj are not observed, but can be estimated, resembles properties of parameters. The way they function in the model, however, is more like variables. They occur at the left hand side of certain equations ((20), (21), (22) and (23)), and if the number or regions changes, also the number of Ai and Bj change. We argue that Ai and Bj should be considered as variables, for the following reason. If we use the model for prediction, we solve it given values for the parameters, and given exogenous variables. Ai and Bj are not given then, but rather the result of the solution. That they are not observed is not an uncommon feature. Ai and Bj are latent variables.
Further confusion is caused by the difficulty to interpret these variables. As we have shown in Section 3, several formulations of the model are possible. In all cases Ai and Bj are mutually influenced, and have no closed form expression. Consequently they cannot be interpreted by simply relating them to observable variables, but function in the whole of the system. That is why they are called ‘systemic variables’. They are also denoted as ‘balancing factors’ or ‘proportionality factors’, connected to Wilson’s doubly constrained model. Alonso (1973, 1978) describes −1 A as ‘opportunity’, ‘demand’ or ‘draw’: “If many opportunities are available from a locality, the flow of its out-migrants may be expected to increase as a whole, but the flow to any particular destination will decrease since there are other attractive destinations.” (Alonso (1973), page 11), and −1 B as ‘competition’, ‘congestion’, ‘potential pool of moves’: “If a great number of migrants is competing for the opportunities at a destination, one may expect a negative feedback reducing the value of its attractiveness and diminishing the flows.” (Alonso (1973), page 12). Ai and Bj can also be interpreted as accessibility indicators, and are in that context also denoted as ‘indices’ (Bikker (1987, 1992), Bikker and De Vos (1992)). These concepts give an idea of what is represented by these variables, but it remains rather vague. It is difficult to draw conclusions about plausible values for the parameters α and β . A more direct and measurable interpretation of Ai and Bj is desirable. We will discuss some possibilities in the remainder of this Section.
A simple interpretation of the GTM is to see it as a model which encapsulates Wilson’s Family of SIMs and allows for intermediate cases. This makes it plausible to restrict α and β to be between zero and one (inclusive). To interpret values of α and β , it is useful to analyze the effect of changes in the exogenous variables on the flows. This will shed light on the question how the GTM behaves compared with the quadrupling problem of the original gravity model, and facilitates comparison to the models of Wilson’s Family. In practical applications this analysis can be used to present the results of estimation. Following De Vos and Bikker (1982), Bikker and De Vos (1992), we distinguish between the effect of a change in all Vi , all Wj or all Fij by the same percentage (macro-elasticities), and the effect of a change in a single element (micro-elasticities). The macro-elasticities can be computed analytically and hold exactly. The micro-elasticities are linear approximations, and can only be used to judge small changes. We will not derive the micro-elasticities here, but concentrate on the macro-elasticities. We will give an example to show how a macro-elasticity can be computed. Suppose we add r to all Vi ln . Using (15), (16) and (17), it can be shown that Ai ln increases with r * (1− β)/(α + β −αβ ), Bj ln increases with r * (−1)/(α + β −αβ ) and Tij ln , Oi ln and Dj ln with r * β /(α + β −αβ ). So the eventual change is β /(α + β −αβ ) times the direct effect of the change in V . This is called the origin multiplier. In combination with parameters in V the macro-elasticity of an explanatory variable can be derived (De Vos and Bikker (1989)). Table 5 presents the multipliers and the corresponding changes in the balancing factors.
These are elasticities. If V increases with 1%, then T will increase with β /(α + β −αβ ) %. We have added a column for the effect of a simultaneous change at the origins and destinations. This is of interest in the light of the quadrupling question. Moreover this effect can also be computed for the doubly constrained model. In Table 6 we summarize the multipliers in the GTM and the four models of Wilson’s Family. The latter can be derived by substituting the values of α and β as indicated, or directly from Wilson’s equations.
In the doubly constrained model a change only at the origins or the destinations is not possible. (See Bröcker (1990) for an analysis of this problem.) The other two elasticities for this model can be derived as a limit (substituting 0 for α and β will lead to 0 divided by 0). We see that in the GTM the elasticity of a change both at the origins and the destinations can vary between 1 and 2, dependent on the values of the parameters. The effect of an overall change in F is of interest for analyzing the effect of changes in transportation technology or costs. If this macro-elasticity is positive, the model implies that an improvement in the facility of movement, for example, by introduction of a new transportation technology, will lead to an overall increase in the flows. This effect appears to be present only in the unconstrained gravity model and in the GTM.
A possible basis for the GTM is a process of choice. Actually most derivations of the model use such a framework (Alonso (1973, 1978), Bikker (1987), Bikker and De Vos (1992)). The reasoning depends on the application area. We have to decide which agent takes what kind of decisions, and with which restrictions the agent is confronted. In the case of migration, the relevant agent is someone who is considering migration. The first decision is whether to migrate or not. This is influenced by the possibilities. The second decision is to which region to migrate. This depends on the attractiveness of the destinations. A similar reasoning holds for hospital admission. For commuting a different reasoning would apply, as here the economic agent (worker) chooses both a job and a residence location. It is reasonable to assume that employers do not discriminate between workers from different residential locations. The derivation of the model for the case of migration could proceed as follows. Suppose that the distribution over the destinations of migrants from origin i is given by the following choice equation:
(25)
This equation states that the share of destination j in the outflow Oi is proportional to (going backwards through the formula) the facility of movement Fij, the relevant characteristics Wj of the destination, and a power of the ratio between the actual inflow Dj and the natural inflow Wj. This ratio indicates the crowding at j. The denominator is chosen so that
(26)
Suppose further that the outflow Oi from origin i is given by the following generation equation:
(27)
This equation states that the outflow is proportional to the relevant characteristics Vi of the destination, and a power of the denominator in (25). This denominator indicates the opportunities at i. This is analogous to a nested logit model (Nijkamp and Reggiani (1988)). Actually, we have now already the complete model, for if we define the inflows as a summation:
(28)
and use the following definitions to simplify the notation
(29) (30)
we can derive the equations (15), (16) and (17), which in combination with (26) and (28) constitute the GTM. (17) follows directly from (30). (16) follows from (27) using the notation from (29). (15) follows from (25) in combination with (17), using (29) and (30).
It is remarkable that from these assumptions, which are asymmetric between origin and destination, the fully symmetric model GTM results. We see that α can take any value, while β should be nonzero. So this case cannot be attraction-constrained nor doubly constrained, which corresponds with the behavioural assumptions. The migrants from a certain origin choose a destination. The destination can become less attractive but does not discriminate between origins. This choice model helps with the interpretation of A and B, and the associated parameters α and β. α is the elasticity of the outflow with respect to the opportunities −1 Ai, so it will be positive. The elasticity of the share of a destination with respect to crowding −1 Bj is β −1, and this will generally be negative, implying β to be smaller than one.
Accessibility modelling documentation