Next: Linear models in gstat
Up: Getting started
Previous: Prediction
  Contents
  Index
Subsections
Simulation
Simulation [6,9,17] is done by setting up a command
file for simple kriging (section 2.5) and changing the
default action to Gaussian simulation by adding the command
method: gs;
(example 6 and example 7), or to indicator simulation by adding the command
method: is;
If valid data are present (i.e., data are available in the neighbourhoods
defined), conditional simulation is done. Unconditional simulation is
done when only dummy variables (dummy, section 4.2)
or data outside every possible neighbourhood are defined.
The sequential simulation algorithm [9] is used for the
simulation. This algorithm visits each simulation location, following a
random path. After simulating a value (or set of values in the
multivariable case) at the location, it is added to the conditioning
data.
A few notes on the practice of (indicator or Gaussian) simulation with
gstat are:
- Even for simulating small fields (e.g. 500 cells) it is strongly
recommended to limit the kriging neighbourhood in order to get local
kriging (section 2.5). Because every simulated point
or block is added to the data, `global' simulation would soon amount to
large kriging systems, thus slowing down the simulation quickly.
- Gstat can create many simulations simultaneously in an efficient
way: a single random path is followed and for each simulation location,
the neighbourhood selection and the solution to the kriging system are
reused for all subsequent simulations. When many simulations are required
(e.g. for a Monte Carlo study) the time saving will be significant
(see set nsim). (Note that the use of local approximations
(local kriging) results in slightly dependent realizations when obtained
by following a single random path.)
- When simulation is done on a regular grid (using a mask map), in
order to reproduce the statistical properties up to reasonably large
distances, a recursively refining random path (``multiple-steps
simulation'', [9]) is followed, shown in
the graph below.
A random path is started on a (randomly located) coarse grid (A: the
coarsest (2
2) grid with a grid spacing that is a power of 2).
The simulation grid is refined recursively: 4
4 (B), 8
8 (C, black dots) halving the grid spacing each step) until all grid
locations are visited (C, grey dots). Neighbourhood definitions (see
section 4.2, and also the force flag) may ensure that
necessary conditioning data are used for reproducing the statistical
properties sufficiently.
- Simple kriging results in a correct conditional distribution, and
in Gaussian simulations that honour the specified variogram. Universal or
ordinary kriging can be used if enough conditioning data are
available, and leads to `rougher' simulations, less honouring the
variogram but better adjusted to a non-stationary mean (``major
heterogeneities'', [8]), when present in the conditioning
data. The parameter n_uk (section 4.4) controls the
choice between simple or universal (ordinary) kriging. Another way to
obtain non-stationary simulations is to add a varying mean to a
stationary (simple kriging) simulation afterwards.
Gaussian block simulation
Gstat simulates block averages when a non-zero block size is specified
(section 3.5). The implementation of this is a rather
inefficient one. Simulation will be faster when nblockdiscr is
set to a low value (to 3 or 2, section 4.4), at the expense of
the accuracy of point-to-block and block-to-block covariance calculations
(see Appendix A.3).
Indicator simulation
Table 2.1:
Order relation corrections
| Indicator |
order violation |
correction |
set order |
| independent |
 |
 |
1-4 |
| independent |
 |
 |
1-4 |
| categorical, open |
 |
 |
2 |
| categorical, closed |
 |
 |
3 |
| cumulative |
 |
 |
4 |
|
From data definitions alone, gstat cannot decide whether it is working
with indicator variables or not. In case of prediction this is not
crucial--procedure-wise, indicator kriging is identical to simple
or ordinary kriging. When indicator simulation is done for multiple
variables, a number of different situations may occur, and for correct
results, it should be specified explicitly if the set of indicator
variables is (i) independent, (ii) cumulative or (iii) disjunct:
- if the set is independent, simulated indicator variable can
take a value 1 or 0 independently from the other indicator variables
- if the ordered set of indicator variables
is cumulative, then
implies that
are all 1 (in gstat, the order of a set of indicator
variables equals the order in which the variables appear in the
command file)
- if a set of indicators is disjunct, then
implies that
for all
.
Independent indicators may represent independent variables. A set
of cumulative indicators may represent the cumulative distribution
function of a single continuous variable and a set of disjunct
indicators can represent the categories of a categorical variable (see
also the data command options c and Category). Table
2.1 shows the corrections done for the different types of
indicator variables and the value order should be set to to
obtain the corrections (estimated probality
,
corrected
estimate
). Cumulative indicators are corrected using
the ``upward-downward'' approach, see [8, p. 80] or
[10, p. 324].
For multiple indicator simulation (no indicator cross variograms are
specified), by default independent indicator simulation is done. The
subsequent indicator variables are taken as cumulative indicators if the
command
set order=4;
is added to the command file (Table 2.1). They will be treated
as disjunct if order is set to 2 or 3 (see section 4.4).
Next: Linear models in gstat
Up: Getting started
Previous: Prediction
  Contents
  Index
Edzer Pebesma
1999-08-31