All variograms and covariograms are calculated from predicted residuals
,
with
the
ordinary least square estimates of
,
fitted globally (all
data of the variable are used in a linear model assuming IID errors),
unless one of dX, noresidual or gls is set.
The sample variogram is calculated from residuals from a single
realization
for regular distance intervals
.
by:
The sample cross covariogram
is calculated
by the sample cross covariance
Gstat provides calculation of sample variogram, covariogram, cross
variogram, pseudo cross variogram and cross covariogram, where width
,
number of intervals
and direction of
can be
controlled. When some cross variogram is requested, gstat decides
which one should be calculated: the first, `classic' cross variogram is
calculated when the two variables have the same number of observations
and identical coordinates and order, in any other case the pseudo cross
variogram is calculated, this information is written to the second line
of the file with the sample variogram.
Gstat provides several methods for estimating variogram model
parameters. Fitting of a variogram model to the sample variogram is
done by iteratively reweighted least squares (WLS, [5]),
minimizing
Fixing parameters in a weighted least squares fit can be done by putting an @ before the range or the sill parameter to fix: e.g. fitting the variogram 1 Sph(@ 0.2) will only fit the sill parameter (1), fitting @1 Sph(0.2) will only fit the range parameter (0.2).
Within gstat, iterative fitting stops when the number of steps exceeds
50 (or the value set by iter) or when the fit has converged. A
fit is considered as `converged' when the change in the weighted sum of
squares of differences between variogram model and sample variogram
becomes less then
the last value of this sum of squares
(this number is controlled with fit_limit).
Both gstat and gnuplot fix the fitting weights during iterations. For this reason, when the fitted model strongly differs from the initial (starting) model, another fitting round may result in a substantially different end model, due to the different weights. Reiterated weighted fitting may very well result in never converging cycles. Gstat uses Gauss-Newton fitting with (mostly) analytical derivative functions; gnuplot uses Levenberg-Marquardt fitting with numerical derivatives.
Finally, gstat provides REML (restricted maximum likelyhood) estimation of partial sill parameters [15,2] from sample data: this method is equivalent to a full weighted least squares fitting of the variogram (sill parameters) to the pairwise products of the observations (the covariogram cloud). REML estimation is shown to be equivalent to iterated MIVQUE (minimum variance quadratic unbiased estimation), and to iterated MINQUE (minimum norm quadratic unbiased estimation) with an Euclidean norm [2]. REML may be slow for moderate to large data sets (more than 100 observations).