next up previous contents index
Next: `variogram' Up: Command file syntax Previous: Commands   Contents   Index

Subsections


`data'

The general form of the data command is

data( identifier): 'file', options ;

The file name should refer to an existing file in ascii table form, simplified GeoEAS format or one of the supported grid map formats. Options can be single keywords like log or expressions like x=2. Column 0 means not defined (non-existent). The full list of options is [default values between square brackets]:

General options

v=5
column 5 contains the data (measurement) variable [0, or obtained from grid map]

x=1
column 1 contains the x-coordinate [0, or obtained from grid map]

y=2
column 2 contains the y-coordinate [0, or obtained from grid map]

z=3
column 3 contains the z-coordinate [0, or obtained from grid map]

d=1
use a first order (polynomial) linear model in the coordinates as the trend; allowed order values are 0, 1, 2 and 3; see also X, sk_mean and b [0: only an intercept as trend]

mv=-1
define missing value as the value -1 [the string NA, see also set mv]

average
average values with identical locations (i.e., their separation distance is less than zero) [noaverage]

log
log transform the variable (natural logarithm) [no transform]

I=5
transform the observation variable $v$ to

\begin{displaymath}
I(v,5) = \left\{ \begin{array}{ll}
1 & \mbox{if $v \leq 5$,} \\
0 & \mbox{otherwise}
\end{array}\right.
\end{displaymath}

[no transform]

v=6, Category='sand'
transform the observation variable $v$ to 1 if the string in column 6 equals the Category string sand, and to 0 in any other case. [no transform]

standard
standardise variable (to mean 0, variance 1) [do not standardise]

X=8&9&x&y
apart from a default intercept, the values of the base functions at the data locations are the variables that are in columns 8, 9 and the x- and y-coordinate. (Polynomial coordinate base functions allowed are: x3 for $x^3$, y3 for $y^3$, z3 for $z^3$, x2 for $x^2$, y2 for $y^2$, z2 for $z^2$, x for $x$, y for $y$, z for $z$, x2y for $x^2 y$, xy2 for $x y^2$, x2z for $x^2 z$, xz2 for $x z^2$, y2z for $y^2 z$, yz2 for $y z^2$, xy for $xy$, xz for $x z$ and yz for $y z$, provided that the corresponding coordinate is defined) [use only an intercept (mean) as trend]

X=-1&8&9
the values of the base functions at the data locations are on columns 8, 9 and 10, with no intercept [use only an intercept (mean) as trend]

b=[2.4, 1.7, -3.9]
define the (known) regression coefficients, corresponding to the X entries given. See also sk_mean; b generalises the concept of a known constant mean to a known mean function. [undefined; regression coefficients are unkonwn]

V=6
column 6 contains the proportionality factor $v_i$ to the residual variance $v_i \sigma^2$ of the v-variable (i.e. the diagonal entries of matrix $D$, see Ordinary and weighted least squares trend prediction in section 2.7, and Appendix A.2). This will have an effect on the least squares residuals (thus affecting the sample covariogram and pseudo cross variogram), as well as on uncorrelated least squares prediction, kriging prediction, and trend prediction. [0: assuming identical variances or variances strictly derived from the variograms]

Variogram modelling options

noresidual
do not calculate OLS (or GLS, see gls) residuals for sample variogram or covariogram estimation (Appendix A.1). For sample variogram estimation in absence of base functions, setting noresidual will yield identical results, but will result in a modest gain in speed and memory saving. In other cases, it will result in the estimation of non-centred covariograms or pseudo-cross variograms [calculate residuals before variogram or covariogram estimation]

dX=0.1
include a pair of data points $\{z(x_i)$, $z(x_j)\}$ for sample variogram calculation only when $\vert\vert f(x_i ) - f(x_j) \vert\vert \leq 0.1$ with $f(x_i
)=(f_1 (x_i ), ..., f_p (x_i ))$ and $\vert\vert u \vert\vert = \sqrt{u'u}$. This allows pooled estimation of within-strata variograms, or variograms of (near-)replicates in a linear model (for point pairs having similar values for regressors like depth, time, or a category variable) [do not evaluate]

Prediction or simulation options

radius=4.5
select observations in a local neighbourhood when they are within a distance of 4.5 [large: select all] (see section 2.5)

max=30
maximum number of observations in a local neighbourhood selection is 30 [large: no maximum] (see section 2.5)

min=10
minimum number of observations in a local neighbourhood selection is 10 [0] (see section 2.5)

omax=2
maximum number of observations per octant (3D), quadrant (2D) or secant (1D) is 2 (this only works in addition to a radius set) [0: don't evaluate] (see section 2.5, and also method: nr;, section 4.5)

square
select points in a square (or block) neighbourhood, with square (block) sizes equal to 2 $\times $ radius [circular (spherical) neighbourhood]

vdist
use variogram value as the distance criterium for min/max/omax neighbourhood selection (but define radius as euclidian distance) [use Euclidean distance] (see section 2.5)

force
force neighbourhood selection to the minimum number of observations, disregarding the distance [unless simple kriging is used, generate a missing value if less than the required minimum number of observations are found within a distance defined by radius] (see section 2.5)

s=7
define variable with the strata for data() locations [0, no strata]

sk_mean=2.4
define the simple kriging mean to be 2.4 [not defined: an unknown mean (intercept) is assumed for each variable]. NOTE: the code sk_mean=2.4 is equivalent to b=[2.4]

dummy
define a dummy variable [require valid data to be read]


next up previous contents index
Next: `variogram' Up: Command file syntax Previous: Commands   Contents   Index
Edzer Pebesma
1999-08-31