Title: | Local Time Space Kriging |
---|---|
Description: | Implements local spatial and local spatiotemporal Kriging based on local spatial and local spatiotemporal variograms, respectively. The method is documented in Kumar et al (2013) <https://www.nature.com/articles/jes201352)>. |
Authors: | Naresh Kumar, Dong Liang, Jin Chen, Jun Chen |
Maintainer: | Dong Liang <[email protected]> |
License: | GPL-2 |
Version: | 1.0.9 |
Built: | 2025-03-07 04:49:32 UTC |
Source: | https://github.com/dliang-cbl/ltsk |
ltsk library is a collection of programs for implementing local spatial and local spatiotemporal Kriging. Unlike global Kriging, ltsk subsets the sample around a given location and time where predicted is needed; estimates variogram using the subset of sample data. Product-sum model is implemented and automatically estimated using the data points within the local neighbourhood. A unique advantage of ltsk is that it addresses non-stationarity, which is difficult to handle in large spatiotemporal dataset.
Package: | ltsk |
Type: | Package |
Version: | 1.0 |
Date: | 2014-12-31 |
License: | GPL2.0 |
Naresh Kumar ([email protected]) Dong Liang ([email protected]) Jun chen ([email protected]) Jin Chen ([email protected])
Haas, Timothy C. "Local prediction of a spatio-temporal process with an application to wet sulfate deposition." Journal of the American Statistical Association 90.432 (1995): 1189-1199.
Iaco, S. De & Myers, D. E. & Posa, D., 2001. "Space-time analysis using a general product-sum model," Statistics & Probability Letters, Elsevier, vol. 52(1), pages 21-28, March.
Kumar, N., et al. (2013). "Satellite-based PM concentrations and their application to COPD in Cleveland, OH." Journal of Exposure Science and Environmental Epidemiology 23(6): 637-646.
Liang, D. and N. Kumar (2013). "Time-space Kriging to address the spatiotemporal misalignment in the large datasets." Atmospheric Environment 72: 60-69.
A neighbor search implementation based on ANN trees to identify observed data points within a given distance around location and time interval.
dnb(query, obs, th, xcoord='x',ycoord='y',tcoord='t', future=TRUE,by=NULL,nbin=NULL,Large=2000, cl=NULL,cluster=c(1,3))
dnb(query, obs, th, xcoord='x',ycoord='y',tcoord='t', future=TRUE,by=NULL,nbin=NULL,Large=2000, cl=NULL,cluster=c(1,3))
query |
a vector; the x, y coordinates and the time stamp of the query point |
obs |
a matrix; the x, y coordinates and time stamps of the spatiotemporal locations |
th |
a vector; the distance threshold and time lag |
xcoord |
a character constant, the field name for x coordinate in both |
ycoord |
a character constant, the field name for y coordinate in both |
tcoord |
a character constant, the field name for time coordinate in both |
future |
logical, whether include observed spatiotemporal points future in time relative to the query spatiotemporal location. |
by |
a vector of |
nbin |
if |
Large |
a numeric constant, upper limit of neighbor points, beyond which subsampling is performance |
cl |
a parallel cluster object (default NULL means single core) |
cluster |
if cl is a parallel object, the time and space domain in terms of |
Implementation involves first calculating the time lags between query point and observed data (with locational coordinates and time); for observed locations within time lag of query, the function calculates the Euclidean distances between query location and all potential neighbors and select those within specified distance threshold.
The future argument can be used to exclude data in the future in neighbor search. This is useful in an extrapolation application.
A list of vectors, row numbers in the observed data matrix, that are within the given distance threshold and time lag of the query location.
For large dataset, use ANN (for spatial kriging) and Range Tree for spatiotemporal Kriging.
Dong Liang ([email protected])
get.knn
in FNN
data(epa_cl) coords <- c('x','y','t') ii <- dnb(query[1,coords],obs[,coords],c(0.1,10))
data(epa_cl) coords <- c('x','y','t') ii <- dnb(query[1,coords],obs[,coords],c(0.1,10))
Function implements ordinary time and space kriging for large data sets, with automatic product-sum variogram estimation.
ltsk(query, obs, th, xcoord = "x", ycoord = "y", tcoord = "t", zcoord = "z", by=c(0.001,0.001,1),nbin=NULL,byvariog=c(0.5,0.5,5),subset=T,nmax=20, vth = NULL, vlen = NULL, llim = c(3, 3), verbose = T, Large = 2000, future=T, cl = NULL, cluster=c(2,10))
ltsk(query, obs, th, xcoord = "x", ycoord = "y", tcoord = "t", zcoord = "z", by=c(0.001,0.001,1),nbin=NULL,byvariog=c(0.5,0.5,5),subset=T,nmax=20, vth = NULL, vlen = NULL, llim = c(3, 3), verbose = T, Large = 2000, future=T, cl = NULL, cluster=c(2,10))
query |
a data frame containing query spatiotemporal locations for which predictions are needed |
obs |
a data frame containing spatiotemporal locations and observed data |
th |
a vector, distance threshold and time lag to define neighbors of a query point |
xcoord |
a character constant, the field name for x coordinate in both |
ycoord |
a character constant, the field name for y coordinate in both |
tcoord |
a character constant, the field name for time coordinate in both |
zcoord |
a character constant, the field name for data in |
by |
a vector of |
nbin |
if |
byvariog |
a vector of |
subset |
for local kriging: whether subset to observations within estimated time and space range. |
nmax |
for local kriging: the number of nearest observations that should be used for prediction where nearest is defined in terms of semi-variance of local model. By default all used. |
vth |
thersholds for local spatiotemporal varigoram (default 75% of the max lag difference) |
vlen |
numbers of bins for local spatiotemporal varigram(default,space 15, temporal for each day) |
llim |
lower limits for number of regions and intervals with observed data to calculate Kriging (default 3 spatial regions, 3 temporal intervals) |
verbose |
logical, whether print details information |
Large |
a numeric constant, upper limit of neighbor points, beyond which subsampling is performance |
future |
logical, whether including observed points in future relative to query points. |
cl |
a parallel cluster object (default NULL means single core) |
cluster |
if cl is a parallel object, the time and space domain in terms of |
Function implements automatic variogram estimation (when possible) within a local spatiotemporal neighborhoods, and ordinary krigng based on the produce-sum variogram within that neighborhood. An variogram is estimated for each query point to allow for possible non-stationarity in the data generating field.
If the number of neighbors exceeds a user-specified upper limit (Large
), neighbors are sub-sampled in a balanced way to reduce the neighborhood size.
Four variogram models: Gaussian, exponential, spherical and Matern are automatically fit to the empirical space and time variogram in the first lag. The range parameter is estimated from the first distance lag where the empirical variogram exceeds 80% of the maximum. Weighted least square is then used to estimate the nugget and partial sill parameters. Model with minimal residual sum of squares between the empirical and fitted variogram is chosen as the variogram model.
Kriging mean and standard deviation and quality flags.
0 | valid prediction |
1 | not enough temporal neighbors |
2 | not enough spatial neighbors |
3 | not enough neighbors |
4 | variogram could not be fit |
Naresh Kumar ([email protected]) Dong Liang ([email protected])
Haas, Timothy C. "Local prediction of a spatio-temporal process with an application to wet sulfate deposition." Journal of the American Statistical Association 90.432 (1995): 1189-1199.
Iaco, S. De & Myers, D. E. & Posa, D., 2001. "Space-time analysis using a general product-sum model," Statistics & Probability Letters, Elsevier, vol. 52(1), pages 21-28, March.
Kumar, N., et al. (2013). "Satellite-based PM concentrations and their application to COPD in Cleveland, OH." Journal of Exposure Science and Environmental Epidemiology 23(6): 637-646.
Liang, D. and N. Kumar (2013). "Time-space Kriging to address the spatiotemporal misalignment in the large datasets." Atmospheric Environment 72: 60-69.
## load the data data(ex) data(epa_cl) ## apply log transformation obs[,'pr_pm25'] = log(obs[,'pr_pm25']) ## run kriging system.time(out <- ltsk(ex2.query[1:2,],obs,c(0.10,10),zcoord='pr_pm25',verbose=FALSE)) table(out$flag)
## load the data data(ex) data(epa_cl) ## apply log transformation obs[,'pr_pm25'] = log(obs[,'pr_pm25']) ## run kriging system.time(out <- ltsk(ex2.query[1:2,],obs,c(0.10,10),zcoord='pr_pm25',verbose=FALSE)) table(out$flag)
These functions are working R functions that are called by the ltsk function. They should not be directly used.
query and observed data for Cleveland OH
data(epa_cl)
data(epa_cl)