GEONE - Variogram analysis and kriging for data in 1D with non-stationarity

The goal is to interpolate a non-stationary data set in 1D (based on simple or ordinary kriging) in a domain (grid), where non-stationarity are given as:

local factor (multiplier) for the ranges along the main axes of the covariance model
local factor (multiplier) for the total weight (variance, sill) of the covariance model

One or several of these non-stationary features can be considered.

Import what is required

[1]:

import numpy as np
import matplotlib.pyplot as plt
import time

# import package 'geone'
import geone as gn

[2]:

# Show version of python and version of geone
import sys
print(sys.version_info)
print('geone version: ' + gn.__version__)

sys.version_info(major=3, minor=13, micro=7, releaselevel='final', serial=0)
geone version: 1.3.1

Remark

The matplotlib figures can be visualized in interactive mode:

%matplotlib notebook: enable interactive mode
%matplotlib inline: disable interactive mode

I. Build non-stationary features on a grid

Here are built the following non-stationary features on a grid:

r_factor_loc: local factor (multiplier) for the range of the covariance model
w_factor_loc: local factor (multiplier) for the total weight (variance, sill) of the covariance model

Defining domain of simulation (grid)

[3]:

nx = 2000 # number of cells
sx = 0.5  # cell unit
ox = 0.0  # origin

xmin, xmax = ox, ox+nx*sx

# # Set an image with simulation grid geometry defined above, and no variable
# im = gn.img.Img(nx, ny, 1, sx, sy, 1., ox, oy, 0., nv=0)

Defining non-stationarities

Range

The factor (multiplier) for range r_factor locally varying in the grid is defined as im_r_factor: an image (with one variable on the grid).

Variance (sill)

The factor (multiplier) for the total weight (variance, sill) w_factor locally varying in the grid is defined as im_w_factor: an image (with one variable on the grid).

[4]:

# Define r_factor as r0 on the left of the grid and decreasing linearly when going to the right of the grid
# ---------------
r0 = 3.0 # on the left
r1 = 0.2 # minimal value

# "Empty" image
im_r_factor = gn.img.Img(nx, 1, 1, sx, 1., 1., ox, 0., 0., nv=0)

# Add variable
r = r0 + im_r_factor.x()/im_r_factor.x().max() * (r1-r0)
im_r_factor.append_var(r)

# Define w_factor as w0 on the left of the grid and decreasing linearly when going to the right of the grid
# ---------------
w0 = 2.5 # on the left
w1 = 0.1 # minimal value

# "Empty" image
im_w_factor = gn.img.Img(nx, 1, 1, sx, 1., 1., ox, 0., 0., nv=0)

# Add variable
w = w0 + im_w_factor.xx()/im_w_factor.xx().max() * (w1-w0)
im_w_factor.append_var(w)

# Plot
# ----
plt.figure(figsize=(15,5))
plt.plot(im_r_factor.x(), im_r_factor.val[0,0,0], label='r_factor')
plt.plot(im_w_factor.x(), im_w_factor.val[0,0,0], label='w_factor')
plt.legend()
plt.grid()
plt.show()

../_images/notebooks_ex_vario_analysis_data1D_2_non_stationary_8_0.png

II Non-stationary simulations

First define a reference covariance model and generate an unconditional simulation (reference). Then build a data set by extract some points from the simulation. Then, ignoring the covariance model and starting from the data set and the maps controlling the non-stationary features:

fit a covariance model
do kriging / conditional simulations

Several cases are proposed below with different non-stationary features.

A. Non-stationary range

Reference covariance model and non-stationarity

Define first a stationary reference covariance model in 1D. Then, add the desired non-stationarity feature.

[5]:

# Define a base covariance model (stationary)
cov_model_base = gn.covModel.CovModel1D(elem=[
    ('cubic', {'w':10, 'r':50.0}), # elementary contribution
    ], name='ref')

# Set list to handle non-stationarities
cov_model_non_stationarity_list = [
    ('multiply_r', im_r_factor.val[0]), # multiply range by `im_r_factor.val[0]` over the grid
]

Do an unconditional simulation (reference)

[6]:

# Simulation
nreal = 1
np.random.seed(2385)
geosclassic_output = gn.geosclassicinterface.simulate(
                        cov_model_base, nx, sx, ox,
                        method='ordinary_kriging',
                        cov_model_non_stationarity_list=cov_model_non_stationarity_list,
                        nneighborMax=64,
                        nreal=nreal,
                        nproc=1, nthreads_per_proc=8)

im_ref = geosclassic_output['image']

plt.figure(figsize=(15,5))
plt.plot(im_ref.x(), im_ref.val[0,0,0])
plt.grid()
plt.show()

simulate: pre-process data done: final number of data points : 0, inequality data points: 0
simulate: computational resources: nproc = 1, nthreads_per_proc = 8, nproc_sgs_at_ineq = 8
simulate: (Step 1-3 skipped) no data
simulate: (Step 4) do sgs (1 realizations) on the grid (at cell centers) using data points at cell centers...
simulate: call `run_MPDSOMPGeosClassicSim` [1 process of 8 thread(s) (OpenMP)] ...
simulate: `run_MPDSOMPGeosClassicSim` [1 process] complete

../_images/notebooks_ex_vario_analysis_data1D_2_non_stationary_14_1.png

Build a non-stationary data set (1D)

The data set is defined by extracting some points from the simulation of reference.

Note: the data set should contain enough points to catch the non-stationarities.

[7]:

# Extract som points from the simulation
n = 40

# --- Choose 1. or 2. below ---
# # 1. Sampling the image
# ps = gn.img.sampleFromImage(im_ref, n, seed=234)
# # Data points and data value
# x = ps.x()
# v = ps.val[3]
# # Optionally: move points in the grid cells randomly, and add noise to values
# x = x + (np.random.random(n)-0.5)* im_ref.sx
# v = v + (np.random.random(n)-0.5)* 1.e-1

# 2. Using the function interpolating the image values
f = gn.img.Img_interp_func(im_ref, iy=0, iz=0)
np.random.seed(987)
x = im_ref.xmin() + np.random.random(n) * (im_ref.xmax()-im_ref.xmin())
v = f(x)
# ----- #

plt.figure(figsize=(15,5))
plt.plot(im_ref.x(), im_ref.val[0,0,0])
plt.plot(x, v, '+', c='k')
plt.grid()
plt.show()

../_images/notebooks_ex_vario_analysis_data1D_2_non_stationary_16_0.png

Simulation starting from a non-stationary data set in 1D and assuming “non-stationarity feature(s)” known

n: number of data points
x: location of data points (1-dimensional array of length n)
v: values at data points (1-dimensional array of length n)
im_r_factor: image of the factor (multiplier) r_factor in the grid; the function r_factor_inv_loc_func (interpolator of the inverse of r_factor in the grid) is built from this image

[8]:

# Set a function interpolating the inverse of the r_factor (given location)
im_tmp = gn.img.copyImg(im_r_factor)
im_tmp.val = 1.0/im_tmp.val
r_factor_inv_loc_func = gn.img.Img_interp_func(im_tmp, ind=0, iy=0, iz=0)
# -> specify iy=0, iz=0: consider only x coordinates along the "line" iy=0 and iz=0

Fitting covariance model accounting for non-stationarity

[9]:

cov_model_to_optimize = gn.covModel.CovModel1D(elem=[
        ('cubic', {'w':np.nan, 'r':np.nan}), # elementary contribution
         ], name='')

t1 = time.time()
cov_model_opt, popt = gn.covModel.covModel1D_fit(
                x, v, cov_model_to_optimize,
                coord_factor_loc_func=r_factor_inv_loc_func, # deal with non-stationarity (multiplier for range)
                loc_m=10, # loc_m > 0: number of sub-intervals btw pair of points to estimate local factor (default 1)
                          # loc_m = 0: take factor from one point
                bounds=([   0,   0],  # min value for param. to fit
                        [ 100, 500]), # max value for param. to fit
                hmax=100, make_plot=False) # figure size for plot
t2 = time.time()
print(f'Fitting covariance model - elapsed time: {t2-t1:.4g}')

cov_model_opt

Fitting covariance model - elapsed time: 0.008378

[9]:

*** CovModel1D object ***
name = ''
number of elementary contribution(s): 1
elementary contribution 0
    type: cubic
    parameters:
        w = 26.219740235828276
        r = 144.1487904701664
*****

Cross-validation of covariance model by leave-one-out error

The function geone.covModel.cross_valid_loo performs a cross-validation test by leave-one-out (LOO) error.

See jupyter notebook ex_vario_analysis_data1D for more details about this function.

[10]:

# Set a function interpolating the r_factor (given location)
r_factor_loc_func = gn.img.Img_interp_func(im_r_factor, ind=0, iy=0, iz=0)
# -> specify iy=0, iz=0: consider only x coordinates along the "line" iy=0 and iz=0

# Set list to handle non-stationarities at x
cov_model_non_stationarity_x_list = [
    ('multiply_r', r_factor_loc_func(x))
]

cv_est1, cv_std1, crps1, crps_def1, pvalue1, success1 = gn.covModel.cross_valid_loo(
                                        x, v, cov_model_opt,
                                        interpolator=gn.covModel.krige,
                                        interpolator_kwargs={'method':'ordinary_kriging'},
                                        cov_model_non_stationarity_x_list=cov_model_non_stationarity_x_list,
                                        print_result=True, make_plot=True, figsize=(15,10), nbins=20)
plt.show()

----- CRPS (negative; the larger, the better) -----
   mean = -0.8029
   def. = -0.9435
----- 1) "Normal law test for mean of normalized error" -----
   p-value = 0.3073
   success = True (wrt significance level 0.05)
      (-> model has no reason to be rejected)
----- 2) "Chi-square test for sum of squares of normalized error" -----
   p-value = 0
   success = False (wrt significance level 0.05)
      -> model should be REJECTED
----- Statistics of normalized error -----
   mean     = 0.1614 (should be close to 0)
   std      = 3.273 (should be close to 1)
   skewness = -0.08016 (should be close to 0)
   excess kurtosis = -0.6062 (should be close to 0)

../_images/notebooks_ex_vario_analysis_data1D_2_non_stationary_22_1.png

Show experimental variogram, fitted model and reference model

[11]:

hmax = 100.0

hexp, gexp, cexp = gn.covModel.variogramExp1D(x, v, hmax=hmax,
                                              coord_factor_loc_func=r_factor_inv_loc_func, loc_m=10,
                                              ncla=20, make_plot=False)

plt.figure(figsize=(10,5))
gn.covModel.plot_variogramExp1D(hexp, gexp, cexp, c='red', label='vario exp')
cov_model_opt.plot_model(vario=True, hmax=hmax, c='red', label='vario opt')
cov_model_base.plot_model(vario=True, hmax=hmax, c='orange', ls='dashed', label='vario ref')
plt.legend()

plt.show()

../_images/notebooks_ex_vario_analysis_data1D_2_non_stationary_24_0.png

Kriging and conditional simulations

[12]:

# Kriging
t1 = time.time() # start time
geosclassic_output = gn.geosclassicinterface.estimate(
                        cov_model_opt, nx, sx, ox,
                        x=x, v=v,
                        method='ordinary_kriging',
                        cov_model_non_stationarity_list=cov_model_non_stationarity_list,
                        nneighborMax=20,
                        nthreads=8)
t2 = time.time() # end time
print(f'Kriging - elapsed time: {t2-t1:.4g} sec')

# Retrieve kriging estimate and standard deviation
im_krig = geosclassic_output['image']

estimate: pre-process data done: final number of data points : 40, inequality data points: 0
estimate: computational resources: nthreads = 8, nproc_sgs_at_ineq = 8
estimate: (Step 1) no inequality data
estimate: (Step 2) set new dataset gathering data and inequality data locations...
estimate: (Step 3) do kriging at the center of grid cells containing at least one data point...
estimate: (Step 4) do kriging on the grid (at cell centers) using data points at cell centers...
estimate: call `run_MPDSOMPGeosClassicSim` [1 process of 8 thread(s) (OpenMP)] ...
estimate: `run_MPDSOMPGeosClassicSim` [1 process] complete
Kriging - elapsed time: 0.02475 sec

[13]:

# Simulation
nreal = 1000
np.random.seed(22131)
t1 = time.time() # start time
geosclassic_output = gn.geosclassicinterface.simulate(
                        cov_model_opt, nx, sx, ox,
                        x=x, v=v, method='ordinary_kriging',
                        cov_model_non_stationarity_list=cov_model_non_stationarity_list,
                        nneighborMax=20,
                        nreal=nreal,
                        nproc=4, nthreads_per_proc=4)
t2 = time.time() # end time
print(f'{nreal} simul. - elapsed time: {t2-t1:.4g} sec')

# Retrieve the realizations
simul = geosclassic_output['image']

# Compute mean and standard deviation (pixel-wise)
simul_mean = gn.img.imageContStat(simul, op='mean')
simul_std = gn.img.imageContStat(simul, op='std')

simulate: pre-process data done: final number of data points : 40, inequality data points: 0
simulate: computational resources: nproc = 4, nthreads_per_proc = 4, nproc_sgs_at_ineq = 16
simulate: (Step 1) no inequality data
simulate: (Step 2) set new dataset gathering data and inequality data locations...
simulate: (Step 3) do kriging at the center of grid cells containing at least one data point...
simulate: (Step 4) do sgs (1000 realizations) on the grid (at cell centers) using data points at cell centers...
simulate: call `run_MPDSOMPGeosClassicSim` [4 process(es) of 4 thread(s) (OpenMP)] ...
simulate: `run_MPDSOMPGeosClassicSim` [4 process(es)] complete
1000 simul. - elapsed time: 4.33 sec

[14]:

# Plot the first simulations and the results of estimation
plt.figure(figsize=(15,5))

plt.plot(im_ref.x(), im_ref.val[0,0,0,:], c='red', ls='dashed', label='ref')

for i in range(3):
    plt.plot(simul.x(), simul.val[i,0,0,:], label='real #{}'.format(i+1))

plt.plot(im_krig.x(), im_krig.val[0,0,0,:], c='black', ls='dashed', label='estimate (krig.)')
plt.fill_between(im_krig.x(),
                 im_krig.val[0,0,0,:] - im_krig.val[1,0,0,:],
                 im_krig.val[0,0,0,:] + im_krig.val[1,0,0,:],
                 color='gray', alpha=.5, label='estim.  +/- sd (krig.)')

plt.fill_between(simul.x(), simul.val[:,0,0,:].min(axis=0), simul.val[:,0,0,:].max(axis=0),
                 color='red', alpha=.2, label='min-max ({} real)'.format(nreal))

plt.plot(x, v, '+', c='k', markersize=10, label='data') # add conditioning points

plt.grid()
plt.legend()
plt.title(f'Results ({nreal} sim.)')
plt.show()

../_images/notebooks_ex_vario_analysis_data1D_2_non_stationary_28_0.png

B. Non-stationary variance (sill)

Reference covariance model and non-stationarity

Define first a stationary reference covariance model in 1D. Then, add the desired non-stationarity feature.

[15]:

# Define a base covariance model (stationary)
cov_model_base = gn.covModel.CovModel1D(elem=[
    ('cubic', {'w':10, 'r':50.0}), # elementary contribution
    ], name='ref')

# Set list to handle non-stationarities
cov_model_non_stationarity_list = [
    ('multiply_w', im_w_factor.val[0]), # multiply weight by `im_w_factor.val[0]` over the grid
]

Do an unconditional simulation (reference)

[16]:

# Simulation
nreal = 1
np.random.seed(2485)
geosclassic_output = gn.geosclassicinterface.simulate(
                        cov_model_base, nx, sx, ox,
                        method='ordinary_kriging',
                        cov_model_non_stationarity_list=cov_model_non_stationarity_list,
                        nneighborMax=64,
                        nreal=nreal,
                        nproc=1, nthreads_per_proc=8)

im_ref = geosclassic_output['image']

plt.figure(figsize=(15,5))
plt.plot(im_ref.x(), im_ref.val[0,0,0])
plt.grid()
plt.show()

simulate: pre-process data done: final number of data points : 0, inequality data points: 0
simulate: computational resources: nproc = 1, nthreads_per_proc = 8, nproc_sgs_at_ineq = 8
simulate: (Step 1-3 skipped) no data
simulate: (Step 4) do sgs (1 realizations) on the grid (at cell centers) using data points at cell centers...
simulate: call `run_MPDSOMPGeosClassicSim` [1 process of 8 thread(s) (OpenMP)] ...
simulate: `run_MPDSOMPGeosClassicSim` [1 process] complete

../_images/notebooks_ex_vario_analysis_data1D_2_non_stationary_33_1.png

Build a non-stationary data set (1D)

The data set is defined by extracting some points from the simulation of reference.

Note: the data set should contain enough points to catch the non-stationarities.

[17]:

# Extract som points from the simulation
n = 40

# --- Choose 1. or 2. below ---
# # 1. Sampling the image
# ps = gn.img.sampleFromImage(im_ref, n, seed=234)
# # Data points and data value
# x = ps.x()
# v = ps.val[3]
# # Optionally: move points in the grid cells randomly, and add noise to values
# x = x + (np.random.random(n)-0.5)* im_ref.sx
# v = v + (np.random.random(n)-0.5)* 1.e-1

# 2. Using the function interpolating the image values
f = gn.img.Img_interp_func(im_ref, iy=0, iz=0)
np.random.seed(987)
x = im_ref.xmin() + np.random.random(n) * (im_ref.xmax()-im_ref.xmin())
v = f(x)
# ----- #

plt.figure(figsize=(15,5))
plt.plot(im_ref.x(), im_ref.val[0,0,0])
plt.plot(x, v, '+', c='k')
plt.grid()
plt.show()

../_images/notebooks_ex_vario_analysis_data1D_2_non_stationary_35_0.png

Simulation starting from a non-stationary data set in 1D and assuming “non-stationarity feature(s)” known

n: number of data points
x: location of data points (1-dimensional array of length n)
v: values at data points (1-dimensional array of length n)
im_w_factor: image of the factor (multiplier) w_factor in the grid; the function w_factor_inv_loc_func (interpolator of the inverse of w_factor in the grid) is built from this image

[18]:

# Set a function interpolating the inverse of the w_factor (given location)
im_tmp = gn.img.copyImg(im_w_factor)
im_tmp.val = 1.0/im_tmp.val
w_factor_inv_loc_func = gn.img.Img_interp_func(im_tmp, ind=0, iy=0, iz=0)
# -> specify iy=0, iz=0: consider only x coordinates along the "line" iy=0 and iz=0

Fitting covariance model accounting for non-stationarity

[19]:

cov_model_to_optimize = gn.covModel.CovModel1D(elem=[
        ('cubic', {'w':np.nan, 'r':np.nan}), # elementary contribution
         ], name='')

t1 = time.time()
cov_model_opt, popt = gn.covModel.covModel1D_fit(
                x, v, cov_model_to_optimize,
                w_factor_loc_func=w_factor_inv_loc_func, # deal with non-stationarity (multiplier for weight)
                loc_m=10, # loc_m > 0: number of sub-intervals btw pair of points to estimate local factor (default 1)
                          # loc_m = 0: take factor from one point
                bounds=([   0,   0],  # min value for param. to fit
                        [ 100, 500]), # max value for param. to fit
                hmax=100, make_plot=False) # figure size for plot
t2 = time.time()
print(f'Fitting covariance model - elapsed time: {t2-t1:.4g}')

cov_model_opt

Fitting covariance model - elapsed time: 0.00691

[19]:

*** CovModel1D object ***
name = ''
number of elementary contribution(s): 1
elementary contribution 0
    type: cubic
    parameters:
        w = 21.250098976072714
        r = 81.84710526696728
*****

Cross-validation of covariance model by leave-one-out error

The function geone.covModel.cross_valid_loo performs a cross-validation test by leave-one-out (LOO) error.

See jupyter notebook ex_vario_analysis_data1D for more details about this function.

[20]:

# Set a function interpolating the w_factor (given location)
w_factor_loc_func = gn.img.Img_interp_func(im_w_factor, ind=0, iy=0, iz=0)
# -> specify iy=0, iz=0: consider only x coordinates along the "line" iy=0 and iz=0

# Set list to handle non-stationarities at x
cov_model_non_stationarity_x_list = [
    ('multiply_w', w_factor_loc_func(x))
]

cv_est1, cv_std1, crps1, crps_def1, pvalue1, success1 = gn.covModel.cross_valid_loo(
                                        x, v, cov_model_opt,
                                        interpolator=gn.covModel.krige,
                                        interpolator_kwargs={'method':'ordinary_kriging'},
                                        cov_model_non_stationarity_x_list=cov_model_non_stationarity_x_list,
                                        print_result=True, make_plot=True, figsize=(15,10), nbins=20)
plt.show()

----- CRPS (negative; the larger, the better) -----
   mean = -1.149
   def. = -1.151
----- 1) "Normal law test for mean of normalized error" -----
   p-value = 0.504
   success = True (wrt significance level 0.05)
      (-> model has no reason to be rejected)
----- 2) "Chi-square test for sum of squares of normalized error" -----
   p-value = 0.0003299
   success = False (wrt significance level 0.05)
      -> model should be REJECTED
----- Statistics of normalized error -----
   mean     = -0.1057 (should be close to 0)
   std      = 1.389 (should be close to 1)
   skewness = -0.0118 (should be close to 0)
   excess kurtosis = 0.2549 (should be close to 0)

../_images/notebooks_ex_vario_analysis_data1D_2_non_stationary_41_1.png

Show experimental variogram, fitted model and reference model

[21]:

hmax = 200.0

hexp, gexp, cexp = gn.covModel.variogramExp1D(x, v, hmax=hmax,
                                              w_factor_loc_func=w_factor_inv_loc_func, loc_m=10,
                                              ncla=10, make_plot=False)

plt.figure(figsize=(10,5))
gn.covModel.plot_variogramExp1D(hexp, gexp, cexp, c='red', label='vario exp')
cov_model_opt.plot_model(vario=True, hmax=hmax, c='red', label='vario opt')
cov_model_base.plot_model(vario=True, hmax=hmax, c='orange', ls='dashed', label='vario ref')
plt.legend()

plt.show()

../_images/notebooks_ex_vario_analysis_data1D_2_non_stationary_43_0.png

Kriging and conditional simulations

[22]:

# Kriging
t1 = time.time() # start time
geosclassic_output = gn.geosclassicinterface.estimate(
                        cov_model_opt, nx, sx, ox,
                        x=x, v=v,
                        method='ordinary_kriging',
                        cov_model_non_stationarity_list=cov_model_non_stationarity_list,
                        searchRadius=200, nneighborMax=20,
                        nthreads=8)
t2 = time.time() # end time
print(f'Kriging - elapsed time: {t2-t1:.4g} sec')

# Retrieve kriging estimate and standard deviation
im_krig = geosclassic_output['image']

estimate: pre-process data done: final number of data points : 40, inequality data points: 0
estimate: computational resources: nthreads = 8, nproc_sgs_at_ineq = 8
estimate: (Step 1) no inequality data
estimate: (Step 2) set new dataset gathering data and inequality data locations...
estimate: (Step 3) do kriging at the center of grid cells containing at least one data point...
estimate: (Step 4) do kriging on the grid (at cell centers) using data points at cell centers...
estimate: call `run_MPDSOMPGeosClassicSim` [1 process of 8 thread(s) (OpenMP)] ...
estimate: `run_MPDSOMPGeosClassicSim` [1 process] complete
Kriging - elapsed time: 0.02381 sec

[23]:

# Simulation
nreal = 1000
np.random.seed(22131)
t1 = time.time() # start time
geosclassic_output = gn.geosclassicinterface.simulate(
                        cov_model_opt, nx, sx, ox,
                        x=x, v=v, method='ordinary_kriging',
                        cov_model_non_stationarity_list=cov_model_non_stationarity_list,
                        searchRadius=200, nneighborMax=20,
                        nreal=nreal,
                        nproc=4, nthreads_per_proc=4)
t2 = time.time() # end time
print(f'{nreal} simul. - elapsed time: {t2-t1:.4g} sec')

# Retrieve the realizations
simul = geosclassic_output['image']

# Compute mean and standard deviation (pixel-wise)
simul_mean = gn.img.imageContStat(simul, op='mean')
simul_std = gn.img.imageContStat(simul, op='std')

simulate: pre-process data done: final number of data points : 40, inequality data points: 0
simulate: computational resources: nproc = 4, nthreads_per_proc = 4, nproc_sgs_at_ineq = 16
simulate: (Step 1) no inequality data
simulate: (Step 2) set new dataset gathering data and inequality data locations...
simulate: (Step 3) do kriging at the center of grid cells containing at least one data point...
simulate: (Step 4) do sgs (1000 realizations) on the grid (at cell centers) using data points at cell centers...
simulate: call `run_MPDSOMPGeosClassicSim` [4 process(es) of 4 thread(s) (OpenMP)] ...
simulate: `run_MPDSOMPGeosClassicSim` [4 process(es)] complete
1000 simul. - elapsed time: 4.056 sec

[24]:

# Plot the first simulations and the results of estimation
plt.figure(figsize=(15,5))

plt.plot(im_ref.x(), im_ref.val[0,0,0,:], c='red', ls='dashed', label='ref')

for i in range(3):
    plt.plot(simul.x(), simul.val[i,0,0,:], label='real #{}'.format(i+1))

plt.plot(im_krig.x(), im_krig.val[0,0,0,:], c='black', ls='dashed', label='estimate (krig.)')
plt.fill_between(im_krig.x(),
                 im_krig.val[0,0,0,:] - im_krig.val[1,0,0,:],
                 im_krig.val[0,0,0,:] + im_krig.val[1,0,0,:],
                 color='gray', alpha=.5, label='estim.  +/- sd (krig.)')

plt.fill_between(simul.x(), simul.val[:,0,0,:].min(axis=0), simul.val[:,0,0,:].max(axis=0),
                 color='red', alpha=.2, label='min-max ({} real)'.format(nreal))

plt.plot(x, v, '+', c='k', markersize=10, label='data') # add conditioning points

plt.grid()
plt.legend()
plt.title(f'Results ({nreal} sim.)')
plt.show()

../_images/notebooks_ex_vario_analysis_data1D_2_non_stationary_47_0.png

C. Non-stationary for range and variance (sill)

Reference covariance model and non-stationarity

Define first a stationary reference covariance model in 1D. Then, add the desired non-stationarity feature.

[25]:

# Define a base covariance model (stationary)
cov_model_base = gn.covModel.CovModel1D(elem=[
    ('cubic', {'w':10, 'r':50.0}), # elementary contribution
    ], name='ref')

# Set list to handle non-stationarities
cov_model_non_stationarity_list = [
    ('multiply_w', im_w_factor.val[0]), # multiply weight by `im_w_factor.val[0]` over the grid
    ('multiply_r', im_r_factor.val[0]), # multiply range by `im_r_factor.val[0]` over the grid
]

Do an unconditional simulation (reference)

[26]:

# Simulation
nreal = 1
np.random.seed(2485)
geosclassic_output = gn.geosclassicinterface.simulate(
                        cov_model_base, nx, sx, ox,
                        method='ordinary_kriging',
                        cov_model_non_stationarity_list=cov_model_non_stationarity_list,
                        nneighborMax=64,
                        nreal=nreal,
                        nproc=1, nthreads_per_proc=8)

im_ref = geosclassic_output['image']

plt.figure(figsize=(15,5))
plt.plot(im_ref.x(), im_ref.val[0,0,0])
plt.grid()
plt.show()

simulate: pre-process data done: final number of data points : 0, inequality data points: 0
simulate: computational resources: nproc = 1, nthreads_per_proc = 8, nproc_sgs_at_ineq = 8
simulate: (Step 1-3 skipped) no data
simulate: (Step 4) do sgs (1 realizations) on the grid (at cell centers) using data points at cell centers...
simulate: call `run_MPDSOMPGeosClassicSim` [1 process of 8 thread(s) (OpenMP)] ...
simulate: `run_MPDSOMPGeosClassicSim` [1 process] complete

../_images/notebooks_ex_vario_analysis_data1D_2_non_stationary_52_1.png

Build a non-stationary data set (1D)

The data set is defined by extracting some points from the simulation of reference.

Note: the data set should contain enough points to catch the non-stationarities.

[27]:

# Extract som points from the simulation
n = 70

# --- Choose 1. or 2. below ---
# # 1. Sampling the image
# ps = gn.img.sampleFromImage(im_ref, n, seed=234)
# # Data points and data value
# x = ps.x()
# v = ps.val[3]
# # Optionally: move points in the grid cells randomly, and add noise to values
# x = x + (np.random.random(n)-0.5)* im_ref.sx
# v = v + (np.random.random(n)-0.5)* 1.e-1

# 2. Using the function interpolating the image values
f = gn.img.Img_interp_func(im_ref, iy=0, iz=0)
np.random.seed(4253)
x = im_ref.xmin() + np.random.random(n) * (im_ref.xmax()-im_ref.xmin())
v = f(x)
# ----- #

plt.figure(figsize=(15,5))
plt.plot(im_ref.x(), im_ref.val[0,0,0])
plt.plot(x, v, '+', c='k')
plt.grid()
plt.show()

../_images/notebooks_ex_vario_analysis_data1D_2_non_stationary_54_0.png

Simulation starting from a non-stationary data set in 1D and assuming “non-stationarity feature(s)” known

n: number of data points
x: location of data points (1-dimensional array of length n)
v: values at data points (1-dimensional array of length n)
im_r_factor: image of the factor (multiplier) r_factor in the grid; the function r_factor_inv_loc_func (interpolator of the inverse of r_factor in the grid) is built from this image
im_w_factor: image of the factor (multiplier) w_factor in the grid; the function w_factor_inv_loc_func (interpolator of the inverse of w_factor in the grid) is built from this image

[28]:

# Set a function interpolating the inverse of the w_factor (given location)
im_tmp = gn.img.copyImg(im_w_factor)
im_tmp.val = 1.0/im_tmp.val
w_factor_inv_loc_func = gn.img.Img_interp_func(im_tmp, ind=0, iy=0, iz=0)
# -> specify iy=0, iz=0: consider only x coordinates along the "line" iy=0 and iz=0

# Set a function interpolating the inverse of the r_factor (given location)
im_tmp = gn.img.copyImg(im_r_factor)
im_tmp.val = 1.0/im_tmp.val
r_factor_inv_loc_func = gn.img.Img_interp_func(im_tmp, ind=0, iy=0, iz=0)
# -> specify iy=0, iz=0: consider only x coordinates along the "line" iy=0 and iz=0

Fitting covariance model accounting for non-stationarity

[29]:

cov_model_to_optimize = gn.covModel.CovModel1D(elem=[
        ('cubic', {'w':np.nan, 'r':np.nan}), # elementary contribution
         ], name='')

t1 = time.time()
cov_model_opt, popt = gn.covModel.covModel1D_fit(
                x, v, cov_model_to_optimize,
                w_factor_loc_func=w_factor_inv_loc_func,     # deal with non-stationarity (multiplier for weight)
                coord_factor_loc_func=r_factor_inv_loc_func, # deal with non-stationarity (multiplier for range)
                loc_m=10, # loc_m > 0: number of sub-intervals btw pair of points to estimate local factor (default 1)
                          # loc_m = 0: take factor from one point
                bounds=([   0,   0],  # min value for param. to fit
                        [ 100, 500]), # max value for param. to fit
                hmax=100, make_plot=False) # figure size for plot
t2 = time.time()
print(f'Fitting covariance model - elapsed time: {t2-t1:.4g}')

cov_model_opt

Fitting covariance model - elapsed time: 0.02022

[29]:

*** CovModel1D object ***
name = ''
number of elementary contribution(s): 1
elementary contribution 0
    type: cubic
    parameters:
        w = 8.450478582025847
        r = 52.9933842174848
*****

Cross-validation of covariance model by leave-one-out error

The function geone.covModel.cross_valid_loo performs a cross-validation test by leave-one-out (LOO) error.

See jupyter notebook ex_vario_analysis_data1D for more details about this function.

[30]:

# Set a function interpolating the w_factor (given location)
w_factor_loc_func = gn.img.Img_interp_func(im_w_factor, ind=0, iy=0, iz=0)
# Set a function interpolating the w_factor (given location)
r_factor_loc_func = gn.img.Img_interp_func(im_r_factor, ind=0, iy=0, iz=0)
# -> specify iy=0, iz=0: consider only x coordinates along the "line" iy=0 and iz=0

# Set list to handle non-stationarities at x
cov_model_non_stationarity_x_list = [
    ('multiply_w', w_factor_loc_func(x)),
    ('multiply_r', r_factor_loc_func(x))
]

cv_est1, cv_std1, crps1, crps_def1, pvalue1, success1 = gn.covModel.cross_valid_loo(
                                        x, v, cov_model_opt,
                                        interpolator=gn.covModel.krige,
                                        interpolator_kwargs={'method':'ordinary_kriging'},
                                        cov_model_non_stationarity_x_list=cov_model_non_stationarity_x_list,
                                        print_result=True, make_plot=True, figsize=(15,10), nbins=20)
plt.show()

----- CRPS (negative; the larger, the better) -----
   mean = -0.39
   def. = -0.8687
----- 1) "Normal law test for mean of normalized error" -----
   p-value = 0.745
   success = True (wrt significance level 0.05)
      (-> model has no reason to be rejected)
----- 2) "Chi-square test for sum of squares of normalized error" -----
   p-value = 0.002184
   success = False (wrt significance level 0.05)
      -> model should be REJECTED
----- Statistics of normalized error -----
   mean     = -0.03887 (should be close to 0)
   std      = 1.244 (should be close to 1)
   skewness = -0.03617 (should be close to 0)
   excess kurtosis = 0.5304 (should be close to 0)

../_images/notebooks_ex_vario_analysis_data1D_2_non_stationary_60_1.png

Show experimental variogram, fitted model and reference model

[31]:

hmax = 200.0

hexp, gexp, cexp = gn.covModel.variogramExp1D(x, v, hmax=hmax,
                                              w_factor_loc_func=w_factor_inv_loc_func,
                                              coord_factor_loc_func=r_factor_inv_loc_func, loc_m=10,
                                              ncla=10, make_plot=False)

plt.figure(figsize=(10,5))
gn.covModel.plot_variogramExp1D(hexp, gexp, cexp, c='red', label='vario exp')
cov_model_opt.plot_model(vario=True, hmax=hmax, c='red', label='vario opt')
cov_model_base.plot_model(vario=True, hmax=hmax, c='orange', ls='dashed', label='vario ref')
plt.legend()

plt.show()

../_images/notebooks_ex_vario_analysis_data1D_2_non_stationary_62_0.png

Kriging and conditional simulations

[32]:

# Kriging
t1 = time.time() # start time
geosclassic_output = gn.geosclassicinterface.estimate(
                        cov_model_opt, nx, sx, ox,
                        x=x, v=v,
                        method='ordinary_kriging',
                        cov_model_non_stationarity_list=cov_model_non_stationarity_list,
                        nneighborMax=20,
                        nthreads=8)
t2 = time.time() # end time
print(f'Kriging - elapsed time: {t2-t1:.4g} sec')

# Retrieve kriging estimate and standard deviation
im_krig = geosclassic_output['image']

estimate: pre-process data done: final number of data points : 68, inequality data points: 0
estimate: computational resources: nthreads = 8, nproc_sgs_at_ineq = 8
estimate: (Step 1) no inequality data
estimate: (Step 2) set new dataset gathering data and inequality data locations...
estimate: (Step 3) do kriging at the center of grid cells containing at least one data point...
estimate: (Step 4) do kriging on the grid (at cell centers) using data points at cell centers...
estimate: call `run_MPDSOMPGeosClassicSim` [1 process of 8 thread(s) (OpenMP)] ...
estimate: `run_MPDSOMPGeosClassicSim` [1 process] complete
Kriging - elapsed time: 0.03166 sec

[33]:

# Simulation
nreal = 1000
np.random.seed(22131)
t1 = time.time() # start time
geosclassic_output = gn.geosclassicinterface.simulate(
                        cov_model_opt, nx, sx, ox,
                        x=x, v=v, method='ordinary_kriging',
                        cov_model_non_stationarity_list=cov_model_non_stationarity_list,
                        nneighborMax=20,
                        nreal=nreal,
                        nproc=4, nthreads_per_proc=4)
t2 = time.time() # end time
print(f'{nreal} simul. - elapsed time: {t2-t1:.4g} sec')

# Retrieve the realizations
simul = geosclassic_output['image']

# Compute mean and standard deviation (pixel-wise)
simul_mean = gn.img.imageContStat(simul, op='mean')
simul_std = gn.img.imageContStat(simul, op='std')

simulate: pre-process data done: final number of data points : 68, inequality data points: 0
simulate: computational resources: nproc = 4, nthreads_per_proc = 4, nproc_sgs_at_ineq = 16
simulate: (Step 1) no inequality data
simulate: (Step 2) set new dataset gathering data and inequality data locations...
simulate: (Step 3) do kriging at the center of grid cells containing at least one data point...
simulate: (Step 4) do sgs (1000 realizations) on the grid (at cell centers) using data points at cell centers...
simulate: call `run_MPDSOMPGeosClassicSim` [4 process(es) of 4 thread(s) (OpenMP)] ...
simulate: `run_MPDSOMPGeosClassicSim` [4 process(es)] complete
1000 simul. - elapsed time: 4.179 sec

[34]:

# Plot the first simulations and the results of estimation
plt.figure(figsize=(15,5))

plt.plot(im_ref.x(), im_ref.val[0,0,0,:], c='red', ls='dashed', label='ref')

for i in range(3):
    plt.plot(simul.x(), simul.val[i,0,0,:], label='real #{}'.format(i+1))

plt.plot(im_krig.x(), im_krig.val[0,0,0,:], c='black', ls='dashed', label='estimate (krig.)')
plt.fill_between(im_krig.x(),
                 im_krig.val[0,0,0,:] - im_krig.val[1,0,0,:],
                 im_krig.val[0,0,0,:] + im_krig.val[1,0,0,:],
                 color='gray', alpha=.5, label='estim.  +/- sd (krig.)')

plt.fill_between(simul.x(), simul.val[:,0,0,:].min(axis=0), simul.val[:,0,0,:].max(axis=0),
                 color='red', alpha=.2, label='min-max ({} real)'.format(nreal))

plt.plot(x, v, '+', c='k', markersize=10, label='data') # add conditioning points

plt.grid()
plt.legend()
plt.title(f'Results ({nreal} sim.)')
plt.show()

../_images/notebooks_ex_vario_analysis_data1D_2_non_stationary_66_0.png