GEONE - DEESSE - Multivariate simulations (II)

Main points addressed

deesse simulations with a non-stationary training image
how to deal with non-stationarity by using an auxiliary variable (bivariate simulations)

Import what is required

[1]:

import numpy as np
import matplotlib.pyplot as plt
import time
import os

# import package 'geone'
import geone as gn

[2]:

# Show version of python and version of geone
import sys
print(sys.version_info)
print('geone version: ' + gn.__version__)

sys.version_info(major=3, minor=13, micro=7, releaselevel='final', serial=0)
geone version: 1.3.0

Non stationary training image (TI)

Read the univariate training image and plot it.

[3]:

# Read file
data_dir = 'data'
filename = os.path.join(data_dir, 'ti3.txt')
ti = gn.img.readImageTxt(filename)

# Values taken in ti
ti.get_unique()

[3]:

array([0., 1., 2.])

Plot the image using the function geone.imgplot.drawImage2D. This image is clearly non-stationary: the density of “orange balls” depends on a vertical trend.

[4]:

categ_val = [0, 1, 2]
categ_col = ['lightblue', 'darkgreen', 'orange']

plt.figure(figsize=(5,7))
gn.imgplot.drawImage2D(ti, categ=True, categVal=categ_val, categCol=categ_col)
plt.title('TI')
plt.show()

../_images/notebooks_ex_deesse_09_multivariate_sim2_7_0.png

Simulation grid

Define the simulation grid (number of cells in each direction, cell unit, origin).

[5]:

nx, ny, nz = 300, 400, 1         # number of cells
sx, sy, sz = ti.sx, ti.sy, ti.sz # cell unit
ox, oy, oz = 0.0, 0.0, 0.0       # origin (corner of the "first" grid cell)

Univariate simulation (not accounting for non-stationarity)

Fill the input structure for deesse and launch deesse

[6]:

deesse_input = gn.deesseinterface.DeesseInput(
    nx=nx, ny=ny, nz=nz,
    sx=sx, sy=sy, sz=sz,
    ox=ox, oy=oy, oz=oz,
    nv=1, varname='code',
    TI=ti,
    distanceType='categorical',
    nneighboringNode=24,
    distanceThreshold=0.02,
    maxScanFraction=0.15,
    npostProcessingPathMax=1,
    seed=444,
    nrealization=2)

# Run deesse
t1 = time.time() # start time
deesse_output = gn.deesseinterface.deesseRun_mp(deesse_input, nproc=2, nthreads_per_proc=8)
t2 = time.time() # end time
print(f'Elapsed time: {t2-t1:.2g} sec')

deesseRun_mp: DeeSse running on 2 process(es)... [VERSION 3.2 / BUILD NUMBER 20230914 / OpenMP 8 thread(s)]
deesseRun_mp: DeeSse run complete (all process(es))
Elapsed time: 5.3 sec

One observes on the results that the non-stationarity is uncontrolled: region with high density of “orange balls” can be found anywhere.

[7]:

# Retrieve the results
sim = deesse_output['sim']

# Display
plt.subplots(1, 2, figsize=(11,7), sharey=True) # 1 x 2 sub-plots
for i in range(2):
    plt.subplot(1, 2, i+1) # select next sub-plot
    gn.imgplot.drawImage2D(sim[i], categ=True, categVal=categ_val, categCol=categ_col,
                           title=f'Real. #{i} (not accounting for non-stationarity)')
plt.show()

../_images/notebooks_ex_deesse_09_multivariate_sim2_13_0.png

Dealing with non-stationarity

To deal with the non-stationarity, the idea is to use an auxiliary variable to describe the non-stationarity in the TI. In this example, a simple vertical trend is sufficient.

Define an auxiliary variable in the TI describing its non-stationarity

[8]:

# Get the y coordinates of the centers of grid cell (meshgrid)
yy = ti.yy()

# Define an auxiliary variable describing the trend (non stationarity)
# - normalize yy
aux_var = (yy - np.min(yy))/(np.max(yy) - np.min(yy)) # vertical trend

# Insert the new variable (index 0)
ti.insert_var(aux_var, varname='aux')
ti

[8]:

*** Img object ***
name = 'data/ti3.txt'
(nx, ny, nz) = (250, 500, 1) # number of cells along each axis
(sx, sy, sz) = (1.0, 1.0, 1.0) # cell size (spacing) along each axis
(ox, oy, oz) = (0.0, 0.0, 0.0) # origin (coordinates of bottom-lower-left corner)
nv = 2  # number of variable(s)
varname = ['aux', np.str_('code')]
val: (2, 1, 500, 250)-array
*****

[9]:

# Display
plt.subplots(1, 2, figsize=(11,7), sharey=True) # 1 x 2 sub-plots

plt.subplot(1,2,1)
gn.imgplot.drawImage2D(ti, iv=0, title=f'TI, 1st var: {ti.varname[0]}')

plt.subplot(1,2,2)
gn.imgplot.drawImage2D(ti, iv=1, categ=True, categVal=categ_val, categCol=categ_col,
                       title=f'TI, 2nd var: {ti.varname[1]}')

plt.show()

../_images/notebooks_ex_deesse_09_multivariate_sim2_16_0.png

Bivariate simulation (accounting for non stationarity)

Build an auxiliary variable in the simulation grid

An auxiliary variable is built on the simulation grid to control the non-stationarity. This auxiliary variable is of the same nature as the auxiliary variable on the TI (same range of values) and allows to describe which types of pattern for the original variable will be simulated in which regions of the simulation grid. In this example, patterns with high density of “orange balls” will be simulated where the auxiliary variable has high values (close to one).

Below is a piece of python code to build an auxiliary variable on the simulation grid.

[10]:

# Set an image with simulation grid geometry defined above, and no variable
im = gn.img.Img(nx, ny, nz, sx, sy, sz, ox, oy, oz, nv=0)

# Get the x and y coordinates of the centers of grid cell (meshgrid)
xx = im.xx()
yy = im.yy()

# Define an auxiliary variable describing (controlling) the trend (non stationarity) in the simulation grid
aux = np.sqrt((xx - im.xmin())**2 + (yy - im.ymax())**2)
aux = 1.0 - (aux - np.min(aux))/(np.max(aux) - np.min(aux))

# Set variable aux in image im
im.append_var(aux, varname='aux')

# Display
plt.figure(figsize=(7,7))
gn.imgplot.drawImage2D(im, title=f'Sim. grid: {im.varname[0]}')
plt.show()

../_images/notebooks_ex_deesse_09_multivariate_sim2_18_0.png

Fill the input structure for deesse and launch deesse

Then a bivariate simulation is done, with the auxiliary variable and the original (categorical variable). The auxiliary variable is exhaustively known on the simulation grid, and then it is set as hard data. It can be passed to deesse as an image (keyword argument dataImage).

Notes:

For the auxiliary variable which is exhaustively known, it can be sufficient to specify a small number of neighbors. Furthermore, one can also relax the corresponding acceptation threshold, so that the auxiliary variable does not filter out sampled region in the TI too strictly.
As the auxiliary variable is exhaustively known, it is not necessary to retrieve it as output: the corresponding flag, keyword argument outputVarFlag, can be set to False.
Deesse can run faster if the auxiliary variable is set as the first one (index 0).

[11]:

deesse_input = gn.deesseinterface.DeesseInput(
    nx=nx, ny=ny, nz=nz,
    sx=sx, sy=sy, sz=sz,
    ox=ox, oy=oy, oz=oz,
    nv=2, varname=['aux', 'code'], # number of variable(s), name of the variable(s)
    TI=ti,
    dataImage=im,                 # set auxiliary variable as hard data (image defined above)
    outputVarFlag=[False, True],  # set which variable will be retrieved in output
    distanceType=['continuous', 'categorical'], # distance type for each variable
    nneighboringNode=[1, 24],     # max. number of neighbors (for the patterns), for each variable
    distanceThreshold=[.1,.02],   # acceptation threshold (for distance between patterns), for each variable
    maxScanFraction=0.15,
    npostProcessingPathMax=1,
    seed=444,
    nrealization=2)

# Run deesse
t1 = time.time() # start time
deesse_output = gn.deesseinterface.deesseRun_mp(deesse_input, nproc=2, nthreads_per_proc=8)
t2 = time.time() # end time
print(f'Elapsed time: {t2-t1:.2g} sec')

deesseRun_mp: DeeSse running on 2 process(es)... [VERSION 3.2 / BUILD NUMBER 20230914 / OpenMP 8 thread(s)]
deesseRun_mp: DeeSse run complete (all process(es))
Elapsed time: 6.2 sec

[12]:

# Retrieve the results
sim = deesse_output['sim'] # list of realization (image), each with one variable

# Display
plt.subplots(1, 3, figsize=(15,5), sharey=True) # 1 x 3 sub-plots

plt.subplot(1,3,1)
gn.imgplot.drawImage2D(im, title='Trend / auxiliary variable (1st var)')

for i in range(2):
    plt.subplot(1, 3, i+2) # select next sub-plot
    gn.imgplot.drawImage2D(sim[i], categ=True, categVal=categ_val, categCol=categ_col,
                           title=f'Real. #{i} (controlling the non-stationarity)')

plt.show()

../_images/notebooks_ex_deesse_09_multivariate_sim2_21_0.png

Adding conditioning data for the original variable

Of course, some hard data can be handled for the original (categorical) variable.

Define some hard data (point set).

[13]:

npt = 5 # number of points
nv = 4  # number of variables including x, y, z coordinates
varname = ['x', 'y', 'z', 'code'] # list of variable names
v = np.array([
    [  10.5,   10.5, 0.5, 0], # x, y, z, code: 1st point
    [ 100.5,   20.5, 0.5, 1], # ...
    [  60.5,  200.5, 0.5, 2],
    [ 120.5,  260.5, 0.5, 2],
    [ 276.5,  354.5, 0.5, 1]
    ]).T # variable values: (nv, npt)-array
hd = gn.img.PointSet(npt=npt, nv=nv, varname=varname, val=v)

# Figure
# ------
# Get the colors for values of the variable of index 3 in the point set,
# according to the color settings used for the TI
hd_col = gn.imgplot.get_colors_from_values(hd.val[3], categ=True, categVal=categ_val, categCol=categ_col)

# Plot
plt.figure(figsize=(8,5))

# Plot empty simulation grid and specify colors
gn.imgplot.drawImage2D(im, plot_empty_grid=True, categ=True, categVal=categ_val, categCol=categ_col)

# Add hard data points
plt.scatter(hd.x(), hd.y(), marker='o', s=50, color=hd_col, edgecolors='black', linewidths=1)

plt.title(f'Hard data ({hd.varname[3]})')
plt.show()

../_images/notebooks_ex_deesse_09_multivariate_sim2_23_0.png

Fill the input structure for deesse and launch deesse

[14]:

nreal = 20
deesse_input = gn.deesseinterface.DeesseInput(
    nx=nx, ny=ny, nz=nz,
    sx=sx, sy=sy, sz=sz,
    ox=ox, oy=oy, oz=oz,
    nv=2, varname=['aux', 'code'],
    TI=ti,
    dataImage=im,
    dataPointSet=hd,             # hard data for the variable 'code'
    outputVarFlag=[False, True],
    distanceType=['continuous', 'categorical'],
    nneighboringNode=[1, 24],
    distanceThreshold=[.1,.02],
    maxScanFraction=0.15,
    npostProcessingPathMax=1,
    seed=444,
    nrealization=nreal)

# Run deesse
t1 = time.time() # start time
deesse_output = gn.deesseinterface.deesseRun_mp(deesse_input, nproc=2, nthreads_per_proc=8)
t2 = time.time() # end time
print(f'Elapsed time: {t2-t1:.2g} sec')

deesseRun_mp: DeeSse running on 2 process(es)... [VERSION 3.2 / BUILD NUMBER 20230914 / OpenMP 8 thread(s)]
deesseRun_mp: DeeSse run complete (all process(es))
Elapsed time: 66 sec

[15]:

# Retrieve the realizations
sim = deesse_output['sim'] # list of realization (image), each with one variable

# Do some statistics on the realizations
# compute the pixel-wise proportion for the given categories
all_sim_stats = gn.img.imageListCategProp(sim, categ_val)

# Color map for proportion of each category
prop_col=['blue', 'darkgreen', 'orange'] # colors for the proportion maps
prop_cmap = [gn.customcolors.custom_cmap(['white', c]) for c in prop_col]

# Display
plt.subplots(2,3, figsize=(15,10), sharex=True, sharey=True)
plt.subplot(2,3,1)
gn.imgplot.drawImage2D(im, title='Trend / auxiliary variable (1st var)')

for i in range(2):
    plt.subplot(2, 3, i+2) # select next sub-plot
    gn.imgplot.drawImage2D(sim[i], categ=True, categVal=categ_val, categCol=categ_col, title=f'Real. #{i}')
    plt.scatter(hd.x(), hd.y(), marker='o', s=50,
                color=hd_col, edgecolors='black', linewidths=1) # add hard data points
    plt.scatter(hd.x(), hd.y(), marker='o', s=50,
                color=hd_col, edgecolors='black', linewidths=1) # add hard data points
for i in range(3):
    plt.subplot(2, 3, i+4) # select next sub-plot
    gn.imgplot.drawImage2D(all_sim_stats, iv=i, cmap=prop_cmap[i],
                           title=f'Prop. of categ. {i} (over {nreal} real.)')
    plt.plot(hd.x(), hd.y(), '+', c='black', markersize=30) # add hard data points

plt.show()

../_images/notebooks_ex_deesse_09_multivariate_sim2_26_0.png

… using pyramids

[16]:

# Deesse input
#   - with 2 additional levels (npyramidLevel=2)
#     and a reduction factor of 2 along x and y axes between the original image
#     and the 1st pyramid level and between the 1st pyramid level to the second one
#     (kx=[2, 2], ky=[2, 2], kz=[0, 0]: do not apply reduction along z axis)
pyrGenParams = gn.deesseinterface.PyramidGeneralParameters(npyramidLevel=2, kx=[2, 2], ky=[2, 2], kz=[0, 0])
pyrParams = [gn.deesseinterface.PyramidParameters(nlevel=2, pyramidType='continuous'),
             gn.deesseinterface.PyramidParameters(nlevel=2, pyramidType='categorical_auto')]

nreal = 20
deesse_input = gn.deesseinterface.DeesseInput(
    nx=nx, ny=ny, nz=nz,
    sx=sx, sy=sy, sz=sz,
    ox=ox, oy=oy, oz=oz,
    nv=2, varname=['aux', 'code'],
    TI=ti,
    dataImage=im,
    dataPointSet=hd,
    outputVarFlag=[False, True],
    distanceType=['continuous', 'categorical'],
    nneighboringNode=[1, 24],
    distanceThreshold=[.1,.02],
    maxScanFraction=0.15,
    pyramidGeneralParameters=pyrGenParams, # pyramid general parameters
    pyramidParameters=pyrParams,           # pyramid parameters for each variable
    npostProcessingPathMax=1,
    seed=444,
    nrealization=nreal)

# Run deesse
t1 = time.time() # start time
deesse_output = gn.deesseinterface.deesseRun_mp(deesse_input, nproc=2, nthreads_per_proc=8)
t2 = time.time() # end time
print(f'Elapsed time: {t2-t1:.2g} sec')

deesseRun_mp: DeeSse running on 2 process(es)... [VERSION 3.2 / BUILD NUMBER 20230914 / OpenMP 8 thread(s)]
deesseRun_mp: DeeSse run complete (all process(es))
Elapsed time: 1.4e+02 sec

[17]:

# Retrieve the realizations
sim = deesse_output['sim'] # list of realization (image), each with one variable

# Do some statistics on the realizations
# compute the pixel-wise proportion for the given categories
all_sim_stats = gn.img.imageListCategProp(sim, categ_val)

# Color map for proportion of each category
prop_col=['blue', 'darkgreen', 'orange'] # colors for the proportion maps
prop_cmap = [gn.customcolors.custom_cmap(['white', c]) for c in prop_col]

# Display
plt.subplots(2,3, figsize=(15,10), sharex=True, sharey=True)
plt.subplot(2,3,1)
gn.imgplot.drawImage2D(im, title='Trend / auxiliary variable (1st var)')

for i in range(2):
    plt.subplot(2, 3, i+2) # select next sub-plot
    gn.imgplot.drawImage2D(sim[i], categ=True, categVal=categ_val, categCol=categ_col, title=f'Real. #{i}')
    plt.scatter(hd.x(), hd.y(), marker='o', s=50,
                color=hd_col, edgecolors='black', linewidths=1) # add hard data points
    plt.scatter(hd.x(), hd.y(), marker='o', s=50,
                color=hd_col, edgecolors='black', linewidths=1) # add hard data points
for i in range(3):
    plt.subplot(2, 3, i+4) # select next sub-plot
    gn.imgplot.drawImage2D(all_sim_stats, iv=i, cmap=prop_cmap[i],
                           title=f'Prop. of categ. {i} (over {nreal} real.)')
    plt.plot(hd.x(), hd.y(), '+', c='black', markersize=30) # add hard data points

plt.show()

../_images/notebooks_ex_deesse_09_multivariate_sim2_29_0.png