5.3 - Probability Density Transformation.

# Importing GemPy
import gempy as gp
from examples.tutorials.ch5_probabilistic_modeling.aux_functions.aux_funct import plot_geo_setting_well
from gempy.bayesian import plot_posterior as pp
from gempy.bayesian.plot_posterior import default_red, default_blue

# Importing auxiliary libraries
import numpy as np
import matplotlib.pyplot as plt
import pymc3 as pm
import arviz as az
import os

Model definition

Same problem as before, let’s assume the observations are layer thickness measurements taken on an outcrop. Now, in the previous example we chose a prior for the mean arbitrarily: \(𝜇∼Normal(mu=10.0, sigma=5.0)\)–something that made sense for these specific data set. If we do not have any further information, keeping an uninformative prior and let the data to dictate the final value of the inference is the sensible way forward. However, this also enable to add information to the system by setting informative priors.

Imagine we get a borehole with the tops of the two interfaces of interest. Each of this data point will be a random variable itself since the accuracy of the exact 3D location will be always limited. Notice that this two data points refer to depth not to thickness–the unit of the rest of the observations. Therefore, the first step would be to perform a transformation of the parameters into the observations space. Naturally in this example a simple subtraction will suffice.

Now we can define the probabilistic models:

This is to make it work in sphinx gallery

cwd = os.getcwd()
if not 'examples' in cwd:
    path_dir = os.getcwd() + '/examples/tutorials/ch5_probabilistic_modeling'
else:
    path_dir = cwd
geo_model = gp.load_model(r'2-layers', path=path_dir+r'/2-layers', recompile=True)
plot_geo_setting_well(geo_model=geo_model)
Cell Number: mid Direction: y

Out:

Active grids: ['regular']
Active grids: ['regular' 'topography']
Setting kriging parameters to their default values.
Compiling theano function...
Level of Optimization:  fast_compile
Device:  cpu
Precision:  float64
Number of faults:  0
Compilation Done!
Kriging values:
                   values
range            1.3e+04
$C_o$            4.2e+06
drift equations      [3]
y_obs = [2.12]
y_obs_list = [2.12, 2.06, 2.08, 2.05, 2.08, 2.09,
              2.19, 2.07, 2.16, 2.11, 2.13, 1.92]

np.random.seed(4003)

# .. image:: /../../_static/computational_graph1.png

with pm.Model() as model:
    mu_top = pm.Normal('$\mu_{top}$', 3.05, .2)
    sigma_top = pm.Gamma('$\sigma_{top}$', 0.3, 3)
    y_top = pm.Normal('y_{top}', mu=mu_top, sd=sigma_top, observed=[3.02])

    mu_bottom = pm.Normal('$\mu_{bottom}$', 1.02, .2)
    sigma_bottom = pm.Gamma('$\sigma_{bottom}$', 0.3, 3)
    y_bottom = pm.Normal('y_{bottom}', mu=mu_bottom, sd=sigma_bottom, observed=[1.02])

    mu_t = pm.Deterministic('$\mu_{thickness}$', mu_top - mu_bottom)
    sigma_thick = pm.Gamma('$\sigma_{thickness}$', 0.3, 3)
    y = pm.Normal('y_{thickness}', mu=mu_t, sd=sigma_thick, observed=y_obs_list)

Sampling

with model:
    prior = pm.sample_prior_predictive(1000)
    trace = pm.sample(1000, discard_tuned_samples=False, cores=1)
    post = pm.sample_posterior_predictive(trace)

Out:

█
█
█
data = az.from_pymc3(trace=trace,
                     prior=prior,
                     posterior_predictive=post)
data
arviz.InferenceData
    • <xarray.Dataset>
      Dimensions:               (chain: 2, draw: 1000)
      Coordinates:
        * chain                 (chain) int64 0 1
        * draw                  (draw) int64 0 1 2 3 4 5 6 ... 994 995 996 997 998 999
      Data variables:
          $\mu_{top}$           (chain, draw) float64 3.099 3.105 3.079 ... 3.165 3.09
          $\mu_{bottom}$        (chain, draw) float64 1.018 1.013 1.02 ... 1.068 1.013
          $\sigma_{top}$        (chain, draw) float64 0.03911 0.09512 ... 0.125 0.1367
          $\sigma_{bottom}$     (chain, draw) float64 0.01884 0.0128 ... 0.03597
          $\mu_{thickness}$     (chain, draw) float64 2.081 2.092 ... 2.097 2.077
          $\sigma_{thickness}$  (chain, draw) float64 0.06096 0.0743 ... 0.08205
      Attributes:
          created_at:                 2020-09-15T12:06:32.967086
          arviz_version:              0.9.0
          inference_library:          pymc3
          inference_library_version:  3.9.3
          sampling_time:              15.46685242652893
          tuning_steps:               1000

    • <xarray.Dataset>
      Dimensions:              (chain: 1, draw: 4000, y_{bottom}_dim_0: 1, y_{thickness}_dim_0: 12, y_{top}_dim_0: 1)
      Coordinates:
        * chain                (chain) int64 0
        * draw                 (draw) int64 0 1 2 3 4 5 ... 3995 3996 3997 3998 3999
        * y_{top}_dim_0        (y_{top}_dim_0) int64 0
        * y_{bottom}_dim_0     (y_{bottom}_dim_0) int64 0
        * y_{thickness}_dim_0  (y_{thickness}_dim_0) int64 0 1 2 3 4 5 6 7 8 9 10 11
      Data variables:
          y_{top}              (chain, draw, y_{top}_dim_0) float64 4.032 ... 3.25
          y_{bottom}           (chain, draw, y_{bottom}_dim_0) float64 0.7143 ... 0...
          y_{thickness}        (chain, draw, y_{thickness}_dim_0) float64 3.49 ... ...
      Attributes:
          created_at:                 2020-09-15T12:06:33.459960
          arviz_version:              0.9.0
          inference_library:          pymc3
          inference_library_version:  3.9.3

    • <xarray.Dataset>
      Dimensions:              (chain: 2, draw: 1000, y_{bottom}_dim_0: 1, y_{thickness}_dim_0: 12, y_{top}_dim_0: 1)
      Coordinates:
        * chain                (chain) int64 0 1
        * draw                 (draw) int64 0 1 2 3 4 5 6 ... 994 995 996 997 998 999
        * y_{top}_dim_0        (y_{top}_dim_0) int64 0
        * y_{bottom}_dim_0     (y_{bottom}_dim_0) int64 0
        * y_{thickness}_dim_0  (y_{thickness}_dim_0) int64 0 1 2 3 4 5 6 7 8 9 10 11
      Data variables:
          y_{top}              (chain, draw, y_{top}_dim_0) float64 0.2755 ... 0.9398
          y_{bottom}           (chain, draw, y_{bottom}_dim_0) float64 3.049 ... 2.389
          y_{thickness}        (chain, draw, y_{thickness}_dim_0) float64 1.671 ......
      Attributes:
          created_at:                 2020-09-15T12:06:33.456800
          arviz_version:              0.9.0
          inference_library:          pymc3
          inference_library_version:  3.9.3

    • <xarray.Dataset>
      Dimensions:             (chain: 2, draw: 1000)
      Coordinates:
        * chain               (chain) int64 0 1
        * draw                (draw) int64 0 1 2 3 4 5 6 ... 994 995 996 997 998 999
      Data variables:
          process_time_diff   (chain, draw) float64 0.001212 0.002141 ... 0.0042
          perf_counter_start  (chain, draw) float64 836.5 836.5 836.5 ... 846.8 846.8
          energy_error        (chain, draw) float64 -0.4256 0.002456 ... -0.006506
          perf_counter_diff   (chain, draw) float64 0.001211 0.00214 ... 0.004199
          depth               (chain, draw) int64 2 3 3 4 3 2 4 4 ... 3 3 3 2 3 3 4 4
          tree_size           (chain, draw) float64 3.0 7.0 7.0 11.0 ... 7.0 15.0 15.0
          diverging           (chain, draw) bool False False False ... False False
          energy              (chain, draw) float64 -12.09 -13.07 ... -12.19 -12.19
          step_size           (chain, draw) float64 0.2726 0.2726 ... 0.2512 0.2512
          max_energy_error    (chain, draw) float64 -0.4256 61.58 ... -0.2379 0.628
          mean_tree_accept    (chain, draw) float64 0.9372 0.8924 ... 0.9969 0.988
          lp                  (chain, draw) float64 14.77 15.66 14.04 ... 13.65 14.44
          step_size_bar       (chain, draw) float64 0.2769 0.2769 ... 0.2902 0.2902
      Attributes:
          created_at:                 2020-09-15T12:06:32.972920
          arviz_version:              0.9.0
          inference_library:          pymc3
          inference_library_version:  3.9.3
          sampling_time:              15.46685242652893
          tuning_steps:               1000

    • <xarray.Dataset>
      Dimensions:                     (chain: 1, draw: 1000)
      Coordinates:
        * chain                       (chain) int64 0
        * draw                        (draw) int64 0 1 2 3 4 5 ... 995 996 997 998 999
      Data variables:
          $\mu_{thickness}$           (chain, draw) float64 2.238 2.343 ... 2.13 1.897
          $\mu_{bottom}$              (chain, draw) float64 1.09 0.7497 ... 1.035
          $\sigma_{thickness}$        (chain, draw) float64 4.401e-07 ... 0.0009651
          $\mu_{top}$                 (chain, draw) float64 3.328 3.092 ... 2.931
          $\sigma_{thickness}$_log__  (chain, draw) float64 -14.64 -2.327 ... -6.943
          $\sigma_{top}$              (chain, draw) float64 0.1756 0.4165 ... 0.001586
          $\sigma_{bottom}$           (chain, draw) float64 0.001322 ... 2.649e-06
          $\sigma_{top}$_log__        (chain, draw) float64 -1.74 -0.8758 ... -6.446
          $\sigma_{bottom}$_log__     (chain, draw) float64 -6.629 -3.144 ... -12.84
      Attributes:
          created_at:                 2020-09-15T12:06:33.463132
          arviz_version:              0.9.0
          inference_library:          pymc3
          inference_library_version:  3.9.3

    • <xarray.Dataset>
      Dimensions:              (chain: 1, draw: 1000, y_{bottom}_dim_0: 1, y_{thickness}_dim_0: 12, y_{top}_dim_0: 1)
      Coordinates:
        * chain                (chain) int64 0
        * draw                 (draw) int64 0 1 2 3 4 5 6 ... 994 995 996 997 998 999
        * y_{top}_dim_0        (y_{top}_dim_0) int64 0
        * y_{bottom}_dim_0     (y_{bottom}_dim_0) int64 0
        * y_{thickness}_dim_0  (y_{thickness}_dim_0) int64 0 1 2 3 4 5 6 7 8 9 10 11
      Data variables:
          y_{top}              (chain, draw, y_{top}_dim_0) float64 3.429 ... 2.93
          y_{bottom}           (chain, draw, y_{bottom}_dim_0) float64 1.088 ... 1.035
          y_{thickness}        (chain, draw, y_{thickness}_dim_0) float64 2.238 ......
      Attributes:
          created_at:                 2020-09-15T12:06:33.466667
          arviz_version:              0.9.0
          inference_library:          pymc3
          inference_library_version:  3.9.3

    • <xarray.Dataset>
      Dimensions:              (y_{bottom}_dim_0: 1, y_{thickness}_dim_0: 12, y_{top}_dim_0: 1)
      Coordinates:
        * y_{top}_dim_0        (y_{top}_dim_0) int64 0
        * y_{bottom}_dim_0     (y_{bottom}_dim_0) int64 0
        * y_{thickness}_dim_0  (y_{thickness}_dim_0) int64 0 1 2 3 4 5 6 7 8 9 10 11
      Data variables:
          y_{top}              (y_{top}_dim_0) float64 3.02
          y_{bottom}           (y_{bottom}_dim_0) float64 1.02
          y_{thickness}        (y_{thickness}_dim_0) float64 2.12 2.06 ... 2.13 1.92
      Attributes:
          created_at:                 2020-09-15T12:06:33.468232
          arviz_version:              0.9.0
          inference_library:          pymc3
          inference_library_version:  3.9.3



az.plot_trace(data)
plt.show()
$\mu_{top}$, $\mu_{top}$, $\mu_{bottom}$, $\mu_{bottom}$, $\sigma_{top}$, $\sigma_{top}$, $\sigma_{bottom}$, $\sigma_{bottom}$, $\mu_{thickness}$, $\mu_{thickness}$, $\sigma_{thickness}$, $\sigma_{thickness}$

sphinx_gallery_thumbnail_number = 3

az.plot_density([data, data.prior], shade=.9, data_labels=["Posterior", "Prior"],
                var_names=[
                    '$\\mu_{top}$',
                    '$\\mu_{bottom}$',
                    '$\\mu_{thickness}$',
                    '$\\sigma_{top}$',
                    '$\\sigma_{bottom}$',
                    '$\\sigma_{thickness}$'
                ],
                colors=[default_red, default_blue], bw=5);
plt.show()
$\mu_{top}$, $\mu_{bottom}$, $\mu_{thickness}$, $\sigma_{top}$, $\sigma_{bottom}$, $\sigma_{thickness}$
p = pp.PlotPosterior(data)

p.create_figure(figsize=(9, 5), joyplot=False, marginal=True, likelihood=False)
p.plot_marginal(var_names=['$\\mu_{top}$', '$\\mu_{bottom}$'],
                plot_trace=False, credible_interval=.70, kind='kde',
                marginal_kwargs={"bw": 1}
                )
plt.show()
ch5 3 probability density transformation
p = pp.PlotPosterior(data)
p.create_figure(figsize=(9, 6), joyplot=True)

iteration = 500
p.plot_posterior(['$\\mu_{top}$', '$\\mu_{bottom}$'],
                 ['$\mu_{thickness}$', '$\sigma_{thickness}$'],
                 'y_{thickness}', iteration, marginal_kwargs={"credible_interval": 0.94,
                                                              'marginal_kwargs': {"bw": 1},
                                                              'joint_kwargs': {"bw": 1}})
plt.show()
Likelihood
az.plot_pair(data, divergences=False, var_names=['$\\mu_{top}$', '$\\mu_{bottom}$'])
plt.show()
ch5 3 probability density transformation

Total running time of the script: ( 0 minutes 44.991 seconds)

Gallery generated by Sphinx-Gallery