PyFWI Within PyTorch
PyFWI is integrated with PyTorch to provide the gradient of a cost function with respect to model parameters. Here, a simple example is provided to show the application of this integration. This notebook can be compared to the Simple Example for a better understanding of the required changes and the results.
In this section, we first show the forward modeling, and then we estimate the gradient of the cost function with respect to \(V_P\).
1. Forward modeling
In this simple example, we use PyFWI to do forward modeling. So, we need to first import the following packages and modulus.
import matplotlib.pyplot as plt
import numpy as np
import PyFWI.wave_propagation as wave
import PyFWI.acquisition as acq
import PyFWI.seiplot as splt
import PyFWI.model_dataset as md
import PyFWI.fwi_tools as tools
import PyFWI.processing as process
import torch
from PyFWI.torchfwi import Fwi
A simple model can be created by using model_dataset
module as
Model = md.ModelGenerator('louboutin')
model = Model()
# Making medium acoustic
model['vs'] *= 0.0
model['rho'] = np.ones_like(model['rho'])
im = splt.earth_model(model, ['vp'], cmap='coolwarm')
Then we need to create an input dictionary as follow
model_shape = model[[*model][0]].shape
inpa = {
'ns': 5, # Number of sources
'sdo': 4, # Order of FD
'fdom': 15, # Central frequency of source
'dh': 7, # Spatial sampling rate
'dt': 0.004, # Temporal sampling rate
'acq_type': 1, # Type of acquisition (0: crosswell, 1: surface, 2: both)
't': 0.8, # Length of operation
'npml': 20, # Number of PML
'pmlR': 1e-5, # Coefficient for PML (No need to change)
'pml_dir': 2, # type of boundary layer
'device': 1, # The device to run the program. Usually 0: CPU 1: GPU
'seimogram_shape': '3d',
}
seisout = 0 # Type of output 0: Pressure
inpa['rec_dis'] = 1 * inpa['dh'] # Define the receivers' distance
Now, we obtain the location of sources and receivers based on specified parameters.
offsetx = inpa['dh'] * model_shape[1]
depth = inpa['dh'] * model_shape[0]
src_loc, rec_loc = acq.surface_seismic(inpa['ns'], inpa['rec_dis'], offsetx,
inpa['dh'], inpa['sdo'])
src_loc[:, 1] -= 5 * inpa['dh']
# Create the source
src = acq.Source(src_loc, inpa['dh'], inpa['dt'])
src.Ricker(inpa['fdom'])
Model properties should be with type of torch.tensor
. So, we need to
convert these properties.
vp = torch.tensor(model['vp'])
vs = torch.tensor(model['vs'])
rho = torch.tensor(model['rho'])
Finally, we can have the forward modelling as
# Create the wave object
W = wave.WavePropagator(inpa, src, rec_loc, model_shape, components=seisout)
# Call the forward modelling
taux_obs, tauz_obs = Fwi.apply(W, vp, vs, rho) # show=True can show the propagation of the wave
To compute the gradient using the adjoint-state method, we need to save the wavefield during the forward wave propagation. This must be done for the wavefield obtained from estimated model. For example, the wavefield at four time steps are presented here in addition to a shot gather.
fig = plt.figure(figsize=(4, 4))
ax = fig.add_subplot(111)
ax = splt.seismic_section(ax, taux_obs[..., 2], t_axis=np.linspace(0, inpa['t'], int(1 + inpa['t'] // inpa['dt'])))
fig.suptitle("Shot gather", fontweight='bold');
2. Gradient
To compute the gradient, we need the observed data and an initial model.
Note: For better visualization and avoiding crosstalk, I compute the gradient in acoustic media.
Then we create the initial model.
m0 = Model(smoothing=1)
m0['vs'] *= 0.0
m0['rho'] = np.ones_like(model['rho'])
# Convert to tensor
vp0 = torch.tensor(m0['vp'], requires_grad=True)
vs0 = torch.tensor(m0['vs'], requires_grad=True)
rho0 = torch.tensor(m0['rho'], requires_grad=True)
im = splt.earth_model(m0, ['vp'], cmap='coolwarm')
And we simulate the wave propagation to obtain estimated data. For
computing the gradient, we can smooth the gradient and scale it by
defining g_smooth
and energy_balancing
.
inpa['energy_balancing'] = True
We save the wavefield at 20% of the time steps (chpr = 20
) to be
used for gradient calculation. The value of wavefield is accessible
using the attribute W
which is a dictionary for \(V_x\),
\(V_z\), \(\tau_x\), \(\tau_z\), and \(\tau_{xz}\) as
vx
, vz
, taux
, tauz
, and tauxz
. Each parameter is a
4D tensor. For example, we can have access to the last time step of
\(\tau_x\) for the first shot as W.W['taux'][:, :, 0, -1]
.
Lam = wave.WavePropagator(inpa, src, rec_loc, model_shape,
chpr=20, components=seisout)
taux_est, tauz_est = Fwi.apply(Lam,
vp0,
vs0,
rho0
)
Now, we define the cost function and obtaine the residuals for adjoint-state method.
criteria = torch.nn.MSELoss(reduction='sum')
mse0 = 0.5 * criteria(taux_est, taux_obs)
mse1 = 0.5 * criteria(tauz_est, taux_obs)
mse = mse0 + mse1
# print(mse.item())
mse.backward()
Using the adjoint source, we can estimate the gradient as
grad = {'vp': vp0.grad,
'vs': vs0.grad,
'rho':rho0.grad
}
# Time to plot the results
splt.earth_model(grad, ['vp'], cmap='jet');