2. Background Theory¶

This section aims to provide the user with a basic review of the physics, discretization, and optimization techniques used to solve the frequency domain quasi-static electromagnetics problem. It is assumed that the user has some background in these areas. For further reading see [Nab91].

Important

The theory provided on this page works for the following right-handed coordinate systems:

X = Easting, Y = Northing, Z = Up (standard Cartesian)
X = Northing, Y = Easting, Z = Down (standard magnetotelluric which this code uses!)

2.1. Fundamental Physics¶

Maxwell’s equations provide the starting point from which an understanding of how electromagnetic fields can be used to uncover the substructure of the Earth. In the frequency domain Maxwell’s equations are:

(2.1)¶\[\begin{split}\begin{align} &\nabla \times \mathbf{E} - i\omega\mu \mathbf{H} = 0 \\ &\nabla \times \mathbf{H} - \sigma \mathbf{E} = \mathbf{s} \\ &\nabla \cdot \mathbf{J} = 0 \\ &\mathbf{J} = \sigma \mathbf{E} \end{align}\end{split}\]

where \(\mathbf{E}\) is the electric field, \(\mathbf{H}\) is the magnetic field, \(\mathbf{J}\) is the current density, \(\mathbf{s}\) is some external source and \(e^{-i\omega t}\) is suppressed. Symbols \(\mu\), \(\sigma\) and \(\omega\) are the magnetic permeability, conductivity, and angular frequency, respectively. This formulation assumes a quasi-static mode so that the system can be viewed as a diffusion equation (Weaver, 1994; Ward and Hohmann, 1988 in [Nab91]). By doing so, some difficulties arise when solving the system;

the curl operator has a non-trivial null space making the resulting linear system highly ill-conditioned

the conductivity \(\sigma\) varies over several orders of magnitude

2.2. Natural Sources: MT and ZTEM¶

The sources in the magnetotelluric (MT) and Z-axis tipper electromagnetic (ZTEM) methods are modeled as plane waves originating from natural phenomenon. These waves can be of very low frequency (< 1 Hz) and very high energy, making it possible to image very deep targets. This also implies that the source term is zero inside the domain of interest, and therefore the source term on the boundaries becomes very important. For natural source electromagnetic (NSEM) problems, Maxwell’s equations can be expressed as:

(2.2)¶\[\begin{split}\begin{align} \nabla \times &\mathbf{E} - i\omega\mu \mathbf{H} = 0 \\ \nabla \times &\mathbf{H} - \sigma \mathbf{E} = 0 \\ &\mathbf{E} \big |_{\partial \Omega} = \mathbf{E_0} \end{align}\end{split}\]

where \(\mathbf{E_0}\) is the electric field solution on the boundary \(\partial \Omega\).

Consider the case where the Earth is a uniform half-space with a plane surface. If the source field is assumed to be homogeneous, infinite in dimension, and is located at infinity, then the plane waves impinging on the Earth’s surface travel in the z-direction.

For plane waves polarized such that their electric fields lie along the x direction, the electric field is defined by the following Helmholtz equation:

(2.3)¶\[\frac{\partial^2 E_x}{\partial z^2} = k^2 E_x \;\;\; \textrm{s.t.} \;\;\; k^2 = -i\mu\omega\sigma\]

and the relationship between \(E_x\) and \(H_y\) is given by:

(2.4)¶\[i \omega \mu H_y = \frac{\partial E_x}{\partial z}\]

The solution to (2.3) takes the form:

(2.5)¶\[E_x = Q e^{-kz}\]

where \(Q\) is some constant. Taking the ratio of the electric and magnetic fields measured at the surface gives:

(2.6)¶\[Z_{xy} = \frac{E_x}{H_y} = \frac{-i\omega \mu}{k} = \sqrt{\dfrac{-i\omega\mu}{\sigma}}\]

This implies that conductivity \(\sigma\) of the Earth can be determined by taking measurements of the field components, and therefore the impedance constitutes the basic MT response function, or data. A 1D layered Earth model can be used to compute the source wave components by iteratively propagating a plane wave from the surface to depth.

2.2.1. Magnetotelluric (MT) Data¶

For a 3-dimensional Earth, the magnetotelluric data are defined by the impedance tensor. The impedance tensor can be defined using the ratios of electric and magnetic field components in both the x and y directions for 2 orthogonal plane wave polarizations; one polarization with the electric field along the x axis and one polarization with the electric file along the y axis. Where the impedance tensor \(\mathbf{Z}\) is a 2 by 2 matrix:

(2.7)¶\[\mathbf{Z} = \mathbf{E H}^{-1}\]

such that:

(2.8)¶\[\begin{split}\begin{bmatrix} Z_{xx} & Z_{xy} \\ Z_{yx} & Z_{yy} \end{bmatrix} = \begin{bmatrix} E_{x}^{(1)} & E_{x}^{(2)} \\ E_{y}^{(1)} & E_{y}^{(2)} \end{bmatrix} \begin{bmatrix} H_{x}^{(1)} & H_{x}^{(2)} \\ H_{y}^{(1)} & H_{y}^{(2)} \end{bmatrix}^{-1}\end{split}\]

where 1 and 2 refer to fields associated with plane waves polarized along two perpendicular directions.

Important

For standard MT data, X = Northing, Y = Easting and Z = Down; which this code uses! Thus:

Superscript \(\! ^{(1)}\) refers to fields resulting from a plane wave whose electric field is polarized along the Northing direction. And superscript \(\! ^{(2)}\) refers to fields resulting from a plane wave whose electric field is polarized along the Easting direction.
\(Z_{xy}\) is essentially the ratio of the electric field along the Northing and the magnetic field along the Easting.

2.2.2. ZTEM Data¶

The Z-Axis Tipper Electromagnetic Technique (ZTEM) (Lo2008) records the vertical component of the magnetic field everywhere above the survey area while recording the horizontal fields at a ground base reference station. In the same manner as demonstrated for MT, transfer functions are computed which relate the vertical fields to the ground based horizontal fields. This relation is given by:

(2.9)¶\[H_z(r) = T_{zx}(r,r_0)H_x(r_0) + T_{zy}(r,r_0)H_y(r_0)\]

where \(r\) is the location of the vertical field and \(r_0\) is the location of the ground base station. \(T_{zx}\) and \(T_{zy}\) are the vertical field transfer functions, from z to x and z to y respectively. For a 3-dimensional Earth, the transfer function can be defined using the magnetic field components for 2 orthogonal plane wave polarizations; one polarization with the electric field along the x axis and one polarization with the electric file along the y axis. In this case,

(2.10)¶\[\begin{split}\begin{bmatrix} H_z^{(1)} \\ H_z^{(2)} \end{bmatrix} = \begin{bmatrix} H_x^{(1)} & H_y^{(1)} \\ H_x^{(2)} & H_y^{(2)} \end{bmatrix} \begin{bmatrix} T_{zx} \\ T_{zy} \end{bmatrix}\end{split}\]

where 1 and 2 refer to fields associated with plane waves polarized along two perpendicular directions. Thus the transfer functions are given by:

\[\begin{split}\begin{bmatrix} T_{zx} \\ T_{zy} \end{bmatrix} = \big ( H_x^{(1)} H_y^{(2)} - H_x^{(2)} H_y^{(1)} \big )^{-1} \begin{bmatrix} - H_y^{(1)} H_z^{(2)} + H_y^{(2)} H_z^{(1)} \\ H_x^{(1)} H_z^{(2)} - H_x^{(2)} H_z^{(1)} \end{bmatrix}\end{split}\]

Important

For standard natural source data, X = Northing, Y = Easting and Z = Down; which this code uses! Thus:

Superscript \(\! ^{(1)}\) refers to fields resulting from a plane wave whose electric field is polarized along the Northing direction. And superscript \(\! ^{(2)}\) refers to fields resulting from a plane wave whose electric field is polarized along the Easting direction.
\(T_{zx}\) is the transfer function related to an incident plane wave whose electric field is polarized along the Northing direction; which produces magnetic fields with components in the Easting direction.

2.3. Discretization of Operators¶

The operators div, grad, and curl are discretized using a finite volume formulation. Although div and grad do not appear in (2.8) or (2.10), they are required for the solution of the system. The divergence operator is discretized in the usual flux-balance approach, which by Gauss’ theorem considers the current flux through each face of a cell. The nodal gradient (operates on a function with values on the nodes) is obtained by differencing adjacent nodes and dividing by edge length. The discretization of the curl operator is computed similarly to the divergence operator by utilizing Stokes theorem by summing the magnetic field components around the edge of each face. Please see [HHG+12] for a detailed description of the discretization process.

2.4. Forward Problem¶

2.4.1. Numerical Approach¶

(2.1) provides the starting point for solving the natural source electromagnetic (NSEM) problem. It is possible to discretize and solve this system in terms of the electric field. However a more stable approach is obtained by solving Maxwell’s equation in terms of potentials. To do this, the electric field is decomposed into the sum of a vector potential (\(\mathbf{a}\)) and the gradient of a scalar potential (\(\phi\)):

(2.11)¶\[\mathbf{E} = \mathbf{a} + \nabla \phi\]

To obtain a unique solution, we implement the Coulomb gauge condition:

(2.12)¶\[\nabla \cdot \mathbf{a} = 0\]

Replacing \(\mathbf{E}\) with Eq. (2.11) in Maxwell’s equations and using the Coulomb Gauge we obtain:

(2.13)¶\[\nabla^2 \mathbf{A} - i \omega \mu \sigma ( \mathbf{A} + \nabla \phi ) = 0\]

The algorithm used for the MT forward modelling is based on the separation of the total electric and magnetic fields into primary parts that exist in a background conductivity model, and secondary parts that exist because of the difference between the actual conductivity model and the background model. That is,

(2.14)¶\[\begin{split}\begin{align} \mathbf{E} &= \mathbf{E_p} + \mathbf{E_s} \\ \mathbf{H} &= \mathbf{H_p} + \mathbf{H_s} \end{align}\end{split}\]

where the primary fields \(\mathbf{E_p}\) and \(\mathbf{H_p}\) satisfy (2.1) for the background conductivity \(\sigma_b\) and the inhomogeneous boundary conditions. Equivalently, the vector and scalar potentials can be thought of as being divided into primary and secondary parts:

(2.15)¶\[\begin{split}\begin{align} \mathbf{A} &= \mathbf{A_p} + \mathbf{A_s} \\ \phi &= \phi_p + \phi_s \end{align}\end{split}\]

where \(\mathbf{A_p}\) and \(\phi_p\) satisfy (2.1) for the background conductivity model and the inhomogeneous boundary conditions. Substituting Eqs. (2.15) into Eq. (2.13) and the third expression in (2.1) gives

(2.16)¶\[\begin{split}\begin{align} \nabla^2 \mathbf{A_s} + i\omega \mu \sigma \big ( \mathbf{A_s} + \nabla \phi_s \big ) &= - i\omega \mu \Delta \sigma \mathbf{E_p} \\ \nabla \cdot \sigma \mathbf{A_s} + \nabla \cdot \sigma \nabla \phi_s &= - \nabla \cdot \Delta \sigma \mathbf{E_p} \end{align}\end{split}\]

where \(\Delta \sigma = \sigma - \sigma_b\). The preceding pair of simultaneous equations are the equations that are solved in the NSEM forward-modelling algorithm. The secondary potentials are assumed to vanish on the boundaries of the computational domain, that is, satisfy homogeneous boundary conditions. This pair of inhomogeneous equations, and the homogeneous boundary conditions, match the boundary value problem that is solved for the controlled-source case. The discretization and solution of Eqs. (2.16) is done in exactly the same way as for the controlled-source case; i.e. finite volume.

2.4.2. Source Term¶

For this approach, we solve a 1D wave equation of the following form:

(2.17)¶\[\mathbf{\tilde{A} \tilde{u}_e} = \mathbf{\tilde{q}}\]

where \(\mathbf{\tilde{u}_e}\) is the electric field for the 1D solution polarized along the x or y directions. \(\mathbf{\tilde{A}}\) is an operator of the form:

\[\mathbf{\tilde{A}} = \mathbf{L} + i \omega \mu_0 \tilde{\sigma}\]

such that \(\mathbf{L}\) is the Laplacian operator, \(\mu_0\) is the permeability of free-space and \(\tilde{\sigma}\) is a 1D conductivity model. The right-hand side \(\mathbf{\tilde{q}}\) is a vector of zeros except for \(\tilde{q}_1\). A Dirichlet condition is imposed by setting \(A_{11} = 1\) and \(\tilde{q}_1 = i\omega \mu_0 h^{-1}\); where \(h\) is the layer thickness. Once Eq. (2.17) is solved for a particular frequency, the solution is transferred to the edges of an tensor mesh. If the electric field is polarized along the x direction, there are no electric fields along y or z; similarly for a solution polarized along the y direction.

Let \(\mathbf{u_s}\) and \(\sigma_s\) be the electric fields and 1D conductivity model transferred to the edges of the tensor mesh, respectively. Then the source term in Eq. discrete e sys is computed for a given frequency and polarization using:

\[\frac{1}{i\omega} \mathbf{A u_s} = \mathbf{s}\]

where \(\mathbf{A}\) is similar to expression A operator, except the mass matrix \(\mathbf{M_\sigma}\) is formed using the transferred conductivity \(\sigma_s\).

2.5. Sensitivity¶

2.6. Inversion Problem¶

To solve the inverse problem, we minimize the following global objective function:

(2.18)¶\[\phi = \phi_d + \beta \phi_m\]

where \(\phi_d\) is the data misfit and \(\phi_m\) is the model objective function. The data misfit ensures the recovered model adequately explains the set of field observations. The model objective function adds geological constraints to the recovered model.

2.6.1. Data Misfit¶

Here, the data misfit is represented as the L2-norm of a weighted residual between the observed data (\(d_{obs}\)) and the predicted data for a given conductivity model \(\boldsymbol{\sigma}\), i.e.:

(2.19)¶\[\phi_d = \frac{1}{2} \big \| \mathbf{W_d} \big ( \mathbf{d_{obs}} - \mathbb{F}[\boldsymbol{\sigma}] \big ) \big \|^2\]

where \(W_d\) is a diagonal matrix containing the reciprocals of the uncertainties \(\boldsymbol{\varepsilon}\) for each measured data point, i.e.:

\[\mathbf{W_d} = \textrm{diag} \big [ \boldsymbol{\varepsilon}^{-1} \big ]\]

Important

For a better understanding of the data misfit, see the GIFtools cookbook .

2.6.2. Model Objective Function¶

Due to the ill-posedness of the problem, there are no stable solutions obtained by freely minimizing the data misfit, and thus regularization is needed. The regularization uses penalties for both smoothness, and likeness to a reference model \(m_{ref}\) supplied by the user. The model objective function is given by:

(2.20)¶\[\begin{split}\begin{align} \phi_m = \frac{\alpha_s}{2} \!\int_\Omega w_s | m - & m_{ref} |^2 dV + \frac{\alpha_x}{2} \!\int_\Omega w_x \Bigg | \frac{\partial}{\partial x} \big (m - m_{ref} \big ) \Bigg |^2 dV \\ &+ \frac{\alpha_y}{2} \!\int_\Omega w_y \Bigg | \frac{\partial}{\partial y} \big (m - m_{ref} \big ) \Bigg |^2 dV + \frac{\alpha_z}{2} \!\int_\Omega w_z \Bigg | \frac{\partial}{\partial z} \big (m - m_{ref} \big ) \Bigg |^2 dV \end{align}\end{split}\]

where \(\alpha_s, \alpha_x, \alpha_y\) and \(\alpha_z\) weight the relative emphasis on minimizing differences from the reference model and the smoothness along each gradient direction. And \(w_s, w_x, w_y\) and \(w_z\) are additional user defined weighting functions.

An important consideration comes when discretizing the regularization onto the mesh. The gradient operates on cell centered variables in this instance. Applying a short distance approximation is second order accurate on a domain with uniform cells, but only \(\mathcal{O}(1)\) on areas where cells are non-uniform. To rectify this a higher order approximation is used ([HHG+12]). The second order approximation of the model objective function can be expressed as:

\[\phi_m (\mathbf{m}) = \mathbf{\big (m-m_{ref} \big )^T W^T W \big (m-m_{ref} \big )}\]

where the regularizer is given by:

(2.21)¶\[\begin{split}\begin{align} \mathbf{W^T W} =& \;\;\;\;\alpha_s \textrm{diag} (\mathbf{w_s \odot v}) \\ & + \alpha_x \mathbf{G_x^T} \textrm{diag} (\mathbf{w_x \odot v_x}) \mathbf{G_x} \\ & + \alpha_y \mathbf{G_y^T} \textrm{diag} (\mathbf{w_y \odot v_y}) \mathbf{G_y} \\ & + \alpha_z \mathbf{G_z^T} \textrm{diag} (\mathbf{w_z \odot v_z}) \mathbf{G_z} \end{align}\end{split}\]

The Hadamard product is given by \(\odot\), \(\mathbf{v_x}\) is the volume of each cell averaged to x-faces, \(\mathbf{w_x}\) is the weighting function \(w_x\) evaluated on x-faces and \(\mathbf{G_x}\) computes the x-component of the gradient from cell centers to cell faces. Similarly for y and z.

If we require that the recovered model values lie between \(\mathbf{m_L \preceq m \preceq m_H}\) , the resulting bounded optimization problem we must solve is:

(2.22)¶\[\begin{split}\begin{align} &\min_m \;\; \phi_d (\mathbf{m}) + \beta \phi_m(\mathbf{m}) \\ &\; \textrm{s.t.} \;\; \mathbf{m_L \preceq m \preceq m_H} \end{align}\end{split}\]

A simple Gauss-Newton optimization method is used where the system of equations is solved using ipcg (incomplete preconditioned conjugate gradients) to solve for each G-N step. For more information refer again to [HHG+12] and references therein.

2.6.3. Inversion Parameters and Tolerances¶

2.6.3.1. Cooling Schedule¶

Our goal is to solve Eq. (2.22), i.e.:

\[\begin{split}\begin{align} &\min_m \;\; \phi_d (\mathbf{m}) + \beta \phi_m(\mathbf{m - m_{ref}}) \\ &\; \textrm{s.t.} \;\; \mathbf{m_L \leq m \leq m_H} \end{align}\end{split}\]

but how do we choose an acceptable trade-off parameter \(\beta\)? For this, we use a cooling schedule. This is described in the GIFtools cookbook . The cooling schedule is defined by the following parameters:

beta_start: The initial value for \(\beta\)

beta_end: The minimum \(\beta\) for which Eq. (2.22) is solved before the inversion will quit

beta_factor: The factor at which \(\beta\) is decrease to a subsequent solution of Eq. (2.22)

Chi Factor: The inversion program stops when the data misfit is less than the target data misfit \(\phi_d^* = N \times Chi \; Factor\), where \(N\) is the number of data observations

If the user chooses the DEFAULT options, then we let:

\[\begin{split}\begin{align} beta \_ factor &= 0.16681 \\ beta \_ start &= 1000 \times \dfrac{\| \mathbf{Jr} \|^2}{\| \mathbf{Wr} \|^2} \\ beta \_ end &= 10^{-7} \times beta \_ start \end{align}\end{split}\]

2.6.3.2. Gauss-Newton Update¶

For a given trade-off parameter (\(\beta\)), the model \(\mathbf{m}\) is updated using the Gauss-Newton approach. Because the problem is non-linear, several model updates may need to be completed for each \(\beta\). Where \(k\) denotes the Gauss-Newton iteration, we solve:

(2.23)¶\[\mathbf{H}_k \, \mathbf{\delta m}_k = - \nabla \phi_k\]

using the current model \(\mathbf{m}_k\) and update the model according to:

(2.24)¶\[\mathbf{m}_{k+1} = \mathbf{m}_{k} + \alpha \mathbf{\delta m}_k\]

where \(\mathbf{\delta m}_k\) is the step direction, \(\nabla \phi_k\) is the gradient of the global objective function, \(\mathbf{H}_k\) is an approximation of the Hessian and \(\alpha\) is a scaling constant. This process is repeated until any of the following occurs:

The gradient is sufficiently small, i.e.:

\[\| \nabla \phi_k \|^2 < tol \_ nl\]

The smallest component of the model perturbation its small in absolute value, i.e.:

\[\textrm{max} ( |\mathbf{\delta m}_k | ) < mindm\]

A max number of GN iterations have been performed, i.e.

\[k = nit\]

2.6.3.3. Gauss-Newton Solve¶

Here we discuss the details of solving Eq. (2.23) for a particular Gauss-Newton iteration \(k\). Using the data misfit from Eq. (2.19) and the model objective function from Eq. (2.21), we must solve:

(2.25)¶\[\Big [ \mathbf{J^T W_d^T W_d J + \beta \mathbf{W^T W}} \Big ] \mathbf{\delta m}_k = - \Big [ \mathbf{J^T W_d^T W_d } \big ( \mathbf{d_{obs}} - \mathbb{F}[\mathbf{m}_k] \big ) + \beta \mathbf{W^T W m}_k \Big ]\]

where \(\mathbf{J}\) is the sensitivity of the data (\(\mathbf{Z}\) or \(\mathbf{T}\)) to the current model \(\mathbf{m}_k\); see sensitivity section to learn how sensitivities are computed. The system is solved for \(\mathbf{\delta m}_k\) using the incomplete-preconditioned-conjugate gradient (IPCG) method. This method is iterative and exits with an approximation for \(\mathbf{\delta m}_k\). Let \(i\) denote an IPCG iteration and let \(\mathbf{\delta m}_k^{(i)}\) be the solution to (2.25) at the \(i^{th}\) IPCG iteration, then the algorithm quits when:

the system is solved to within some tolerance and additional iterations do not result in significant increases in solution accuracy, i.e.:

\[\| \mathbf{\delta m}_k^{(i-1)} - \mathbf{\delta m}_k^{(i)} \|^2 / \| \mathbf{\delta m}_k^{(i-1)} \|^2 < tol \_ ipcg\]

a maximum allowable number of IPCG iterations has been completed, i.e.:

\[i = max \_ iter \_ ipcg\]