Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Seismology Lecture Notes: Forward Model and Model Resolution, Study notes of Geology

An introduction to the forward model in seismology, which establishes a mapping from model space to data space. The document also discusses the concept of model resolution and its testing using checkerboard resolution tests. The document further explains fresnel zones and multipathing/scattering, which provide more realistic 3d kernels applicable to earth models.

Typology: Study notes

2012/2013

Uploaded on 07/19/2013

shantii
shantii 🇮🇳

4.4

(14)

98 documents

1 / 7

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
T
RAVELTIME TOMOGRAPHY
(
CONT
)
M
ODEL UNIQUENESS
d
The forward model
i
=A
ik
m
k
d
=
Am
(1)
rData vecto vector Model matrix y Sensitivit
states the linearized relationship between data and model via a mapping or sensitivity matrix,
where i = 1, 2, … , N (data) and k = 1, 2, … , M (model parameters). In the case of GRT
traveltime tomography,
A
can be thought of as a sparse sampling matrix whose nonzero elements
are the lengths of segments of the raypath within each discretized voxel of the Earth.
l
11
L l
1M
1 ray
A
M O M
M
l
N 1
L l
NM
N
ray
Normally, N >> M – the number of data (typically ~10
6
– 10
8
) is far greater than the number
of model parameters (typically ~10
5
– 10
6
) – making (1) nominally overdetermined.
Consequently, A is normally not square.
We can invert for m by first left-multiplying (1) by A
T
T T
Am A = d A
and then, since A
T
A is square, we complete the inversion to yield the classic least squares
solution, as follows:
)
T 1T
m =
(
A A
)
d A . (2)
The penalty function that is satisfied by (2) is
2
T 1 T
ε
m
(
A A
)
d A
(3)
where
m
)
(plugged in for
m
in (3)) minimizes
ε
in the least squares sense. Aside: penalty metrics
other than the squared norm in (3) can be used; the absolute norm, for example.
As stated above, N >> M Æ the system is nominally
overdetermined. But
A
is often rank-deficient – mathematically,
rank(
A
) << M – and therefore the system behaves as though it is
underdetermined. In other words, many of the N observations
are linearly dependent and so do not provide complementary
information on the subsurface. To understand this, consider the
worst-case scenario where we have 10
6
observations that travel
along the exact same raypath like the figure shown at right.
Only cells traversed by the ray are actually sampled, meaning
that unsampled cells are completely unconstrained. From an
inverse point of view, the unconstrained parameter cells make
the problem underdetermined – there are more parameters than
S
R
1
Docsity.com
pf3
pf4
pf5

Partial preview of the text

Download Seismology Lecture Notes: Forward Model and Model Resolution and more Study notes Geology in PDF only on Docsity!

T RAVELTIME TOMOGRAPHY ( CONT )

M ODEL UNIQUENESS

d

The forward model

i =^ Aik mk ⇔^ d^ =^ Am^ (1) Data vecto r SensitivitymatrixModelvector

states the linearized relationship between data and model via a mapping or sensitivity matrix, where i = 1, 2, … , N (data) and k = 1, 2, … , M (model parameters). In the case of GRT traveltime tomography, A can be thought of as a sparse sampling matrix whose nonzero elements are the lengths of segments of the raypath within each discretized voxel of the Earth.

l 11 L^ l 1 M ⎤ ←^ ray^1 A ≡ ⎢ ⎢ M O M ⎥ ⎥ M ⎢⎣ lN 1 L lNM (^) ⎦⎥ ← rayN

  • Normally, N >> M – the number of data (typically ~10^6 – 10^8 ) is far greater than the number of model parameters (typically ~10^5 – 10^6 ) – making (1) nominally overdetermined.
  • Consequently, A is normally not square.

We can invert for m by first left-multiplying (1) by A T

A^ T^ Am = A T d

and then, since A T A is square, we complete the inversion to yield the classic least squares solution, as follows:

) (^) T −^1 T

m = ( A A ) A d. (2)

The penalty function that is satisfied by (2) is

T −^1 T^2

ε ≡ m − ( A A ) A d (3)

where m

(plugged in for m in (3)) minimizes ε in the least squares sense. Aside: penalty metrics other than the squared norm in (3) can be used; the absolute norm, for example.

As stated above, N >> M Æ the system is nominally overdetermined. But A is often rank-deficient – mathematically, rank( A ) << M – and therefore the system behaves as though it is underdetermined. In other words, many of the N observations are linearly dependent and so do not provide complementary information on the subsurface. To understand this, consider the worst-case scenario where we have 10^6 observations that travel along the exact same raypath like the figure shown at right. Only cells traversed by the ray are actually sampled, meaning that unsampled cells are completely unconstrained. From an inverse point of view, the unconstrained parameter cells make the problem underdetermined – there are more parameters than

S

R

30 March 2005

there are complementary pieces of information to constrain the system.

When an inverse problem is underdetermined, all the following statements are true and/or synonymous: {1} the problem is ill-posed; {2} A T A is singular/non-invertible/rank-deficient; {3} one or more rows or columns of A and A T A are linearly dependent; {4} one or more parameters are unconstrained; {5} A and A T A both have nontrivial null spaces.

All parts of a discretized model that are unconstrained – or, equivalently, that are not mapped by

A from model space into data space – reside in the null space of A. This means that ε is independent of unconstrained parameters; an unconstrained parameter can take on any value without affecting ε. So the least squares solution is not unique if one or more parameters are

unconstrained! In fact, there is an infinite set of solutions that equally minimize ε. Namely, any prospective solution that contains the true solution (which we don’t know) linearly combined with any of the vectors in null ( A ) will equally minimize ε. This is termed nonuniqueness.

R EGULARIZATION AND D AMPING

Model regularization essentially adds additional constraints to the inverse problem to address the solution nonuniqueness that arises as a consequence of ill-posedness. For example, the expression

a + b = 2

is ill-posed if we are asked to find values for both a and b. There are an infinite number of combinations of a and b that equally satisfy this expression, so the inverse problem is nonunique. But if we add an extra condition or constraint that specifies

λ 1 (

where λ 1 is a user-specified weight (a.k.a. Lagrange multiplier), we now have a well-posed system of two equations in two unknowns (provided λ 1 ≠ 0). The unique solution to this system is a = b = 1. This second expression is a form of model regularization. It assumes that a functional relationship exists between the model parameters (in this case, it basically assumes that a = b ) that can be used to add information and thereby constrain the original problem.

This idea can be applied to (3) by augmenting it with an additional regularization condition:

ab ) = 0,

2 ( A T^ A

− )

(^1) T 2 A d λ 1

where, for example, L can defined as the gradient or first-difference operator

ε = m − + Lm (4)

1 − 1 0 L ⎤

L

O

0 1 − 1 0 L

L ≡

M O O O

that basically reduces the difference between adjacent model parameters in m. This type of constraint requires that neighboring model parameters be of similar magnitude. For this reason, the second term on the RHS of (4) is often referred to as a smoothness constraint. The strength

30 March 2005

(cartooned by the bottom arrow). We would especially like to know how well we resolve model parameters via the inverse mapping. Combining (1) and (2), we get

m )^ ≅ ( A^ T^ A )− 1 ( A T^ A ) m , (8)

and we define the resolution matrix as

R ≡ ( A^^ T^ A )−^1 (^ A T A )^ (9)

so that ) mRm.

Clearly, if R = I then all the parameters are equally and perfectly resolved. In practice, R will often be diagonally dominant , but the main diagonals will not all be one and the off-diagonals will often be different from zero. By analyzing the diagonals of R , we gather information on how well resolved the parameters are (resolution increases as the diagonal values approach unity). The off-diagonal terms tell us about ‘smearing’ between parameters.

However, in seismic tomography, A frequently contains billions or even trillions of elements, so formally computing R is impractical (impossible?). There are a number of ways to circumnavigate this difficulty. The two methods described here are the checkerboard resolution test and the hypothesis test.

The checkerboard resolution test basically utilizes a synthetic Earth model wherein seismic velocities alternate between slow and fast, in a checkerboard pattern:

m syn ≡ checkerboard.

m syn is mapped to synthetic data by

d syn = Am syn ,

and then d syn is inverted to create the inversion model

m^ )^ = ( A T^ A )− 1 A T d.

syn syn

We can then consider the resolution metric

2 −

m (^) syn m syn ) or we can simply visually inspect m (^) syn to see how well it tallies with m syn. The following

observations can be made about checkerboard resolution tests:

  • The greatest model resolution, not surprisingly, occurs over continental landmasses
  • Many regions over open oceans apparently cannot be resolved at all.
  • Deep mantle model parameters are generally better resolved than ones nearer the surface because there is better data coverage provided by deep diving waves.
  • Model regularizations used in the inversions tend to provide poorer amplitude resolution than spatial resolution.
  • These test types assume A is known. But A is an approximation of physical reality.

30 March 2005

The hypothesis testing method is less arbitrary than the checkerboard method. Rather than test the resolution of an unrealistic Earth model, which is only marginally informative at best, this method tests a null hypothesis. The null hypothesis can be any Earth model that is similar to the current inversion model, but that differs in some way that would be important from an interpretive viewpoint. Let the current and null hypothesis models be denoted, respectively, m 0 and m null. The null hypothesis Earth model is mapped to data space via A and then inverted back to model space via ( A T A )^ -1 A T^ – so we’re considering ( A T A ) (-1 A T A ) m null. If m null is similar to ( A T A ) (^ -1 A T A ) m null at those places where^ m null was purposely made to differ from^ m 0 , then there is sufficient resolution in these regions and the null hypothesis is rejected. If ( A T A ) (^ -1 A T A ) m null looks like m 0 then the null hypothesis is accepted – the data do not constrain the parameters enough to distinguish between m 0 and m null. So, when the null hypothesis is rejected, we have resolution in regions of interest; but if it is accepted, then both the null hypothesis and the original inversion model can equally explain the data, which presents a problem as far as interpretation is concerned. Hypothesis testing is easily visualized using the block diagram shown in Figure 1.

Tomographic model with m 0 ) (^) l ( m null ) wi of m 0

( T^ )− 1 ( T ) null

or

i Accept

i The i ld

top ri ll To do this, apply ( A T A )-1 A T A to m null and ly

anomaly ( (^) Null hypothesis mode th dashed outline for comparison

AA AA m

Reject – Resolution is suffic ent – Resolution is

Hypothesis test

Figure 1 Block d agram cartooning hypothesis testing for resolution. top left panel depicts a tomographic image w th one anomaly. We wou like to know, for example, whether the data provides enough resolution to unequivocally interpret this anomaly as a contiguous body or perhaps as two separate bodies (as seen in the ght nu hypothesis panel). check to see whether the two-anoma hypothesis is preserved. If it is, the hypothesis is rejected; there is enough resolution to distinguish between the two models. If not, the hypothesis is accepted; we do not have enough to distinguish between models. insufficient to distinguish resolution to reasonably distinguish (^) between models. between the null hypothesis and the original image.

30 March 2005

The Fresnel Zone is shaped like a banana and is frequently referred to as the banana donut

In cross-section, the banana donut is represented by a (^) Banana Donut series of concentric contours that correspond to the (^) Sensitivity Kernel spatial sensitivity of the banana donut at each point. Interestingly, for finite frequencies, the sensitivity is identically zero at the center of the cross-section, which Zero sensitivity provides the counterintuitive insight that the GRT raypath is a path of zero sensitivity!

The banana donut kernel (BDK) replaces the elements in A , as described earlier in these notes. Because the BDK is a volume and not a raypath, A is no longer nearly as sparse as it is in the case of GRT. Moreover, because sensitivity is now distributed in a volume, the BDK A is less ill-posed than the GRT A. Hence, banana donut kernels will tend to increase resolution, as compared to GRT kernels.

Multipathing

The preceding ideas also naturally give rise to multipathing , which acknowledges that more than one raypath can satisfy Fermat’s Principle. For example, a “scattered” ray might travel through a faster region and therefore arrive at the same time as the GRT ray. Consequently, the integral

d = (^) ∫ G δμ dV fast (^) Volume

can be simplified to the sum of two integrals

δ t = (^) ∫ ∆ s dl + (^) ∫ ∆ s dl , ray 1 ray 2

and this is the central idea of multipathing.