Difference between revisions of "I4DVAR ANA SENSITIVITY"

From WikiROMS
Jump to navigationJump to search
Line 1: Line 1:
{{#lst:Options|OBS_SENSITIVITY}}
{{#lst:Options|OBS_SENSITIVITY}}
<wikitex>
==Mathematical Formulation==
The mathematical development presented here closely parallels that of [[Bibliography#ZhuY_2008a | Zhu and Gelaro (2008)]]. The sensitivity of any functional, $\cal J$, of the analysis $\bf\Psi_a$ or forecast $\bf\Psi_f$ can be efficiently computed using the adjoint model which yields information about the gradients of ${\cal J}({\bf\Psi_a})$ and ${\cal J}({\bf\Psi_f})$. We can extent the concept of the adjoint sensitivity to compute the sensitivity of the [[IS4DVAR]] cost function, $J$, in (1) and any other function $\cal J$ of the forecast $\bf\Psi_f$ to the observations, $\bf y$.
In [[IS4DVAR]], we define a quadratic cost function:
$$J(\psi) = \frac{1}{2}\,\psi^T {\bf B}^{-1} \psi + \frac{1}{2}\,({\bf G}\psi - {\bf d})^T {\bf O}^{-1} ({\bf G}\psi - {\bf d}) \eqno{(1)} $$


==Technical Description==
where $\psi$ is the ocean state vector ($\zeta$, u, v, T, S, ...) with $\psi\equiv\psi(t_0)$, $\bf O$ is the observation error and error of representativeness covariance matrix, $\bf B$ represents the background error covariance matrix, $\bf d$ is the innovation vector that represents the difference between the nonlinear background solution ($\bf\Psi_b$) and the observations, ${\bf d}_i = {\bf y}_i - {\bf H}_i ({\bf\Psi_b}(t_i))$. Here, ${\bf H}_i$ is an operator that samples the nonlinear model at the observation location, ${\bf G} = {\bf H}_{i}^{\prime} {\bf R}(t_0,t_i)$, ${\bf H}_{i}^{\prime}$ is the linearization of ${\bf H}_i$, and the operator $\bf R$ is  referred as the tangent linear model propagator.


<wikitex>
At the minimum of (1), the cost function gradient ${\partial J}/{\partial\psi}$ vanishes and:


The mathematical development presented here closely parallels that of [[Bibliography#ZhuY_2008a | Zhu and Gelaro (2008)]]. The sensitivity of any functional, $\cal J$, of the analysis $\bf\Psi_a$ or forecast $\bf\Psi_f$ can be efficiently computed using the adjoint model which yields information about the gradients of ${\cal J}({\bf\Psi_a})$ and ${\cal J}({\bf\Psi_f})$. We can extent the concept of the adjoint sensitivity to compute the sensitivity of the [[IS4DVAR]] cost function, $J$,
$$\psi \equiv \psi_a = ({\bf B}^{-1} + {\bf G}^T {\bf O}^{-1} {\bf G})^{-1} {\bf G}^T {\bf O}^{-1}  {\bf d}, \eqno{(2)}$$


$$J(\psi) = \frac{1}{2}\,\psi^T {\bf B}^{-1} \psi + \frac{1}{2}\,({\bf G}\psi - {\bf d})^T {\bf O}^{-1} ({\bf G}\psi - {\bf d}) \eqno{(1)} $$
where $\psi_a$ is referred to as the analysis increment, the desired solution of the incremental data assimilation procedure.


and any other function $\cal J$ of the forecast $\bf\Psi_f$ to the observations, $\bf y$. Here, $\psi$ is the ocean state vector, $\bf O$ is the observation error and error of representativeness matrix, $\bf B$ represents the background error covariance matrix, $\bf d$ is the innovation vector that represents the difference between the nonlinear background solution and the observations, ${\bf d}_i = {\bf y}_i - {\bf H}_i ({\bf\Psi_b}(t))$, ${\bf H}_i$ is an operator that samples the nonlinear model at the observation location, and ${\bf G} = {\bf H}_{i}^{\prime} {\bf R}(t_0,t_i)$.
The [[IS4DVAR]] analysis can be written in the more traditional form as ${\bf\Psi_a} = {\bf\Psi_b} + {\bf K} {\bf d}$ ([[Bibliography#DaleyR_1991a | Daley, 1991]]), where $\psi_a = {\bf K} {\bf d}$, the Kalman gain matrix ${\bf K} = {\cal H}^{-1} {\bf G}^T {\bf O}^{-1}$, and ${\cal H} = {\partial^{2}J}/{\partial\psi^2} = {\bf B}^{-1} {\bf G}^T {\bf O}^{-1} {\bf G}$ is the Hessian matrix. The entire [[IS4DVAR]] procedure is therefore neatly embodied in $\bf K$. At the cost function minimum, (1) can be written as $J = \frac{1}{2} {\bf d}^T ({\bf I} - {\bf K}^T {\bf G}^T) {\bf O}^{-1} {\bf d}$ which yields the sensitivity of the [[IS4DVAR]] cost function to the observations.


In the current [[IS4DVAR]]/[[LANCZOS]] data assimilation algorithm, the cost function (1) is identified using the Lanczos method  [[Bibliography#Golub_1989a | (Golub and Van Loan, 1989)]], in which case:
In the current [[IS4DVAR]]/[[LANCZOS]] data assimilation algorithm, the cost function (1) is identified using the Lanczos method  [[Bibliography#GolubG_1989a | (Golub and Van Loan, 1989)]], in which case:


$${\bf\Psi_a} = {\bf\Psi_b} - {\bf Q}_{k} {\bf T}_{k}^{-1} {\bf Q}_{k}^{T} {\bf G}^T {\bf O}^{-1} {\bf d} \eqno{(2)}$$
$${\bf\Psi_a} = {\bf\Psi_b} - {\bf Q}_{k} {\bf T}_{k}^{-1} {\bf Q}_{k}^{T} {\bf G}^T {\bf O}^{-1} {\bf d} \eqno{(3)}$$


where ${\bf Q}_k = ({\bf q}_i)$ is the matrix of $k$ orthogonal Lanczos vectors, and ${\bf T}_k$ is a known tridiagonal matrix. Each of the $k\hyphen$iterations of [[IS4DVAR]] employed in finding the minimum of $J$ yields one column ${\bf q}_i$ of ${\bf Q}_k$. We can identify the Kalman gain matrix as ${\bf K} = - {\bf Q}_k {\bf T}_{k}^{-1} {\bf Q}_{k}^{T} {\bf G}^T {\bf O}^{-1}$ in which case ${\bf K}^T = - {\bf O}^{-1} {\bf G} {\bf Q}_k {\bf T}_{k}^{-1} {\bf Q}_{k}^{T}$ represents the adjoint of the entire [[IS4DVAR]] system. The ${\bf K}^T$ on the vector ${\bf G}^T {\bf O}^{-1} {\bf d}$ above can be readily computed since the Lanczos vectors ${\bf q}_i$ and the matrix ${\bf T}_k$ are available at the end of each [[IS4DVAR]] assimilation cycle.
where ${\bf Q}_k = ({\bf q}_i)$ is the matrix of $k$ orthogonal Lanczos vectors, and ${\bf T}_k$ is a known tridiagonal matrix. Each of the $k\hyphen$iterations of [[IS4DVAR]] employed in finding the minimum of $J$ yields one column ${\bf q}_i$ of ${\bf Q}_k$. From (3) we can identify the Kalman gain matrix as ${\bf K} = - {\bf Q}_k {\bf T}_{k}^{-1} {\bf Q}_{k}^{T} {\bf G}^T {\bf O}^{-1}$ in which case ${\bf K}^T = - {\bf O}^{-1} {\bf G} {\bf Q}_k {\bf T}_{k}^{-1} {\bf Q}_{k}^{T}$ represents the adjoint of the entire [[IS4DVAR]] system. The action of ${\bf K}^T$ on the vector ${\bf G}^T {\bf O}^{-1} {\bf d}$ in (3) can be readily computed since the Lanczos vectors ${\bf q}_i$ and the matrix ${\bf T}_k$ are available at the end of each [[IS4DVAR]] assimilation cycle.


Therefore, the basic observation sensitivity algorithm is as follows:
Therefore, the basic observation sensitivity algorithm is as follows:
Line 24: Line 32:
# Multiply ${\bf H}^{\prime}\psi(t)$ by $-{\bf O}^{-1}$ to yield ${\partial J}/{\partial{\bf y}}$.
# Multiply ${\bf H}^{\prime}\psi(t)$ by $-{\bf O}^{-1}$ to yield ${\partial J}/{\partial{\bf y}}$.


Consider now a forecast ${\bf\Psi_f}(t)$ fort the interval $t = [t_0+\tau, \; t_0+\tau+t_f]$ initialized from ${\bf\Psi_a}(t_0+\tau)$ obtained from an assimilation cycle over the interval $t = [t_0+\tau, \; t_0+\tau]$, where $t_f$ is the forecast lead time. In addition, consider ${\cal J}({\bf\Psi_f}(t))$ that depends on the forecast ${\bf\Psi_f}(t)$, and which characterizes some future aspect of the forecast circulation.  According to the chain rule, the sensitivity ${\partial\cal{J}}/{\partial{\bf y}}$ of $\cal{J}$ to the observations $\bf y$ collected during the assimilation cycle $t = [t_0+\tau, t_0+\tau]$ is given by:
Consider now a forecast ${\bf\Psi_f}(t)$ fort the interval $t = [t_0+\tau, \; t_0+\tau+t_f]$ initialized from ${\bf\Psi_a}(t_0+\tau)$ obtained from an assimilation cycle over the interval $t = [t_0+\tau, \; t_0+\tau]$, where $t_f$ is the forecast lead time. In addition, consider ${\cal J}({\bf\Psi_f}(t))$ that depends on the forecast ${\bf\Psi_f}(t)$, and which characterizes some future aspect of the forecast circulation.  According to the chain rule, the sensitivity ${\partial\cal{J}}/{\partial{\bf y}}$ of $\cal{J}$ to the observations $\bf y$ collected during the assimilation cycle $t = [t_0+\tau,t_0+\tau]$ is given by:


$$ {\pder{\cal J}{\bf y}} = {\pder{\bf\Psi_a}{\bf y}} {\pder{\bf\Psi_f}{\bf\Psi_a}} {\pder{\cal J}{\bf\Psi_f}} = - {\bf O}^{-1} {\bf H} {\bf R} {\bf Q}_k {\bf T}_{k}^{-1} {\bf Q}_{k}^{T} {\bf R}^{T} {\pder{\cal J}{\bf\Psi_f}} \eqno{(3)}$$
$$ {\pder{\cal J}{\bf y}} = {\pder{\bf\Psi_a}{\bf y}} {\pder{\bf\Psi_f}{\bf\Psi_a}} {\pder{\cal J}{\bf\Psi_f}} = - {\bf O}^{-1} {\bf H} {\bf R} {\bf Q}_k {\bf T}_{k}^{-1} {\bf Q}_{k}^{T} {\bf R}^{T} {\pder{\cal J}{\bf\Psi_f}} \eqno{(4)}$$


which again can be readily evaluated using the adjoint model denoted by ${\bf R}^{T}(t,t_0+\tau)$ and the adjoint of [[IS4DVAR]], ${\bf K}^{T}$.
which again can be readily evaluated using the adjoint model denoted by ${\bf R}^{T}(t,t_0+\tau)$ and the adjoint of [[IS4DVAR]], ${\bf K}^{T}$.


<wikitex>
==Technical Description==
 
</wikitex>

Revision as of 01:55, 10 July 2008

<wikitex>

Mathematical Formulation

The mathematical development presented here closely parallels that of Zhu and Gelaro (2008). The sensitivity of any functional, $\cal J$, of the analysis $\bf\Psi_a$ or forecast $\bf\Psi_f$ can be efficiently computed using the adjoint model which yields information about the gradients of ${\cal J}({\bf\Psi_a})$ and ${\cal J}({\bf\Psi_f})$. We can extent the concept of the adjoint sensitivity to compute the sensitivity of the IS4DVAR cost function, $J$, in (1) and any other function $\cal J$ of the forecast $\bf\Psi_f$ to the observations, $\bf y$.

In IS4DVAR, we define a quadratic cost function:

$$J(\psi) = \frac{1}{2}\,\psi^T {\bf B}^{-1} \psi + \frac{1}{2}\,({\bf G}\psi - {\bf d})^T {\bf O}^{-1} ({\bf G}\psi - {\bf d}) \eqno{(1)} $$

where $\psi$ is the ocean state vector ($\zeta$, u, v, T, S, ...) with $\psi\equiv\psi(t_0)$, $\bf O$ is the observation error and error of representativeness covariance matrix, $\bf B$ represents the background error covariance matrix, $\bf d$ is the innovation vector that represents the difference between the nonlinear background solution ($\bf\Psi_b$) and the observations, ${\bf d}_i = {\bf y}_i - {\bf H}_i ({\bf\Psi_b}(t_i))$. Here, ${\bf H}_i$ is an operator that samples the nonlinear model at the observation location, ${\bf G} = {\bf H}_{i}^{\prime} {\bf R}(t_0,t_i)$, ${\bf H}_{i}^{\prime}$ is the linearization of ${\bf H}_i$, and the operator $\bf R$ is referred as the tangent linear model propagator.

At the minimum of (1), the cost function gradient ${\partial J}/{\partial\psi}$ vanishes and:

$$\psi \equiv \psi_a = ({\bf B}^{-1} + {\bf G}^T {\bf O}^{-1} {\bf G})^{-1} {\bf G}^T {\bf O}^{-1} {\bf d}, \eqno{(2)}$$

where $\psi_a$ is referred to as the analysis increment, the desired solution of the incremental data assimilation procedure.

The IS4DVAR analysis can be written in the more traditional form as ${\bf\Psi_a} = {\bf\Psi_b} + {\bf K} {\bf d}$ ( Daley, 1991), where $\psi_a = {\bf K} {\bf d}$, the Kalman gain matrix ${\bf K} = {\cal H}^{-1} {\bf G}^T {\bf O}^{-1}$, and ${\cal H} = {\partial^{2}J}/{\partial\psi^2} = {\bf B}^{-1} {\bf G}^T {\bf O}^{-1} {\bf G}$ is the Hessian matrix. The entire IS4DVAR procedure is therefore neatly embodied in $\bf K$. At the cost function minimum, (1) can be written as $J = \frac{1}{2} {\bf d}^T ({\bf I} - {\bf K}^T {\bf G}^T) {\bf O}^{-1} {\bf d}$ which yields the sensitivity of the IS4DVAR cost function to the observations.

In the current IS4DVAR/LANCZOS data assimilation algorithm, the cost function (1) is identified using the Lanczos method (Golub and Van Loan, 1989), in which case:

$${\bf\Psi_a} = {\bf\Psi_b} - {\bf Q}_{k} {\bf T}_{k}^{-1} {\bf Q}_{k}^{T} {\bf G}^T {\bf O}^{-1} {\bf d} \eqno{(3)}$$

where ${\bf Q}_k = ({\bf q}_i)$ is the matrix of $k$ orthogonal Lanczos vectors, and ${\bf T}_k$ is a known tridiagonal matrix. Each of the $k\hyphen$iterations of IS4DVAR employed in finding the minimum of $J$ yields one column ${\bf q}_i$ of ${\bf Q}_k$. From (3) we can identify the Kalman gain matrix as ${\bf K} = - {\bf Q}_k {\bf T}_{k}^{-1} {\bf Q}_{k}^{T} {\bf G}^T {\bf O}^{-1}$ in which case ${\bf K}^T = - {\bf O}^{-1} {\bf G} {\bf Q}_k {\bf T}_{k}^{-1} {\bf Q}_{k}^{T}$ represents the adjoint of the entire IS4DVAR system. The action of ${\bf K}^T$ on the vector ${\bf G}^T {\bf O}^{-1} {\bf d}$ in (3) can be readily computed since the Lanczos vectors ${\bf q}_i$ and the matrix ${\bf T}_k$ are available at the end of each IS4DVAR assimilation cycle.

Therefore, the basic observation sensitivity algorithm is as follows:

  1. Force the adjoint model with ${\bf O}^{-1} {\bf d}$ to yield $\psi^\dagger = {\bf G}^{T} {\bf O}^{-1} {\bf d} \propto {\bf q}_i$.
  2. Operate $\psi^\dagger(t_0)$ with ${\bf Q}_k {\bf T}_{k}^{-1} {\bf Q}_{k}^{T}$ which is equivalent to a rank$\hyphen k$ approximation of the Hessian matrix.
  3. Integrate the results of step 2 forward in time using the tangent linear model and save the solution ${\bf H}^{\prime} \psi(t)$, that is, solution at observation locations.
  4. Multiply ${\bf H}^{\prime}\psi(t)$ by $-{\bf O}^{-1}$ to yield ${\partial J}/{\partial{\bf y}}$.

Consider now a forecast ${\bf\Psi_f}(t)$ fort the interval $t = [t_0+\tau, \; t_0+\tau+t_f]$ initialized from ${\bf\Psi_a}(t_0+\tau)$ obtained from an assimilation cycle over the interval $t = [t_0+\tau, \; t_0+\tau]$, where $t_f$ is the forecast lead time. In addition, consider ${\cal J}({\bf\Psi_f}(t))$ that depends on the forecast ${\bf\Psi_f}(t)$, and which characterizes some future aspect of the forecast circulation. According to the chain rule, the sensitivity ${\partial\cal{J}}/{\partial{\bf y}}$ of $\cal{J}$ to the observations $\bf y$ collected during the assimilation cycle $t = [t_0+\tau,t_0+\tau]$ is given by:

$$ {\pder{\cal J}{\bf y}} = {\pder{\bf\Psi_a}{\bf y}} {\pder{\bf\Psi_f}{\bf\Psi_a}} {\pder{\cal J}{\bf\Psi_f}} = - {\bf O}^{-1} {\bf H} {\bf R} {\bf Q}_k {\bf T}_{k}^{-1} {\bf Q}_{k}^{T} {\bf R}^{T} {\pder{\cal J}{\bf\Psi_f}} \eqno{(4)}$$

which again can be readily evaluated using the adjoint model denoted by ${\bf R}^{T}(t,t_0+\tau)$ and the adjoint of IS4DVAR, ${\bf K}^{T}$.

Technical Description

</wikitex>