I4DVAR ANA SENSITIVITY

From WikiROMS
Revision as of 15:34, 15 July 2008 by Robertson (talk | contribs) (<wikitex> should only be used INSIDE sections so that the table of contents will work)
Jump to navigationJump to search


Mathematical Formulation

<wikitex>The mathematical development presented here closely parallels that of Zhu and Gelaro (2008). The sensitivity of any functional, $\cal J$, of the analysis $\bf\Psi_a$ or forecast $\bf\Psi_f$ can be efficiently computed using the adjoint model which yields information about the gradients of ${\cal J}({\bf\Psi_a})$ and ${\cal J}({\bf\Psi_f})$. We can extent the concept of the adjoint sensitivity to compute the sensitivity of the IS4DVAR cost function, $J$, in (1) and any other function $\cal J$ of the forecast $\bf\Psi_f$ to the observations, $\bf y$.

In IS4DVAR, we define a quadratic cost function:

$$J(\psi) = \frac{1}{2}\,\psi^T {\bf B}^{-1} \psi + \frac{1}{2}\,({\bf G}\psi - {\bf d})^T {\bf O}^{-1} ({\bf G}\psi - {\bf d}) \eqno{(1)} $$

where $\psi$ is the ocean state vector ($\zeta$, u, v, T, S,$\ldots$) with $\psi\equiv\psi(t_0)$, $\bf O$ is the observation error and error of representativeness covariance matrix, $\bf B$ represents the background error covariance matrix, $\bf d$ is the innovation vector that represents the difference between the nonlinear background solution ($\bf\Psi_b$) and the observations, ${\bf d}_i = {\bf y}_i - {\bf H}_i ({\bf\Psi_b}(t_i))$. Here, ${\bf H}_i$ is an operator that samples the nonlinear model at the observation location, ${\bf G} = {\bf H}_{i}^{\prime} {\bf R}(t_0,t_i)$, ${\bf H}_{i}^{\prime}$ is the linearization of ${\bf H}_i$, and the operator $\bf R$ is referred as the tangent linear model propagator.

At the minimum of (1), the cost function gradient ${\partial J}/{\partial\psi}$ vanishes and:

$$\psi \equiv \psi_a = ({\bf B}^{-1} + {\bf G}^T {\bf O}^{-1} {\bf G})^{-1} {\bf G}^T {\bf O}^{-1} {\bf d}, \eqno{(2)}$$

where $\psi_a$ is referred to as the analysis increment, the desired solution of the incremental data assimilation procedure.

The IS4DVAR analysis can be written in the more traditional form as ${\bf\Psi_a} = {\bf\Psi_b} + {\bf K} {\bf d}$ ( Daley, 1991), where $\psi_a = {\bf K} {\bf d}$, the Kalman gain matrix ${\bf K} = {\cal H}^{-1} {\bf G}^T {\bf O}^{-1}$, and ${\cal H} = {\partial^{2}J}/{\partial\psi^2} = {\bf B}^{-1} {\bf G}^T {\bf O}^{-1} {\bf G}$ is the Hessian matrix. The entire IS4DVAR procedure is therefore neatly embodied in $\bf K$. At the cost function minimum, (1) can be written as $J = \frac{1}{2} {\bf d}^T ({\bf I} - {\bf K}^T {\bf G}^T) {\bf O}^{-1} {\bf d}$ which yields the sensitivity of the IS4DVAR cost function to the observations.

In the current IS4DVAR/LANCZOS data assimilation algorithm, the cost function minimum of (1) is identified using the Lanczos method (Golub and Van Loan, 1989), in which case:

$${\bf\Psi_a} = {\bf\Psi_b} - {\bf Q}_{k} {\bf T}_{k}^{-1} {\bf Q}_{k}^{T} {\bf G}^T {\bf O}^{-1} {\bf d} \eqno{(3)}$$

where ${\bf Q}_k = ({\bf q}_i)$ is the matrix of $k$ orthogonal Lanczos vectors, and ${\bf T}_k$ is a known tridiagonal matrix. Each of the $k\hyphen$iterations of IS4DVAR employed in finding the minimum of $J$ yields one column ${\bf q}_i$ of ${\bf Q}_k$. From (3) we can identify the Kalman gain matrix as ${\bf K} = - {\bf Q}_k {\bf T}_{k}^{-1} {\bf Q}_{k}^{T} {\bf G}^T {\bf O}^{-1}$ in which case ${\bf K}^T = - {\bf O}^{-1} {\bf G} {\bf Q}_k {\bf T}_{k}^{-1} {\bf Q}_{k}^{T}$ represents the adjoint of the entire IS4DVAR system. The action of ${\bf K}^T$ on the vector ${\bf G}^T {\bf O}^{-1} {\bf d}$ in (3) can be readily computed since the Lanczos vectors ${\bf q}_i$ and the matrix ${\bf T}_k$ are available at the end of each IS4DVAR assimilation cycle.

Therefore, the basic observation sensitivity algorithm is as follows:

  1. Force the adjoint model with ${\bf O}^{-1} {\bf d}$ to yield $\psi^\dagger = {\bf G}^{T} {\bf O}^{-1} {\bf d} \propto {\bf q}_i$.
  2. Operate on $\psi^\dagger(t_0)$ with ${\bf Q}_k {\bf T}_{k}^{-1} {\bf Q}_{k}^{T}$ which is equivalent to a rank$\hyphen k$ approximation of the Hessian matrix.
  3. Integrate the results of step (ii) forward in time using the tangent linear model and save the solution ${\bf H}^{\prime} \psi(t_i)$. That is, the solution at observation locations.
  4. Multiply ${\bf H}^{\prime}\psi(t_i)$ by $-{\bf O}^{-1}$ to yield ${\partial J}/{\partial{\bf y}}$.

Consider now a forecast ${\bf\Psi_f}(t)$ for the interval $t = [t_0+\tau, \; t_0+\tau+t_f]$ initialized from ${\bf\Psi_a}(t_0+\tau)$ obtained from an assimilation cycle over the interval $t = [t_0, t_0+\tau]$, where $t_f$ is the forecast lead time. In addition, consider ${\cal J}({\bf\Psi_f}(t))$ that depends on the forecast ${\bf\Psi_f}(t)$, and which characterizes some future aspect of the forecast circulation. According to the chain rule, the sensitivity ${\partial\cal{J}}/{\partial{\bf y}}$ of $\cal{J}$ to the observations $\bf y$ collected during the assimilation cycle $t = [t_0,t_0+\tau]$ is given by:

$$ {\pder{\cal J}{\bf y}} = {\pder{\bf\Psi_a}{\bf y}} {\pder{\bf\Psi_f}{\bf\Psi_a}} {\pder{\cal J}{\bf\Psi_f}} = - {\bf O}^{-1} {\bf H} {\bf R} {\bf Q}_k {\bf T}_{k}^{-1} {\bf Q}_{k}^{T} {\bf R}^{T} {\pder{\cal J}{\bf\Psi_f}} \eqno{(4)}$$

which again can be readily evaluated using the adjoint model denoted by ${\bf R}^{T}(t,t_0+\tau)$ and the adjoint of IS4DVAR, ${\bf K}^{T}$. </wikitex>

Technical Description

<wikitex>This ROMS driver can be used to evaluate the impact of each observation on the IS4DVAR data assimilation algorithm by measuring their sensitivity over a specified circulation functional index, ${\cal J}$. This algorithm is an extension of the adjoint sensitivity driver, AD_SENSITIVITY. As mentioned above, this algorithm is equivalent to taking the adjoint of the IS4DVAR algorithm.

The following steps are carried out in obs_sen_ocean.h:

obs sens time diagram.png
  1. We begin by running the incremental strong constraint 4DVar data assimilation algorithm (IS4dVAR) with the Lanczos conjugate gradient minimization algorithm (LANCZOS) using $k$ inner-loops ($\hbox{Ninner}=k$) and a single outer-loop ($\hbox{Nouter}=1$) for the period $t=[t_0,t_1]$. We will denote by ${\bf\Psi_b}(t_0)$ the background initial condition, and the observations vector by $\bf y$. The resulting Lanczos vectors that we save in the adjoint NetCDF file will be denoted by ${\bf q}_i$, where $i=1,2,\ldots,k$.
  2. Next we run the nonlinear model for the combined assimilation plus forecast period $t=[t_0,t_2]$, where $t_2\geq t_1$. This represents the final sweep of the nonlinear model for the period $t=[t_0,t_1]$ after exiting the inner-loop in the IS4DVAR plus the forecast period $t=[t_1,t-2]$. The initial condition for the nonlinear mode at $t=t_0$ is ${\bf\Psi_b}(t_0)$ and not the new estimated initial conditions, ${\bf\Psi}(t_0) = {\bf\Psi_b}(t_0) + \psi(t_0)$. We save the basic state trajectory, ${\bf\Psi_b}(t)$, of this nonlinear model run for use in the adjoint sensitivity calculation next, and for use in the tangent linear model run later in step (vii). Depending on time for which the sensitivity functional ${\cal J}(t)$ is defined, this will dictate $t_2$. For example, if ${\cal J}(t)$ is a functional defined during the forecast interval $t_1<t<t_2$, then $t_2>t_1$ for this run of the nonlinear model. However, if ${\cal J}(t)$ is defined during the assimilation interval $t_0<t<t_1$, then $t_2=t_1$. That is, the definition of $t_2$ should be flexible depending on the choice of $\cal J$.
  3. The next step involves an adjoint sensitivity calculation for the combined assimilation plus forecast period $t=[t_2,t_0]$. The basic state trajectory for this calculation will be that from the nonlinear model run in step (ii).
  4. After running the regular adjoint sensitivity calculation in (iii), we will have a full 3D-adjoint state vector at time $t=t_0$. Let's call this vector $\psi^\dagger(t_0)$. The next thing we want to do is to compute the dot-product of $\psi^\dagger(t_0)$ with each of the Lanczos vectors from the previous IS4DVAR run. So if we ran IS4DVAR with $k$ inner-loops, we will have $k\hyphen$Lanczos vectors which we denote as ${\bf q}_i$ where $i=1,2,\ldots,k$. So we will compute $a_i = \psi^T {\bf q}_i$ where $\psi^T$ is the transpose of the vector $\psi^\dagger(t_0)$, and $a_i$ for $i=1,2,\ldots,k$ are scalars, so there will be $k$ of them.
  5. The next step is to invert the tridiagonal matrix associated with the Lanczos vectors. Let's denote this matrix as $\bf T$. So what we want to solve ${\bf T}{\bf b}={\bf a}$, where $\bf a$ is the $k\times 1$ vector of scalars $a_i$ from step (iv), and $\bf b$ is the $k\times 1$ vector that we want to find. So we solve for $b$ by using a tridiagonal solver.
  6. The next step is to compute a weighted sum of the Lanczos vectors. Let's call this $\bf z$, where ${\bf z} = \sum_i (b_i {\bf q}_i)$ for $i=1,2,\ldots,k$. The $b_i$ are obtained from solving the tridiagonal equation in (v), and the ${\bf q}_i$ are the Lanczos vectors. The vector $z$ is a full-state vector and be used as an initial condition for the tangent linear model in step (vii).
  7. Finally, we run the tangent linear model from $t=[t_0,t_1]$ using $\bf z$ from (vi) as the initial conditions. During this run of the tangent linear model, we need to read and process the observations $\bf y$ that we used in the IS4DVAR of step 1 and write the solution at the observation points and times to the MODname NetCDF file. The values that we write into this MODname are actually the values multiplied by error covariance $-{\bf O}^{-1}$ assigned to each observation during the IS4DVAR in step (i).

</wikitex>