Could it be that there is a bug in cgradient.F, specifically in the computation of the reduction of the gradient norm?
In cgradientF, the tridiagonal system is solved for the z vector in Eq. 22 of [1]. For innLoop=1, the back substitution into cg_Tmatrix(1,3) and the call to new_gradient() is skipped. Shouldn't this part be executed for innLoop=1? In other words, the 'END IF' for the following conditional
Code: Select all
IF (innLoop.gt.1) THEN
DO i=2,innLoop
cg_gamma(i,outLoop)=cg_beta(i,outLoop)/zbeta
zbeta=cg_delta(i,outLoop)- &
& cg_beta(i,outLoop)*cg_gamma(i,outLoop)
cg_zu(i,outLoop)=(-cg_QG(i,outLoop)- &
& cg_beta(i,outLoop)*cg_zu(i-1,outLoop))/zbeta
END DO
Code: Select all
g_2 = As_2 - b = A(s_1 + Vz) -b = g_1 + AVz =
g_1 + VTz + gamma_1 q_2 z
is approximately equal to
g_1 - V cg_QG + gamma_1 q_2 z
the first two terms cancel: V cg_QG = q_1 q_1^T g_1 = norm(g_1) q_1 = g_1
and z = z_1 \approx -norm(g_1)/delta_1
note that gamma_1/delta_1 = -sqrt(beta_1) = -norm(g_2)/norm(g_1)
g_2 is the second gradient
A is the hessian
T is the tridiagonal matrix of the Lanczos relation
V is the matrix of the Lanczos relation constiting of orthonormal column vectors q_i
q_1 is the first column of V
s_1 is the initial increment, i.e. zero
s_2 is the next increment
cg_QG = V^T g_1 is approximately Tz according to Eq. 22 of [1]
z is the solution of Eq. 22 of [1]
delta_1 is the first diagonal element of T
gamma_1 is the first upper diagonal element of T
The "gamma" here is called "beta" in the code, which is somewhat confusing.
The output of the I4DVAR tutorial currently is:
Code: Select all
....
(001,000): Initial gradient norm, Gnorm = 1.5065655E+03
...
(001,001): Reduction in the gradient norm, Greduc = 0.0000000E+00
(001,001): Lanczos algorithm coefficient, delta = 6.6899689E+02
* ROMS revision b7a47408, although slightly patched to use the old LAPACK due to compiler problems and fixing the bug in obs_k2z
* and roms_test revision e1af6099
* I'm using an outdated data revision, but I guess this doesn't change the control flow
It says the gradient reduction is zero after innLoop=1, but shouldn't we have already computed g_2, which should have a smaller norm than the initial g_1? Moving the 'END IF' upwards above the back-substitution of the tridiagonal solution algorithm gives
Code: Select all
...
(001,000): Initial gradient norm, Gnorm = 1.5065655E+03
...
(001,001): Reduction in the gradient norm, Greduc = 8.6059669E-01
(001,001): Lanczos algorithm coefficient, delta = 6.6899689E+02
Code: Select all
[drop@ocean I4DVAR]$ cat output.log |grep Reduction
(001,001): Reduction in the gradient norm, Greduc = 8.6059669E-01
(001,002): Reduction in the gradient norm, Greduc = 4.0482401E-01
(001,003): Reduction in the gradient norm, Greduc = 3.3909970E-01
(001,004): Reduction in the gradient norm, Greduc = 2.4363785E-01
(001,005): Reduction in the gradient norm, Greduc = 2.0550356E-01
(001,006): Reduction in the gradient norm, Greduc = 1.6836103E-01
(001,007): Reduction in the gradient norm, Greduc = 1.7527041E-01
(001,008): Reduction in the gradient norm, Greduc = 1.2164608E-01
What concerns me, though, is that I get completely different values for the initial gradient norm (in both of the above) when compared to a revision from 2022. Also, the TLcost_function output makes no sense in my own experiment, because the first value is computed differently from the subsequent values, and has vastly different values. May it's a problem in my setup and not a bug, but I think this one here is unrelated to the other problems...
[1] Fisher, M., 1998: Minimization algorithms for variational data !
data assimilation. In Recent Developments in Numerical !
Methods for Atmospheric Modelling, pp 364-385, ECMWF. !
diff:
Code: Select all
--- a/ROMS/Utility/cgradient.F
+++ b/ROMS/Utility/cgradient.F
@@ -992,6 +992,7 @@
cg_zu(i,outLoop)=(-cg_QG(i,outLoop)- &
& cg_beta(i,outLoop)*cg_zu(i-1,outLoop))/zbeta
END DO
+ END IF
!
! Back substitution.
!
@@ -1054,7 +1055,6 @@
& ad_ubar, ad_vbar, &
# endif
& ad_zeta)
- END IF
!
! Compute the new cost function.
!