next up previous contents index
Next: Convergence properties of the Up: Convergence properties of the Previous: subspace iteration in the   Contents   Index

The interaction between the subspace iteration and the Lanczos-Ritz values

Taking a look at the numerical examples in Chapter 6, sometimes it seems that there is no sign of the convergence of the subspace iteration. The proofs stated above are correct, but there is an interaction with the Lanczos behavior of the algorithm, which will sometimes lead to a slower subspace iteration convergence than expected. This can only happen when, all the vectors $ \langle e_{n-k},e_{n-k+1},\ldots,e_n\rangle $ have a small component in one or more directions of the eigenspace connected to the dominant eigenvalues. Before we show by examples, that this happens in practice, we first give a condition under which we are sure that the subspace iteration lets the matrix converge to one having a diagonal block containing the dominant eigenvalues. As soon as some of the Ritz values approximate all of the dominant eigenvalues, this convergence behavior appears. To show this we assume that the initial matrix $ A$ has two clusters of eigenvalues, $ \Lambda_1=\{ \lambda_{1,j} \vert j\in
J\}$ and $ \Lambda_{2}=\{ \lambda_{2,i} \vert i\in I\}$, with $ \char93 J = n_1$ and $ \char93 I=n_2$. Suppose also that there is a gap between cluster $ 2$ and cluster $ 1$, $ \min_{i \in I } \vert\lambda_{2,i}\vert \ggg
\max_{j \in J} \vert\lambda_{1,i}\vert $. Using the following notations $ \Delta_1= \diag(\Lambda_1)$ and $ \Delta_2=\diag(\Lambda_2)$, we can write the matrix $ A^{(k)}$ in the following way:

% latex2html id marker 34876
$\displaystyle \left( \begin{array}{cc} A_k & \time...
...}{cc}\ V_{11}^T & V_{21}^T \\  [0.1cm] V_{12}^T & V_{22}^T \end{array} \right),$ (5.7)

with $ S_k \in \mathbb{R}^{k \times k}$, $ V_{11}\in \mathbb{R}^{(n-k)\times n_1}$, $ V_{12} \in \mathbb{R}^{(n-k) \times n_2}$, $ V_{21}\in \mathbb{R}^{k\times n_1}$ and $ V_{22}\in \mathbb{R}^{k\times n_2}$. From equation (5.7), we have the following equation for $ S_k$
$\displaystyle S_k$ $\displaystyle =$ $\displaystyle V_{21} \Delta_1 V_{21}^T + V_{22} \Delta_2 V_{22}^T.$  

Let the eigenvalue decomposition of $ S_k$ be denoted as:
$\displaystyle S_k$ $\displaystyle =$ $\displaystyle \left( \begin{array}{cc} V^{S_k}_1 & V^{S_k}_2
\end{array} \right...
\left( \begin{array}{c}
{V^{S_k}_1}^T \\
\end{array} \right)$ (5.8)

where we assume that some of the eigenvalues of $ S_k$ (in fact the Ritz values) approximate already the dominant eigenvalues $ \Lambda_2$. Let us denote $ \Delta^{S_k}_2=\diag \left( \Lambda^{S_k}_2 \right)$ with $ \Lambda^{S_k}_2$ a set of eigenvalues $ \Lambda^{S_k}_2=\{\lambda^{S_k}_{2,i}\vert \forall i\in I\}$, then we assume that

$\displaystyle \lambda_{2,i}^{S_k} \approx \lambda_{2,i} \quad \forall i \in I.$    

subspace iteration needs weak conditions before convergence can occur: $ V_{22}$ has full rank and is far from being not of full rank. This corresponds to the demand that the last vectors $ \{ e_{n-k+1},\ldots,e_n\}$ projected on the invariant subspace connected to the dominant eigenvalues $ \lambda_{2,i}$ have a large component in every direction of this subspace.

Via an Householder transformation we can transform equation (5.7) into

$\displaystyle \left( \begin{array}{c\vert c}
\tilde{A} &
...0.3cm} \begin{array}{cc} 0 & \tilde{v}_k \end{array}}
& S_k \end{array} \right)$ $\displaystyle =$ $\displaystyle \left( \begin{array}{cc}
\tilde{V}_{11} & \tilde{V}_{12} \\
...}_{11}^T & V_{21}^T \\ [0.1cm]
\tilde{V}_{12}^T & V_{22}^T \end{array} \right).$  

Then we can substitute $ S_k$ by its eigenvalue decomposition (5.8):

\begin{displaymath}\left( \begin{array}{c\vert c}
\tilde{A} & {\rule[-0.4cm]{0cm...
...} & 0 \\
0 & \Delta^{S_k}_{2}
\end{array} \right)\end{displaymath} $\displaystyle =$ $\displaystyle \left( \begin{array}{cc}
\tilde{V}_{11} & \tilde{V}_{12} \\
...e{V}_{21}^T \\ [0.1cm]
\tilde{V}_{12}^T & \tilde{V}_{22}^T \end{array} \right).$  

Note that $ V_{22}$ is of full rank, if and only if $ \tilde{V}_{22}$ is of full rank. We get

% latex2html id marker 34944
$\displaystyle \Delta^{S_k}_{2} = \left( \begin{arr...
...y}{c} \tilde{V}_{12}^T P^T\\  [0.1cm] \tilde{V}_{22}^T P^T \end{array} \right),$    

where $ P$ is the projection matrix $ \left( 0\; I \right)$ of size $ n^{S_k}_{2}\times n$ where $ I$ is the identity matrix of size $ n^{S_k}_2\times n^{S_k}_2$, with $ n_2^{S_k}$ the dimension of $ \Delta^{S_k}_{2}$. This means that:

$\displaystyle \Delta^{S_k}_{2} - P \tilde{V}_{12} \Delta_1 \tilde{V}_{12}^T P^T = P \tilde{V}_{22} \Delta_2 \tilde{V}_{22}^T P^T.$ (5.9)

Note that $ P \tilde{V}_{22} $ is a square matrix.

Because there is a gap between the eigenvalues $ \lambda_{1,j}, j \in
J$ and $ \lambda_{2,i}, i \in I$, there will be a comparable gap between $ \lambda^{S_k}_{2,j}, j \in J$ and $ \lambda_{1,i}, i \in I$. Hence the matrix at the left-hand side of equation (5.9) is far from singular. Therefore this is also true for the right-hand side, and $ \tilde{V}_{22}$ is of full rank. Hence, $ V_{22}$ has full rank.

In this paragraph, a brief explanation is given, why in certain situations the subspace iteration does not work immediately from the start. Suppose the size of the block $ \Delta_2$ is $ 2$. We can write the matrix $ A^{(1)}=Q_0^TAQ_0$ in decomposed form:

% latex2html id marker 34982
$\displaystyle A^{(1)}= \left( \begin{array}{cc} A_...
...y}{cc} V_{11}^T & V_{21}^T \\  [0.1cm] V_{12}^T & V_{22}^T \end{array} \right).$ (5.10)

Note that $ \alpha$ equals $ S_1$. Applying already the Householder transformation on the matrix $ A$ gives us the following decomposition:
    $\displaystyle \left( \begin{array}{cc}
\tilde{H} & 0 \\ [0.1cm]
q^T & 0 \\ [0.1...
\tilde{H}^T & q & 0 \\ [0.1cm]
0 & 0 & 1
\end{array} \right)$ (5.11)
  $\displaystyle =$ $\displaystyle \left( \begin{array}{ccc}
\tilde{A} & a^T & 0 \\
a & \gamma & \beta \\ [0.1cm]
0 & \beta & \alpha
\end{array} \right) = \tilde{A}^{(1)}.$  

The Ritz values are the eigenvalues of the two by two matrix:

% latex2html id marker 34997
$\displaystyle \left( \begin{array}{cc} \gamma & \beta \\  \beta & \alpha \end{array} \right).$    

In fact, to complete the previous step a Givens transformation should still be performed on the matrix $ \tilde{A}^{(1)}$, to make the last two rows and columns dependent, and thereby transforming the matrix into $ A^{(2)}$. Note that this last Givens transformation does not change the eigenvalues of the bottom right submatrix.

We show now that the weak assumptions to let the subspace iteration converge, are not satisfied. We prove that

$\displaystyle \left( e_{n-1}\;\; e_n \right)$    

does not have two large linearly independent components in the span of

% latex2html id marker 35005
$\displaystyle \left( \begin{array}{cc} \tilde{H} &...
...d{array} \right) \left( \begin{array}{c} V_{12} \\  V_{22} \end{array} \right).$    

We can write $ \left( e_{n-1}\;\; e_n \right)$ as a linear combination of the eigenvectors:

% latex2html id marker 35009
$\displaystyle \left( e_{n-1}\;\; e_n \right) = \le...
...2} \end{array} \right) \left( \begin{array}{c} C_1 \\  C_2 \end{array} \right).$    

The coordinates $ C_2$ of $ \left( e_{n-1}, e_n \right)$ with respect to the dominant eigenvectors, are the following:
$\displaystyle C_2$ $\displaystyle =$ $\displaystyle \left( \begin{array}{cc}
V_{12}^T & V_{22}^T
\end{array} \right)
\end{array} \right)
e_{n-1} & e_n
\end{array} \right)$  
  $\displaystyle =$ $\displaystyle \left( \begin{array}{cc}
V_{12}^T & V_{22}^T
\end{array} \right)
...} \right)
= \left( \begin{array}{cc}
V_{12}^T q & V_{22}^T
\end{array} \right).$  

Using equation (5.10) and (5.11), we get that

% latex2html id marker 35026
$\displaystyle \left( \begin{array}{cc} V_{11}^T & ...
...ight)\left( \begin{array}{c} V_{21}^T \\  [0.1cm] V_{22}^T \end{array} \right).$    


$\displaystyle V_{12}^T v_n + V_{22}^T \alpha = \Delta_2 V_{22}^T.$    

Because of the Householder transformation we know that $ q=v_n/\Vert v_n \Vert$, hence

$\displaystyle V_{12}^Tq=\left( \Delta_2 - \alpha I \right) \frac{V_{22}^T}{\Vert v_n\Vert}.$    

$\displaystyle C_2$ $\displaystyle =$ $\displaystyle \left(\begin{array}{cc} V_{12}^T q & V_{22}^T \end{array}\right)$  
  $\displaystyle =$ $\displaystyle \left(\begin{array}{cc} \left( \Delta_2 - {\alpha} I \right) \frac{V_{22}^T}{\Vert v_n\Vert} & V_{22}^T \end{array}\right).$  

This means that if $ \Delta_2 V_{22}^T$ lies almost in the same direction as $ V_{22}^T$ then $ C_2$ is almost singular. If the two largest eigenvalues $ \lambda_{2,1} \approx \lambda_{2,2}$ this is the case. Note however that when $ \lambda_{2,1} \approx -\lambda_{2,2}$, the subspace iteration will work immediately, because they are both extreme and immediately located by the Lanczos procedure.

Briefly spoken we have the following interaction between the Lanczos-Ritz value behavior, and the subspace iteration: the subspace iteration, will start converging as soon as the Lanczos-Ritz values have converged close enough to the dominant eigenvalues.

next up previous contents index
Next: Convergence properties of the Up: Convergence properties of the Previous: subspace iteration in the   Contents   Index
Raf Vandebril 2004-05-03