Next: subspace iteration and the
Up: Convergence properties of the
Previous: Lanczos-Ritz values in a
  Contents
  Index
subspace iteration inside of the orthogonal similarity reduction of a
matrix into semiseparable form
In Section 5.2.1 we showed that the intermediate
semiseparable matrices have as eigenvalues the Lanczos-Ritz values.
This behavior is completely the same as the one observed in an
orthogonal similarity transformation to reduce a symmetric matrix into
a similar tridiagonal one. One might therefore doubt the usefulness of
this reduction, because it costs
more, and has not yet any
advantage over the tridiagonal approach. In this section we will
prove that the extra cost of
provides an extra interesting property for this
reduction.
At each step of the algorithm introduced in Theorem 49,
one more row (column) is added by means of orthogonal transformations
to the set of the rows (columns) of the matrix already proportional to
each other. In this section, using arguments similar to those
considered in [183,184,185,187,195], we show that this algorithm can be
interpreted as a kind of nested subspace iteration method [91],
where the size of the vector subspace is increased by one and a change
of coordinate system is made at each step of the algorithm. As a
consequence, depending on the gaps between the eigenvalues, the
semiseparable part of the matrix will converge to a block diagonal
matrix, and the eigenvalues of these blocks converge to the largest
eigenvalues in absolute value of the original symmetric matrix.
Given a matrix
and an initial subspace
the subspace iteration method [91]
can be described as follows
Under weak assumptions
on
and
the
converge to an invariant subspace (more details on these assumptions
will be investigated in the next subsection, because the subspace iteration will interact with the Lanczos convergence behavior).
We will see that the reduction algorithm from a symmetric to a semiseparable matrix can be
interpreted as such a kind of subspace iteration, where the dimension of the subspace grows by one at each step of the algorithm.
Let
. For the reduction algorithm as presented in
Section 4.1 in Chapter 4 we have
. Suppose we have only performed the first orthogonal
similarity transformations such that the rows (columns)
and
are already proportional (Let us capture all the necessary
Givens and Householder transformations to go from matrix
to
matrix
in one orthogonal matrix
):
 |
(5.3) |
where
has the semiseparable structure in the rows (columns)
and
and
From (5.3), we can write
Let
be the standard basis vectors of
.
From (5.4), because
of the structure of
, it can clearly be seen that:
This means that
the last column of
and
span the same one-dimensional space.
In fact one subspace iteration step is performed on
the vector
.
The first step of the algorithm is completed when the following
orthogonal transformation is performed,
The latter transformation can be
interpreted as a change of coordinate system:
and
represent the same linear transformation with respect to
different coordinate systems. Let
. Then
is
represented in the new system by
. This means that for the
vector
we get
. Summarizing, this
means that one step of subspace iteration on the subspace
is
performed, resulting in a new subspace
, and then, by
means of a coordinate transformation, it is transformed back into the
subspace
. So, instead of working with a fixed matrix and
changing subspaces, we work with fixed subspaces and changing
matrices. Therefore, denoting by
the eigenvector
corresponding to the largest eigenvalue in absolute value
of
, we can say that, if
has
a nonzero last component and if the largest eigenvalue is unique, the
sequence
converges to
, and,
consequently, the lower right element of
converges to
. Note that the last assumption of the nonzero component,
will play an important role in the next section, where the interaction
between the Lanczos behavior and the subspace iteration is investigated.
The second step can be interpreted in a completely analogous way.
Suppose we have already the semiseparable structure in the last two
rows (columns). Then we perform the following similarity
transformation on
,
 |
(5.5) |
in order to make the rows (columns)
up to
dependent.
Using (5.5),
can be written as follows,
Considering the subspace
and using the same notation as above, we have,
This means
that the second step of the algorithm is a step of subspace iteration
performed on a larger subspace. For every new dependency that
is created in the symmetric matrix
, the dimension of the
subspace is increased by one.
One can also see the algorithm as performing in each iteration step
a
step without shift on the semiseparable bottom-right submatrix.
Let us partition the matrix
(as in the proof of Theorem 49) as follows:
with
,
,
and
. From the semiseparable structure, we know that
has rank one.
In the first part of the
th iteration step, an
orthogonal matrix
is constructed such that
with
.
Hence, our matrix
is transformed into the following orthogonally similar matrix
Let us partition the matrix in a slightly different way, adding one more column and row to the semiseparable part.
Let
then we can define the semiseparable matrix
as follows
In the second part of the
th iteration step, we perform a
step on
the semiseparable matrix
of order
, i.e.,
we construct an orthogonal matrix
such that
with
lower triangular. Hence, the matrix of (5.6)
is transformed into the orthogonal similar matrix
The execution of the two steps is illustrated in Figure 5.1.
At the beginning of the iteration the last four rows and columns
have already a semiseparable structure. In (1), a similarity transformation
(either an Householder matrix or a product of Givens transformations) is considered
in order to annihilate the entries indicated by
.
Due to the semiseparable structure,
all the entries in the first four rows (colums) with column (row) indexes greater than 5
are annihilated, too. In steps (2), (3), (4) and (5), the Givens rotations are only applied to
the left, transforming the
principal submatrix in the right-bottom corner
to lower triangular form. To retrieve the semiseparable structure in the last 5 rows and columns, the transpose of the latter Givens rotation
must be applied to the right.
Figure 5.1:
Description of
step without shift applied in each iteration
of the semiseparable reduction.
 |
This means that from step
all the consecutive steps
perform subspace iterations on the subspace of dimension
.
From
[187], we know that these consecutive
iterations on subspaces tend to create block lower triangular
matrices. Hence, for a symmetric matrix these are block diagonal. Furthermore,
the process works for all the nested subspaces at the same time, and
so the semiseparable part of the matrices
generated by
the proposed algorithm, becomes more and more block diagonal, and the
blocks contain eigenvalues of the original matrix. This
explains why in the numerical examples (see Chapter 6), the lower right block already gives a good estimate of
the largest eigenvalues, since they are connected to a subspace on
which the subspace iteration is performed most.
This insight also opens several new perspectives. In many problems,
only the few largest eigenvalues (in absolute value) need to be
computed [22,114,181,182,192,193].
In such cases, the
proposed algorithm gives the required information after only few
steps, without running the algorithm to completion. Moreover,
because the sequence of similar matrices generated at each step of
the algorithm converges to a block diagonal matrix, the original
problem can be divided into smaller independent subproblems. A direct
application of this behavior can be found in the field of optical
flow. In this field one wants to reduce the complexity of storing
movies by not considering a movie as a sequence of images, but as
pictures in which the pixels are moving. In this way one can
associate a vector field to each pixel. Pixels which are moving have
the largest component in the vector field. This means that moving
objects in the image correspond to separate subspaces [112].
We know that at each step of the algorithm a step of
is performed on the semiseparable part of the matrix.
This insight opens the perspective to a new research topic. Is
it possible to incorporate a shift in this reduction, thereby
performing a
-step with shift on the already semiseparable part.
This will however destroy the semiseparable structure, but it will
reduce the matrix into a similar semiseparable plus diagonal matrix.
Moreover it is possible to extract a complete diagonal matrix,
not only a shift, from the semiseparable part and then perform the
-step. In the
following part we will prove that a
-step applied on a
semiseparable plus diagonal matrix is again a semiseparable plus
diagonal matrix, for which the diagonal did not change. This last
statement justifies our previous statements.
We finish this section with a theorem from [187] concerning
the speed of convergence of subspace iterations.
Definition 56 (From [][p. 8)
n890]
Denote with

and

two subspaces, then the
distance

between these two subspaces is
defined in the following way:
Using this definition, we can state the following convergence theorem:
Theorem 57 (From [][Theorem 5.4)
n890]
Let

, and let

be a polynomial of degree

. Let

denote the eigenvalues
of

, ordered so that

. Suppose

is a positive integer less
than

for which

. Let

be a sequence of polynomials of degree

such that

as

and

for

and all

. Let

. Let

and

be the invariant subspaces of

associated with

and

respectively. Consider the nonstationary subspace iteration
where

is a

-dimensional
subspace of

satisfying

.
Then for every

satisfying

there exists
a constant

such that
In our case Theorem 57 can be applied with the polynomials
.
We translate the previous theorem to the matrix case, for which we
apply the
-algorithm without shift to the matrix
. With
and
. We have that for
there exists a
, from Theorem 57, such that:
Next: subspace iteration and the
Up: Convergence properties of the
Previous: Lanczos-Ritz values in a
  Contents
  Index
Raf Vandebril
2004-05-03