가계 반사가 행렬을 대각선으로 만들 수없는 이유는 무엇입니까?

16

실제로 QR 분해를 계산할 때, 가정용 반사를 사용하여 매트릭스의 하부를 제로화합니다. 대칭 행렬의 고유 값을 계산할 때 가정용 반사로 할 수있는 최선의 방법은 삼각 행렬로 만드는 것입니다. 왜 이런 식으로 완전히 대각선 화 될 수 없는지 알 수있는 확실한 방법이 있습니까? 나는 이것을 간단하게 설명하려고 노력하고 있지만 명확한 발표를 할 수는 없다.

linear-algebra matrix

— 빅터 리우
소스

12

대칭 행렬 의 고유 값을 계산할 때 $M\in\mathbb{R}^{n\times n}$ 가정용 반사경으로 할 수있는 최선의 방법은 $M$ 을 3 각형으로 만드는 것입니다. 이전 답변에서 언급했듯이 $M$ 이 대칭 이기 때문에 대각선 행렬, 즉 를 초래하는 직교 유사성 변환이 $D=S^TMS$ 있습니다. 일련의 반사기를 계산 하고 왼쪽에서 및 를 적용하여 가정용 반사기를 엄격하게 사용하여 알려지지 않은 직교 행렬 의 동작을 찾을 수 있다면 편리 할 것입니다 $S$ $H^T$ $M$ $H$ 오른쪽에서 . 그러나 가정용 리플렉터가 열을 제로화하도록 설계된 방식으로 인해 이는 불가능합니다. 우리 모두에서 제로로 집주인의 반사를 계산하기 위해 아래의 번호 인 경우 우리가 찾을 $M$ $M_{11}$ 그러나 이제 항목 은왼쪽에 적용된반사기 에의해 변경되었습니다. 따라서오른쪽에 을적용하면 더 이상 남기고의 첫 번째 행을 0으로 만들지 않습니다. 대신 우리는

M = (\begin{array}{ccccc} * & * & * & * & * \\ * & * & * & * & * \\ * & * & * & * & * \\ * & * & * & * & * \\ * & * & * & * & * \end{array}) \to H_{1}^{T} M = (\begin{array}{ccccc} * & *^{'} & *^{'} & *^{'} & *^{'} \\ 0 & *^{'} & *^{'} & *^{'} & *^{'} \\ 0 & *^{'} & *^{'} & *^{'} & *^{'} \\ 0 & *^{'} & *^{'} & *^{'} & *^{'} \\ 0 & *^{'} & *^{'} & *^{'} & *^{'} \end{array}) .

$M=\left(\!\!{\begin{array}{ccccc} * &* & * & *&* \\ * &* & * & *&* \\ * &* & * & *&* \\ * &* & * & *&* \\ * &* & * & *&* \\ \end{array}}\!\!\right)\rightarrow H^T_1M=\left(\!\!{\begin{array}{ccccc} * &*' & *' & *'&*' \\ 0 &*' & *' & *'&*' \\ 0 &*' & *' & *'&*' \\ 0 &*' & *' & *'&*' \\ 0 &*' & *' & *'&*' \\ \end{array}}\!\!\right).$

M_{12} - M_{1 n}

$M_{12}-M_{1n}$

H_{1}^{T}

$H^T_1$

H_{1}

$H_1$

M

$M$

M_{11}

$M_{11}$

행을 제로로 만들지 않았을뿐만 아니라 반사기

방금 도입 한 제로 구조를 파괴 할 수 있습니다.

H_{1}^{T} M = (\begin{array}{ccccc} * & *^{'} & *^{'} & *^{'} & *^{'} \\ 0 & *^{'} & *^{'} & *^{'} & *^{'} \\ 0 & *^{'} & *^{'} & *^{'} & *^{'} \\ 0 & *^{'} & *^{'} & *^{'} & *^{'} \\ 0 & *^{'} & *^{'} & *^{'} & *^{'} \end{array}) \to H_{1}^{T} M H_{1} = (\begin{array}{ccccc} * & *^{″} & *^{″} & *^{″} & *^{″} \\ *^{'} & *^{″} & *^{″} & *^{″} & *^{″} \\ *^{'} & *^{″} & *^{″} & *^{″} & *^{″} \\ *^{'} & *^{″} & *^{″} & *^{″} & *^{″} \\ *^{'} & *^{″} & *^{″} & *^{″} & *^{″} \end{array}) .

$H^T_1M=\left(\!\!{\begin{array}{ccccc} * &*' & *' & *'&*' \\ 0 &*' & *' & *'&*' \\ 0 &*' & *' & *'&*' \\ 0 &*' & *' & *'&*' \\ 0 &*' & *' & *'&*' \\ \end{array}}\!\!\right)\rightarrow H^T_1MH_1=\left(\!\!{\begin{array}{ccccc} * &*'' & *'' & *''&*'' \\ *' &*'' & *'' & *''&*'' \\ *' &*'' & *'' & *''&*'' \\ *' &*'' & *'' & *''&*'' \\ *' &*'' & *'' & *''&*'' \\ \end{array}}\!\!\right).$

H_{1}^{T}

$H^T_1$

However, when you opt to drive $M$ to a tridiagonal structure you will leave the first row untouched by the action of $H^T_1$ , so

M = (\begin{array}{ccccc} * & * & * & * & * \\ * & * & * & * & * \\ * & * & * & * & * \\ * & * & * & * & * \\ * & * & * & * & * \end{array}) \to H_{1}^{T} M = (\begin{array}{ccccc} * & * & * & * & * \\ *^{'} & *^{'} & *^{'} & *^{'} & *^{'} \\ 0 & *^{'} & *^{'} & *^{'} & *^{'} \\ 0 & *^{'} & *^{'} & *^{'} & *^{'} \\ 0 & *^{'} & *^{'} & *^{'} & *^{'} \end{array}) .

$M=\left(\!\!{\begin{array}{ccccc} * &* & * & *&* \\ * &* & * & *&* \\ * &* & * & *&* \\ * &* & * & *&* \\ * &* & * & *&* \\ \end{array}}\!\!\right)\rightarrow H^T_1M=\left(\!\!{\begin{array}{ccccc} * &* & * & *&* \\ *' &*' & *' & *'&*' \\ 0 &*' & *' & *'&*' \\ 0 &*' & *' & *'&*' \\ 0 &*' & *' & *'&*' \\ \end{array}}\!\!\right).$ Thus when we apply the same reflector from the right we obtain

H_{1}^{T} M = (\begin{array}{ccccc} * & * & * & * & * \\ *^{'} & *^{'} & *^{'} & *^{'} & *^{'} \\ 0 & *^{'} & *^{'} & *^{'} & *^{'} \\ 0 & *^{'} & *^{'} & *^{'} & *^{'} \\ 0 & *^{'} & *^{'} & *^{'} & *^{'} \end{array}) \to H_{1}^{T} M H_{1} = (\begin{array}{ccccc} * & *^{'} & 0 & 0 & 0 \\ *^{'} & *^{″} & *^{″} & *^{″} & *^{″} \\ 0 & *^{″} & *^{″} & *^{″} & *^{″} \\ 0 & *^{″} & *^{″} & *^{″} & *^{″} \\ 0 & *^{″} & *^{″} & *^{″} & *^{″} \end{array}) .

$H^T_1M=\left(\!\!{\begin{array}{ccccc} * &* & * & *&* \\ *' &*' & *' & *'&*' \\ 0 &*' & *' & *'&*' \\ 0 &*' & *' & *'&*' \\ 0 &*' & *' & *'&*' \\ \end{array}}\!\!\right)\rightarrow H^T_1MH_1=\left(\!\!{\begin{array}{ccccc} * &*' & 0 & 0&0 \\ *' &*'' & *'' & *''&*'' \\ 0 &*'' & *'' & *''&*'' \\ 0 &*'' & *'' & *''&*'' \\ 0 &*'' & *'' & *''&*'' \\ \end{array}}\!\!\right).$

Applied recursively this allows us to drive $M$ to a tridiagonal matrix $T$ . You can complete the diagonalization of $M$ efficiently, as was mentioned previously, using Jacobi or Givens rotations both of which are found in the Golub and Van Loan book Matrix Computations. The accumulated actions of the sequence of Householder reflectors and Jacobi or Givens rotations allows us to find the action of the orthogonal matrices $S^T$ and $S$ without explicitly forming them.

— Andrew Winters
소스

11

As the Comments to other Answers clarify, the real issue here is not a shortcoming of Householder matrices but rather a question as to why iterative rather than direct ("closed-form") methods are used to diagonalize (real) symmetric matrices (via orthogonal similarity).

Indeed any orthogonal matrix can be expressed as a product of Householder matrices, so if we knew the diagonal form of a symmetric matrix (its eigenvalues), we could solve for a complete set of orthonormalized eigenvectors and represent the corresponding change of basis matrix as a product of Householder transformations in polynomial time.

So let's turn to Victor's parenthetical comment "other than Abel's theorem" because we are effectively asking why iterative methods should be used find the roots of a polynomial rather than a direct method. Of course the eigenvalues of a real symmetric matrix are the roots of its characteristic polynomial, and it is possible to go in the other direction as well. Given a real polynomial with only real roots, it is possible to construct a tridiagonal symmetric companion matrix from a Sturm sequence for the polynomial. See also that poster Denis Serre's Exercise 92 in this set. This is rather nice for showing the equivalence of those problems since we've seen (@AndrewWinters) the direct application of Householder matrices will tridiagonalize a real symmetric matrix.

Analysis of the arithmetic complexity for an iterative (root isolation) method is given in Reif (1999), An Efficient Algorithm for the Real Root and Symmetric Tridiagonal Eigenvalue Problems. Reif's approach improves on tailored versions of QR for companion matrices, giving $O(n \log^3 n)$ instead of $O(n^2)$ complexity.

The Abel-Galois-Ruffini Theorem says that no general formula for roots of polynomials above degree four can be given in terms of radicals (and usual arithmetic). However there are closed forms for roots in terms of more exotic operations. In principle one might base eigenvalue/diagonalization methods on such approaches, but one encounters some practical difficulties:

The Bring radical (aka ultraradical) is a function of one variable, in that respect like taking a square root. Jerrad (c. 1835) showed that solving the general quintic could be reduced to solving $t^5 + t - a = 0$ , so that univariate function $t(a)$ (used in addition to radicals and other usual arithmetic) allows all quintics to be solved.
This breaks down with degree six polynomials and above, although various ways can be found to solve them using functions of just two variables. Hilbert's 13th Problem was the conjecture that general degree seven polynomials could not be solved using only functions of at most two variables, but in 1957 V.I. Arnold showed they could. Among the multivariable function families that can be used to get solutions to arbitrary degree polynomials are Mellin integrals, hypergeometric and Siegel theta functions.
Besides implementing somewhat exotic special functions of more than one argument, we need direct methods for solving polynomials which work for general degree $n$ rather than ad hoc or degree specific methods. Guàrdia (2002) gives "a very simple expression of the roots of a polynomial of arbitrary degree in terms of derivatives of hyperelliptic theta functions." However this approach requires making choices of Weierstrass points on hyperelliptic curve $C_f: Y^2 = f(x)$ where all roots of polynomial $f(x)$ are sought. A good choice leads to expressing less than half of those roots, and it appears this approach requires repeated trials to get all of them. Each trial involves solving a homogeneous linear system at $O(n^3)$ cost.

Therefore the indirect/iterative methods for isolating real roots (equiv. eigenvalues of symmetric matrices), even to high precision, currently have practical advantages over the known direct/exact methods for these problems.

— hardmath
소스

Some notes: 1. a practical method for building the tridiagonal companion matrix from Sturm sequences was outlined in papers by Fiedler and Schmeisser; I gave a Mathematica implementation here, and it should not be too hard to implement in a more traditional language.

— J. M.

2. With respect to the "theta function" approach for polynomial roots (which I agree is a bit too unwieldy for practical use), Umemura outlines an approach using Riemann theta functions.

— J. M.

2

For what reason do you assume that this is impossible?

Any symmetric real matrix $S$ can be orthogonally diagonalized, i.e. $S = G D G^t$ , where $G$ is orthogonal and $D$ is diagonal.

Any orthogonal matrix of size n×n can be constructed as a product of at most n such reflections.Wikipedia. Therefore you have this decomposition.

I am not sure about the last statement, I just cite it (and I think it is correct). As far as I understand your question, it boils down to whether any orthogonal matrix can be decomposed into a sequence of Householder transforms.

— shuhalo
소스

2

I should have been more specific. The first step to diagonalizing a symmetric matrix is applying Householder until it is tridiagonal. Next, QR iterations are performed. This process cannot be completed using only closed-form Householder transformations. Why? (other than Abel's theorem)

— Victor Liu

1

You can do it with Jacobi rotations. Golub and Van Loan write that Jacobi is the same as Givens. Householder is just another way of doing Givens. In practice, the "correct" way might be with QR if it is faster.

— power

1

If the eigenvalues are already known (from a preliminary calculation based on the usual approach), one can use them to triangulize a nonsymmetric matrix (or diagonalize a symmetric matrix) by a product on $n-1$ Householder reflections. In the $k$ th step the $k$ th column is brought to triangular form. (This also provides a simple inductive proof of the existence of the Schur factorization.)

It is actually useful for methods where one repeatedly needs the orthoginal matrix in a numerically stable factored form.

— Arnold Neumaier
소스