Thuật toán mạnh mẽ cho

26

Một thuật toán đơn giản để tính SVD của là gì $2 \times 2$ ma trận?

Lý tưởng nhất là tôi muốn một thuật toán mạnh về số lượng, nhưng tôi muốn thấy cả hai cách triển khai đơn giản và không đơn giản. Mã C được chấp nhận.

Bất kỳ tài liệu tham khảo cho các giấy tờ hoặc mã?

— lhf
nguồn

5

Wikipedia liệt kê một giải pháp dạng đóng 2x2, nhưng tôi không biết về các thuộc tính số của nó.

— Damien

Là tài liệu tham khảo, "Công thức số", Press và cộng sự, Cambridge Press. Cuốn sách khá đắt tiền nhưng đáng giá từng xu. Bên cạnh các giải pháp SVD, bạn sẽ tìm thấy rất nhiều thuật toán hữu ích khác.

— Jan Hackenberg

19

Xem https://math.stackexchange.com/questions/861674/decompose-a-2d-arbitrary-transform-into-only-scelling-and-rotation (xin lỗi, tôi sẽ đưa nó vào một nhận xét nhưng tôi đã đăng ký chỉ để đăng bài này vì vậy tôi không thể đăng bình luận nào).

Nhưng vì tôi đang viết nó như một câu trả lời, tôi cũng sẽ viết phương pháp:

E = \frac{m_{00} + m_{11}}{2}; F = \frac{m_{00} - m_{11}}{2}; G = \frac{m_{10} + m_{01}}{2}; H = \frac{m_{10} - m_{01}}{2} Q = \sqrt{E^{2} + H^{2}}; R = \sqrt{F^{2} + G^{2}} s_{x} = Q + R; s_{y} = Q - R a_{1} = a t a n 2 (G, F); a_{2} = a t a n 2 (H, E) θ = \frac{a_{2} - a_{1}}{2}; ϕ = \frac{a_{2} + a_{1}}{2}

$E=\frac{m_{00}+m_{11}}{2}; F=\frac{m_{00}-m_{11}}{2}; G=\frac{m_{10}+m_{01}}{2}; H=\frac{m_{10}-m_{01}}{2}\\ Q=\sqrt{E^2+H^2}; R=\sqrt{F^2+G^2}\\ s_x=Q+R; s_y=Q-R\\ a_1=\mathrm{atan2}(G,F); a_2=\mathrm{atan2}(H,E)\\ \theta=\frac{a_2-a_1}{2}; \phi=\frac{a_2+a_1}{2}$

Điều đó phân hủy ma trận như sau:

M = (\begin{matrix} m_{00} & m_{01} \\ m_{10} & m_{11} \end{matrix}) = (\begin{matrix} \cos ϕ & - \sin ϕ \\ \sin ϕ & \cos ϕ \end{matrix}) (\begin{matrix} s_{x} & 0 \\ 0 & s_{y} \end{matrix}) (\begin{matrix} \cos θ & - \sin θ \\ \sin θ & \cos θ \end{matrix})

$M=\pmatrix{m_{00}&m_{01}\\m_{10}&m_{11}}=\pmatrix{\cos\phi&-\sin\phi\\\sin\phi&\cos\phi}\pmatrix{s_x&0\\0&s_y}\pmatrix{\cos\theta&-\sin\theta\\\sin\theta&\cos\theta}$

Điều duy nhất để bảo vệ chống lại phương pháp này là $G=F=0$ hoặc $H=E=0$ cho atan2.~~Tôi nghi ngờ nó có thể mạnh hơn thế~~ (Update: see Alex Eftimiades' answer!).

The reference is: http://dx.doi.org/10.1109/38.486688 (given by Rahul there) which comes from the bottom of this blog post: http://metamerist.blogspot.com/2006/10/linear-algebra-for-graphics-geeks-svd.html

Update: As noted by @VictorLiu in a comment, $s_y$ may be negative. That happens if and only if the determinant of the input matrix is negative as well. If that's the case and you want the positive singular values, just take the absolute value of $s_y$ .

— Pedro Gimeno
nguồn

1

It seems that

s_{y}

$s_y$ can be negative if

Q < R

$Q<R$ . This should not be possible.

— Victor Liu

@VictorLiu If the input matrix flips, the only place that can be reflected is in the scaling matrix, as the rotation matrices can't possibly flip. Just don't feed it input matrices that flip. I haven't done the math yet but I bet that the sign of the determinant of the input matrix will determine whether

Q

$Q$ or

R

$R$ is greater.

— Pedro Gimeno

@VictorLiu Tôi đã thực hiện phép toán bây giờ và xác nhận rằng thực sự,

đơn giản hóa thành

tức là yếu tố quyết định của ma trận đầu vào.

Q^{2} - R^{2}

$Q^2-R^2$

m_{00} m_{11} - m_{01} m_{10}

$m_{00}m_{11}-m_{01}m_{10}$

— Pedro Gimeno

9

@Pedro Gimeno

"I doubt it can be any more robust than that."

Challenge accepted.

Tôi nhận thấy cách tiếp cận thông thường là sử dụng các hàm trig như atan2. Theo trực giác, không nên sử dụng các chức năng trig. Thật vậy, tất cả các kết quả cuối cùng là sin và cosin của arctans - có thể được đơn giản hóa thành các hàm đại số. Mất khá nhiều thời gian, nhưng tôi đã cố gắng đơn giản hóa thuật toán của Pedro để chỉ sử dụng các hàm đại số.

Các mã python sau đây thực hiện các mẹo.

từ nhập khẩu numpy asarray, diag

def svd2 (m):

    y1, x1 = (m[1, 0] + m[0, 1]), (m[0, 0] - m[1, 1])
    y2, x2 = (m[1, 0] - m[0, 1]), (m[0, 0] + m[1, 1])

    h1 = hypot(y1, x1)
    h2 = hypot(y2, x2)

    t1 = x1 / h1
    t2 = x2 / h2

    cc = sqrt((1 + t1) * (1 + t2))
    ss = sqrt((1 - t1) * (1 - t2))
    cs = sqrt((1 + t1) * (1 - t2))
    sc = sqrt((1 - t1) * (1 + t2))

    c1, s1 = (cc - ss) / 2, (sc + cs) / 2,
    u1 = asarray([[c1, -s1], [s1, c1]])

    d = asarray([(h1 + h2) / 2, (h1 - h2) / 2])
    sigma = diag(d)

    if h1 != h2:
        u2 = diag(1 / d).dot(u1.T).dot(m)
    else:
        u2 = diag([1 / d[0], 0]).dot(u1.T).dot(m)

    return u1, sigma, u2

— Alex Eftimiades
nguồn

1

The code seems incorrect. Consider the 2x2 identity matrix. Then y1=0, x1=0, h1=0, and t1=0/0=NaN.

— Hugues

8

The GSL has a 2-by-2 SVD solver underlying the QR decomposition part of the main SVD algorithm for gsl_linalg_SV_decomp. See the svdstep.c file and look for the svd2 function. The function has a few special cases, isn't exactly trivial, and looks to be doing several things to be numerically careful (e.g., using hypot to avoid overflows).

— horchler
nguồn

1

Does this function have any documentation? I would like to know what its input parameters are.

— Victor Liu

@VictorLiu: Sadly I haven't seen anything other than the meager comments in the file itself. There's a bit in the ChangeLog file if you download the GSL. And you can look at svd.c for details of the overall algorithm. The only true documentation seems to be for the high level user-callable functions, e.g., gsl_linalg_SV_decomp.

— horchler

7

Khi chúng ta nói "mạnh mẽ về số lượng", chúng ta thường có nghĩa là một thuật toán trong đó chúng ta làm những việc như xoay vòng để tránh lan truyền lỗi. Tuy nhiên, đối với ma trận 2x2, bạn có thể viết kết quả xuống dưới dạng công thức rõ ràng - nghĩa là viết công thức cho các phần tử SVD chỉ nêu kết quả về mặt đầu vào , thay vì về các giá trị trung gian được tính toán trước đó . Điều đó có nghĩa là bạn có thể hủy bỏ nhưng không có lỗi lan truyền.

Vấn đề đơn giản là đối với các hệ thống 2x2, việc lo lắng về sự mạnh mẽ là không cần thiết.

— Wolfgang Bangerth
nguồn

It can depend on the matrix. I've seen a method which finds the left and right angles separately (each via arctan2(y,x)) which generally works fine. But when the singular values are close together, each of these arctans tends to 0/0, so the result can be inaccurate. In the method given by Pedro Gimeno, the calculation of a2 will be well defined in this case, while a1 becomes ill-defined; you still have a good result because the validity of the decomposition is only sensitive to theta+phi when the s.vals are close together, not to theta-phi.

— greggo

5

This code is based on Blinn's paper, Ellis paper, SVD lecture, and additional calculations. An algorithm is suitable for regular and singular real matrices. All previous versions works 100% as well as this one.

#include <stdio.h>
#include <math.h>

void svd22(const double a[4], double u[4], double s[2], double v[4]) {
    s[0] = (sqrt(pow(a[0] - a[3], 2) + pow(a[1] + a[2], 2)) + sqrt(pow(a[0] + a[3], 2) + pow(a[1] - a[2], 2))) / 2;
    s[1] = fabs(s[0] - sqrt(pow(a[0] - a[3], 2) + pow(a[1] + a[2], 2)));
    v[2] = (s[0] > s[1]) ? sin((atan2(2 * (a[0] * a[1] + a[2] * a[3]), a[0] * a[0] - a[1] * a[1] + a[2] * a[2] - a[3] * a[3])) / 2) : 0;
    v[0] = sqrt(1 - v[2] * v[2]);
    v[1] = -v[2];
    v[3] = v[0];
    u[0] = (s[0] != 0) ? (a[0] * v[0] + a[1] * v[2]) / s[0] : 1;
    u[2] = (s[0] != 0) ? (a[2] * v[0] + a[3] * v[2]) / s[0] : 0;
    u[1] = (s[1] != 0) ? (a[0] * v[1] + a[1] * v[3]) / s[1] : -u[2];
    u[3] = (s[1] != 0) ? (a[2] * v[1] + a[3] * v[3]) / s[1] : u[0];
}

int main() {
    double a[4] = {1, 2, 3, 6}, u[4], s[2], v[4];
    svd22(a, u, s, v);
    printf("Matrix A:\n%f %f\n%f %f\n\n", a[0], a[1], a[2], a[3]);
    printf("Matrix U:\n%f %f\n%f %f\n\n", u[0], u[1], u[2], u[3]);
    printf("Matrix S:\n%f %f\n%f %f\n\n", s[0], 0, 0, s[1]);
    printf("Matrix V:\n%f %f\n%f %f\n\n", v[0], v[1], v[2], v[3]);
}

— Martynas Sabaliauskas
nguồn

5

I needed an algorithm that has

little branching (hopefully CMOVs)
no trigonometric function calls
high numerical accuracy even with 32 bit floats

We want to calculate $c_1, s_1, c_2, s_2, \sigma_1$ and $\sigma_2$ as follows:

$A = USV$ , which can be expanded like:

$\begin{bmatrix} a & b \\ c & d \end{bmatrix} = \begin{bmatrix} c_1 & s_1 \\ -s_1 & c_1 \end{bmatrix} \begin{bmatrix} \sigma_1 & 0 \\ 0 & \sigma_2 \end{bmatrix} \begin{bmatrix} c_2 & -s_2 \\ s_2 & c_2 \end{bmatrix}$

The main idea is to find a rotation matrix $V$ that diagonalizes $A^TA$ , that is $VA^TAV^T=D$ is diagonal.

Recall that

$USV = A$

$US = AV^{-1} = AV^T$ (since $V$ is orthogonal)

$VA^TAV^T = (AV^T)^TAV^T = (US)^TUS = S^TU^TUS = D$

Multiplying both sides by $S^{-1}$ we get

$(S^{-T}S^T)U^TU(SS^{-1}) = U^TU = S^{-T}DS^{-1}$

Since $D$ is diagonal, setting $S$ to $\sqrt{D}$ will give us $U^TU=Identity$ , meaning $U$ is a rotation matrix, $S$ is a diagonal matrix, $V$ is a rotation matrix and $USV = A$ , just what we are looking for.

Calculating the diagonalizing rotation can be done by solving the following equation:

$t_2^2 - \frac{\beta-\alpha}{\gamma}t_2-1 = 0$

where

$A^TA = \begin{bmatrix} a & c \\ b & d \end{bmatrix} \begin{bmatrix} a & b \\ c & d \end{bmatrix} = \begin{bmatrix} a^2+c^2 & ab+cd \\ ab+cd & b^2+d^2 \end{bmatrix} = \begin{bmatrix} \alpha & \gamma \\ \gamma & \beta \end{bmatrix}$

and $t_2$ is the tangent of angle of $V$ . This can be derived by expanding $VA^TAV^T$ and making its off-diagonal elements equal to zero (they are equal to each other).

The problem with this method is that it loses significant floating point precision when calculating $\beta-\alpha$ and $\gamma$ for certain matrices, because of the subtractions in the calculations. The solution for this is to do an RQ decomposition ( $A=RQ$ , $R$ upper triangular and $Q$ orthogonal) first, then use the algorithm to factorize $USV' = R$ . This gives $USV=USV'Q=RQ=A$ . Notice how setting $d$ to 0 (as in $R$ ) eliminates some of the additions/subtractions. (The RQ decomposition is fairly trivial from the expansion of the matrix product).

The algorithm naively implemented this way has some numerical and logical anomalies (e.g. is $S$ $+\sqrt{D}$ or $-\sqrt{D}$ ), which I fixed in the code below.

I threw about 2000 million randomized matrices at the code, and the largest numerical error produced was around $6\cdot10^{-7}$ (with 32 bit floats, $error = ||USV-M||/||M||$ ). The algorithm runs in about 340 clock cycles (MSVC 19, Ivy Bridge).

template <class T>
void Rq2x2Helper(const Matrix<T, 2, 2>& A, T& x, T& y, T& z, T& c2, T& s2) {
    T a = A(0, 0);
    T b = A(0, 1);
    T c = A(1, 0);
    T d = A(1, 1);

    if (c == 0) {
        x = a;
        y = b;
        z = d;
        c2 = 1;
        s2 = 0;
        return;
    }
    T maxden = std::max(abs(c), abs(d));

    T rcmaxden = 1/maxden;
    c *= rcmaxden;
    d *= rcmaxden;

    T den = 1/sqrt(c*c + d*d);

    T numx = (-b*c + a*d);
    T numy = (a*c + b*d);
    x = numx * den;
    y = numy * den;
    z = maxden/den;

    s2 = -c * den;
    c2 = d * den;
}


template <class T>
void Svd2x2Helper(const Matrix<T, 2, 2>& A, T& c1, T& s1, T& c2, T& s2, T& d1, T& d2) {
    // Calculate RQ decomposition of A
    T x, y, z;
    Rq2x2Helper(A, x, y, z, c2, s2);

    // Calculate tangent of rotation on R[x,y;0,z] to diagonalize R^T*R
    T scaler = T(1)/std::max(abs(x), abs(y));
    T x_ = x*scaler, y_ = y*scaler, z_ = z*scaler;
    T numer = ((z_-x_)*(z_+x_)) + y_*y_;
    T gamma = x_*y_;
    gamma = numer == 0 ? 1 : gamma;
    T zeta = numer/gamma;

    T t = 2*impl::sign_nonzero(zeta)/(abs(zeta) + sqrt(zeta*zeta+4));

    // Calculate sines and cosines
    c1 = T(1) / sqrt(T(1) + t*t);
    s1 = c1*t;

    // Calculate U*S = R*R(c1,s1)
    T usa = c1*x - s1*y; 
    T usb = s1*x + c1*y;
    T usc = -s1*z;
    T usd = c1*z;

    // Update V = R(c1,s1)^T*Q
    t = c1*c2 + s1*s2;
    s2 = c2*s1 - c1*s2;
    c2 = t;

    // Separate U and S
    d1 = std::hypot(usa, usc);
    d2 = std::hypot(usb, usd);
    T dmax = std::max(d1, d2);
    T usmax1 = d2 > d1 ? usd : usa;
    T usmax2 = d2 > d1 ? usb : -usc;

    T signd1 = impl::sign_nonzero(x*z);
    dmax *= d2 > d1 ? signd1 : 1;
    d2 *= signd1;
    T rcpdmax = 1/dmax;

    c1 = dmax != T(0) ? usmax1 * rcpdmax : T(1);
    s1 = dmax != T(0) ? usmax2 * rcpdmax : T(0);
}

Ideas from:
http://www.cs.utexas.edu/users/inderjit/public_papers/HLA_SVD.pdf
http://www.math.pitt.edu/~sussmanm/2071Spring08/lab09/index.html
http://www.lucidarme.me/singular-value-decomposition-of-a-2x2-matrix/

— petiaccja
nguồn

3

I have used the description at http://www.lucidarme.me/?p=4624 to create this C++ code. The Matrices are those of the Eigen library, but you can easily create your own data structure from this example:

$A=U\Sigma V^T$

#include <cmath>
#include <Eigen/Core>
using namespace Eigen;

Matrix2d A;
// ... fill A

double a = A(0,0);
double b = A(0,1);
double c = A(1,0);
double d = A(1,1);

double Theta = 0.5 * atan2(2*a*c + 2*b*d,
                           a*a + b*b - c*c - d*d);
// calculate U
Matrix2d U;
U << cos(Theta), -sin(Theta), sin(Theta), cos(Theta);

double Phi = 0.5 * atan2(2*a*b + 2*c*d,
                         a*a - b*b + c*c - d*d);
double s11 = ( a*cos(Theta) + c*sin(Theta))*cos(Phi) +
             ( b*cos(Theta) + d*sin(Theta))*sin(Phi);
double s22 = ( a*sin(Theta) - c*cos(Theta))*sin(Phi) +
             (-b*sin(Theta) + d*cos(Theta))*cos(Phi);

// calculate S
S1 = a*a + b*b + c*c + d*d;
S2 = sqrt(pow(a*a + b*b - c*c - d*d, 2) + 4*pow(a*c + b*d, 2));

Matrix2d Sigma;
Sigma << sqrt((S1+S2) / 2), 0, 0, sqrt((S1-S2) / 2);

// calculate V
Matrix2d V;
V << signum(s11)*cos(Phi), -signum(s22)*sin(Phi),
     signum(s11)*sin(Phi),  signum(s22)*cos(Phi);

With the standard sign function

double signum(double value)
{
    if(value > 0)
        return 1;
    else if(value < 0)
        return -1;
    else
        return 0;
}

This results in exactly the same values as the Eigen::JacobiSVD (see https://eigen.tuxfamily.org/dox-devel/classEigen_1_1JacobiSVD.html).

— Corbie
nguồn

1

S2 = hypot( a*a + b*b - c*c - d*d, 2*(a*c + b*d))

— greggo

2

I have pure C code for the 2x2 real SVD here. See line 559. It essentially computes the eigenvalues of $A^TA$ by solving a quadratic, so it's not necessarily the most robust, but it seems to work well in practice for not-too-pathological cases. It's relatively simple.

— Victor Liu
nguồn

I don't think your code works when the eigenvalues of the matrix are negative. Try [[1 1] [1 0]], and u * s * vt is not equal to m...

— Carlos Scheidegger

2

For my personal need, I tried to isolate the minimum computation for a 2x2 svd. I guess it is probably one of the simplest and fastest solution. You can find details on my personal blog : http://lucidarme.me/?p=4624.

Advantages : simple, fast and you can only calculate one or two of the three matrices (S, U or D) if you don't need the three matrices.

Drawback it uses atan2, which may be inacurate and may require an external library (typ. math.h).

— Fifi
nguồn

3

Since links are rarely permanent, it is important to summarize the approach rather than simply providing a link as an answer.

— Paul

Also, if you're going to post a link to your own blog, please (a) disclose that it's your blog, (b) even better would be to actually summarize or cut-and-paste your approach (the images of formulas can be translated into raw LaTeX and rendered using MathJax). The best answers for this sort of question state formulas, provide citations for said formulas, and then list things like drawbacks, edge cases, and potential alternatives.

— Geoff Oxberry

1

Here is an implementation of a 2x2 SVD solve. I based it off of Victor Liu's code. His code was not working for some matrices. I used these two documents as mathematical reference for the solve: pdf1 and pdf2.

The matrix setData method is in row-major order. Internally, I represent the matrix data as a 2D array given by data[col][row].

void Matrix2f::svd(Matrix2f* w, Vector2f* e, Matrix2f* v) const{
    //If it is diagonal, SVD is trivial
    if (fabs(data[0][1] - data[1][0]) < EPSILON && fabs(data[0][1]) < EPSILON){
        w->setData(data[0][0] < 0 ? -1 : 1, 0, 0, data[1][1] < 0 ? -1 : 1);
        e->setData(fabs(data[0][0]), fabs(data[1][1]));
        v->loadIdentity();
    }
    //Otherwise, we need to compute A^T*A
    else{
        float j = data[0][0]*data[0][0] + data[0][1]*data[0][1],
            k = data[1][0]*data[1][0] + data[1][1]*data[1][1],
            v_c = data[0][0]*data[1][0] + data[0][1]*data[1][1];
        //Check to see if A^T*A is diagonal
        if (fabs(v_c) < EPSILON){
            float s1 = sqrt(j),
                s2 = fabs(j-k) < EPSILON ? s1 : sqrt(k);
            e->setData(s1, s2);
            v->loadIdentity();
            w->setData(
                data[0][0]/s1, data[1][0]/s2,
                data[0][1]/s1, data[1][1]/s2
            );
        }
        //Otherwise, solve quadratic for eigenvalues
        else{
            float jmk = j-k,
                jpk = j+k,
                root = sqrt(jmk*jmk + 4*v_c*v_c),
                eig = (jpk+root)/2,
                s1 = sqrt(eig),
                s2 = fabs(root) < EPSILON ? s1 : sqrt((jpk-root)/2);
            e->setData(s1, s2);
            //Use eigenvectors of A^T*A as V
            float v_s = eig-j,
                len = sqrt(v_s*v_s + v_c*v_c);
            v_c /= len;
            v_s /= len;
            v->setData(v_c, -v_s, v_s, v_c);
            //Compute w matrix as Av/s
            w->setData(
                (data[0][0]*v_c + data[1][0]*v_s)/s1,
                (data[1][0]*v_c - data[0][0]*v_s)/s2,
                (data[0][1]*v_c + data[1][1]*v_s)/s1,
                (data[1][1]*v_c - data[0][1]*v_s)/s2
            );
        }
    }
}

— Azmisov
nguồn