Có phải Tikhonov chính quy hóa giống như hồi quy Ridge?


Câu trả lời:


47

Tikhonov chính quy là một tập hợp lớn hơn hồi quy sườn. Đây là nỗ lực của tôi để đánh vần chính xác chúng khác nhau như thế nào.

Giả sử rằng với một ma trận và vectơ b đã biết , chúng ta muốn tìm một vectơ x sao cho:Abx

Ax=b.

The standard approach is ordinary least squares linear regression. However, if no x satisfies the equation or more than one x does—that is the solution is not unique—the problem is said to be ill-posed. Ordinary least squares seeks to minimize the sum of squared residuals, which can be compactly written as:

Axb2

where is the Euclidean norm. In matrix notation the solution, denoted by x^, is given by:

x^=(ATA)1ATb

Tikhonov regularization minimizes

Axb2+Γx2

for some suitably chosen Tikhonov matrix, Γ. An explicit matrix form solution, denoted by x^, is given by:

x^=(ATA+ΓTΓ)1ATb

ΓΓ=0 this reduces to the unregularized least squares solution provided that (ATA)−1 exists.

Typically for ridge regression, two departures from Tikhonov regularization are described. First, the Tikhonov matrix is replaced by a multiple of the identity matrix

Γ=αI,

giving preference to solutions with smaller norm, i.e., the L2 norm. Then ΓTΓ becomes α2I leading to

x^=(ATA+α2I)1ATb

Finally, for ridge regression, it is typically assumed that A variables are scaled so that XTX has the form of a correlation matrix. and XTb is the correlation vector between the x variables and b, leading to

x^=(XTX+α2I)1XTb

Note in this form the Lagrange multiplier α2 is usually replaced by k, λ, or some other symbol but retains the property λ0

In formulating this answer, I acknowledge borrowing liberally from Wikipedia and from Ridge estimation of transfer function weights


10
(+1) For completeness, it is worth mentioning that in practical application the regularized system would typically be written in the form [AαΓ]x[b0]A^xb^, which can then be solved as a standard linear least squares problem (e.g. via QR/SVD on A^, without explicitly forming the normal equations).
GeoMatt22

Good point. I'll add it in later.
Carl

Are smoothing splines and similar basis expansion methods a subset of Tikhonov regularization?
Sycorax says Reinstate Monica

@Sycorax I do not expect so. For example, a B-spline would set derivatives at zero at endpoints, and match derivatives and magnitudes of spline to data in between endpoints. Tikhonov regularization will minimize whatever parameter error you tell it to by changing slope of fit. So, different things.
Carl

Also, Tychonov regularization has a formulation in arbitrary dimensions for (separable?) Hilbert spaces
AIM_BLB

23

Carl has given a thorough answer that nicely explains the mathematical differences between Tikhonov regularization vs. ridge regression. Inspired by the historical discussion here, I thought it might be useful to add a short example demonstrating how the more general Tikhonov framework can be useful.

First a brief note on context. Ridge regression arose in statistics, and while regularization is now widespread in statistics & machine learning, Tikhonov's approach was originally motivated by inverse problems arising in model-based data assimilation (particularly in geophysics). The simplified example below is in this category (more complex versions are used for paleoclimate reconstructions).


Imagine we want to reconstruct temperatures u[x,t=0] in the past, based on present-day measurements u[x,t=T]. In our simplified model we will assume that temperature evolves according to the heat equation

ut=uxx
in 1D with periodic boundary conditions
u[x+L,t]=u[x,t]
A simple (explicit) finite difference approach leads to the discrete model
ΔuΔt=LuΔx2ut+1=Aut
Mathematically, the evolution matrix A is invertible, so we have
ut=A1ut+1
However numerically, difficulties will arise if the time interval T is too long.

Tikhonov regularization can solve this problem by solving

Autut+1ωLut0
which adds a small penalty ω21 on roughness uxx.

Below is a comparison of the results:

Tikhonov vs. Checkerboard

We can see that the original temperature u0 has a smooth profile, which is smoothed still further by diffusion to give ufwd. Direct inversion fails to recover u0, and the solution uinv shows strong "checkerboarding" artifacts. However the Tikhonov solution ureg is able to recover u0 with quite good accuracy.

Note that in this example, ridge regression would always push our solution towards an "ice age" (i.e. uniform zero temperatures). Tikhonov regression allows us a more flexible physically-based prior constraint: Here our penalty essentially says the reconstruction u should be only slowly evolving, i.e. ut0.


Matlab code for the example is below (can be run online here).

% Tikhonov Regularization Example: Inverse Heat Equation
n=15; t=2e1; w=1e-2; % grid size, # time steps, regularization
L=toeplitz(sparse([-2,1,zeros(1,n-3),1]/2)); % laplacian (periodic BCs)
A=(speye(n)+L)^t; % forward operator (diffusion)
x=(0:n-1)'; u0=sin(2*pi*x/n); % initial condition (periodic & smooth)
ufwd=A*u0; % forward model
uinv=A\ufwd; % inverse model
ureg=[A;w*L]\[ufwd;zeros(n,1)]; % regularized inverse
plot(x,u0,'k.-',x,ufwd,'k:',x,uinv,'r.:',x,ureg,'ro');
set(legend('u_0','u_{fwd}','u_{inv}','u_{reg}'),'box','off');

All compliments warmly received. It is worthwhile mentioning, even if slightly off topic, that both Tikhonov regularization and ridge regression can be used for targeting physical regression targets. (+1)
Carl

2
@Carl this is certainly true. We could even use it here, by switching variables to v=Lu! (In general, any Tikhonov problem with an invertible Tikhonov matrix can be converted to ridge regression.)
GeoMatt22
Khi sử dụng trang web của chúng tôi, bạn xác nhận rằng bạn đã đọc và hiểu Chính sách cookieChính sách bảo mật của chúng tôi.
Licensed under cc by-sa 3.0 with attribution required.