Xác định giá trị trung bình và độ lệch chuẩn trong thời gian thực

31

Điều gì sẽ là cách lý tưởng để tìm độ lệch trung bình và độ lệch chuẩn của tín hiệu cho ứng dụng thời gian thực. Tôi muốn có thể kích hoạt bộ điều khiển khi tín hiệu vượt quá 3 độ lệch chuẩn so với giá trị trung bình trong một khoảng thời gian nhất định.

Tôi cho rằng một DSP chuyên dụng sẽ thực hiện việc này khá dễ dàng, nhưng có "phím tắt" nào có thể không yêu cầu thứ gì đó quá phức tạp không?

statistics real-time measurement

— jonsca
nguồn

Bạn có biết gì về tín hiệu không? Có phải là văn phòng phẩm?

@Tim Hãy nói rằng đó là văn phòng phẩm. Đối với sự tò mò của riêng tôi, điều gì sẽ là sự phân nhánh của tín hiệu không cố định?

— jonsca

3

Nếu nó đứng yên, bạn có thể chỉ cần tính toán trung bình và độ lệch chuẩn đang chạy. Mọi thứ sẽ phức tạp hơn nếu độ lệch trung bình và độ lệch chuẩn thay đổi theo thời gian.

5

Rất liên quan: en.wikipedia.org/wiki/ đá

— Tiến sĩ belisarius

36

Có một lỗ hổng trong câu trả lời của Jason R, được thảo luận trong "Nghệ thuật lập trình máy tính" của Knuth. 2. Vấn đề xảy ra nếu bạn có độ lệch chuẩn là một phần nhỏ của giá trị trung bình: phép tính E (x ^ 2) - (E (x) ^ 2) bị nhạy cảm nghiêm trọng với các lỗi làm tròn điểm nổi.

Bạn thậm chí có thể tự thử điều này trong tập lệnh Python:

ofs = 1e9
A = [ofs+x for x in [1,-1,2,3,0,4.02,5]] 
A2 = [x*x for x in A]
(sum(A2)/len(A))-(sum(A)/len(A))**2

Tôi nhận được -128.0 như một câu trả lời, rõ ràng là không có giá trị tính toán, vì toán học dự đoán rằng kết quả sẽ không âm.

Knuth trích dẫn một cách tiếp cận (tôi không nhớ tên của nhà phát minh) để tính toán giá trị trung bình và độ lệch chuẩn chạy theo cách tương tự:

 initialize:
    m = 0;
    S = 0;
    n = 0;

 for each incoming sample x:
    prev_mean = m;
    n = n + 1;
    m = m + (x-m)/n;
    S = S + (x-m)*(x-prev_mean);

và sau mỗi bước, giá trị của giá trị mtrung bình và độ lệch chuẩn có thể được tính bằng sqrt(S/n)hoặc sqrt(S/n-1)tùy thuộc vào định nghĩa ưa thích của bạn về độ lệch chuẩn.

Phương trình tôi viết ở trên hơi khác so với phương trình trong Knuth, nhưng nó tương đương về mặt tính toán.

Khi tôi có thêm vài phút nữa, tôi sẽ viết mã công thức trên bằng Python và cho thấy rằng bạn sẽ nhận được câu trả lời không âm (hy vọng rằng nó gần với giá trị chính xác).

cập nhật: đây rồi.

test1.py:

import math

def stats(x):
  n = 0
  S = 0.0
  m = 0.0
  for x_i in x:
    n = n + 1
    m_prev = m
    m = m + (x_i - m) / n
    S = S + (x_i - m) * (x_i - m_prev)
  return {'mean': m, 'variance': S/n}

def naive_stats(x):
  S1 = sum(x)
  n = len(x)
  S2 = sum([x_i**2 for x_i in x])
  return {'mean': S1/n, 'variance': (S2/n - (S1/n)**2) }

x1 = [1,-1,2,3,0,4.02,5] 
x2 = [x+1e9 for x in x1]

print "naive_stats:"
print naive_stats(x1)
print naive_stats(x2)

print "stats:"
print stats(x1)
print stats(x2)

kết quả:

naive_stats:
{'variance': 4.0114775510204073, 'mean': 2.0028571428571427}
{'variance': -128.0, 'mean': 1000000002.0028572}
stats:
{'variance': 4.0114775510204073, 'mean': 2.0028571428571431}
{'variance': 4.0114775868357446, 'mean': 1000000002.0028571}

Bạn sẽ lưu ý rằng vẫn còn một số lỗi làm tròn, nhưng nó không tệ, trong khi naive_statschỉ là lỗi.

chỉnh sửa: Chỉ cần chú ý bình luận của Belisarius trích dẫn Wikipedia có đề cập đến thuật toán Knuth.

— Jason S
nguồn

1

+1 cho câu trả lời chi tiết với mã ví dụ. Cách tiếp cận này tốt hơn phương pháp được chỉ ra trong câu trả lời của tôi khi cần triển khai dấu phẩy động.

— Jason R

1

Người ta cũng có thể kiểm tra điều này để triển khai C ++: johndcook.com/st Chuẩn_deviation.html

— Rui Marques

1

yep, that's it. He uses the exact equations Knuth uses. You can optimize somewhat and avoid having to check for initial iteration vs. subsequent iterations if you use my method.

— Jason S

"Knuth cites an approach (I don't remember the name of the inventor) for calculating running mean" -- it's Welford's method, by the way.

— Jason S

I have posted a question related to this if anyone is able to help: dsp.stackexchange.com/questions/31812/…

— Jonathan

13

Điều gì sẽ là cách lý tưởng để tìm độ lệch trung bình và độ lệch chuẩn của tín hiệu cho ứng dụng thời gian thực. Tôi muốn có thể kích hoạt bộ điều khiển khi tín hiệu vượt quá 3 độ lệch chuẩn so với giá trị trung bình trong một khoảng thời gian nhất định.

$\tau$ seconds, which is probably what you want, rather than the usual arithmetic average over all samples ever seen.

In the frequency domain, an "exponentially weighted running average" is simply a real pole. It is simple to implement in the time domain.

Time domain implementation

Let mean and meansq be the current estimates of the mean and mean of the square of the signal. On every cycle, update these estimates with the new sample x:

% update the estimate of the mean and the mean square:
mean = (1-a)*mean + a*x
meansq = (1-a)*meansq + a*(x^2)

% calculate the estimate of the variance:
var = meansq - mean^2;

% and, if you want standard deviation:
std = sqrt(var);

Here $0 < a < 1$ is a constant that determines the effective length of the running average. How to choose $a$ is described below in "analysis".

What is expressed above as an imperative program may also be depicted as a signal-flow diagram:

enter image description here

Analysis

The above algorithm computes $y_i = a x_i + (1-a) y_{i-1}$ where $x_i$ is the input at sample $i$ , and $y_i$ is the output (i.e. estimate of the mean). This is a simple, single-pole IIR filter. Taking the $z$ transform, we find the transfer function

H (z) = \frac{a}{1 - (1 - a) z^{- 1}}

$H(z) = \frac{a}{1-(1-a)z^{-1}}$ .

Condensing the IIR filters into their own blocks, the diagram now looks like this:

enter image description here

To go to the continuous domain, we make the substitution $z = e^{s T}$ where $T$ is the sample time and $f_s = 1/T$ is the sample rate. Solving $1-(1-a)e^{-sT}=0$ , we find that the continuous system has a pole at $s = \frac{1}{T} \log (1-a)$ .

Choose $a$ :

a = 1 - \exp {2 π \frac{T}{τ}}

$a = 1 - \exp \left\{2\pi\frac{T}{\tau}\right\}$

References

The Simulink diagram source may be downloaded from https://gist.github.com/1942771

— nibot
nguồn

1

Could you explain how

a

$a$ determines the length of the running average? And what value of

a

$a$ should be used? The specification 0 > a > 1 is impossible to meet.

— Dilip Sarwate

This is similar to Jason R's approach. This method will be less accurate but a little faster and lower on memory. This approach ends up using an exponential window.

— schnarf

Woops! Of course I meant 0 < a < 1. If your system has sampling tmie T and you'd like an averaging time constant tau, then choose a = 1 - exp (2*pi*T/tau).

— nibot

I think there may be a mistake in here. The single-pole filters don't have 0 dB gain at DC and since you are applying one filter in the linear domain and one in the squared domain the gain error is different for E<x> and E<x^2>. I'll elaborate more in my answer

— Hilmar

It does have 0 dB gain at DC. Substitute z=1 (DC) into H(z) = a/(1-(1-a)/z) and you get 1.

— nibot

5

A method I've used before in an embedded processing application is to maintain accumulators of the sum and sum-of-squares of the signal of interest:

A_{x, i} = \sum_{k = 0}^{i} x [k] = A_{x, i - 1} + x [i], A_{x, - 1} = 0

$A_{x,i} = \sum_{k=0}^{i}x[k] = A_{x,i-1} + x[i], A_{x,-1} = 0$

A_{x^{2}, i} = \sum_{k = 0}^{i} x^{2} [k] = A_{x^{2}, i - 1} + x^{2} [i], A_{x^{2}, - 1} = 0

$A_{x^2,i} = \sum_{k=0}^{i}x^2[k] = A_{x^2,i-1} + x^2[i], A_{x^2,-1} = 0$

Also, keep track of the current time instant $i$ in the above equations (that is, note the number of samples that you've added into the accumulators). Then, the sample mean and standard deviation at time $i$ are:

\tilde{μ} = \frac{A_{x_{i}}}{i + 1}

$\tilde\mu = \frac{A_{x_i}}{i+1}$

\tilde{σ} = \sqrt{\frac{A_{x_{i}^{2}}}{i + 1} - {\tilde{μ}}^{2}}

$\tilde\sigma = \sqrt{\frac{A_{x^2_i}}{i+1} - \tilde\mu^2}$

or you can use:

\tilde{σ} = \sqrt{\frac{A_{x_{i}^{2}}}{i} - {\tilde{μ}}^{2}}

$\tilde\sigma = \sqrt{\frac{A_{x^2_i}}{i} - \tilde\mu^2}$

depending upon which standard deviation estimation method you prefer. These equations are based on the definition of the variance:

σ^{2} = E (X^{2}) - (E (X))^{2}

$\sigma^2 = \operatorname{E}(X^2) - (\operatorname{E}(X))^2$

I've used these successfully in the past (although I was only concerned with variance estimation, not standard deviation), although you do have to be careful about the numeric types you use to hold the accumulators if you're going to be summing over a long period of time; you don't want overflow.

Edit: In addition to the above comment on overflow, it should be noted that this is not a numerically robust algorithm when implemented in floating-point arithmetic, potentially causing large errors in the estimated statistics. Look at Jason S's answer for a better approach in that case.

— Jason R
nguồn

1

Maybe rewriting it as

A_{x, i} = x [i] + A_{x, i - 1}, A_{x, 0} = x [0]

$A_{x,i}=x[i]+A_{x,i-1},\ A_{x,0}=x[0]$ will make it clear that you need to add only two numbers at each step, lest someone implement it as summing all

i

$i$ elements of

x

$x$ at each step.

— Lorem Ipsum

Yes, that's better. I tried to rewrite to make the recursive implementation more clear.

— Jason R

2

-1 when I have enough rep to do so: this has numerical problems. See Knuth vol. 2

— Jason S

There seem to be a couple of typos here. Why is the mean being subtracted under the square root sign for

σ

$\sigma$ ? it should be

μ^{2}

$\mu^2$ to match the displayed equation

σ^{2} = E (X^{2}) - (E (X))^{2}

$\sigma^2 = E(X^2) - (E(X))^2$ , no? Also, though I won't vote down this answer, I agree with Jason S that there can be numerical problems in this approach.

— Dilip Sarwate

2

@JasonS: I would disagree that the technique is inherently flawed, although I agree with your point that it is not a numerically robust method when implemented in floating point. I should have been more clear that I've used this successfully in an application that used integer arithmetic. Integer (or fixed-point implementations of fractional numbers) arithmetic doesn't suffer from the issue you pointed out that causes loss of precision. In that context, it is a suitable method that requires fewer operations per sample.

— Jason R

3

Similar to the preferred answer above (Jason S.), and also derived from the formula taken from Knut (Vol.2, p 232), one can also derive a formula to replace a value, i.e. remove and add a value in one step. According to my tests, replace delivers better precision than the two-step remove/add version.

The code below is in Java, mean and s get updated ("global" member variables), same as m and s above in Jason's post. The value count refers to the window size n.

/**
 * Replaces the value {@code x} currently present in this sample with the
 * new value {@code y}. In a sliding window, {@code x} is the value that
 * drops out and {@code y} is the new value entering the window. The sample
 * count remains constant with this operation.
 * 
 * @param x
 *            the value to remove
 * @param y
 *            the value to add
 */
public void replace(double x, double y) {
    final double deltaYX = y - x;
    final double deltaX = x - mean;
    final double deltaY = y - mean;
    mean = mean + deltaYX / count;
    final double deltaYp = y - mean;
    final double countMinus1 = count - 1;
    s = s - count / countMinus1 * (deltaX * deltaX - deltaY * deltaYp) - deltaYX * deltaYp / countMinus1;
}

— marco
nguồn

3

Jason and Nibot's answer differ in one important aspect: Jason's method calculates the std dev and mean for the the whole signal (since y = 0), while Nibot's is a "running" calculation, i.e. it weighs more recent samples stronger than samples from the distant past.

Since the application seems to require std dev and mean as a function of time, Nibot's method is probably the more appropriate one (for this specific application). However, the real tricky part will be to get the time weighting part right. Nibot's example uses a simple single pole filter.

The proper way to describe this is to that we get an estimate of $E[x]$ by filtering $x[n]$ and an estimate for $E[x^2]$ by filtering $x[n]^2$ . Estimation filters are typically low pass filters. These filters should be scaled to have 0dB gain at 0 Hz. Otherwise there is a constant gain error.

The choice of lowpass filter can be guided by what you know about your signal and the time resolution you need for your estimation. Lower cutoff frequencies and higher order will result in better accuracy but slower response time.

To complicate things further one filter is applied in the linear domain and another in the squared domain. Squaring significantly changes the spectral content of the signal so you may want to use a different filter in the squared domain.

Here is an example on how to estimate mean, rms and std dev as a function of time.

%% example
fs = 44100; n = fs; % 44.1 kHz sample rate, 1 second
% signal: white noise plus a low frequency drift at 5 Hz)
x = randn(n,1) + sin(2*pi*(0:n-1)'*5/fs);
% mean estimation filter: since we are looking for effects in the 5 Hz range we use maybe a
% 25 Hz filter, 2nd order so it's not too sluggish
[b,a] = butter(2,25*2/fs);
xmeanEst = filter(b,a,x);
% now we estimate x^2, since most frequency double we use twice the bandwidth
[b,a] = butter(2,50*2/fs);
x2Est = filter(b,a,x.^2);
% std deviation estimate
xstd = sqrt(x2Est)-xmeanEst;
% and plot it
h = plot([x, xmeanEst sqrt(x2Est) xstd]);
grid on;
legend('x','E<x>','sqrt(E<x^2>)','Std dev');
set(h(2:4),'Linewidth',2);

— Hilmar
nguồn

1

The filter in my answer corresponds to y1 = filter(a,[1 (1-a)],x);.

— nibot

1

Good point on the distinction between the running statistics and the statistics of the overall sample. My implementation could be modified to compute running statistics by accumulating over a moving window, which can be done efficiently also (at each time step, subtract the time sample that just slid out of the window from each accumulator).

— Jason R

nibot, sorry you are right and I was wrong. I'll correct this right away

— Hilmar

1

+1 for suggesting different filtering for x and x^2

— nibot