Biến thể thống kê ở hai định dạng vòng loại Công thức 1

15

Tôi vừa đọc bài viết này của BBC về định dạng đủ điều kiện trong Công thức 1.

Các nhà tổ chức muốn làm cho vòng loại ít dự đoán hơn, tức là để tăng sự thay đổi thống kê trong kết quả. Đưa ra một vài chi tiết không liên quan, tại thời điểm các trình điều khiển được xếp hạng theo vòng đua tốt nhất của họ từ (để cụ thể hóa) hai lần thử.

Một người đứng đầu F1, Jean Todt, đề xuất rằng các trình điều khiển xếp hạng trung bình của hai vòng đua sẽ làm tăng sự thay đổi thống kê, vì các trình điều khiển có thể có khả năng mắc lỗi cao gấp đôi. Các nguồn khác lập luận rằng bất kỳ trung bình nào chắc chắn sẽ làm giảm sự thay đổi thống kê.

Chúng ta có thể nói ai đúng theo các giả định hợp lý? Tôi cho rằng nó sôi theo phương sai tương đối của $\text{mean}(x,y)$ so với $\text{min}(x,y)$ , trong đó $x$ và $y$ là các biến ngẫu nhiên đại diện cho hai lần lái xe?

variance

— không hài lòng
nguồn

5

Tôi nghĩ rằng nó phụ thuộc vào sự phân phối thời gian vòng.

Đặt là độc lập, phân phối giống hệt nhau. $X,Y$

Nếu , rồi $P(X=0)=P(X=1)=\frac{1}{2}$ $Var(\frac{X+Y}{2}) = \frac{1}{8} < Var( \min(X,Y)) = \frac{3}{16}.$
Tuy nhiên, nếu , thì $P(X=0) = 0.9, P(X=100)=0.1$ $Var(\frac{X+Y}{2}) = 450 > Var( \min(X,Y)) = 99.$

Điều này phù hợp với lập luận được đề cập trong câu hỏi về việc mắc lỗi (nghĩa là chạy một thời gian đặc biệt dài với xác suất nhỏ). Vì vậy, chúng ta sẽ phải biết phân phối thời gian ghép để quyết định.

— cát
nguồn

Thật thú vị, tôi đoán một cái gì đó như thế này hoạt động cho rvs liên tục quá. Chính xác thì điều gì đã sai trong chứng minh trước?

— innisfree

1

Theo tôi hiểu, lập luận rằng với

, khoảng cách giữa

và giá trị trung bình luôn nhỏ hơn khoảng cách giữa

và

, do đó phương sai của giá trị trung bình phải nhỏ hơn phương sai của

. Tuy nhiên, điều này không tuân theo:

x \geq y

$x\geq y$

x

$x$

x

$x$

min (x, y)

$\min(x,y)$

min (x, y)

$\min(x,y)$

min (x, y)

$\min(x,y)$ có thể ở xa liên tục trong khi giá trị trung bình thay đổi rất nhiều. Nếu bằng chứng được dựa trên một tính toán thực tế, sẽ dễ dàng xác định chính xác vị trí sai hơn (hoặc kiểm tra xem nó có hợp lệ không).

— Sandris

2

Không mất tính tổng quát, giả sử và cả hai biến được rút ra từ cùng một phân phối với một giá trị trung bình và phương sai cụ thể. $y \leq x$

cải thiện trên bởi, $\{y,x\}$ $\{x\}$

trường hợp 1, có nghĩa là: , $\frac{y-x}{2}$

trường hợp 2, phút: . $y-x$

Do đó, giá trị trung bình có một nửa ảnh hưởng đến sự cải thiện (được điều khiển bởi phương sai) so với mức tối thiểu (trong 2 thử nghiệm). Đó là, trung bình làm giảm sự thay đổi.

— James
nguồn

Tôi không tin điều này là hoàn toàn chính xác, bạn có thể vui lòng cung cấp một lời giải thích chính thức?

— Sandris

2

Đây là bằng chứng của tôi về Var [Nghĩa]

Đối với 2 biến ngẫu nhiên x, y có mối quan hệ giữa giá trị trung bình và max và min của chúng.

Do đó

2 M e một n (x, y) = = M Tôi n (x, y) + M một x (x, y)

$2\,Mean(x,y) = Min(x,y) + Max(x,y)$

Nếu bây giờ chúng ta giả sử rằng phân phối đối xứng quanh giá trị trung bình thì

Sau đó

4 V một r [M e một n] = = V một r [M Tôi n] + V một r [M một x] + 2 C o v [M Tôi n, M một x]

$4\,Var[Mean] = Var[Min]+Var[Max]+2\,Cov[Min,Max]$

V một r [M Tôi n (x, y)] = = V một r [M một x (x, y)]

$Var[Min(x,y)]=Var[Max(x,y)]$

4 V a r [M e a n] = 2 V a r [M i n] + 2 C o v [M i n, M a x]

$4\,Var[Mean] = 2\,Var[Min] + 2\,Cov[Min,Max]$

C o v [M i n, M a x] <= s q r t (V a r [M i n] V a r [M a x]) = V a r [M i n]

$Cov[Min,Max]<=sqrt(Var[Min]Var[Max])=Var[Min]$ Therefore

V a r [M e a n] <= V a r [M i n]

$Var[Mean]<=Var[Min]$ It is also easy to see from this derivation that in order to reverse this inequality you need a distribution with very sharp truncation of the distribution on the negative side of the mean. For example for the exponential distribution the mean has a larger variance than the min.

— sega_sai
nguồn

1

Nice question, thank you! I agree with @sandris that distribution of lap times matters, but would like to emphasize that causal aspects of the question need to be addressed. My guess is that F1 wants to avoid a boring situation where the same team or driver dominates the sport year after year, and that they especially hope to introduce the (revenue-generating!) excitement of a real possibility that 'hot' new drivers can suddenly arise in the sport.

That is, my guess is that there is some hope to disrupt excessively stable rankings of teams/drivers. (Consider the analogy with raising the temperature in simulated annealing.) The question then becomes, what are the causal factors at work, and how are they distributed across the population of drivers/teams so as to create persistent advantage for current incumbents. (Consider the analogous question of levying high inheritance taxes to 'level the playing field' in society at large.)

Suppose incumbent teams are maintaining incumbency by a conservative strategy heavily dependent on driver experience, that emphasizes low variance in lap times at the expense of mean lap time. Suppose that by contrast the up-and-coming teams with (say) younger drivers, necessarily adopt a more aggressive (high-risk) strategy with larger variance, but that this involves some spectacular driving that sometimes 'hits it just right' and achieves a stunning lap time. Abstracting away from safety concerns, F1 would clearly like to see some such 'underdogs' in the race. In this causal scenario, it would seem that a best-of-n-laps policy (large $n$ ) would help give the upstarts a boost -- assuming that the experienced drivers are 'set in their ways', and so couldn't readily adapt their style to the new policy.

Suppose, on the other hand, that engine failure is an uncontrollable event with the same probability across all teams, and that the current rankings correctly reflect genuine gradation in driver/team quality across many other factors. In this case, the bad luck of an engine failure promises to be the lone 'leveling factor' that F1 could exploit to achieve greater equality of opportunity--at least without heavy-handed ranking manipulations that destroy the appearance of 'competition'. In this case, a policy that heavily penalizes engine failures (which are the only factor in this scenario not operating relatively in favor of the incumbents) promises to promote instability in rankings. In this case, the best-of-n policy mentioned above would be exactly the wrong policy to pursue.

— David C. Norris
nguồn

0

I generally agree with other answers that the average of two runs will have a lower variance, but I believe they are leaving out important aspects underlying the problem. A lot has to do with how drivers react to the rules and their strategies for qualifying.

For instance, with only one lap to qualify, drivers would be more conservative, and therefore more predictable and more boring to watch. The idea with two laps is to allow the drivers to take chances on one to try to get that "perfect lap", with another available for a conservative run. More runs would use up a lot of time, which could also be boring. The current setup might just be the "sweet spot" to get the most action in the shortest time frame.

Also note that with an averaging approach, the driver needs to find the fastest repeatable lap time. With the min approach, the driver needs to drive as fast as possible for only one lap, probably pushing further than they would under the averaging approach.

This discussion is closer to game theory. Your question might get better answers when framed in that light. Then one could propose other techniques, like the option for a driver to drop the first lap time in favor of a second run, and possibly a faster or slower time. Etc.

Also note that a change in qualifying was attempted this year that generally pushed drivers into one conservative lap. https://en.wikipedia.org/wiki/2016_Formula_One_season#Qualifying The result was viewed as a disaster and quickly cancelled.

— Maddenker
nguồn