Đầu tiên, tôi có thể sai nhưng tôi nghĩ giải pháp của bạn sẽ không hoạt động trong trường hợp có nhiều điểm trong đó Ước tính mật độ hạt nhân (KDE) giao nhau. Thứ hai, mặc dù overlap
gói được tạo để sử dụng với dữ liệu dấu thời gian, bạn vẫn có thể sử dụng gói đó để ước tính diện tích chồng lấp của hai KDE bất kỳ. Bạn chỉ cần thay đổi dữ liệu của mình để nó nằm trong khoảng từ 0 đến 2π.
Ví dụ như :
# simulate two sample
a <- rnorm(100)
b <- rnorm(100, 2)
# To use overplapTrue(){overlap} the scale must be in radian (i.e. 0 to 2pi)
# To keep the *relative* value of a and b the same, combine a and b in the
# same dataframe before rescaling. You'll need to load the ‘scales‘ library.
# But first add a "Source" column to be able to distinguish between a and b
# after they are combined.
a = data.frame( value = a, Source = "a" )
b = data.frame( value = b, Source = "b" )
d = rbind(a, b)
library(scales)
d$value <- rescale( d$value, to = c(0,2*pi) )
# Now you can created the rescaled a and b vectors
a <- d[d$Source == "a", 1]
b <- d[d$Source == "b", 1]
# You can then calculate the area of overlap as you did previously.
# It should give almost exactly the same answers.
# Or you can use either the overlapTrue() and overlapEst() function
# provided with the overlap packages.
# Note that with these function the KDE are fitted using von Mises kernel.
library(overlap)
# Using overlapTrue():
# define limits of a common grid, adding a buffer so that tails aren't cut off
lower <- min(d$value)-1
upper <- max(d$value)+1
# generate kernel densities
da <- density(a, from=lower, to=upper, adjust = 1)
db <- density(b, from=lower, to=upper, adjust = 1)
# Compute overlap coefficient
overlapTrue(da$y,db$y)
# Using overlapEst():
overlapEst(a, b, kmax = 3, adjust=c(0.8, 1, 4), n.grid = 500)
# You can also plot the two KDEs and the region of overlap using overlapPlot()
# but sadly I haven't found a way of changing the x scale so that the scale
# range correspond to the initial x value and not the rescaled value.
# You can only change the maximum value of the scale using the xscale argument
# (i.e. it always range from 0 to n, where n is set with xscale = n).
# So if some of your data take negative value, you're probably better off with
# a different plotting method. You can change the x label with the xlab
# argument.
overlapPlot(a, b, xscale = 10, xlab= "x metrics", rug=T)