Skip to content

Conversation

@andreasnoack
Copy link
Member

@andreasnoack andreasnoack commented Oct 30, 2025

This implements the confidence interval as described in Cleveland and Grosse 1991 except for the statistical approximation of the deltas described in section 4 of the paper. They don't seem to share the coefficients of their fit, so it is not easy to implement that part. In addition, computers have much more memory these days, so I don't think the big matrix is a problem in most cases. I'd be interested in anybody knows about more recent approaches to calculating the deltas without the need for the big matrix.

I tried to mimic the interface for predict in GLM but decided to use a struct to return the expensive helper quantities together with the confidence bounds. I'm not a big fan of using Symbols for finite options like here but that is what GLM currently does. Maybe we should change it, but that is a separate concern.

With this, you can construct this plot

Screenshot 2025-10-29 at 23 12 50

The same plot with ggplot2 which uses Cleveland and Grosses code for the computations is

Screenshot 2025-10-29 at 23 12 59

Closes #29

@andreasnoack
Copy link
Member Author

The new implementation stores the rows of the hat matrix (and a bit more) for each vertex when constructing the Loess fit. That requires much more memory than the old implementation and this causes the benchmark runs to run out of memory because it tests relatively large problems, see

for i in 2:6
n = 10^i
x = rand(MersenneTwister(42), n)
y = sqrt.(x)
SUITE["random"][string(n)] = @benchmarkable loess($x, $y)
end
. It looks like R might be avoiding these rows in the loess function and only constructs when when you call predict on the loess object but you pay the price of essentially recomputing all the local fits in predict. In my opinion, the main use of Loess these days is for visualization with uncertainty bounds, so I'm leaning towards just accepting that the implementation can't handle as large problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ENH: Add confidence intervals / standard errors?

2 participants