Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve speed and accuracy for correlation() #26135

Merged
merged 16 commits into from
May 15, 2021
Prev Previous commit
Next Next commit
Consistently use fsum()
  • Loading branch information
rhettinger committed May 14, 2021
commit 9d05de138ff7c25e079dffc9453b7e6b023bbdd6
8 changes: 4 additions & 4 deletions Lib/statistics.py
Original file line number Diff line number Diff line change
Expand Up @@ -882,8 +882,8 @@ def covariance(x, y, /):
raise StatisticsError('covariance requires that both inputs have same number of data points')
if n < 2:
raise StatisticsError('covariance requires at least two data points')
xbar = fmean(x)
ybar = fmean(y)
xbar = fsum(x) / n
ybar = fsum(y) / n
sxy = fsum((xi - xbar) * (yi - ybar) for xi, yi in zip(x, y))
return sxy / (n - 1)

Expand All @@ -910,8 +910,8 @@ def correlation(x, y, /):
raise StatisticsError('correlation requires that both inputs have same number of data points')
if n < 2:
raise StatisticsError('correlation requires at least two data points')
xbar = fmean(x)
ybar = fmean(y)
xbar = fsum(x) / n
ybar = fsum(y) / n
sxy = fsum((xi - xbar) * (yi - ybar) for xi, yi in zip(x, y))
s2x = fsum((xi - xbar) ** 2.0 for xi in x)
s2y = fsum((yi - ybar) ** 2.0 for yi in y)
Expand Down