Skip to content

Commit cb6e381

Browse files
add linear regression examples
1 parent 6c004d1 commit cb6e381

File tree

4 files changed

+214
-0
lines changed

4 files changed

+214
-0
lines changed

linear_regression_class/mlr02.xls

751 Bytes
Binary file not shown.

linear_regression_class/moore.csv

+102
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,102 @@
1+
Intel 4004 2,300 1971 Intel 10,000 nm 12 mm²
2+
Intel 8008 3,500 1972 Intel 10,000 nm 14 mm²
3+
Intel 8080 4,500 1974 Intel 6,000 nm 20 mm²
4+
Motorola 6800 4,100 1974 Motorola 6,000 nm 16 mm²
5+
RCA 1802 5,000 1974 RCA 5,000 nm 27 mm²
6+
TMS 1000 8,000 1974[7] Texas Instruments 8,000 nm
7+
MOS Technology 6502 3,510[8] 1975 MOS Technology 8,000 nm 21 mm²
8+
Intel 8085 6,500 1976 Intel 3,000 nm 20 mm²
9+
Zilog Z80 8,500 1976 Zilog 4,000 nm 18 mm²
10+
Intel 8086 29,000 1978 Intel 3,000 nm 33 mm²
11+
Motorola 6809 9,000 1978 Motorola 5,000 nm 21 mm²
12+
Intel 8088 29,000 1979 Intel 3,000 nm 33 mm²
13+
Motorola 68000 68,000 1979 Motorola 3,500 nm 44 mm²
14+
WDC 65C02 11,500[9] 1981 WDC 3,000 nm 6 mm²
15+
Intel 80186 55,000 1982 Intel 3,000 nm 60 mm²
16+
Intel 80286 134,000 1982 Intel 1,500 nm 49 mm²
17+
WDC 65C816 22,000[10] 1983 WDC 9 mm²
18+
Motorola 68020 190,000[11] 1984 Motorola 2,000 nm 85 mm²
19+
ARM 1 25,000[11] 1985 Acorn 3,000 nm 50 mm²
20+
Intel 80386 275,000 1985 Intel 1,500 nm 104 mm²
21+
Novix NC4016 16,000[12] 1985[13] Harris Corporation 3,000 nm[14]
22+
ARM 2 30,000[11] 1986 Acorn 2,000 nm 30 mm²
23+
TI Explorer's 32-bit Lisp machine chip 553,000[15] 1987 Texas Instruments
24+
DEC WRL MultiTitan 180,000[16] 1988 DEC WRL 1,500 nm 61 mm²
25+
Intel i960 250,000[17] 1988 Intel 600 nm
26+
ARM 3 300,000 1989 Acorn
27+
Intel 80486 1,180,235 1989 Intel 1000 nm 173 mm²
28+
ARM 6 35,000 1991 ARM
29+
R4000 1,350,000 1991 MIPS 1,000 nm 213 mm²
30+
Pentium 3,100,000 1993 Intel 800 nm 294 mm²
31+
ARM700 578,977[18] 1994 ARM 68.51 mm²
32+
Pentium Pro 5,500,000[19] 1995 Intel 500 nm 307 mm²
33+
SA-110 2,500,000[11] 1995 Acorn/DEC/Apple 350 nm 50 mm²
34+
AMD K5 4,300,000 1996 AMD 500 nm 251 mm²
35+
AMD K6 8,800,000 1997 AMD 350 nm 162 mm²
36+
Pentium II Klamath 7,500,000 1997 Intel 350 nm 195 mm²
37+
Pentium II Deschutes 7,500,000 1998 Intel 250 nm 113 mm²
38+
AMD K6-III 21,300,000 1999 AMD 250 nm 118 mm²
39+
AMD K7 22,000,000 1999 AMD 250 nm 184 mm²
40+
ARM 9TDMI 111,000[11] 1999 Acorn 350 nm 4.8 mm²
41+
Pentium II Mobile Dixon 27,400,000 1999 Intel 180 nm 180 mm²
42+
Pentium III Katmai 9,500,000 1999 Intel 250 nm 128 mm²
43+
Pentium 4 Willamette 42,000,000 2000 Intel 180 nm 217 mm²
44+
Pentium III Coppermine 21,000,000 2000 Intel 180 nm 80 mm²
45+
Pentium III Tualatin 45,000,000 2001 Intel 130 nm 81 mm²
46+
Itanium 2 McKinley 220,000,000 2002 Intel 180 nm 421 mm²
47+
Pentium 4 Northwood 55,000,000 2002 Intel 130 nm 145 mm²
48+
AMD K8 105,900,000 2003 AMD 130 nm 193 mm²
49+
Barton 54,300,000 2003 AMD 130 nm 101 mm²
50+
Itanium 2 Madison 6M 410,000,000 2003 Intel 130 nm 374 mm²
51+
Itanium 2 with 9 MB cache 592,000,000 2004 Intel 130 nm 432 mm²
52+
Pentium 4 Prescott 112,000,000 2004 Intel 90 nm 110 mm²
53+
Pentium 4 Prescott-2M 169,000,000 2005 Intel 90 nm 143 mm²
54+
Pentium D Smithfield 228,000,000 2005 Intel 90 nm 206 mm²
55+
Cell 241,000,000 2006 Sony/IBM/Toshiba 90 nm 221 mm²
56+
Core 2 Duo Conroe 291,000,000 2006 Intel 65 nm 143 mm²
57+
Dual-core Itanium 2 1,700,000,000[26] 2006 Intel 90 nm 596 mm²
58+
Pentium 4 Cedar Mill 184,000,000 2006 Intel 65 nm 90 mm²
59+
Pentium D Presler 362,000,000 2006 Intel 65 nm 162 mm²
60+
AMD K10 quad-core 2M L3 463,000,000[20] 2007 AMD 65 nm 283 mm²
61+
ARM Cortex-A9 26,000,000[21] 2007 ARM 45 nm 31 mm²
62+
Core 2 Duo Allendale 169,000,000 2007 Intel 65 nm 111 mm²
63+
Core 2 Duo Wolfdale 411,000,000 2007 Intel 45 nm 107 mm²
64+
POWER6 789,000,000 2007 IBM 65 nm 341 mm²
65+
AMD K10 quad-core 6M L3 758,000,000[20] 2008 AMD 45 nm 258 mm²
66+
Atom 47,000,000 2008 Intel 45 nm 24 mm²
67+
Core 2 Duo Wolfdale 3M 230,000,000 2008 Intel 45 nm 83 mm²
68+
Core i7 (Quad) 731,000,000 2008 Intel 45 nm 263 mm²
69+
Six-core Xeon 7400 1,900,000,000 2008 Intel 45 nm 503 mm²
70+
Six-core Opteron 2400 904,000,000 2009 AMD 45 nm 346 mm²
71+
16-core SPARC T3 1,000,000,000[22] 2010 Sun/Oracle 40 nm 377 mm²
72+
8-core POWER7 32M L3 1,200,000,000 2010 IBM 45 nm 567 mm²
73+
8-core Xeon Nehalem-EX 2,300,000,000[30] 2010 Intel 45 nm 684 mm²
74+
Quad-core Itanium Tukwila 2,000,000,000[28] 2010 Intel 65 nm 699 mm²
75+
Quad-core z196[24] 1,400,000,000 2010 IBM 45 nm 512 mm²
76+
Six-core Core i7 (Gulftown) 1,170,000,000 2010 Intel 32 nm 240 mm²
77+
10-core Xeon Westmere-EX 2,600,000,000 2011 Intel 32 nm 512 mm²
78+
Quad-core + GPU Core i7 1,160,000,000 2011 Intel 32 nm 216 mm²
79+
Six-core Core i7/8-core Xeon E5 (Sandy Bridge-E/EP) 2,270,000,000[29] 2011 Intel 32 nm 434 mm²
80+
61-core Xeon Phi 5,000,000,000[34] 2012 Intel 22 nm 350 mm²
81+
8-core AMD Bulldozer 1,200,000,000[23] 2012 AMD 32 nm 315 mm²
82+
8-core Itanium Poulson 3,100,000,000 2012 Intel 32 nm 544 mm²
83+
8-core POWER7+ 80 MB L3 cache 2,100,000,000 2012 IBM 32 nm 567 mm²
84+
Quad-core + GPU AMD Trinity 1,303,000,000 2012 AMD 32 nm 246 mm²
85+
Quad-core + GPU Core i7 Ivy Bridge 1,400,000,000 2012 Intel 22 nm 160 mm²
86+
Six-core zEC12 2,750,000,000 2012 IBM 32 nm 597 mm²
87+
12-core POWER8 4,200,000,000 2013 IBM 22 nm 650 mm²
88+
Apple A7 (dual-core ARM64 "mobile SoC") 1,000,000,000 2013 Apple 28 nm 102 mm²
89+
Six-core Core i7 Ivy Bridge E 1,860,000,000 2013 Intel 22 nm 256 mm²
90+
Xbox One main SoC 5,000,000,000 2013 Microsoft/AMD 28 nm 363 mm²
91+
15-core Xeon Ivy Bridge-EX 4,310,000,000[33] 2014 Intel 22 nm 541 mm²
92+
18-core Xeon Haswell-E5 5,560,000,000[35] 2014 Intel 22 nm 661 mm²
93+
8-core Core i7 Haswell-E 2,600,000,000[31] 2014 Intel 22 nm 355 mm²
94+
Apple A8 (dual-core ARM64 "mobile SoC") 2,000,000,000 2014 Apple 20 nm 89 mm²
95+
Apple A8X (tri-core ARM64 "mobile SoC") 3,000,000,000[32] 2014 Apple 20 nm 128 mm²
96+
Quad-core + GPU Core i7 Haswell 1,400,000,000[25] 2014 Intel 22 nm 177 mm²
97+
Duo-core + GPU Iris Core i7 Broadwell-U 1,900,000,000[27] 2015 Intel 14 nm 133 mm²
98+
IBM z13 3,990,000,000 2015 IBM 22 nm 678 mm²
99+
IBM z13 Storage Controller 7,100,000,000 2015 IBM 22 nm 678 mm²
100+
Quad-core + GPU GT2 Core i7 Skylake K cca 1,750,000,000 2015 Intel 14 nm 122 mm²
101+
SPARC M7 10,000,000,000[37] 2015 Oracle 20 nm
102+
22-core Xeon Broadwell-E5 ~7,200,000,000[36] 2016 Intel 14 nm 456 mm²

linear_regression_class/moore.py

+61
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,61 @@
1+
# shows how linear regression analysis can be applied to moore's law
2+
#
3+
# notes for this course can be found at:
4+
# https://www.udemy.com/data-science-linear-regression-in-python
5+
# transistor count from: https://en.wikipedia.org/wiki/Transistor_count
6+
import re
7+
import numpy as np
8+
import matplotlib.pyplot as plt
9+
10+
X = []
11+
Y = []
12+
13+
# some numbers show up as 1,170,000,000 (commas)
14+
# some numbers have references in square brackets after them
15+
non_decimal = re.compile(r'[^\d]+')
16+
17+
for line in open('moore.csv'):
18+
r = line.split('\t')
19+
20+
x = int(non_decimal.sub('', r[2].split('[')[0]))
21+
y = int(non_decimal.sub('', r[1].split('[')[0]))
22+
X.append(x)
23+
Y.append(y)
24+
25+
26+
X = np.array(X)
27+
Y = np.array(Y)
28+
29+
plt.scatter(X, Y)
30+
plt.show()
31+
32+
Y = np.log(Y)
33+
plt.scatter(X, Y)
34+
plt.show()
35+
36+
# copied from lr_1d.py
37+
denominator = X.dot(X) - X.mean() * X.sum()
38+
a = ( X.dot(Y) - Y.mean()*X.sum() ) / denominator
39+
b = ( Y.mean() * X.dot(X) - X.mean() * X.dot(Y) ) / denominator
40+
41+
# let's calculate the predicted Y
42+
Yhat = a*X + b
43+
44+
plt.scatter(X, Y)
45+
plt.plot(X, Yhat)
46+
plt.show()
47+
48+
# determine how good the model is by computing the r-squared
49+
d1 = Y - Yhat
50+
d2 = Y - Y.mean()
51+
r2 = 1 - d1.dot(d1) / d2.dot(d2)
52+
print("a:", a, "b:", b)
53+
print("the r-squared is:", r2)
54+
55+
# how long does it take to double?
56+
# log(transistorcount) = a*year + b
57+
# transistorcount = exp(b) * exp(a*year)
58+
# 2*transistorcount = 2 * exp(b) * exp(a*year) = exp(ln(2)) * exp(b) * exp(a * year) = exp(b) * exp(a * year + ln(2))
59+
# a*year2 = a*year1 + ln2
60+
# year2 = year1 + ln2/a
61+
print("time to double:", np.log(2)/a, "years")

linear_regression_class/systolic.py

+51
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,51 @@
1+
# need to sudo pip install xlrd to use pd.read_excel
2+
# data is from:
3+
# http://college.cengage.com/mathematics/brase/understandable_statistics/7e/students/datasets/mlr/frames/mlr02.html
4+
5+
# The data (X1, X2, X3) are for each patient.
6+
# X1 = systolic blood pressure
7+
# X2 = age in years
8+
# X3 = weight in pounds
9+
10+
import matplotlib.pyplot as plt
11+
import numpy as np
12+
import pandas as pd
13+
14+
df = pd.read_excel('mlr02.xls')
15+
X = df.as_matrix()
16+
17+
# using age to predict systolic blood pressure
18+
plt.scatter(X[:,1], X[:,0])
19+
plt.show()
20+
# looks pretty linear!
21+
22+
# using weight to predict systolic blood pressure
23+
plt.scatter(X[:,2], X[:,0])
24+
plt.show()
25+
# looks pretty linear!
26+
27+
df['ones'] = 1
28+
Y = df['X1']
29+
X = df[['X2', 'X3', 'ones']]
30+
X2only = df[['X2', 'ones']]
31+
X3only = df[['X3', 'ones']]
32+
33+
def get_r2(X, Y):
34+
w = np.linalg.solve( X.T.dot(X), X.T.dot(Y) )
35+
Yhat = X.dot(w)
36+
37+
# determine how good the model is by computing the r-squared
38+
d1 = Y - Yhat
39+
d2 = Y - Y.mean()
40+
r2 = 1 - d1.dot(d1) / d2.dot(d2)
41+
return r2
42+
43+
print "r2 for x2 only:", get_r2(X2only, Y)
44+
print "r2 for x3 only:", get_r2(X3only, Y)
45+
print "r2 for both:", get_r2(X, Y)
46+
47+
48+
49+
50+
51+

0 commit comments

Comments
 (0)