Auto stash before merge of "master" and "data_analysis/master"

husseinmleng · Apr 6, 2023 · 518e9df · 518e9df
1 parent f4a87b8
commit 518e9df
Show file tree

Hide file tree

Showing 16 changed files with 11,283 additions and 11,283 deletions.
diff --git a/Case Study 1/winequality.names b/Case Study 1/winequality.names
@@ -1,72 +1,72 @@
-Citation Request:
-  This dataset is public available for research. The details are described in [Cortez et al., 2009]. 
-  Please include this citation if you plan to use this database:
-
-  P. Cortez, A. Cerdeira, F. Almeida, T. Matos and J. Reis. 
-  Modeling wine preferences by data mining from physicochemical properties.
-  In Decision Support Systems, Elsevier, 47(4):547-553. ISSN: 0167-9236.
-
-  Available at: [@Elsevier] http://dx.doi.org/10.1016/j.dss.2009.05.016
-                [Pre-press (pdf)] http://www3.dsi.uminho.pt/pcortez/winequality09.pdf
-                [bib] http://www3.dsi.uminho.pt/pcortez/dss09.bib
-
-1. Title: Wine Quality 
-
-2. Sources
-   Created by: Paulo Cortez (Univ. Minho), Antonio Cerdeira, Fernando Almeida, Telmo Matos and Jose Reis (CVRVV) @ 2009
-
-3. Past Usage:
-
-  P. Cortez, A. Cerdeira, F. Almeida, T. Matos and J. Reis. 
-  Modeling wine preferences by data mining from physicochemical properties.
-  In Decision Support Systems, Elsevier, 47(4):547-553. ISSN: 0167-9236.
-
-  In the above reference, two datasets were created, using red and white wine samples.
-  The inputs include objective tests (e.g. PH values) and the output is based on sensory data
-  (median of at least 3 evaluations made by wine experts). Each expert graded the wine quality 
-  between 0 (very bad) and 10 (very excellent). Several data mining methods were applied to model
-  these datasets under a regression approach. The support vector machine model achieved the
-  best results. Several metrics were computed: MAD, confusion matrix for a fixed error tolerance (T),
-  etc. Also, we plot the relative importances of the input variables (as measured by a sensitivity
-  analysis procedure).
-
-4. Relevant Information:
-
-   The two datasets are related to red and white variants of the Portuguese "Vinho Verde" wine.
-   For more details, consult: http://www.vinhoverde.pt/en/ or the reference [Cortez et al., 2009].
-   Due to privacy and logistic issues, only physicochemical (inputs) and sensory (the output) variables 
-   are available (e.g. there is no data about grape types, wine brand, wine selling price, etc.).
-
-   These datasets can be viewed as classification or regression tasks.
-   The classes are ordered and not balanced (e.g. there are munch more normal wines than
-   excellent or poor ones). Outlier detection algorithms could be used to detect the few excellent
-   or poor wines. Also, we are not sure if all input variables are relevant. So
-   it could be interesting to test feature selection methods. 
-
-5. Number of Instances: red wine - 1599; white wine - 4898. 
-
-6. Number of Attributes: 11 + output attribute
-
-   Note: several of the attributes may be correlated, thus it makes sense to apply some sort of
-   feature selection.
-
-7. Attribute information:
-
-   For more information, read [Cortez et al., 2009].
-
-   Input variables (based on physicochemical tests):
-   1 - fixed acidity
-   2 - volatile acidity
-   3 - citric acid
-   4 - residual sugar
-   5 - chlorides
-   6 - free sulfur dioxide
-   7 - total sulfur dioxide
-   8 - density
-   9 - pH
-   10 - sulphates
-   11 - alcohol
-   Output variable (based on sensory data): 
-   12 - quality (score between 0 and 10)
-
-8. Missing Attribute Values: None
+Citation Request:
+  This dataset is public available for research. The details are described in [Cortez et al., 2009]. 
+  Please include this citation if you plan to use this database:
+
+  P. Cortez, A. Cerdeira, F. Almeida, T. Matos and J. Reis. 
+  Modeling wine preferences by data mining from physicochemical properties.
+  In Decision Support Systems, Elsevier, 47(4):547-553. ISSN: 0167-9236.
+
+  Available at: [@Elsevier] http://dx.doi.org/10.1016/j.dss.2009.05.016
+                [Pre-press (pdf)] http://www3.dsi.uminho.pt/pcortez/winequality09.pdf
+                [bib] http://www3.dsi.uminho.pt/pcortez/dss09.bib
+
+1. Title: Wine Quality 
+
+2. Sources
+   Created by: Paulo Cortez (Univ. Minho), Antonio Cerdeira, Fernando Almeida, Telmo Matos and Jose Reis (CVRVV) @ 2009
+   
+3. Past Usage:
+
+  P. Cortez, A. Cerdeira, F. Almeida, T. Matos and J. Reis. 
+  Modeling wine preferences by data mining from physicochemical properties.
+  In Decision Support Systems, Elsevier, 47(4):547-553. ISSN: 0167-9236.
+
+  In the above reference, two datasets were created, using red and white wine samples.
+  The inputs include objective tests (e.g. PH values) and the output is based on sensory data
+  (median of at least 3 evaluations made by wine experts). Each expert graded the wine quality 
+  between 0 (very bad) and 10 (very excellent). Several data mining methods were applied to model
+  these datasets under a regression approach. The support vector machine model achieved the
+  best results. Several metrics were computed: MAD, confusion matrix for a fixed error tolerance (T),
+  etc. Also, we plot the relative importances of the input variables (as measured by a sensitivity
+  analysis procedure).
+ 
+4. Relevant Information:
+
+   The two datasets are related to red and white variants of the Portuguese "Vinho Verde" wine.
+   For more details, consult: http://www.vinhoverde.pt/en/ or the reference [Cortez et al., 2009].
+   Due to privacy and logistic issues, only physicochemical (inputs) and sensory (the output) variables 
+   are available (e.g. there is no data about grape types, wine brand, wine selling price, etc.).
+
+   These datasets can be viewed as classification or regression tasks.
+   The classes are ordered and not balanced (e.g. there are munch more normal wines than
+   excellent or poor ones). Outlier detection algorithms could be used to detect the few excellent
+   or poor wines. Also, we are not sure if all input variables are relevant. So
+   it could be interesting to test feature selection methods. 
+
+5. Number of Instances: red wine - 1599; white wine - 4898. 
+
+6. Number of Attributes: 11 + output attribute
+  
+   Note: several of the attributes may be correlated, thus it makes sense to apply some sort of
+   feature selection.
+
+7. Attribute information:
+
+   For more information, read [Cortez et al., 2009].
+
+   Input variables (based on physicochemical tests):
+   1 - fixed acidity
+   2 - volatile acidity
+   3 - citric acid
+   4 - residual sugar
+   5 - chlorides
+   6 - free sulfur dioxide
+   7 - total sulfur dioxide
+   8 - density
+   9 - pH
+   10 - sulphates
+   11 - alcohol
+   Output variable (based on sensory data): 
+   12 - quality (score between 0 and 10)
+
+8. Missing Attribute Values: None
diff --git a/Lesson_1/Detectors_v1.4.py b/Lesson_1/Detectors_v1.4.py
@@ -1,120 +1,120 @@
-import numpy as np
-from scipy.stats import norm
-import matplotlib.pyplot as plt
-
-ITERATION = 1000
-USER = 16
-RECEIVER = 128
-
-CONS_ALPHABET = np.array([[-1, 1]], np.complex)
-signal_energy_avg = np.mean(np.square(np.abs(CONS_ALPHABET)))
-
-snr_db_list = []
-for snr in range(-2, 21, 2):
-    snr_db_list.append(snr)
-
-
-def mmse(s, n0, h, y):
-
-    p1 = np.matmul(np.matrix.getH(h), y)
-    p2 = np.matmul(np.matrix.getH(h), h) + (n0 / signal_energy_avg) * np.identity(USER)
-    xhat = np.matmul(np.linalg.inv(p2), p1)
-
-    v1 = np.matmul(xhat, np.ones([1, CONS_ALPHABET.size]))
-    v2 = np.matmul(np.ones([USER, 1]), CONS_ALPHABET)
-    idxhat = np.argmin(np.square(np.abs(v1 - v2)), axis=1)
-
-    estimated_symbol = CONS_ALPHABET[:, idxhat]
-    accuracy_mmse = np.equal(estimated_symbol.flatten(), np.transpose(s))
-
-    error_mmse = 1 - (np.sum(accuracy_mmse) / USER)
-
-    return error_mmse
-
-
-def zero_forcing(s, h, y):
-
-    p1 = np.matmul(np.matrix.getH(h), h)
-    p2 = np.matmul(np.matrix.getH(h), y)
-    xhat = np.matmul(np.linalg.inv(p1), p2)
-
-    v1 = np.matmul(xhat, np.ones([1, CONS_ALPHABET.size]))
-    v2 = np.matmul(np.ones([USER, 1]), CONS_ALPHABET)
-    idxhat = np.argmin(np.square(np.abs(v1 - v2)), axis=1)
-
-    idx = CONS_ALPHABET[:, idxhat]
-    accuracy_mmse = np.equal(idx.flatten(), np.transpose(s))
-
-    error_zf = 1 - (np.sum(accuracy_mmse) / USER)
-
-    return error_zf
-
-
-def matched_filter(s, h, y):
-
-    p1 = np.matmul(np.matrix.getH(h), y)
-
-    h_flat = h.reshape(h.shape[0] * h.shape[1], 1)
-    _, singular_value, _ = np.linalg.svd(h_flat)
-    p2 = singular_value[np.argmax(singular_value)] #norm
-
-    xhat = p1 * (1/p2)
-
-    v1 = np.matmul(xhat, np.ones([1, CONS_ALPHABET.size]))
-    v2 = np.matmul(np.ones([USER, 1]), CONS_ALPHABET)
-    idxhat = np.argmin(np.square(np.abs(v1 - v2)), axis=1)
-
-    idx = CONS_ALPHABET[:, idxhat]
-    accuracy_mf = np.equal(idx.flatten(), np.transpose(s))
-
-    error_mf = 1 - (np.sum(accuracy_mf) / USER)
-
-    return error_mf
-
-
-bers_mmse_in_iter = np.zeros([len(snr_db_list), ITERATION])
-bers_zf_in_iter = np.zeros([len(snr_db_list), ITERATION])
-bers_mf_in_iter = np.zeros([len(snr_db_list), ITERATION])
-
-for iter_snr in range(len(snr_db_list)):
-    snr_db = snr_db_list[iter_snr]
-
-    for rerun in range(ITERATION):
-        transmitted_symbol = np.transpose(np.sign(np.random.rand(1, USER) - 0.5))
-
-        SNR_lin = 10 ** (snr_db / 10)
-
-        noise_variance = signal_energy_avg * USER / SNR_lin
-
-        noise = np.sqrt(0.5) * (norm.ppf(np.random.rand(RECEIVER, 1)) + (1j * norm.ppf(np.random.rand(RECEIVER, 1))))
-
-        channel = np.sqrt(0.5) * (norm.ppf(np.random.rand(RECEIVER, USER)) + (1j * norm.ppf(np.random.rand(RECEIVER, USER))))
-
-        received_signal = np.matmul(channel, transmitted_symbol) + np.sqrt(noise_variance) * noise
-
-        bers_mmse_in_iter[iter_snr][rerun] = mmse(transmitted_symbol, noise_variance, channel, received_signal)
-        bers_zf_in_iter[iter_snr][rerun] = zero_forcing(transmitted_symbol, channel, received_signal)
-        bers_mf_in_iter[iter_snr][rerun] = matched_filter(transmitted_symbol, channel, received_signal)
-
-bers_mmse = np.mean(bers_mmse_in_iter, axis=1)
-bers_zf = np.mean(bers_zf_in_iter, axis=1)
-bers_mf = np.mean(bers_mf_in_iter, axis=1)
-
-print('mmse error rate', bers_mmse)
-print('zero forcing error rate', bers_zf)
-print('matched filter error rate', bers_mf)
-
-plt.figure('Bit Error Rate')
-plt.subplot(111)
-plt.semilogy(snr_db_list, bers_mmse, color='black', marker='*', linestyle='-', linewidth=1, markersize=6, label='MMSE')
-plt.semilogy(snr_db_list, bers_zf, color='blue', marker='d', linestyle='-', linewidth=1, markersize=5, label='ZF')
-plt.semilogy(snr_db_list, bers_mf, color='red', marker='x', linestyle='-', linewidth=1, markersize=6, label='MF')
-plt.title('BER vs SNR')
-plt.xlabel('SNR(dB)')
-plt.xlim(-2, 16)
-plt.xscale('linear')
-plt.ylabel('BER')
-plt.ylim(0.00001, 1)
-plt.grid(b=True, which='major', color='#666666', linestyle='--')
-plt.legend(title='Detectors:')
-plt.show()
+import numpy as np
+from scipy.stats import norm
+import matplotlib.pyplot as plt
+
+ITERATION = 1000
+USER = 16
+RECEIVER = 128
+
+CONS_ALPHABET = np.array([[-1, 1]], np.complex)
+signal_energy_avg = np.mean(np.square(np.abs(CONS_ALPHABET)))
+
+snr_db_list = []
+for snr in range(-2, 21, 2):
+    snr_db_list.append(snr)
+
+
+def mmse(s, n0, h, y):
+
+    p1 = np.matmul(np.matrix.getH(h), y)
+    p2 = np.matmul(np.matrix.getH(h), h) + (n0 / signal_energy_avg) * np.identity(USER)
+    xhat = np.matmul(np.linalg.inv(p2), p1)
+
+    v1 = np.matmul(xhat, np.ones([1, CONS_ALPHABET.size]))
+    v2 = np.matmul(np.ones([USER, 1]), CONS_ALPHABET)
+    idxhat = np.argmin(np.square(np.abs(v1 - v2)), axis=1)
+
+    estimated_symbol = CONS_ALPHABET[:, idxhat]
+    accuracy_mmse = np.equal(estimated_symbol.flatten(), np.transpose(s))
+
+    error_mmse = 1 - (np.sum(accuracy_mmse) / USER)
+
+    return error_mmse
+
+
+def zero_forcing(s, h, y):
+
+    p1 = np.matmul(np.matrix.getH(h), h)
+    p2 = np.matmul(np.matrix.getH(h), y)
+    xhat = np.matmul(np.linalg.inv(p1), p2)
+
+    v1 = np.matmul(xhat, np.ones([1, CONS_ALPHABET.size]))
+    v2 = np.matmul(np.ones([USER, 1]), CONS_ALPHABET)
+    idxhat = np.argmin(np.square(np.abs(v1 - v2)), axis=1)
+
+    idx = CONS_ALPHABET[:, idxhat]
+    accuracy_mmse = np.equal(idx.flatten(), np.transpose(s))
+
+    error_zf = 1 - (np.sum(accuracy_mmse) / USER)
+
+    return error_zf
+
+
+def matched_filter(s, h, y):
+
+    p1 = np.matmul(np.matrix.getH(h), y)
+
+    h_flat = h.reshape(h.shape[0] * h.shape[1], 1)
+    _, singular_value, _ = np.linalg.svd(h_flat)
+    p2 = singular_value[np.argmax(singular_value)] #norm
+
+    xhat = p1 * (1/p2)
+
+    v1 = np.matmul(xhat, np.ones([1, CONS_ALPHABET.size]))
+    v2 = np.matmul(np.ones([USER, 1]), CONS_ALPHABET)
+    idxhat = np.argmin(np.square(np.abs(v1 - v2)), axis=1)
+
+    idx = CONS_ALPHABET[:, idxhat]
+    accuracy_mf = np.equal(idx.flatten(), np.transpose(s))
+
+    error_mf = 1 - (np.sum(accuracy_mf) / USER)
+
+    return error_mf
+
+
+bers_mmse_in_iter = np.zeros([len(snr_db_list), ITERATION])
+bers_zf_in_iter = np.zeros([len(snr_db_list), ITERATION])
+bers_mf_in_iter = np.zeros([len(snr_db_list), ITERATION])
+
+for iter_snr in range(len(snr_db_list)):
+    snr_db = snr_db_list[iter_snr]
+
+    for rerun in range(ITERATION):
+        transmitted_symbol = np.transpose(np.sign(np.random.rand(1, USER) - 0.5))
+
+        SNR_lin = 10 ** (snr_db / 10)
+
+        noise_variance = signal_energy_avg * USER / SNR_lin
+
+        noise = np.sqrt(0.5) * (norm.ppf(np.random.rand(RECEIVER, 1)) + (1j * norm.ppf(np.random.rand(RECEIVER, 1))))
+
+        channel = np.sqrt(0.5) * (norm.ppf(np.random.rand(RECEIVER, USER)) + (1j * norm.ppf(np.random.rand(RECEIVER, USER))))
+
+        received_signal = np.matmul(channel, transmitted_symbol) + np.sqrt(noise_variance) * noise
+
+        bers_mmse_in_iter[iter_snr][rerun] = mmse(transmitted_symbol, noise_variance, channel, received_signal)
+        bers_zf_in_iter[iter_snr][rerun] = zero_forcing(transmitted_symbol, channel, received_signal)
+        bers_mf_in_iter[iter_snr][rerun] = matched_filter(transmitted_symbol, channel, received_signal)
+
+bers_mmse = np.mean(bers_mmse_in_iter, axis=1)
+bers_zf = np.mean(bers_zf_in_iter, axis=1)
+bers_mf = np.mean(bers_mf_in_iter, axis=1)
+
+print('mmse error rate', bers_mmse)
+print('zero forcing error rate', bers_zf)
+print('matched filter error rate', bers_mf)
+
+plt.figure('Bit Error Rate')
+plt.subplot(111)
+plt.semilogy(snr_db_list, bers_mmse, color='black', marker='*', linestyle='-', linewidth=1, markersize=6, label='MMSE')
+plt.semilogy(snr_db_list, bers_zf, color='blue', marker='d', linestyle='-', linewidth=1, markersize=5, label='ZF')
+plt.semilogy(snr_db_list, bers_mf, color='red', marker='x', linestyle='-', linewidth=1, markersize=6, label='MF')
+plt.title('BER vs SNR')
+plt.xlabel('SNR(dB)')
+plt.xlim(-2, 16)
+plt.xscale('linear')
+plt.ylabel('BER')
+plt.ylim(0.00001, 1)
+plt.grid(b=True, which='major', color='#666666', linestyle='--')
+plt.legend(title='Detectors:')
+plt.show()