Skip to content

Commit

Permalink
Auto stash before merge of "master" and "data_analysis/master"
Browse files Browse the repository at this point in the history
  • Loading branch information
husseinmleng committed Apr 6, 2023
1 parent f4a87b8 commit 518e9df
Show file tree
Hide file tree
Showing 16 changed files with 11,283 additions and 11,283 deletions.
144 changes: 72 additions & 72 deletions Case Study 1/winequality.names
Original file line number Diff line number Diff line change
@@ -1,72 +1,72 @@
Citation Request:
This dataset is public available for research. The details are described in [Cortez et al., 2009].
Please include this citation if you plan to use this database:

P. Cortez, A. Cerdeira, F. Almeida, T. Matos and J. Reis.
Modeling wine preferences by data mining from physicochemical properties.
In Decision Support Systems, Elsevier, 47(4):547-553. ISSN: 0167-9236.

Available at: [@Elsevier] http://dx.doi.org/10.1016/j.dss.2009.05.016
[Pre-press (pdf)] http://www3.dsi.uminho.pt/pcortez/winequality09.pdf
[bib] http://www3.dsi.uminho.pt/pcortez/dss09.bib

1. Title: Wine Quality

2. Sources
Created by: Paulo Cortez (Univ. Minho), Antonio Cerdeira, Fernando Almeida, Telmo Matos and Jose Reis (CVRVV) @ 2009

3. Past Usage:

P. Cortez, A. Cerdeira, F. Almeida, T. Matos and J. Reis.
Modeling wine preferences by data mining from physicochemical properties.
In Decision Support Systems, Elsevier, 47(4):547-553. ISSN: 0167-9236.

In the above reference, two datasets were created, using red and white wine samples.
The inputs include objective tests (e.g. PH values) and the output is based on sensory data
(median of at least 3 evaluations made by wine experts). Each expert graded the wine quality
between 0 (very bad) and 10 (very excellent). Several data mining methods were applied to model
these datasets under a regression approach. The support vector machine model achieved the
best results. Several metrics were computed: MAD, confusion matrix for a fixed error tolerance (T),
etc. Also, we plot the relative importances of the input variables (as measured by a sensitivity
analysis procedure).

4. Relevant Information:

The two datasets are related to red and white variants of the Portuguese "Vinho Verde" wine.
For more details, consult: http://www.vinhoverde.pt/en/ or the reference [Cortez et al., 2009].
Due to privacy and logistic issues, only physicochemical (inputs) and sensory (the output) variables
are available (e.g. there is no data about grape types, wine brand, wine selling price, etc.).

These datasets can be viewed as classification or regression tasks.
The classes are ordered and not balanced (e.g. there are munch more normal wines than
excellent or poor ones). Outlier detection algorithms could be used to detect the few excellent
or poor wines. Also, we are not sure if all input variables are relevant. So
it could be interesting to test feature selection methods.

5. Number of Instances: red wine - 1599; white wine - 4898.

6. Number of Attributes: 11 + output attribute

Note: several of the attributes may be correlated, thus it makes sense to apply some sort of
feature selection.

7. Attribute information:

For more information, read [Cortez et al., 2009].

Input variables (based on physicochemical tests):
1 - fixed acidity
2 - volatile acidity
3 - citric acid
4 - residual sugar
5 - chlorides
6 - free sulfur dioxide
7 - total sulfur dioxide
8 - density
9 - pH
10 - sulphates
11 - alcohol
Output variable (based on sensory data):
12 - quality (score between 0 and 10)

8. Missing Attribute Values: None
Citation Request:
This dataset is public available for research. The details are described in [Cortez et al., 2009].
Please include this citation if you plan to use this database:
P. Cortez, A. Cerdeira, F. Almeida, T. Matos and J. Reis.
Modeling wine preferences by data mining from physicochemical properties.
In Decision Support Systems, Elsevier, 47(4):547-553. ISSN: 0167-9236.
Available at: [@Elsevier] http://dx.doi.org/10.1016/j.dss.2009.05.016
[Pre-press (pdf)] http://www3.dsi.uminho.pt/pcortez/winequality09.pdf
[bib] http://www3.dsi.uminho.pt/pcortez/dss09.bib
1. Title: Wine Quality
2. Sources
Created by: Paulo Cortez (Univ. Minho), Antonio Cerdeira, Fernando Almeida, Telmo Matos and Jose Reis (CVRVV) @ 2009
3. Past Usage:
P. Cortez, A. Cerdeira, F. Almeida, T. Matos and J. Reis.
Modeling wine preferences by data mining from physicochemical properties.
In Decision Support Systems, Elsevier, 47(4):547-553. ISSN: 0167-9236.
In the above reference, two datasets were created, using red and white wine samples.
The inputs include objective tests (e.g. PH values) and the output is based on sensory data
(median of at least 3 evaluations made by wine experts). Each expert graded the wine quality
between 0 (very bad) and 10 (very excellent). Several data mining methods were applied to model
these datasets under a regression approach. The support vector machine model achieved the
best results. Several metrics were computed: MAD, confusion matrix for a fixed error tolerance (T),
etc. Also, we plot the relative importances of the input variables (as measured by a sensitivity
analysis procedure).
4. Relevant Information:
The two datasets are related to red and white variants of the Portuguese "Vinho Verde" wine.
For more details, consult: http://www.vinhoverde.pt/en/ or the reference [Cortez et al., 2009].
Due to privacy and logistic issues, only physicochemical (inputs) and sensory (the output) variables
are available (e.g. there is no data about grape types, wine brand, wine selling price, etc.).
These datasets can be viewed as classification or regression tasks.
The classes are ordered and not balanced (e.g. there are munch more normal wines than
excellent or poor ones). Outlier detection algorithms could be used to detect the few excellent
or poor wines. Also, we are not sure if all input variables are relevant. So
it could be interesting to test feature selection methods.
5. Number of Instances: red wine - 1599; white wine - 4898.
6. Number of Attributes: 11 + output attribute
Note: several of the attributes may be correlated, thus it makes sense to apply some sort of
feature selection.
7. Attribute information:
For more information, read [Cortez et al., 2009].
Input variables (based on physicochemical tests):
1 - fixed acidity
2 - volatile acidity
3 - citric acid
4 - residual sugar
5 - chlorides
6 - free sulfur dioxide
7 - total sulfur dioxide
8 - density
9 - pH
10 - sulphates
11 - alcohol
Output variable (based on sensory data):
12 - quality (score between 0 and 10)
8. Missing Attribute Values: None
240 changes: 120 additions & 120 deletions Lesson_1/Detectors_v1.4.py
Original file line number Diff line number Diff line change
@@ -1,120 +1,120 @@
import numpy as np
from scipy.stats import norm
import matplotlib.pyplot as plt

ITERATION = 1000
USER = 16
RECEIVER = 128

CONS_ALPHABET = np.array([[-1, 1]], np.complex)
signal_energy_avg = np.mean(np.square(np.abs(CONS_ALPHABET)))

snr_db_list = []
for snr in range(-2, 21, 2):
snr_db_list.append(snr)


def mmse(s, n0, h, y):

p1 = np.matmul(np.matrix.getH(h), y)
p2 = np.matmul(np.matrix.getH(h), h) + (n0 / signal_energy_avg) * np.identity(USER)
xhat = np.matmul(np.linalg.inv(p2), p1)

v1 = np.matmul(xhat, np.ones([1, CONS_ALPHABET.size]))
v2 = np.matmul(np.ones([USER, 1]), CONS_ALPHABET)
idxhat = np.argmin(np.square(np.abs(v1 - v2)), axis=1)

estimated_symbol = CONS_ALPHABET[:, idxhat]
accuracy_mmse = np.equal(estimated_symbol.flatten(), np.transpose(s))

error_mmse = 1 - (np.sum(accuracy_mmse) / USER)

return error_mmse


def zero_forcing(s, h, y):

p1 = np.matmul(np.matrix.getH(h), h)
p2 = np.matmul(np.matrix.getH(h), y)
xhat = np.matmul(np.linalg.inv(p1), p2)

v1 = np.matmul(xhat, np.ones([1, CONS_ALPHABET.size]))
v2 = np.matmul(np.ones([USER, 1]), CONS_ALPHABET)
idxhat = np.argmin(np.square(np.abs(v1 - v2)), axis=1)

idx = CONS_ALPHABET[:, idxhat]
accuracy_mmse = np.equal(idx.flatten(), np.transpose(s))

error_zf = 1 - (np.sum(accuracy_mmse) / USER)

return error_zf


def matched_filter(s, h, y):

p1 = np.matmul(np.matrix.getH(h), y)

h_flat = h.reshape(h.shape[0] * h.shape[1], 1)
_, singular_value, _ = np.linalg.svd(h_flat)
p2 = singular_value[np.argmax(singular_value)] #norm

xhat = p1 * (1/p2)

v1 = np.matmul(xhat, np.ones([1, CONS_ALPHABET.size]))
v2 = np.matmul(np.ones([USER, 1]), CONS_ALPHABET)
idxhat = np.argmin(np.square(np.abs(v1 - v2)), axis=1)

idx = CONS_ALPHABET[:, idxhat]
accuracy_mf = np.equal(idx.flatten(), np.transpose(s))

error_mf = 1 - (np.sum(accuracy_mf) / USER)

return error_mf


bers_mmse_in_iter = np.zeros([len(snr_db_list), ITERATION])
bers_zf_in_iter = np.zeros([len(snr_db_list), ITERATION])
bers_mf_in_iter = np.zeros([len(snr_db_list), ITERATION])

for iter_snr in range(len(snr_db_list)):
snr_db = snr_db_list[iter_snr]

for rerun in range(ITERATION):
transmitted_symbol = np.transpose(np.sign(np.random.rand(1, USER) - 0.5))

SNR_lin = 10 ** (snr_db / 10)

noise_variance = signal_energy_avg * USER / SNR_lin

noise = np.sqrt(0.5) * (norm.ppf(np.random.rand(RECEIVER, 1)) + (1j * norm.ppf(np.random.rand(RECEIVER, 1))))

channel = np.sqrt(0.5) * (norm.ppf(np.random.rand(RECEIVER, USER)) + (1j * norm.ppf(np.random.rand(RECEIVER, USER))))

received_signal = np.matmul(channel, transmitted_symbol) + np.sqrt(noise_variance) * noise

bers_mmse_in_iter[iter_snr][rerun] = mmse(transmitted_symbol, noise_variance, channel, received_signal)
bers_zf_in_iter[iter_snr][rerun] = zero_forcing(transmitted_symbol, channel, received_signal)
bers_mf_in_iter[iter_snr][rerun] = matched_filter(transmitted_symbol, channel, received_signal)

bers_mmse = np.mean(bers_mmse_in_iter, axis=1)
bers_zf = np.mean(bers_zf_in_iter, axis=1)
bers_mf = np.mean(bers_mf_in_iter, axis=1)

print('mmse error rate', bers_mmse)
print('zero forcing error rate', bers_zf)
print('matched filter error rate', bers_mf)

plt.figure('Bit Error Rate')
plt.subplot(111)
plt.semilogy(snr_db_list, bers_mmse, color='black', marker='*', linestyle='-', linewidth=1, markersize=6, label='MMSE')
plt.semilogy(snr_db_list, bers_zf, color='blue', marker='d', linestyle='-', linewidth=1, markersize=5, label='ZF')
plt.semilogy(snr_db_list, bers_mf, color='red', marker='x', linestyle='-', linewidth=1, markersize=6, label='MF')
plt.title('BER vs SNR')
plt.xlabel('SNR(dB)')
plt.xlim(-2, 16)
plt.xscale('linear')
plt.ylabel('BER')
plt.ylim(0.00001, 1)
plt.grid(b=True, which='major', color='#666666', linestyle='--')
plt.legend(title='Detectors:')
plt.show()
import numpy as np
from scipy.stats import norm
import matplotlib.pyplot as plt

ITERATION = 1000
USER = 16
RECEIVER = 128

CONS_ALPHABET = np.array([[-1, 1]], np.complex)
signal_energy_avg = np.mean(np.square(np.abs(CONS_ALPHABET)))

snr_db_list = []
for snr in range(-2, 21, 2):
snr_db_list.append(snr)


def mmse(s, n0, h, y):

p1 = np.matmul(np.matrix.getH(h), y)
p2 = np.matmul(np.matrix.getH(h), h) + (n0 / signal_energy_avg) * np.identity(USER)
xhat = np.matmul(np.linalg.inv(p2), p1)

v1 = np.matmul(xhat, np.ones([1, CONS_ALPHABET.size]))
v2 = np.matmul(np.ones([USER, 1]), CONS_ALPHABET)
idxhat = np.argmin(np.square(np.abs(v1 - v2)), axis=1)

estimated_symbol = CONS_ALPHABET[:, idxhat]
accuracy_mmse = np.equal(estimated_symbol.flatten(), np.transpose(s))

error_mmse = 1 - (np.sum(accuracy_mmse) / USER)

return error_mmse


def zero_forcing(s, h, y):

p1 = np.matmul(np.matrix.getH(h), h)
p2 = np.matmul(np.matrix.getH(h), y)
xhat = np.matmul(np.linalg.inv(p1), p2)

v1 = np.matmul(xhat, np.ones([1, CONS_ALPHABET.size]))
v2 = np.matmul(np.ones([USER, 1]), CONS_ALPHABET)
idxhat = np.argmin(np.square(np.abs(v1 - v2)), axis=1)

idx = CONS_ALPHABET[:, idxhat]
accuracy_mmse = np.equal(idx.flatten(), np.transpose(s))

error_zf = 1 - (np.sum(accuracy_mmse) / USER)

return error_zf


def matched_filter(s, h, y):

p1 = np.matmul(np.matrix.getH(h), y)

h_flat = h.reshape(h.shape[0] * h.shape[1], 1)
_, singular_value, _ = np.linalg.svd(h_flat)
p2 = singular_value[np.argmax(singular_value)] #norm

xhat = p1 * (1/p2)

v1 = np.matmul(xhat, np.ones([1, CONS_ALPHABET.size]))
v2 = np.matmul(np.ones([USER, 1]), CONS_ALPHABET)
idxhat = np.argmin(np.square(np.abs(v1 - v2)), axis=1)

idx = CONS_ALPHABET[:, idxhat]
accuracy_mf = np.equal(idx.flatten(), np.transpose(s))

error_mf = 1 - (np.sum(accuracy_mf) / USER)

return error_mf


bers_mmse_in_iter = np.zeros([len(snr_db_list), ITERATION])
bers_zf_in_iter = np.zeros([len(snr_db_list), ITERATION])
bers_mf_in_iter = np.zeros([len(snr_db_list), ITERATION])

for iter_snr in range(len(snr_db_list)):
snr_db = snr_db_list[iter_snr]

for rerun in range(ITERATION):
transmitted_symbol = np.transpose(np.sign(np.random.rand(1, USER) - 0.5))

SNR_lin = 10 ** (snr_db / 10)

noise_variance = signal_energy_avg * USER / SNR_lin

noise = np.sqrt(0.5) * (norm.ppf(np.random.rand(RECEIVER, 1)) + (1j * norm.ppf(np.random.rand(RECEIVER, 1))))

channel = np.sqrt(0.5) * (norm.ppf(np.random.rand(RECEIVER, USER)) + (1j * norm.ppf(np.random.rand(RECEIVER, USER))))

received_signal = np.matmul(channel, transmitted_symbol) + np.sqrt(noise_variance) * noise

bers_mmse_in_iter[iter_snr][rerun] = mmse(transmitted_symbol, noise_variance, channel, received_signal)
bers_zf_in_iter[iter_snr][rerun] = zero_forcing(transmitted_symbol, channel, received_signal)
bers_mf_in_iter[iter_snr][rerun] = matched_filter(transmitted_symbol, channel, received_signal)

bers_mmse = np.mean(bers_mmse_in_iter, axis=1)
bers_zf = np.mean(bers_zf_in_iter, axis=1)
bers_mf = np.mean(bers_mf_in_iter, axis=1)

print('mmse error rate', bers_mmse)
print('zero forcing error rate', bers_zf)
print('matched filter error rate', bers_mf)

plt.figure('Bit Error Rate')
plt.subplot(111)
plt.semilogy(snr_db_list, bers_mmse, color='black', marker='*', linestyle='-', linewidth=1, markersize=6, label='MMSE')
plt.semilogy(snr_db_list, bers_zf, color='blue', marker='d', linestyle='-', linewidth=1, markersize=5, label='ZF')
plt.semilogy(snr_db_list, bers_mf, color='red', marker='x', linestyle='-', linewidth=1, markersize=6, label='MF')
plt.title('BER vs SNR')
plt.xlabel('SNR(dB)')
plt.xlim(-2, 16)
plt.xscale('linear')
plt.ylabel('BER')
plt.ylim(0.00001, 1)
plt.grid(b=True, which='major', color='#666666', linestyle='--')
plt.legend(title='Detectors:')
plt.show()
Loading

0 comments on commit 518e9df

Please sign in to comment.