Skip to content

Commit 6092d0e

Browse files
authored
Last changes
1 parent 061c453 commit 6092d0e

File tree

1 file changed

+54
-59
lines changed

1 file changed

+54
-59
lines changed

Tensor-GOT-Polymer/index.html

Lines changed: 54 additions & 59 deletions
Original file line numberDiff line numberDiff line change
@@ -99,23 +99,24 @@ <h2>Data analysis</h2>
9999
During this 0 point we will serve as company for Tensorflow Codelab based on Game of Thrones Data, in order to provide a Neural Network model to classify the probability of Death of Game of Thrones characters with wide+deep model .
100100
There is some data theory we will dive in when implementing the model, but first we will dig a little bit in the data in order to know more about the behaviour of our dataset </p>
101101
<p>
102-
<p><a href="https://github.com/SoyGema/Tensorflow_CodeLab_Wide-Deep_learning/blob/master/wide%2Bdeep_code" target="_blank"><paper-button class="colored" raised><iron-icon icon="file-download"></iron-icon>Download source code</paper-button></a></p>
103-
You can download the data from the button below. The original dataset has been released in Kaggle and we have been filling the information in Game of Thrones wiki focusing on title, culture . </p>
102+
<p><a href="https://github.com/codelab-tf-got/code/archive/master.zip" target="_blank"><paper-button class="colored" raised><iron-icon icon="file-download"></iron-icon>Download source code</paper-button></a></p>
103+
<p>You can download the data from the button below. The original dataset has been released in Kaggle and we have been filling the information in Game of Thrones wiki focusing on title, culture . </p>
104104
<p>
105105
<img src="Table1.jpg" alt="Data">
106106

107-
In one first approach of the dataset we extract the following useful information about the dataset, wich we consider a significant first approach for finding correlations and relationships in between variables
108-
NUMBER OF CHARACTERS : 1946
109-
MEDIAN OF AGE : 27
110-
CORRELATION BETWEEN Number of Death Relationships and Popularity : 0.663 </p>
107+
In one first approach of the dataset we extract the following useful information about the dataset, wich we consider a significant first approach for finding correlations and relationships in between variables </p>
108+
109+
<p> NUMBER OF CHARACTERS : 1946 </p>
110+
<p> MEDIAN OF AGE : 27 </p>
111+
<p>CORRELATION BETWEEN Number of Death Relationships and Popularity : 0.663 </p>
111112
<p>
112113
Here we have some graphics that show correlation among data , having into account that most of them show relationships in between popularity and other characteristics .
113114
<img src="1_CultureGOT.png" alt="Culture Analysis"> Game of Thrones character ecosystem shows a diverse culture approach that prints a diverse atmosphere , without a doubt it really brings to the table a rich atmosphere to enjoy. Among all cultures, the most significant are Valyrian - as House Targaryen members- , Northmen -like watchmen -that has the highest value in culture with 143 characters, and Andals - like House Lannister.
114115
<img src="histogram_pop.jpg" alt="Histogram of popularity">On this first approach of Histogram of popularity , more than 750 characters , about 40 % are ranged with 0 popularity ; being the most popular character listed – less than 20 , about a 1.02 % - the following, described together within .
115116
However, the median of popularity is 0.03344
116117
Below we show a comparative table in between the top 10 popular characters and the top 10 probability likelihood of Death, can you find any relationship in between it?
117118
As a comparative conclusion, we might underline the popularity of Baratheons and Starks and their absence from the high probability of death and the amounts of Targaryens that seems to be related with life ending situations.
118-
####--------------------------COMPONENT TABLE --------------------------------#####
119+
119120
Baratheons are among the most popular and also far away from the most probability of death
120121
<p>
121122
<img src="3_Tensorflow_PDr.gif" alt="Popularity VS number of Death Relations">
@@ -171,7 +172,7 @@ <h2 class="checklist">CodeLab Structure</h2>
171172
<img src="Download_thecodelab.jpg" alt="network">
172173
<li>Download the code and the data_set and put it on a folder. It should contain the file wide+deep_Tensorflow_GOT and the file GOT_data.csvd </li>
173174
<li>Open the file and change the path in your dataset : data_set = ‘your_path_here’ </li>
174-
<li>In console, execute the program :~ $ python 'program'</li>
175+
<li>In console, execute the program :~ $ python 'program' --training_mode learn_runner --model_dir /Base directory for output models --model_type 'wide_n_deep' --steps 200</li>
175176
<li>In console, execute Tensorflow :~ $ tensorboard --logdir =/tmp/model/ </li>
176177
<li>A fair amount of time</li>
177178
</ul>
@@ -201,14 +202,25 @@ <h2>Base Features</h2>
201202
<p>Categorical variables are also known as discrete or qualitative variables. Categorical variables can be further categorized as either nominal, ordinal or dichotomous. Nominal variables are variables that have two or more categories, but which do not have an intrinsic order.
202203
Continuos variables are those that refers to continuous values such as numbers </p>
203204
<pre><code>
204-
CATEGORICAL_COLUMNS = ["alive", "title", "male", "culture",
205-
"house", "spouse", "isAliveMother", "isAliveFather", "isAliveHeir",
206-
"isAliveSpouse", "isMarried", "isNoble", "numDeadRelations",
207-
"boolDeadRelations", "isPopular" , "popularity"]
205+
CATEGORICAL_COLUMN_NAMES = only_existing([
206+
'male',
207+
'culture',
208+
'mother',
209+
'father',
210+
'title',
211+
'heir',
212+
'house',
213+
'spouse',
214+
'numDeadRelations',
215+
'boolDeadRelations',
216+
], COLUMNS)
217+
218+
CONTINUOUS_COLUMNS = only_existing([
219+
'age',
220+
'popularity',
221+
'dateOfBirth',
222+
], COLUMNS)
208223

209-
CONTINUOUS_COLUMNS = ["name", "dateOfBirth", "mother", "father",
210-
"heir", "book1", "book2", "book3", "book4", "book5", "age",
211-
"isAlive", "house", "title", "numDeadRelations"]
212224
</code></pre>
213225
</google-codelab-step>
214226
<google-codelab-step
@@ -223,18 +235,11 @@ <h2>Linear classiffier: Memorization</h2>
223235
We execute the linear classifier
224236

225237
Here the design of the net comes with the combination of features that can combine the information in order to offer a suitable conclusion </p>
226-
<img src="4_Comic.png" alt="crossing_narrow_sea">
238+
<img src="WideDEF.gif" alt="crossing_narrow_sea">
227239
<pre><code>
228-
# Wide columns and deep columns.
229-
wide_columns = [name, dateOfBirth, DateoFdeath, mother, father, heir, book1, book2,
230-
book3, book4, book5, age, isAlive
231-
tf.contrib.layers.crossed_column([house, title],
232-
hash_bucket_size=int(1e4)),
233-
tf.contrib.layers.crossed_column(
234-
[age_buckets, house, title],
235-
hash_bucket_size=int(1e6)),
236-
tf.contrib.layers.crossed_column([numDeadRelations, title],
237-
hash_bucket_size=int(1e4))]
240+
if FLAGS.model_type == "wide":
241+
m = tf.contrib.learn.LinearClassifier(model_dir=model_dir,
242+
feature_columns=wide_columns)
238243
</code></pre>
239244
</google-codelab-step>
240245
<google-codelab-step
@@ -246,36 +251,30 @@ <h2>Deep layer</h2>
246251
<p>Generalization - Deep Model- the model is a Feedfoward neural network that works with categorical features . there exist Transitivity of correlation and explores feature combinations that have never or rarely occurred in the past. It improves the diversity of recommended items . This generalization can be added by using features that are less granular .
247252
So at the end this model is great for combining two different models of classification using neural Networks. You can see how this model has been working with </p>
248253
<h2>Deep layer</h2>
249-
<p>Embeddings are mathematical abstractions of categorical data. Their main purpose is to find relationships in between the data and show them in a 3 dimensional space.
250-
Tensorflow has released its own playground for embeddings using words and image data . You can find it here . </p>
251254
<pre><code>
252-
deep_columns = [
253-
tf.contrib.layers.embedding_column(title, dimension=8),
254-
tf.contrib.layers.embedding_column(house, dimension=8),
255-
tf.contrib.layers.embedding_column(culture, dimension=8),
256-
tf.contrib.layers.embedding_column(isAliveNoble, dimension=8),
257-
tf.contrib.layers.embedding_column(numberDeadRelations,
258-
dimension=8),
259-
tf.contrib.layers.embedding_column(popularity, dimension=8),
260-
male,
261-
spouse,
262-
isPopular,
263-
spouse,
264-
isMarried,
265-
]
266-
</google-codelab-step>
267-
<google-codelab-step
268-
</code></pre>
255+
elif FLAGS.model_type == "deep":
256+
m = tf.contrib.learn.DNNClassifier(model_dir=model_dir,
257+
feature_columns=deep_columns,
258+
hidden_units=[100, 50])
259+
</code></pre>
260+
</google-codelab-step>
261+
<google-codelab-step
262+
label="Network Structure : Combining wide+deep learning model"
263+
step="5.3"
264+
duration="10">
269265
<h2>Combining wide and deep learning model into one</h2>
270266
<p>The wide models and deep models are combined by summing up their final output log odds as the prediction, then feeding the prediction to a logistic loss function. All the graph definition and variable allocations have already been handled for you under the hood, so you simply need to create a DNNLinearCombinedClassifier: </p>
267+
<p> In this case, there are two layers with 100 and 50 neurons each. You can select your own number of layers and neurons </p>
271268
<pre><code>
272-
import tempfile
273-
model_dir = tempfile.mkdtemp()
274-
m = tf.contrib.learn.DNNLinearCombinedClassifier(
275-
model_dir=model_dir,
276-
linear_feature_columns=wide_columns,
277-
dnn_feature_columns=deep_columns,
278-
dnn_hidden_units=[100, 50])
269+
else:
270+
m = tf.contrib.learn.DNNLinearCombinedClassifier(
271+
model_dir=model_dir,
272+
linear_feature_columns=wide_columns,
273+
dnn_feature_columns=deep_columns,
274+
dnn_hidden_units=[100, 50],
275+
fix_global_step_increment_bug = True,
276+
)
277+
return m
279278
</code></pre>
280279
</google-codelab-step>
281280
<google-codelab-step
@@ -306,14 +305,10 @@ <h2>Conclusions</h2>
306305
The best result of the model is pasted with the accuracy as follows :
307306

308307

309-
You can help us with the results of changing the model, generalizing in our two main hypothesis :
308+
You can help us with the results of changing the model, generalizing in hypothesis :
310309

311-
In the model, if you increases the hidden layers, the accuracy of the model …………..
312-
In the model, changing the activation function from ……………………. To ………………. increases/decreases accuracy in ………….
313-
314-
Please, fill your model conclussions in this form
315-
316-
There is still work to do to optimize this codelab.
310+
In the model, if you increase the hidden layers and combine the number of neurons , the accuracy of the model increases or decreases? .
311+
Please, fill your model conclussions in this <a href="https://docs.google.com/forms/d/1QLNq6nxWIJRuO-JiK3wgnLaeW2X3wbFS1edzzj59cLQ/edit" target="_blank" rel="noopener">Form</a>
317312
Take this feedback form to tell us more about how useful it was and dig into it if you want to know more </p>
318313

319314
<img src="5_Comic.png" alt="TensorFlow Game of Thrones">

0 commit comments

Comments
 (0)