README.md update

emanuelegiona · Apr 26, 2018 · 4959cab · 4959cab
1 parent 23797fc
commit 4959cab
Showing 1 changed file with 10 additions and 10 deletions.
diff --git a/README.md b/README.md
@@ -11,9 +11,9 @@ Dictionary built using [Google English One Million 1-grams](http://storage.googl
 ### Setup:
 
 - Python 3.6.3 (Anaconda custom 64-bit)
-- PyTorch 0.3.1 (0.4.0 might not work due to ["torch.Tensor and autograd.Variable changes"](https://github.com/pytorch/pytorch/releases/tag/v0.4.0))
+- PyTorch 0.3.1 (0.4.0 might not work due to [torch.Tensor and autograd.Variable changes](https://github.com/pytorch/pytorch/releases/tag/v0.4.0))
 - CUDA 8
-- spaCy v2.0 with English models (more ["here"](https://spacy.io/usage/))
+- spaCy v2.0 with English models (more [here](https://spacy.io/usage/))
 - project folder must contain a folder named `batches` in the same directory of the train.py file
 
 ### Training
@@ -24,7 +24,7 @@ Start training by using this command:
 
 where:
 - the training set file is a UTF-8 encoded .txt file;
-- the model file is a pre-existent .pt file (by default: word_guesser.pt).
+- the model file is a pre-existent .pt file (by default: `word_guesser.pt`).
 
 The model file is not mandatory: if not specified, it will assume there is no model and will create a model file
 named `word_guesser.pt`, overwriting it in case it already exists. By starting a training specifying a model file,
@@ -45,22 +45,22 @@ specifying a model file, the model stored in that file will be used for predicti
 
 ## Features
 
-- Multi-threaded operation in order to read from the training file, split to sentences, batching, training simultaneously (producer-consumer pattern)
+- Multi-threaded operation in order to read from the training file, split to sentences, batching, training simultaneously (_producer-consumer pattern_)
 - Low RAM usage due to limited queues between threads and periodic dumps of created batches
 - Sentences are never padded, instead they are organized by their length and then created batches from sentences of all the same length
 - Dynamic batch size: will try to create batches of maximal size (hyper-parameter `batch_dim`) as much as possible, but batches smaller than the chosen size will not be padded
 
-## Known bugs
+## Known bugs/problems
 
 - Missing `batches` folder creation if not present
 - Training corpus only accepted format is UTF-8 encoded plain text
 - Slow on computation of large training corpus, might become faster implementing hierarchical softmax or negative sampling
 
 ## Consulted resources
 
-- ["PyTorch Tutorials"](http://pytorch.org/tutorials/)
-- ["Practical PyTorch"](https://github.com/spro/practical-pytorch)
-- ["The Incredible PyTorch"](https://github.com/ritchieng/the-incredible-pytorch)
-- ["Optimizing PyTorch training code"](https://www.sagivtech.com/2017/09/19/optimizing-pytorch-training-code/)
-- ["Word Sense Disambiguation with LSTM: Do We Really Need 100 Billion Words?"](https://github.com/cltl/wsd-dynamic-sense-vector)
+- [PyTorch Tutorials](http://pytorch.org/tutorials/)
+- [Practical PyTorch](https://github.com/spro/practical-pytorch)
+- [The Incredible PyTorch](https://github.com/ritchieng/the-incredible-pytorch)
+- [Optimizing PyTorch training code](https://www.sagivtech.com/2017/09/19/optimizing-pytorch-training-code/)
+- [Word Sense Disambiguation with LSTM: Do We Really Need 100 Billion Words?](https://github.com/cltl/wsd-dynamic-sense-vector)