Skip to content

Commit

Permalink
Fix typos
Browse files Browse the repository at this point in the history
  • Loading branch information
magitz committed Mar 23, 2022
1 parent a900bd5 commit 65ef768
Show file tree
Hide file tree
Showing 2 changed files with 19 additions and 34 deletions.
35 changes: 10 additions & 25 deletions 19_Natural_Language_Processing.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
"source": [
"# Natural Language Processing Intro\n",
"\n",
"Natural Language Processing (NLP) is a large field of AI with many related, but distinct sub-disciplines. We won't have time to look at all of these, but you are most likely somewhat familir with many of the applications.\n",
"Natural Language Processing (NLP) is a large field of AI with many related, but somewhat distinct sub-disciplines. We won't have time to look at all of these, but you are most likely somewhat familiar with many of the applications.\n",
"\n",
"Some NLP Tasks:\n",
"* Summary generation, information extractions\n",
Expand All @@ -16,13 +16,13 @@
"* Auto-completion\n",
"* Sentiment Analysis\n",
"* Intent Detection\n",
"* Chat, automated writing, dialog generation, question answerting\n",
"* Chat, automated writing, dialog generation, question answering\n",
"* Voice assistant\n",
"* Document retrieval\n",
"\n",
"## Ambiguity in language\n",
"\n",
"NLP is not a simple task, and until deep learning, was quite limited. Part of the challenge is that human language tends to be ambiguous and recognizing words is really only the start to infering meaning. Take for example this sentence:\n",
"NLP is not a simple task, and until deep learning, was quite limited in its abilities, though as with most fields in AI, there is a long history, [dating back to the 1950s](https://en.wikipedia.org/wiki/Natural_language_processing). Part of the challenge is that human language tends to be ambiguous and recognizing words is really only the start to inferring meaning. Take for example this sentence:\n",
" > The boy saw a man with a telescope\n",
" \n",
" * Who had the telescope?\n",
Expand All @@ -33,8 +33,9 @@
"\n",
"One of the primary challenges of NLP is representing language as numbers--remember computers, and the ML/AI systems we have, primarily deal with numbers. For computer vision problems, this was relatively easy in that we took pixel intensities of an image and fed those in. But what to do with speech, words, text?\n",
"\n",
"The processes of converting text to numerical representation is called **tokenization**. There are many mehods of tokenization, but the idea is to break text into itemizable components.\n",
"The processes of converting text to numerical representation is called **tokenization**. There are many methods of tokenization, but the idea is to break text into itemizable components--tokens.\n",
"\n",
"Tokens can be words, letters, word fragments, or even sentences.\n",
"\n"
]
},
Expand Down Expand Up @@ -87,7 +88,7 @@
"import logging\n",
"tf.get_logger().setLevel(logging.ERROR)\n",
"\n",
"EPOCHS = 10 # origonally 32, reduced for time.\n",
"EPOCHS = 10 # originally 32, reduced for time.\n",
"BATCH_SIZE = 256\n",
"INPUT_FILE_NAME = 'data/frankenstein.txt'\n",
"WINDOW_LENGTH = 40\n",
Expand All @@ -104,11 +105,11 @@
"source": [
"## Cleaning, tokenization and creation of training set\n",
"\n",
"In the following block, we read in the Frankenstein text file and use the `text_to_word_sequence` to convert the text to a list of individual words. This also removes punctuation and converts everything to lower case. This command accomplishes our cleaning and tokenization steps.\n",
"In the following block, we read in the [*Frankenstein*](https://www.gutenberg.org/ebooks/84) text file and use the `text_to_word_sequence` to convert the text to a list of individual words. This also removes punctuation and converts everything to lower case. This command accomplishes our cleaning and tokenization steps.\n",
"\n",
"The next step is to creat the training set of `fragments` and corresponding `targets`. The hyperparameters were set above, and used here to make a tranining set by sliding a frame overs the text, `WINDOW_LENGTH=40` words at a time. \n",
"The next step is to create the training set of `fragments` and corresponding `targets`. The hyperparameters were set above, and used here to make a training set by sliding a window over the text, `WINDOW_LENGTH=40` words at a time. \n",
"\n",
"The following word becomes the target and the frame shifts down `WINDOW_STEP=3` words to make the next fragment/target pair."
"The word immediately after that becomes the `target` and the window shifts down `WINDOW_STEP=3` words to make the next fragment/target pair."
]
},
{
Expand Down Expand Up @@ -218,7 +219,7 @@
"source": [
"## Build and fit the model\n",
"\n",
"Our first layer is the embedding layer that learns to conver the numbered words in the vocabulary to the embedding. Then we have two LSTM layers and a dense layer with a ReLU activation and a final layer with one neuron per word in the vocabulary and a softmax to make the output the probability of each word being the output from the model."
"Our first layer is the embedding layer that learns to convert the numbered words in the vocabulary to the embedding. Then we have two LSTM layers and a dense layer with a ReLU activation and a final layer with one neuron per word in the vocabulary and a softmax to make the output the probability of each word being the output from the model."
]
},
{
Expand Down Expand Up @@ -358,22 +359,6 @@
" print(word + ': ', distance)\n",
" print('')"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "c426982f-9e36-4bac-abf3-f59fc4b162a5",
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "code",
"execution_count": null,
"id": "c51e4e1d-7196-4efa-b432-969741655a7b",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
Expand Down
18 changes: 9 additions & 9 deletions 20_NLP_Transformers.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -14,11 +14,11 @@
"\n",
"This notebook largely follows, and quotes from, the [Hugging Face Transformer Course](https://huggingface.co/course/chapter1/3?fw=tf). It starts with some motivating examples.\n",
"\n",
"Here are some examples of what Trasformers can do in NLP:\n",
"Here are some examples of what Transformers can do in NLP:\n",
"\n",
"## Some Examples\n",
"\n",
"### Sentement Analysis"
"### Sentiment Analysis"
]
},
{
Expand Down Expand Up @@ -57,7 +57,7 @@
"id": "4cd60204-e7e6-4f82-9c6b-4cca96367ffc",
"metadata": {},
"source": [
"> By default, this pipeline selects a particular pretrained model that has been fine-tuned for sentiment analysis in English. The model is downloaded and cached when you create the `classifier` object. If you rerun the command, the cached model will be used instead and there is no need to download the model again.\n",
"> By default, this pipeline selects a particular pre-trained model that has been fine-tuned for sentiment analysis in English. The model is downloaded and cached when you create the `classifier` object. If you rerun the command, the cached model will be used instead and there is no need to download the model again.\n",
">\n",
"> There are three main steps involved when you pass some text to a pipeline:\n",
">\n",
Expand All @@ -81,7 +81,7 @@
"\n",
"### Zero-shot classification\n",
"\n",
"> We’ll start by tackling a more challenging task where we need to classify texts that haven’t been labelled. This is a common scenario in real-world projects because annotating text is usually time-consuming and requires domain expertise. For this use case, the zero-shot-classification pipeline is very powerful: it allows you to specify which labels to use for the classification, so you don’t have to rely on the labels of the pretrained model. You’ve already seen how the model can classify a sentence as positive or negative using those two labels — but it can also classify the text using any other set of labels you like."
"> We’ll start by tackling a more challenging task where we need to classify texts that haven’t been labelled. This is a common scenario in real-world projects because annotating text is usually time-consuming and requires domain expertise. For this use case, the zero-shot-classification pipeline is very powerful: it allows you to specify which labels to use for the classification, so you don’t have to rely on the labels of the pre-trained model. You’ve already seen how the model can classify a sentence as positive or negative using those two labels — but it can also classify the text using any other set of labels you like."
]
},
{
Expand Down Expand Up @@ -641,32 +641,32 @@
"\n",
"Transformers are not that old! Only having been introduced in 2017! Vaswani *et al.*'s paper [*Attention is all you need*](https://arxiv.org/abs/1706.03762) has over 38,000 citations in Google Scholar (3/22/2022)\n",
"\n",
"![Tranformer timeline from Hugging Face](images/transformers_chrono_HuggingFace.png)\n",
"![Transformer timeline from Hugging Face](images/transformers_chrono_HuggingFace.png)\n",
"\n",
"Since its publication, transformers have revolutionized NLP and been applied to other fields, including computer vision.\n",
"\n",
"### Transformers are language models\n",
"\n",
"> All the Transformer models mentioned above (GPT, BERT, BART, T5, etc.) have been trained as language models. This means they have been trained on large amounts of raw text in a self-supervised fashion. **Self-supervised learning is a type of training in which the objective is automatically computed from the inputs of the model. That means that humans are not needed to label the data!**\n",
"> \n",
"> This type of model develops a statistical understanding of the language it has been trained on, but it’s not very useful for specific practical tasks. Because of this, the general pretrained model then goes through a process called transfer learning. During this process, the model is fine-tuned in a supervised way — that is, using human-annotated labels — on a given task.\n",
"> This type of model develops a statistical understanding of the language it has been trained on, but it’s not very useful for specific practical tasks. Because of this, the general pre-trained model then goes through a process called transfer learning. During this process, the model is fine-tuned in a supervised way — that is, using human-annotated labels — on a given task.\n",
"\n",
"### Transformers are **BIG** models\n",
"\n",
"The graph below is from the article [*We Might See A 100T Language Model In 2022*](https://analyticsindiamag.com/we-might-see-a-100t-language-model-in-2022/), published in Dec 2021:\n",
"![Image of transformer model size over time, from https://analyticsindiamag.com/we-might-see-a-100t-language-model-in-2022/](images/NVIDIA_NLP_Model_Size.png)\n",
"\n",
"And costly in terms of compute and CO2 emmisions...\n",
"And costly in terms of compute and CO2 emissions...\n",
"\n",
"![Relative CO2 emissions for a variety of human activities, from Hugging Face](images/carbon_footprint_HuggingFace.png)\n",
"\n",
"Luckily, as we've seen, transfer learning can help!!\n",
"\n",
"![Schematic of pre-training from Hugging Face](images/pretraining_HuggingFace.png)\n",
"\n",
"> This pretraining is usually done on very large amounts of data. Therefore, it requires a very large corpus of data, and training can take up to several weeks.\n",
"> This pre-training is usually done on very large amounts of data. Therefore, it requires a very large corpus of data, and training can take up to several weeks.\n",
">\n",
"> *Fine-tuning*, on the other hand, is the training done **after** a model has been pretrained. To perform fine-tuning, you first acquire a pretrained language model, then perform additional training with a dataset specific to your task."
"> *Fine-tuning*, on the other hand, is the training done **after** a model has been pre-trained. To perform fine-tuning, you first acquire a pre-trained language model, then perform additional training with a dataset specific to your task."
]
},
{
Expand Down

0 comments on commit 65ef768

Please sign in to comment.