Skip to content

Commit

Permalink
Merge pull request cgpotts#124 from cgpotts/spring2023-prep
Browse files Browse the repository at this point in the history
Spring2023 prep
  • Loading branch information
cgpotts authored Mar 31, 2023
2 parents 81f217a + cd2a2dc commit 8f74373
Show file tree
Hide file tree
Showing 50 changed files with 11,551 additions and 23,522 deletions.
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -69,3 +69,4 @@ rel_ext_data*
.DS_Store
ColBERT*
experiments*
cache*
53 changes: 28 additions & 25 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,77 +2,80 @@

Code for [the Stanford course](http://web.stanford.edu/class/cs224u/).

Spring 2022
Spring 2023

[Christopher Potts](http://web.stanford.edu/~cgpotts/)


# Core components
## Core components


## `setup.ipynb`
### `setup.ipynb`

Details on how to get set up to work with this code.


## `tutorial_*` notebooks
### `hw_*.ipynb`

The set of homeworks for the current run of the course.


### `tutorial_*` notebooks

Introductions to Juypter notebooks, scientific computing with NumPy and friends, and PyTorch.


## `torch_*.py` modules
### `torch_*.py` modules

A generic optimization class (`torch_model_base.py`) and subclasses for GloVe, Autoencoders, shallow neural classifiers, RNN classifiers, tree-structured networks, and grounded natural language generation.

`tutorial_pytorch_models.ipynb` shows how to use these modules as a general framework for creating original systems.


## `np_*.py` modules

Reference implementations for the `torch_*.py` models, designed to reveal more about how the optimization process works.
### `evaluation_*.ipynb` and `projects.md`

Notebooks covering key experimental methods and practical considerations, and tips on writing up and presenting work in the field.

## `vsm_*` and `hw_wordrelatedness.ipynb`

A unit on vector space models of meaning, covering traditional methods like PMI and LSA as well as newer methods like Autoencoders and GloVe. `vsm.py` provides a lot of the core functionality, and `torch_glove.py` and `torch_autoencoder.py` are the learned models that we cover. `vsm_03_retroffiting.ipynb` is an extension that uses `retrofitting.py`, and `vsm_04_contextualreps.ipynb` explores methods for deriving static representations from contextual models.
### `iit*` and `feature_attribution.ipynb`

Part of our unit on explainability and model analysis.

## `sst_*` and `hw_sst.ipynb`

A unit on sentiment analysis with the [English Stanford Sentiment Treebank](https://nlp.stanford.edu/sentiment/treebank.html). The core code is `sst.py`, which includes a flexible experimental framework. All the PyTorch classifiers are put to use as well: `torch_shallow_neural_network.py`, `torch_rnn_classifier.py`, and `torch_tree_nn.py`.
### `np_*.py` modules

This is now considered background material for the course.

## `rel_ext*` and `hw_rel_ext.ipynb`

A unit on relation extraction with distant supervision.
Reference implementations for the `torch_*.py` models, designed to reveal more about how the optimization process works.


## `nli_*` and `hw_wordentail.ipynb`
### `vsm_*`

A unit on Natural Language Inference. `nli.py` provides core interfaces to a variety of NLI dataset, and an experimental framework. All the PyTorch classifiers are again in heavy use: `torch_shallow_neural_network.py`, `torch_rnn_classifier.py`, and `torch_tree_nn.py`.
This is now considered background material for the course.

A unit on vector space models of meaning, covering traditional methods like PMI and LSA as well as newer methods like Autoencoders and GloVe. `vsm.py` provides a lot of the core functionality, and `torch_glove.py` and `torch_autoencoder.py` are the learned models that we cover. `vsm_03_contextualreps.ipynb` explores methods for deriving static representations from contextual models.

## `colors*`, `torch_color_describer.py`, and `hw_colors.ipynb`

A unit on grounded natural language generation, focused on generating context-dependent color descriptions using the [English Stanford Colors in Context dataset](https://cocolab.stanford.edu/datasets/colors.html).
### `sst_*`

This is now considered background material for the course.

## `finetuning.ipynb`
A unit on sentiment analysis with the [English Stanford Sentiment Treebank](https://nlp.stanford.edu/sentiment/treebank.html). The core code is `sst.py`, which includes a flexible experimental framework. All the PyTorch classifiers are put to use as well: `torch_shallow_neural_network.py`, `torch_rnn_classifier.py`, and `torch_tree_nn.py`.

Using pretrained parameters from [Hugging Face](https://huggingface.co) for featurization and fine-tuning.

### `finetuning.ipynb`

## `evaluation_*.ipynb` and `projects.md`
This is now considered background material for the course.

Notebooks covering key experimental methods and practical considerations, and tips on writing up and presenting work in the field.
Using pretrained parameters from [Hugging Face](https://huggingface.co) for featurization and fine-tuning.


## `utils.py`
### `utils.py`

Miscellaneous core functions used throughout the code.


## `test/`
### `test/`

To run these tests, use

Expand Down
250 changes: 0 additions & 250 deletions colors.py

This file was deleted.

Loading

0 comments on commit 8f74373

Please sign in to comment.