Skip to content

Latest commit

 

History

History
34 lines (24 loc) · 2.66 KB

README.md

File metadata and controls

34 lines (24 loc) · 2.66 KB

Text Algorithms in Economics

Companion python notebooks to the 'Text Algorithms in Economics' article by Elliott Ash and Stephen Hansen. Notebooks developed by Yabra Muvdi. If you reuse in teaching or research, please cite the published article in the Annual Review of Economics (2023).

Notebooks content outline

All notebooks contain a button that allows the user to execute the notebook in Google Colab: Open In Colab

Notebook 1: Simple dictionary examples with regular expressions (here)

  • Summary: This notebook illustrates how a simple count of negative and positive terms can generate a sentiment index that correlates with GDP growth.
  • Data: Minutes from the Monetary Policy Committee at the Bank of England.

Notebook 2: Preprocessing and document-term matrix creation (here)

  • Summary: This notebook illustrates how to apply multiple preprocessing steps to clean text data and build a document-term matrix.
  • Data: Minutes from the Monetary Policy Committee at the Bank of England.

Notebook 3: Dimensionality reduction with LDA (here)

  • Summary: This notebook illustrates how to reduce the dimension of the document-term matrix with one particular method; Latent Dirichlet Allocation (LDA).
  • Data: USA State of the Union Addresses and Minutes from the Monetary Policy Committee at the Bank of England.

Notebook 4: Word2Vec (here)

  • Summary: This notebook illustrates how to estimate word embeddings using the word2vec algorithm.
  • Data: Bank of England Inflation Reports and Minutes from the Monetary Policy Committee at the Bank of England.

Notebook 5: Large language models for feature generation (here)

  • Summary: This notebook illustrates multiple strategies to generate embedded representations of text sequences using BERT. It then compares the quality of these representations by using them for a regression task.
  • Data: 10-K reports for selected firms.

Notebook 6: Finetuning a large language model (here)

  • Summary: This notebook illustrates how to finetune a large language model for a particular classification task.
  • Data: 10-K reports for selected firms.

Notebook 7: GPT demonstration (here)

  • Summary: This notebook shows how to interact with GPT using OpenAI's API.