Skip to content

Participatory democracy analysis project using DistilBERT and multi-task learning, deployed with Vertex AI, Kubernetes and DVC.

Notifications You must be signed in to change notification settings

dendarko/participedia-capstone

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 

Repository files navigation

Participedia Capstone Project

Multi-task learning pipeline to analyze participatory democracy data, developed as part of the Participedia capstone project.

Overview

This project develops a multi-task learning framework to perform classification and text analysis on participatory democracy datasets. It uses a pretrained DistilBERT language model as the base encoder and applies task-specific heads for classification and embedding generation. The pipeline includes data preprocessing, fine-tuning, evaluation and deployment, with all experiments tracked using DVC and MLflow.

Features

  • Data preparation: Cleans and preprocesses datasets from Participedia, performing tokenization and formatting for transformer inputs.
  • Multi-task learning: Fine-tunes a shared DistilBERT model across classification tasks to extract contextual embeddings and classify input texts.
  • Experiment tracking & versioning: Uses DVC and MLflow to version data and models, log metrics and manage experiments.
  • Deployment: Provides deployment scripts for serving the model on Vertex AI and Kubernetes.
  • Reproducible infrastructure: Includes containerization via Docker and infrastructure-as-code for consistent environments.

Usage

This repository contains only a high-level description. The complete codebase and dataset remain proprietary. To learn more about the project or discuss collaboration opportunities, please contact me or explore the summary on my portfolio site.

About

Participatory democracy analysis project using DistilBERT and multi-task learning, deployed with Vertex AI, Kubernetes and DVC.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published