Participedia Capstone Project

Multi-task learning pipeline to analyze participatory democracy data, developed as part of the Participedia capstone project.

Overview

This project develops a multi-task learning framework to perform classification and text analysis on participatory democracy datasets. It uses a pretrained DistilBERT language model as the base encoder and applies task-specific heads for classification and embedding generation. The pipeline includes data preprocessing, fine-tuning, evaluation and deployment, with all experiments tracked using DVC and MLflow.

Features

Data preparation: Cleans and preprocesses datasets from Participedia, performing tokenization and formatting for transformer inputs.
Multi-task learning: Fine-tunes a shared DistilBERT model across classification tasks to extract contextual embeddings and classify input texts.
Experiment tracking & versioning: Uses DVC and MLflow to version data and models, log metrics and manage experiments.
Deployment: Provides deployment scripts for serving the model on Vertex AI and Kubernetes.
Reproducible infrastructure: Includes containerization via Docker and infrastructure-as-code for consistent environments.

Usage

This repository contains only a high-level description. The complete codebase and dataset remain proprietary. To learn more about the project or discuss collaboration opportunities, please contact me or explore the summary on my portfolio site.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Participedia Capstone Project

Overview

Features

Usage

About

Uh oh!

Releases

Packages

dendarko/participedia-capstone

Folders and files

Latest commit

History

Repository files navigation

Participedia Capstone Project

Overview

Features

Usage

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Packages