NorthJerseyProject/data at main · ChaseH01/NorthJerseyProject

Name	Name	Last commit message	Last commit date
parent directory ..
README.md	README.md
combine_datasets.py	combine_datasets.py
emotional_data.jsonl	emotional_data.jsonl
environmental_grounding_data.jsonl	environmental_grounding_data.jsonl
everyday_topics_data.jsonl	everyday_topics_data.jsonl
guardrails_data.jsonl	guardrails_data.jsonl
loading-hf-set.py	loading-hf-set.py
master_dataset.jsonl	master_dataset.jsonl
multi_turn_data.jsonl	multi_turn_data.jsonl
organized_environmental_grounding_data.jsonl	organized_environmental_grounding_data.jsonl
organized_multi_turn.jsonl	organized_multi_turn.jsonl
prompts.md	prompts.md
question_answer_data.jsonl	question_answer_data.jsonl

Name

Last commit message

Last commit date

Dataset Breakdown

soprano_dpo_pairs dataset: Teaches the model to answer logical, conceptual, and factual questions in Tony's voice so it can handle structured prompts reliably.
Multi-turn dialog: Trains the model to sustain conversation across multiple exchanges, maintaining context and persona without resetting each turn.
Emotional variation: Gives the model the ability to shift tone—angry, tired, reflective, amused—so responses feel alive rather than flat or repetitive.
Everyday topics: Builds natural conversational range (food, sports, advice, small talk) so Tony feels like a real person instead of a narrow Q&A filter.
Guardrails: Ensures the model refuses unsafe or inappropriate requests in-character without breaking persona or reverting to generic assistant tone.
Environmental grounding: Teaches the model how to handle contradictions, follow-ups, corrections, and shifts in context so replies stay coherent and adaptive.