MedASL

MedASL is an ASL corpus that focuses on medical communication, with gloss and text annotations. It is designed to support researchers and industry professionals in advancing Sign Language Machine Translation systems. By incorporating medical terminologies and advanced data acquisition, such as the Intel RealSense camera, MedASL enables the development of accurate and context-aware models that reflect real-world healthcare scenarios. The dataset consists of 500 medical and healthcare-related statements, generated via prompt engineering using ChatGPT and signed by an ASL expert, simulating realistic dialogues between patients and healthcare professionals.

The repository contains the MedASL dataset, which is divided into two subfolders: Annotations and Videos.

The Annotations subfolder includes a CSV file that lists 500 medical-related sentences along with their corresponding glosses and the file paths to the related videos signed in American Sign Language (ASL).

The Videos subfolder comprises 500 subfolders, each representing one sentence in the dataset. Each sentence is signed in ASL, and the corresponding video is stored as .npy files, where each .npy file represents one second of the signed video.

Prompt Engineering Design:

To create MedASL, we design and develop prompts using the following methodology:

• High-Level Prompt Structure
We design a high-level prompt to generate realistic medical conversations in the following format:
“Generate a realistic medical interaction between a patient and a [doctor/nurse/pharmacist/technician] in a healthcare setting. The conversation should involve common symptoms, medical advice, and questions about treatments or prescriptions. Ensure the language is clear, professional, and appropriate for real-world scenarios.”

• Refinement Process
We refine the high-level prompt by dividing it into low-level prompts. This is to improve the generated sentences’ coherence and relevance using low-level prompt variations such as:
“Generate 10 medical-related statements that a nurse might say when checking a patient’s vitals.”

Data Recording Process:

We recorded the sign videos using Intel RealSense at a resolution of 1280×800 and stored them in “.npy” format.

Data Pre-processing:

For the sign language gloss and spoken language text, we applied the following additional pre-processing steps:

• Building Vocabularies: We create unique vocabularies for gloss and text data, including a special token to represent unknown words.
• Assigning Unique Indices: We assign a unique index to each word in the gloss and text data for better processing.
• Tokenizing: We tokenize the gloss and text sequences into individual units to enable efficient input representation.
• Padding: We apply zero-padding to the sequences, ensuring uniform lengths for batch processing.
• Adding Special Tokens: We add special tokens such as (start of sequence) and (end of sequence) to mark the sequence boundaries.
• Gloss Alignment: We map each gloss annotation to its corresponding spoken language sentence.
• Data serialization: We store the pre-processed gloss, text, and video data in a standardized “.pkl” format for efficient input loading during model training.

To prepare the video data for training, we applied the following pre-processing steps:

• Video Frames Extraction: We sample the videos at 30 frames per second (fps) to maintain motion fidelity.
• Video Keypoints Extraction: We represent every frame by its corresponding keypoints, extracted using Mediapipe python library.
• Frames Concatenation: We concatenate video frames corresponding to each sentence into a continuous sequence to align with gloss annotations.
• Padding: We apply zero-padding to align frame lengths across all videos.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
Annotations		Annotations
Videos		Videos
.gitattributes		.gitattributes
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

MedASL

Prompt Engineering Design:

Data Recording Process:

Data Pre-processing:

About

Uh oh!

Releases

Packages

INDUCE-Lab/MedASL

Folders and files

Latest commit

History

Repository files navigation

MedASL

Prompt Engineering Design:

Data Recording Process:

Data Pre-processing:

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Packages