A real-time data cleaning pipeline for medical and healthcare data using Apache Spark, SparkNLP, Spark Streaming, and Kafka.
-
Updated
Mar 18, 2025 - Python
A real-time data cleaning pipeline for medical and healthcare data using Apache Spark, SparkNLP, Spark Streaming, and Kafka.
Python scripts to process, and analyze log files using PySpark.
Project that captures information about all Dark Souls 3 (DS3) weapons and performs textual analysis on.
Perform sentiment analysis on Yelp dataset with Apache Spark
An implementation of NLP Sandbox PHI Annotator API based on Spark NLP
Add a description, image, and links to the spark-nlp topic page so that developers can more easily learn about it.
To associate your repository with the spark-nlp topic, visit your repo's landing page and select "manage topics."