Skip to content

Hands-on project demos covering infrastructure automation (Ansible, Docker), big-data processing & streaming (Hive, Spark, Kafka), and network experiments (MitM, TCP-over-UDP).

License

Notifications You must be signed in to change notification settings

mixaisealx/DevOps-n-DataOps

Repository files navigation

DevOps, DataOps & Networking

License Version

This repository is a collection of projects and demos across different technologies: Ansible, Docker, Hive, Spark, Hadoop, Kafka, HBase, Networking, and Jupyter.
Each project lives in its own folder with a dedicated README.md containing detailed setup and usage instructions.

Repository Contents

Folder Project Description
Ansible Ansible Role: Nginx + Service State Cron Installs nginx and a cron job that maintains service_state. Idempotent, OS-aware (CentOS, RHEL, Arch, Debian, Ubuntu).
Apache-Hive Hive Transactions Analysis Hive SQL tasks on transactions: format comparison (TEXT/ORC/PARQUET), profit analysis, shift violations.
Apache-Spark Spark Solutions BFS shortest path, collocations (NPMI), streaming user segmentation, and common friends analysis. Implemented with RDD, DataFrames, and Streaming.
Docker-Compose Dockerized Web + DB Pipeline MariaDB database + filler service (CSV loader) + web app with /health and / endpoints.
Hadoop-MapReduce MapReduce Jobs Hadoop streaming tasks: Wikipedia proper-name frequency and system log error analysis.
Kafka Kafka & Spark Streaming Segmentation Parses user agents and streams counts to Kafka topics.
MitM-attack Man-in-the-Middle Network Attack Dockerized network simulation: ARP spoofing and TCP injection by Eve to intercept requests.
NoSQL-HBase HBase + Spark + HappyBase Spark job ingests game logs into HBase; CLI reader queries top-10 weapons per match within coordinates.
TCP-over-UDP Reliable UDP Delivery Protocol Custom TCP-like protocol built on UDP with tunable MTU, sliding windows, and retransmission logic.
Tmux-Venv-JupyterNotebook Jupyter Notebook Tmux Manager Manages multiple Jupyter servers in isolated venvs, each in its own tmux window; state tracked in master.json.

Getting Started

Clone the repository:

git clone https://github.com/mixaisealx/DevOps-n-DataOps.git
cd DevOps-n-DataOps

Each project can be explored individually by entering its folder and following the instructions in its README.md.

Technologies Covered

  • Infrastructure automation -> Ansible, Docker Compose
  • Big Data -> Apache Hive, Apache Spark, Hadoop MapReduce
  • Streaming -> Kafka, Spark Streaming
  • NoSQL -> HBase (with Spark + HappyBase)
  • Networking -> Custom UDP protocol, MitM attack
  • Productivity tools -> Jupyter Notebook manager with Tmux + venv

License

This repository is licensed under the Apache License 2.0. See the LICENSE file for details.

About

Hands-on project demos covering infrastructure automation (Ansible, Docker), big-data processing & streaming (Hive, Spark, Kafka), and network experiments (MitM, TCP-over-UDP).

Topics

Resources

License

Stars

Watchers

Forks