This repository is a collection of projects and demos across different technologies: Ansible, Docker, Hive, Spark, Hadoop, Kafka, HBase, Networking, and Jupyter.
Each project lives in its own folder with a dedicated README.md
containing detailed setup and usage instructions.
Folder | Project | Description |
---|---|---|
Ansible | Ansible Role: Nginx + Service State Cron | Installs nginx and a cron job that maintains service_state . Idempotent, OS-aware (CentOS, RHEL, Arch, Debian, Ubuntu). |
Apache-Hive | Hive Transactions Analysis | Hive SQL tasks on transactions: format comparison (TEXT/ORC/PARQUET), profit analysis, shift violations. |
Apache-Spark | Spark Solutions | BFS shortest path, collocations (NPMI), streaming user segmentation, and common friends analysis. Implemented with RDD, DataFrames, and Streaming. |
Docker-Compose | Dockerized Web + DB Pipeline | MariaDB database + filler service (CSV loader) + web app with /health and / endpoints. |
Hadoop-MapReduce | MapReduce Jobs | Hadoop streaming tasks: Wikipedia proper-name frequency and system log error analysis. |
Kafka | Kafka & Spark Streaming Segmentation | Parses user agents and streams counts to Kafka topics. |
MitM-attack | Man-in-the-Middle Network Attack | Dockerized network simulation: ARP spoofing and TCP injection by Eve to intercept requests. |
NoSQL-HBase | HBase + Spark + HappyBase | Spark job ingests game logs into HBase; CLI reader queries top-10 weapons per match within coordinates. |
TCP-over-UDP | Reliable UDP Delivery Protocol | Custom TCP-like protocol built on UDP with tunable MTU, sliding windows, and retransmission logic. |
Tmux-Venv-JupyterNotebook | Jupyter Notebook Tmux Manager | Manages multiple Jupyter servers in isolated venv s, each in its own tmux window; state tracked in master.json . |
Clone the repository:
git clone https://github.com/mixaisealx/DevOps-n-DataOps.git
cd DevOps-n-DataOps
Each project can be explored individually by entering its folder and following the instructions in its README.md
.
- Infrastructure automation -> Ansible, Docker Compose
- Big Data -> Apache Hive, Apache Spark, Hadoop MapReduce
- Streaming -> Kafka, Spark Streaming
- NoSQL -> HBase (with Spark + HappyBase)
- Networking -> Custom UDP protocol, MitM attack
- Productivity tools -> Jupyter Notebook manager with Tmux + venv
This repository is licensed under the Apache License 2.0. See the LICENSE file for details.