Skip to content

Commit ee71fcd

Browse files
committed
imgae and 2 data strucutre folder
1 parent 8034b0c commit ee71fcd

File tree

3 files changed

+391
-0
lines changed

3 files changed

+391
-0
lines changed
Lines changed: 391 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,391 @@
1+
# Data Science ML Full Stack
2+
3+
![Robot](https://github.com/hemansnation/Data-Science-ML-Full-Stack/blob/master/images/a-robot-with-lights-on-jx5pkvw0.jpeg)
4+
5+
### What we will do and gain?
6+
7+
- Build an in-depth understanding of all the data concepts.
8+
- Create your strong social media profile on LinkedIn and GitHub.
9+
- Build 15+ projects including 5+ Major Projects.
10+
- Showcase your skills with a portfolio of real projects.
11+
- Work on Live projects in parallel to understand how companies create end-to-end software solutions and apply ML models to real-life problems.
12+
13+
### The‌ ‌Roadmap‌ ‌is‌ ‌divided‌ ‌into‌ ‌16 ‌Sections‌ ‌
14+
15+
Duration:‌ ‌256‌ ‌Hours‌ of Learning ‌(8 ‌Months)‌ ‌and many more hours for practice and project building.
16+
17+
## Month 1 - May
18+
19+
1. [Python‌ ‌Programming‌ ‌and‌ ‌Logic‌ ‌Building‌](#1--python-programming-and-logic-building)
20+
2. [Data‌ ‌Structure‌ ‌&‌ ‌Algorithms‌](#2--data-structure--algorithms)
21+
22+
## Month 2 - June
23+
24+
3. [Pandas‌ ‌Numpy‌ ‌Matplotlib‌](#3--pandas-numpy-matplotlib)
25+
4. [Statistics‌](#4--statistics)
26+
27+
## Month 3 - July
28+
29+
5. [Machine‌ ‌Learning‌](#5--machine-learning)
30+
6. [ML Operations](#6--mlops)
31+
32+
## Month 4 - August
33+
34+
7. [Natural‌ ‌Language‌ ‌Processing‌](#7--natural-language-processing)
35+
8. [Computer‌ ‌Vision‌‌](#8--computer-vision)
36+
37+
## Month 5 - September
38+
39+
9. [Data‌ ‌Visualization‌ ‌with‌ ‌Tableau‌](#9--data-visualization-with-tableau)
40+
10. [Structure‌ ‌Query‌ ‌Language‌ ‌(SQL)‌](#10--structure-query-language-sql)
41+
42+
## Month 6 - October
43+
44+
11. [Data Engineering](#11--data-engineering)
45+
12. [Data System Design](#12--data-system-design)
46+
47+
## Month 7 - November
48+
49+
13. [Five‌ ‌Major‌ Capstone ‌Projects‌](#13--five-major-projects-and-git)
50+
14. [Interview Preparations](#14--interview-preperation)
51+
52+
## Month 8 - December
53+
54+
15. [Git & GitHub](#15--git--github)
55+
16. [Personal Branding and portfolio](#16--personal-profile--portfolio)
56+
57+
58+
### [Resources](#resources)
59+
60+
- [Dataset Collection]()
61+
62+
### Technology‌ ‌Stack‌
63+
64+
- Python‌
65+
- Data‌ ‌Structures‌
66+
- NumPy‌
67+
- Pandas‌
68+
- Matplotlib‌
69+
- Seaborn‌
70+
- Scikit-Learn‌
71+
- Statsmodels‌
72+
- Natural‌ ‌Language‌ ‌Toolkit‌ ‌(‌ ‌NLTK‌ ‌)‌
73+
- PyTorch‌
74+
- OpenCV‌
75+
- Tableau‌
76+
- Structure‌ ‌Query‌ ‌Language‌ ‌(‌ ‌SQL‌ ‌)‌
77+
- PySpark‌
78+
- Azure‌ ‌Fundamentals‌
79+
- Azure‌ ‌Data‌ ‌Factory‌
80+
- Databricks‌
81+
- 5‌ ‌Major‌ ‌Projects‌
82+
- Git‌ ‌and‌ ‌GitHub‌ ‌
83+
84+
85+
# 1 | Python Programming and Logic Building
86+
I will prefer Python Programming Language. Python is the best for starting your programming journey. Here is the roadmap of python for logic building.
87+
88+
- Python basics, Variables, Operators, Conditional Statements
89+
- List and Strings
90+
- While Loop, Nested Loops, Loop Else
91+
- For Loop, Break, and Continue statements
92+
- Functions, Return Statement, Recursion
93+
- Dictionary, Tuple, Set
94+
- File Handling, Exception Handling
95+
- Object-Oriented Programming
96+
- Modules and Packages
97+
98+
<a href="https://github.com/hemansnation/Python-Roadmap-2022">In-Depth Roadmap of Python</a>
99+
100+
# 2 | Data Structure & Algorithms
101+
Data Structure is the most important thing to learn not only for data scientists but for all the people working in computer science. With data structure, you get an internal understanding of the working of everything in software.
102+
103+
Understand these topics
104+
105+
- Types of Algorithm Analysis
106+
- Asymptotic Notation, Big-O, Omega, Theta
107+
- Stacks
108+
- Queues
109+
- Linked List
110+
- Trees
111+
- Graphs
112+
- Sorting
113+
- Searching
114+
- Hashing
115+
116+
117+
# 3 | Pandas Numpy Matplotlib
118+
Python supports n-dimensional arrays with Numpy. For data in 2-dimensions, Pandas is the best library for analysis. You can use other tools but tools have drag-and-drop features and have limitations. Pandas can be customized as per the need as we can code depending upon the real-life problem.
119+
120+
### Numpy
121+
- Vectors, Matrix
122+
- Operations on Matrix
123+
- Mean, Variance, and Standard Deviation
124+
- Reshaping Arrays
125+
- Transpose and Determinant of Matrix
126+
- Diagonal Operations, Trace
127+
- Add, Subtract, Multiply, Dot, and Cross Product.
128+
129+
### Pandas
130+
- Series and DataFrames
131+
- Slicing, Rows, and Columns
132+
- Operations on DataFrame
133+
- Different ways to create DataFrame
134+
- Read, Write Operations with CSV files
135+
- Handling Missing values, replace values, and Regular Expression
136+
- GroupBy and Concatenation
137+
138+
139+
### Matplotlib
140+
- Graph Basics
141+
- Format Strings in Plots
142+
- Label Parameters, Legend
143+
- Bar Chart, Pie Chart, Histogram, Scatter Plot
144+
145+
146+
# 4 | Statistics
147+
148+
### Descriptive Statistics
149+
- Measure of Frequency and Central Tendency
150+
- Measure of Dispersion
151+
- Probability Distribution
152+
- Gaussian Normal Distribution
153+
- Skewness and Kurtosis
154+
- Regression Analysis
155+
- Continuous and Discrete Functions
156+
- Goodness of Fit
157+
- Normality Test
158+
- ANOVA
159+
- Homoscedasticity
160+
- Linear and Non-Linear Relationship with Regression
161+
162+
### Inferential Statistics
163+
- t-Test
164+
- z-Test
165+
- Hypothesis Testing
166+
- Type I and Type II errors
167+
- t-Test and its types
168+
- One way ANOVA
169+
- Two way ANOVA
170+
- Chi-Square Test
171+
- Implementation of continuous and categorical data
172+
173+
174+
# 5 | Machine Learning
175+
The best way to master machine learning algorithms is to work with the Scikit-Learn framework. Scikit-Learn contains predefined algorithms and you can work with them just by generating the object of the class. These are the algorithm you must know including the types of Supervised and Unsupervised Machine Learning:
176+
177+
- Linear Regression
178+
- Logistic Regression
179+
- Decision Tree
180+
- Gradient Descent
181+
- Random Forest
182+
- Ridge and Lasso Regression
183+
- Naive Bayes
184+
- Support Vector Machine
185+
- KMeans Clustering
186+
187+
### Other Concepts and Topics for ML
188+
- Measuring Accuracy
189+
- Bias-Variance Trade-off
190+
- Applying Regularization
191+
- Elastic Net Regression
192+
- Predictive Analytics
193+
- Exploratory Data Analysis
194+
195+
# 6 | MLOps
196+
197+
198+
199+
# 7 | Natural Language Processing
200+
If you are interested in working with Text, you should do some of the work an NLP Engineer do and understand the working of Language models.
201+
202+
- Sentiment analysis
203+
- POS Tagging, Parsing,
204+
- Text preprocessing
205+
- Stemming and Lemmatization
206+
- Sentiment classification using Naive Bayes
207+
- TF-IDF, N-gram,
208+
- Machine Translation, BLEU Score
209+
- Text Generation, Summarization, ROUGE Score
210+
- Language Modeling, Perplexity
211+
- Building a text classifier
212+
- Identifying the gender
213+
214+
215+
# 8 | Computer Vision
216+
To work on image and video analytics we can master computer vision. To work on computer vision we have to understand images.
217+
218+
- PyTorch Tensors
219+
- Understanding Pretrained models like AlexNet, ImageNet, ResNet.
220+
- Neural Networks
221+
- Building a perceptron
222+
- Building a single layer neural network
223+
- Building a deep neural network
224+
- Recurrent neural network for sequential data analysis
225+
226+
### Convolutional Neural Networks
227+
228+
- Understanding the ConvNet topology
229+
- Convolution layers
230+
- Pooling layers
231+
- Image Content Analysis
232+
- Operating on images using OpenCV-Python
233+
- Detecting edges
234+
- Histogram equalization
235+
- Detecting corners
236+
- Detecting SIFT feature points
237+
238+
239+
# 9 | Data Visualization with Tableau
240+
241+
How to use it Visual Perception
242+
243+
- What is it, How it works, Why Tableau
244+
- Connecting to Data
245+
- Building charts
246+
- Calculations
247+
- Dashboards
248+
- Sharing our work
249+
- Advanced Charts, Calculated Fields, Calculated Aggregations
250+
- Conditional Calculation, Parameterized Calculation
251+
252+
253+
# 10 | Structure Query Language (SQL)
254+
255+
1. Introduction to SQL: Learn the basics of SQL syntax, commands, and data types.
256+
2. Retrieving Data: Learn how to write queries to retrieve data from a database using SELECT statements, filtering, sorting, and grouping.
257+
3. Joins: Learn how to combine data from multiple tables using INNER JOIN, OUTER JOIN, and other types of joins.
258+
4. Aggregating Data: Learn how to use aggregate functions like SUM, COUNT, AVG, and MAX to summarize data.
259+
5. Subqueries: Learn how to use subqueries to retrieve data from one or more tables based on conditions.
260+
6. Creating Tables: Learn how to create tables, define columns, and set constraints.
261+
7. Modifying Data: Learn how to insert, update, and delete data in a table.
262+
8. Advanced SQL: Learn advanced SQL concepts such as transactions, views, stored procedures, and functions.
263+
9. Database Design: Learn about database design principles, normalization, and ER diagrams.
264+
10. Practice, Practice, Practice: Practice writing SQL queries on real-world datasets, and work on projects to apply your knowledge.
265+
266+
267+
# 11 | Data Engineering
268+
269+
### BigData
270+
- What is BigData?
271+
- How is BigData applied within Business?
272+
### PySpark
273+
- Resilient Distributed Datasets
274+
- Schema
275+
- Lambda Expressions
276+
- Transformations
277+
- Actions
278+
279+
### Data Modeling
280+
281+
- Duplicate Data
282+
- Descriptive Analysis on Data
283+
- Visualizations
284+
- ML lib
285+
- ML Packages
286+
- Pipelines
287+
288+
### Streaming
289+
290+
- Packaging Spark Applications
291+
292+
# 12 | Data System Design
293+
294+
- Foundation of Data Systems
295+
- Data Models
296+
- Storage
297+
- Encoding
298+
- Distributed Data
299+
- Replication
300+
- Partitioning
301+
- Derived Data
302+
- Batch Processing
303+
- Stream Processing
304+
- Microsoft Azure
305+
- Azure Data Workloads
306+
- Azure Data Factory
307+
- Azure HDInsights
308+
- Azure Databricks
309+
- Azure Synapse Analytics
310+
- Relational Database in Azure
311+
- Non-relational Database in Azure
312+
313+
314+
# 13 | Five Major Projects and Git
315+
316+
We follow project-based learning and we will work on all the projects in parallel.
317+
318+
319+
# 14 | Interview Preperation
320+
321+
322+
323+
# 15 | Git & GitHub
324+
325+
### [Git & GitHub Course](https://god-level-python.notion.site/Git-GitHub-Course-Make-Recruiters-reach-You-Build-your-stunning-profile-First-open-source-cont-1d4d70450aa94dd7ad2c062c0fec3cb8)
326+
327+
- Understanding Git
328+
- Commands and How to commit your first code?
329+
- How to use GitHub?
330+
- How to make your first open-source contribution?
331+
- How to work with a team? - Part 1
332+
- How to create your stunning GitHub profile?
333+
- How to build your own viral repository?
334+
- Building a personal landing page for your Portfolio for FREE
335+
- How to grow followers on GitHub?
336+
- How to work with a team? Part 2 - issues, milestone and projects
337+
338+
339+
# 16 | Personal Profile & Portfolio
340+
341+
342+
343+
344+
# Resources
345+
346+
### Datasets
347+
348+
1️⃣ [Awesome Public Datasets](https://github.com/awesomedata/awesome-public-datasets)
349+
This list of a topic-centric public data sources in high quality.
350+
351+
2️⃣[NLP Datasets](https://github.com/niderhoff/nlp-datasets)
352+
Alphabetical list of free/public domain datasets with text data for use in NLP.
353+
354+
3️⃣[Awesome Dataset Tools](https://github.com/jsbroks/awesome-dataset-tools)
355+
A curated list of awesome dataset tools.
356+
357+
4️⃣[Awesome time series database](https://github.com/xephonhq/awesome-time-series-database)
358+
A curated list of time series databases.
359+
360+
5️⃣[Awesome-Cybersecurity-Datasets](https://github.com/shramos/Awesome-Cybersecurity-Datasets)
361+
A curated list of amazingly awesome Cybersecurity datasets.
362+
363+
6️⃣[Awesome Robotics Datasets](https://github.com/mint-lab/awesome-robotics-datasets)
364+
Robotics Dataset Collections.
365+
366+
367+
368+
369+
370+
371+
### Join Telegram for Data Science ML AI Resources:
372+
https://t.me/+sREuRiFssMo4YWJl
373+
374+
### Connect with me on these platforms:
375+
LinkedIn: https://www.linkedin.com/in/hemansnation/
376+
377+
Twitter: https://twitter.com/hemansnation
378+
379+
GitHub: https://github.com/hemansnation
380+
381+
Instagram: https://www.instagram.com/masterdexter.ai/
382+
383+
384+
### Are you a professional?
385+
For One-on-One sessions for Python, Data Science, Machine Learning, and Data Engineering.
386+
<br>Email your requirements Here: connect@himanshuramchandani.co
387+
388+
389+
390+
391+
-90.3 KB
Binary file not shown.
-59.8 KB
Binary file not shown.

0 commit comments

Comments
 (0)