Skip to content

Repository documenting my journey through the Santander Bootcamp 2023 - Data Science with Python. Contains projects, exercises, and materials covering Python, data visualization, machine learning models, and ETL pipelines developed during this comprehensive DIO educational program.

License

Notifications You must be signed in to change notification settings

Laurentius96/DIO-Santander_DataScience

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

44 Commits
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

DIO | Santander: Data Science

Santander Bootcamp 2023 - Data Science with Python

Materials, exercises, and projects developed during the bootcamp sponsored by Santander in partnership with DIO

Overview β€’ Structure β€’ Projects β€’ Tools β€’ Content β€’ License

πŸ” Overview

This repository documents my learning journey in Data Science through the Santander Bootcamp 2023, including code, practical projects, and study materials. The content reflects the complete bootcamp curriculum, offering a comprehensive view of the skills developed.

🌟 About the Bootcamp (Click to expand)

Important note: This bootcamp was originally conducted between August and October 2023, focusing on Data Science with Python.

Why is this bootcamp important?

1️⃣ Market-aligned content

  • Modules developed to reflect industry trends and company requirements
  • Focus on practical skills valued by employers

2️⃣ Intensive and comprehensive learning

  • In-depth coverage of Python, Databases, Visualization, and Machine Learning
  • Time distribution based on the relevance of topics in today's market

3️⃣ Practical and targeted learning

  • Coding challenges and projects for immediate application of knowledge
  • Updated educational materials with the latest tools and techniques
  • Projects that simulate real data scientist challenges
⚠️ Note

Important note: This bootcamp was originally offered in 2023, but I am completing it in 2025 as part of the benefits of being a Global Student at DIO (Digital Innovation One). The platform allows continuous access to educational content even after the official period ends, enabling students like me to take advantage of the opportunity to develop these skills at a later date.

πŸ“š Bootcamp Structure

πŸ“‹ Curriculum Structure (Click to expand)

Prepare for the Journey (Onboarding)

  • DIO Bootcamps: Free Education and Employability Together! (1h)
  • Organizing Your Studies with DIO Roadmaps and Notion (2h)
  • Code Versioning with Git and GitHub (2h)
  • Project Challenges: Create a Winning Portfolio (1h)
  • Contributing to an Open Source Project on GitHub (1h)
  • Opening Class - Santander Bootcamps 2023 (2h)

Introduction to Data Science and Python

  • Development Environment and First Steps with Python (1h)
  • Getting to Know the Python Programming Language (2h)
  • Types of Operators in Python (2h)
  • Conditional and Loop Structures in Python (2h)
  • Manipulating Strings with Python (2h)
  • Working with Lists in Python (1h)
  • Getting to Know Tuples in Python (1h)
  • Exploring Sets in Python (1h)
  • Learning to Use Dictionaries in Python (1h)
  • Mastering Python Functions (1h)
  • Exploring Generative AI in an ETL Pipeline with Python (2h)

Solving Your First Code Challenges

  • Code Challenges: Improve Your Logic and Computational Thinking (1h)
  • Python Challenges: Balancing Balance (1h)
  • Python Challenges: Organizing Your Assets (1h)
  • Python Challenges: Conditionally Rich (1h)
  • Python Challenges: Compound Interest (1h)
  • Python Challenges: The Big Deposit (1h)

First Steps in SQL and NoSQL

  • Introduction to Relational Databases (SQL) (3h)
  • Introduction to NoSQL Databases (3h)

Data Visualization and Analysis with Power BI

  • Business Intelligence (BI) Fundamentals (2h)
  • Introduction to Data Analysis with SQL (3h)
  • Theoretical Foundations of ETL (1h)
  • First Steps with Power BI (3h)
  • Working with Visuals in Power BI (4h)
  • BI Fundamentals: KPIs and Metrics (1h)
  • Creating Interactive Dashboards with Power BI (2h)
  • Creating a Management Sales Report with Power BI (1h)
  • Data Collection and Extraction with Power BI (3h)
  • Data Cleaning and Transformation with Power BI (2h)
  • Creating a Corporate Dashboard with MySQL and Azure Integration (1h)

Machine Learning Fundamentals and Techniques

  • Introduction to Machine Learning (2h)
  • Bio-inspired Machine Learning Methods (1h)
  • Artificial Neural Networks (1h)
  • Genetic Algorithms (2h)
  • SVM (Support Vector Machine) Algorithms (1h)
  • Problem Classification: Exploring Datasets (1h)
  • Programming Languages for Machine Learning (1h)
  • Python for Machine Learning in Practice (2h)
  • Evaluate this Bootcamp (1h)
πŸš€ Mentoring Sessions (Live) (Click to expand)

Technical and Career Mentoring

  • Opening Class - Santander Bootcamps 2023 (2h)
  • Intelligent Development: Maximizing Your Productivity with Generative AI (2h)
  • Demystifying SQL and NoSQL Databases with ChatGPT (2h)
  • Challenges and Future Perspectives on Generative AI (2h)
  • Building Your Digital Brand: How to Highlight Your Developer Portfolio (2h)

πŸ“Š Projects and Challenges

πŸ“‹ Project List
Project Description Status
1. Contributing to an Open Source Project on GitHub Contribution to an open source project using GitHub 🚧 In progress
2. Exploring Generative AI in an ETL Pipeline with Python Development of ETL pipeline with Generative AI 🚧 In progress
3. Python Challenges: Balancing Balance System for bank balance management 🚧 In progress
4. Python Challenges: Organizing Your Assets Alphabetical organization of banking assets 🚧 In progress
5. Python Challenges: Conditionally Rich Withdrawal system with balance verification 🚧 In progress
6. Python Challenges: Compound Interest Compound interest calculation for investments 🚧 In progress
7. Python Challenges: The Big Deposit Deposit system with validation 🚧 In progress
8. Creating a Management Sales Report with Power BI Interactive sales report with Power BI 🚧 In progress
9. Creating a Corporate Dashboard with MySQL and Azure Integration Dashboard with cloud database integration 🚧 In progress
1. Contributing to an Open Source Project on GitHub

Description

The world of Open Source awaits you! In the lab "Contributing to an Open Source Project on GitHub," you'll be introduced to the fascinating universe of open-source collaboration. This practical project was specially designed for technology students like you to dive in and experience firsthand the power of collaborative work and continuous innovation that Open Source provides.

Objective

Understand and practice the process of contributing to Open Source projects, using GitHub as a collaboration platform.

What to do?

  • Choose an Open Source project to contribute to
  • Make a contribution to the project (documentation improvements, feature additions, bug fixes, etc.)
  • For first-time contributors, the repository digitalinnovationone/dio-lab-open-source is recommended

Technologies

  • GitHub
  • Git
  • Markdown

Level

Basic

2. Exploring Generative AI in an ETL Pipeline with Python

Description

Get ready for a practical journey through the world of Data Science! We'll build an ETL (Extract, Transform, Load) pipeline, demonstrating the relationship between data, Artificial Intelligence (AI), and APIs.

  • Extraction: The adventure begins with a simple spreadsheet, from which we'll extract user IDs. Then, we'll use these IDs to access the 'Santander Dev Week 2023' API and obtain more detailed data.
  • Transformation: We'll enter the universe of AI with OpenAI's GPT-4, transforming this data into personalized marketing messages.
  • Loading: We'll finish the process by sending these messages back to the 'Santander Dev Week 2023' API.

Objective

Reimagine the ETL process by applying the concepts learned in a new application domain.

Technologies

  • Python
  • REST
  • OpenAI API
  • ChatGPT
  • ETL

Level

Advanced

3. Python Challenges: Balancing Balance

Description

You've been hired by a banking company to assist with implementations and improvements to the business system. In an initial analysis, the finance team identified the need to develop a solution that allows customers to balance their bank accounts. The program should request an input representing the employee's current balance, and after that, the value of two transactions should be informed: a deposit and a withdrawal. The program should update the balance based on the transactions and display the final balance.

Input

  • saldoAtual: decimal number representing the current bank account balance.
  • valorDeposito: decimal number representing the amount to be deposited into the account.
  • valorRetirada: decimal number representing the amount to be withdrawn from the account.

Output

A decimal number representing the updated balance in the bank account after processing the transactions.

Technologies

  • Python
  • Basic Programming Principles

Level

Basic

4. Python Challenges: Organizing Your Assets

Description

After a careful analysis conducted by the development team of a banking company, the need for a new functionality was identified to optimize processes and improve user experience. Your task is to implement a solution that organizes in alphabetical order a list of assets that will be informed by users. Assets are represented by strings that represent their types, such as: Liquidity reserves, Intangible assets, and others.

Input

  • An integer representing the number of assets the user has.
  • Then, the user should provide, on separate lines, the types (strings) of the respective assets.

Output

The list of Assets organized in alphabetical order, with each asset presented on a separate line.

Technologies

  • Python
  • Data Structures
  • Sorting

Level

Basic

5. Python Challenges: Conditionally Rich

Description

A new feature for a banking system was analyzed by the development team and will be one of the tasks to be worked on during the sprint. As a company developer, you received the requirements for the new implementation, which consists of an algorithmic solution that allows customers to make withdrawals at ATMs.

Withdrawal rules

  • Each customer will enter the value of their saldoTotal from their bank account and the valorSaque.
  • A withdrawal can only be made if the available balance in the account is equal to or greater than the requested amount.
  • If the balance is sufficient, the requested amount should be subtracted from the available balance, indicating that the withdrawal was made.
  • If the balance is insufficient, the withdrawal should not be made, and an appropriate message should be displayed.

Input

Two integer values representing the total account balance and the withdrawal amount.

Output

  • If the withdrawal is successful: "Saque realizado com sucesso! Novo saldo: {saldo}"
  • If the withdrawal is not possible: "Saldo insuficiente. Saque nao realizado!"

Technologies

  • Python
  • Conditional Structures

Level

Basic

6. Python Challenges: Compound Interest

Description

Imagine you're developing an application for a bank that wants to calculate the compound interest on an investment. Your goal is to create a function that takes three parameters: the initial investment value, the annual interest rate, and the time period in years. The function should calculate and return the final investment value after the specified period, taking into account compound interest.

Input

  • valor_inicial: integer or decimal number representing the initial investment value.
  • taxa_juros: decimal number representing the annual interest rate (e.g., 5% = 0.05).
  • periodo: integer representing the number of years of the investment.

Output

The final investment value after the specified period, considering compound interest, rounded to two decimal places.

Technologies

  • Python
  • Financial Mathematics

Level

Basic

7. Python Challenges: The Big Deposit

Description

You've been hired by a bank to develop a program that helps its customers make deposits into their accounts. The program should ask the customer for the deposit amount and check if the value is valid. If the value is greater than zero, the program should add the value to the account balance. Otherwise, the program should display an error message. The program should request the deposit amount only once.

Input

The deposit amount entered by the customer (can be decimal, representing value in reais).

Output

  • If valid value (> 0): "Deposito realizado com sucesso! Saldo atual: R$ {valor}"
  • If invalid value (< 0): "Valor invalido! Digite um valor maior que zero."
  • If value is 0: "Encerrando o programa..."

Technologies

  • Python
  • Conditional Structures

Level

Basic

8. Creating a Management Sales Report with Power BI

Description

In this project, you will create a report in Power BI Desktop based on the Financials sample provided by Microsoft itself. The necessary data is described in the challenge and on GitHub.

Objective

Create an elaborate report with:

  • Defined structure
  • Navigation buttons that provide navigability
  • Used slicers and buttons with associated images
  • Indicators and buttons to select different visuals on the same subject

Elements to be created:

  • Objects that define the report layout
  • Charts (visuals) and the fields that compose them
  • Buttons for navigation
  • Data slicers
  • Second page of the report
  • Publication of the report in Power BI Service

Technologies

  • Power BI

Level

Intermediate

9. Creating a Corporate Dashboard with MySQL and Azure Integration

Description

In this challenge, it will be your turn to apply the steps of collecting, obtaining, and transforming data with Power BI and MySQL in Azure. Follow the steps defined in the videos.

Objective

Process and transform data using Power BI integrated with MySQL database hosted on Azure.

Steps:

  1. Create a MySQL instance on Azure
  2. Explore the resource - MySQL Instance
  3. Connect to the Database with Cloud Shell
  4. Create Firewall Rule in Azure for Database Access
  5. Connect to MySQL on Azure using Workbench
  6. Integrate Power BI with MySQL on Azure

Technologies

  • Power BI
  • MySQL
  • Azure Cloud

Level

Intermediate

πŸ› οΈ Tools and Technologies

Technology Stack
  • Python: Pandas, NumPy, Matplotlib, Scikit-learn
  • Databases: SQL, NoSQL
  • Power BI: Visualization and dashboards
  • Git and GitHub: Versioning and collaboration
  • Machine Learning: Supervised and unsupervised models
  • Generative AI: Practical applications in data pipelines
Technology Diagram (Click to expand)
graph TD
    A[Data Science] --> B[Python Fundamentals]
    A --> C[Databases]
    A --> D[Visualization]
    A --> E[Machine Learning]
    
    B --> F[Data Types]
    B --> G[Control Structures]
    B --> H[Functions]
    B --> I[Libraries]
    
    I --> J[Pandas]
    I --> K[NumPy]
    I --> L[Matplotlib]
    
    C --> M[SQL]
    C --> N[NoSQL]
    
    D --> O[Power BI]
    D --> P[Interactive Dashboards]
    
    E --> Q[Supervised Algorithms]
    E --> R[Unsupervised Algorithms]
    E --> S[Generative AI]
    
    Q --> T[SVM]
    Q --> U[Neural Networks]
    
    R --> V[Clustering]
    
    S --> W[ETL with AI]
Loading

πŸ“ Repository Content

Directory Structure
DIO_Santander_DataScience/
β”œβ”€β”€ Module_1-Onboarding/
β”‚   β”œβ”€β”€ Git_GitHub/
β”‚   └── Project_OpenSource/
β”œβ”€β”€ Module_2-Python_DataScience/
β”‚   β”œβ”€β”€ Python_Fundamentals/
β”‚   β”œβ”€β”€ Data_Structures/
β”‚   └── Project_ETL_AI/
β”œβ”€β”€ Module_3-Code_Challenges/
β”‚   β”œβ”€β”€ Challenge_BalancingBalance/
β”‚   β”œβ”€β”€ Challenge_OrganizingAssets/
β”‚   β”œβ”€β”€ Challenge_ConditionallyRich/
β”‚   β”œβ”€β”€ Challenge_CompoundInterest/
β”‚   └── Challenge_BigDeposit/
β”œβ”€β”€ Module_4-SQL_NoSQL/
β”‚   β”œβ”€β”€ SQL_Fundamentals/
β”‚   └── NoSQL_Introduction/
β”œβ”€β”€ Module_5-PowerBI/
β”‚   β”œβ”€β”€ BI_Fundamentals/
β”‚   β”œβ”€β”€ Interactive_Dashboards/
β”‚   └── Project_Sales_Dashboard/
β”œβ”€β”€ Module_6-MachineLearning/
β”‚   β”œβ”€β”€ ML_Introduction/
β”‚   β”œβ”€β”€ Supervised_Algorithms/
β”‚   └── Unsupervised_Algorithms/
β”œβ”€β”€ .gitignore
β”œβ”€β”€ LICENSE.md
└── README.md

πŸ”„ Updates

This repository is regularly updated with new materials and projects as I progress through the bootcamp, reflecting the progress and skills acquired.

πŸ“‹ Update Log (Click to expand)
  • March/2025: Bootcamp start and repository setup

πŸ“œ License

CC BY-NC-ND 4.0 License

This repository is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

What this means:

  • βœ… You can share β€” You are free to copy and redistribute the material in any medium or format
  • ❌ No commercial use β€” You may not use the material for commercial purposes
  • ❌ No derivatives β€” You may not remix, transform, or build upon the material
  • βœ… Attribution required β€” You must give appropriate credit, provide a link to the license, and indicate if changes were made

For the complete license terms, please see the LICENSE.md file.

πŸ“« Contact

For questions or suggestions about this repository, contact me through GitHub.


Note: This repository contains study materials from the Santander Bootcamp 2023 - Data Science with Python and serves as a portfolio for learning and developing skills in the field.

About

Repository documenting my journey through the Santander Bootcamp 2023 - Data Science with Python. Contains projects, exercises, and materials covering Python, data visualization, machine learning models, and ETL pipelines developed during this comprehensive DIO educational program.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published