Materials, exercises, and projects developed during the bootcamp sponsored by Santander in partnership with DIO
Overview β’ Structure β’ Projects β’ Tools β’ Content β’ License
This repository documents my learning journey in Data Science through the Santander Bootcamp 2023, including code, practical projects, and study materials. The content reflects the complete bootcamp curriculum, offering a comprehensive view of the skills developed.
π About the Bootcamp (Click to expand)
Important note: This bootcamp was originally conducted between August and October 2023, focusing on Data Science with Python.
1οΈβ£ Market-aligned content
- Modules developed to reflect industry trends and company requirements
- Focus on practical skills valued by employers
2οΈβ£ Intensive and comprehensive learning
- In-depth coverage of Python, Databases, Visualization, and Machine Learning
- Time distribution based on the relevance of topics in today's market
3οΈβ£ Practical and targeted learning
- Coding challenges and projects for immediate application of knowledge
- Updated educational materials with the latest tools and techniques
- Projects that simulate real data scientist challenges
β οΈ Note
Important note: This bootcamp was originally offered in 2023, but I am completing it in 2025 as part of the benefits of being a Global Student at DIO (Digital Innovation One). The platform allows continuous access to educational content even after the official period ends, enabling students like me to take advantage of the opportunity to develop these skills at a later date.
π Curriculum Structure (Click to expand)
- DIO Bootcamps: Free Education and Employability Together! (1h)
- Organizing Your Studies with DIO Roadmaps and Notion (2h)
- Code Versioning with Git and GitHub (2h)
- Project Challenges: Create a Winning Portfolio (1h)
- Contributing to an Open Source Project on GitHub (1h)
- Opening Class - Santander Bootcamps 2023 (2h)
- Development Environment and First Steps with Python (1h)
- Getting to Know the Python Programming Language (2h)
- Types of Operators in Python (2h)
- Conditional and Loop Structures in Python (2h)
- Manipulating Strings with Python (2h)
- Working with Lists in Python (1h)
- Getting to Know Tuples in Python (1h)
- Exploring Sets in Python (1h)
- Learning to Use Dictionaries in Python (1h)
- Mastering Python Functions (1h)
- Exploring Generative AI in an ETL Pipeline with Python (2h)
- Code Challenges: Improve Your Logic and Computational Thinking (1h)
- Python Challenges: Balancing Balance (1h)
- Python Challenges: Organizing Your Assets (1h)
- Python Challenges: Conditionally Rich (1h)
- Python Challenges: Compound Interest (1h)
- Python Challenges: The Big Deposit (1h)
- Introduction to Relational Databases (SQL) (3h)
- Introduction to NoSQL Databases (3h)
- Business Intelligence (BI) Fundamentals (2h)
- Introduction to Data Analysis with SQL (3h)
- Theoretical Foundations of ETL (1h)
- First Steps with Power BI (3h)
- Working with Visuals in Power BI (4h)
- BI Fundamentals: KPIs and Metrics (1h)
- Creating Interactive Dashboards with Power BI (2h)
- Creating a Management Sales Report with Power BI (1h)
- Data Collection and Extraction with Power BI (3h)
- Data Cleaning and Transformation with Power BI (2h)
- Creating a Corporate Dashboard with MySQL and Azure Integration (1h)
- Introduction to Machine Learning (2h)
- Bio-inspired Machine Learning Methods (1h)
- Artificial Neural Networks (1h)
- Genetic Algorithms (2h)
- SVM (Support Vector Machine) Algorithms (1h)
- Problem Classification: Exploring Datasets (1h)
- Programming Languages for Machine Learning (1h)
- Python for Machine Learning in Practice (2h)
- Evaluate this Bootcamp (1h)
π Mentoring Sessions (Live) (Click to expand)
- Opening Class - Santander Bootcamps 2023 (2h)
- Intelligent Development: Maximizing Your Productivity with Generative AI (2h)
- Demystifying SQL and NoSQL Databases with ChatGPT (2h)
- Challenges and Future Perspectives on Generative AI (2h)
- Building Your Digital Brand: How to Highlight Your Developer Portfolio (2h)
π Project List
| Project | Description | Status |
|---|---|---|
| 1. Contributing to an Open Source Project on GitHub | Contribution to an open source project using GitHub | π§ In progress |
| 2. Exploring Generative AI in an ETL Pipeline with Python | Development of ETL pipeline with Generative AI | π§ In progress |
| 3. Python Challenges: Balancing Balance | System for bank balance management | π§ In progress |
| 4. Python Challenges: Organizing Your Assets | Alphabetical organization of banking assets | π§ In progress |
| 5. Python Challenges: Conditionally Rich | Withdrawal system with balance verification | π§ In progress |
| 6. Python Challenges: Compound Interest | Compound interest calculation for investments | π§ In progress |
| 7. Python Challenges: The Big Deposit | Deposit system with validation | π§ In progress |
| 8. Creating a Management Sales Report with Power BI | Interactive sales report with Power BI | π§ In progress |
| 9. Creating a Corporate Dashboard with MySQL and Azure Integration | Dashboard with cloud database integration | π§ In progress |
1. Contributing to an Open Source Project on GitHub
The world of Open Source awaits you! In the lab "Contributing to an Open Source Project on GitHub," you'll be introduced to the fascinating universe of open-source collaboration. This practical project was specially designed for technology students like you to dive in and experience firsthand the power of collaborative work and continuous innovation that Open Source provides.
Understand and practice the process of contributing to Open Source projects, using GitHub as a collaboration platform.
- Choose an Open Source project to contribute to
- Make a contribution to the project (documentation improvements, feature additions, bug fixes, etc.)
- For first-time contributors, the repository
digitalinnovationone/dio-lab-open-sourceis recommended
- GitHub
- Git
- Markdown
Basic
2. Exploring Generative AI in an ETL Pipeline with Python
Get ready for a practical journey through the world of Data Science! We'll build an ETL (Extract, Transform, Load) pipeline, demonstrating the relationship between data, Artificial Intelligence (AI), and APIs.
- Extraction: The adventure begins with a simple spreadsheet, from which we'll extract user IDs. Then, we'll use these IDs to access the 'Santander Dev Week 2023' API and obtain more detailed data.
- Transformation: We'll enter the universe of AI with OpenAI's GPT-4, transforming this data into personalized marketing messages.
- Loading: We'll finish the process by sending these messages back to the 'Santander Dev Week 2023' API.
Reimagine the ETL process by applying the concepts learned in a new application domain.
- Python
- REST
- OpenAI API
- ChatGPT
- ETL
Advanced
3. Python Challenges: Balancing Balance
You've been hired by a banking company to assist with implementations and improvements to the business system. In an initial analysis, the finance team identified the need to develop a solution that allows customers to balance their bank accounts. The program should request an input representing the employee's current balance, and after that, the value of two transactions should be informed: a deposit and a withdrawal. The program should update the balance based on the transactions and display the final balance.
saldoAtual: decimal number representing the current bank account balance.valorDeposito: decimal number representing the amount to be deposited into the account.valorRetirada: decimal number representing the amount to be withdrawn from the account.
A decimal number representing the updated balance in the bank account after processing the transactions.
- Python
- Basic Programming Principles
Basic
4. Python Challenges: Organizing Your Assets
After a careful analysis conducted by the development team of a banking company, the need for a new functionality was identified to optimize processes and improve user experience. Your task is to implement a solution that organizes in alphabetical order a list of assets that will be informed by users. Assets are represented by strings that represent their types, such as: Liquidity reserves, Intangible assets, and others.
- An integer representing the number of assets the user has.
- Then, the user should provide, on separate lines, the types (strings) of the respective assets.
The list of Assets organized in alphabetical order, with each asset presented on a separate line.
- Python
- Data Structures
- Sorting
Basic
5. Python Challenges: Conditionally Rich
A new feature for a banking system was analyzed by the development team and will be one of the tasks to be worked on during the sprint. As a company developer, you received the requirements for the new implementation, which consists of an algorithmic solution that allows customers to make withdrawals at ATMs.
- Each customer will enter the value of their
saldoTotalfrom their bank account and thevalorSaque. - A withdrawal can only be made if the available balance in the account is equal to or greater than the requested amount.
- If the balance is sufficient, the requested amount should be subtracted from the available balance, indicating that the withdrawal was made.
- If the balance is insufficient, the withdrawal should not be made, and an appropriate message should be displayed.
Two integer values representing the total account balance and the withdrawal amount.
- If the withdrawal is successful: "Saque realizado com sucesso! Novo saldo: {saldo}"
- If the withdrawal is not possible: "Saldo insuficiente. Saque nao realizado!"
- Python
- Conditional Structures
Basic
6. Python Challenges: Compound Interest
Imagine you're developing an application for a bank that wants to calculate the compound interest on an investment. Your goal is to create a function that takes three parameters: the initial investment value, the annual interest rate, and the time period in years. The function should calculate and return the final investment value after the specified period, taking into account compound interest.
valor_inicial: integer or decimal number representing the initial investment value.taxa_juros: decimal number representing the annual interest rate (e.g., 5% = 0.05).periodo: integer representing the number of years of the investment.
The final investment value after the specified period, considering compound interest, rounded to two decimal places.
- Python
- Financial Mathematics
Basic
7. Python Challenges: The Big Deposit
You've been hired by a bank to develop a program that helps its customers make deposits into their accounts. The program should ask the customer for the deposit amount and check if the value is valid. If the value is greater than zero, the program should add the value to the account balance. Otherwise, the program should display an error message. The program should request the deposit amount only once.
The deposit amount entered by the customer (can be decimal, representing value in reais).
- If valid value (> 0): "Deposito realizado com sucesso! Saldo atual: R$ {valor}"
- If invalid value (< 0): "Valor invalido! Digite um valor maior que zero."
- If value is 0: "Encerrando o programa..."
- Python
- Conditional Structures
Basic
8. Creating a Management Sales Report with Power BI
In this project, you will create a report in Power BI Desktop based on the Financials sample provided by Microsoft itself. The necessary data is described in the challenge and on GitHub.
Create an elaborate report with:
- Defined structure
- Navigation buttons that provide navigability
- Used slicers and buttons with associated images
- Indicators and buttons to select different visuals on the same subject
- Objects that define the report layout
- Charts (visuals) and the fields that compose them
- Buttons for navigation
- Data slicers
- Second page of the report
- Publication of the report in Power BI Service
- Power BI
Intermediate
9. Creating a Corporate Dashboard with MySQL and Azure Integration
In this challenge, it will be your turn to apply the steps of collecting, obtaining, and transforming data with Power BI and MySQL in Azure. Follow the steps defined in the videos.
Process and transform data using Power BI integrated with MySQL database hosted on Azure.
- Create a MySQL instance on Azure
- Explore the resource - MySQL Instance
- Connect to the Database with Cloud Shell
- Create Firewall Rule in Azure for Database Access
- Connect to MySQL on Azure using Workbench
- Integrate Power BI with MySQL on Azure
- Power BI
- MySQL
- Azure Cloud
Intermediate
Technology Stack
- Python: Pandas, NumPy, Matplotlib, Scikit-learn
- Databases: SQL, NoSQL
- Power BI: Visualization and dashboards
- Git and GitHub: Versioning and collaboration
- Machine Learning: Supervised and unsupervised models
- Generative AI: Practical applications in data pipelines
Technology Diagram (Click to expand)
graph TD
A[Data Science] --> B[Python Fundamentals]
A --> C[Databases]
A --> D[Visualization]
A --> E[Machine Learning]
B --> F[Data Types]
B --> G[Control Structures]
B --> H[Functions]
B --> I[Libraries]
I --> J[Pandas]
I --> K[NumPy]
I --> L[Matplotlib]
C --> M[SQL]
C --> N[NoSQL]
D --> O[Power BI]
D --> P[Interactive Dashboards]
E --> Q[Supervised Algorithms]
E --> R[Unsupervised Algorithms]
E --> S[Generative AI]
Q --> T[SVM]
Q --> U[Neural Networks]
R --> V[Clustering]
S --> W[ETL with AI]
Directory Structure
DIO_Santander_DataScience/
βββ Module_1-Onboarding/
β βββ Git_GitHub/
β βββ Project_OpenSource/
βββ Module_2-Python_DataScience/
β βββ Python_Fundamentals/
β βββ Data_Structures/
β βββ Project_ETL_AI/
βββ Module_3-Code_Challenges/
β βββ Challenge_BalancingBalance/
β βββ Challenge_OrganizingAssets/
β βββ Challenge_ConditionallyRich/
β βββ Challenge_CompoundInterest/
β βββ Challenge_BigDeposit/
βββ Module_4-SQL_NoSQL/
β βββ SQL_Fundamentals/
β βββ NoSQL_Introduction/
βββ Module_5-PowerBI/
β βββ BI_Fundamentals/
β βββ Interactive_Dashboards/
β βββ Project_Sales_Dashboard/
βββ Module_6-MachineLearning/
β βββ ML_Introduction/
β βββ Supervised_Algorithms/
β βββ Unsupervised_Algorithms/
βββ .gitignore
βββ LICENSE.md
βββ README.md
This repository is regularly updated with new materials and projects as I progress through the bootcamp, reflecting the progress and skills acquired.
π Update Log (Click to expand)
- March/2025: Bootcamp start and repository setup
CC BY-NC-ND 4.0 License
This repository is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
- β You can share β You are free to copy and redistribute the material in any medium or format
- β No commercial use β You may not use the material for commercial purposes
- β No derivatives β You may not remix, transform, or build upon the material
- β Attribution required β You must give appropriate credit, provide a link to the license, and indicate if changes were made
For the complete license terms, please see the LICENSE.md file.
For questions or suggestions about this repository, contact me through GitHub.
Note: This repository contains study materials from the Santander Bootcamp 2023 - Data Science with Python and serves as a portfolio for learning and developing skills in the field.