This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors (in alphabetical order): Katharina Rasch, Noa Tamir
By the end of this lab you will have completed one data science project based on open data, simulating the work of a freelance data science team.
You will gain familiarity with processes and practice the required communication skills at the core of a data scientist’s job.
You will get experience with weekly planning, receiving and following guidance by professional data science tech leads, team collaboration, communication to stakeholders, documentation for project handover, sharing your ongoing work and working together on not-perfect code and ideas, peer 2 peer code review.
And, importantly, you will end up with a portfolio project (in the form of a presentation, blog post, github repository, or another format you prefer) showcasing your work on a real-world data problem, which you can use in job applications.
We are striving for a co-learning atmosphere. You will be part of a team of about four people, working together and learning from each other. Your mentors will check in with you weekly to make sure your team is on the right path.
- Usage forecasting
- Resource prioritization
- Risk analysis
- Data journalism
Have a look at the projects folder.
The lab runs for ~10 weeks, with your team working through the main stages of a data science project:
- Client brief and project ideation
- Exploratory data analysis and data cleaning
- Evaluation setup and building baseline model
- Feature construction and improved model
- Client handover and technical handover
You can find all the details about what you will be doing in each stage in our handbook.
The weekly 1-hour meetings with your mentors are your chance to check up on your progress and plan the next steps in the project. Typically meetings will look like this:
- First 30 minutes - Your team presents the results of the previous week in 10-15 minutes. Train giving succinct technical presentations! Then you get feedback from your mentors and discuss open questions together.
- Remaining 30 minutes - Team and mentors plan the next week. Bring your own ideas for next steps!
In addition to the weekly mentor call, you should also be available for weekly sync meetings with your team.
Data science does not exist in a technical vacuum. In the end, you do want to solve some real-world/business problem. It is important to us, that you also train to:
- Understand the clients needs and help them scope it into a data science task
- Always keep the clients needs in mind, even when deep in technical details.
- Communicate your results to the client in an accessible manner.
We will not be working with a real client in this lab. Instead, your mentors will put on their "client hats" when necessary.
We want you to leave the lab with portfolio project that
- convinces hiring managers that you understand how to solve business problems using data science
- convinces tech leads that you do have the technical chops necessary to work on data science projects
For example, you might end up preparing
- a client-focused presentation for the hiring managers
- a technical presentation / github project for the tech leads
We'll discuss what the best options are for you specifically and give feedback.
We encourage self-organised pair-programming and peer code-reviews and can advise you on how to get started. We'll do a team retrospective after 1/3 of the lab focused on team dynamics.
We'll also schedule a 1:1 call between you and one of the mentors, focusing on you/your career/your questions.
The program costs €30 (to buy your mentors some coffee). Payable on completion, because we want you to know what you’re paying for.
Noa (LinkedIn Profile) is a freelance data scientist and analytics consultant, who built 3 data science teams in Berlin, and supported dozens of data scientists through their personal professional development. She is comfortable with both Python and R, and has experience with causality, UX optimisation, and pricing models. Noa knows first hand the benefits of an engaging professional women’s network, and in the past years has organised open meetups, development sprints, and conferences in Berlin, either dedicated to women and gender minorities, or open to feminist communities. You can lisen to her on the Techpoint Charlie Podcast, for example this episode about "What is the Role of a Data Scientist?".
Kat (personal website, LinkedIn) is a freelance data scientist / computer vision engineer / teacher. Kat holds a PhD in Computer Science from KTH Stockholm and has been working as a data scientist since 2014. Kat has several years of teaching experience, including three years teaching a data science lab course similar to this mentorship at HTW Berlin.
Sounds interesting? Apply here!
Have a look at our handbook.