MovieLens Analytics

Summary: This rush will help you to strengthen the skills acquired in the previous days.

Chapter III

Specific instructions for the day

No code in the global scope. Use functions!
Any exception not caught will invalidate the work, even in the event of an error that was asked you to test
The interpreter to use is Python 3
Any built-in function is allowed
You can import the following libraries: os, sys, urllib, requests, beautifulsoup, json, pytest, collections, functools, datetime, re
Use Jupyter Notebook for creating the report

Chapter IV

Mandatory part

In this rush, you are going to work on your own analytical report. You will analyze data from the MovieLens database. By the end of the rush, you will have two files:

movielens_analysis.py and movielens_report.ipynb. In the first file, you will need to create your own module with classes and methods. In the second file, you will create the report itself using only your module.

Module

Remember that the goal of the rush is to strengthen your skills.

Try to use as much as you can from what you have learned from the previous days.

Use a smaller version of MovieLens dataset, download it, please. Try to use the first 1000 values from the dataset
Read the README.txt very carefully. Focus on the file structures
In your module, you will need to create 4 classes corresponding to 4 files from the data and 1 class for testing
The classes and methods below are obligatory but you can add to them anything that suits your needs

Classes Ratings, Tags, Movies, Links can be found in code-samples.

Class Tests:

Create tests using PyTest for each and every method of the classes above.

They should check:

if the methods return the correct data types
if the lists elements have the correct data types
if the returned data sorted correctly

Run the tests before going to the next stage of the rush.

Report

Using only the classes and methods from movielens_analysis.py, prepare your report.

You should do it in Jupyter Notebook.

It is a great tool especially if you are a data scientist.

It gives you an opportunity to work with the code interactively by launching and relaunching different cells with different values.

You do not have to rerun your whole code from the beginning.

Also, you can put in the cells not only code but text too, which is a great feature for making reports.

Install it to your environment.

In this part of the rush, we will give you more freedom.

We are not going to define the structure of your report.

The goal of the report is to tell us an interesting story about the MovieLens dataset.

Find the good structure and the right sequence.

The only constraints:

you must use each and every method from movielens_analysis.py except class Tests
every cell in your notebook should contain magic command %timeit
all other imports are prohibited as well as using built-in functions. If you need them, put them in your module in advance

Chapter V

Bonus part

Add to the classes more methods that you may find useful and interesting for your report. Do not forget to test them too
Improve the tests. Check the correctness of your calculations as well. Precalculate manually some results and metrics and check if the methods return the correct information if you give them the specific input

Chapter VI

Turn-in and peer-evaluation

Turn in your work using your git repository, as usual. Only the work that’s in your repository will be graded during the evaluation.

During the correction, you will be evaluated on your turn-in (no functions that do all the heavy lifting for you) as well as your ability to present, explain, and justify your choices.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
code-samples		code-samples
data-samples		data-samples
datasets		datasets
materials		materials
misc		misc
src		src
.gitattributes		.gitattributes
.gitignore		.gitignore
ACTIVE_TAG_FILES.txt		ACTIVE_TAG_FILES.txt
CHANGELOG		CHANGELOG
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MovieLens Analytics

Contents

Chapter III

Specific instructions for the day

Chapter IV

Mandatory part

Module

Report

Chapter V

Bonus part

Chapter VI

Turn-in and peer-evaluation

About

Uh oh!

Languages

License

radaga1s/DS_Team

Folders and files

Latest commit

History

Repository files navigation

MovieLens Analytics

Contents

Chapter III

Specific instructions for the day

Chapter IV

Mandatory part

Module

Report

Chapter V

Bonus part

Chapter VI

Turn-in and peer-evaluation

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Languages