Skip to content

Commit

Permalink
Try lazy + expression fonction to check perf bench #146 (#147)
Browse files Browse the repository at this point in the history
* Try lazy + expression fonction to check perf bench #146

Signed-off-by: Armand <arm.gilles@gmail.com>

* Update code with get_transactions_out expr #146

Signed-off-by: Armand <arm.gilles@gmail.com>

* have to collect with lazy #146

Signed-off-by: Armand <arm.gilles@gmail.com>

* Lazy read_activity_vcub #148 and update notebook transactions_out

Signed-off-by: Armand <arm.gilles@gmail.com>

* Update bench test with lazy vcub_keeper_py312 #148

Signed-off-by: Armand <arm.gilles@gmail.com>

* update docstring #148

Signed-off-by: Armand <arm.gilles@gmail.com>

* collect lazy df to be a fair bench #146

Signed-off-by: Armand <arm.gilles@gmail.com>

* collect lazy df to be a fair bench #146

Signed-off-by: Armand <arm.gilles@gmail.com>

* collect lazy df to be a fair bench #146

Signed-off-by: Armand <arm.gilles@gmail.com>

* Update transactions_in to be lazy and expr fonction #149

Signed-off-by: Armand <arm.gilles@gmail.com>

* Try to improve bench test on big result

Signed-off-by: Armand <arm.gilles@gmail.com>

* lazy and Expr function for transactions_all function #150

Signed-off-by: Armand <arm.gilles@gmail.com>

* try to fix bad perf on big dataset lazy

Signed-off-by: Armand <arm.gilles@gmail.com>

* try to fix bad perf on big dataset lazy

Signed-off-by: Armand <arm.gilles@gmail.com>

* try to fix bad perf on big dataset lazy

Signed-off-by: Armand <arm.gilles@gmail.com>

* Lazy expr for get_consecutive_no_transactions_out #151

Signed-off-by: Armand <arm.gilles@gmail.com>

* Lazy transform_json_api_bdx_station_data_to_df function #152

Signed-off-by: Armand <arm.gilles@gmail.com>

* Encoding time in Expr function and process_data_cluster in lazy mode #153

Signed-off-by: Armand <arm.gilles@gmail.com>

* add todo for ML with pandas

Signed-off-by: Armand <arm.gilles@gmail.com>

* Adapt code for pipeline bench lazy

Signed-off-by: Armand <arm.gilles@gmail.com>

* Try new lazy for pipeline bench

Signed-off-by: Armand <arm.gilles@gmail.com>

* Try new lazy for pipeline bench

Signed-off-by: Armand <arm.gilles@gmail.com>

* process data with with_columns style & lazy #161

Signed-off-by: Armand <arm.gilles@gmail.com>

* forget previous commit

Signed-off-by: Armand <arm.gilles@gmail.com>

* have to collect this tests

Signed-off-by: Armand <arm.gilles@gmail.com>

* Small test bench are in eager mode, big in lazy mode to faire comparaison

Signed-off-by: Armand <arm.gilles@gmail.com>

* Small test bench are in eager mode, big in lazy mode to faire comparaison

Signed-off-by: Armand <arm.gilles@gmail.com>

* Update notebook with with_colums style for feature creation

Signed-off-by: Armand <arm.gilles@gmail.com>

* Using pipe style with lazy

Signed-off-by: Armand <arm.gilles@gmail.com>

* cleaning

Signed-off-by: Armand <arm.gilles@gmail.com>

---------

Signed-off-by: Armand <arm.gilles@gmail.com>
  • Loading branch information
armgilles authored Oct 16, 2024
1 parent f903829 commit 61f56d4
Show file tree
Hide file tree
Showing 16 changed files with 1,632 additions and 4,545 deletions.
40 changes: 15 additions & 25 deletions notebooks/01_Analyse/01_Raw/01_Activite_station_Vcub_files.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -12,27 +12,19 @@
},
{
"cell_type": "code",
"execution_count": 3,
"execution_count": 5,
"metadata": {
"ExecuteTime": {
"end_time": "2020-11-08T10:10:29.490258Z",
"start_time": "2020-11-08T10:10:29.462913Z"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"The autoreload extension is already loaded. To reload it, use:\n",
" %reload_ext autoreload\n"
]
}
],
"outputs": [],
"source": [
"import glob as glob\n",
"\n",
"import pandas as pd\n",
"import polars as pl\n",
"\n",
"from vcub_keeper.config import ROOT_DATA_CLEAN, ROOT_DATA_RAW\n",
"from vcub_keeper.transform.features_factory import get_transactions_all, get_transactions_in, get_transactions_out\n",
Expand Down Expand Up @@ -10926,8 +10918,6 @@
}
],
"source": [
"import polars as pl\n",
"\n",
"from vcub_keeper.create.creator import create_activity_time_series\n",
"from vcub_keeper.reader.reader import read_time_serie_activity\n",
"\n",
Expand Down Expand Up @@ -11137,7 +11127,7 @@
},
{
"cell_type": "code",
"execution_count": 71,
"execution_count": 1,
"metadata": {
"ExecuteTime": {
"end_time": "2020-11-08T10:15:40.825772Z",
Expand All @@ -11155,7 +11145,7 @@
},
{
"cell_type": "code",
"execution_count": 64,
"execution_count": 2,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-03T11:55:48.583245Z",
Expand All @@ -11167,12 +11157,12 @@
"name": "stdout",
"output_type": "stream",
"text": [
"bordeaux-2018.csv\n",
"(1118425, 5)\n",
"bordeaux-2020.csv\n",
"(11888595, 5)\n",
"bordeaux-2019.csv\n",
"(12527719, 5)\n",
"bordeaux-2020.csv\n",
"(11888595, 5)\n"
"bordeaux-2018.csv\n",
"(1118425, 5)\n"
]
}
],
Expand All @@ -11183,7 +11173,7 @@
},
{
"cell_type": "code",
"execution_count": 72,
"execution_count": 3,
"metadata": {
"ExecuteTime": {
"end_time": "2020-11-08T10:16:19.650198Z",
Expand All @@ -11197,7 +11187,7 @@
},
{
"cell_type": "code",
"execution_count": 73,
"execution_count": 6,
"metadata": {},
"outputs": [
{
Expand Down Expand Up @@ -11225,7 +11215,7 @@
"└────────────┴────────────┴────────────┴────────────┴────────┴────────────┴────────────┴───────────┘"
]
},
"execution_count": 73,
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
Expand All @@ -11237,7 +11227,7 @@
},
{
"cell_type": "code",
"execution_count": 74,
"execution_count": 7,
"metadata": {
"ExecuteTime": {
"end_time": "2020-11-08T10:16:19.664372Z",
Expand Down Expand Up @@ -11276,7 +11266,7 @@
},
{
"cell_type": "code",
"execution_count": 75,
"execution_count": 8,
"metadata": {
"ExecuteTime": {
"end_time": "2020-11-08T09:54:58.845575Z",
Expand All @@ -11290,7 +11280,7 @@
"(12794673, 8)"
]
},
"execution_count": 75,
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
Expand Down
Loading

0 comments on commit 61f56d4

Please sign in to comment.