DCA-1016: Confluence prepare data script update #657

opopovss · 2021-07-30T11:05:57Z

No description provided.

ometelytsia · 2021-07-30T19:37:30Z

app/util/data_preparation/confluence_prepare_data.py


 import urllib3

+from multiprocessing import cpu_count


Not used import.

ometelytsia · 2021-07-30T19:37:57Z

app/util/data_preparation/confluence_prepare_data.py

@@ -1,14 +1,17 @@
 import random
 import string



There are a bunch of missing imports related to the print_timing decorator.

ometelytsia · 2021-07-30T19:38:56Z

app/util/data_preparation/confluence_prepare_data.py

 ENGLISH_GB = 'en_GB'


+def print_timing(message, sep='-'):


This is a code duplication. jsm_prepare_data already has this. Need to extract this decorator to a common place, e.g. introduce new util file for prepare_data scripts.

ometelytsia · 2021-07-30T19:42:58Z

app/util/data_preparation/confluence_prepare_data.py

-    dataset[BLOGS] = __get_blogs(perf_user_api, 5000)
+
+    def __get_pages_blogs(func):
+        return func(perf_user_api, 5000)


With this approach, we break the possibility to modify the number of pages and blogs separately.
Also, the code is quite hard to understand.

Let's try something more simple and straight forward like:

pool = ThreadPool(processes=2) async_pages = pool.apply_async(__get_pages, (perf_user_api, 5000)) async_blogs = pool.apply_async(__get_blogs, (perf_user_api, 5000)) async_pages.wait() async_blogs.wait() dataset[PAGES] = async_pages.get() dataset[BLOGS] = async_blogs.get()

ometelytsia · 2021-08-02T15:36:26Z

app/util/data_preparation/confluence_prepare_data.py

+@print_timing('Creating dataset started')
 def __create_data_set(rest_client, rpc_client):
-    dataset = dict()
+    dataset = dict([])


Why this line was changed?

probably it is just my mistake

DCA-1016: Confluence prepare data script update

fb9b279

opopovss requested review from SergeyMoroz0703 and ometelytsia July 30, 2021 11:05

opopovss self-assigned this Jul 30, 2021

opopovss requested review from astashys and mmizin as code owners July 30, 2021 11:05

ometelytsia reviewed Jul 30, 2021

View reviewed changes

DCA-1016: Async code modified, added the new file util.py

ea0bbf5

opopovss requested a review from ometelytsia August 2, 2021 09:36

ometelytsia approved these changes Aug 2, 2021

View reviewed changes

DCA-1016:fixed my mistake

5c519cd

mmizin approved these changes Aug 3, 2021

View reviewed changes

mmizin and others added 2 commits August 3, 2021 12:56

Merge branch 'dev' into confluence/dca-1016-prepare-data-script-improve

e57ffc2

Merge branch 'dev' into confluence/dca-1016-prepare-data-script-improve

575a85a

opopovss requested a review from OlehStefanyshyn as a code owner August 4, 2021 12:21

opopovss merged commit 63ce012 into dev Aug 4, 2021

opopovss deleted the confluence/dca-1016-prepare-data-script-improve branch August 4, 2021 12:29

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

DCA-1016: Confluence prepare data script update #657

DCA-1016: Confluence prepare data script update #657

Uh oh!

opopovss commented Jul 30, 2021

Uh oh!

ometelytsia Jul 30, 2021

Uh oh!

ometelytsia Jul 30, 2021

Uh oh!

ometelytsia Jul 30, 2021

Uh oh!

ometelytsia Jul 30, 2021

Uh oh!

ometelytsia Aug 2, 2021

Uh oh!

opopovss Aug 3, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

DCA-1016: Confluence prepare data script update #657

DCA-1016: Confluence prepare data script update #657

Uh oh!

Conversation

opopovss commented Jul 30, 2021

Uh oh!

ometelytsia Jul 30, 2021

Choose a reason for hiding this comment

Uh oh!

ometelytsia Jul 30, 2021

Choose a reason for hiding this comment

Uh oh!

ometelytsia Jul 30, 2021

Choose a reason for hiding this comment

Uh oh!

ometelytsia Jul 30, 2021

Choose a reason for hiding this comment

Uh oh!

ometelytsia Aug 2, 2021

Choose a reason for hiding this comment

Uh oh!

opopovss Aug 3, 2021

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants