Skip to content

Refactor the todf() function of client-py to improve performance#4242

Merged
SteveYurongSu merged 7 commits intoapache:masterfrom
fuwei3140:client-py-fix
Nov 19, 2021
Merged

Refactor the todf() function of client-py to improve performance#4242
SteveYurongSu merged 7 commits intoapache:masterfrom
fuwei3140:client-py-fix

Conversation

@fuwei3140
Copy link
Contributor

Description

Optimized the processing logic of the todf() function in client_py to read the byte stream by column to avoid performance problems caused by reading by row.


This PR has:

  • been self-reviewed.
    • concurrent read
    • concurrent write
    • concurrent read and write
  • added documentation for new or modified features or behaviors.
  • added Javadocs for most classes and all non-trivial methods.
  • added or updated version, license, or notice information
  • added comments explaining the "why" and the intent of the code wherever would not be obvious
    for an unfamiliar reader.
  • added unit tests or modified existing tests to cover new code paths, ensuring the threshold
    for code coverage.
  • added integration tests.
  • been tested in a test IoTDB cluster.

Key changed/added classes (or packages if there are too many classes) in this PR

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, this is your first pull request in IoTDB project. Thanks for your contribution! IoTDB will be better because of you.

@SteveYurongSu
Copy link
Member

Hi, @JulianFeinauer

Please take a look :D

@coveralls
Copy link

coveralls commented Oct 27, 2021

Coverage Status

Coverage increased (+0.03%) to 66.927% when pulling 3d95863 on fuwei3140:client-py-fix into 7ff968c on apache:master.

@JulianFeinauer
Copy link
Contributor

Thanks for pinging me @SteveYurongSu . The code looks so far fine for me. What wee are generally missing in our python module are tests (we effectively have none). So I really would appreciate any test (based on pytest) to be added to the repo. What do you think @fuwei3140 ?

@fuwei3140 fuwei3140 closed this Nov 1, 2021
@fuwei3140
Copy link
Contributor Author

This is a good proposal, and a complete test is necessary @JulianFeinauer . I can try to use pytest to implement tests on various data type queries. @SteveYurongSu

@fuwei3140 fuwei3140 reopened this Nov 1, 2021
@SteveYurongSu
Copy link
Member

This is a good proposal, and a complete test is necessary @JulianFeinauer . I can try to use pytest to implement tests on various data type queries. @SteveYurongSu

Great!

@fuwei3140
Copy link
Contributor Author

Hello @JulianFeinauer @SteveYurongSu ,
1.Added test cases to verify query operations of multiple data types, simple queries, queries with null values, and multiple batch queries.
2.Modify the null value handling of the dataframe. In the original implementation, when the data type is INT32, INT64, and BOOLEAN, using 0 and False instead of null values will mislead users and fail to recognize null values. Because numpy does not support null values of this types, the new implementation is to use pd.NA instead of null values of this data types.

Copy link
Contributor

@JulianFeinauer JulianFeinauer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you very much, I feel a lot better now about this code!

@fuwei3140
Copy link
Contributor Author

fix a bug

@SteveYurongSu SteveYurongSu changed the title Refactor the todf() function of client_py to improve performance. Refactor the todf() function of client-py to improve performance Nov 19, 2021
Copy link
Member

@SteveYurongSu SteveYurongSu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks for your contribution!

I made some minor changes by executing black .(autoformatting) and flake8 .(linting) respectively.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants