-
Notifications
You must be signed in to change notification settings - Fork 4
Use uv project manager and add CI tests #12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Migrates the project from Poetry to the UV build system by adopting PEP 621 metadata and restructuring dependencies.
- Consolidate metadata under a PEP 621
[project]
table instead of[tool.poetry]
- Define core and development dependencies using
[project.dependencies]
and[dependency-groups]
- Configure
uv_build
as the PEP 517 build backend and pin the development Python version
Reviewed Changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.
File | Description |
---|---|
pyproject.toml | Replace Poetry fields with PEP 621 [project] , add dependency-groups, and switch to uv_build backend |
.python-version | Pin the Python interpreter version to match the project’s minimum requirement |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR integrates the uv project manager for dependency management and adds CI tests to ensure cross-version compatibility with Spark and Python.
- Updates test files to improve compatibility with PySpark 3.x by replacing the use of toArrow().
- Migrates project metadata to the new PEP 621 format and switches to uv_build.
- Introduces a GitHub Actions workflow that runs tests on multiple Python versions and package combinations.
Reviewed Changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 1 comment.
Show a summary per file
File | Description |
---|---|
tests/test_huggingface_writer.py | Updates test logic and fixture for environment-based token and converts DataFrame using pyarrow for Spark 3.x compatibility. |
tests/test_huggingface.py | Adds import for pyspark_huggingface. |
pyproject.toml | Migrates project configuration to PEP 621 and defines dependency groups using uv_build. |
.python-version | Specifies the default Python version for local development. |
.github/workflows/ci.yml | Introduces a CI workflow with a matrix for Python and package versions. |
Comments suppressed due to low confidence (1)
tests/test_huggingface_writer.py:134
- Collecting all rows from the DataFrame into memory for conversion to a PyArrow table may impact performance if the test data scale increases. Consider using a method that processes data in batches or limiting the test dataset size.
arrow_table = pa.Table.from_pylist([row.asDict() for row in df.collect()], schema=to_arrow_schema(df.schema))
@@ -0,0 +1 @@ | |||
3.9 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The .python-version file statically specifies Python 3.9, yet the CI workflow includes Python 3.13. It would be helpful to document the intended supported Python versions or update .python-version accordingly for consistency.
Copilot uses AI. Check for mistakes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice !
Changes
pyproject.toml
to the standard format. And switch to uv project managerci.yml
to test with combinations of different python (3.9, 3.13) and pyspark (3.5.6, >=4.0.0) versions on Github PRsTesting