Skip to content

feat: DIA-2062: Batch writes for cloud storage import #7372

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 6 commits into
base: develop
Choose a base branch
from

Conversation

hakan458
Copy link
Collaborator

Changes how cloud storage import works so that we are bulk_create-ing Tasks, Annotations, and Predictions
Before we were calling add_task for every single new key (path) from cloud storage. Now we will create tasks in batches (setting controlled to 50 currently)

@github-actions github-actions bot added the feat label Apr 15, 2025
Copy link

sentry-io bot commented Apr 15, 2025

🔍 Existing Issues For Review

Your pull request is modifying functions with the following pre-existing issues:

📄 File: label_studio/io_storages/base_models.py

Function Unhandled Issue
_scan_and_create_links ValueError: Error loading JSON from file "label1.json". io_storages.base_models.import_sync_bac...
Event Count: 3
_scan_and_create_links ValueError: Storage status (in_progress) must be QUEUED to move it IN_PROGRESS io_storages.base...
Event Count: 2
_scan_and_create_links ValueError: Error on key kac_v6_collabo/annotation.json: For S3 your JSON file must be a dictionary with one task ...
Event Count: 1
_scan_and_create_links ValueError: Error loading JSON from file "wafer_aoi_defects_batch1/Field Failure/N_PBI_ZN030_4964150_13.093.png". ...
Event Count: 1
_scan_and_create_links ValueError: If you use "predictions" field in the task, you must put "data" field in the task too ...
Event Count: 1

Did you find this useful? React with a 👍 or 👎

Copy link

netlify bot commented Apr 15, 2025

Deploy Preview for label-studio-storybook ready!

Name Link
🔨 Latest commit 11c0211
🔍 Latest deploy log https://app.netlify.com/sites/label-studio-storybook/deploys/6802906daa0be60008cdb098
😎 Deploy Preview https://deploy-preview-7372--label-studio-storybook.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

Copy link

netlify bot commented Apr 15, 2025

Deploy Preview for label-studio-docs-new-theme ready!

Name Link
🔨 Latest commit 11c0211
🔍 Latest deploy log https://app.netlify.com/sites/label-studio-docs-new-theme/deploys/6802906d2e0d8300086eb7b9
😎 Deploy Preview https://deploy-preview-7372--label-studio-docs-new-theme.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

Copy link

netlify bot commented Apr 15, 2025

Deploy Preview for heartex-docs ready!

Name Link
🔨 Latest commit 11c0211
🔍 Latest deploy log https://app.netlify.com/sites/heartex-docs/deploys/6802906d10d3f80008cd3782
😎 Deploy Preview https://deploy-preview-7372--heartex-docs.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

Copy link

codecov bot commented Apr 16, 2025

Codecov Report

Attention: Patch coverage is 66.12903% with 21 lines in your changes missing coverage. Please review.

Project coverage is 77.22%. Comparing base (849c3df) to head (11c0211).
Report is 15 commits behind head on develop.

Files with missing lines Patch % Lines
label_studio/io_storages/base_models.py 65.57% 21 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #7372      +/-   ##
===========================================
- Coverage    77.27%   77.22%   -0.06%     
===========================================
  Files          190      190              
  Lines        14697    14721      +24     
===========================================
+ Hits         11357    11368      +11     
- Misses        3340     3353      +13     
Flag Coverage Δ
pytests 77.22% <66.12%> (-0.06%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@hakan458
Copy link
Collaborator Author

hakan458 commented Apr 18, 2025

/fm sync

Workflow run

Copy link
Member

@makseq makseq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe we should add FF and keep the old implementation in place until we are completely sure that the new one is safe

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants