Added a pipeline step for bulk stemming using Demucs #5

energydrink9 · 2024-12-16T11:42:42Z

PR Type

Enhancement

Description

This PR adds a new pipeline step for audio stem separation using Demucs:

Implements new stemming functionality to extract individual instrument tracks from whole audio files
Adds parallel processing support via Dask for efficient stem separation
Includes quality filters to exclude silent stems and low-quality piano/vocals stems
Moves silence detection to a shared utility function
Updates pipeline to include stemming as initial step
Adds necessary configuration and documentation updates

Changes walkthrough 📝

Relevant files

Enhancement

stem.py `New module for audio stem separation using Demucs` src/stem_continuation_dataset_generator/steps/stem.py Added new module for audio stem separation using Demucs Implements functions to process audio files and extract individual instrument tracks Handles file conversion and filtering of silent/low quality stems Includes parallel processing support via Dask	+90/-0
pipeline.py `Integration of stemming step into main pipeline` src/stem_continuation_dataset_generator/pipeline.py Added stem_all() step to the dataset creation pipeline Updated imports to include new stemming functionality	+4/-1
utils.py `Added audio silence detection utility` src/stem_continuation_dataset_generator/utils/utils.py Added is_mostly_silent() utility function to detect silent audio segments	+10/-0
constants.py `Added path constant for whole track files` src/stem_continuation_dataset_generator/constants.py Added get_whole_tracks_files_path() function for accessing source audio files	+4/-0
merge.py `Refactored silence detection functionality` src/stem_continuation_dataset_generator/steps/merge.py Moved is_mostly_silent() to utils module Updated silence detection implementation for remote files	+7/-12

Documentation

README.md `Updated documentation with stemming step` README.md Added documentation for new stemming step using Demucs	+1/-0

Dependencies

pyproject.toml `Added Demucs dependency` pyproject.toml Added demucs package dependency	+1/-0

💡 PR-Agent usage: Comment /help "your question" on any pull request to receive relevant information

qodo-merge-pro · 2024-12-16T11:43:25Z

PR Reviewer Guide 🔍

Here are some key observations to aid the review process:

⏱️ Estimated effort to review: 4 🔵🔵🔵🔵⚪
🧪 No relevant tests
🔒 No security concerns identified
⚡ Recommended focus areas for review Error Handling The stem_file() function lacks error handling for failed Demucs separations. Should handle potential exceptions from demucs.separate.main() Resource Management The stem() function opens files but doesn't explicitly close them. Should use context managers for file operations Configuration Issue RUN_LOCALLY is hardcoded to True which may cause issues when deploying to production environment Type Safety The is_mostly_silent() function uses type: ignore for librosa.load() call. Should handle potential type issues properly

qodo-merge-pro · 2024-12-16T11:43:54Z

PR Code Suggestions ✨

Explore these optional code suggestions:

Category	Suggestion	Score
Possible issue	Reset file position after reading to prevent subsequent read operations from failing Add file position reset after librosa.load() to ensure the file can be read again if needed, as librosa.load() moves the file pointer. src/stem_continuation_dataset_generator/utils/utils.py [47-48] def is_mostly_silent(file: Union[io.TextIOWrapper, io.BufferedReader], percentage_non_silent: float) -> bool: audio, sr = librosa.load(file) # type: ignore + file.seek(0) Apply this suggestion Suggestion importance[1-10]: 9 Why: Not resetting the file position after librosa.load() could cause subsequent file operations to fail, potentially breaking the silence detection functionality.	9
Possible issue	Add error handling to prevent pipeline failures when audio separation fails Add error handling for the Demucs model execution to prevent pipeline failures if the audio separation fails for a specific file. src/stem_continuation_dataset_generator/steps/stem.py [41] -demucs.separate.main(shlex.split(f'-n htdemucs_6s --clip-mode clamp --out "{output_directory}" "{file_path}"')) +try: + demucs.separate.main(shlex.split(f'-n htdemucs_6s --clip-mode clamp --out "{output_directory}" "{file_path}"')) +except Exception as e: + print(f"Failed to process {file_path}: {str(e)}") + return (output_directory, []) Apply this suggestion Suggestion importance[1-10]: 8 Why: Adding error handling for Demucs model execution is crucial for pipeline reliability, preventing complete pipeline failures when processing individual files fails.	8
General	Ensure proper cleanup of temporary files even if processing fails Close file handles explicitly after downloading and processing to prevent resource leaks, especially important when processing large batches. src/stem_continuation_dataset_generator/steps/stem.py [53-55] -fs.download(file_path, local_path) -stem_file(local_directory, local_path) -os.remove(local_path) +try: + fs.download(file_path, local_path) + stem_file(local_directory, local_path) +finally: + if os.path.exists(local_path): + os.remove(local_path) Apply this suggestion Suggestion importance[1-10]: 7 Why: Proper resource cleanup is important for system stability, especially when processing large batches of files. The try-finally block ensures cleanup even on errors.	7

qodo-merge-pro bot added the Review effort [1-5]: 4 label Dec 16, 2024

Added a pipeline step for bulk stemming using Demucs

0ecb5d2

energydrink9 force-pushed the feature/stemming branch from 6e6b1a3 to 0ecb5d2 Compare December 16, 2024 11:44

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Added a pipeline step for bulk stemming using Demucs #5

Added a pipeline step for bulk stemming using Demucs #5

energydrink9 commented Dec 16, 2024 •

edited by qodo-merge-pro bot

Loading

Uh oh!

qodo-merge-pro bot commented Dec 16, 2024

Uh oh!

qodo-merge-pro bot commented Dec 16, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Added a pipeline step for bulk stemming using Demucs #5

Are you sure you want to change the base?

Added a pipeline step for bulk stemming using Demucs #5

Conversation

energydrink9 commented Dec 16, 2024 • edited by qodo-merge-pro bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Type

Description

Changes walkthrough 📝

Uh oh!

qodo-merge-pro bot commented Dec 16, 2024

PR Reviewer Guide 🔍

Uh oh!

qodo-merge-pro bot commented Dec 16, 2024

PR Code Suggestions ✨

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

energydrink9 commented Dec 16, 2024 •

edited by qodo-merge-pro bot

Loading