Intelligent audio library organization with AI-powered analysis, interactive classification, and adaptive learning.
Transform your chaotic audio collections into intelligently organized, searchable libraries with AI that actually listens to your content and learns your creative patterns.
AudioAI doesn't just sort files - it understands them.
- Listens to actual audio content using advanced spectral analysis
- Learns your organization patterns and discovers new categories organically
- Interactive classification asks for help when uncertain, improving over time
- Semantic filename preservation keeps meaning while adding rich metadata
- Confidence-based processing - auto-handles obvious files, asks about edge cases
- Adaptive learning system gets smarter with every classification
downloads/
βββ 88bpm_play playful_childlike_beat_ES_February_Moon.mp3
βββ pulsing_signals_digital_space_Rhythmania.mp3
βββ out-of-breath-male-176715.mp3
βββ UK-Asian_Young_Female_Voice_35.wav
01_UNIVERSAL_ASSETS/
βββ MUSIC_LIBRARY/by_mood/contemplative/
β βββ playful_childlike_February_Moon_Instrumental_MUS_88bpm_CONT_E7.mp3
βββ SFX_LIBRARY/by_category/technology/
β βββ pulsing_signals_digital_space_Rhythmania_SFX_TECH_90bpm_MYST_E6.mp3
βββ SFX_LIBRARY/by_category/human_elements/
βββ Male_Out-of-Breath_SFX_176715.mp3
βββ UK-Asian_Young_Female_Voice_35.wav
04_METADATA_SYSTEM/
βββ audio_metadata_20250708.xlsx # Comprehensive searchable database
Plus: Confidence scores, AI reasoning, cross-references, and learning statistics!
# Python 3.8+ required
pip install openai librosa mutagen pandas openpyxl numpy pathlibgit clone https://github.com/user/AudioAI-organizer.git
cd audioai-organizer
pip install -r requirements.txt
# Set your OpenAI API key
export OPENAI_API_KEY="sk-your-key-here"from audioai_organizer import AdaptiveAudioOrganizer
# Initialize with your library path
organizer = AdaptiveAudioOrganizer(
openai_api_key="your-api-key-here",
base_directory="/path/to/your/audio/library"
)
# Process a single file interactively
result = organizer.process_file_interactive("test_audio.mp3", dry_run=True)
# Batch process with smart interaction (recommended)
audio_files = ["/path/to/audio1.mp3", "/path/to/audio2.wav"]
organizer.interactive_batch_process(audio_files, confidence_threshold=0.7)- Audio content analysis: BPM, brightness, texture, energy levels
- Pattern recognition: Learns your specific organization preferences
- Interactive learning: Asks targeted questions to improve accuracy
- Confidence scoring: Auto-processes obvious files, flags uncertain ones
- Tempo detection: Precise BPM extraction for rhythm-based organization
- Mood analysis: Emotional classification (contemplative, mysterious, energetic)
- Content type detection: Music vs SFX vs voice with high accuracy
- Spectral analysis: Brightness, texture, and tonal characteristics
- Semantic folder structures: Organized by mood, energy, and purpose
- Cross-reference system: Files can belong to multiple relevant categories
- Automatic folder creation: Discovers new categories from your content
- Filename enhancement: Preserves meaning while adding rich metadata
- Searchable metadata: Excel spreadsheets with full analysis data
- Learning statistics: Track system improvement over time
- Original filename preservation: Complete traceability
- Confidence and reasoning: Understand every AI decision
"I had 10,000 samples scattered everywhere. AudioAI organized them by BPM, mood, and energy in 2 hours. Now I find the perfect 128bpm dark ambient pad instantly."
"Managing voice samples, SFX, and music was chaos. The semantic filename preservation means I never lose track of what files actually contain."
"The interactive classification caught edge cases I missed. The AI learned our specific sound design categories and now auto-sorts 95% of new assets."
"Building audio libraries for AI consciousness storytelling. The system understands emotional context and organizes themes like 'digital consciousness' and 'memory formation'."
# Smart mode - asks when uncertain (recommended)
organizer.set_interaction_mode('smart') # 70% confidence threshold
# Minimal mode - only very uncertain files
organizer.set_interaction_mode('minimal') # 40% confidence threshold
# Always interactive - maximum accuracy
organizer.set_interaction_mode('always') # 100% threshold
# Fully automatic - bulk processing
organizer.set_interaction_mode('never') # 0% threshold# View learning statistics
organizer.show_learning_stats()
# Export classifications for backup
learning_data = organizer.export_learning_data()
# Import existing classifications
organizer.import_classifications("previous_library.json")
# Force learning update after manual corrections
organizer.update_learning_patterns()# Define custom organization patterns
custom_categories = {
"music_electronic": ["energetic", "euphoric", "dark", "minimal"],
"sfx_nature": ["calming", "organic", "flowing", "textural"],
"voice_ai": ["synthetic", "robotic", "processed", "emotional"]
}
organizer.add_custom_categories(custom_categories)# Process with real-time audio preview (Jupyter/IPython)
organizer.interactive_batch_process(
file_list,
confidence_threshold=0.7,
play_audio=True # Hear uncertain files before classifying
)- When the AI is less than 70% confident, it will prompt you for confirmation and play a 30-second audio preview (in Jupyter/IPython).
- You can accept, modify, or skip the classification, ensuring maximum accuracy and control.
- Every file move and filename change is logged in the metadata spreadsheet, including original and new filenames/paths.
- This allows you to recover or trace any file, even after large batch operations.
- For security and portability, set your API key and base directory using environment variables:
export OPENAI_API_KEY="sk-your-key-here" export AUDIOAI_BASE_DIRECTORY="/path/to/your/audio/library"
- Test the system on a single file (with audio preview and user feedback) before running batch processing:
from audioai_organizer import AdaptiveAudioOrganizer import os organizer = AdaptiveAudioOrganizer( openai_api_key=os.getenv('OPENAI_API_KEY'), base_directory=os.getenv('AUDIOAI_BASE_DIRECTORY') ) result = organizer.process_file_interactive("/path/to/test/file.mp3", dry_run=True)
- Process multiple files, with human-in-the-loop and audio preview for uncertain files:
audio_files = ["/path/to/audio1.mp3", "/path/to/audio2.wav"] results = organizer.interactive_batch_process(audio_files, confidence_threshold=0.7, dry_run=True)
- The system will prompt you and play audio for files where the AI is unsure.
- The metadata spreadsheet (
audio_metadata_YYYYMMDD.xlsx) contains all original and new filenames/paths for every processed file. - Use this log to recover or trace any file if needed.
AudioAI creates an intelligent, expandable folder structure:
YOUR_AUDIO_LIBRARY/
βββ 01_UNIVERSAL_ASSETS/
β βββ MUSIC_LIBRARY/
β β βββ by_mood/
β β βββ contemplative/
β β βββ tension_building/
β β βββ mysterious/
β β βββ wonder_discovery/
β βββ SFX_LIBRARY/
β β βββ by_category/
β β βββ consciousness/
β β βββ human_elements/
β β βββ environmental/
β β βββ technology/
β βββ VOICE_ELEMENTS/
β βββ narrator_banks/
β βββ processed_vocals/
β βββ character_voices/
βββ THEMATIC_COLLECTIONS/
β βββ human_machine_dialogue/
β βββ digital_consciousness/
β βββ emergence_awakening/
βββ 04_METADATA_SYSTEM/
β βββ learning_data.pkl
β βββ discovered_categories.json
β βββ audio_metadata_YYYYMMDD.xlsx
βββ TO_SORT/
βββ [unprocessed files]
- Spectral Analysis: Librosa extracts tempo, brightness, texture
- Content Classification: ML distinguishes music/SFX/voice
- Mood Detection: Energy and harmonic analysis for emotional context
- Pattern Recognition: Compares against learned user preferences
- Confidence Scoring: Determines if human input needed
- User feedback integration: Every correction improves future classifications
- Pattern discovery: Identifies recurring organization themes
- Category evolution: Automatically discovers new meaningful groupings
- Confidence calibration: Learns when to ask vs auto-process
- Semantic preservation: Keeps original meaning and context
- Metadata integration: Adds BPM, energy, mood, classification codes
- Collision handling: Smart numbering for duplicate semantic content
- Reversibility: Complete traceability back to original names
We'd love your help making AudioAI even better!
- DAW integration (Ableton Live, Logic Pro, etc.)
- Cloud storage sync (Google Drive, Dropbox)
- Multi-language filename support
- Advanced genre-specific classification models
- Web interface for remote library management
- Audio similarity clustering
- Collaborative library sharing
Found an issue? Please include:
- Audio file format and duration
- Full error traceback
- Your system info (OS, Python version)
- Example file (if shareable)
git clone https://github.com/user/AudioAI-Organizer.git
cd audioai-organizer
pip install -e ".[dev]"
pytest tests/- Processing speed: ~50-100 files per hour (depending on interaction mode)
- Library size: Tested with 50,000+ file libraries
- Memory usage: ~200MB for typical audio analysis
- Storage overhead: ~5MB metadata per 1000 files
- Local processing: All audio analysis happens on your machine
- API usage: Only sends text descriptions to OpenAI, never audio files
- No data collection: Your library organization stays completely private
- Offline mode: Core audio analysis works without internet
MIT License - see LICENSE for details.
Built with β€οΈ for the audio community. π»
- librosa - Incredible audio analysis capabilities
- OpenAI - GPT-4 language understanding
- pandas - Data management and export
- mutagen - Audio metadata extraction
- The audio community - For inspiring better organization tools
Questions? Ideas? Success stories?
- Open an issue
- Email: user@example.com
- Substack: rt-max.substack.com
From audio chaos to intelligent organization. AudioAI learns, adapts, and grows with your creative vision.
