D^2-MoE: Delta Decompression for MoE-based LLMs Compression
-
Updated
Mar 25, 2025 - Python
D^2-MoE: Delta Decompression for MoE-based LLMs Compression
Decentralized personalized federated learning based on a conditional sparse-to-sparser scheme (TNNLS)
[AICCSA 2025] Official Implementation of the paper "Z-Pruner: Post-Training Pruning of Large Language Models for Efficiency without Retraining".
Certain Finetuning , Inferencing, Optimization Tasks(Pruning, Distillation) for better LLM Performance
Developing efficient deep learning models for real-world use. Covers knowledge distillation, quantization, pruning, and more. Focused on reducing size and latency while preserving accuracy. Includes training pipelines, visualizations, and performance reports.
Add a description, image, and links to the pruning-sparsity topic page so that developers can more easily learn about it.
To associate your repository with the pruning-sparsity topic, visit your repo's landing page and select "manage topics."