M.S. student in Electrical and Computer Engineering at UC San Diego, working on computer vision, multimodal AI, and efficient deep learning.
- Multimodal Large Language Models — evaluation, benchmarking, tool-mediated reasoning
- Efficient Deep Learning — model pruning, on-device deployment, federated learning
- Generative Models — diffusion models, conditional image generation
| Project | Description | Status |
|---|---|---|
| MLLM-as-a-Judge | Benchmark for evaluating MLLMs as judges of vision-task outputs | Paper in prep |
| TAP-ViTs | Task-adaptive pruning for deploying ViTs on edge devices | |
| Noise-Level-Dependence | Measuring how conditioning effectiveness varies with noise level in diffusion models | Extended research |
| Face-Swap-Diffusion | Exploring sampling schedulers and identity guidance mechanisms for conditional diffusion model-based face swapping | Undergraduate Thesis |
-
Zhibo Wang, Zuoyuan Zhang, Xiaoyi Pang, Qile Zhang, et al. TAP-ViTs: Task-Adaptive Pruning for On-Device Deployment of Vision Transformers. arXiv:2601.02437
-
[Authors including Qile Zhang]. Evaluating Multi-modal Large Language Models as MLLM-as-Judge for Vision Tasks. In preparation.