-
KUIS AI Center
- İstanbul
- http://ilkerkesen.github.io
- @ilker_kesen
Stars
[NeurIPS 2024 Best Paper][GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ult…
PyTorch Implementation of "V* : Guided Visual Search as a Core Mechanism in Multimodal LLMs"
Python tool for converting files and office documents to Markdown.
Evaluation of language models on mono- or multilingual tasks.
Meta Lingua: a lean, efficient, and easy-to-hack codebase to research LLMs.
Evaluation framework for paper "VisualWebBench: How Far Have Multimodal LLMs Evolved in Web Page Understanding and Grounding?"
[CVPR 2024 Highlight] Mitigating Object Hallucinations in Large Vision-Language Models through Visual Contrastive Decoding
This repo contains evaluation code for the paper "BLINK: Multimodal Large Language Models Can See but Not Perceive". https://arxiv.org/abs/2404.12390 [ECCV 2024]
20+ high-performance LLMs with recipes to pretrain, finetune and deploy at scale.
Official code implementation for the paper "Do Vision & Language Decoders use Images and Text equally? How Self-consistent are their Explanations?"
VILA is a family of state-of-the-art vision language models (VLMs) for diverse multimodal AI tasks across the edge, data center, and cloud.
Large Language Model Text Generation Inference
A lightweight library for generating synthetic instruction tuning datasets for your data without GPT.
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
Finetune Llama 3.3, DeepSeek-R1, Mistral, Phi-4 & Gemma 2 LLMs 2-5x faster with 70% less memory
Interpretability for sequence generation models 🐛 🔍
A program to choose transfer languages for cross-lingual learning
[CVPR'24] HallusionBench: You See What You Think? Or You Think What You See? An Image-Context Reasoning Benchmark Challenging for GPT-4V(ision), LLaVA-1.5, and Other Multi-modality Models
Modeling, training, eval, and inference code for OLMo
ColBERT: state-of-the-art neural search (SIGIR'20, TACL'21, NeurIPS'21, NAACL'22, CIKM'22, ACL'23, EMNLP'23)
Self-Alignment with Principle-Following Reward Models