-
KAIST
- Daejeon
Highlights
- Pro
Stars
Code for the paper "Visual Anagrams: Generating Multi-View Optical Illusions with Diffusion Models"
Some Python code to reproduce a nice optical illusion found on the web.
A simple optical illusion in python
A Parametric Framework to Generate Visual Illusions using Python
Measuring Massive Multitask Language Understanding | ICLR 2021
We collect papers about "large language models (LLM) for table-related tasks", e.g., using LLM for Table QA task. “表格+LLM”相关论文整理
A beautiful, simple, clean, and responsive Jekyll theme for academics
LangFair is a Python library for conducting use-case level LLM bias and fairness assessments
The LLM's practical guide: From the fundamentals to deploying advanced LLM and RAG apps to AWS using LLMOps best practices
[EMNLP 2023 Demo] Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video Understanding
Repo that contains resources to learn or get started with Large Language Models (LLMs)
✨✨Latest Advances on Multimodal Large Language Models
A list of free LLM inference resources accessible via API.
code for generating a high-quality knowledge graph with metadata about datasets and links to publications
Github Pages template for academic personal websites, forked from academicpages/academicpages.github.io
Transformer Explained Visually: Learn How LLM Transformer Models Work with Interactive Visualization
Data and Code for Program of Thoughts (TMLR 2023)
Code and data accompanying our paper on arXiv "Faithful Chain-of-Thought Reasoning".
a state-of-the-art-level open visual language model | 多模态预训练模型
[ACM MM 2020] CCD dataset for traffic accident anticipation.
A library of visualization tools for the interpretability and hallucination analysis of large vision-language models (LVLMs).
Developing VLMs for expert-level performance in specific medical specialties
Grounded SAM 2: Ground and Track Anything in Videos with Grounding DINO, Florence-2 and SAM 2
Grounded SAM: Marrying Grounding DINO with Segment Anything & Stable Diffusion & Recognize Anything - Automatically Detect , Segment and Generate Anything
Official code repo of PIN: Positional Insert Unlocks Object Localisation Abilities in VLMs
[ECCV 2024] API: Attention Prompting on Image for Large Vision-Language Models
Welcome to the Llama Cookbook! This is your go to guide for Building with Llama: Getting started with Inference, Fine-Tuning, RAG. We also show you how to solve end to end problems using Llama mode…