Skip to content

hamzafer/Large-Vision-Models

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

37 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Large Vision Models Inferences

[Made Public]

Multimodal AI Inference

Overview

This repository is dedicated to running inference tasks using various large vision models (LVMs) over secure SSH connections. It serves as a growing collection of scripts that implement and manage inference for cutting-edge multimodal AI models, focusing on both vision and language tasks.

Key Features

  • Qwen Inference: Leverages Qwen, a robust language model, to process multimodal input through qwen_inference.py.
  • Llava Next Integration: Adds advanced visual understanding with Llava Next, utilizing llava_next_inference.py.
  • Continuous Expansion: As more large vision models are explored and integrated, the repository will expand with additional inference files.
  • SSH-based Inference: All inference processes are conducted remotely over SSH, providing scalable and secure access to compute resources.

Files (Growing Collection)

  • qwen_inference.py: Script for running inference tasks using the Qwen model.
  • llava_next_inference.py: Inference script for Llava Next, aimed at advanced visual understanding.

Use Case

This repository is designed for AI researchers and developers working on large vision models. It facilitates the remote deployment and inference of state-of-the-art vision and multimodal models, with secure SSH-based access.

Future Work

  • Integration of additional large vision models for comprehensive multimodal tasks.
  • Support for larger datasets and batch processing capabilities.
  • Performance benchmarking and optimization for inference tasks.

About

GPU 4090 RTX Large Vision Models Inference via SSH

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published