Description
Feature request
We would like to implement Gemini 1.5 and/or Gemini Vision in openadapt.adapters.gemini
.
Related: #565
Gemini 1.5: https://blog.google/technology/ai/google-gemini-next-generation-model-february-2024/#sundar-note
Motivation
https://blog.google/technology/ai/google-gemini-next-generation-model-february-2024/#sundar-note
1 million tokens
https://www.linkedin.com/feed/update/urn:li:activity:7140972956314247168/
Gemini Pro Vision (multimodal) works good and is available to everyone! I did a quick test and the results I got were similar to GPT4-Vision.
Descriptions are accurate. Colors, and directions of objects are correct! Something Llava did not get right, unfortunately ...
⚡The good thing is you can use Gemini Pro already today, at 1 𝐫𝐞𝐪𝐮𝐞𝐬𝐭 𝐩𝐞𝐫 𝐬𝐞𝐜𝐨𝐧𝐝 compared to 100 requests per day for GPT-4 vision.
Google has enough GPUs to serve the world apparently 😏