Gemini API Demos

Hello! 👋 This is a repository of examples built with Google's Gemini API, which lets you build multimodal AI applications with text, images, and more.

What You'll Find:

Examples: These demos show the latest Gemini models (Flash 1.5, Pro, and others) in action. Dive into projects that demonstrate:

Image and Video understanding: Analyze content, classify objects, and even generate timestamped summaries.
Multimodal interaction: Combine text and image inputs to create engaging user experiences.
Technical Inspiration: Get hands-on with code examples that show you how to use the Gemini API effectively. Learn best practices for prompt engineering, caching and embedding, and integrating Gemini into your own applications.

Getting Started:

Obtain an API Key: To use the Gemini API, you'll need an API key. You can get one here or from AI Studio
Explore the Docs: The official documentation is your comprehensive guide to the Gemini API: https://ai.google.dev/gemini-api/docs/
Dive into the Demos: Choose a demo that sparks your interest and follow the instructions in its README. You'll be up and running in no time!

Important Notes:

API Usage Limits: Google may have usage limits and associated costs for the Gemini API. Be sure to review the details on their website.
Responsible AI: Please use the Gemini API responsibly and ethically. Avoid generating harmful or misleading content.
Feedback Welcome: We value your input! If you encounter issues, have suggestions, or want to share your creations, please open an issue or pull request.

Current Projects

Name	Description	Tools
Multimodal Embeddings	Using Gemini's new Multimodal Embeddings API, we'll explore high dimensional embedding space of text, images, and videos.	Multimodal Embeddings API, Firestore Vector Search
Gemini Video Scrubber	"GVS" is a prototype that uses Gemini's multimodal video understanding capabilities to create timestamped summaries of videos with a simple UI to play those timestamps back in sequence, giving you the ability to quickly scan videos for interesting moments, common occurences, and more!	Multimodal Gemini, File API, Caching
Voice Cursor	An experimental text editor that lets you highlight phrases and instantly hear them spoken by Gemini 2.0 in different expressive styles. Simply select text, choose a tone, and hear AI-generated speech with customizable prompts.	Gemini 2.0 Native Audio Output
Image to Code	An experimental site that uses Gemini 2.0 Flash to turn an image --> into a creative code sketch (p5.js).	Gemini 2.0 Flash

Experiments for all

This is an experiment, not an official Google product. We'll do our best to support and maintain this experiment but your mileage may vary.

We encourage open sourcing projects as a way of learning from each other. Please respect our and other creators' rights, including copyright and trademark rights when present, when sharing these works and creating derivative work. If you want more info on Google's policy, you can find that here.

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
image-to-code		image-to-code
multimodal-embeddings		multimodal-embeddings
video-scrubber		video-scrubber
voice-cursor		voice-cursor
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Gemini API Demos

What You'll Find:

Getting Started:

Important Notes:

Current Projects

Experiments for all

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Gemini API Demos

What You'll Find:

Getting Started:

Important Notes:

Current Projects

Experiments for all

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages