Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dalle integration PR #16

Merged
merged 7 commits into from
Feb 5, 2024
Merged

Dalle integration PR #16

merged 7 commits into from
Feb 5, 2024

Conversation

jackitaliano
Copy link
Contributor

Dalle Integration

Allow for image generation via DALL-E-3 within chats.

Can be used by asking Bidara to generate an image, or show you something you've already been discussing.

Example of how the interaction might go

...Conversation about using kingfisher beaks for aircraft design...

  • user: Show me an image of how that would look for the nose-cone of an aircraft
  • ai: (image)
  • ai: Now that you've seen that visualization, how might you incorporate the design into an aircraft?

Note: rough estimate of the conversation

Example generations

Screenshot 2024-01-31 at 10 44 18 AM

Changes

Function Definition / Bidara Prompt

Problem

  • Trouble getting Bidara to actually use its function to generate images.

Solution

  • Tried many different "configurations" of the function descriptions/parameters.
  • Found the most consistent was to simplify the function while telling Bidara about the function and about it's ability do the thing in the function (i.e. generate images).

Inserting Images

Inserting an image can be done by simply returning a url to an image from funcCalling

Problem

  • If voice mode is activated, it will voice-over the entire (long) url, which isn't ideal
  • It will also not allow you to change the size of the image (default 1024x1024 because this is all openai currently allows for), also not ideal

Solution

  • Using previously deprecated deepChatRef.addMessage, which is now deepChatRef._addMessage
  • This feature is "unofficial" and not documented anywhere
  • See deep-chat issue#75

Other

  • Image generation was relatively simple via fetch to OpenAI. (See image gen docs)
  • On error, or no image is received, should gracefully tell the user there was an error
  • After image is given to user, should tie back into the biomimicry process

- using function calling (function: "generate_image_from_description"). Found GPT to be very picky about naming, parameters, etc.
- added "hints" to BIDARA in sys prompt to encourage it to use image generation when asked. Found GPT to also be very picky here
- currently just takes "description" parameter from function call to pass to dalle api
- add "openai" dependency
- moved some things to bidaraFunctions from image_gen
- replaced api usage of dalle with fetch in openaiUtils
- switch to fetch req to align with other changes
@jackitaliano jackitaliano linked an issue Jan 31, 2024 that may be closed by this pull request
src/openaiUtils.js Outdated Show resolved Hide resolved
@bruffridge bruffridge merged commit 95bf5da into main Feb 5, 2024
@bruffridge bruffridge deleted the js-dalle-integration branch February 5, 2024 22:00
Onsang1 pushed a commit that referenced this pull request Dec 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

DALLE-3 support for image generation
2 participants