Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Release 2.7.0 #1540

Merged
merged 70 commits into from
Dec 12, 2024
Merged
Show file tree
Hide file tree
Changes from 62 commits
Commits
Show all changes
70 commits
Select commit Hold shift + click to select a range
987f285
feat: hunyuan_video
anuragts Dec 9, 2024
9092897
fix: lint
anuragts Dec 9, 2024
d9951ab
fix: create a fal tool
anuragts Dec 9, 2024
37b577e
fix: remove print
anuragts Dec 9, 2024
c9895bb
Update cookbook/agents/42_generate_fal_video.py
anuragts Dec 9, 2024
87c1afb
Update cookbook/agents/42_generate_fal_video.py
anuragts Dec 9, 2024
3243c5b
fix: move to fal tools
anuragts Dec 9, 2024
a59d392
fix: enum type
anuragts Dec 9, 2024
31f86fc
fix: name
anuragts Dec 9, 2024
7cf736d
fix: improve instructions
anuragts Dec 9, 2024
d04acf6
fix: return video/image url to agent
anuragts Dec 9, 2024
1b9513c
fix: add fal video agent to playground
anuragts Dec 9, 2024
a50da31
fix: send as schema
anuragts Dec 9, 2024
cf895cc
fix: return as mp4
anuragts Dec 9, 2024
c939f7e
fix: add enum
anuragts Dec 9, 2024
d0cb754
fix: data
anuragts Dec 9, 2024
45e9e86
fix: instruction for video model
anuragts Dec 9, 2024
dd75ea2
fix: more instruction
anuragts Dec 9, 2024
710aa30
Add replicate toolkit
dirkvolter Dec 9, 2024
22a07bb
Add replicate toolkit
dirkvolter Dec 9, 2024
b3bf340
Ignore missing imports
dirkvolter Dec 9, 2024
0a739bc
luma lab video generation
Ayush0054 Dec 9, 2024
11adf3e
Fix cookbook for replicate
dirkvolter Dec 10, 2024
2522dd7
Update image/video serialization
dirkvolter Dec 10, 2024
d438e3a
added image to video funcationality ,fixed formatting and mypy errors
Ayush0054 Dec 10, 2024
74bde7f
Fix style
dirkvolter Dec 10, 2024
9a10b50
Merge branch 'feature/replicate-toolkit' of github.com:phidatahq/phi…
Ayush0054 Dec 10, 2024
d3a5543
Merge remote-tracking branch 'origin/feature/replicate-toolkit' into …
anuragts Dec 10, 2024
4f16362
fix: improvements
anuragts Dec 10, 2024
a8b3c61
fix: rename file
anuragts Dec 10, 2024
935b680
fix: instruction update
anuragts Dec 10, 2024
e85e388
updated according to comments/review
Ayush0054 Dec 10, 2024
47f59f1
formatting
Ayush0054 Dec 10, 2024
c0b018c
Fix typo
dirkvolter Dec 10, 2024
5740cf8
fix: modal labs type mismatch
anuragts Dec 10, 2024
dff1e02
Bump version
dirkvolter Dec 10, 2024
d03c89f
Merge branch 'release/2.7.0' of https://github.com/phidatahq/phidata …
dirkvolter Dec 10, 2024
eccea9a
Add image cookbook
dirkvolter Dec 10, 2024
2443c27
fix: send gif in image
anuragts Dec 10, 2024
a643f21
Update
dirkvolter Dec 10, 2024
72605b4
Merge branch 'release/2.7.0' of https://github.com/phidatahq/phidata …
dirkvolter Dec 10, 2024
499693e
Merge branch 'feature/replicate-toolkit' of https://github.com/phidat…
dirkvolter Dec 10, 2024
312895b
Fix FAL interface
dirkvolter Dec 10, 2024
7b1f696
Merge
dirkvolter Dec 10, 2024
398e227
Fix FAL_KEY
dirkvolter Dec 10, 2024
5e3d3e2
Add modellabs gif to playground app
dirkvolter Dec 10, 2024
7ddc56c
Update name of replicate tools
dirkvolter Dec 10, 2024
a5c4681
Fix style
dirkvolter Dec 10, 2024
0d8c2b4
fix: remove duplicate
anuragts Dec 10, 2024
0e98701
Fix gemini reference
dirkvolter Dec 10, 2024
ad2023b
Merge branch 'hunyuan_video' of https://github.com/phidatahq/phidata …
dirkvolter Dec 10, 2024
8620e57
fix: remove correct
anuragts Dec 10, 2024
2ce1fe6
Fix mypy
dirkvolter Dec 10, 2024
b608861
Merge branch 'hunyuan_video' of https://github.com/phidatahq/phidata …
dirkvolter Dec 10, 2024
6ca39dd
Merge branch 'feature/replicate-toolkit' of https://github.com/phidat…
dirkvolter Dec 10, 2024
795ee52
Update lumalabs to work with new interface
dirkvolter Dec 10, 2024
d280dfa
Improve instructions
dirkvolter Dec 11, 2024
9cefcb7
Merge pull request #1526 from phidatahq/hunyuan_video
dirkbrnd Dec 11, 2024
6a60da1
Fix typo
dirkvolter Dec 11, 2024
2fd27e1
Merge branch 'release/2.7.0' of https://github.com/phidatahq/phidata …
dirkvolter Dec 11, 2024
d10bc9c
Fix style
dirkvolter Dec 11, 2024
51e8ac0
Merge pull request #1532 from phidatahq/lumalabs-video-generation
dirkbrnd Dec 11, 2024
1c3d341
Merge branch 'main' into release/2.7.0
anuragts Dec 11, 2024
09dd849
Update
dirkvolter Dec 11, 2024
629b5de
Merge branch 'release/2.7.0' of https://github.com/phidatahq/phidata …
dirkvolter Dec 11, 2024
2ef0ab5
Merge branch 'main' into release/2.7.0
anuragts Dec 12, 2024
f2d0d71
use-case-example-recipe-creator (#1511)
unnati914 Dec 12, 2024
6d72f5e
Update PR template (#1538)
saajann Dec 12, 2024
5980aee
Merge branch 'main' of https://github.com/phidatahq/phidata into rele…
dirkvolter Dec 12, 2024
08fc407
Pull in main
dirkvolter Dec 12, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion cookbook/agents/15_generate_video.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
tools=[ModelsLabs()],
description="You are an AI agent that can generate videos using the ModelsLabs API.",
instructions=[
"When the user asks you to create a video, use the `create_video` tool to create the video.",
"When the user asks you to create a video, use the `generate_media` tool to create the video.",
"The video will be displayed in the UI automatically below your response, so you don't need to show the video URL in your response.",
"Politely and courteously let the user know that the video has been generated and will be displayed below as soon as its ready.",
],
Expand Down
24 changes: 24 additions & 0 deletions cookbook/agents/43_generate_replicate_video.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
from phi.agent import Agent
from phi.model.openai import OpenAIChat
from phi.tools.replicate import ReplicateTools

"""Create an agent specialized for Replicate AI content generation"""

video_agent = Agent(
name="Video Generator Agent",
model=OpenAIChat(id="gpt-4o"),
tools=[
ReplicateTools(model="tencent/hunyuan-video:847dfa8b01e739637fc76f480ede0c1d76408e1d694b830b5dfb8e547bf98405")
],
description="You are an AI agent that can generate videos using the Replicate API.",
instructions=[
"When the user asks you to create a video, use the `generate_media` tool to create the video.",
"Return the URL as raw to the user.",
"Don't convert video URL to markdown or anything else.",
],
markdown=True,
debug_mode=True,
show_tool_calls=True,
)

video_agent.print_response("Generate a video of a horse in the dessert.")
22 changes: 22 additions & 0 deletions cookbook/agents/44_generate_replicate_image.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
from phi.agent import Agent
from phi.model.openai import OpenAIChat
from phi.tools.replicate import ReplicateTools

"""Create an agent specialized for Replicate AI content generation"""

image_agent = Agent(
name="Image Generator Agent",
model=OpenAIChat(id="gpt-4o"),
tools=[ReplicateTools(model="luma/photon-flash")],
description="You are an AI agent that can generate images using the Replicate API.",
instructions=[
"When the user asks you to create an image, use the `generate_media` tool to create the image.",
"Return the URL as raw to the user.",
"Don't convert image URL to markdown or anything else.",
],
markdown=True,
debug_mode=True,
show_tool_calls=True,
)

image_agent.print_response("Generate an image of a horse in the dessert.")
20 changes: 20 additions & 0 deletions cookbook/agents/45_generate_fal_video.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
from phi.agent import Agent
from phi.model.openai import OpenAIChat
from phi.tools.fal_tools import FalTools

fal_agent = Agent(
name="Fal Video Generator Agent",
model=OpenAIChat(id="gpt-4o"),
tools=[FalTools("fal-ai/hunyuan-video")],
description="You are an AI agent that can generate videos using the Fal API.",
instructions=[
"When the user asks you to create a video, use the `generate_media` tool to create the video.",
"Return the URL as raw to the user.",
"Don't convert video URL to markdown or anything else.",
],
markdown=True,
debug_mode=True,
show_tool_calls=True,
)

fal_agent.print_response("Generate video of balloon in the ocean")
2 changes: 1 addition & 1 deletion cookbook/assistants/llms/vertexai/samples/multimodal.py
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure about making changes to assistants folder as we are trying to phase it out

Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ def multimodal_example(project: Optional[str], location: Optional[str]) -> str:
# Load the model
multimodal_model = GenerativeModel("gemini-1.0-pro-vision")
# Query the model
response = multimodal_model.generate_content(
response = multimodal_model.generate_media(
[
# Add an example image
Part.from_uri("gs://generativeai-downloads/images/scones.jpg", mime_type="image/jpeg"),
Expand Down
2 changes: 1 addition & 1 deletion cookbook/assistants/llms/vertexai/samples/text_stream.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ def generate(project: Optional[str], location: Optional[str]) -> None:
# Load the model
model = GenerativeModel("gemini-1.0-pro-vision")
# Query the model
responses: Iterable[GenerationResponse] = model.generate_content("Who are you?", stream=True)
responses: Iterable[GenerationResponse] = model.generate_media("Who are you?", stream=True)
# Process the response
for response in responses:
print(response.text, end="")
Expand Down
61 changes: 48 additions & 13 deletions cookbook/playground/multimodal_agent.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,48 +10,83 @@
from phi.model.openai import OpenAIChat
from phi.tools.dalle import Dalle
from phi.tools.models_labs import ModelsLabs
from phi.model.response import FileType
from phi.playground import Playground, serve_playground_app
from phi.storage.agent.sqlite import SqlAgentStorage
from phi.tools.fal_tools import FalTools

image_agent_storage_file: str = "tmp/image_agent.db"

image_agent = Agent(
name="Image Agent",
name="DALL-E Image Agent",
agent_id="image_agent",
model=OpenAIChat(id="gpt-4o"),
tools=[Dalle()],
description="You are an AI agent that can generate images using DALL-E.",
instructions=[
"When the user asks you to create an image, use the `create_image` tool to create the image.",
"The image will be displayed in the UI automatically below your response, so you don't need to show the image URL in your response.",
"Politely and courteously let the user know that the image has been generated and will be displayed below as soon as its ready.",
"Don't provide the URL of the image in the response. Only describe what image was generated.",
],
markdown=True,
debug_mode=True,
add_history_to_messages=True,
add_datetime_to_instructions=True,
storage=SqlAgentStorage(table_name="image_agent", db_file="tmp/image_agent.db"),
storage=SqlAgentStorage(table_name="image_agent", db_file=image_agent_storage_file),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's keep it in tmp folder as we have it in .gitignore

)

video_agent = Agent(
name="Video Agent",
agent_id="video_agent",
ml_gif_agent = Agent(
name="ModelsLab GIF Agent",
agent_id="ml_gif_agent",
model=OpenAIChat(id="gpt-4o"),
tools=[ModelsLabs(wait_for_completion=True)],
tools=[ModelsLabs(wait_for_completion=True, file_type=FileType.GIF)],
description="You are an AI agent that can generate gifs using the ModelsLabs API.",
instructions=[
"When the user asks you to create an image, use the `generate_media` tool to create the image.",
"Don't provide the URL of the image in the response. Only describe what image was generated.",
],
markdown=True,
debug_mode=True,
add_history_to_messages=True,
add_datetime_to_instructions=True,
storage=SqlAgentStorage(table_name="ml_gif_agent", db_file=image_agent_storage_file),
)

ml_video_agent = Agent(
name="ModelsLab Video Agent",
agent_id="ml_video_agent",
model=OpenAIChat(id="gpt-4o"),
tools=[ModelsLabs(wait_for_completion=True, file_type=FileType.MP4)],
description="You are an AI agent that can generate videos using the ModelsLabs API.",
instructions=[
"When the user asks you to create a video, use the `create_video` tool to create the video.",
"The video will be displayed in the UI automatically below your response, so you don't need to show the video URL in your response.",
"Politely and courteously let the user know that the video has been generated and will be displayed below as soon as its ready.",
"When the user asks you to create a video, use the `generate_media` tool to create the video.",
"Don't provide the URL of the video in the response. Only describe what video was generated.",
],
markdown=True,
debug_mode=True,
add_history_to_messages=True,
add_datetime_to_instructions=True,
storage=SqlAgentStorage(table_name="video_agent", db_file="tmp/video_agent.db"),
storage=SqlAgentStorage(table_name="ml_video_agent", db_file=image_agent_storage_file),
)

app = Playground(agents=[image_agent, video_agent]).get_app()
fal_agent = Agent(
name="Fal Video Agent",
agent_id="fal_agent",
model=OpenAIChat(id="gpt-4o"),
tools=[FalTools("fal-ai/hunyuan-video")],
description="You are an AI agent that can generate videos using the Fal API.",
instructions=[
"When the user asks you to create a video, use the `generate_media` tool to create the video.",
"Don't provide the URL of the video in the response. Only describe what video was generated.",
],
markdown=True,
debug_mode=True,
add_history_to_messages=True,
add_datetime_to_instructions=True,
storage=SqlAgentStorage(table_name="fal_agent", db_file=image_agent_storage_file),
)


app = Playground(agents=[image_agent, ml_gif_agent, ml_video_agent, fal_agent]).get_app(use_async=False)

if __name__ == "__main__":
serve_playground_app("multimodal_agent:app", reload=True)
45 changes: 45 additions & 0 deletions cookbook/tools/lumalabs_tool.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
from phi.agent import Agent
from phi.llm.openai import OpenAIChat
from phi.tools.lumalab import LumaLabTools

"""Create an agent specialized for Luma AI video generation"""

luma_agent = Agent(
name="Luma Video Agent",
agent_id="luma-video-agent",
llm=OpenAIChat(model="gpt-4o"),
tools=[LumaLabTools()], # Using the LumaLab tool we created
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unnecessary comment here

markdown=True,
debug_mode=True,
show_tool_calls=True,
instructions=[
"You are an agent designed to generate videos using the Luma AI API.",
"You can generate videos in two ways:",
"1. Text-to-Video Generation:",
" - Use the generate_video function for creating videos from text prompts",
" - Default parameters: loop=False, aspect_ratio='16:9', keyframes=None",
"2. Image-to-Video Generation:",
" - Use the image_to_video function when starting from one or two images",
" - Required parameters: prompt, start_image_url",
" - Optional parameters: end_image_url, loop=False, aspect_ratio='16:9'",
" - The image URLs must be publicly accessible",
"Choose the appropriate function based on whether the user provides image URLs or just a text prompt.",
"The video will be displayed in the UI automatically below your response, so you don't need to show the video URL in your response.",
"Politely and courteously let the user know that the video has been generated and will be displayed below as soon as its ready.",
"After generating any video, if generation is async (wait_for_completion=False), inform about the generation ID",
],
system_message=(
"Use generate_video for text-to-video requests and image_to_video for image-based "
"generation. Don't modify default parameters unless specifically requested. "
"Always provide clear feedback about the video generation status."
),
)

luma_agent.run("Generate a video of a car in a sky")
# luma_agent.run("Transform this image into a video of a tiger walking: https://upload.wikimedia.org/wikipedia/commons/thumb/3/3f/Walking_tiger_female.jpg/1920px-Walking_tiger_female.jpg")
# luma_agent.run("""
# Create a transition video between these two images:
# Start: https://img.freepik.com/premium-photo/car-driving-dark-forest-generative-ai_634053-6661.jpg?w=1380
# End: https://img.freepik.com/free-photo/front-view-black-luxury-sedan-road_114579-5030.jpg?t=st=1733821884~exp=1733825484~hmac=735ca584a9b985c53875fc1ad343c3fd394e1de4db49e5ab1a9ab37ac5f91a36&w=1380
# Make it a smooth, natural movement
# """)
22 changes: 11 additions & 11 deletions phi/agent/agent.py
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,7 @@

from phi.document import Document
from phi.agent.session import AgentSession
from phi.model.content import Image, Video
from phi.reasoning.step import ReasoningStep, ReasoningSteps, NextAction
from phi.run.response import RunEvent, RunResponse, RunResponseExtraData
from phi.knowledge.agent import AgentKnowledge
Expand Down Expand Up @@ -57,9 +58,9 @@ class Agent(BaseModel):

# -*- Agent Data
# Images associated with this agent
images: Optional[List[Union[str, Dict[str, Any]]]] = None
images: Optional[List[Image]] = None
# Videos associated with this agent
videos: Optional[List[Union[str, Dict[str, Any]]]] = None
videos: Optional[List[Video]] = None

# Data associated with this agent
# name, model, images and videos are automatically added to the agent_data
Expand Down Expand Up @@ -573,9 +574,9 @@ def get_agent_data(self) -> Dict[str, Any]:
if self.model is not None:
agent_data["model"] = self.model.to_dict()
if self.images is not None:
agent_data["images"] = self.images
agent_data["images"] = [img.model_dump() for img in self.images]
if self.videos is not None:
agent_data["videos"] = self.videos
agent_data["videos"] = [vid.model_dump() for vid in self.videos]
return agent_data

def get_session_data(self) -> Dict[str, Any]:
Expand All @@ -588,7 +589,6 @@ def get_session_data(self) -> Dict[str, Any]:

def get_agent_session(self) -> AgentSession:
"""Get an AgentSession object, which can be saved to the database"""

return AgentSession(
session_id=self.session_id,
agent_id=self.agent_id,
Expand Down Expand Up @@ -632,13 +632,13 @@ def from_agent_session(self, session: AgentSession):
if "images" in session.agent_data:
images_from_db = session.agent_data.get("images")
if self.images is not None and isinstance(self.images, list):
self.images.extend(images_from_db) # type: ignore
self.images.extend([Image.model_validate(img) for img in self.images])
else:
self.images = images_from_db
if "videos" in session.agent_data:
videos_from_db = session.agent_data.get("videos")
if self.videos is not None and isinstance(self.videos, list):
self.videos.extend(videos_from_db) # type: ignore
self.videos.extend([Video.model_validate(vid) for vid in self.videos])
else:
self.videos = videos_from_db

Expand Down Expand Up @@ -2433,7 +2433,7 @@ def delete_session(self, session_id: str):
# Handle images and videos
###########################################################################

def add_image(self, image: Union[str, Dict]) -> None:
def add_image(self, image: Image) -> None:
if self.images is None:
self.images = []
self.images.append(image)
Expand All @@ -2442,7 +2442,7 @@ def add_image(self, image: Union[str, Dict]) -> None:
self.run_response.images = []
self.run_response.images.append(image)

def add_video(self, video: Union[str, Dict]) -> None:
def add_video(self, video: Video) -> None:
if self.videos is None:
self.videos = []
self.videos.append(video)
Expand All @@ -2451,10 +2451,10 @@ def add_video(self, video: Union[str, Dict]) -> None:
self.run_response.videos = []
self.run_response.videos.append(video)

def get_images(self) -> Optional[List[Union[str, Dict]]]:
def get_images(self) -> Optional[List[Image]]:
return self.images

def get_videos(self) -> Optional[List[Union[str, Dict]]]:
def get_videos(self) -> Optional[List[Video]]:
return self.videos

###########################################################################
Expand Down
4 changes: 3 additions & 1 deletion phi/llm/openai/chat.py
Original file line number Diff line number Diff line change
Expand Up @@ -181,7 +181,9 @@ def to_dict(self) -> Dict[str, Any]:
if self.presence_penalty:
_dict["presence_penalty"] = self.presence_penalty
if self.response_format:
_dict["response_format"] = self.response_format
_dict["response_format"] = (
self.response_format if isinstance(self.response_format, dict) else str(self.response_format)
)
if self.seed is not None:
_dict["seed"] = self.seed
if self.stop:
Expand Down
18 changes: 18 additions & 0 deletions phi/model/content.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
from typing import Optional

from pydantic import BaseModel


class Video(BaseModel):
id: str
url: str
original_prompt: Optional[str] = None
revised_prompt: Optional[str] = None
eta: Optional[str] = None


class Image(BaseModel):
id: str
url: str
original_prompt: Optional[str] = None
revised_prompt: Optional[str] = None
4 changes: 3 additions & 1 deletion phi/model/openai/chat.py
Original file line number Diff line number Diff line change
Expand Up @@ -255,7 +255,9 @@ def to_dict(self) -> Dict[str, Any]:
if self.presence_penalty is not None:
model_dict["presence_penalty"] = self.presence_penalty
if self.response_format is not None:
model_dict["response_format"] = self.response_format
model_dict["response_format"] = (
self.response_format if isinstance(self.response_format, dict) else str(self.response_format)
)
if self.seed is not None:
model_dict["seed"] = self.seed
if self.stop is not None:
Expand Down
5 changes: 5 additions & 0 deletions phi/model/response.py
Original file line number Diff line number Diff line change
Expand Up @@ -23,3 +23,8 @@ class ModelResponse:
tool_call: Optional[Dict[str, Any]] = None
event: str = ModelResponseEvent.assistant_response.value
created_at: int = int(time())


class FileType(str, Enum):
MP4 = "mp4"
GIF = "gif"
7 changes: 4 additions & 3 deletions phi/run/response.py
Original file line number Diff line number Diff line change
@@ -1,9 +1,10 @@
from time import time
from enum import Enum
from typing import Optional, Any, Dict, List, Union
from typing import Optional, Any, Dict, List

from pydantic import BaseModel, ConfigDict, Field

from phi.model.content import Video, Image
from phi.reasoning.step import ReasoningStep
from phi.model.message import Message, MessageReferences

Expand Down Expand Up @@ -48,8 +49,8 @@ class RunResponse(BaseModel):
session_id: Optional[str] = None
workflow_id: Optional[str] = None
tools: Optional[List[Dict[str, Any]]] = None
images: Optional[List[Union[str, Dict[str, Any]]]] = None
videos: Optional[List[Union[str, Dict[str, Any]]]] = None
images: Optional[List[Image]] = None
videos: Optional[List[Video]] = None
audio: Optional[Dict] = None
extra_data: Optional[RunResponseExtraData] = None
created_at: int = Field(default_factory=lambda: int(time()))
Expand Down
Loading
Loading