This project aims to develop a solution for post-production editing of object poses in images using generative AI. The tasks include:
- Object Segmentation: Segment the object specified by a user prompt within an image.
- Pose Editing: Rotate the segmented object based on user-defined azimuth and polar angles, preserving the original background and scene realism.
- Goal: Identify and mask the object in an image based on a class prompt.
- Input: Image and class name.
- Output: Image with the segmented object highlighted with a red mask.
python t1.py --image ./example.jpg --class "chair" --output ./segmented_output.png
Goal: Change the pose of the segmented object by adjusting its azimuth and polar angles. Input: Image, class name, and angle adjustments (azimuth and polar). Output: Image with the object's pose edited according to the specified angles.
python t2.py --image ./example.jpg --class "chair" --azimuth +72 --polar +0 --output ./pose_edited_output.png
Install the required dependencies using the following command:
pip install -r requirements.txt
To run the object segmentation script:
python t1.py --image ./path_to_image.jpg --class "object_class_name" --output ./output_image.png
To run the pose editing script:
python t2.py --image ./path_to_image.jpg --class "object_class_name" --azimuth +degrees --polar +degrees --output ./output_image.png
Utilized a pre-trained segmentation model, such as the Segment Anything Model (SAM), to identify and mask the object in the image based on the input class. The segmented object is highlighted with a red mask to indicate successful identification.
After segmenting the object, its pose is modified by rotating it according to the user-provided azimuth and polar angles. The edited scene preserves the background and ensures that the object rotation looks natural. Challenges & Solutions
Solution: Used fine-tuned pre-trained models and tested the model across different images to refine boundary detection.
Solution: Employed advanced 3D rotation techniques to adjust object poses while ensuring the background remains intact.
Sample results from both tasks can be found:
Segmented objects: Red mask over the identified object. Pose-edited objects: Object rotated according to the user-specified angles while preserving the background.
Improved Boundary Detection: Further refine the segmentation model to handle more complex scenes with multiple objects. Advanced Pose Editing: Incorporate more sophisticated 3D transformation techniques to enhance realism in object pose manipulation. Failure Analysis: Conduct deeper analysis on failure cases and provide solutions to handle edge cases better.