Skip to content

High Memory and VRAM Usage When Processing Images #18

@LytsingStudio

Description

@LytsingStudio

As stated in the title, processing a 4096x4096 image with the parameters below is extremely slow and requires over 48GB of VRAM and more than 128GB of system memory. Is this behavior expected?
I am using an RTX 4090 with 48GB of VRAM, but I cannot complete the test as it results in an OOM (Out of Memory) error.

parser.add_argument("--pretrained_model_name_or_path", type=str, default="./stable-diffusion-3-medium-diffusers/", required=False, help='path to the pretrained sd3')
parser.add_argument("--lora_dir", type=str, default="checkpoint/tsdsr", help='path to tsd-sr lora weights')
parser.add_argument("--embedding_dir", type=str, default="dataset/default/", help='path to prompt embeddings')
parser.add_argument("--output_dir", '-o', type=str, default="outputs/", help='path to save results')
parser.add_argument('--input_dir', '-i', type=str, default="inputs4/", required=False, help='path to the input image')

parser.add_argument("--rank", type=int, default=64, help='rank for transformer')
parser.add_argument("--rank_vae", type=int, default=64, help='rank for vae')

parser.add_argument("--is_use_tile", type=bool, default=True, help='whether to use tiled vae')
parser.add_argument("--vae_decoder_tiled_size", type=int, default=1024, help='tiled size for tiled vae decoder') 
parser.add_argument("--vae_encoder_tiled_size", type=int, default=1024, help='tiled size for tiled vae encoder') 
parser.add_argument("--latent_tiled_size", type=int, default=1024, help='tiled size for transformer latent')
parser.add_argument("--latent_tiled_overlap", type=int, default=32, help='tiled overlap for transformer latent')

parser.add_argument("--device", type=str, default="cuda")
parser.add_argument("--seed", type=int, default=42)
parser.add_argument("--upscale", type=int, default=4, help='upscale factor')
parser.add_argument("--process_size", type=int, default=512, help='process size for images')
parser.add_argument("--mixed_precision", type=str, choices=['fp16', 'fp32'], default="fp16")
parser.add_argument("--align_method", type=str, choices=['wavelet', 'adain', 'nofix'], default='wavelet', help='color alignment method')

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions