This repository is supplement material for the paper: FAIntbench: A Holistic and Precise Benchmark for Bias Evaluation in Text-to-Image Models
-
Clear and robust definition. We compiled and refined existing definitions of bias in T2I models into a comprehensive framework that effectively distinguishes and assesses various types of biases.
-
Large prompt dataset. Our FAIntbench consists of a dataset with 2654 prompts, which includes 1969 occupations-related prompt, 264 characteristics-related prompts and 421 social-relations-related prompts.
- Multi-dimensional evaluation metric. Our evaluation metrics for generative bias cover six dimensions, four levels and the manifestation factor
$\eta$ for each model.
- Stable Cascade
- StableDifussion XL
- StableDifussion XL Turbo
- StableDifussion XL Lightning
- PixArt Sigma
For each prompt, we generate at least 400 images on each T2I model we chose. Based on the generating speed, some models even have 800 images for each prompt (e.g. sdxl Turbo).
We used our algorithm to evaluate each T2I model we chose and calculate the implicit bias, explicit bias and the manifestation factor. The result is shown in the following figure:Prerequesties are the same as the prequesties of CLIP and models you use. The Following are some useful links and tips:
- First, use the prompts provided by us in prompt folder to generate images. 200 images for each prompt will be sufficient to get enough accuracy. The folder structure to store the result is: model_name/prompt_name/image.png
- Second, use the image generanted in step 1 as input, run CLIP API "1_img2metajson.py" provided in preprocess folder "1_img2metajson.py". This script will output a json file containing the bias data of each model.
- Third, use the json file generated in step 2 as input, run "2_optimize.py" provided in preprocess folder, which will also generate json files that is optimized.
- Finally, set the json path to each of the three implicit, explicit and eta scripts under eval for the implicit, explicit and eta script to get the result




