Skip to content

First complete test run! #2

@dbuytaert

Description

@dbuytaert

My script has reached a point where I'm ready to document the first complete run.

The captions.json file below was generated using the following command:

find . -name "*.jpg" -exec ./caption {} --time \; | jq -s '{"results": .}' > captions.json

It generates captions for all test images using all available models, and combines the results into a single JSON file. (You need to have jq installed for this.)

I plan to share a more detailed write-up and analysis in a future blog post on https://dri.es/. Until then, I might tinker with this script some more and welcome any suggestions or improvements.

Happy holidays! 🎄

{
  "results": [
    {
      "image": "./test-images/image-9.jpg",
      "captions": {
        "vit-gpt2": {
          "caption": "A bird perched on top of a wooden post in front of a mountain range with a view of a mountain range and mountains.",
          "time": 10
        },
        "git": {
          "caption": "A wooden sign in front of a mountain with a statue on it that says \" meise museum \" on the top.",
          "time": 21
        },
        "blip": {
          "caption": "There is a birdhouse with a statue on top of it in the middle of a field with trees and mountains in the background.",
          "time": 7
        },
        "blip2-opt": {
          "caption": "A statue of mary on a wooden post in the middle of a field with mountains in the background.",
          "time": 34
        },
        "blip2-flan": {
          "caption": "A statue of the virgin mary sits on top of a wooden post in the middle of a mountain.",
          "time": 39
        },
        "llama32-vision-11b": {
          "caption": "A wooden shrine with a statue of mary and a signpost in front of mountains.",
          "time": 43
        }
      }
    },
    {
      "image": "./test-images/image-8.jpg",
      "captions": {
        "vit-gpt2": {
          "caption": "A man in a bathing suit is sitting on a boogie board at the water's edge with his feet propped up.",
          "time": 11
        },
        "git": {
          "caption": "A person is standing in the water and one of them is wearing a pair of shorts with the words \" i love you \" on it.",
          "time": 26
        },
        "blip": {
          "caption": "There is a man that is standing on the edge of a swimming pool with his feet in the water and a hose attached to the side of the pool.",
          "time": 7
        },
        "blip2-opt": {
          "caption": "A man in swim trunks standing on the edge of a pool.",
          "time": 26
        },
        "blip2-flan": {
          "caption": "A young boy is leaning on the railing of a dock in a body of water with his feet in the water.",
          "time": 40
        },
        "llama32-vision-11b": {
          "caption": "A person standing on a dock, with their legs submerged in water and holding onto a metal archway.",
          "time": 46
        }
      }
    },
    {
      "image": "./test-images/image-10.jpg",
      "captions": {
        "vit-gpt2": {
          "caption": "A city at night with skyscrapers and a traffic light on the side of the street in front of a tall building.",
          "time": 12
        },
        "git": {
          "caption": "A busy city street is lit up at night, with the word qroi on the right side of the sign.",
          "time": 22
        },
        "blip": {
          "caption": "This is an aerial view of a busy city street at night with lots of people walking and cars on the side of the road.",
          "time": 8
        },
        "blip2-opt": {
          "caption": "An aerial view of a busy city street at night.",
          "time": 32
        },
        "blip2-flan": {
          "caption": "An aerial view of a busy street in tokyo, japanese city at night with large billboards.",
          "time": 41
        },
        "llama32-vision-11b": {
          "caption": "A bustling city street at night, with numerous billboards and advertisements lining the buildings.",
          "time": 43
        }
      }
    },
    {
      "image": "./test-images/image-11.jpg",
      "captions": {
        "vit-gpt2": {
          "caption": "A building that has a lot of windows on top of it and some cars parked on the side of the road in front of it.",
          "time": 10
        },
        "git": {
          "caption": "A laptop sits on a table in front of a cityscape, with the skyline in the background and a cityscape in the background.",
          "time": 24
        },
        "blip": {
          "caption": "There is a laptop sitting on top of a table on a balcony with a cityscape in the background and a solar panel on top of it.",
          "time": 8
        },
        "blip2-opt": {
          "caption": "A solar panel on top of a table in front of a city skyline.",
          "time": 27
        },
        "blip2-flan": {
          "caption": "A solar panel sits on top of a railing on a balcony overlooking a city skyline.",
          "time": 40
        },
        "llama32-vision-11b": {
          "caption": "A solar panel on top of a table with a city skyline in the background.",
          "time": 45
        }
      }
    },
    {
      "image": "./test-images/image-12.jpg",
      "captions": {
        "vit-gpt2": {
          "caption": "A man standing on top of a boat next to another man holding a surfboard in one hand and a surfboard in the other.",
          "time": 11
        },
        "git": {
          "caption": "Two men are in a boat, one of them is wearing an orange hat and the other is wearing an orange hat.",
          "time": 21
        },
        "blip": {
          "caption": "There are two men riding on the back of a boat in the water, one of them is on a surfboard and the other is on a board.",
          "time": 8
        },
        "blip2-opt": {
          "caption": "Three young men sitting on the back of a boat.",
          "time": 26
        },
        "blip2-flan": {
          "caption": "A group of people sitting on a boat watching a man ride a surfboard in the middle of the water.",
          "time": 38
        },
        "llama32-vision-11b": {
          "caption": "Two shirtless men on a boat watching people wakeboarding or surfing behind it.",
          "time": 43
        }
      }
    },
    {
      "image": "./test-images/image-1.jpg",
      "captions": {
        "vit-gpt2": {
          "caption": "A candle is lit on a wooden table in front of a fire place with candles and other items on top of it.",
          "time": 11
        },
        "git": {
          "caption": "Two candles are lit next to each other on a table, one of them is lit up and the other is lit up.",
          "time": 23
        },
        "blip": {
          "caption": "There is a lit candle sitting on top of a wooden table next to a game board and a glass of wine on the table.",
          "time": 8
        },
        "blip2-opt": {
          "caption": "A candle sits on top of a wooden table.",
          "time": 27
        },
        "blip2-flan": {
          "caption": "A candle sits on a wooden table next to a backgammon board and a glass of wine.",
          "time": 37
        },
        "llama32-vision-11b": {
          "caption": "A dimly lit room with a wooden table, featuring a candle, a backgammon board, and other objects.",
          "time": 43
        }
      }
    },
    {
      "image": "./test-images/image-3.jpg",
      "captions": {
        "vit-gpt2": {
          "caption": "A large body of water with a bunch of birds on top of it, with a beach next to it in the distance.",
          "time": 11
        },
        "git": {
          "caption": "A large rock jutting out of the ocean is in the foreground and a couple of people are on the beach.",
          "time": 22
        },
        "blip": {
          "caption": "A group of people walking on the beach with a rock formation in the background and seagulls in the foreground.",
          "time": 8
        },
        "blip2-opt": {
          "caption": "Haystack rock, cannon beach, oregon.",
          "time": 35
        },
        "blip2-flan": {
          "caption": "Cannon beach, oregon at sunset with people on the beach and rock formations in the foreground.",
          "time": 37
        },
        "llama32-vision-11b": {
          "caption": "A large rock formation on a beach, with people standing nearby and seagulls flying overhead.",
          "time": 43
        }
      }
    },
    {
      "image": "./test-images/image-2.jpg",
      "captions": {
        "vit-gpt2": {
          "caption": "A hotel room with a view of the ocean and skyscrapers on the other side of the clock tower in the distance.",
          "time": 12
        },
        "git": {
          "caption": "A view of a city from the window of a hotel with a sunset in the background and a cityscape in the distance.",
          "time": 23
        },
        "blip": {
          "caption": "A bedroom with a view of the city from it ' s window overlooking the water and a cityscape in the distance.",
          "time": 8
        },
        "blip2-opt": {
          "caption": "A view of the city skyline through a window in a hotel room.",
          "time": 25
        },
        "blip2-flan": {
          "caption": "A bed in a room with a view of a city skyline at dusk on a cloudy day.",
          "time": 41
        },
        "llama32-vision-11b": {
          "caption": "A bedroom with a large window overlooking a city skyline at sunset or sunrise.",
          "time": 43
        }
      }
    },
    {
      "image": "./test-images/image-6.jpg",
      "captions": {
        "vit-gpt2": {
          "caption": "A man standing on top of a skateboard with his foot on the ground next to a skateboard in the grass.",
          "time": 11
        },
        "git": {
          "caption": "A person on a skateboard that has the word tron on the side of it and the word tron on the side.",
          "time": 24
        },
        "blip": {
          "caption": "There is a man that is standing on a skateboard in the grass with his feet on the board and one foot on the back of the board.",
          "time": 9
        },
        "blip2-opt": {
          "caption": "A man riding an electric scooter on a grassy field.",
          "time": 24
        },
        "blip2-flan": {
          "caption": "A man is standing on a skateboard in a grassy area with a camper in the background.",
          "time": 40
        },
        "llama32-vision-11b": {
          "caption": "A person riding an onewheel, a type of electric skateboard, with their hands on the wheel and wearing casual clothing.",
          "time": 43
        }
      }
    },
    {
      "image": "./test-images/image-7.jpg",
      "captions": {
        "vit-gpt2": {
          "caption": "A woman sitting on a ledge in front of a building with a view of the city from the balcony, looking out into the distance.",
          "time": 11
        },
        "git": {
          "caption": "A woman with red hair is standing in front of a large window that has an american flag on the wall behind her.",
          "time": 22
        },
        "blip": {
          "caption": "A woman standing in front of a window looking out on a cityscape with skyscrapers in the background.",
          "time": 7
        },
        "blip2-opt": {
          "caption": "A red - haired woman standing in front of a large window with skyscrapers in the background.",
          "time": 30
        },
        "blip2-flan": {
          "caption": "A woman is standing in front of a window looking at a city with tall buildings and skyscrapers.",
          "time": 41
        },
        "llama32-vision-11b": {
          "caption": "A woman with red hair standing in front of a large window, looking out at a cityscape.",
          "time": 43
        }
      }
    },
    {
      "image": "./test-images/image-5.jpg",
      "captions": {
        "vit-gpt2": {
          "caption": "A large group of people sitting at a table in front of a large building with a clock on the top of it.",
          "time": 11
        },
        "git": {
          "caption": "A group of people are sitting at a table in front of a building with the word sagrada familia on it.",
          "time": 23
        },
        "blip": {
          "caption": "A group of people sitting at a table with a view of the city and mountains in the background.",
          "time": 7
        },
        "blip2-opt": {
          "caption": "A group of people sitting at an outdoor table in front of the sagrada familia in barcelona, spain.",
          "time": 41
        },
        "blip2-flan": {
          "caption": "Barcelona's sagrada familia is one of the most iconic buildings in barcelona.",
          "time": 40
        },
        "llama32-vision-11b": {
          "caption": "A cityscape with a large building and people sitting at tables, possibly enjoying drinks or food.",
          "time": 43
        }
      }
    },
    {
      "image": "./test-images/image-4.jpg",
      "captions": {
        "vit-gpt2": {
          "caption": "A living room with a mirror, candles, and a vase of flowers on a table in front of a mirror.",
          "time": 11
        },
        "git": {
          "caption": "A picture frame is hanging on a wall next to a vase and a vase with the word tulips on it.",
          "time": 28
        },
        "blip": {
          "caption": "A room with a painting on the wall and two vases on the table in front of it.",
          "time": 8
        },
        "blip2-opt": {
          "caption": "A room with a painting on the wall, a picture frame, and a chandelier.",
          "time": 30
        },
        "blip2-flan": {
          "caption": "A room with a painting on the wall and a couple of framed pictures hanging on the wall next to it.",
          "time": 39
        },
        "llama32-vision-11b": {
          "caption": "An empty gold frame on a wall with ornate wallpaper, surrounded by other decorative items.",
          "time": 43
        }
      }
    }
  ]
}

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions