imgcaption
Simple Python script to generate descriptive texts (captions) for images, using the Hugging Face Transformers library with the Salesforce/blip-image-captioning-base
model. It is basically just a wrapper around such library.
Inspired by:
- Building an Image Captioning Model Using Salesforce's BLIP Model - by Pranav Kumar - Medium
- What is Image Captioning and How to use Python to Generate Caption from an Image - by Jim Wang - Medium
- An Introduction to Image Captioning with BLIP - Aggregata
Usage
Important: this has been tested with Python 3.12.4 on Windows 10.
Set up a Python venv (virtual environment) and install some packages inside it:
Then you can use the script like this:
Bash | |
---|---|