CLIP Interrogator let’s you submit an image and have it figure out what prompt could have been used to create an image. CLIP has been used with various models in the past to provide feedback to AI if the image being generated is closely matching to the prompt or not. One of my earliest ventures into AI space was with VQGAN-CLIP which used this integrator.
@pharmapsychotic has been famously known for his great page for various resources for AI experimentation and exploration. Check out his Tools page for great collection of resources.
He released recently this tool on HuggingFace where you can submit a URL or upload an image and the CLIP Integrator will analyse your prompt and provide you the prompt that may have been used to create it.
One of the example images of the Turtle demonstrates how CLIP Interrogator works evaluates the image in details. The prompt produced is something like this:
a watercolor painting of a sea turtle, a digital painting, by Kubisi art, featured on dribbble, medibang, warm saturated palette, beautiful painting of a tall, image overlays, high saturation colors, beautifully rendered, digital art h 9 6 0, green and red, wet – on – wet technique
Although most of it makes sense I’m not sure what ‘h 9 6 0’ represents…is it the height of the image? ‘Beautiful painting of tall’ also seems a bit irrelevant here but we are off to a start. Let’s look at some other images.
Result: a man standing in front of a futuristic city, a detailed matte painting, by jessica rossier, movie still of the alien girl, sensory processing overload, puzzle-like room, anamorphic widescreen, bright blue future, wide angle portrait of astroboy, blueprint
Original prompt: a boy standing in front of a giangantic empty spaceship outside, cyberpunk world futuristic, otherworldly sea green lights, unreal engine, octane render, concept art by Michael Whelan, high contrast, highly detailed, cinematography
Starting explanation is quite similar, except man vs boy. Reference to movie still, anamorphic widescreen describes cinematography. The artist described is not the same as the original. Also comparing the two the the colours interpreted are not exactly the same.
Result: a futuristic city with a motorcycle in the foreground, a detailed matte painting, by Jan Tengnagel, colors of tron legacy, the scooter ( edm band, editorial illustration colorful, panoramic anamorphic, 8 0 – s, gateway to futurisma, featured on artstattion, black light, animatic
Original Prompt: a retro tron motorcycle in front of a distant futuristic city, cyberpunk, wide angle, distant cityscape, trending on artstation
Having compared the two you can see the interpretation is pretty good and you can take either of these prompts to get a very similar result.
Another example I tried what that, I grab this image of mine which I tweeted some time ago and run this through CLIP Interrogator.
I simply right click to get the Image URL and feed it to the CLIP Interrogator and it spits out the below prompt.
Result: a strange looking creature with blue eyes, a character portrait, by lovecraft, zbrush central, ornate tentacles growing around, old humanoid ents, photo of cthulhu, anchor goatee, detailled face, ability image, post – apokalyptic, aquarius
Original Prompt: cthulhu profile picture, highly detailed, portrait, intricate ornate psychedelic, digital painting, trending on artstation, concept art, sharp focus, illustration, by lovencraft, art by artgerm
The CLIP Interrogator does a pretty good job at understanding the image which gives you an idea of how to structure this prompt. I feel though that this tool could be used to re-create images similar to other AI Artists or copy their style, so I urge that you use this tool as a learning mechanism to build your knowledge and understanding on how to create a could descriptive prompt that an AI Text-to-Image model can use to bring your vision to life.
Since creating this Huggingface space, pharma has created a Google Colab version as well in case you prefer to use that.