August 1, 2024 Black Forest Labs released three new models Flux.1 – Pro, Dev and Schnell. The Pro version is not open source and is available through their API but DEV and Schnell are both open source and available to download via Huggingface page.
Dev is a higher quality model than Schnell, but Schnell is much faster (4 steps). These are big models though both of them weight a whopping 23.8GB each and they require high level of VRAM to run. It is recommended that you have 32GB RAM.
However, don’t be sad because there is a way to run them on lower VRAM GPUs. I have RTX4080 with 16GB and I can run both Dev and Schnell only difference is that Dev takes about 3 minutes to generate an image 1024px by 1536px while Schnell takes only 30-40 seconds to generate the same.
The buzz at the moment is that these models are at par with Midjourney and in my testing I have to agree that they are much better. It is better at many aspects actually:
- Resolution – the model is able to handle any image size you want from extremely wide to extremely tall, there is no set resolution that you have to adhere to
- Prompts – it is much better at handling the prompt and adhering to the various nuances of the prompt
- Quality – quality is much better and higher in this initial release. Hands are better formed, composition is almost spot on always and facial features are well defined
- Text – renders text better than any model out there even SD3
Most importantly it doesn’t apply its own recipe or sauce to make your image better, so it stays close to your prompt as much as possible. Whereas, with Midjourney there is always the influence that their model tries to add in the image to make it better which can often make it hard to control the image with just a text prompt.
Download
In order to run this, you need ComfyUI (update to the latest version) and then download these files.
- Model: Flux1-Schnell or Flux1-Dev (you need to agree to the terms). Files are 23.8GB each
- VAE: AE (its own VAE). File is 335Mb
- Clip1: T5xxl_fp8_e4m3fn (for under 32GB VRAM) or T5xxl_fp16 (32GB or above VRAM)
- Clip2: Clip_l
Place the Model in the models\unet folder, VAE in models\VAE and Clip in models\clip folder of ComfyUI directories. Make sure you restart ComfyUI and Refresh your browser.
The default workflows are provided by ComfyAnonymous on their github page.
My adapted workflows are available as well for download. I provide two workflows Text 2 Image and Image 2 Image, just drag the PNG files in the zip into ComfyUI. Install any missing nodes using ComfyUI Manager.
Flux.1 Txt2Img and Img2Img Workflows (768 downloads )My Image to Image workflow utilises Florence 2 LLM and Clip Interrogator (got the original version online from somewhere I can’t recall) to generate an accompanying prompt to help guide Flux. So you have Image that is influencing the generation plus the text prompt that makes the result super!!
Sample Results
It’s been a wonderful breath of fresh air to get a model that can produce such high quality coherent results which has kick off the month of August with a bang. In wonder what other excitement is awaiting us next. For me I keep exploring Flux and had already downscaled my Midjourney subscription but is it time to ditch Midjourney, we will see.
If you'd like to support our site please consider buying us a Ko-fi, grab a product or subscribe. Need a faster GPU, get access to fastest GPUs for less than $1 per hour with RunPod.io