Midjourney has been at the cutting edge of Text to Image AI driven models and over the last year and a bit their models have been lifting the game every few months. The current model v4 has the ability to produce some amazing results.
So far the challenges faced by most AI models have been drawing hands and controlling the number of teeth. So by far the expectation is that the next Midjourney model v5 will be able to address this. Midjourney recently release two new pages for their paying/subscribing members to be able to rate “two distinct images” and “two similar image”, the idea is that you can pick the image you like.
Although the team at Midjourney said that these images do not represent the v5 final model it is obvious that that these are generated with a model that is much more advanced than current Midjourney v4. The level of details, contract, depth and clarity that is being achieved is certainly next level and if the next model as good as these samples or better…then this is amazing next level that we reach. So as I took some hours to review these and rate some images I captured a few images to show you how this new model will improve the quality of images when its released.
Teeth had been quite challenging for most AI based image generation models available out there till now. You always had issues like multiple rows of teeth, or too many teeth that were produced in the resulting image. Looking at the samples it appears that this is getting close to real as possible. Of course I have a handful of samples but you be the judge.
Another very challenging area for AI image generation was correctly orient hands and have the right number of fingers. This was starting to be addressed with a custom fine tuned model that we shared about earlier on the blog in the post Protogen x3.4 for Stable Diffusion. However if the early review of samples from Midjourney v5 are anything to go by then this is alway improving 10-fold, see for yourself!
This has not been a challenge but there was still room for improvement as the faces often lacked details and texture and at times AI generated humans had artefacts that you could see and tell that this is not an real photo. The eyes were sometimes off, iris colour may be different, the skin was too smooth almost porcelain like.
My initial thoughts are the details and realism in the Midjourney v5 people images is now getting to photorealism level. Can it get better from here further or is this the pinnacle of what we’d expect to see from AI based model to create human images. You be the judge and let me know your thoughts.
Not only that there has been HUGE improvement in the different aspects that were lacking in AI Generated image, this next release of Midjourney v5 is improving the amount of detail that is produced in these image. Is this the end of Stock Photography? Let me know your thoughts in the comments below.