Stable Diffusion V2: WHAT HAVE THEY DONE!?
Know more about the Stable Diffusion 2 new features here.
Stable Diffusion version 2 is here and it includes a number of groundbreaking advancements.
Stable Diffusions’ Improvements
The most noticeable improvement is the increase in coherence, detail, and complexity.
And this is down to better training on a new model.
This model is trained on an aesthetic subset of the image database.
So they’ve chosen the more attractive images to train the models on.
Here’s an example comparing version one and version two.
You can see that the detail is largely increased as well as the coherence.
What is even more significant is that they’ve removed all of the not-safe-for-work content.
And they’ve also removed celebrities from the database.
But hey I’d sacrifice a pair of knockers for a nice bunny rabbit and a pair of Ray-Bans and a sock.
There is also a super-resolution upscaler included which allows you to upscale images by up to four times using the upscale on the images.
You can now generate images that are up to 2048 pixels by 2048 pixels.
A big new feature is that you can now generate images both at 512 by 512 pixels and 768 by 768.
I am most excited about this, the depth-to-image model included.
They say depth to image infers the depth of an input image and it does that by understanding where objects in the image are in relation to each other.
So how far are they away from the viewer?
And then generated new images using both the text and depth information.
So it’s combining its understanding of the content and also the positioning of the elements inside of an image.
And this gives us a whole host of new possibilities especially I would imagine in the realm of animation.
Here’s an example of a little red man on a lectern being turned into a pineapple man or a little baby who is decreeing peace and asking humans to enact free milk for all.
But the key thing here is that it allows you to create structure-preserving images.
There’s an updated in-painting diffusion model which makes it super easy to switch out parts of an image intelligently and quickly.
And most importantly stable diffusion is still running on a single GPU.
You can install it yourself using huggingface and a web UI.
They say it will be available in DreamStudio in a few days.
Stable Diffusions’ Interface
But where Stable Diffusion for me falls down is with a coherent and musical user interface.
Many people like using the automatic 111 interface but for me, this is overburdened with features that are thrown upon your face it does not create a streamlined aesthetic and artistic experience for a creative to hone their craft.
It’s very much a fantastic tool that needs to be harnessed it needs to be clothed.
It’s like we’ve got this naked rugged gorilla in front of us and someone needs to train it in table manners.
There is some interesting commentary from users on Reddit.
GambAntonio is commenting that what will be next for the Stable Diffusion.
Will they remove weapons?
Will they remove people showing their feet?
They’re commenting on the fact that Stable Diffusion first came out with no content restrictions.
And now they’re backtracking on that initial ethos and removing the adult content.
Users are noting that it’s now difficult to imitate certain artists’ styles.
One user on Discord has noted, “what have you done to Greg?,” referring to trying to get an image in the style of Greg Rutkowski who has been used largely inside AI art.
Some people are also commenting that anatomy has taken a step backward potentially because of the reduction of nudity inside the image models.
Potential Legal Issues
People are speculating that these causes are because of potential legal issues affecting the stability AI in relation to adult content and also in using artistic styles.
So we’re not only seeing a huge step forward here in the technical abilities of stable diffusion 2.0.
But we’re also seeing a slight course change in the ethos of the platform where it was initially championing open-source unrestricted.
It’s now taking a much more restricted approach though Emad does deny this.
He said in an interview with The Verge that there has been no specific filtering of artists here.
Emad has gone on to say on Discord that this is properly open and interoperable.
Midjourney and Dalle are amazing but you can’t tell what data is in it as folks fine-tune stuff back in we can learn from that to make the next releases better.
So he is still claiming that there is a lot of value in his open-source approach to the backwards engineering possibilities of Stable Diffusion.
And he’s giving a slight slight dig at Midjourney and Dalle.
He’s also said that it makes sense to remove as much as possible because it’s easy to add new images to the data space but it’s hard to remove them.
Stable Diffusions’ Quality of Image
Stable Diffusion 2.0 takes a step forward in the quality of images coming out.
It’s increasing the resolution of the images.
And it’s offering depth to images which allows us to use it in painting to keep the coherence of the positioning of objects inside of the frame.
It still works on a single GPU and the other huge difference is the data set that it’s been using.
It’s removing not-safe-for-work content.
it’s removing celebrities and there are reports that it’s removing details related to certain artists.
Stable Diffusion is charting a course with added features and more restrictions.
The possibilities are exciting and I for one can’t wait.
Let me know what you think of stable diffusion 2.0.
Thank you for watching.
I hope you enjoyed this video, if you did why not stick around for more videos on AI art?
Have a delightful day.
If you enjoyed this article, make sure to check out our other articles below.