Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I'm not the guy who came up with this style! I just vacuum up catboxes and do a lot of tests/repros. I'm not at my main computer so I can't give you exact gen parameters, but the cliff notes version is:

  - DPM++ 2M K or DPM++ SDE K w/ a high eta are the best samplers in general, the former was used here
  - search seed space at whatever res is reasonable (512x704) before upscaling
  - once you have a good seed hiresfix with ultrasharp or remarci @ 0.5-0.6 denoise (i prefer the former, ymmv)
  - do a second diffusion pass in i2i on top of the hires, but with a simpler prompt focusing on high quality e.x. "gorgeous anime wallpaper, beautiful colors, impeccable lineart"
  - for the second pass (this will be taking you from ~1k x 1.5k to 2.5k x 4k) you're gonna run out of VRAM if you're trying to diffuse traditionally so use the Tiled Diffusion addon with Tiled VAE enabled as well to conserve vram. DDIM seems to work best here though I've gotten good results with the two samplers above as well.
  - using a style finetune helps a lot, Cardos Anime was used here
  - when in doubt, search Civit.ai or huggingface, there are tons of great models/LoRAs and textual inversions out there and if you have anything specific in mind having some form of finetune helps a ton
Obviously you're going to need to know how to prompt as well which is definitely not something I can explain in a single post, just like any kind of art you just have to practice a bunch to gain an intuition for it.

P.S. I've recently started a patreon, if any of you'd like to support my work on this stuff. I'm a big believer in sharing all my knowledge, so most of it will come out for free eventually, but I gotta eat. [0]

[0] https://www.patreon.com/thot_experiment



> gotta eat

Yeah :) Thanks for your infodump! I'm mostly curious how you achieve the insane amount of "clutter" in these images - is it done by referencing a specific artist or style in the prompt, or just by some difficult-to-find key phrase? I haven't been able to get anything near.


I think it's the tiled diffusion second/third pass that does it, because you're essentially doing many gens on individual tiles of the image there's a natural visual density that SD tends toward and since this is composed of many gens the density is increased in the overall picture, that being said it's not something I've tested super extensively and mostly only with this sort of style.


> but I gotta eat.

So do the people who made the art used to train these things.


Do not start with me with this shit. Of course artists need to eat. I'm on the fucking internet essentially begging because I struggle with capitalism and the idea that almost all forms of employment force me to restrict my intellectual property, or assign it to someone else. Every time I'm forced to do it because the bank account is low I can feel it grinding down my soul. It is antisocial to prevent people from building upon my ideas (or any ideas). We should take every step we can to strengthen the commons. (see "Everything is a Remix"[0])

I dropped out of fucking high school and was homeless couch surfing in LA for years trying to break into steady VFX work, I don't think for a lack of skill/blood/sweat/tears[1]. I'm well aware artists need to eat.

The problem isn't technological progress, (jfc this is all open source!!! stability is doing god's work by undermining the value of ghouls who try to restrict access to these models for personal profit). It's certainly not that copyright is too weak in preventing people form building off the aggregate styles and work of artists. I learned style by fucking copying the masters[2]!! This is how art is supposed to work. The problem it's the disgusting economic/political system that has twisted copyrights and patents into a device to enable rent seeking for those who can afford to purchase/commission them rather than protecting the innovation and creativity that they were meant to.

[0] https://www.youtube.com/watch?v=nJPERZDfyWc

[1] https://www.reddit.com/gallery/unqmux

[2] https://imgur.com/a/NDRWVj8


What about the artists whose art said artists used to train themselves and so on and so forth? How much do we owe the people we learn from?

So many artists copy/adapt Disney's style totally anonymously for example. Should they be paying royalties to Disney?

As a human being, if I browse DA and reference other people's work/styles while I muck about on a graphics tablet am I in the wrong?


While I'm strongly on team "train the AI on everything and reduce IP protections to strengthen the commons" as much as such a team exists, I think it's important to point out that this is a argument that misses the forest for the trees.

It's only relevant when taking a very logical/semantic view of the situation, when you take a more holistic view it breaks down. The scale and implications of the two things you're comparing are completely different. While I probably agree with you in general I think these sorts of arguments don't really help people understand our point of view because they sidestep the economic and social implications which are at the heart of this.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: