Progression over Time

Background

First Efforts

https://www.tightbytes.com/art/images/Cui/23/Charlotte.png

Quick history:

  • MidJourney… very quickly created characters such as Charlotte. It was so satisfying that I paid for a subscription. However, the characters all began to look very similar.

  • LeonardoAI… Subsequently, this produced a significantly broader variety of individual faces. The key character ‘Victoria’ evolved in Leonardo.

  • Discovered Stable Diffusion and Automatic1111, which opened the door to limitless image/character creation (including NSFW). For this, however, I needed a more powerful graphics card.

  • Purchased the NVidia 3070 (8gig VRAM) card. Note: this card had only 8gig of VRAM, where the 3060 had 12, which would have made a better choice.

  • Downloaded and (via Anaconda) installed Automatic1111 web-UI.

In A1111

https://www.tightbytes.com/art/images/Cui/23/Staci-CafeB1.png

The main objectives:

  • 1-to create as realistic (photo-real) characters as possible

  • 2-create consistent characters (back to MidJourney for this)

  • 3-have multiple characters in a scene

  • 4-have those multiple characters be unique (different from each other)

  • 5-have those multiple characters be reproduceable

Was able to achieve 1, 2, 3 and 4. Installed but did not invoke roop - did all Faceswapping in MidJourney.

Discovered ComfyUI, and migrated all activities to ComfyUI.

Node-based Interface ComfyUI

Objective Breakdown

I installed ComfyUI in the recommended Python3.10.6 environment which eliminated many obscure Python error messages no one seemed to have an answer to. I reinstalled A1111 the same way: bog-standard Python3.10.6 venv. All models are currently saved in the A1111 folder. As of September, 2023, have cancelled all subscriptions to Leonardo and MidJourney: roop is very straightforward and easy to accomplish in ComfyUI.

The main objectives, current as of 7th Oct, 2023, are:

  • 1-to create as realistic (photo-real) characters as possible [DONE]

  • 2-create persistent characters [DONE using roop]

  • 3-have multiple characters in a scene [DONE]

  • 4-have those multiple characters be unique and reproduceable [DONE dual roop]

  • 5-characters will be anatomically correct (minimal deformities)

  • 6-have those multiple characters interact (more than just staring blankly)

  • 7-clothe the characters differently

  • 8-develop poses / LoRA / LyCORIS etc

2 - Persistent Characters

https://www.tightbytes.com/art/images/Cui/23/Charlotte-Robyn-Library.png

The first characters (Charlotte, Victoria, Mabel, etc) were created for an odd little novella I was writing, one that has sort-of stalled as what I started writing seemed, with time, a bit trite and uninspired. They were ‘created’ in MidJourney and later LeonardoAI, satisfying the ‘photoreal’ requirement easily. I’ve since gone to using This Person Does Not Exist for reference faces as the features are a lot less airbrushed and produce far more realistic-looking individuals using roop. There is one carefully GIMPed real person reference image for a persona in some of my images.

3 - Multiple Characters

The length and detail of the prompt very much affects how consistently I can get two characters to appear. Too much emphasis on a detail results in one person to appear instead of two, regardless of whether I enclose the phrase ‘2 women’ in parentheses. Removing detail appears to resolve that problem.

4 - Unique, reproduceable characters

Generally speaking, the same conditions that result in only one figure appearing where two were asked for account for roop working consistently. I’ve had Comfy assigned my first face to a tertiary individual in the image (very much a background figure) and so the first character receives the second face and none is assigned to the second charcter at all:

https://www.tightbytes.com/art/images/Cui/23/Cui-3rdChar.png

In this image, the first face was given to the obscure person on the couch on the left of the picture, the second figure - in the far-too-short skirt - was given the second face by roop, and the third figure was whatever Comfy / SD had in mind. Note: this brings up another interesting challenge… appropriate attire. Unfortunately, prompting has little effect on the length of skirts or the decollette of the neckline.

So, whilst I do feel this objective has been accomplished, getting consistent results very much depends on shorter, concise prompts. Details may need to come from LoRAs or - newer tech for artists this October - IP Adapters.

5 - Minimal, or no Deformities

While for the most part, facial distortion appears to have been resolved - with the exception of when the faces are in close proximity to each other - hand deformities unfortunately, frustratingly persist:

https://www.tightbytes.com/art/images/Cui/23/EnBateau01.png

Note the hands (rendered 06.Oct.23) - this is currently my main objective: to consistently render non-deformed limbs, or at least be able to fix deformities via inpainting or something like that.

And yet, in the preceding image, the hands are fine.

Update: 23.10.08

The original developer of ComfyUI’s Manager, ltdrdata, has released a video proposing an automatic fix to hand deformations. The key node-set to manage all this is contained in his Impact Pack.

6 - Character Interaction

So far (Oct 2023) no real solution has presented itself. Indeed, getting one character to show one expression and the other, something else eludes me. Even getting the characters to look at anything specific is “too-hard basket” stuff at the moment. I can get ‘selfie’-style shots galore, and, a la rigeur, two individuals ‘discussing’ while each sort of looks off into the distance, ostensibly contemplative, but even achieving that is inconsistent and really, how much need is there for that sort of pose? Getting consistent interaction remains a thorny problem. I currently can have characters holding hands and ‘embrace’ but (at least in SD1.5), a more intimate display of affection - i.e., a kiss - tends to distort the faces near the contact point. (Observed 04.Oct.23: appears to be a roop thing).


Node Sets - October Workflow

Nodes affect Objectives

23.10.08

I am discovering that, almost as much if not more, prompts - and specifically the injudicious, casual use of keywords in a prompt, will lead to undesireable outcomes in the final image. For example: I had the phrase “lace-up boots” in a prompt. this resulted in short skirts and lace-topped stockings. Removing this brought the skirt hemline down to more desireable levels and a far better image.

In a good prompt, less is more. Too many words in the negative prompt does absolutely nothing to improve the image and can adversely affect the outcome.

Challenges I have yet to resolve:

  • Camera distance from my subject

  • What the subject is looking at is consistent to the prompt

HiRes Fix

23.10.07

From Automatic1111, I really missed ‘HiRez-Fix’. Well, got that working, sort-of. I think. Open to new ideas, particularly if they work:

https://www.tightbytes.com/art/images/Cui/23/231007-HiResFix.png

23.10.09

Thanks to EndangeredAI and his excellent videos (linked below), I’ve had cause to revise the Sampling section of my workflow. Mind you: what was suggested wasn’t for HiResFix per se, and it was SDXL-based, so I might look at it again

Nodal Noodles

The Efficiency Pack clears up a lot of noodle mess. It includes one LoRA and an empty latent image.

IP Adapter

23.10.13

At this juncture, I’m able to render two characters - in this case, women - with consistent faces. Despite the hand fix “module”, perhaps because of the proximity of the characters hands can still turn out a bit messy. I’m cycling through the checkpoints to see if one of them does a better job at this. At the moment (1000h) the Swizz8-BakedVAE-FP16-Pruned checkpoint appeared initially to produce the best results. However, with subsequent renders limbs were missing and the figures put themselves in impossible poses.


Animation - October Workflow

23.10.24 Anim-Begins

Animating my figures wasn’t even on the radar, for some reason I thought I’d have a go. It involved installing the AnimateDiff-Evolved node sets and a whole bunch of models. One can create GIFs or MP4s. The frame count limit is 32 unless you add the Uniform Context Options node.

23.10.27 Rooping -FlowFrames

For the video to actually be smooth, I bring the MP4s into a programme - only available for Windows, unfortunately - called “FlowFrames”. It does a decent job interpolating those missing frames, creating very viewable video and even a decent gif, by converting that GIF:

https://www.tightbytes.com/art/images/Cui/23/Vid/Dreaming-05.gif

to this:

https://www.tightbytes.com/art/images/Cui/23/Vid/MesReves-3.gif

The video was 4x frame-interpolated.

The figures seem to have come to life. So far, of course, I’ve only featured one figure at a time, which may have an impact of figure accuracy. Interestingly, all anatomy issues that were produced in still image ‘renders’ have magically resolved: no extra - or missing - fingers or limbs or anything of the sort. Again, perhaps that’s due to it only being one figure.

23.12.09 AnimateDiff

I am getting out-of-memory errors all the time, now. So, AUD$749 for a 4060ti with exactly twice the memory I have now. Rendering video with purposeful movement should finally be within the scope of the possible.


ComfyUI Tips and Tricks

The “Unofficial” Sub-Reddit ComfyUI has proven an amazing resource, more than anywhere else, to be honest. Hope to collect some of the “Tips and Tricks” I pick up here.

Inserting Node-sets

To insert a node-set into a workflow:

  • save the node set to insert by highlighing the nodes, and [RtClick] -> ‘Save Selected as Template’

  • In the target workflow, [RtClick] -> ‘Node Templates’ (select the node set)

Caveat: if enclosed in a grouping box, the box is not saved into the template.

Note: templates are stored in the browser’s cache. Until recently, there was no GUI-based approach to removing unwanted templates: that has changed.


Media - Social and Other

The IPAdapter

How to use the models, by the Developer: Matt

with Latent Vision

ComfyUI Nodes: What they Do

Complete Comfy UI SDXL 1.0 Guide Part 1 - Beginner to Pro Series

with Endangered AI


Posts on SubReddit ComfyUI

23.10.07

Note: removed this post as it was not as good as those by EndangeredAI

Different artists, different solutions. You’re starting this learning process correctly: simple workflows is the way to go. I’m also finding that the simpler the prompts - positive and negative - the more consistent the output.

I’m be happy to share my workflow (via json-embedded pngs):

The workflow in both images is identical: it manages two individuals - well, women… haven’t worked out how to mix it up with multiple figures yet - in a variety of settings and the faces remain the same. In order to have a play with the workflow, you will need to install:
  • Efficiency Nodes

  • Ultimate SD Upscale

  • ComfyUI roop

  • Checkpoint: epiCRealismSin with add_detail and epiCRealismHelper LoRAs, but those are just my preference - any SD1.5 models will do

  • RgThree’s nodes, and probably some other stuff too - CivitAI is a great place to “shop”! :-)

All of these - and heaps more - can all be installed via ‘Manager’. You will need that, of course.

For faces, you can always generate and save a face in Comfy, but I’ve found that thispersondoesnotexist generates high-detail faces that work quite well with roop. Your mileage may vary, of course.

Posts on YouTube

23.10.27

ComfyUI & Animate Diff 📕 Guide by PromptMuse.

Excellent video, PromptMuse! Thank you for it.

I completely agree with the hardware suggestion you made about having a 12 gig VRAM graphics card. I have this older PC (an i5 or i7) into which I put a 3070 NVidia card… poor choice, as it only has 8 gig VRAM. One of the things I wanted to achieve was interaction between figures in my little scenes. I had roop worked out and so had persistent characters, but getting the figures to do much more than selfie-smile or stare catatonically past each other wasn’t happening. To that end, I was hoping that with video2video I could finally “compel” that dialogue, perhaps even facial expressions reflecting involvement in that scene. With an 8 gig card, I have been able to achieve some things. If you are interested, I’d be happy to share my workflow, one that can achieve remarkable results with an 8 gig card:

https://www.tightbytes.com/art/images/Cui/23/Vid/Patr014.mp4

BTW, I use https://thispersondoesnotexist.com/ for original, high-quality “real-people” faces… roop likes them better than AI faces made in MidJourney or LeonardoAI. And, in order to get a smoother video, you can use FlowFrames (Windows only) to get that result.

End of the day, I suppose I’ll just have to save up for the 3060 card, as it does have 12 gig, in order to do vi2vid, however. All the best with your channel. I have subscribed!