So the way I used to get consistent characters was using a mix of:
img2img (denoise: 0.7) with the same model posed into the position desired.
Then using a weighted down character lora and a weighted down celeb name, to help give a consistent base.
And finally a prompt with all the same details each time. (Well apart from changing expressions in the prompt each time.)
Wait. So the way to get consistent characters without LoRA training is to use a LoRA?
I agree. My 3070 runs the 8B Llama3 model in about 250ms, especially for short responses.