The Making of a LoRA: Ink & Lore
I've been saving random watercolor and color pencil images for a while now, not really having much of a plan of what I'd do with them. However, I recently discovered Flux can do some unexpected and creative things when trained on a mix styles, so I decided to run a LoRA without a particular style target in mind, but instead a fusion of several mediums. I liked the way it turned out.
.png)
.png)
The Training Settings
- The dataset was 14 images, cropped to the following aspect ratios: 2 images (3:5), 6 images (7:9), 2 images (1:1), and 4 images (5:3).
- That dataset was used 4 times at 4 resolutions, training at:
- 256, batch size of 8, with 2 repeats, no buckets, random crop
- 512, batch size of 4, with 2 repeats, buckets enabled
- 768, batch size of 4, with 2 repeats, buckets enabled
- 1024, batch size of 2, with 2 repeats, buckets enabled
- Training ran for 2000 steps, but the best model was saved at 1728 steps (epoch 56 of 63)
- Training used an equal network alpha and rank of 8.
- It was trained on the flux-dev2pro-fp8 base model
- Used AdamW8bit Optimizer set with betas=0.9,0.999 and weight decay=0.01
- Used cosine annealing for the LR scheduler with a max LR of 6e-4 and a min LR of 2e-4, cycling every 200 steps for 8 cycles
- Trained clip-l with a LR of 5e-5
A Closer Look
Multiple Resolutions
Though there are many ways to run a multiple resolution training, I've found that a wide spread including going all the way down to 256 works well for an art style LoRA. I typically set the lowest resolution without bucketing and let it randomly crop. At the lowest resolution, the model does not capture fine details, but rather overall stylistic elements. Because of this, I also found that it works well to run the lowest resolution without captions, which appears to enhance the strength of the style. Even though I can't train at 1024 at anything higher than batch 2, mixing the training sets averages out the batch sizes to speed up training and lets me scale up the learning rate without burning the model.
Learning Rate Schedule
I actually made a mistake training this model that turned out to work in my advantage. Typically I've been training models with a cosine annealing LR scheduler and running 8 cycles for 200 step each for a total of 1600 training steps. The restarts help break the model out of any slumps and makes it easier to map out where the best epochs are likely to be. However, I increased the number of training steps but forgot to proportionally increase the cycles, causing the learning rate to plateau at a constant 2e-4 after the 8th cycle (1600 steps). The smooth finish of the run made it easy to pick out several good epochs from the last 500 or so steps
Training on Dev2Pro
To enhance the stability and quality of LoRA training, I opted to use the Flux-Dev2Pro model as the base. Flux-Dev2Pro is a fine-tuned version of Flux-Dev, specifically designed to address common issues encountered during LoRA training, such as distorted outputs and model collapse. I've often repeated the same training run with both Flux-Dev2Pro and the original Flux dev base and training with Flux-Dev2Pro yields superior LoRA models in the majority of cases with a few occasional ties in overall quality. However, it's important to note that if you're using training samples, they may not be trustworthy, as the script utilizes Flux-Dev2Pro for inference, which can result in subpar images.To save time, it's advisable to disable samples and evaluate your epochs once training is complete.
Where to get the LoRA
Ink & Lore is available for free download at on Civitai and can be run online on Mage.Space.