I guarantee that you will be up and running with Ostris AI-toolkit in less than 30mins using my in-depth tutorial published on YouTube.
I love using Runpod because I can access high end GPUs for cents per hour when I want to do targeted heavy lifting work such as Training AI models or LoRAs.
Step by step
- Sign into Runpod with your account. Assumption you have signed up and added some credits (We giveaway some credits from time to time on X – follow the blog so you can win)
- Deploy standard Runpod Pytorch 2.2.0 with a 24GB GPU. You can use newer versions also than 2.2.0
- Make sure you Edit Template to modify the volumes and increase the Workspace volume to 80GB or 100GB
- Once the Runpod is up and running – connect to Jupyter Notebook and open a Terminal
- Run these commands to install the necessary components
git clone https://github.com/ostris/ai-toolkit.git
cd ai-toolkit
git submodule update --init --recursive
python3 -m venv venv
source venv/bin/activate
pip3 install torch
pip3 install -r requirements.txt
- While this is happening you can Upload the training Images to the /workspace folder so you can reference them shortly step.
- Huggingface steps
- Sign into HF and accept the model access here black-forest-labs/FLUX.1-dev
- Create a new file named
env.txt
in the root (ai-toolkit) folder using the File Explorer - Get a READ key from huggingface and add it to the
env.txt
file like soHF_TOKEN=insert_your_key_here
- Once you have saved the file, right click it in order to rename it to
.env
- As soon as you rename the file it will no longer appear in the File Explorer
- Download the config YAML file (you can switch and choose others based on your need) and edit it to specify your preferences. Jump to this section in the video to follow along.
- Once you are ready to run your training run this command
python run.py config/whatever_you_want.yaml
where you need to specify the YAML file you created. You can see my newspaper-collage lora sample.yaml file below - NOTE: FIRST run takes a bit of time as the models need to be downloaded from huggingspace. Subsequent runs will be faster.
---
job: extension
config:
# this name will be the folder and filename name
name: "newspaper-collage"
process:
- type: 'sd_trainer'
# root folder to save training sessions/samples/weights
training_folder: "output"
# uncomment to see performance stats in the terminal every N steps
performance_log_every: 1000
device: cuda:0
# if a trigger word is specified, it will be added to captions of training data if it does not already exist
# alternatively, in your captions you can add [trigger] and it will be replaced with the trigger word
trigger_word: "newspaper collage style"
network:
type: "lora"
linear: 32
linear_alpha: 32
save:
dtype: float16 # precision to save
save_every: 200 # save every this many steps
max_step_saves_to_keep: 4 # how many intermittent saves to keep
datasets:
# datasets are a folder of images. captions need to be txt files with the same name as the image
# for instance image2.jpg and image2.txt. Only jpg, jpeg, and png are supported currently
# images will automatically be resized and bucketed into the resolution specified
# on windows, escape back slashes with another backslash so
# "C:\\path\\to\\images\\folder"
- folder_path: "/workspace/newspaper-collage"
caption_ext: "txt"
caption_dropout_rate: 0.05 # will drop out the caption 5% of time
shuffle_tokens: true # shuffle caption order, split by commas
cache_latents_to_disk: true # leave this true unless you know what you're doing
resolution: [ 512, 768, 1024 ] # flux enjoys multiple resolutions
train:
batch_size: 1
steps: 3000 # total number of steps to train 500 - 4000 is a good range
gradient_accumulation_steps: 1
train_unet: true
train_text_encoder: false # probably won't work with flux
gradient_checkpointing: true # need the on unless you have a ton of vram
noise_scheduler: "flowmatch" # for training only
optimizer: "adamw8bit"
lr: 1e-4
# uncomment this to skip the pre training sample
skip_first_sample: true
# uncomment to completely disable sampling
# disable_sampling: true
# uncomment to use new vell curved weighting. Experimental but may produce better results
# linear_timesteps: true
# ema will smooth out learning, but could slow it down. Recommended to leave on.
ema_config:
use_ema: true
ema_decay: 0.99
# will probably need this if gpu supports it for flux, other dtypes may not work correctly
dtype: bf16
model:
# huggingface model name or path
name_or_path: "black-forest-labs/FLUX.1-dev"
is_flux: true
quantize: true # run 8bit mixed precision
# low_vram: true # uncomment this if the GPU is connected to your monitors. It will use less vram to quantize, but is slower.
sample:
sampler: "flowmatch" # must match train.noise_scheduler
sample_every: 200 # sample every this many steps
width: 1024
height: 1024
prompts:
# you can add [trigger] to the prompts here and it will be replaced with the trigger word
# - "[trigger] holding a sign that says 'I LOVE PROMPTS!'"\
- "[trigger] woman with red hair, playing chess at the park, bomb going off in the background"
- "[trigger] a woman holding a coffee cup, in a beanie, sitting at a cafe"
- "[trigger] a horse is a DJ at a night club, fish eye lens, smoke machine, lazer lights, holding a martini"
neg: "" # not used on flux
seed: 42
walk_seed: true
guidance_scale: 4
sample_steps: 20
# you can add any additional meta info here. [name] is replaced with config name at top
meta:
name: "WeirdWonderfulAiArt"
version: '1.0'
- Now you can see the training take place and your LoRA files will be produced in the output folder with the names specified in the YAML file. You will also see the images generated that will give you a feel for how the training is going. As you can see LoRA generation in progress – 6/3000 steps. Time remaining indicated as well 3:31:41 remaining.
- The outputs LoRA files and the samples PNGs will appear in your
/ai-toolkit/output/<your_lora_name>
folder. The files will have.safetensors
extension - In the sub-folder called
samples
the script will also produce some sample images based on the prompts you added in the YAML file and you can start to see the results as the LoRA learns with every increasing step.
Once the LoRA training is finished you should download all the .safetensors files from the RunPod server. They would be lost when you delete the RunPod instance.
To Test the LoRA you can use my LoRA tester workflow in ComfyUI or create your own following the preview below.
I hope you found my in-depth tutorial on running Ostris AI-toolkit using RunPod useful. I know many of you have commented and supported the video tutorial and also you can support the blog by using this referrer link to RunPod which doesn’t cost you any more but supports this blog and our channel.
If you'd like to support our site please consider buying us a Ko-fi, grab a product or subscribe. Need a faster GPU, get access to fastest GPUs for less than $1 per hour with RunPod.io