Kohya-ss-sd-scripts

mirror of https://github.com/kohya-ss/sd-scripts.git synced 2026-04-06 13:47:06 +00:00

Author	SHA1	Message	Date
Kohya S.	ae72efb92b	Merge pull request #2264 from kohya-ss/release/v0.10.1 Release 0.10.1 v0.10.1	2026-02-13 08:34:42 +09:00
Kohya S	449e70b4cf	README: Update change history for version 0.10.1 with Anima model support	2026-02-13 08:31:22 +09:00
Kohya S.	b237b8deb3	Merge pull request #2263 from kohya-ss/sd3 feat: Anima support	2026-02-13 08:20:44 +09:00
Kohya S.	34e7138b6a	Add/modify some implementation for anima (#2261 ) * fix: update extend-exclude list in _typos.toml to include configs * fix: exclude anima tests from pytest * feat: add entry for 'temperal' in extend-words section of _typos.toml for Qwen-Image VAE * fix: update default value for --discrete_flow_shift in anima training guide * feat: add Qwen-Image VAE * feat: simplify encode_tokens * feat: use unified attention module, add wrapper for state dict compatibility * feat: loading with dynamic fp8 optimization and LoRA support * feat: add anima minimal inference script (WIP) * format: format * feat: simplify target module selection by regular expression patterns * feat: kept caption dropout rate in cache and handle in training script * feat: update train_llm_adapter and verbose default values to string type * fix: use strategy instead of using tokenizers directly * feat: add dtype property and all-zero mask handling in cross-attention in LLMAdapterTransformerBlock * feat: support 5d tensor in get_noisy_model_input_and_timesteps * feat: update loss calculation to support 5d tensor * fix: update argument names in anima_train_utils to align with other archtectures * feat: simplify Anima training script and update empty caption handling * feat: support LoRA format without `net.` prefix * fix: update to work fp8_scaled option * feat: add regex-based learning rates and dimensions handling in create_network * fix: improve regex matching for module selection and learning rates in LoRANetwork * fix: update logging message for regex match in LoRANetwork * fix: keep latents 4D except DiT call * feat: enhance block swap functionality for inference and training in Anima model * feat: refactor Anima training script * feat: optimize VAE processing by adjusting tensor dimensions and data types * fix: wait all block trasfer before siwtching offloader mode * feat: update Anima training guide with new argument specifications and regex-based module selection. Thank you Claude! * feat: support LORA for Qwen3 * feat: update Anima SAI model spec metadata handling * fix: remove unused code * feat: split CFG processing in do_sample function to reduce memory usage * feat: add VAE chunking and caching options to reduce memory usage * feat: optimize RMSNorm forward method and remove unused torch_attention_op * Update library/strategy_anima.py Use torch.all instead of all. Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Update library/safetensors_utils.py Fix duplicated new_key for concat_hook. Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Update anima_minimal_inference.py Remove unused code. Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Update anima_train.py Remove unused import. Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Update library/anima_train_utils.py Remove unused import. Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * fix: review with Copilot * feat: add script to convert LoRA format to ComfyUI compatible format (WIP, not tested yet) * feat: add process_escape function to handle escape sequences in prompts * feat: enhance LoRA weight handling in model loading and add text encoder loading function * feat: improve ComfyUI conversion script with prefix constants and module name adjustments * feat: update caption dropout documentation to clarify cache regeneration requirement * feat: add clarification on learning rate adjustments * feat: add note on PyTorch version requirement to prevent NaN loss --------- Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>	2026-02-13 08:15:06 +09:00
Kohya S	9144463f7b	Merge branch 'dev' into sd3	2026-02-13 08:14:21 +09:00
duongve13112002	e21a7736f8	Support Anima model (#2260 ) * Support Anima model * Update document and fix bug * Fix latent normlization * Fix typo * Fix cache embedding * fix typo in tests/test_anima_cache.py * Remove redundant argument apply_t5_attn_mask * Improving caching with argument caption_dropout_rate * Fix W&B logging bugs * Fix discrete_flow_shift default value	2026-02-08 10:18:55 +09:00
Kohya S.	8b5ce3e641	Merge pull request #2255 from cgcalatrava/fix-diffusers-unet-import Fix AttributeError for UNet2DConditionModel with newer diffusers versions	2026-01-20 07:50:04 +09:00
cgcalatrava	da07e4c617	Make UNet2DConditionModel import compatible with old and new diffusers versions	2026-01-19 20:53:00 +01:00
Kohya S.	966e9d7f6b	Merge pull request #2254 from kohya-ss/dev Merge the changes from the sd3 branch into main v0.10.0	2026-01-19 22:00:25 +09:00
Kohya S.	2a2760e702	Merge pull request #1374 from kohya-ss/sd3 support SD3	2026-01-19 21:50:22 +09:00
Kohya S.	b996440c5f	Doc update sd3 branch documentation (#2253 ) * doc: move sample prompt file documentation, and remove history for branch * doc: remove outdated FLUX.1 and SD3 training information from README * doc: update README and training documentation for clarity and structure	2026-01-19 21:38:46 +09:00
Kohya S.	a9af52692a	feat: add pyramid noise and noise offset options to generation script (#2252 ) * feat: add pyramid noise and noise offset options to generation script * fix: fix to work with SD1.5 models * doc: update to match with latest gen_img.py * doc: update README to clarify script capabilities and remove deprecated sections	2026-01-18 16:56:48 +09:00
Kohya S.	c6bc632ec6	fix: metadata dataset degradation and make it work (#2186 ) * fix: support dataset with metadata * feat: support another tagger model * fix: improve handling of image size and caption/tag processing in FineTuningDataset * fix: enhance metadata loading to support JSONL format in FineTuningDataset * feat: enhance image loading and processing in ImageLoadingPrepDataset with batch support and output options * fix: improve image path handling and memory management in dataset classes * Update finetune/tag_images_by_wd14_tagger.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * fix: add return type annotation for process_tag_replacement function and ensure tags are returned * feat: add artist category threshold for tagging * doc: add comment for clarification --------- Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>	2026-01-18 15:17:07 +09:00
Kohya S.	f7f971f50d	Merge pull request #2251 from kohya-ss/fix-pytest-for-lumina fix(tests): add ip_noise_gamma args for MockArgs in pytest	2026-01-18 15:09:47 +09:00
Kohya S	c4be615f69	fix(tests): add ip_noise_gamma args for MockArgs in pytest	2026-01-18 15:05:57 +09:00
Kohya S.	e06e063970	Merge pull request #2225 from urlesistiana/sd3_lumina2_ts_fix fix: lumina 2 timesteps handling	2026-01-18 14:39:04 +09:00
Kohya S.	94e3dbebea	Merge pull request #2246 from kozistr/deps/pytorch-optimizer Bump `pytorch-optimizer` version to v3.9.0	2025-12-21 22:51:32 +09:00
kozistr	95a65b89a5	build(deps): bump pytorch-optimizer to v3.9.0	2025-12-21 15:53:47 +09:00
Kohya S.	a5a162044c	Merge pull request #2226 from kohya-ss/fix-hunyuan-image-batch-gen-error fix: error on batch generation closes #2209	2025-10-15 21:57:45 +09:00
Kohya S	a33cad714e	fix: error on batch generation closes #2209	2025-10-15 21:57:11 +09:00
urlesistiana	f7fc7ddda2	fix #2201 : lumina 2 timesteps handling	2025-10-13 16:08:28 +08:00
Kohya S.	5e366acda4	Merge pull request #2003 from laolongboy/sd3-dev Fix missing parameters in model conversion script	2025-10-01 21:03:12 +09:00
Kohya S	5462a6bb24	Merge branch 'dev' into sd3	2025-09-29 21:02:02 +09:00
Kohya S	63711390a0	Merge branch 'main' into dev	2025-09-29 20:56:07 +09:00
Kohya S.	206adb6438	Merge pull request #2216 from kohya-ss/fix-sdxl-textual-inversion-training-disable-mmap fix: disable_mmap_safetensors not defined in SDXL TI training	2025-09-29 20:55:02 +09:00
Kohya S	60bfa97b19	fix: disable_mmap_safetensors not defined in SDXL TI training	2025-09-29 20:52:48 +09:00
Kohya S.	f0c767e0f2	Merge pull request #2213 from kohya-ss/doc-hunyuan-image-training-text-encoder-cpu-note docs: enhance text encoder CPU usage instructions for HunyuanImage-2.…	2025-09-28 18:32:11 +09:00
kohya-ss	a0c26a0efa	docs: enhance text encoder CPU usage instructions for HunyuanImage-2.1 training	2025-09-28 18:21:25 +09:00
Kohya S.	67d0621313	Merge pull request #2212 from kohya-ss/fix-hunyuan-image-sample-generation fix: HunyuanImage-2.1 sample generation fails	2025-09-28 18:12:04 +09:00
Kohya S	6a826d21b1	feat: add new parameters for sample image inference configuration	2025-09-28 18:06:17 +09:00
Kohya S.	4c197a538b	Merge pull request #2207 from kohya-ss/fix-flux-extract-lora-metadata-failed fix: update metadata construction to include model_config for flux	2025-09-24 21:19:27 +09:00
Kohya S	4b79d73504	fix: update metadata construction to include model_config for flux	2025-09-24 21:15:37 +09:00
Kohya S.	121853ca2a	Merge pull request #2198 from kohya-ss/feat-hunyuan-image-2.1-inference feat: support HunyuanImage-2.1	2025-09-23 19:11:50 +09:00
Kohya S	58df9dffa4	doc: update README with HunyuanImage-2.1 LoRA training details and requirements	2025-09-23 18:59:02 +09:00
Kohya S	31f7df3b3a	doc: add --network_train_unet_only option for HunyuanImage-2.1 training	2025-09-23 18:53:36 +09:00
Kohya S.	753c794549	Update hunyuan_image_train_network.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>	2025-09-21 13:30:22 +09:00
Kohya S.	e7b89826c5	Update library/custom_offloading_utils.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>	2025-09-21 13:29:58 +09:00
Kohya S	806d535ef1	fix: block-wise scaling is overwritten by per-tensor scaling	2025-09-21 13:10:41 +09:00
Kohya S	3876343fad	fix: remove print statement for guidance rescale in AdaptiveProjectedGuidance	2025-09-21 13:09:38 +09:00
Kohya S	040d976597	feat: add guidance rescale options for Adaptive Projected Guidance in inference	2025-09-21 13:03:14 +09:00
Kohya S	9621d9d637	feat: add Adaptive Projected Guidance parameters and noise rescaling	2025-09-21 12:34:40 +09:00
Kohya S	e7b8e9a778	doc: add --vae_chunk_size option for training and inference	2025-09-21 11:13:26 +09:00
Kohya S	f41e9e2b58	feat: add vae_chunk_size argument for memory-efficient VAE decoding and processing	2025-09-21 11:09:37 +09:00
Kohya S	8f20c37949	feat: add --text_encoder_cpu option to reduce VRAM usage by running text encoders on CPU for training	2025-09-20 20:26:20 +09:00
Kohya S	b090d15f7d	feat: add multi backend attention and related update for HI2.1 models and scripts	2025-09-20 19:45:33 +09:00
Kohya S	f834b2e0d4	fix: --fp8_vl to work	2025-09-18 23:46:18 +09:00
Kohya S	f6b4bdc83f	feat: block-wise fp8 quantization	2025-09-18 21:20:54 +09:00
Kohya S	2ce506e187	fix: fp8 casting not working	2025-09-18 21:20:08 +09:00
Kohya S	f5b004009e	fix: correct tensor indexing in HunyuanVAE2D class for blending and encoding functions	2025-09-17 21:54:25 +09:00
Kohya S	cbe2a9da45	feat: add conversion script for LoRA models to ComfyUI format with reverse option	2025-09-16 21:48:47 +09:00

1 2 3 4 5 ...

2483 Commits