Kohya-ss-sd-scripts

mirror of https://github.com/kohya-ss/sd-scripts.git synced 2026-04-06 21:52:27 +00:00

Author	SHA1	Message	Date
Kohya S.	fa53f71ec0	fix: improve numerical stability by conditionally using float32 in Anima (#2302 ) * fix: improve numerical stability by conditionally using float32 in block computations * doc: update README for improvement stability for fp16 training on Anima in version 0.10.3	2026-04-02 12:36:29 +09:00
Kohya S	5fb3172baf	fix: AdaLN modulation to use float32 for numerical stability in fp16	2026-03-29 21:25:53 +09:00
Kohya S.	5cdad10de5	Fix/leco cleanup (#2294 ) * feat: SD1.x/2.x と SDXL 向けの LECO 学習スクリプトを追加 (#2285) * Add LECO training script and associated tests - Implemented `sdxl_train_leco.py` for training with LECO prompts, including argument parsing, model setup, training loop, and weight saving functionality. - Created unit tests for `load_prompt_settings` in `test_leco_train_util.py` to validate loading of prompt configurations in both original and slider formats. - Added basic syntax tests for `train_leco.py` and `sdxl_train_leco.py` to ensure modules are importable. * fix: use getattr for safe attribute access in argument verification * feat: add CUDA device compatibility validation and corresponding tests * Revert "feat: add CUDA device compatibility validation and corresponding tests" This reverts commit `6d3e51431b`. * feat: update predict_noise_xl to use vector embedding from add_time_ids * feat: implement checkpointing in predict_noise and predict_noise_xl functions * feat: remove unused submodules and update .gitignore to exclude .codex-tmp --------- Co-authored-by: Kohya S. <52813779+kohya-ss@users.noreply.github.com> * fix: format * fix: LECO PR #2285 のレビュー指摘事項を修正 - train_util.py/deepspeed_utils.py の getattr 化を元に戻し、LECO パーサーにダミー引数を追加 - sdxl_train_util のモジュールレベルインポートをローカルインポートに変更 - PromptEmbedsCache.__getitem__ でキャッシュミス時に KeyError を送出するよう修正 - 設定ファイル形式を YAML から TOML に変更（リポジトリの規約に統一） - 重複コード (build_network_kwargs, get_save_extension, save_weights) を leco_train_util.py に統合 - _expand_slider_target の冗長な PromptSettings 構築を簡素化 - add_time_ids 用に専用の batch_add_time_ids 関数を追加 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * docs: LECO 学習ガイドを大幅に拡充コマンドライン引数の全カテゴリ別解説、プロンプト TOML の全フィールド説明、 2つの guidance_scale の違い、推奨設定表、YAML からの変換ガイド等を追加。英語本文と日本語折り畳みの二言語構成。 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: apply_noise_offset の dtype 不一致を修正 torch.randn のデフォルト float32 により latents が暗黙的にアップキャストされる問題を修正。 float32/CPU で生成後に latents の dtype/device へ変換する安全なパターンを採用。 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Umisetokikaze <52318966+umisetokikaze@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-29 20:41:43 +09:00
Kohya S.	5f793fb0f4	Log d*lr for ProdigyPlusScheduleFree (#2289 )	2026-03-29 18:47:09 +09:00
woctordho	343c929e39	Log d*lr for ProdigyPlusScheduleFree	2026-03-21 11:09:56 +08:00
woctordho	1cd95b2d8b	Add `skip_image_resolution` to deduplicate multi-resolution dataset (#2273 ) * Add min_orig_resolution and max_orig_resolution * Rename min_orig_resolution to skip_image_resolution; remove max_orig_resolution * Change skip_image_resolution to tuple * Move filtering to __init__ * Minor fix	2026-03-19 08:43:39 +09:00
Kohya S.	2217704ce1	feat: Support LoKr/LoHa for SDXL and Anima (#2275 ) * feat: Add LoHa/LoKr network support for SDXL and Anima - networks/network_base.py: shared AdditionalNetwork base class with architecture auto-detection (SDXL/Anima) and generic module injection - networks/loha.py: LoHa (Low-rank Hadamard Product) module with HadaWeight custom autograd, training/inference classes, and factory functions - networks/lokr.py: LoKr (Low-rank Kronecker Product) module with factorization, training/inference classes, and factory functions - library/lora_utils.py: extend weight merge hook to detect and merge LoHa/LoKr weights alongside standard LoRA Linear and Conv2d 1x1 layers only; Conv2d 3x3 (Tucker decomposition) support will be added separately. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat: Enhance LoHa and LoKr modules with Tucker decomposition support - Added Tucker decomposition functionality to LoHa and LoKr modules. - Implemented new methods for weight rebuilding using Tucker decomposition. - Updated initialization and weight handling for Conv2d 3x3+ layers. - Modified get_diff_weight methods to accommodate Tucker and non-Tucker modes. - Enhanced network base to include unet_conv_target_modules for architecture detection. * fix: rank dropout handling in LoRAModule for Conv2d and Linear layers, see #2272 for details * doc: add dtype comment for load_safetensors_with_lora_and_fp8 function * fix: enhance architecture detection to support InferSdxlUNet2DConditionModel for gen_img.py * doc: update model support structure to include Lumina Image 2.0, HunyuanImage-2.1, and Anima-Preview * doc: add documentation for LoHa and LoKr fine-tuning methods * Update networks/network_base.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Update docs/loha_lokr.md Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * fix: refactor LoHa and LoKr imports for weight merging in load_safetensors_with_lora_and_fp8 function --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>	2026-02-23 22:09:00 +09:00
Kohya S.	f90fa1a89a	feat: backward compatibility for SD/SDXL latent cache (#2276 ) * fix: improve handling of legacy npz files and add logging for fallback scenarios * fix: simplify fallback handling in SdSdxlLatentsCachingStrategy	2026-02-23 21:44:51 +09:00
Kohya S	892f8be78f	fix: cast input tensor to float32 for improved numerical stability in residual connections	2026-02-23 21:12:57 +09:00
woctordho	50694df3cf	Multi-resolution dataset for SD1/SDXL (#2269 ) * Multi-resolution dataset for SD1/SDXL * Add fallback to legacy key without resolution suffix * Support numpy 2.2	2026-02-23 15:30:36 +09:00
Kohya S	ef051427df	fix: `str is not "no"` to `str != "no"`	2026-02-16 07:58:15 +09:00
Kohya S.	573a7fa06c	Merge pull request #2262 from duongve13112002/fix_lumina Fix bug and optimization for Lumina model	2026-02-16 07:54:49 +09:00
Kohya S.	34e7138b6a	Add/modify some implementation for anima (#2261 ) * fix: update extend-exclude list in _typos.toml to include configs * fix: exclude anima tests from pytest * feat: add entry for 'temperal' in extend-words section of _typos.toml for Qwen-Image VAE * fix: update default value for --discrete_flow_shift in anima training guide * feat: add Qwen-Image VAE * feat: simplify encode_tokens * feat: use unified attention module, add wrapper for state dict compatibility * feat: loading with dynamic fp8 optimization and LoRA support * feat: add anima minimal inference script (WIP) * format: format * feat: simplify target module selection by regular expression patterns * feat: kept caption dropout rate in cache and handle in training script * feat: update train_llm_adapter and verbose default values to string type * fix: use strategy instead of using tokenizers directly * feat: add dtype property and all-zero mask handling in cross-attention in LLMAdapterTransformerBlock * feat: support 5d tensor in get_noisy_model_input_and_timesteps * feat: update loss calculation to support 5d tensor * fix: update argument names in anima_train_utils to align with other archtectures * feat: simplify Anima training script and update empty caption handling * feat: support LoRA format without `net.` prefix * fix: update to work fp8_scaled option * feat: add regex-based learning rates and dimensions handling in create_network * fix: improve regex matching for module selection and learning rates in LoRANetwork * fix: update logging message for regex match in LoRANetwork * fix: keep latents 4D except DiT call * feat: enhance block swap functionality for inference and training in Anima model * feat: refactor Anima training script * feat: optimize VAE processing by adjusting tensor dimensions and data types * fix: wait all block trasfer before siwtching offloader mode * feat: update Anima training guide with new argument specifications and regex-based module selection. Thank you Claude! * feat: support LORA for Qwen3 * feat: update Anima SAI model spec metadata handling * fix: remove unused code * feat: split CFG processing in do_sample function to reduce memory usage * feat: add VAE chunking and caching options to reduce memory usage * feat: optimize RMSNorm forward method and remove unused torch_attention_op * Update library/strategy_anima.py Use torch.all instead of all. Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Update library/safetensors_utils.py Fix duplicated new_key for concat_hook. Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Update anima_minimal_inference.py Remove unused code. Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Update anima_train.py Remove unused import. Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Update library/anima_train_utils.py Remove unused import. Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * fix: review with Copilot * feat: add script to convert LoRA format to ComfyUI compatible format (WIP, not tested yet) * feat: add process_escape function to handle escape sequences in prompts * feat: enhance LoRA weight handling in model loading and add text encoder loading function * feat: improve ComfyUI conversion script with prefix constants and module name adjustments * feat: update caption dropout documentation to clarify cache regeneration requirement * feat: add clarification on learning rate adjustments * feat: add note on PyTorch version requirement to prevent NaN loss --------- Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>	2026-02-13 08:15:06 +09:00
Duoong	1640e53392	Fix bug and optimization Lumina training	2026-02-12 22:52:28 +07:00
duongve13112002	e21a7736f8	Support Anima model (#2260 ) * Support Anima model * Update document and fix bug * Fix latent normlization * Fix typo * Fix cache embedding * fix typo in tests/test_anima_cache.py * Remove redundant argument apply_t5_attn_mask * Improving caching with argument caption_dropout_rate * Fix W&B logging bugs * Fix discrete_flow_shift default value	2026-02-08 10:18:55 +09:00
Kohya S.	c6bc632ec6	fix: metadata dataset degradation and make it work (#2186 ) * fix: support dataset with metadata * feat: support another tagger model * fix: improve handling of image size and caption/tag processing in FineTuningDataset * fix: enhance metadata loading to support JSONL format in FineTuningDataset * feat: enhance image loading and processing in ImageLoadingPrepDataset with batch support and output options * fix: improve image path handling and memory management in dataset classes * Update finetune/tag_images_by_wd14_tagger.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * fix: add return type annotation for process_tag_replacement function and ensure tags are returned * feat: add artist category threshold for tagging * doc: add comment for clarification --------- Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>	2026-01-18 15:17:07 +09:00
urlesistiana	f7fc7ddda2	fix #2201 : lumina 2 timesteps handling	2025-10-13 16:08:28 +08:00
Kohya S	5462a6bb24	Merge branch 'dev' into sd3	2025-09-29 21:02:02 +09:00
Kohya S	63711390a0	Merge branch 'main' into dev	2025-09-29 20:56:07 +09:00
Kohya S	60bfa97b19	fix: disable_mmap_safetensors not defined in SDXL TI training	2025-09-29 20:52:48 +09:00
Kohya S.	e7b89826c5	Update library/custom_offloading_utils.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>	2025-09-21 13:29:58 +09:00
Kohya S	806d535ef1	fix: block-wise scaling is overwritten by per-tensor scaling	2025-09-21 13:10:41 +09:00
Kohya S	3876343fad	fix: remove print statement for guidance rescale in AdaptiveProjectedGuidance	2025-09-21 13:09:38 +09:00
Kohya S	040d976597	feat: add guidance rescale options for Adaptive Projected Guidance in inference	2025-09-21 13:03:14 +09:00
Kohya S	9621d9d637	feat: add Adaptive Projected Guidance parameters and noise rescaling	2025-09-21 12:34:40 +09:00
Kohya S	f41e9e2b58	feat: add vae_chunk_size argument for memory-efficient VAE decoding and processing	2025-09-21 11:09:37 +09:00
Kohya S	b090d15f7d	feat: add multi backend attention and related update for HI2.1 models and scripts	2025-09-20 19:45:33 +09:00
Kohya S	f834b2e0d4	fix: --fp8_vl to work	2025-09-18 23:46:18 +09:00
Kohya S	f6b4bdc83f	feat: block-wise fp8 quantization	2025-09-18 21:20:54 +09:00
Kohya S	f5b004009e	fix: correct tensor indexing in HunyuanVAE2D class for blending and encoding functions	2025-09-17 21:54:25 +09:00
Kohya S	4e2a80a6ca	refactor: update imports to use safetensors_utils for memory-efficient operations	2025-09-13 21:07:11 +09:00
Kohya S	d831c88832	fix: sample generation doesn't work with block swap	2025-09-13 21:06:04 +09:00
Kohya S	bae7fa74eb	Merge branch 'sd3' into feat-hunyuan-image-2.1-inference	2025-09-13 20:13:58 +09:00
Kohya S.	e1c666e97f	Update library/safetensors_utils.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>	2025-09-13 20:03:55 +09:00
Kohya S	8783f8aed3	feat: faster safetensors load and split safetensor utils	2025-09-13 19:51:38 +09:00
Kohya S	209c02dbb6	feat: HunyuanImage LoRA training	2025-09-12 21:40:42 +09:00
Kohya S	a0f0afbb46	fix: revert constructor signature update	2025-09-11 22:27:00 +09:00
Kohya S	7f983c558d	feat: block swap for inference and initial impl for HunyuanImage LoRA (not working)	2025-09-11 22:15:22 +09:00
Kohya S	5149be5a87	feat: initial commit for HunyuanImage-2.1 inference	2025-09-11 12:54:12 +09:00
Kohya S	e836b7f66d	fix: chroma LoRA training without Text Encode caching	2025-08-30 09:30:24 +09:00
Kohya S	6edbe00547	feat: update libraries, remove warnings	2025-08-16 20:07:03 +09:00
Kohya S	351bed965c	fix model type handling in analyze_state_dict_state function for SD3	2025-08-13 21:38:51 +09:00
rockerBOO	9bb50c26c4	Set sai_model_spec to must	2025-08-03 00:43:09 -04:00
rockerBOO	10bfcb9ac5	Remove text model spec	2025-08-03 00:40:10 -04:00
rockerBOO	d24d733892	Update model spec to 1.0.1. Refactor model spec	2025-08-02 21:14:27 -04:00
Kohya S	96feb61c0a	feat: implement modulation vector extraction for Chroma and update related methods	2025-07-30 21:34:49 +09:00
Kohya S	6c8973c2da	doc: add reference link for input vector gradient requirement in Chroma class	2025-07-28 22:08:02 +09:00
Kohya S	9eda938876	Merge branch 'sd3' into feature-chroma-support	2025-07-21 13:32:22 +09:00
Kohya S.	d98400b06e	Merge pull request #2138 from kohya-ss/feature-lumina-image Feature lumina image	2025-07-21 13:21:26 +09:00
Kohya S	0b763ef1f1	feat: fix timestep for input_vec for Chroma	2025-07-20 20:53:06 +09:00

1 2 3 4 5 ...

979 Commits