Lab隨記: 2025

2025年3月9日星期日

Design of Electrolyte Using Deep Eutectic Solvents for High-Performance Rechargeable Nickel-Iodine Batteries

Abstract

Rechargeable nickel-ion batteries (RNiBs) have attracted significant attention because of their high volumetric density, low cost, environmental friendliness, and easy recyclability. In this study, a rechargeable nickel-iodine battery using a rational design of a deep eutectic solvent (DES) electrolyte based on a conversion reaction mechanism is first demonstrated. The rechargeable Ni-I2 battery with the DES electrolyte delivered a specific capacity of 201 mAh g−1 with a coulombic efficiency of 82.5% over 65 cycles at a current density of 0.3 A g−1. The energy storage mechanism can be attributed to I+/I− redox chemistry, which has been validated by ex situ Raman, X-ray photoelectron spectroscopy (XPS) and X-ray absorption spectroscopy (XAS). The study provides an avenue for exploring rechargeable nickel-ion batteries with DES electrolytes based on the conversion reaction mechanism.

https://doi.org/10.1002/smll.202412549

2025年2月17日星期一

深度學習模型權重檔案格式與存放目錄

隨著深度學習模型的發展，越來越多的開發者透過 GitHub 與 Hugging Face 分享模型權重，以便其他人可以下載並加以應用。但不同的深度學習框架有各自的儲存格式與資料夾結構，因此了解這些規範能幫助我們更快速找到所需的模型。

1. 常見的深度學習權重檔案格式

不同的深度學習框架使用不同的檔案格式來儲存模型的權重，以下是最常見的副檔名：

副檔名	用途	對應框架
.bin	PyTorch 模型權重 (Hugging Face)	PyTorch
.pth / .pt	PyTorch 權重 (state_dict 或完整模型)	PyTorch
.safetensors	更安全的 PyTorch 權重存儲格式	PyTorch, Hugging Face
.pb	TensorFlow Frozen Graph	TensorFlow
.ckpt	TensorFlow 或 PyTorch 的 Checkpoint	TensorFlow, PyTorch
.h5	Keras/TensorFlow 權重	TensorFlow, Keras
.tflite	TensorFlow Lite 模型	TensorFlow Lite
.msgpack	Chainer 權重存儲格式	Chainer
.npz	JAX 或 NumPy 存儲格式	JAX, NumPy
.onnx	ONNX 格式，方便跨框架使用	ONNX

當下載 Hugging Face 或 GitHub 上的模型時，可以根據這些副檔名來判斷模型的格式並選擇合適的框架來載入。

2. 深度學習模型權重的儲存目錄

不同的框架與專案通常會將模型權重存放在特定的目錄中，以下是最常見的結構與對應的儲存位置：

(1) Hugging Face (`transformers`, `diffusers`, `sentence-transformers` 等)

Hugging Face 的模型通常儲存在 model 相關的資料夾下，例如：

/model
  ├── config.json
  ├── pytorch_model.bin  # PyTorch 權重
  ├── model.safetensors  # SafeTensors 權重
  ├── tf_model.h5        # TensorFlow 權重
  ├── tokenizer.json
  ├── special_tokens_map.json

有些大型模型（如 LLaMA）會有多個拆分的 .bin 權重檔案：

/model
  ├── pytorch_model-00001-of-00003.bin
  ├── pytorch_model-00002-of-00003.bin
  ├── pytorch_model-00003-of-00003.bin
  ├── tokenizer.json
  ├── config.json

📌 相關目錄： /model/, /weights/, /checkpoints/, /snapshots/

(2) PyTorch（GitHub 上常見的專案結構）

PyTorch 模型的權重通常儲存在 weights 或 checkpoints 目錄：

/project_root
  ├── models/
  │   ├── model.py
  │   ├── __init__.py
  ├── weights/
  │   ├── best_model.pth
  │   ├── last_checkpoint.pth
  ├── checkpoints/
  │   ├── epoch_10.pth
  │   ├── epoch_20.pth

📌 相關目錄： /weights/, /checkpoints/, /models/, /logs/

(3) TensorFlow/Keras

TensorFlow 和 Keras 的權重通常儲存在 checkpoints 或 saved_model 目錄：

/project_root
  ├── checkpoints/
  │   ├── model.ckpt.index
  │   ├── model.ckpt.data-00000-of-00001
  │   ├── checkpoint
  ├── saved_model/
  │   ├── assets/
  │   ├── variables/
  │   ├── saved_model.pb

📌 相關目錄： /checkpoints/, /saved_model/, /logs/

(4) ONNX（跨框架模型）

ONNX 模型通常存放在 onnx_models 或 exported_models 目錄：

/project_root
  ├── onnx_models/
  │   ├── model.onnx
  ├── exported_models/
  │   ├── model.onnx

📌 相關目錄： /onnx_models/, /exported_models/

(5) 擴散模型（Stable Diffusion, ControlNet）

擴散模型通常使用 .safetensors 或 .ckpt 格式，並存放在 models 目錄中：

/stable-diffusion
  ├── models/
  │   ├── stable-diffusion-v1-4.ckpt
  │   ├── stable-diffusion-v2.safetensors
  ├── configs/
  │   ├── v1-inference.yaml

📌 相關目錄： /models/, /diffusion_models/

總結

框架/類型	常見儲存目錄
Hugging Face	/model/, /weights/, /checkpoints/, /snapshots/
PyTorch	/weights/, /checkpoints/, /models/, /logs/
TensorFlow	/checkpoints/, /saved_model/, /logs/
ONNX	/onnx_models/, /exported_models/
擴散模型	/models/, /diffusion_models/

2025年1月13日星期一

深度學習中的稀疏性：提升效率還是削弱能力？

在深度學習領域，「稀疏性（Sparsity）」是一個關鍵概念，它指的是數據或模型參數中有許多值為零的特性。這種特性可以提升計算效率、減少記憶體需求，甚至提高模型的泛化能力。但這是否意味著「零越多越好」呢？其實，關鍵在於如何適當地控制稀疏性，以達到最好的平衡。本文將介紹深度學習中幾種常見的稀疏性類型，以及它們在實際應用中的影響。

1. 稀疏性類型與應用

(1) 參數稀疏性（Model Sparsity）

指的是神經網路中的權重矩陣大部分為零。這可以透過 L1 正則化（Lasso）、剪枝（Pruning） 或 低秩分解（Low-rank Factorization） 來實現。

舉例： 假設一個神經網路的權重矩陣如下：

W = \begin{bmatrix} 0.5 & 0 & 0.2 \\ 0 & 0 & 0 \\ -0.3 & 0 & 0.8 \end{bmatrix}

這裡有 6 個元素為 0（總共 9 個參數），稀疏度為 66.7%。這樣的矩陣可以減少儲存需求，並透過稀疏矩陣運算提升計算速度。

(2) 激活稀疏性（Activation Sparsity）

當使用 ReLU（Rectified Linear Unit）激活函數時，負數輸入會變成 0，導致許多神經元「沉默」。

舉例： 輸入矩陣 $X$ ：

X = \begin{bmatrix} 2 & -1 & 0.5 \\ -3 & 0 & 4 \\ 1 & -2 & -0.7 \end{bmatrix}

經過 ReLU 激活後：

\text{ReLU}(X) = \begin{bmatrix} 2 & 0 & 0.5 \\ 0 & 0 & 4 \\ 1 & 0 & 0 \end{bmatrix}

這裡產生了 4 個零值（共 9 個元素），稀疏度約為 44.4%。這有助於減少計算，但若太多神經元變為 0，可能影響模型學習能力。

(3) 特徵稀疏性（Feature Sparsity）

指的是輸入數據本身為稀疏的，例如 自然語言處理（NLP） 的詞袋模型（BoW）、推薦系統的用戶-物品互動矩陣等。

舉例： 詞頻向量（Bag of Words）：

\text{BoW} = \begin{bmatrix} 1 & 0 & 0 & 5 & 0 & 0 & 2 \end{bmatrix}

只有 3 個非零值，表示這段文字只包含 3 個詞。這樣的稀疏特徵能夠壓縮存儲並提升計算效率。

(4) 梯度稀疏性（Gradient Sparsity）

在深度學習訓練中，部分權重的梯度可能接近 0，意味著它們對損失函數的貢獻很小。

舉例： 梯度矩陣：

\text{Gradient} = \begin{bmatrix} 0.01 & 0 & -0.02 \\ 0 & 0 & 0 \\ 0 & 0.03 & 0 \end{bmatrix}

在 分散式訓練 時，僅傳輸非零梯度可減少通信成本，提高計算效率。

(5) 注意力稀疏性（Sparse Attention）

在 Transformer 模型（如 BERT, GPT）中，自注意力機制計算量為 $O(n^2)$ 。透過「稀疏注意力」，模型可聚焦於關鍵資訊，減少計算量。

舉例：

A = \begin{bmatrix} 0.1 & 0.3 & 0.05 & 0.02 \\ 0.0 & 0.6 & 0.0 & 0.0 \\ 0.0 & 0.2 & 0.0 & 0.0 \\ 0.05 & 0.0 & 0.0 & 0.4 \end{bmatrix}

這樣的設計可降低 Transformer 計算複雜度，提升運算效率。

2. 0 越多越好嗎？

許多人會問：「如果讓更多參數變成 0，是否代表更好的模型？」答案是否定的。過度稀疏會導致 信息丟失，影響模型的表現。

適當稀疏與過度稀疏的影響

應用場景	適當稀疏的好處	過度稀疏的風險
模型壓縮（剪枝）	減少模型大小，加快運算	削弱表達能力，影響準確度
ReLU 激活	過濾無效資訊，提高計算效率	過多神經元變成 0，影響學習
NLP 稀疏注意力	只關注重要詞，提高效率	忽略重要詞，影響理解
推薦系統（特徵稀疏）	加速運算，減少存儲需求	缺少重要的交互信息

3. 如何控制稀疏度？

要讓模型既能利用稀疏性提升效率，又不會過度影響學習能力，可以考慮以下方法：

逐步調整剪枝比例（如 30%、50%、70%）來測試影響。
使用 L1 正則化 來鼓勵但不強制 0 值。
採用動態稀疏技術（Dynamic Sparsity），讓模型在訓練中自行選擇要稀疏的部分。

結論

稀疏性是一種強大的工具，能夠提升深度學習的運算效率，但「零越多越好」的想法是錯誤的。關鍵在於 適當平衡稀疏與模型表達能力，才能在效率與準確度之間取得最佳效果。

你是否在使用稀疏性來加速你的深度學習模型？歡迎在留言區分享你的經驗！

訂閱：文章 (Atom)

2025年3月9日 星期日