[quantization] Introduce wrapper for Qwen3VLVisionRotaryEmbedding#496
[quantization] Introduce wrapper for Qwen3VLVisionRotaryEmbedding#496dvsav wants to merge 1 commit intoSamsung:mainfrom
Conversation
For ReviewersBelow is the source code of # transformers/models/qwen3_vl/modeling_qwen3_vl.py
class Qwen3VLVisionRotaryEmbedding(nn.Module):
inv_freq: torch.Tensor # fix linting for `register_buffer`
def __init__(self, dim: int, theta: float = 10000.0) -> None:
super().__init__()
self.dim = dim
self.theta = theta
inv_freq = 1.0 / (theta ** (torch.arange(0, dim, 2, dtype=torch.float) / dim))
self.register_buffer("inv_freq", inv_freq, persistent=False)
def forward(self, seqlen: int) -> torch.Tensor:
seq = torch.arange(seqlen, device=self.inv_freq.device, dtype=self.inv_freq.dtype)
freqs = torch.outer(seq, self.inv_freq)
return freqs |
|
FYI, we used to create positional embeddings "statically". When we run transformers on devices, we fix sequence length for efficiency. By fixing vision embedding's input seq_len, Qwen3VLVisionRotaryEmbedding folds to constant table. @mhs4670go mentioned that, thus, it seems when you implement the upper class('Qwen3VLVisionModel')'s wrapq class, above logics will be treated like one constant, needing no wrapper. |
|
To fill you more, By casting seq_len as python integer in the code, torch notice it as an 'specilized variable' (see: #431 (comment)) |
This change introduces QuantQwen3VLVisionRotaryEmbedding wrapper to support post-training quantization of Qwen3VLVisionRotaryEmbedding module. TICO-DCO-1.0-Signed-off-by: d.savchenkov <d.savchenkov@partner.samsung.com>
d614df6 to
f5d54ff
Compare
This change introduces
QuantQwen3VLVisionRotaryEmbeddingwrapper to support post-training quantization ofQwen3VLVisionRotaryEmbeddingmodule.Why?
Qwen3VLVisionRotaryEmbeddingmodule is used in the image encoder of Qwen model.Trying to quantize
Qwen3VLVisionRotaryEmbeddingvia PTQ generates exceptionPTQQuantizer: no quantization wrapper for Qwen3VLVisionRotaryEmbedding.What
This change introduces:
QuantQwen3VLVisionRotaryEmbedding(tico/quantization/wrapq/wrappers/qwen_vl/quant_vision_rotary_embedding.py).class TestQuantQwen3VLVisionRotaryEmbedding(test/quantization/wrapq/wrappers/qwen_vl/test_quant_vision_rotary_embedding.py) - skipped iftransformerspackage is not installed.tico.quantization.wrapq.wrappers.qwen_vl.quant_vision_rotary_embeddingin_CORE_MODULES(tico/quantization/wrapq/wrappers/registry.py).Qwen3VLVisionRotaryEmbeddingquantization and conversion to Circle (tico/quantization/wrapq/examples/qwen/quantize_qwen_vision_rotary_embedding.py).Unit Tests
Unit tests results with coverage information:
Coverage info (irrelevant files skipped):