Skip to content

[quantization] Introduce wrapper for Qwen3VLVisionRotaryEmbedding#496

Draft
dvsav wants to merge 1 commit intoSamsung:mainfrom
dvsav:quant_vision_rotary_embed
Draft

[quantization] Introduce wrapper for Qwen3VLVisionRotaryEmbedding#496
dvsav wants to merge 1 commit intoSamsung:mainfrom
dvsav:quant_vision_rotary_embed

Conversation

@dvsav
Copy link
Contributor

@dvsav dvsav commented Feb 16, 2026

This change introduces QuantQwen3VLVisionRotaryEmbedding wrapper to support post-training quantization of Qwen3VLVisionRotaryEmbedding module.

Why?

Qwen3VLVisionRotaryEmbedding module is used in the image encoder of Qwen model.
Trying to quantize Qwen3VLVisionRotaryEmbedding via PTQ generates exception PTQQuantizer: no quantization wrapper for Qwen3VLVisionRotaryEmbedding.

What

This change introduces:

  • Class QuantQwen3VLVisionRotaryEmbedding (tico/quantization/wrapq/wrappers/qwen_vl/quant_vision_rotary_embedding.py).
  • Unit tests: class TestQuantQwen3VLVisionRotaryEmbedding (test/quantization/wrapq/wrappers/qwen_vl/test_quant_vision_rotary_embedding.py) - skipped if transformers package is not installed.
  • New entry tico.quantization.wrapq.wrappers.qwen_vl.quant_vision_rotary_embedding in _CORE_MODULES (tico/quantization/wrapq/wrappers/registry.py).
  • Example of Qwen3VLVisionRotaryEmbedding quantization and conversion to Circle (tico/quantization/wrapq/examples/qwen/quantize_qwen_vision_rotary_embedding.py).

Unit Tests

Unit tests results with coverage information:

$ coverage run -m pytest test/quantization/wrapq/wrappers/qwen_vl/test_quant_vision_rotary_embedding.py -v
======================================================================================= test session starts ========================================================================================
platform linux -- Python 3.10.12, pytest-8.4.0, pluggy-1.6.0 -- /home/d.savchenkov/myenv/bin/python3
cachedir: .pytest_cache
rootdir: /home/d.savchenkov/TICO
configfile: pyproject.toml
plugins: anyio-4.12.0, mock-3.15.1, xdist-3.7.0, cov-6.2.1
collected 10 items                                                                                                                                                                                 

test/quantization/wrapq/wrappers/qwen_vl/test_quant_vision_rotary_embedding.py::TestQuantQwen3VLVisionRotaryEmbedding::test_activation_stats_collected PASSED                                [ 10%]
test/quantization/wrapq/wrappers/qwen_vl/test_quant_vision_rotary_embedding.py::TestQuantQwen3VLVisionRotaryEmbedding::test_different_sequence_lengths PASSED                                [ 20%]
test/quantization/wrapq/wrappers/qwen_vl/test_quant_vision_rotary_embedding.py::TestQuantQwen3VLVisionRotaryEmbedding::test_dtype_override             PASSED                                [ 30%]
test/quantization/wrapq/wrappers/qwen_vl/test_quant_vision_rotary_embedding.py::TestQuantQwen3VLVisionRotaryEmbedding::test_frequency_values_correct   PASSED                                [ 40%]
test/quantization/wrapq/wrappers/qwen_vl/test_quant_vision_rotary_embedding.py::TestQuantQwen3VLVisionRotaryEmbedding::test_mode_transitions           PASSED                                [ 50%]
test/quantization/wrapq/wrappers/qwen_vl/test_quant_vision_rotary_embedding.py::TestQuantQwen3VLVisionRotaryEmbedding::test_no_learnable_parameters    PASSED                                [ 60%]
test/quantization/wrapq/wrappers/qwen_vl/test_quant_vision_rotary_embedding.py::TestQuantQwen3VLVisionRotaryEmbedding::test_observer_count             PASSED                                [ 70%]
test/quantization/wrapq/wrappers/qwen_vl/test_quant_vision_rotary_embedding.py::TestQuantQwen3VLVisionRotaryEmbedding::test_output_shape               PASSED                                [ 80%]
test/quantization/wrapq/wrappers/qwen_vl/test_quant_vision_rotary_embedding.py::TestQuantQwen3VLVisionRotaryEmbedding::test_quantised_output_close     PASSED                                [ 90%]
test/quantization/wrapq/wrappers/qwen_vl/test_quant_vision_rotary_embedding.py::TestQuantQwen3VLVisionRotaryEmbedding::test_registration_in_registry   PASSED                                [100%]

================================================================================== 10 passed, 2 warnings in 6.43s ==================================================================================

Coverage info (irrelevant files skipped):

$ coverage report -m
Name                                                                        Stmts   Miss  Cover   Missing
---------------------------------------------------------------------------------------------------------
...
tico/quantization/wrapq/wrappers/qwen_vl/quant_vision_rotary_embedding.py      24      0   100%
...
---------------------------------------------------------------------------------------------------------
TOTAL                                                                       10227   6595    36%

@dvsav
Copy link
Contributor Author

dvsav commented Feb 16, 2026

For Reviewers

Below is the source code of Qwen3VLVisionRotaryEmbedding module that can be used to check the correctness of QuantQwen3VLVisionRotaryEmbedding implementation:

# transformers/models/qwen3_vl/modeling_qwen3_vl.py

class Qwen3VLVisionRotaryEmbedding(nn.Module):
    inv_freq: torch.Tensor  # fix linting for `register_buffer`

    def __init__(self, dim: int, theta: float = 10000.0) -> None:
        super().__init__()
        self.dim = dim
        self.theta = theta
        inv_freq = 1.0 / (theta ** (torch.arange(0, dim, 2, dtype=torch.float) / dim))
        self.register_buffer("inv_freq", inv_freq, persistent=False)

    def forward(self, seqlen: int) -> torch.Tensor:
        seq = torch.arange(seqlen, device=self.inv_freq.device, dtype=self.inv_freq.dtype)
        freqs = torch.outer(seq, self.inv_freq)
        return freqs

@mhs4670go
Copy link
Contributor

#496 and #498 are necessary? You can refer to the llama decoder layer wrapper for position embeddings creation.

@dayo09
Copy link
Contributor

dayo09 commented Feb 20, 2026

FYI, we used to create positional embeddings "statically". When we run transformers on devices, we fix sequence length for efficiency.

By fixing vision embedding's input seq_len, Qwen3VLVisionRotaryEmbedding folds to constant table.
And the following https://github.com/huggingface/transformers/blob/1618d44b9295361607ec74d7be860ba886aac039/src/transformers/models/qwen3_vl/modeling_qwen3_vl.py#L658 rotary embedding generation logic can be folds to constant too.

@mhs4670go mentioned that, thus, it seems when you implement the upper class('Qwen3VLVisionModel')'s wrapq class, above logics will be treated like one constant, needing no wrapper.

@dayo09
Copy link
Contributor

dayo09 commented Feb 20, 2026

To fill you more,

By casting seq_len as python integer in the code, torch notice it as an 'specilized variable' (see: #431 (comment))
Even in this PR, the generated Qwen3VLVisionRotaryEmbedding's circle model is simply a 'weight' as it's folded into constant, as 'forward' functions takes 'seq_len' as an integer.
Likewise, try to fix 'seq_len' in your Qwen3VLTextRotaryEmbedding code. (#498) You can see that static table is created based on the give seq_len.

@dvsav dvsav marked this pull request as draft March 4, 2026 06:48
This change introduces QuantQwen3VLVisionRotaryEmbedding wrapper to support post-training quantization of Qwen3VLVisionRotaryEmbedding module.

TICO-DCO-1.0-Signed-off-by: d.savchenkov <d.savchenkov@partner.samsung.com>
@dvsav dvsav force-pushed the quant_vision_rotary_embed branch from d614df6 to f5d54ff Compare March 4, 2026 07:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants