vllm.ir.ops ¶
Modules:
| Name | Description |
|---|---|
activation | |
layernorm | |
gelu_fast ¶
Fast GELU activation function.
Formula: 0.5 * x * (1.0 + tanh(x * 0.7978845608 * (1.0 + 0.044715 * x^2)))
A computationally efficient approximation of the GELU function.
Source code in vllm/ir/ops/activation.py
gelu_new ¶
New GELU activation function.
Formula: 0.5 * x * (1.0 + tanh(sqrt(2/pi) * (x + 0.044715 * x^3)))
This is the GELU approximation used in GPT-2 and other transformer models.
Source code in vllm/ir/ops/activation.py
quick_gelu ¶
Quick GELU activation function.
Formula: x * sigmoid(1.702 * x)
A fast approximation of GELU used in various transformer models. Reference: https://github.com/huggingface/transformers/blob/main/src/transformers/activations.py#L90
Source code in vllm/ir/ops/activation.py
rms_norm ¶
rms_norm(
x: Tensor,
weight: Tensor | None,
epsilon: float,
variance_size: int | None = None,
) -> Tensor
Weighted root-mean-square layer normalization