vllm.kernels.vllm_c ¶
CUDA_ALIKE module-attribute ¶
Most kernels in this file are supported on all CUDA-alike platforms.
rms_no_var_size module-attribute ¶
rms_no_var_size = (
lambda x, weight, epsilon, variance_size=None: (
variance_size is None
and (weight is None or dtype == dtype)
)
)
vLLM kernel requires no variance_size override and matching input/weight dtype.
gelu_fast ¶
Fast GELU activation function using vLLM C++ kernel.
Formula: 0.5 * x * (1.0 + tanh(x * 0.7978845608 * (1.0 + 0.044715 * x^2)))
Source code in vllm/kernels/vllm_c.py
gelu_new ¶
New GELU activation function using vLLM C++ kernel.
Formula: 0.5 * x * (1.0 + tanh(sqrt(2/pi) * (x + 0.044715 * x^3)))
Source code in vllm/kernels/vllm_c.py
quick_gelu ¶
Quick GELU activation function using vLLM C++ kernel.
Formula: x * sigmoid(1.702 * x)