source code model weights

Qwen-VL is an LMM class model developed by Baidu.

Qwen2-VL

Use vllm to host an inference server and use AWQ quantization. At time of writing (28/9/24) you need to install the latest vllm from source.

python -m vllm.entrypoints.openai.api_server --served-model-name Qwen2-VL-2B-Instruct --model Qwen/Qwen2-VL-2B-Instruct-AWQ