Deploy any HuggingFace model with vLLM on Modal in minutes. A production-ready template for running custom LLM inference endpoints on serverless GPUs. Just change the MODEL_NAME environment variable ...