Hacker News new | past | comments | ask | show | jobs | submit login

I'm serving AI models on Lambda Labs and after some trial and error I found having a single vllm server along with caddy, behind cloudflare dns, to work really well and really easy to set up

vllm serve ${MODEL_REPO} --dtype auto --api-key $HF_TOKEN --guided-decoding-backend outlines --disable-fastapi-docs &

sudo caddy reverse-proxy --from ${SUBDOMAIN}.sugaku.net --to localhost:8000 &




It's really best to avoid running web servers as root. It's easy to forward the port 80 with iptables, change the kernel knob to let unprivileged users use port 80 and above, or set the network capability on the binary.

https://stackoverflow.com/questions/413807/


You can use Cloudflare Tunnel, which is even better and simple than having an extra service.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: