12 alternatives to ExLlamaV2 in the infrastructure category
ExLlamaV2
A fast inference library for running LLMs locally on modern consumer-class GPUs - turboderp-org/exllamav2
Looking for alternatives?
Compare 12 similar tools below
Inference performance drives profitability.
Make employees, applications and networks faster and more secure everywhere, while reducing complexity and cost.
Train, deploy, observe, and evaluate LLMs from a single platform. Lower cost, faster latency, and dedicated support from Inference.net.
Bring your own code, and run CPU, GPU, and data-intensive compute at scale. The serverless platform for AI and data teams.
Serve and scale open-source and custom AI models on the fastest, most reliable inference platform.
An open source framework and developer platform for building, testing, deploying, scaling, and observing agents in production.