Skip to content

LLMPerf-v2.0

Latest
Compare
Choose a tag to compare
@avnishn avnishn released this 06 Dec 01:35
· 5 commits to main since this release
1eac866

The latest version of LLMPerf brings a suite of significant updates designed to provide more in-depth and customizable benchmarking capabilities for LLM inference. These updates include:

  • Expanded metrics with quantile distribution (P25-99): Comprehensive data representation for deeper insights.
  • Customizable benchmarking parameters: Tailor parameters to fit specific use case scenarios.
  • Introduction of load test and correctness test: Assessing performance and accuracy under stress.
  • Broad compatibility: Supports a range of products including Anyscale Endpoints, OpenAI, Anthropic, together.ai, Fireworks.ai, Perplexity, Huggingface, Lepton AI, and various APIs supported by the LiteLLM project).
  • Easy addition of new LLMs via the LLMClient API.

The old LLMPerf code base can be found in the llmperf-legacy repo.