Ctrl K

Back to Discovery

10 models

Groq's LPU inference engine has excelled in the latest independent large language model (LLM) benchmarks, redefining the standards for AI solutions with its remarkable speed and efficiency. Groq represents instant inference speed, demonstrating strong performance in cloud-based deployments.

Supported Models

Llama 3.1 8B (Preview)

llama-3.1-8b-instant

128K

Maximum Context Length

128K

Maximum Output Length

8K

Input Price

$0.05

Output Price

$0.08

Llama 3.1 70B (Preview)

llama-3.1-70b-versatile

128K

Maximum Context Length

128K

Maximum Output Length

8K

Input Price

$0.59

Output Price

$0.79

Llama 3 Groq 8B Tool Use (Preview)

llama3-groq-8b-8192-tool-use-preview

8K

Maximum Context Length

8K

Maximum Output Length

--

Input Price

$0.19

Output Price

$0.19

Llama 3 Groq 70B Tool Use (Preview)

llama3-groq-70b-8192-tool-use-preview

8K

Maximum Context Length

8K

Maximum Output Length

--

Input Price

$0.89

Output Price

$0.89

Using Groq in LobeChat

Using Groq in LobeChat

Groq's LPU Inference Engine has excelled in the latest independent Large Language Model (LLM) benchmark, redefining the standard for AI solutions with its remarkable speed and efficiency. By integrating LobeChat with Groq Cloud, you can now easily leverage Groq's technology to accelerate the operation of large language models in LobeChat.

Groq's LPU Inference Engine achieved a sustained speed of 300 tokens per second in internal benchmark tests, and according to benchmark tests by ArtificialAnalysis.ai, Groq outperformed other providers in terms of throughput (241 tokens per second) and total time to receive 100 output tokens (0.8 seconds).

This document will guide you on how to use Groq in LobeChat:

Obtaining GroqCloud API Keys

First, you need to obtain an API Key from the GroqCloud Console.

Get GroqCloud API Key

Create an API Key in the API Keys menu of the console.

Save GroqCloud API Key

Safely store the key from the pop-up as it will only appear once. If you accidentally lose it, you will need to create a new key.

Configure Groq in LobeChat

You can find the Groq configuration option in Settings -> Language Model, where you can input the API Key you just obtained.

Groq service provider settings

Next, select a Groq-supported model in the assistant's model options, and you can experience the powerful performance of Groq in LobeChat.

Related Providers

OpenAI is a global leader in artificial intelligence research, with models like the GPT series pushing the frontiers of natural language processing. OpenAI is committed to transforming multiple industries through innovative and efficient AI solutions. Their products demonstrate significant performance and cost-effectiveness, widely used in research, business, and innovative applications.

Ollama provides models that cover a wide range of fields, including code generation, mathematical operations, multilingual processing, and conversational interaction, catering to diverse enterprise-level and localized deployment needs.

Anthropic is a company focused on AI research and development, offering a range of advanced language models such as Claude 3.5 Sonnet, Claude 3 Sonnet, Claude 3 Opus, and Claude 3 Haiku. These models achieve an ideal balance between intelligence, speed, and cost, suitable for various applications from enterprise workloads to rapid-response scenarios. Claude 3.5 Sonnet, as their latest model, has excelled in multiple evaluations while maintaining a high cost-performance ratio.

Bedrock is a service provided by Amazon AWS, focusing on delivering advanced AI language and visual models for enterprises. Its model family includes Anthropic's Claude series, Meta's Llama 3.1 series, and more, offering a range of options from lightweight to high-performance, supporting tasks such as text generation, conversation, and image processing for businesses of varying scales and needs.

Google's Gemini series represents its most advanced, versatile AI models, developed by Google DeepMind, designed for multimodal capabilities, supporting seamless understanding and processing of text, code, images, audio, and video. Suitable for various environments from data centers to mobile devices, it significantly enhances the efficiency and applicability of AI models.