๐Ÿ†• Build and deploy Haystack pipelines with deepset Studio
Maintained by deepset

Integration: Groq

Use open Language Models served by Groq

Authors
deepset

Table of Contents

Overview

Groq is an AI company that has developed Language Processing Unit (LPU), a high-performance engine designed for fast inference of Large Language Models.

To start using Groq, sign up for an API key here. This will give you access to Groq API, which offers rapid inference of open Language Models like Mixtral and Llama 3.

Usage

Groq API is OpenAI compatible, making it easy to use in Haystack via OpenAI Generators.

Using Generator

Here’s an example of using Mixtral served via Groq to perform question answering on a web page. You need to set the environment variable GROQ_API_KEY and choose a compatible model.

from haystack import Pipeline
from haystack.utils import Secret
from haystack.components.fetchers import LinkContentFetcher
from haystack.components.converters import HTMLToDocument
from haystack.components.builders import PromptBuilder
from haystack.components.generators import OpenAIGenerator

fetcher = LinkContentFetcher()
converter = HTMLToDocument()
prompt_template = """
According to the contents of this website:
{% for document in documents %}
  {{document.content}}
{% endfor %}
Answer the given question: {{query}}
Answer:
"""
prompt_builder = PromptBuilder(template=prompt_template)
llm = OpenAIGenerator(
    api_key=Secret.from_env_var("GROQ_API_KEY"),
    api_base_url="https://api.groq.com/openai/v1",
    model="mixtral-8x7b-32768",
    generation_kwargs = {"max_tokens": 512}
)
pipeline = Pipeline()
pipeline.add_component("fetcher", fetcher)
pipeline.add_component("converter", converter)
pipeline.add_component("prompt", prompt_builder)
pipeline.add_component("llm", llm)

pipeline.connect("fetcher.streams", "converter.sources")
pipeline.connect("converter.documents", "prompt.documents")
pipeline.connect("prompt.prompt", "llm.prompt")

result = pipeline.run({"fetcher": {"urls": ["https://wow.groq.com/why-groq/"]},
              "prompt": {"query": "Why should I use Groq for serving LLMs?"}})

print(result["llm"]["replies"][0])

Using ChatGenerator

See an example of engaging in a multi-turn conversation with Llama 3. You need to set the environment variable GROQ_API_KEY and choose a compatible model.

from haystack.components.generators.chat import OpenAIChatGenerator
from haystack.dataclasses import ChatMessage
from haystack.utils import Secret

generator = OpenAIChatGenerator(
    api_key=Secret.from_env_var("GROQ_API_KEY"),
    api_base_url="https://api.groq.com/openai/v1",
    model="llama3-8b-8192",
    generation_kwargs = {"max_tokens": 512}
)


messages = []

while True:
    msg = input("Enter your message or Q to exit\n๐Ÿง‘ ")
    if msg=="Q":
        break
    messages.append(ChatMessage.from_user(msg))
    response = generator.run(messages=messages)
    assistant_resp = response['replies'][0]
    print("๐Ÿค– "+assistant_resp.content)
    messages.append(assistant_resp)