AI Gateway

Name,Description,Supported Models

AI Gateway allows developers to expose multiple models through specific URL paths, simplifying the integration of various AI models into applications. This block ensures seamless communication and scalability by providing multiple proxy LLMs for different use cases. Key features include A/B testing between LLMs, querying multiple LLMs simultaneously, load balancing across LLMs, and selecting the best response from multiple LLMs. Additionally, it secures these LLMs with access rules and supports rate limiting at the user, team, or route level, enabling efficient management and optimization of AI models within applications.

Model Providers

Model providers provide access to foundational and private models over REST API. The model providers abstract the detail of model location, security tokens and availability.

Name	Description	Supported Models
openai	Open AI and all the models provided by them	gpt-3.5-turbo, gpt-4-turbo,gpt-4o
anthropic	Anthropic and all the models provided by them	claude-3-opus, claude-3-sonnet, claude-3-haiku
local-llm	Open source LLMs available locally	llama-3-8b, llama-3-8b-128k, llama3-70b, codellama2-34b, codellama2-7b, mistral7b, mixtral8x7b phi3
load-llm	Proxy LLM which splits the request between multiple LLMs	All the models above
multi-llm	Proxy LLM which queries multiple llms with the same query for critical use cases	All the models above
ab-llm	Proxy LLM which splits queries between multiple models in a given ratio	All the models above
best-llm	Proxy LLM which queries multiple models with same query and gives the better answer scored by different llm	All the models above

Examples

Model providers are defined in the ai/routes/ rolder of a product in YAML files under the key model-providers. There can be multiple YAML files in this folders and the AI Gateway service will coalesce all of them and resolve all providers.

Open AI, LLaMa 3 Providers

1
model-providers:
2
- name: openai
3
  type: openai
4
  model: gpt-4o
5
  vault-key: llm/open-ai-secure-key
6
- name: llama3-8b-128k
7
  type: local-llm
8
  model: llama3-8b-128k # llama3-8b with 128k context window by Gradient AI

Load-balancing LLM Provider

1
model-providers:
2
# the llama providers are not included for brevity
3
- name: load-know
4
  type: load-llm
5
  llms:
6
    - name: llama3-8b-01
7
    - name: llama3-8b-02
8
    - name: llama3-8b-03
9
  config:
10
    sticky: false
11
    strategy: roundrobin

Multi LLM Provider

1
model-providers:
2
# the openai and llama providers are not included for brevity
3
- name: multi-know
4
  type: multi-llm
5
  llms:
6
    - name: openai
7
    - name: llama3-8b-128k
8
  config:
9
    strategy: waitforall

A/B testing with LLM Providers

1
model-providers:
2
# the openai and llama providers are not included for brevity
3
- name: a-80-b-20
4
  type: ab-llm
5
  llms:
6
    - category: a
7
      name: openai
8
      ratio: 0.8
9
    - category: b
10
      name: llama3-8b-128k
11
      ratio: 0.2

Best LLM Provider

1
model-providers:
2
# the openai and llama providers are not included for brevity
3
- name: best-llm
4
  type: best-llm
5
  llms:
6
    - name: openai
7
      ratio: 0.8
8
    - name: llama3-8b-128k
9
      ratio: 0.2
10
  judge-llm:
11
    - name: mistral
12
  config:
13
    strategy: best

Routes

Routes are defined by the developers in the ai/routes/ folder of a product in YAML files under the key routes. There can be multiple YAML files in this folders and the AI Gateway service will coalesce all of them and serve. Routes YAML has three key information for each route

Model Provider and configuration for that provider
Access Rules on who can access this route
Rate Limits defined at user, group and route level

Examples

Access Open AI gpt-4o

In the below example route the developer has created a new route /ai/tasks which can be accessed to Open AI models without exposing or sharijng the token. Here a user is allowed access to make the calls only if they are have llm-open-ai-allowed group in their jwt token claims. They have also applied additional usage limits to keep the costs in control.

1
routes:
2
- name: tasks
3
  path: /tasks
4
  provider: openai
5
  access: |
6
    user.groups.includes('llm-open-ai-allowed')
7
  rate-limits:
8
    user:
9
      prompts: 60 / hour
10
      tokens: 3000 / prompt # or 100000 / hour or 8000 / request
11
    group:
12
      rule: |
13
        user.department
14
      prompts: 600 / hour
15
      tokens: 3000 / prompt # or 100000 / hour
16
    total:
17
      prompts: 1000 / hour
18
      tokens: 3000 / prompt

On the above endpoint, a user can call 60 times in a hour, and all users in a given department ‘Sales’ can make 600 request in an hour collectively, and the entire user base requests are limited at 1000 per hour. Similarly the token limits are applied.

If any of the rate-limit thresholds are violated the gateway returns 429 HTTP status code.

Access local llama3-8b

Here the developer is exposing a local llama3-8b over a route /ai/know

1
routes:
2
- name: knowledge
3
  path: /know
4
  provider: llama3-8b-128k
5
  access: |
6
    user.groups.includes('llm-llama-know-allowed')

Access local llama3-8b

Here the developer is exposing a local llama3-8b over a route /ai/know

1
routes:
2
- name: multi
3
  path: /multi-know
4
  provider: multi-know
5
  access: |
6
    user.groups.includes('llm-multi-know-allowed')

Accessing the models

With the model-providers and routes configuration, the AI Gateway blocks provides unified API access to all models. The format for of response matches the Open AI chat completion response format.

Examples

Here the developer is calling the /tasks route with POST verb with a json payload with the prompt name and parameters required to expand the prompt. Refer to Prompts to learn more about defining prompts.

OpenAI example

Request

POST /ai/tasks

1
{
2
  "prompt": "email-template-gen",
3
  "parameters": {
4
    "templatecount": 5,
5
    "your-brand": "kis.ai",
6
    "charactercount": 150,
7
    "topic": "hotel management software",
8
    "brand": "Sun and Sands Resorts",
9
    "firstname": "Joe",
10
    "lastname": "Pescano",
11
    "signature": "Wishing a sunny day, AP"
12
  }
13
}

Response

1
{
2
  "choices": [
3
    {
4
      "finish_reason": "stop",
5
      "index": 0,
6
      "logprobs": null,
7
      "message": {
8
        "content": "### Email Template 1\nSubject: Enhance Sun and Sands Resorts with kis.ai\n---\nHi Joe Pescano,\n\nDiscover how kis.ai can revolutionize your hotel management software at Sun and Sands Resorts. Let's discuss!\n\nWishing a sunny day,\nAP\n\n### Email Template 2\nSubject: Transform Sun and Sands Resorts with AI\n---\nHi Joe,\n\nBoost efficiency at Sun and Sands Resorts using kis.ai's advanced AI for hotel management software.\n\nWishing a sunny day,\nAP\n\n### Email Template 3\nSubject: Innovate Hotel Management with kis.ai\n---\nHi Joe,\n\nElevate Sun and Sands Resorts' software with kis.ai's AI capabilities. Ready to explore?\n\nWishing a sunny day,\nAP\n\n### Email Template 4\nSubject: AI Solutions for Sun and Sands Resorts\n---\nHi Joe,\n\nImprove Sun and Sands Resorts with kis.ai's hotel management software. Let's connect!\n\nWishing a sunny day,\nAP\n\n### Email Template 5\nSubject: Upgrade Sun and Sands Resorts with kis.ai\n---\nHi Joe,\n\nTransform your hotel management at Sun and Sands Resorts using kis.ai's AI solutions.\n\nWishing a sunny day,\nAP",
9
        "role": "assistant"
10
      }
11
    }
12
  ],
13
  "created": 1716301646,
14
  "id": "chatcmpl-9RKkYAAHZJ4ewBV4rxdCdnVBCav08",
15
  "model": "gpt-3.5-turbo-0125",
16
  "object": "chat.completion",
17
  "system_fingerprint": null,
18
  "usage": {
19
    "completion_tokens": 184,
20
    "prompt_tokens": 28,
21
    "total_tokens": 212
22
  }
23
}

Multi LLM Example

Request

1
{
2
    "prompt": "test-zero-shot-prompt",
3
    "parameters": {
4
        "prompt": "please tell me a good story"
5
    }
6
}

Response

1
{
2
  "openai": {
3
    "choices": [
4
      {
5
        "finish_reason": "stop",
6
        "index": 0,
7
        "logprobs": null,
8
        "message": {
9
          "content": "Once upon a time, in a small village nestled between rolling green hills and lush forests, there lived a young girl named Lily. Lily was known throughout the village for her kind heart and gentle spirit. She had a special connection with animals, and they seemed to be drawn to her.\n\nOne day, as Lily was walking through the woods, she heard a faint meowing coming from behind a bush. Curious, she approached and discovered a tiny kitten, all alone and in need of help. Lily gently picked up the kitten and cradled it in her arms. She could see that it was malnourished and scared.\n\nDetermined to help the kitten, Lily brought it back to her home and carefully nursed it back to health. As days passed, the kitten grew stronger and more playful, forming a deep bond with Lily. They became inseparable companions, exploring the woods together and sharing adventures.\n\nWord of Lily's kindness spread throughout the village, and soon people from far and wide would come to seek her help with their own animals in need. Lily was always willing to lend a hand, whether it was rescuing a lost puppy or nursing an injured bird back to health.\n\nAs time went on, the village grew to become a haven for animals of all kinds, thanks to Lily's compassion and dedication. And so, the young girl with the kind heart became known as the Guardian of the Animals, a title she wore with pride and humility.\n\nAnd they all lived happily ever after, surrounded by the love and gratitude of the creatures they had helped.",
10
          "role": "assistant"
11
        }
12
      }
13
    ],
14
    "created": 1716303276,
15
    "id": "chatcmpl-9RLAqNEoMg1dixbIa5T8mu5zLx92O",
16
    "model": "gpt-3.5-turbo-0125",
17
    "object": "chat.completion",
18
    "system_fingerprint": null,
19
    "usage": {
20
      "completion_tokens": 314,
21
      "prompt_tokens": 26,
22
      "total_tokens": 340
23
    }
24
  },
25
  "test-arka-flow": {
26
    "choices": [
27
      {
28
        "finish_reason": "length",
29
        "index": 0,
30
        "logprobs": null,
31
        "text": " Once upon a time in a faraway land, there was a small, peaceful village nestled among the lush green rolling hills. This village was called Willowbrook, and it was home to a variety of creatures who lived harmoniously together. Among the villagers were rabbits, squirrels, birds, deer, and even a family of bears.\n\nAt the heart of Willowbrook was a magnificent willow tree that stood tall and proud. According to a legend, this was the tree that had given the village its name. It was said that the tree held the spirits of the ancestors of the animals who lived there.\n\nOne sunny day as the village was going about its usual business, there was an unexpected event. A great and powerful storm brewed over the horizon. The animals of Willowbrook grew anxious as the ominous dark clouds rolled in, bringing with them thunder and lightning, heavy rain, and strong winds.\n"
32
      }
33
    ],
34
    "created": 1004163,
35
    "id": "cmpl-beff332550344237812c5aa715072c59",
36
    "model": "mistralai/Mistral-7B-Instruct-v0.2",
37
    "object": "text_completion",
38
    "usage": {
39
      "completion_tokens": 200,
40
      "prompt_tokens": 35,
41
      "total_tokens": 235
42
    }
43
  }
44
}