The Empathy AI API you will be using during the challenge is a service that will allow the developers of the different groups that participate to integrate large language models (LLMs) into their applications. It uses the official format of the OpenAI client.
The API offers two different LLMs, both of which are open-source and open-weights, ensuring transparency and accessibility for developers.
Additionally, the GPU infrastructure supporting these models is self-hosted, meaning it operates independently of the commercial clouds commonly used in the industry. This setup fosters a healthy relationship with AI, prioritizing ethical use, data privacy, and sustainability.
You can use it for any kind of task you can imagine, like answering questions, providing recommendations, or assisting with technical queries. Remember that creativity will be judged!
The two different LLMs you can consume are:
Key Features:
In this documentation, we will show you a simple example of using the API, but remember that you can find more information and uses if you go to the OpenAI client documentation.
Quick Start Guide
Example Request
curl https://ai-challenge.empathy.ai/v1/chat/completions \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "qwen/Qwen2.5-Coder-7B-Instruct",
"stream": true,
"messages": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Explain how to implement a binary search in Java."}
]
}'
Parameters
Parameter | Type | Required | Description |
---|---|---|---|
model | string | Yes | The model to use: mistralai/Mistral-7B-Instruct-v0.3 or qwen/Qwen2.5-Coder-7B-Instruct |
messages | array | Yes | Array of json objects defining roles and content. |
stream | boolean | No | The default is false. When set to True, it allows the API to return partial results as they are generated, rather than waiting for the entire completion to finish. |
Headers
Header | Required | Description |
---|---|---|
Authorization | Yes | Bearer token for authentication. |
Content-Type | Yes | Must be application/json. |
Sample Response
It can take some seconds to answer (if that happens, consider using the “stream” param in your request), but the output should look like this:
{
"id":"f3135ec1",
"created":1735890548,
"model":"qwen/Qwen2.5-Coder-7B-Instruct",
"choices":[
{
"finish_reason":"stop",
"message":{
"content":"Binary search is an efficient algorithm for finding an item…",
"role":"assistant"
}
}
]
}
(There are more parameters but they are of no interest on this simple use case)
Response Parameters
Choices Array
The choices array contains the generated responses. In this case, there's one item:
Indicates why the generation stopped ("stop" means it completed normally).
An object containing the generated response:
The actual text of the response.
The role of the entity providing the response ("assistant").
Code Example
Here you have an example of the use of the API in Python:
import openai
from typing import Optional
# Initialize OpenAI client
client = openai.OpenAI(
api_key="API_KEY",
base_url="https://ai-challenge.empathy.ai/v1/chat/completions"
)
def get_chat_completion(prompt: str, stream: bool = False) -> None:
try:
response = client.chat.completions.create(
model="qwen/Qwen2.5-Coder-7B-Instruct",
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": prompt}
],
stream=stream
)
if stream:
# Handle streaming response
for chunk in response:
if chunk.choices[0].delta.content is not None:
content = chunk.choices[0].delta.content
print(content, end="", flush=True)
print() # New line after stream ends
else:
# Handle non-streaming response
print(response.choices[0].message.content)
except Exception as e:
print(f"Error: {str(e)}")
if __name__ == "__main__":
prompt = "Write a short poem about coding"
print("\nStreaming Response:")
get_chat_completion(prompt, stream=True)
print("Non-Streaming Response:")
get_chat_completion(prompt, stream=False)
Here you have an Open API version to consume this API. Just copy and paste his code into editor.swagger.io and you will be able to consume it!
openapi: 3.0.0
info:
title: Empathy AI API
description: An API for conversational AI using advanced language models
version: 1.0.0
servers:
- url: https://ai-challenge.empathy.ai/v1/chat/completions
paths:
/chat/completions:
post:
tags:
- Endpoints
summary: Generate a chat completion. Add API_KEY on the Green "Authorize"
button above.
requestBody:
required: true
content:
application/json:
schema:
type: object
required:
- model
- messages
properties:
model:
type: string
enum: [mistralai/Mistral-7B-Instruct-v0.3, qwen/Qwen2.5-Coder-7B-Instruct]
description: The model to use for completion
messages:
type: array
items:
type: object
required:
- role
- content
properties:
role:
type: string
enum: [system, user]
content:
type: string
example:
model: qwen/Qwen2.5-Coder-7B-Instruct
messages:
- role: system
content: You are a helpful assistant.
- role: user
content: Explain how to implement a binary search in Python.
responses:
'200':
description: Successful response
content:
application/json:
schema:
type: object
properties:
id:
type: string
created:
type: integer
model:
type: string
object:
type: string
system_fingerprint:
type: string
choices:
type: array
items:
type: object
properties:
finish_reason:
type: string
index:
type: integer
message:
type: object
properties:
content:
type: string
role:
type: string
usage:
type: object
properties:
completion_tokens:
type: integer
prompt_tokens:
type: integer
total_tokens:
type: integer
example:
id: f3135ec1
created: 1735890548
model: qwen/Qwen2.5-Coder-7B-Instruct
object: chat.completion
system_fingerprint: 3.0.1-sha-bb9095a
choices:
- finish_reason: stop
index: 0
message:
content: Here's an explanation of how to implement a binary search in Python...
role: assistant
usage:
completion_tokens: 501
prompt_tokens: 30
total_tokens: 531
components:
securitySchemes:
BearerAuth:
type: http
scheme: bearer
security:
- BearerAuth: []