Custom agents#

OpenVidu provides a set of built-in agents, each one offering a set of AI services to help enhance the user experience in your Rooms. But you can also create your own custom agents to fine-tune the AI capabilities of your OpenVidu application. You can do so using the powerful LiveKit Agents framework.

1. Implement your custom agent using the LiveKit Agents framework#

LiveKit Agents consists of a Python or Node program that connects to LiveKit Rooms to perform some kind of AI pipeline over the media tracks published to the Room by regular Participants.

The agent actually behaves as any other regular Participant of the Room, but thanks to its connection to Speech-to-Text services, LLMs and Text-to-Speech service, it can transcribe audio tracks, analyze video tracks, generate speech, etc... and publish the results back to the Room. This allows building any kind of flow interaction between your users and the AI service, all in realtime.

An incredible set of plugins make it very easy to integrate your agent code with the most popular AI providers. You have further information in the LiveKit Agents integrations documentation.

Tip

To start building your own custom agent, the best way is to follow the LiveKit's Voice AI quickstart guide. You can customize it to your needs once you grasp the basics of the Agents framework. You also have a great collection of recipes to inspire you.

2. Dockerize your custom agent#

Once you are satisfied with your custom agent implementation, you need to build a Docker image of it. When using the Python SDK and having a project structure similar to this...

.
├── agent.py
├── requirements.txt
└── Dockerfile

...here you have a typical example of an agent's Dockerfile:

# This is an example Dockerfile that builds a minimal container for running LK Agents
# syntax=docker/dockerfile:1
ARG PYTHON_VERSION=3.11.11
FROM python:${PYTHON_VERSION}-slim

# Prevents Python from writing pyc files.
ENV PYTHONDONTWRITEBYTECODE=1

# Keeps Python from buffering stdout and stderr to avoid situations where
# the application crashes without emitting any logs due to buffering.
ENV PYTHONUNBUFFERED=1

# Create a non-privileged user that the app will run under.
# See https://docs.docker.com/develop/develop-images/dockerfile_best-practices/#user
ARG UID=10001
RUN adduser \
    --disabled-password \
    --gecos "" \
    --home "/home/appuser" \
    --shell "/sbin/nologin" \
    --uid "${UID}" \
    appuser

# Install gcc, g++ and other build dependencies.
RUN apt-get update && \
    apt-get install -y \
    gcc \
    g++ \
    python3-dev \
    git \
    && rm -rf /var/lib/apt/lists/*

USER appuser

RUN mkdir -p /home/appuser/.cache
RUN chown -R appuser /home/appuser/.cache

WORKDIR /home/appuser

COPY requirements.txt .
RUN python -m pip install --user --no-cache-dir -r requirements.txt

COPY ./*.py .

# ensure that any dependent models are downloaded at build-time
RUN python agent.py download-files

# Run the application.
CMD ["python", "agent.py", "start"]

3. Add your custom agent to your OpenVidu deployment#

1. SSH into an OpenVidu Node and go to configuration folder#

Depending on your OpenVidu deployment type:

OpenVidu Local (Development)OpenVidu Single NodeOpenVidu ElasticOpenVidu High Availability

If you are using OpenVidu Local (Development), simply navigate to the configuration folder of the project:

# For OpenVidu Local COMMUNITY
cd openvidu-local-deployment/community

# For OpenVidu Local PRO
cd openvidu-local-deployment/pro

If you are using OpenVidu Single Node, SSH into the only OpenVidu node and navigate to:

cd /opt/openvidu/config

If you are using OpenVidu Elastic, SSH into the only Master Node and navigate to:

cd /opt/openvidu/config/cluster/media_node

If you are using OpenVidu High Availability, SSH into any of your Master Nodes (doesn't matter which one) and navigate to:

cd /opt/openvidu/config/cluster/media_node

2. Add an `agent-AGENT_NAME.yaml` file#

Located in the configuration folder of your OpenVidu node, create a file named agent-AGENT_NAME.yaml, where AGENT_NAME must be a unique name for your agent. The minimal content of this file is:

# Docker image of the agent.
docker_image: YOUR_IMAGE

# Whether to run the agent or not.
enabled: true

CUSTOM_CONFIGURATION: ...

The docker_image field must be the full name of the Docker image you built in step 2. Of course, your OpenVidu nodes must have access to that Docker image's registry.
The enabled field indicates whether the agent will be started by OpenVidu or not. Setting this to false will result in your agent NOT being launched and not being available, even if you later try to manually dispatch your agent.
You can add as many other properties as you want to this YAML file. You can access them within your agent's code (see Accessing the agent's configuration file).

3. Restart OpenVidu#

Depending on your OpenVidu deployment type:

OpenVidu Local (Development)OpenVidu Single NodeOpenVidu ElasticOpenVidu High Availability

Run where docker-compose.yaml is located:

docker compose restart

Run this command in your node:

sudo systemctl restart openvidu

Run this command in your Master Node:

sudo systemctl restart openvidu

Run this command in one of your Master Nodes:

sudo systemctl restart openvidu

After restarting OpenVidu your agent will be up and running, ready to process new Rooms.

Warning

If your agent container keeps restarting, there might be an error in your configuration. Check its logs to find out what is wrong.

Tips when coding your custom agent#

When developing your custom agent using the Python or Node SDKs, there are some tips that can help:

Dispatching your custom agent#

You can control when to dispatch your agent in your agent's code. By default agents will dispatch (connect) automatically to new Rooms. If you want to manually control when to dispatch your agent, simply add property agent_name to your WorkerOptions when creating the agent:

Python Node.js

opts = WorkerOptions(
    ...
    agent_name="my-custom-agent",
)

const opts = new WorkerOptions({
  ...
  agentName: "my-custom-agent",
});

Property agent_name must match the value AGENT_NAME in the file agent-AGENT_NAME.yaml created here.

Then you can manually dispatch your agent using the Dispatch API or via a Participant connection.

Accessing the agent's configuration file#

It can be very useful to access your agent's YAML configuration file from within your agent's code. OpenVidu automatically mounts file agent-AGENT_NAME.yaml for your agent's Docker container. You have the path to the file in env var AGENT_CONFIG_FILE. You can read the file's content directly in your agent's code (a YAML parser can be very useful). For example:

Python Node.js

import os
import yaml

with open(os.environ["AGENT_CONFIG_FILE"], "r") as f:
    config = yaml.safe_load(f)

print(config)

import fs from 'fs';
import yaml from 'js-yaml';

const configFile = process.env.AGENT_CONFIG_FILE;
const config = yaml.load(fs.readFileSync(configFile, 'utf8'));

console.log(config);

Custom agents vs OpenVidu agents#

Take into account that OpenVidu agents have an advantage over a regular LiveKit agent when running in an multi-node OpenVidu deployment (OpenVidu Elastic and OpenVidu High Availability): OpenVidu agents are designed to allow graceful shutdowns when scaling down Media Nodes.

This means that a Media Node flagged for termination will wait for all its OpenVidu agents to finish processing their assigned Rooms before allowing the Media Node to be stopped, while at the same time rejecting new job requests. This ensures a smooth experience for your users, avoiding downtimes when your cluster is scaled down.