siraaj-litellm-service / README.md

Siraaj LiteLLM Service

LiteLLM proxy service with guardrails implementation for Siraaj

Last updated: 4/16/2026GitHub

Siraaj LiteLLM Service

Overview

This is a LiteLLM proxy service that provides unified LLM API access with guardrails for the Siraaj product. It acts as an AI gateway that routes requests to multiple LLM providers (OpenAI, vLLM on Local GPUs, etc.) through a single OpenAI-compatible API, while enforcing a multi-layer guardrail system to ensure response quality and safety.

The file structure is as follows:

siraaj-litellm-service/
├── guardrails/                           # Guardrail implementations
│   ├── __init__.py                       # Package initialization
│   ├── base_output_guardrail.py          # Base class for output guardrails
│   ├── input_guardrail.py                # Pre-call regex-based input guard
│   ├── combined_output_guardrail.py      # Post-call LLM-based output guard (intent + policy)
│   ├── prompts/                          # Guardrail prompt templates
│   │   ├── intent_validation.txt         # Intent guard prompt
│   │   ├── policy_validation.txt         # Output policy prompt
│   │   └── hallucination_validation.txt  # Hallucination check prompt (disabled)
│   └── README.md                         # Guardrails documentation
├── tests/                                # Testing & test cases
│   ├── test_cases.json                   # Test cases for guardrails
│   ├── test_baseline.py                  # Baseline LLM testing script
│   └── results/                          # Automated test results
├── docker-compose.yml                    # Local development
├── litellm_config.yaml                   # LiteLLM configuration
└── README.md

Flow

Key Features:

Unified API gateway for multiple LLM providers
Multi-layer guardrail system (input → output)
Request/response logging and cost tracking via Langfuse
Admin UI for monitoring and configuration
Database persistence for models and logs

Guardrail System

Two layers of protection run on every request:

Layer	Type	When	Covers
Input Guard	Regex	Pre-call	Prompt injection, jailbreak patterns
Output Guard	LLM	Post-call	Intent check (alcohol/substances, violence, self-harm, sexual content, malicious code, illegal activities) + policy check (toxicity, Islamic content, Omani leadership, regional politics, and more)

For a full breakdown of category coverage across all three layers see the guardrails README.

Technology Stack

Proxy Service: LiteLLM
Database: PostgreSQL
API Framework: FastAPI (via LiteLLM)
Observability: Langfuse
Guardrail LLM: Configurable via env vars (GUARDRAIL_MODEL_NAME)
Deployment: Docker Compose

Setup

Configuration

Configuration is split between two files:

litellm_config.yaml - Shared settings (mostly same across deployments):

UI, logging, security settings
Guardrail definitions
Feature flags

.env - Deployment-specific values:

Database credentials and URLs
API keys and secrets
Ports and hostnames

# Copy environment example
cp .env.example .env

# Edit .env with deployment-specific values

Local Development

# Clone the repository
git clone <repo_url>
cd siraaj-litellm-service

# Copy and configure environment
cp .env.example .env
# Edit .env with your values

# Start the services
docker-compose up -d

# Check logs
docker-compose logs -f litellm

# Access the UI
open http://localhost:4000/ui

Configuration Reference

Environment Variables (.env)

Required:

DATABASE_URL - PostgreSQL connection string
LITELLM_MASTER_KEY - Master API key for proxy authentication
GUARDRAIL_MODEL_NAME - Model used for guardrail checks
GUARDRAIL_API_BASE - Base URL for guardrail model API
GUARDRAIL_API_KEY - API key for guardrail model

Optional:

UI_USERNAME, UI_PASSWORD - UI authentication (recommended for production)
LITELLM_LOG - Logging level: DEBUG, INFO, WARNING, ERROR (default: INFO)
POSTGRES_PORT, LITELLM_PORT - Port configuration for local development
LANGFUSE_PUBLIC_KEY, LANGFUSE_SECRET_KEY, LANGFUSE_HOST - Langfuse observability

LiteLLM Config (litellm_config.yaml)

Shared settings that are mostly the same across deployments:

ui: true - Enable admin UI
store_model_in_db: true - Store models in database
store_prompts_in_spend_logs: true - Store full prompts/responses in logs
json_logs: true - JSON-formatted logs
no_docs, no_redoc - API documentation (set to true for production)
drop_params: true - Automatically drop unsupported model parameters
guardrails - Guardrail class definitions

Testing

# Run baseline tests against all guardrail test cases
python tests/test_baseline.py

# Results are written to tests/results/

Test cases are defined in tests/test_cases.json covering 52 scenarios across all guardrail categories including acceptable queries, edge cases, and known attack patterns.

General Notes & References

Database Persistence

All models, logs, and configuration are stored in PostgreSQL. The database is persisted using Docker volumes (postgres_data).

Internal Documentation

LiteLLM + Langfuse - Internal documentation on observability and monitoring setup

LiteLLM Official Documentation

LiteLLM Documentation - Main documentation
All Settings Reference - Complete configuration options
Guardrails Documentation - Guardrails setup and configuration
Logging & Debugging - Logging configuration and troubleshooting
Production Best Practices - Security and performance guidelines

License & Attribution

This service uses the following open-source software:

LiteLLM

Licensed under the MIT License
Full License

PostgreSQL

Licensed under the PostgreSQL License (BSD/MIT-style permissive license)
Full License