siraaj-litellm-service / README.md

Siraaj LiteLLM Service

LiteLLM proxy service with guardrails implementation for Siraaj

Last updated: 4/16/2026GitHub

Siraaj LiteLLM Service

Overview

This is a LiteLLM proxy service that provides unified LLM API access with guardrails for the Siraaj product. It acts as an AI gateway that routes requests to multiple LLM providers (OpenAI, vLLM on Local GPUs, etc.) through a single OpenAI-compatible API, while enforcing a multi-layer guardrail system to ensure response quality and safety.

The file structure is as follows:

siraaj-litellm-service/
├── guardrails/                           # Guardrail implementations
│   ├── __init__.py                       # Package initialization
│   ├── base_output_guardrail.py          # Base class for output guardrails
│   ├── input_guardrail.py                # Pre-call regex-based input guard
│   ├── combined_output_guardrail.py      # Post-call LLM-based output guard (intent + policy)
│   ├── prompts/                          # Guardrail prompt templates
│   │   ├── intent_validation.txt         # Intent guard prompt
│   │   ├── policy_validation.txt         # Output policy prompt
│   │   └── hallucination_validation.txt  # Hallucination check prompt (disabled)
│   └── README.md                         # Guardrails documentation
├── tests/                                # Testing & test cases
│   ├── test_cases.json                   # Test cases for guardrails
│   ├── test_baseline.py                  # Baseline LLM testing script
│   └── results/                          # Automated test results
├── docker-compose.yml                    # Local development
├── litellm_config.yaml                   # LiteLLM configuration
└── README.md

Flow

Key Features:

  • Unified API gateway for multiple LLM providers
  • Multi-layer guardrail system (input → output)
  • Request/response logging and cost tracking via Langfuse
  • Admin UI for monitoring and configuration
  • Database persistence for models and logs

Guardrail System

Two layers of protection run on every request:

LayerTypeWhenCovers
Input GuardRegexPre-callPrompt injection, jailbreak patterns
Output GuardLLMPost-callIntent check (alcohol/substances, violence, self-harm, sexual content, malicious code, illegal activities) + policy check (toxicity, Islamic content, Omani leadership, regional politics, and more)

For a full breakdown of category coverage across all three layers see the guardrails README.

Technology Stack

  • Proxy Service: LiteLLM
  • Database: PostgreSQL
  • API Framework: FastAPI (via LiteLLM)
  • Observability: Langfuse
  • Guardrail LLM: Configurable via env vars (GUARDRAIL_MODEL_NAME)
  • Deployment: Docker Compose

Setup

Configuration

Configuration is split between two files:

litellm_config.yaml - Shared settings (mostly same across deployments):

  • UI, logging, security settings
  • Guardrail definitions
  • Feature flags

.env - Deployment-specific values:

  • Database credentials and URLs
  • API keys and secrets
  • Ports and hostnames
# Copy environment example
cp .env.example .env

# Edit .env with deployment-specific values

Local Development

# Clone the repository
git clone <repo_url>
cd siraaj-litellm-service

# Copy and configure environment
cp .env.example .env
# Edit .env with your values

# Start the services
docker-compose up -d

# Check logs
docker-compose logs -f litellm

# Access the UI
open http://localhost:4000/ui

Configuration Reference

Environment Variables (.env)

Required:

  • DATABASE_URL - PostgreSQL connection string
  • LITELLM_MASTER_KEY - Master API key for proxy authentication
  • GUARDRAIL_MODEL_NAME - Model used for guardrail checks
  • GUARDRAIL_API_BASE - Base URL for guardrail model API
  • GUARDRAIL_API_KEY - API key for guardrail model

Optional:

  • UI_USERNAME, UI_PASSWORD - UI authentication (recommended for production)
  • LITELLM_LOG - Logging level: DEBUG, INFO, WARNING, ERROR (default: INFO)
  • POSTGRES_PORT, LITELLM_PORT - Port configuration for local development
  • LANGFUSE_PUBLIC_KEY, LANGFUSE_SECRET_KEY, LANGFUSE_HOST - Langfuse observability

LiteLLM Config (litellm_config.yaml)

Shared settings that are mostly the same across deployments:

  • ui: true - Enable admin UI
  • store_model_in_db: true - Store models in database
  • store_prompts_in_spend_logs: true - Store full prompts/responses in logs
  • json_logs: true - JSON-formatted logs
  • no_docs, no_redoc - API documentation (set to true for production)
  • drop_params: true - Automatically drop unsupported model parameters
  • guardrails - Guardrail class definitions

Testing

# Run baseline tests against all guardrail test cases
python tests/test_baseline.py

# Results are written to tests/results/

Test cases are defined in tests/test_cases.json covering 52 scenarios across all guardrail categories including acceptable queries, edge cases, and known attack patterns.

General Notes & References

Database Persistence

All models, logs, and configuration are stored in PostgreSQL. The database is persisted using Docker volumes (postgres_data).

Internal Documentation

LiteLLM Official Documentation

License & Attribution

This service uses the following open-source software:

LiteLLM

  • Copyright (c) 2023 Berri AI
  • Licensed under the MIT License
  • Full License

PostgreSQL

  • Copyright © 1996-2026, The PostgreSQL Global Development Group
  • Copyright © 1994, The Regents of the University of California
  • Licensed under the PostgreSQL License (BSD/MIT-style permissive license)
  • Full License