siraaj / README.md
Siraaj
Siraaj
An enterprise AI platform that serves as your organization's intelligent assistant, seamlessly connecting to company documents, applications, and people. Siraaj provides a sophisticated chat interface that integrates with any LLM of your choice while maintaining synchronized knowledge and access controls across workplace tools. Since you control the deployment, your user data and conversations remain entirely under your organization's control. The platform combines the power of large language models with your team's unique knowledge, effectively creating a subject matter expert that understands your specific business context.
Imagine having an AI assistant that can answer questions like "A customer wants feature X, is this already supported?" or "Where's the pull request for feature Y?" by accessing your team's collective knowledge across Slack conversations, Google Drive documents, Confluence pages, and code repositories.
Problem Statement
Organizations face a fundamental challenge in the modern workplace: critical knowledge exists in scattered silos across multiple platforms, making it nearly impossible for teams to access the collective intelligence they need to make informed decisions. Employees waste countless hours searching through disconnected systems like Slack channels, Google Drive folders, Confluence spaces, and various databases, often recreating work that colleagues have already completed. Traditional search tools fall short because they rely on keyword matching rather than understanding context and meaning.
This fragmentation creates several cascading problems. Decision-making becomes slower as teams struggle to locate relevant precedents and documentation. New team members face extended onboarding periods because institutional knowledge isn't easily discoverable. Customer support teams cannot quickly access the technical information needed to resolve issues effectively. Project teams duplicate efforts because they cannot easily find similar work completed by other departments.
Siraaj addresses these challenges by creating a unified knowledge ecosystem where artificial intelligence can understand, synthesize, and provide contextual answers from your organization's collective information. Rather than replacing existing tools, Siraaj acts as an intelligent layer that connects and makes sense of information across all your workplace systems while maintaining the security boundaries and access controls that protect sensitive information.
Project Links
Getting Started
Prerequisites
Understanding Siraaj's architecture helps you appreciate why these prerequisites matter. The platform operates as a distributed system where each component serves a specific purpose in the knowledge management pipeline.
- Node.js (v20+) - Powers the Next.js frontend interface
- Docker (v24.0.2+) - Orchestrates the microservices architecture
- Docker Compose - Manages the multi-container deployment
Project Setup
Clone the Repository
git clone https://github.com/rihal-om/siraaj.git
cd siraaj
Create Shared Docker Network
Before running the application, create the shared Docker network that enables communication between Siraaj services and the OCR service when running on the same Docker host:
docker network create siraaj-shared-network
Why is this needed?
The OCR service runs as a separate docker-compose stack. When both the main Siraaj stack and the OCR stack run on the same Docker host, they need a shared network to communicate via Docker DNS (e.g., OCR_URL=http://ocr_workflow:8000). This applies when using either:
- Local OCR service: ocr/docker-compose.dev.yml
- External OCR repository (same host): siraaj-dot-ocr-service/docker-compose.yml
Without this shared network, each docker-compose stack would have isolated networks and containers couldn't communicate.
When is it NOT needed?
Only when the OCR service is running on a completely different server/instance. In that case, you would configure OCR_URL with the external server's URL in your .env file to communicate over standard HTTP, and the shared network is not required.
Note: You only need to create this network once. It will persist across container restarts and can be reused by multiple docker-compose stacks.
Running the Application Locally
There are 2 options for running the application locally, so refer to this document for details.
Quick Start with Makefile
Siraaj includes a comprehensive Makefile at the project root that simplifies common development and deployment tasks. These commands provide a clean, memorable interface for managing your Siraaj deployment.
Essential Commands
# Start core services (chat, search, document processing)
make run # or: make up
# Start ALL services including agents, meeting, keycloak, monitoring
make all # or: make run-all
# Stop all services
make down # or: make stop
# Check service status
make status # Clean formatted status
make ps # Detailed docker information
# View logs
make logs # All services (use Ctrl+C to exit)
make logs-api # API server only
make logs-web # Web server only
make logs-background # Background workers only
Build & Rebuild Commands
# Build services
make build # Build all images (incremental)
make rebuild # Rebuild from scratch (preserves LLM models ✅)
# Build specific services
make build-backend # API server & background workers only
make build-web # Web server only
make build-models # Model servers only
# Quick workflows
make rebuild-run # Rebuild everything and start (keeps data)
make quick-start # Pull latest images and start
Feature-Specific Services
make chat # Start chat services only
make meet # Start meeting/transcription services
make agents # Start automation agents + n8n
make keycloak # Start with Keycloak authentication
make monitoring # Start monitoring (Vispana)
Maintenance & Management
# Restart services
make restart # Restart all services
make restart-api # Restart API server only
make restart-web # Restart web server only
# Cleanup
make clean # Stop and remove containers (keeps volumes/data)
make clean-all # Remove EVERYTHING including data (⚠️ destructive)
make prune # Clean up unused Docker resources
# Database operations
make db-shell # Open PostgreSQL shell
make db-backup # Backup database to ./deployment/docker_compose/backups/
Development Tools
# Development mode
make dev # Start with live code mounting
# Container access
make shell-api # Open shell in API container
make shell-web # Open shell in web container
# Information
make urls # Show all service URLs
make check # Check service health
make help # Show all available commands
Service URLs After Starting
Core Services (available with make run):
- Web UI: http://localhost:3000
- API Server: http://localhost:8080
- PostgreSQL: localhost:5432
- Redis: localhost:6379
- Vespa Search: http://localhost:8081
Optional Services (require specific commands):
- Keycloak: http://localhost:8079/auth (use
make keycloak) - Temporal UI: http://localhost:3001 (use
make meet) - AI Meet API: http://localhost:5001 (use
make meet) - Agents: http://localhost:8501 (use
make agents) - n8n: http://localhost:5678 (use
make agents) - Vispana: http://localhost:4000 (use
make monitoring)
💡 Tip: Use
make allto start all services at once, ormake urlsto see which services are available
Important Notes
LLM Models are Preserved: The make rebuild and make clean commands preserve downloaded LLM models stored in Docker volumes. Only make clean-all or make fresh-start will delete these models.
Data Persistence: Your databases and indexed documents are stored in Docker volumes and persist across container restarts. Use make clean to stop services while keeping all data intact.
First-Time Setup: After starting services for the first time, configure an LLM provider through the web interface (Settings → Admin → LLM Providers) to enable chat functionality.
Tech Stack
Understanding the technology choices helps you appreciate how Siraaj delivers enterprise-grade knowledge management capabilities. Each component serves a specific purpose in creating a reliable, scalable, and secure platform that can handle the complex requirements of organizational knowledge management.
| Category | Tool | Description |
|---|---|---|
| Frontend | Next.js 14+ with TypeScript | Modern React framework providing server-side rendering, optimal performance, and type safety for complex user interfaces |
| Backend | Python FastAPI | High-performance async API framework that provides automatic documentation and handles concurrent requests efficiently |
| Authentication | Multi-provider Support | Comprehensive authentication system supporting OAuth, OIDC, SAML, and traditional credentials for enterprise identity integration |
| Primary Database | PostgreSQL 15 | Enterprise-grade relational database ensuring ACID compliance and reliable data persistence for user accounts and metadata |
| Search Engine | Vespa | Advanced distributed search platform providing both traditional keyword search and modern vector similarity search capabilities |
| Cache Layer | Redis | High-speed in-memory data store that dramatically improves response times for frequently accessed information |
| Object Storage | MinIO | S3-compatible storage system for documents, images, and other media files with built-in versioning and access controls |
| AI Model Services | Custom Inference Servers | Dedicated microservices for AI model operations, supporting both general inference tasks and specialized document indexing workloads |
| Workflow Engine | Temporal | Reliable distributed workflow orchestration system that handles complex background processing with fault tolerance and retry logic |
| Process Automation | N8N | Visual workflow builder enabling custom integrations and automated responses without requiring extensive programming knowledge |
| Observability | Jaeger + Sentry | Comprehensive monitoring solution providing distributed tracing for performance optimization and error tracking for rapid issue resolution |
| Deployment Platform | Docker + Docker Compose | Containerized architecture ensuring consistent deployment across development, staging, and production environments |
| Load Balancing | Nginx | High-performance reverse proxy and load balancer handling SSL termination and request distribution across backend services |
Comprehensive Workplace Connectors
The power of Siraaj lies in its ability to connect with virtually every tool your organization uses to create, store, and share knowledge. Rather than forcing teams to change their existing workflows, Siraaj integrates seamlessly with established systems to create a unified knowledge layer. The platform efficiently monitors and pulls the latest changes from communication platforms like Slack for capturing conversational knowledge and real-time discussions. Development teams benefit from GitHub integration that provides access to code repositories, documentation, and issue tracking information.
Document management systems receive comprehensive support through connectors for Google Drive, SharePoint, and local file systems, ensuring that both cloud-based and on-premise document repositories become searchable through the unified interface. Knowledge management platforms including Confluence, Notion, Slab, Document360, and Bookstack integrate seamlessly to make structured organizational knowledge instantly accessible.
Project management and workflow tools connect through specialized integrations with Jira for development tracking, Linear for modern project management, Productboard for feature planning, and similar platforms that capture decision-making processes and project evolution. Customer-facing systems integrate through connectors for Zendesk customer support platforms, HubSpot customer relationship management, Gong sales conversation intelligence, and Gmail communication history.
The platform also supports integration with specialized knowledge systems including Guru for verified company information, Bookstack for technical documentation, and even website crawling capabilities for public-facing information that teams reference regularly. This comprehensive connector ecosystem means that regardless of your organization's specific tool choices, Siraaj can likely integrate with your existing knowledge infrastructure without requiring disruptive migrations or workflow changes.
Project Team
The diverse expertise of our team reflects the complex challenges Siraaj addresses in enterprise knowledge management.
| Name | Role | GitHub |
|---|---|---|
| Alhaitham Al Jabri | Technical Lead | @aljab012 |
| Shihab Al Amri | Software Engineer | @shihabal3amri |
| Ali Al Aufi | New Ventures Manager | NA |
| Asma Al Hattali | Machine Learning Engineer | @asmaai |
| Bushra Al Jahwari | Machine Learning Engineer | @Bushrxh |
Deployment Flexibility and Scalability
One of Siraaj's most significant advantages lies in its deployment flexibility, which accommodates organizations with vastly different technical requirements, security constraints, and operational scales. This flexibility ensures that teams can implement effective knowledge management solutions regardless of their infrastructure preferences or organizational policies.
Local Development and Evaluation capabilities allow teams to run the complete Siraaj platform on individual laptops or workstations. This approach proves invaluable for technical evaluation, demonstration purposes, and small team pilots. The entire system, including all microservices, databases, and AI components, can operate effectively on modest hardware resources, enabling thorough testing before committing to larger deployment initiatives.
On-Premise Enterprise Deployment addresses the needs of organizations with strict data sovereignty requirements or existing infrastructure investments. The containerized architecture ensures consistent behavior whether deployed on a single virtual machine or distributed across multiple servers within corporate data centers. This deployment model provides complete control over data location, network access, and integration with existing enterprise systems while maintaining the full functionality of cloud-based alternatives.
The deployment documentation provides comprehensive guidance for each scenario, including infrastructure requirements, security considerations, and optimization strategies that help organizations achieve optimal performance while maintaining their specific compliance and operational requirements.
Development Workflow
Local Development
For frontend development, you can run the Next.js development server independently while connecting to the containerized backend services. This approach provides fast refresh capabilities while maintaining access to the full platform functionality.
Testing Strategy
The platform includes end-to-end testing with Playwright, ensuring the complete user experience works correctly across different scenarios. The modular architecture also supports unit testing of individual services.
Deployment Considerations
The Docker Compose setup is designed for both development and production use. Environment variables control feature flags and integrations, allowing you to customize the platform for different deployment contexts.
Ready to transform your organization's relationship with its collective knowledge and expertise!