sjc-website-chatbot / README.md
sjc-website-chatbot
this is a chatbot for SJC website
sjc-website-chatbot
This is a chatbot for SJC (Supreme Judiciary Council) website with service category navigation and admin portal for document management.
Key Features
- Service Category Navigation: Users can select specific services or use general chatbot mode
- Category-Specific Q&A: Chatbot answers questions based on selected service category
- JWT-Protected Admin Portal: Secure admin interface for file management
- Arabic/English Support: Full bilingual support with RTL interface for Arabic
- Document Upload: Admin can upload and categorize documents (PDF, DOCX, TXT, images)
Backend Service
1. Overview
This is a RAG chatbot service that provides conversational AI through REST APIs. It processes user queries and generates responses using large language models. It is for SJC (Supreme Judiciary Council) website.
It is very similar to MOCIIP chatbot, so many scripts were reused.
The file structure is as follows:
backend/
├── Dockerfile # docker file used for development has all dependencies.
├── Dockerfile.prod # lighter, has the dependencies that only needed at runtime (no web scrapping)
├── app/
│ ├── init.py
│ ├── auth.py # JWT authentication utilities
│ ├── categories.py # Service categories definitions
│ ├── chunker.py # Splits large text/content into manageable chunks
│ ├── config.py # Configuration settings and environment variables
│ ├── index.py # Handles indexing and querying in the vector database
│ ├── llm.py # LLM initialization and calling
│ ├── main.py # Service entry point for API (includes admin endpoints)
│ ├── qdrant.py # Qdrant vector database integration (store and query embeddings in Qdrant)
│ ├── rag.py # Retrieval-Augmented Generation orchestration
│ ├── scrapper.py # Web scraping logic
│ └── test.py # testing scripts, used to run main tasks for data extraction and indexing
├── poetry.lock
└── pyproject.toml
frontend/
├── src/ # React chatbot widget with category selection
└── ...
admin-portal/
├── src/ # Separate React admin app for file management
└── README.md # Admin portal documentation
streamlit_app/
├── dockerfile
└── streamlit_app.py # Frontend interface for end users for internal testing
2. Workflow/Process Flow
simple rag workflow that handles follow-up questions
Get User Query → validate user message → retrieve relevant documents(handling follow up questions) → generate answer → send back response
3. Technology Stack
- Runtime: Python 3.10+
- API Framework: FastAPI
- ML Models: OpenAI API
- Vector Database: Qdrant
- Dependency Management: Poetry
4. Setup
# using Docker Compose:
docker-compose up --build -d
once all services are up, add Qdrant database snapshot.
by adding a new collection to : http://localhost:6333/dashboard
Note: for the current setup you should name the collection sjc
then you can ask questions using streamlit app: http://localhost:8501
3
5. Environment Variables
you need to add this in your .env file
OPENAI_API_KEY: API key for OpenAI models
6. Testing
python backend/app/test.py
7. General Notes
- Data preparation is a one-time process, since the dataset is not expected to be updated frequently.
- Preparing the Qdrant database involves cleaning documents and scraping the web using
test.pyandJupyter notebook. A snapshot of the prepared database can be requested for reuse. - Backend's Dockerfiles are provided for development (
Dockerfile) and production (Dockerfile.prod). - The
Dockerfile.prodimage is optimized for production use — it does not include the data preparation step. - To add the Qdrant database manually, create a collection named
sjcand load the provided snapshot into it.
8. Service Categories
The chatbot supports the following service categories:
-
خدمات المحاكم (بوابة قضاء) - Court Services (Qadaa Portal)
- Electronic lawsuit registration
- Case and session inquiries
- Request copies of judgments
- Session attendance certificates
- Case tracking
- Electronic memo submission
- Session postponement requests
- Execution orders tracking
-
خدمات التنفيذ (نظام تنفيذ) - Execution Services (Tanfeedh System)
- Execution file inquiries
- Execution request tracking
- Execution court appointments
- Electronic payment via ONEIC
- Freeze/unfreeze requests
- Automatic arrest/release orders
-
خدمات التركات (نظام تركات) - Inheritance Services (Taraakat System)
- Heir restriction requests
- Estate inventory
- Estate manager appointment
- Estate distribution
- Minor property sale
- Estate management agency
- Legal heir notifications
-
خدمات الكاتب بالعدل (منصة توثيق) - Notary Services (Tawtheeq Platform)
- Power of attorney issuance (all types)
- Power of attorney cancellation
- Contract authentication
- Declaration authentication
- Will and endowment registration
- On-site notary services
- Authentication transaction inquiries
-
خدمات محكمة الاستثمار والتجارة - Investment and Commerce Court Services
- Company and investment lawsuits
- Partnership contract registration
- Commercial lawsuit tracking
- Commercial judgment copies
- Electronic commercial appeals
Users can select a specific category for focused answers or use general mode for broader questions.
9. Admin Portal
A separate React admin application is available for managing documents and files.
Admin Setup
- Navigate to the admin portal directory:
cd admin-portal
npm install
- Configure admin credentials in backend
.env:
JWT_SECRET_KEY=your-strong-secret-key
ADMIN_USERNAME=admin
ADMIN_PASSWORD=your-secure-password
- Start the admin portal:
npm run dev
Access at: http://localhost:3001
Admin Features
- JWT Authentication: Secure login with token-based auth
- File Upload: Upload documents (PDF, DOCX, TXT, PNG, JPG)
- Category Tagging: Assign files to specific service categories
- File Management: View uploaded files with metadata
- Arabic Interface: Full RTL support
See admin-portal/README.md for detailed documentation.
10. API Endpoints
Chat Endpoint
POST /api/chat
Request body:
{
"message": "user question",
"history": [...],
"category": "courts" // optional, filter by category
}
Admin Endpoints (JWT Protected)
Login
POST /api/admin/login
Body: { "username": "admin", "password": "..." }
Returns: { "access_token": "...", "token_type": "bearer" }
Get Categories
GET /api/admin/categories
Returns: { "categories": [...] }
Upload File
POST /api/admin/upload
Headers: { "Authorization": "Bearer <token>" }
Form data: { "file": <file>, "category": "courts" }
List Files
GET /api/admin/files
Headers: { "Authorization": "Bearer <token>" }
Returns: { "files": [...] }
11. Security
- JWT tokens with configurable expiration (default 24 hours)
- Password-protected admin access
- File size limits (10MB default)
- CORS configuration for allowed origins
- Rate limiting on chat endpoint (20 requests/minute per IP)
- Session-based token quotas
Security Best Practices:
- Change default admin credentials in production
- Use strong JWT secret key
- Enable HTTPS in production
- Regularly rotate JWT secrets
- Monitor file uploads for malicious content
12. Chatbot Embed
To embed the chatbot, use the following HTML snippet:
<script src="http://localhost:3000/embed.js"></script>
<script>
ChatbotWidget.init({
widgetUrl: "http://localhost:3000/#/widget?lang=ar",
})
</script>