Purpose of This Document
This document is a reference for AI coding agents building this system. It describes the architectural patterns, system components, and design decisions needed to create a student AI chat interface with complete pedagogical control.
If you're a human reader, this document will help you understand the system design, but it's optimized to provide context to command-line AI coding assistants during implementation.
Implementation note: This document describes architectural patterns conceptually. The build stages in this guide implement these patterns using PHP on shared hosting, but the patterns themselves are platform-agnostic. If you're working with Node.js, Python, or another backend technology, these same architectural decisions and component relationships apply - you'll simply implement them using your platform's idioms and libraries.
System Overview
This is a web-based AI chat application designed for educational use. It allows educators to:
- Customize system instructions that shape AI behavior for learning
- Configure conversation frameworks and modular learning supports
- Deploy without authentication systems (privacy-first, simple deployment)
- Store student conversations locally in their browsers (data ownership)
- Pay only for API usage (no per-student licensing)
The architecture prioritizes educator autonomy and student privacy.
Core Architectural Decisions
No Authentication System
Why: Privacy, simplicity, and faster deployment. Each installation serves a single class or use case. Students access via URL. No user accounts, passwords, or session management is required, and no chat data is used for model training.
Trade-off: Multiple classes need multiple installations (different URLs or paths).
Local Browser Storage
Why: Student data ownership. Conversations never leave the student's device unless they explicitly export them. No cloud storage, no server-side conversation databases.
Implementation: IndexedDB via LocalForage library.
Server-Side Instruction Assembly
Why: Avoid ModSecurity WAF (Web Application Firewall) restrictions that block large POST payloads. Many shared hosting environments reject requests with very long system instructions.
Solution: Store only IDs in configuration. Server-side PHP assembles the full instruction text from multiple sources before sending to AI provider.
API-Agnostic Design
Why: Let educators choose the AI provider that best fits their needs and budget. Providers have different pricing, capabilities, and policies.
Implementation: Core architecture works with any REST-based LLM API. Requires adapting the API proxy for different request/response formats.
Conversation Instruction Snapshots
Why: Enables modular, curriculum-aligned learning. Teachers can change the system configuration (switch conversation frameworks or learning supports) for new conversations without affecting existing student work.
Implementation: Each conversation stores a complete copy of the system instruction that was active when it was created.
Pedagogical benefit: Week 1 students create "brainstorming" conversations, Week 2 teacher switches mode to "project planning" - students can still access and continue their Week 1 brainstorming conversations while new conversations use the planning framework.
System Components
Configuration System
Three JSON files work together to provide flexible, modular configuration:
1. Main Configuration File
- Application settings (title, subtitle, version)
- Master enable/disable switch
- AI model identifier
- Core educational framework (base system instruction)
- Reference to selected conversation framework (by ID, not full text)
- Array of selected learning support IDs (not full text)
- Student-facing instructions and warnings
- UI customization options
2. Conversation Frameworks Library File
- Array of complete conversation framework definitions
- Each includes: title, description, full prompt text, example interactions
- Referenced by numeric ID (array position)
- Allows teachers to build a library of frameworks and switch between them
3. Learning Supports Library File
- Array of modular pedagogical strategies
- Each includes: unique ID, title, prompt text, usage notes
- Can be combined (multiple supports selected at once)
- Examples: scaffolding techniques, questioning strategies, feedback approaches
Why this structure:
- Modular: Mix and match supports without rewriting prompts
- Manageable: Change active configuration without editing large text blocks
- Avoids WAF issues: Full text assembled server-side, only IDs sent from client
System Instruction Assembly Component
A server-side PHP component that dynamically builds complete system instructions.
Process:
- Receives configuration (or conversation-specific instruction if resuming saved conversation)
- If building fresh instruction:
- Reads main config file for core framework
- Looks up selected conversation framework by ID from frameworks library
- Looks up selected learning supports by IDs from supports library
- Retrieves any framework-specific customization notes
- Combines all components into single complete instruction
- Returns assembled instruction for use in API request
Output includes:
- Core educational framework
- Conversation framework prompt and examples
- All selected learning support strategies
- Customization notes
- Framework title markers (for UI display)
Why server-side:
- Keeps large text assembly off the client
- Avoids WAF restrictions on POST payload size
- Central logic for instruction composition
API Proxy Layer
A PHP backend component that mediates between the client and the AI provider API.
Responsibilities:
Request Handling:
- Receives conversation history from client
- Retrieves or assembles system instruction
- Formats request for AI provider's API specification
- Includes API key from environment variable
- Sends request to AI provider
Rate Limiting:
- Tracks request timestamps per client IP address
- Enforces minimum interval between requests (prevents abuse)
- Uses filesystem-based locks for tracking
Error Handling & Retry Logic:
- Implements exponential backoff for rate limit responses (HTTP 429)
- Retries failed requests with increasing wait times
- Logs errors for debugging
- Returns meaningful error messages to client
Response Handling:
- Extracts AI response from provider's format
- Includes token usage metadata (for cost tracking)
- Returns formatted response to client
Security:
- API key stored in environment variable (never in code or config files)
- SSL verification enforced for all external requests
- Security headers (CSP, XSS protection, frame options)
- Input validation for conversation history format
- HTML escaping for user content before storage
Why a proxy:
- Keeps API keys server-side (not exposed to client)
- Centralizes rate limiting and error handling
- Allows logging and monitoring
- Abstracts provider-specific API details from frontend
Storage Layer
Browser-based storage using IndexedDB via the LocalForage library.
Storage Model:
- Each conversation is a complete object stored in IndexedDB
- Path-based namespacing allows multiple installations on same domain
- Maximum 15 conversations per namespace
Conversation Object Structure:
- Unique ID
- Title (user-editable)
- Creation and last-updated timestamps
- Array of message objects (role: user/model, content: text)
- Complete system instruction (frozen snapshot from creation time)
- Instruction type label (e.g., "Biology Study Assistant")
- Metadata: model name, token counts, namespace
Why this structure:
- Frozen instruction: Ensures pedagogical consistency within each conversation even if global config changes
- Complete messages array: Enables continuing multi-turn conversations
- Metadata: Allows cost tracking and conversation management
- Namespace: Enables multiple installations without conflicts
Path-Based Namespacing:
- Namespace derived from URL path
/math-class/and/history-class/maintain separate conversation stores- Enables single domain to host multiple class installations
- No cross-contamination of student data
Automatic Limit Management:
- Storage validates before saving
- If at 15 conversations, identifies oldest by last-updated timestamp
- Asks user to delete a conversation to make room for new (can suggest oldest)
- User can manually delete conversations anytime
Why browser storage:
- Student data never leaves their device
- No server-side database needed (simpler deployment)
- Works offline for viewing saved conversations
- Student owns and controls their data
UI Components
Chat Interface:
- Message input field
- Send button and keyboard shortcuts
- Message display area with role distinction (user vs AI)
- Markdown rendering for AI responses
- Token usage display
- Loading states during API requests
Conversation Management:
- List of saved conversations with titles and timestamps
- Create new conversation
- Save current conversation (manual or auto-save)
- Load saved conversation (restores full context)
- Delete conversations
- Sort options (recent activity, newest, oldest)
- Active conversation indicator
Configuration Editors (Optional):
Two modes possible, for use by teachers:
- Module editor: Lightweight interface to select conversation frameworks and learning supports
- Full editor: Full administrative access to all configuration settings
Both editors:
- Read current configuration
- Provide selection interfaces (dropdowns, checkboxes)
- Save updated configuration back to file
- Use same backend handler with different permission levels
Why separate editors:
- Teachers maintain control over core framework and available options
- Teachers can opt for simpler module editor and leave the full editor to technology staff
- Single backend simplifies maintenance
Data Flow Diagrams
Request Flow
sequenceDiagram
participant User
participant UI
participant Storage
participant API Proxy
participant AI Provider
User->>UI: Enter message
UI->>Storage: Check for saved conversation
Storage-->>UI: Return conversation + instruction (if exists)
UI->>API Proxy: Send message + history + instruction
API Proxy->>API Proxy: Assemble or use provided instruction
API Proxy->>AI Provider: Format & send request
AI Provider-->>API Proxy: Return response + token usage
API Proxy-->>UI: Return formatted response
UI->>UI: Display response with markdown
UI->>Storage: Save conversation (if requested)
Storage-->>UI: Confirm saved
Configuration Assembly
flowchart TD
A[New Conversation Request] --> B{Has saved instruction?}
B -->|Yes| C[Use conversation snapshot]
B -->|No| D[Read main config]
D --> E[Get core framework]
D --> F[Get selected framework ID]
D --> G[Get selected support IDs]
F --> H[Look up framework in library]
G --> I[Look up supports in library]
E --> J[Combine all components]
H --> J
I --> J
J --> K[Complete system instruction]
C --> L[Send to AI provider]
K --> L
Storage Architecture
flowchart TD
A[Browser IndexedDB] --> B[Namespace: /class-path/]
B --> C[Conversation 1]
B --> D[Conversation 2]
B --> E[... up to 15]
C --> F[ID, Title, Timestamps]
C --> G[Messages Array]
C --> H[System Instruction Snapshot]
C --> I[Metadata]
J[New Conversation Save] --> K{At 15 limit?}
K -->|Yes| L[Find oldest by timestamp]
K -->|No| M[Save directly]
L --> N[Prompt user to Delete]
N --> M
Security Considerations
API Key Protection
- Store in environment variable, never in code or configuration files
- Access via server-side code only (PHP
getenv()) - Never expose to client-side JavaScript
- Rotate periodically if compromised
Input Validation
- Validate conversation history format before sending to API
- Ensure message structure matches expected schema
- Reject malformed requests
Input Sanitization
- HTML-escape user content before storing in browser
- Prevent XSS attacks via stored conversation data
- Markdown rendering library handles safe HTML conversion
Security Headers
- Content Security Policy (CSP) to restrict resource loading
- X-XSS-Protection header
- X-Frame-Options to prevent clickjacking
- HTTPS enforcement where possible
Rate Limiting
- IP-based request tracking
- Minimum interval between requests (typically 1 second)
- Prevents abuse and runaway API costs
- Filesystem-based lock mechanism for tracking
SSL/TLS
- Enforce SSL verification on all API requests to AI providers
- Use HTTPS for production deployments
- Protect API keys and conversation data in transit
Trade-offs and Alternatives
No Authentication
Trade-off: Anyone with the URL can access. Suitable for classroom environments where URL is shared only with enrolled students.
Alternative considered: User accounts with authentication. Rejected due to complexity, privacy concerns (storing student data), and deployment overhead.
Browser Storage Limit (15 conversations)
Trade-off: Students must delete old conversations to create new ones.
Alternative considered: Unlimited storage. Rejected due to browser storage quotas and UX complexity of managing large conversation lists.
Mitigation: Students can export conversations before deletion.
Server-Side Configuration Files
Trade-off: Requires server access to change configuration. Not suitable for non-technical users without file access.
Alternative considered: Database-backed configuration. Rejected due to deployment complexity and shared hosting limitations.
Mitigation: Configuration editors provide UI for file editing.
Single Installation Per Use Case
Trade-off: Multiple classes need multiple installations (different paths or subdomains).
Alternative considered: Multi-tenant system with class/student management. Rejected due to authentication requirements and complexity.
Benefit: Simplicity, privacy, isolation between classes.
Deployment Considerations
Hosting Requirements
- PHP 7.4+ support
- Apache web server with mod_rewrite (for clean URLs)
- Ability to set environment variables
- File write permissions for logs and rate limiting
- HTTPS recommended for production
External Dependencies
- LocalForage: Browser storage abstraction (loaded via CDN)
- Showdown or similar: Markdown rendering (loaded via CDN)
- AI Provider API: Internet connectivity required for API requests
No Database Required
- All configuration stored in JSON files
- All conversation data stored client-side in browser
- Simplifies deployment and reduces hosting requirements