OAuth Storage Setup (Option 1: Self-Hosted with Persistence)
Overview
The OAuth authorization server now includes file-based persistent storage for:
- ✅ Registered OAuth clients
- ✅ Authorization codes (with expiration)
- ✅ Access tokens (with expiration)
- ✅ Refresh tokens
This implementation provides a production-ready foundation for small to medium deployments without requiring an external database.
What Changed
New Files
-
src/corpusiq/oauth_storage.py- Thread-safe file-based storage for OAuth data
- JSON storage in
.oauth_data/directory - Automatic expiration handling
- CRUD operations for clients, codes, and tokens
-
src/corpusiq/oauth_cleanup.py- Background task that runs every hour
- Automatically removes expired authorization codes
- Automatically removes expired access tokens
- Prevents storage bloat
-
test_oauth_storage.py- Comprehensive test suite for OAuth storage
- Validates all storage operations
- Tests expiration and cleanup
Updated Files
-
src/corpusiq/app.py- Integrated OAuth storage into all endpoints
/register: Now persists client data/authorize: Validates clients and stores authorization codes/token: Validates codes, clients, and stores access tokens- Background cleanup task starts on app initialization
-
.gitignore- Added
.oauth_data/to prevent committing sensitive data
- Added
How It Works
Client Registration (/register)
1. Client sends registration request
2. Server validates redirect URIs
3. Server generates client_id and client_secret
4. Server STORES client in .oauth_data/clients.json
5. Server returns credentials
Authorization Flow (/authorize)
1. Client requests authorization with client_id
2. Server VALIDATES client exists
3. Server VALIDATES redirect_uri matches registered URI
4. Server generates authorization code
5. Server STORES code in .oauth_data/codes.json (expires in 10 min)
6. Server redirects with code
Token Exchange (/token)
1. Client submits authorization code + credentials
2. Server VALIDATES client exists
3. Server VALIDATES client_secret matches
4. Server RETRIEVES authorization code from storage
5. Server VALIDATES code not expired
6. Server VALIDATES code belongs to client
7. Server DELETES code (one-time use)
8. Server generates access token
9. Server STORES token in .oauth_data/tokens.json (expires in 1 hour)
10. Server returns access token
Background Cleanup
Every hour:
1. Check all authorization codes
2. Delete expired codes
3. Check all access tokens
4. Delete expired tokens
5. Log cleanup results
Storage Format
.oauth_data/clients.json
{
"client_abc123": {
"client_id": "client_abc123",
"client_secret": "secret_xyz789",
"client_name": "ChatGPT",
"redirect_uris": ["https://chat.openai.com/aip/callback"],
"grant_types": ["authorization_code"],
"response_types": ["code"],
"token_endpoint_auth_method": "client_secret_basic",
"scope": "corpus:read corpus:search",
"client_id_issued_at": 1704240000,
"client_secret_expires_at": 1711996800
}
}
.oauth_data/codes.json
{
"code_def456": {
"code": "code_def456",
"client_id": "client_abc123",
"redirect_uri": "https://chat.openai.com/aip/callback",
"scope": "corpus:read corpus:search",
"code_challenge": null,
"code_challenge_method": null,
"expires_at": 1704240600,
"user_id": null
}
}
.oauth_data/tokens.json
{
"token_ghi789": {
"access_token": "token_ghi789",
"token_type": "Bearer",
"client_id": "client_abc123",
"scope": "corpus:read corpus:search",
"expires_at": 1704243600,
"refresh_token": "refresh_jkl012"
}
}
Testing
1. Test Storage Directly
# Run the test suite
$env:PYTHONPATH="$PWD\src"
python test_oauth_storage.py
2. Test Client Registration
curl -X POST https://mcp.sqltrainer.com/register `
-H "Content-Type: application/json" `
-d '{
"client_name": "Test Client",
"redirect_uris": ["https://example.com/callback"]
}'
Check .oauth_data/clients.json - you should see the new client!
3. Test Authorization Flow
# Step 1: Register a client (get client_id from response)
$response = Invoke-RestMethod -Uri "https://mcp.sqltrainer.com/register" `
-Method POST `
-ContentType "application/json" `
-Body '{"client_name":"Test","redirect_uris":["http://localhost:8080/callback"]}'
$clientId = $response.client_id
# Step 2: Request authorization (will auto-approve and redirect)
Start-Process "https://mcp.sqltrainer.com/authorize?client_id=$clientId&redirect_uri=http://localhost:8080/callback&response_type=code&scope=corpus:read"
# Step 3: Extract code from redirect URL and exchange for token
# (Manual step - copy code from browser URL)
4. Test Token Validation
# After getting a token, check it in storage
Get-Content .oauth_data/tokens.json | ConvertFrom-Json
Security Considerations
✅ Implemented
- Thread-safe file operations (prevents race conditions)
- Client secret validation
- Authorization code one-time use (deleted after exchange)
- Token expiration (codes: 10 min, tokens: 1 hour)
- Redirect URI validation
- Client validation on all endpoints
⚠️ Consider for Production
- Encrypt client secrets at rest (currently plain text in JSON)
- Add rate limiting per client (currently global only)
- Implement PKCE validation for code_challenge/code_verifier
- Add token revocation endpoint (
/revoke) - Add refresh token rotation
- Consider database storage for multi-instance deployments
- Add audit logging for compliance
- Implement client authentication for registration endpoint
Migration Path
Current State (Option 1)
- File-based storage
- Single instance deployments
- Development and small production workloads
Future Options
Option 2: Add Database
# Replace OAuthStorage with database-backed version
from corpusiq.oauth_storage_db import OAuthStorageDB
storage = OAuthStorageDB(connection_string="postgresql://...")
Option 3: Use OAuth Provider
- Migrate to Auth0, Keycloak, or similar
- Update endpoints to proxy to external provider
- Keep MCP server endpoints unchanged
Backup and Recovery
Backup OAuth Data
# Backup to timestamped folder
$timestamp = Get-Date -Format "yyyyMMdd_HHmmss"
Copy-Item .oauth_data .oauth_data_backup_$timestamp -Recurse
Restore OAuth Data
# Restore from backup
Remove-Item .oauth_data -Recurse -Force
Copy-Item .oauth_data_backup_20260103_120000 .oauth_data -Recurse
Export Registered Clients
# View all registered clients
Get-Content .oauth_data/clients.json | ConvertFrom-Json | ConvertTo-Json -Depth 10
Monitoring
Check Storage Health
from corpusiq.oauth_storage import get_oauth_storage
storage = get_oauth_storage()
# List all clients
clients = storage.list_clients()
print(f"Registered clients: {len(clients)}")
# Check token count
tokens_file = storage.tokens_file
with open(tokens_file) as f:
import json
tokens = json.load(f)
print(f"Active tokens: {len(tokens)}")
Log Messages to Watch
INFO - Stored client: client_xxx (ChatGPT)
INFO - Generated and stored authorization code for client client_xxx
INFO - Issued access token for client client_xxx
INFO - Cleaned up N expired authorization codes
INFO - Cleaned up N expired access tokens
WARNING - Token request for unknown client: client_xxx
WARNING - Invalid client_secret for client client_xxx
Troubleshooting
Problem: Clients not persisting after restart
Cause: Storage directory permissions or file write errors
Solution: Check logs for write errors, verify .oauth_data/ is writable
Problem: Authorization codes always invalid
Cause: Code expired or clock skew
Solution: Check system time, increase code expiration in app.py
Problem: Storage files growing too large
Cause: Cleanup task not running or tokens not expiring
Solution: Check cleanup logs, manually run storage.cleanup_expired_tokens()
Problem: Concurrent write errors
Cause: Multiple instances writing to same files
Solution: Use Option 2 (database) for multi-instance deployments
Next Steps
- ✅ Test with ChatGPT: Try adding your app in ChatGPT
- ✅ Monitor logs: Watch for client registrations and token issuance
- ✅ Verify persistence: Restart server and confirm clients remain
- ⏭️ Add token validation middleware: Protect API endpoints
- ⏭️ Implement refresh token flow: Allow long-lived sessions
- ⏭️ Add client management UI: View/revoke clients and tokens
Summary
You now have a production-ready OAuth authorization server with:
- ✅ Persistent client registration
- ✅ Validated authorization codes
- ✅ Stored and validated access tokens
- ✅ Automatic expiration cleanup
- ✅ Thread-safe operations
- ✅ Comprehensive error handling
Your authorization server: https://mcp.sqltrainer.com
Storage location: .oauth_data/
Cleanup interval: Every 1 hour
Ready for ChatGPT integration! 🚀