Frequently Asked Questions (FAQ)
Quick answers to common questions about CorpusIQ.
Table of Contents
- General Questions
- Installation & Setup
- Usage Questions
- Technical Questions
- Security & Privacy
- Deployment & Production
- Troubleshooting
- Contributing & Support
General Questions
What is CorpusIQ?
CorpusIQ is a production-ready MCP (Model Context Protocol) server that enables ChatGPT to search across multiple connected data sources like Gmail, Google Drive, OneDrive, QuickBooks, and more from a unified interface.
What is MCP?
MCP (Model Context Protocol) is an open standard protocol that allows AI applications like ChatGPT to communicate with external services and data sources. Think of it as a universal connector between AI and your tools.
Is CorpusIQ free?
Yes! CorpusIQ is completely free and open source. You can use it, modify it, and even deploy it commercially without any fees.
What license is CorpusIQ released under?
See the LICENSE file in the repository for specific terms.
Who maintains CorpusIQ?
CorpusIQ is maintained by the CorpusIQ team and community contributors.
Can I use CorpusIQ for commercial purposes?
Yes, CorpusIQ can be used in commercial environments. Check the LICENSE file for full terms.
Installation & Setup
What are the system requirements?
Minimum:
- Python 3.10 or higher
- 512 MB RAM
- 100 MB disk space
- Internet connection
Recommended:
- Python 3.11+
- 1 GB RAM
- 500 MB disk space
Do I need to know Python to use CorpusIQ?
Not necessarily! If you’re just using it, you only need to run a few commands. However, basic Python knowledge helps if you want to customize it.
Can I run CorpusIQ on Windows?
Yes! CorpusIQ works on Windows, macOS, and Linux.
Does CorpusIQ work on ARM/M1/M2 Macs?
Yes, it works on Apple Silicon Macs.
Why does ChatGPT require HTTPS?
ChatGPT requires HTTPS for security. Use ngrok or Cloudflare Tunnel to expose your local server over HTTPS.
Can I use CorpusIQ without ChatGPT?
CorpusIQ is designed for ChatGPT and other MCP-compatible AI assistants. It follows the MCP protocol, so it works with any MCP-compatible client.
How do I update CorpusIQ?
cd corpusiq-openai-sdk
git pull
pip install -e . --upgrade
Usage Questions
How do I search my data?
Just ask ChatGPT naturally:
- “Search my corpus for quarterly reports”
- “Find emails about the Smith project”
- “Show me documents related to budget planning”
Can I search specific data sources?
Yes! Be specific in your query:
- “Search my Gmail for messages from john@example.com”
- “Find documents in my OneDrive about marketing”
How many results can I get?
By default, you get 5 results. You can request up to 20:
- “Search for ‘project updates’ and show me 10 results”
Can I save searches?
Saved searches are a planned feature for a future release.
How do I add a new data connector?
- Ask ChatGPT: “Open connectors”
- Click “Add Connector” in the widget
- Choose your data source
- Follow the OAuth authorization flow
Which data sources are supported?
Currently supported (or planned):
- Gmail / Google Workspace
- Google Drive
- OneDrive / SharePoint
- Dropbox
- QuickBooks
- Salesforce
- Slack
- GitHub
- And more being added!
Can I remove a connector?
Yes:
- Open the connectors interface
- Find the connector
- Click “Disconnect” or “Remove”
Do I need API keys for each data source?
No! CorpusIQ uses OAuth, so users authorize access through each service’s login page. No API keys needed.
Technical Questions
What technology stack does CorpusIQ use?
- Language: Python 3.10+
- Web Framework: FastAPI
- Server: Uvicorn (ASGI)
- Validation: Pydantic
- Protocol: MCP SDK
Can I customize CorpusIQ?
Absolutely! The code is open source and well-documented. You can:
- Add custom tools
- Integrate new data sources
- Modify the UI widget
- Change business logic
How do I add a custom tool?
See the User Guide and API Reference for detailed examples. Basic pattern:
@mcp.tool()
def my_custom_tool(param: str) -> dict:
"""My custom tool description."""
# Your logic here
return result
Does CorpusIQ support multiple users?
Yes, but you need to implement user authentication and data isolation based on your needs. The OAuth foundation is included.
Can I run multiple instances?
Yes! For high availability, run multiple instances behind a load balancer. Use Redis for shared rate limiting.
What databases does CorpusIQ support?
CorpusIQ itself doesn’t require a database for basic operation. You can add database connections for custom features.
Can I use CorpusIQ with Claude or other AIs?
Yes! Any AI that supports the MCP protocol can use CorpusIQ. Claude and other MCP-compatible assistants work out of the box.
What’s the performance like?
Typical metrics:
- Search latency: <500ms (p95)
- Widget load: <200ms
- Concurrent users: 1000+ per instance
- Throughput: 100+ searches/second
Performance varies based on your deployment and connected data sources.
Security & Privacy
Is my data safe?
Yes. CorpusIQ:
- Doesn’t store your search queries or results
- Uses HTTPS for all communication
- Implements OAuth for secure authentication
- Follows security best practices
Where is my data stored?
CorpusIQ doesn’t store your data. It queries your connected sources in real-time and returns results. Nothing is persisted (except logs, which you control).
Can I run CorpusIQ on-premises?
Yes! CorpusIQ can be deployed on your own infrastructure, keeping all data within your network.
Does CorpusIQ log my searches?
Basic request logging (IP, timestamp, tool called) is standard. You can configure logging levels in the .env file. Search queries and results are not logged by default.
How is authentication handled?
- Development: Optional (for testing)
- Production: OAuth 2.1 required
- Data Sources: OAuth for each connector
What permissions does CorpusIQ need?
Only what you authorize! Each data source connector requests specific permissions during OAuth setup. You can review and revoke these anytime.
Is CorpusIQ GDPR compliant?
CorpusIQ’s architecture supports GDPR compliance, but compliance also depends on how you deploy and configure it. It provides:
- Data minimization (queries in real-time, no storage)
- User consent (OAuth flows)
- Right to erasure (disconnect connectors)
- Data portability (standard formats)
Has CorpusIQ been security audited?
See SECURITY_REVIEW_SUMMARY.md for the latest security review information.
Deployment & Production
Can I deploy CorpusIQ to production?
Yes! CorpusIQ is production-ready. See DEPLOYMENT.md for detailed instructions.
What hosting options are available?
- Cloud Platforms: Heroku, AWS, GCP, Azure, DigitalOcean
- VPS: Your own server with Nginx/Apache
- Docker: Container deployment
- Kubernetes: Orchestrated deployment
Do I need a domain name?
For production, yes. ChatGPT requires a public HTTPS URL. For testing, ngrok or Cloudflare Tunnel work fine.
How do I set up HTTPS?
Use:
- Let’s Encrypt: Free SSL certificates
- Cloudflare: Free SSL and CDN
- Cloud Provider: Managed SSL (AWS ACM, etc.)
What about scaling?
CorpusIQ scales horizontally:
- Run multiple instances behind a load balancer
- Use Redis for shared rate limiting
- Deploy across regions for global users
How much does hosting cost?
Depends on your setup:
- Free tier: Heroku, AWS Free Tier, etc. (suitable for testing)
- Small production: $5-20/month (basic VPS)
- Medium: $50-200/month (managed platform)
- Enterprise: $500+ (high availability, multiple regions)
What monitoring should I set up?
Recommended:
- Uptime: UptimeRobot, Pingdom
- Errors: Sentry, Rollbar
- Performance: New Relic, Datadog
- Logs: ELK, Splunk, CloudWatch
Do I need to implement OAuth for production?
Yes! OAuth authentication is required for production ChatGPT integration. See DEPLOYMENT.md for setup instructions.
Troubleshooting
CorpusIQ won’t start
Check:
- Python version:
python --version(needs 3.10+) - Virtual environment activated
- Dependencies installed:
pip install -e . - Configuration file:
.envexists
ChatGPT can’t connect
Check:
- Server is running
- Tunnel is active (ngrok/Cloudflare)
- HTTPS URL is correct
- CORS allows
https://chat.openai.com
Search returns no results
Check:
- Connectors are connected (not just added)
- Data exists in connected sources
- OAuth tokens are valid
- Search query is clear and specific
Widget doesn’t show
Check:
assets/corpusiq.htmlexists- Browser console for errors
- CORS settings
- Widget file path in code
Rate limit errors (429)
Solutions:
- Wait 60 seconds
- Increase
CORPUSIQ_RATE_LIMIT_REQUESTS_PER_MINUTEin.env - Check if multiple users are hitting same IP
OAuth errors
Check:
- OAuth provider configuration
- Well-known endpoints accessible
- Token validation implemented
- Callback URLs correct
For more detailed troubleshooting, see TROUBLESHOOTING.md
Contributing & Support
How can I contribute?
We welcome contributions!
- Fork the repository
- Create a feature branch
- Make your changes
- Add tests if applicable
- Submit a pull request
See contributing guidelines in the repository.
I found a bug. What should I do?
- Check if it’s already reported in GitHub Issues
- If not, create a new issue with:
- Description of the bug
- Steps to reproduce
- Expected vs actual behavior
- Environment details (OS, Python version, etc.)
I have a feature request
Great! Create a GitHub issue with the “feature request” label and describe:
- What you want to accomplish
- Why it would be useful
- Any implementation ideas
How do I get help?
- Documentation: Check this FAQ and other docs
- GitHub Issues: Search existing issues
- Discussions: Ask questions in GitHub Discussions
- Community: Connect with other users
Is professional support available?
Yes, for enterprises needing SLAs, custom development, or training, professional support options are available. Contact the CorpusIQ team.
Can I hire someone to help with deployment?
Yes, there are professional services available for:
- Custom deployment
- Integration development
- Training
- Ongoing support
How often is CorpusIQ updated?
CorpusIQ receives regular updates:
- Bug fixes: As needed (typically days)
- Security patches: Immediately
- Features: Monthly releases
- Major versions: Quarterly
Where can I find the changelog?
See CHANGES.md for detailed version history.
How do I stay informed about updates?
- Watch the GitHub repository
- Star the project to show support
- Follow project announcements
Advanced Questions
Can CorpusIQ work offline?
No, CorpusIQ requires internet connectivity to access cloud-based data sources and communicate with ChatGPT.
Does CorpusIQ support multi-tenancy?
Multi-tenancy isn’t built-in, but the architecture supports it. You’ll need to implement:
- Tenant isolation
- Per-tenant authentication
- Data segregation
- Usage tracking
Can I white-label CorpusIQ?
Yes! As an open-source project, you can customize branding, naming, and appearance. Check the LICENSE for any requirements.
Does CorpusIQ support webhooks?
Not currently, but this is a planned feature for future releases.
Can I use CorpusIQ as a library?
CorpusIQ is primarily designed as a server application, but you could import and use its components in your own Python projects.
What about data source rate limits?
Each data source (Gmail, Drive, etc.) has its own API rate limits. CorpusIQ implements respectful usage patterns, but high-volume deployments should monitor and handle rate limits from external APIs.
Can I contribute new data source connectors?
Absolutely! New connector contributions are highly valued. See the contributing guidelines and connector development docs.
Still Have Questions?
Check these resources:
- User Guide - Comprehensive usage documentation
- API Reference - Technical details
- Troubleshooting Guide - Fix common issues
- Deployment Guide - Production deployment
Get support:
- GitHub Issues: https://github.com/CorpusIQ/corpusiq-openai-sdk/issues
- Discussions: https://github.com/CorpusIQ/corpusiq-openai-sdk/discussions
Can’t find your answer? Create a GitHub Discussion post and the community will help!
Last Updated: January 2026
This FAQ is community-maintained. If you found something unclear or have suggestions for new questions, please open an issue!