encycloped.ai is an experimental, community-driven project that combines the power of ChatGPT 4.1 (or any other LLM with an API) with the collaborative spirit of Wikipedia. This platform dynamically generates encyclopedia-style articles with citations using the ChatGPT 4.1 (or any other LLM with an API) API, while allowing users to report inaccuracies and contribute missing information. AI moderation helps ensure that the content remains accurate and reliable, even as the community drives its evolution.
While Wikipedia has been a valuable resource, human moderated encyclopedias like Wikipedia suffer from several critical limitations that compromise their reliability and accuracy. These platforms are vulnerable to bias, vandalism, political manipulation, and inconsistent editorial standards that can lead to misinformation being presented as fact. The decentralized nature of human moderation means that articles can be edited by anyone with varying levels of expertise, leading to inconsistent quality and potential conflicts of interest. Additionally, the volunteer-based editorial process can be slow to correct errors and may be influenced by groupthink or dominant perspectives that don't represent the full spectrum of knowledge. This is why encycloped.ai uses AI moderation to ensure consistent, unbiased, and factually accurate content generation.
Note: All article and topic data is stored in a persistent PostgreSQL database, ensuring data durability and multi-user support. No in-memory storage is used for articles or topics.
-
Dynamic Content Generation:
Automatically generate encyclopedia articles on-the-fly using ChatGPT 4.1 (or any other LLM with an API) with full citations. -
Topic and Subtopic Navigation:
Access articles via URL paths (e.g.,/Python) and subtopics using anchors (#subtopic) or paths (e.g.,/Python/libraries). -
User Feedback:
Users can report issues or suggest additional information through intuitive modals. Feedback is sent to the ChatGPT 4.1 (or any other LLM with an API) API for validation and content updates. -
AI Moderation:
The system leverages AI (ChatGPT 4.1 or compatible) to review and integrate user contributions, ensuring that the content remains both accurate and reliable. -
Community-Driven:
Open-source and decentralized, contributions are welcome from anyone. However, final control and integration of contributions remain with the project maintainers, ensuring consistency and quality. -
Persistent Database Storage:
All articles and topics are stored in a PostgreSQL database for durability and reliability. This enables multi-user access, prevents data loss on server restarts, and supports future scalability. -
Flexible Topic Names: Topic names now support most common symbols, including spaces, parentheses, periods, commas, colons, semicolons, exclamation marks, question marks, slashes, brackets, braces, quotes, ampersands, asterisks, percent, dollar, at, caret, equals, tilde, pipe, angle brackets, and more. This allows for accurate representation of real-world article titles and disambiguation (e.g., "Python (programming language)", "C++", "Mercury (planet)", etc.).
-
Local LLM Support: Run the application with local LLMs using Ollama. Switch between OpenAI API and local models by using different startup commands. Supports any model available in Ollama, with DeepSeek-Coder as the recommended local option. Local LLM mode uses optimized prompts for better performance and includes improved error handling and timeout management.
-
Interactive Topic Suggestion: Enhanced user engagement with intelligent text selection features. When you highlight text in an article, a lens icon (π) appears near your selection. Clicking the icon opens a modal that:
- Extracts actual terms from your selected text as topic suggestions
- Provides 3 relevant topic options based on the highlighted content
- Allows custom topic input with real-time validation
- Automatically converts selected text into clickable links to new articles
- Creates an interconnected encyclopedia where new topics link back to original content
- Features a blurred background modal with loading animations and accessibility support
encycloped.ai implements comprehensive security measures to protect against web vulnerabilities and AI-specific attacks. For detailed information, see Security Documentation.
-
Prompt Injection Protection:
- Input delimiter wrapping with
"""quotes - Clear framing to prevent instruction execution
- Heuristic-based detection with pattern matching
- LLM system prompt hardening
- Input delimiter wrapping with
-
Cross-Site Scripting (XSS) Protection:
- HTML sanitization using
bleach - Strict tag and attribute whitelisting
- Safe template rendering
- HTML sanitization using
-
Denial of Service (DoS) Protection:
- IP-based rate limiting (5 requests per minute for sensitive endpoints)
- Request throttling per session
- Input size restrictions
-
Content Poisoning Prevention:
- Submission review queue with automated flagging
- Duplicate content detection
- Frequency-based abuse detection
- Admin review workflow
-
Content Security:
- Markdown sanitization
- Path traversal prevention
- Input validation and sanitization
- Contributor metadata logging
-
API Security:
- JSON payload validation
- Required field checking
- Error handling and logging
- Secure session management
For security best practices, incident response procedures, and threat modeling, see docs/SECURITY.md.
- Flask App: Handles all web requests, user feedback, and article generation.
- LLM Integration: Supports both OpenAI API and local LLMs via Ollama for generating and updating encyclopedia articles and topic suggestions. Local LLM mode includes optimized prompts, increased timeouts, and improved error handling for better reliability.
- PostgreSQL: Stores all articles, markdown, topic suggestions, and logsβpersistent, durable storage.
- Redis: Used only for rate limiting (not for article data)βensures fair usage and protects against abuse, even in a distributed/multi-instance setup.
- PostgreSQL is a robust, persistent database for all encyclopedia content and logs.
- Redis is a fast, in-memory data store used for rate limiting.
- Rate limiting requires atomic, high-speed operations and must be shared across all app instances.
- Redis is the industry standard for this use case; PostgreSQL is not suitable for distributed rate limiting.
-
Clone the Repository:
git clone https://github.com/VictoKu1/encycloped.ai.git cd encycloped.ai -
Set Up a Virtual Environment (Recommended):
python3 -m venv venv source venv/bin/activate # On Windows: .\venv\Scripts\activate
-
Install Python Dependencies:
pip install -r requirements.txt
-
Configure Environment Variables:
Create a
.envfile (or set the environment variables directly) to configure your API key, for example:Linux:
export OPENAI_API_KEY=your_openai_api_keyWindows:
- Using CMD:
set OPENAI_API_KEY=your_openai_api_key
- Using PowerShell:
$env:OPENAI_API_KEY="your_openai_api_key"
Follow these steps to set up the PostgreSQL database and Redis for rate limiting, and run the project. You do NOT need to run all commands in the same terminal, but you may find it convenient to use multiple terminals for different steps.
Open any terminal and run:
docker-compose up -d # On Linux it may require "sudo docker-compose up -d"This starts the PostgreSQL database and Redis in the background. You can close this terminal or use it for other commands.
Set the following environment variables in the terminal where you will run the app and the database initialization script. You can also use a .env file if your app loads it automatically.
Linux/macOS:
export DB_HOST=localhost
export DB_PORT=5432
export DB_NAME=encyclopedai
export DB_USER=encyclo_user
export DB_PASSWORD=encyclo_pass
export REDIS_HOST=localhostWindows PowerShell:
$env:DB_HOST="localhost"
$env:DB_PORT="5432"
$env:DB_NAME="encyclopedai"
$env:DB_USER="encyclo_user"
$env:DB_PASSWORD="encyclo_pass"
$env:REDIS_HOST="localhost"- In production/Docker, set
REDIS_HOST=redis.
In any terminal (after Docker is running and dependencies are installed):
python utils/db.py --initThis creates the necessary tables for topics and logs. You only need to run this once (or after making schema changes).
In a new terminal (so you can keep it open while using the app):
python app.pyLeave this terminal open while you use the app in your browser.
If you want to use a local LLM instead of OpenAI API:
-
Install Ollama: Visit ollama.ai and follow the installation instructions for your platform.
-
Start Ollama:
ollama serve
Note: If you get an error about the port being in use, it means Ollama is already running. This is normal and you can proceed to the next step.
-
Pull a model (e.g., DeepSeek-R1):
ollama pull deepseek-coder:6.7b
This model is recommended for good performance and quality. You can also try other models like
llama3.2:3bfor faster responses. -
Configure local LLM settings:
Option A: Interactive setup (Recommended):
python setup_local_llm.py
Option B: Manual configuration: Edit the
local_llm.jsonfile to specify your preferred model:{ "model": "deepseek-coder:6.7b", "base_url": "http://localhost:11434" } -
Test the local LLM integration (Recommended):
python test_local_llm.py
This will run a series of tests to verify that the local LLM integration is working correctly.
-
Run the app in local LLM mode:
python app.py localThe app will validate your Ollama setup and model availability before starting. Local LLM mode uses simplified prompts optimized for local models, providing faster responses while maintaining quality.
If you need to restart the PostgreSQL database cleanly (for example, after making changes to the Docker configuration or to reset the container), you can use the following commands:
Stop the database service:
docker-compose downRemove all stopped containers and volumes (WARNING: this will delete all data!):
docker-compose down -vStart the database service again:
docker-compose up -d-
Home Page:
Enter a topic in the search bar. Topic names can include most common symbols and punctuation, allowing for precise and disambiguated article titles (e.g., "Python (programming language)", "C++," "Mercury: The Planet"). If the topic already exists, you'll be directed to its page; otherwise, a new page is generated with a loading animation while content is created. -
Topic Pages:
View the generated article along with citations. Use the "Report an Issue" button to flag inaccuracies or the "Add Missing Information" button to contribute extra details or subtopics. -
Interactive Topic Suggestion:
New Feature! Select any text in an article (at least 10 characters) and a lens icon (π) will appear near your selection. Click the icon to:- See topic suggestions extracted from your selected text
- Enter a custom topic with real-time validation
- Generate new articles that automatically link back to the original content
- Create an interconnected web of related topics
-
User Feedback:
Feedback forms open in modals. Your input is sent via AJAX to the backend, where it is validated by the LLM (OpenAI API or local LLM) before updating the article content. -
LLM Mode Switching:
Easily switch between OpenAI API and local LLM modes by using different startup commands. The application validates your setup before starting to ensure everything works correctly. Local LLM mode provides offline capability and privacy while maintaining article quality.
The Interactive Topic Suggestion feature enhances article exploration by allowing users to discover and create related topics directly from the content they're reading.
- Text Selection: Select any text in an article (minimum 10 characters)
- Lens Icon: A lens icon (π) appears near your selection
- Topic Extraction: Click the icon to extract actual terms from your selected text
- Smart Suggestions: The system provides 3 relevant topic suggestions based on the highlighted content
- Custom Topics: Enter your own topic with real-time validation
- Automatic Linking: Selected text becomes a clickable link to the new article
- Interconnected Content: Creates a web of related topics that link back to original content
- Intelligent Extraction: Extracts actual terms from selected text, not generic suggestions
- Real-time Validation: Checks if custom topics are part of the selected text
- Visual Feedback: Button color changes based on validation status
- Accessibility: Keyboard navigation and screen reader support
- Mobile-Friendly: Responsive design works on all devices
- Loading Animations: Smooth loading states with progress indicators
- Blur Effects: Modal with blurred background for focus
- Select text: "Python is an interpreted programming language"
- Click lens icon: Opens modal with suggestions
- See suggestions: "Python", "interpreted", "programming language"
- Click suggestion: Creates new article and converts selected text to link
- Result: "Python is an interpreted programming language"
- Discoverability: Helps users find related topics they might not know about
- Content Creation: Encourages creation of new articles from existing content
- Interconnection: Creates a network of related articles
- User Engagement: Makes article exploration more interactive and fun
- Knowledge Discovery: Reveals connections between different topics
-
"Ollama is not running or not accessible"
- Make sure Ollama is installed and running:
ollama serve - Check if Ollama is accessible at
http://localhost:11434 - If you get a port binding error, Ollama is already running (this is normal)
- Make sure Ollama is installed and running:
-
"Model is not available in Ollama"
- Pull the model first:
ollama pull deepseek-coder:6.7b - Check available models:
ollama list - Update the model name in
local_llm.jsonif needed
- Pull the model first:
-
"Local LLM setup validation failed"
- Run the test script:
python test_local_llm.py - Use the setup script to reconfigure:
python setup_local_llm.py - Check the logs for specific error messages
- Ensure your model has enough memory and resources
- Run the test script:
-
"Error communicating with Ollama: Read timed out"
- Local LLMs are slower than OpenAI API
- The timeout has been increased to 120 seconds
- Consider using a smaller model for faster responses
- Ensure your system has sufficient RAM (at least 8GB recommended)
-
"OpenAI API error: local variable 'client' referenced before assignment"
- This error occurs when the OpenAI API key is not set but the app tries to use OpenAI mode
- Set your OpenAI API key:
$env:OPENAI_API_KEY="your_key" - Or use local LLM mode:
python app.py local
- Local LLMs may be slower than OpenAI API (30-60 seconds for article generation)
- Consider using smaller models for faster responses:
llama3.2:3b- Faster, smaller modeldeepseek-coder:6.7b- Good balance of speed and qualitydeepseek-coder:33b- Higher quality, slower
- Ensure your system has sufficient RAM (8GB minimum, 16GB recommended)
- Local LLM mode uses optimized prompts for better performance
- The application implements rate limiting to prevent abuse
- All user input is sanitized to prevent XSS attacks
- Content is validated and sanitized before storage
- Contributor actions are logged for accountability
- API endpoints are protected against common vulnerabilities
Contributions are welcome and encouraged! Please see the CONTRIBUTING.md file for guidelines on how to contribute to the project.
This project is licensed under the GNU General Public License v3 (GPL v3).
- Flask β The web framework powering this project.
- OpenAI β For providing the ChatGPT 4.1 API.
- PostgreSQL β The database powering this project.
- Redis β The rate limiting database powering this project.
- Docker β The containerization platform powering this project.
- The open-source community β For their inspiration and continuous contributions.
- Wikipedia and other collaborative knowledge-sharing platforms β For inspiring a decentralized approach to knowledge.
