Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
19 changes: 19 additions & 0 deletions .dockerignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
.git
.gitignore
.venv
__pycache__/
*.pyc
*.pyo
*.pyd
.pytest_cache/
.mypy_cache/
.ruff_cache/
node_modules/
DIRECTEYE/
docs/
test.db
test.db-shm
test.db-wal
spectra.sqlite3
*.sqlite3
*.log
19 changes: 12 additions & 7 deletions Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@ RUN apt-get update && apt-get install -y --no-install-recommends \
&& rm -rf /var/lib/apt/lists/*

# Create non-root user for security
RUN groupadd -r spectra && useradd -r -g spectra spectra
RUN groupadd -r spectra && useradd -r -g spectra -d /home/spectra -m spectra

# Set working directory
WORKDIR /app
Expand All @@ -42,7 +42,7 @@ COPY --from=builder /root/.local /home/spectra/.local
COPY --chown=spectra:spectra . .

# Create necessary directories
RUN mkdir -p /app/data /app/logs /app/config && \
RUN mkdir -p /app/data /app/logs /app/config /app/media /app/checkpoints && \
chown -R spectra:spectra /app

# Set environment variables
Expand All @@ -52,17 +52,22 @@ ENV PATH="/home/spectra/.local/bin:$PATH" \
SPECTRA_TESTING=false \
SPECTRA_PORT=5000 \
SPECTRA_HOST=0.0.0.0 \
SPECTRA_JWT_SECRET=change-me-in-production
SPECTRA_BOOTSTRAP_SECRET= \
SPECTRA_SESSION_SECRET=change-me-in-production \
SPECTRA_JWT_SECRET=change-me-in-production \
SPECTRA_WEBAUTHN_ORIGIN= \
SPECTRA_WEBAUTHN_RP_ID=

# Set user
USER spectra

# Health check
# Health check targets the public login surface so it still works when
# the rest of the UI is auth-gated.
HEALTHCHECK --interval=30s --timeout=10s --start-period=40s --retries=3 \
CMD curl -f http://localhost:5000/health || exit 1
CMD curl -fsS http://localhost:5000/login || exit 1

# Expose port
EXPOSE 5000

# Default entrypoint: Web server
CMD ["python", "-m", "tgarchive.web", "--host", "0.0.0.0", "--port", "5000"]
# Default entrypoint: unified SPECTRA web UI
CMD ["sh", "-c", "python -m spectra_app.spectra_gui_launcher --host ${SPECTRA_HOST:-0.0.0.0} --port ${SPECTRA_PORT:-5000} --log-level ${SPECTRA_LOG_LEVEL:-INFO}"]
16 changes: 10 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ SPECTRA is an advanced framework for Telegram data collection, network discovery

## Features

- 🔄 **Multi-account & API key rotation** with smart, persistent selection and failure detection
- 🔄 **Multi-account orchestration** with smart, persistent selection and failure detection
- 🕵️ **Proxy rotation** for OPSEC and anti-detection
- 🔎 **Network discovery** of connected groups and channels (with SQL audit trail)
- 📊 **Graph/network analysis** to identify high-value targets
Expand All @@ -20,7 +20,8 @@ SPECTRA is an advanced framework for Telegram data collection, network discovery
- ⚡ **Parallel processing** leveraging multiple accounts and proxies simultaneously
- 🖥️ **Modern TUI** (npyscreen) and CLI, both using the same modular backend
- ⚙️ **Streamlined Account Management** - Full CRUD operations directly in the TUI with keyboard shortcuts
- ☁️ **Forwarding Mode:** Traverse a series of channels, discover related channels, and download text/archive files with specific rules, using a single API key.
- ☁️ **Forwarding Mode:** Traverse a series of channels, discover related channels, and download text/archive files with specific rules.
- 🔐 **Dockerized web console** with first-run bootstrap admin enrollment and YubiKey/passkey WebAuthn sign-in
- 🛡️ **Red team/OPSEC features**: account/proxy rotation, SQL audit trail, sidecar metadata, persistent state

## ⚡ Quick Start
Expand Down Expand Up @@ -49,11 +50,14 @@ The repository also includes a local web launcher for orchestration, status, and
./spectra
```

Optional API key protection:
Docker-friendly browser authentication:

```bash
export SPECTRA_GUI_API_KEY="change-me"
./spectra --api-key "$SPECTRA_GUI_API_KEY"
export SPECTRA_BOOTSTRAP_SECRET="one-time-bootstrap-secret"
export SPECTRA_SESSION_SECRET="change-me-in-production"
export SPECTRA_WEBAUTHN_ORIGIN="http://localhost:5000"
export SPECTRA_WEBAUTHN_RP_ID="localhost"
./spectra
```

Standard machine-readable API surfaces:
Expand Down Expand Up @@ -144,7 +148,7 @@ npm run build # Build static HTML to docs/html/
#### Getting Started
- **[Installation Guide](docs/docs/getting-started/installation.md)** - Complete installation instructions
- **[Quick Start Guide](docs/docs/getting-started/quick-start.md)** - Get running in 30 seconds
- **[Configuration Guide](docs/docs/getting-started/configuration.md)** - Setting up API keys and accounts
- **[Configuration Guide](docs/docs/getting-started/configuration.md)** - Setting up accounts, sessions, and browser auth

#### User Guides
- **[TUI Usage Guide](docs/docs/guides/tui-usage.md)** - Complete guide to using the Terminal User Interface
Expand Down
10 changes: 10 additions & 0 deletions datasketch.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
"""Local fallback for the optional datasketch dependency."""


class MinHash:
def __init__(self, num_perm=128):
self.num_perm = num_perm
self._tokens = set()

def update(self, value):
self._tokens.add(value)
26 changes: 15 additions & 11 deletions deployment/docker/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -28,10 +28,8 @@ COPY requirements.txt /app/
RUN pip install --no-cache-dir --upgrade pip setuptools wheel && \
pip install --no-cache-dir -r requirements.txt

# Copy application code
COPY tgarchive/ /app/tgarchive/
COPY setup.py /app/
COPY README.md /app/
# Copy application code needed by the unified web UI
COPY . /app/

# Install SPECTRA
RUN pip install --no-cache-dir -e .
Expand All @@ -45,22 +43,28 @@ USER spectra

# Environment variables (override at runtime)
ENV PYTHONUNBUFFERED=1 \
SPECTRA_HOST=0.0.0.0 \
SPECTRA_PORT=5000 \
SPECTRA_DB_PATH=/app/data/spectra.db \
SPECTRA_MEDIA_DIR=/app/media \
SPECTRA_LOG_DIR=/app/logs \
SPECTRA_CHECKPOINT_DIR=/app/checkpoints
SPECTRA_CHECKPOINT_DIR=/app/checkpoints \
SPECTRA_BOOTSTRAP_SECRET= \
SPECTRA_SESSION_SECRET=change-me-in-production \
SPECTRA_WEBAUTHN_ORIGIN= \
SPECTRA_WEBAUTHN_RP_ID=

# Health check
HEALTHCHECK --interval=30s --timeout=10s --start-period=40s --retries=3 \
CMD python -c "from tgarchive.db.integrity_checker import quick_integrity_check; \
import sys; \
sys.exit(0 if quick_integrity_check('${SPECTRA_DB_PATH}') else 1)"
CMD python -c "import urllib.request, sys; \
url = 'http://127.0.0.1:5000/login'; \
sys.exit(0 if urllib.request.urlopen(url, timeout=5).getcode() < 400 else 1)"

# Volumes for persistent data
VOLUME ["/app/data", "/app/logs", "/app/media", "/app/checkpoints"]

# Expose health check port
EXPOSE 8080
# Expose unified web UI port
EXPOSE 5000

# Default command
CMD ["python", "-m", "tgarchive"]
CMD ["sh", "-c", "python -m spectra_app.spectra_gui_launcher --host ${SPECTRA_HOST:-0.0.0.0} --port ${SPECTRA_PORT:-5000} --log-level ${LOG_LEVEL:-INFO}"]
25 changes: 18 additions & 7 deletions deployment/docker/README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# SPECTRA Docker Deployment

Production-ready Docker containers for SPECTRA with TEMPEST Class C security controls.
Production-ready Docker containers for SPECTRA with TEMPEST Class C security controls and YubiKey/WebAuthn browser authentication.

## Quick Start

Expand All @@ -23,8 +23,10 @@ nano .env # Edit with your credentials

Required variables:
```env
TG_API_ID=your_api_id
TG_API_HASH=your_api_hash_32_chars
SPECTRA_BOOTSTRAP_SECRET=one-time-bootstrap-secret
SPECTRA_SESSION_SECRET=change-me-in-production
SPECTRA_WEBAUTHN_ORIGIN=http://localhost:5000
SPECTRA_WEBAUTHN_RP_ID=localhost
```

### 3. Build and Start
Expand All @@ -49,7 +51,7 @@ curl http://localhost:8080/health
## Services

### spectra
Main archiver service with full functionality.
Main SPECTRA web console service with the full operator workflow.

**Resources:**
- CPU: 0.5-2.0 cores
Expand All @@ -66,9 +68,9 @@ Main archiver service with full functionality.
Health check and monitoring endpoint.

**Endpoints:**
- `GET /health` - Overall health status
- `GET /metrics` - Resource metrics (Prometheus format)
- `GET /status` - Detailed component status
- `GET /login` - Browser login/bootstrap surface
- `GET /api/auth/bootstrap/status` - First-run bootstrap state
- `GET /api/system/status` - Detailed component and auth status

**Resources:**
- CPU: Up to 0.5 cores
Expand Down Expand Up @@ -173,6 +175,15 @@ volumes:
- Dropped Linux capabilities
- No privilege escalation

## First-Run Bootstrap

The first browser operator enrolled through `/login` becomes the admin.

1. Set `SPECTRA_BOOTSTRAP_SECRET` before starting the container.
2. Open `http://localhost:5000/login`.
3. Enter the bootstrap secret, username, display name, and register a YubiKey or platform passkey.
4. Use the resulting admin session to enroll additional operators and credentials.

### Hardening Checklist

- [ ] Use `.env` file, not environment in compose
Expand Down
116 changes: 14 additions & 102 deletions deployment/docker/docker-compose.yml
Original file line number Diff line number Diff line change
Expand Up @@ -5,40 +5,38 @@ services:
build:
context: ../..
dockerfile: deployment/docker/Dockerfile
container_name: spectra-archiver
container_name: spectra-ui
restart: unless-stopped

# Environment variables
environment:
- TG_API_ID=${TG_API_ID}
- TG_API_HASH=${TG_API_HASH}
- SPECTRA_HOST=0.0.0.0
- SPECTRA_PORT=5000
- SPECTRA_DB_PATH=/app/data/spectra.db
- SPECTRA_MEDIA_DIR=/app/media
- SPECTRA_LOG_DIR=/app/logs
- SPECTRA_CHECKPOINT_DIR=/app/checkpoints
- LOG_LEVEL=${LOG_LEVEL:-INFO}
- SPECTRA_BOOTSTRAP_SECRET=${SPECTRA_BOOTSTRAP_SECRET:-}
- SPECTRA_SESSION_SECRET=${SPECTRA_SESSION_SECRET:-change-me-in-production}
- SPECTRA_WEBAUTHN_ORIGIN=${SPECTRA_WEBAUTHN_ORIGIN:-http://localhost:5000}
- SPECTRA_WEBAUTHN_RP_ID=${SPECTRA_WEBAUTHN_RP_ID:-localhost}
- SPECTRA_JWT_SECRET=${SPECTRA_JWT_SECRET:-change-me-in-production}

# Alternative: Load from .env file
env_file:
- .env

# Volumes for persistent data
volumes:
- spectra-data:/app/data
- spectra-media:/app/media
- spectra-logs:/app/logs
- spectra-checkpoints:/app/checkpoints
- ./config:/app/config:ro # Mount config directory as read-only
- ./config:/app/config:ro

# Security options (TEMPEST Class C)
security_opt:
- no-new-privileges:true
cap_drop:
- ALL
cap_add:
- NET_BIND_SERVICE # Only if binding to privileged ports
read_only: false # Need write access to volumes

# Resource limits
deploy:
resources:
limits:
Expand All @@ -48,111 +46,25 @@ services:
cpus: '0.5'
memory: 1G

# Network
networks:
- spectra-network

# Health check
ports:
- "5000:5000"

healthcheck:
test: ["CMD", "python", "-c", "from tgarchive.db.integrity_checker import quick_integrity_check; import sys; sys.exit(0 if quick_integrity_check('/app/data/spectra.db') else 1)"]
test: ["CMD", "python", "-c", "import urllib.request, sys; sys.exit(0 if urllib.request.urlopen('http://127.0.0.1:5000/login', timeout=5).getcode() < 400 else 1)"]
interval: 30s
timeout: 10s
retries: 3
start_period: 40s

# Logging
logging:
driver: "json-file"
options:
max-size: "10m"
max-file: "3"

spectra-health:
build:
context: ../..
dockerfile: deployment/docker/Dockerfile
container_name: spectra-health
restart: unless-stopped

environment:
- PYTHONUNBUFFERED=1

volumes:
- spectra-data:/app/data:ro # Read-only access
- spectra-logs:/app/logs

security_opt:
- no-new-privileges:true
cap_drop:
- ALL

deploy:
resources:
limits:
cpus: '0.5'
memory: 512M

networks:
- spectra-network

ports:
- "8080:8080"

command: ["python", "-m", "tgarchive.core.health_server", "--port", "8080"]

logging:
driver: "json-file"
options:
max-size: "5m"
max-file: "2"

spectra-scheduler:
build:
context: ../..
dockerfile: deployment/docker/Dockerfile
container_name: spectra-scheduler
restart: unless-stopped

environment:
- TG_API_ID=${TG_API_ID}
- TG_API_HASH=${TG_API_HASH}
- SPECTRA_DB_PATH=/app/data/spectra.db
- LOG_LEVEL=${LOG_LEVEL:-INFO}

env_file:
- .env

volumes:
- spectra-data:/app/data
- spectra-media:/app/media
- spectra-logs:/app/logs
- ./config:/app/config:ro

security_opt:
- no-new-privileges:true
cap_drop:
- ALL

deploy:
resources:
limits:
cpus: '1.0'
memory: 1G

networks:
- spectra-network

command: ["python", "-m", "tgarchive.services.scheduler_service"]

depends_on:
- spectra

logging:
driver: "json-file"
options:
max-size: "5m"
max-file: "2"

volumes:
spectra-data:
driver: local
Expand Down
Loading