From bec46e80d70617107d07878607fb523a417c0234 Mon Sep 17 00:00:00 2001
From: Ahmet Abdullah Gultekin <ahmetabdullahgultekin@gmail.com>
Date: Tue, 12 May 2026 17:55:43 +0000
Subject: [PATCH 1/2] fix(verify): enforce anti-spoof block, wire EAR, fix
 aged-threshold, pin SHA, add verify-challenge
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Closes 4 P0/P1 findings from the 2026-05-12 ML review:

Bug 1 (P0) — Anti-spoof `recommended_action='block'` is advisory
  AntispoofPipelineAssembler attached `recommended_action='block'` to /verify
  responses but the route still returned 200/verified=True. Added
  `ANTISPOOF_BLOCK_ENFORCE=true` (default ON in prod). When any layer votes
  block (face_usability_block, hybrid_fusion_is_spoof, or recommended_action='block')
  the route now raises HTTP 403 with `{error_code: ANTISPOOF_BLOCKED, reason: <category>}`.
  Flip flag false for canary/observation rollout. Tests:
  tests/integration/test_verify_antispoof_block_enforce.py (8 assertions, 4 for Bug 1).

Bug 2 (P0) — Blink-cache / EAR work unreachable from /verify
  The 2026-05-11 spoof-detector paper-P0 (blink cache + EAR recalibration)
  lived in `src.infrastructure.analyzers.blink_analyzer` but was never wired
  into the route. Added `_evaluate_ear_liveness_safe()` that runs MediaPipe
  FaceLandmarker on the uploaded still frame, computes EAR via the
  spoof-detector library (EAR_THRESHOLD=0.18), and vetoes on closed eyes.
  Multi-frame BlinkAnalyzer state (V-shape detection) is explicitly out of
  scope here — the cache only helps with multi-face/frame video sessions
  that the current /verify single-still-frame contract doesn't provide.
  Default OFF (ANTISPOOF_EAR_VETO_ENABLED) until ops deploys the
  face_landmarker.task asset; helper fails-soft to None when the model
  or MediaPipe is missing. Companion spoof-detector PR exposes the
  blink_analyzer module on the public `spoof_detector.*` namespace
  (per `feedback_spoof_detector_architecture`, algorithms live there).

Bug 3 (P0) — VERIFICATION_THRESHOLD_AGED semantics inverted
  Comparator is `verified = distance < threshold`; default was
  THRESHOLD=0.45, THRESHOLD_AGED=0.38 — making aged users *stricter*
  (higher FRR), the opposite of the adaptive feature's intent. Raised
  THRESHOLD_AGED default to 0.55 (still well below Facenet cosine
  operating-point ceiling ~0.6 so FAR stays controlled). Added a
  Pydantic model_validator that hard-rejects aged < standard at
  config-load — the regression cannot silently come back via env-file
  edits. .env.example documents the comparator semantics inline.
  Tests: tests/unit/test_verification_threshold_aged.py (4 assertions).

Bug 4 (P1) — Web puzzles call onSuccess client-side, no server validation
  Added POST /api/v1/liveness/verify-challenge for the web
  biometric-puzzles training surface. Single-action contract:
  `{action, start_timestamp_ms, end_timestamp_ms, confidence, ...}` →
  `{verified, action, duration_seconds, reason_code, message}`. Structural
  validation only (action enum, timestamps monotonic + sane duration
  120ms..60s, confidence floor 0.5). Heavier server-side detection
  belongs to multi-step /liveness/verify. Tests:
  tests/integration/test_verify_challenge_endpoint.py (7 assertions).
  Web-app wiring lands in a companion PR on web-app.

Bug 5 (P1) — SHA256 model integrity pins empty / advisory
  `_verify_model_integrity` previously logged a WARNING when the pin was
  empty. Added `DEEPFACE_SHA256_REQUIRED=true` (default). With this flag
  on AND ENVIRONMENT=production, an empty pin now raises RuntimeError at
  model-load — defense against silent ~/.deepface/weights/ rotations.
  Operator action: compute `sha256sum` against the in-container
  facenet512_weights.h5 and pin it via DEEPFACE_FACENET512_SHA256 in
  .env.prod (captured 2026-05-12 from running container:
  3f76b5117a9ca574d536af8199e6720089eb4ad3dc7e93534496d88265de864f).
  The face/hand_landmarker.task hashes intentionally stay empty —
  those models are NOT loaded server-side; the server only delivers
  them as static SHA256-verified assets to clients. Tests:
  tests/unit/test_deepface_sha256_required.py (5 assertions).

Test results (DATABASE_URL=postgresql://test:test@localhost:5432/test):
  - 4 new unit tests (verification_threshold_aged)
  - 5 new unit tests (deepface_sha256_required)
  - 8 new integration tests (verify_antispoof_block_enforce)
  - 7 new integration tests (verify_challenge_endpoint)
  - 6 pre-existing integration tests (verify_antispoof_wiring) — now also
    run locally thanks to added `resemblyzer` mock (baseline-rot fix).
  - test_config_validator.py — 14 pre-existing tests still green.
  Total: 44 pass / 0 fail locally.

Operator action items:
  1. Pin `DEEPFACE_FACENET512_SHA256` in /opt/projects/fivucsas/biometric-processor/.env.prod
     with the value captured above (already added to local .env.prod, NOT committed
     because .env.prod is gitignored).
  2. Rebuild biometric-processor container to pick up these changes.
  3. Decide whether to flip `ANTISPOOF_BLOCK_ENFORCE=false` for a canary rollout
     before relying on the default-ON behavior.
  4. To enable Bug 2 EAR veto: deploy `models/face_landmarker.task`, set
     `FACE_LANDMARKER_MODEL_PATH`, then `ANTISPOOF_EAR_VETO_ENABLED=true`.
  5. Add the identity-core-api proxy for `/biometric/puzzles/verify-challenge`
     when convenient — web-app soft-passes on 404 until it lands.

Memory rules respected:
  - feedback_spoof_detector_architecture: algorithms come from spoof-detector
    via the new public shim; biometric-processor only imports + wires.
  - feedback_liveness_hybrid_vs_passive: no liveness backend changes; prod
    LIVENESS_BACKEND remains as configured by ops.
  - feedback_readonly_rootfs_cache_dirs: new lazy FaceLandmarker init
    respects the existing FACE_LANDMARKER_MODEL_PATH env contract; cache
    dirs unchanged.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---
 .env.example                                  |  15 +
 app/api/routes/puzzle.py                      | 138 ++++++
 app/api/routes/verification.py                | 211 ++++++++-
 app/api/schemas/single_challenge.py           |  85 ++++
 app/api/schemas/verification.py               |  17 +-
 app/core/config.py                            |  87 +++-
 .../ml/extractors/deepface_extractor.py       |  26 +-
 .../test_verify_antispoof_block_enforce.py    | 431 ++++++++++++++++++
 .../test_verify_antispoof_wiring.py           |  11 +-
 .../test_verify_challenge_endpoint.py         | 132 ++++++
 tests/unit/test_deepface_sha256_required.py   | 166 +++++++
 .../unit/test_verification_threshold_aged.py  |  76 +++
 12 files changed, 1385 insertions(+), 10 deletions(-)
 create mode 100644 app/api/schemas/single_challenge.py
 create mode 100644 tests/integration/test_verify_antispoof_block_enforce.py
 create mode 100644 tests/integration/test_verify_challenge_endpoint.py
 create mode 100644 tests/unit/test_deepface_sha256_required.py
 create mode 100644 tests/unit/test_verification_threshold_aged.py
diff --git a/.env.example b/.env.example
index ac4b679..511f2fb 100644
--- a/.env.example
+++ b/.env.example
@@ -65,6 +65,21 @@ SIMILARITY_METRIC=cosine
 SIMILARITY_THRESHOLD=0.6
 EMBEDDING_DIMENSION=2622
 
+# ---------------------------------------------------------------------------
+# Verification thresholds (cosine distance)
+# ---------------------------------------------------------------------------
+# Comparator: verified = distance < threshold
+#   HIGHER threshold = MORE LENIENT (further-allowed distance still matches)
+#   LOWER  threshold = STRICTER   (only near-zero distances match)
+# VERIFICATION_THRESHOLD_AGED must be >= VERIFICATION_THRESHOLD; the config
+# loader rejects inverted values (see app/core/config.py
+# _validate_aged_threshold_lenience). Bug 2026-05-12: an earlier default of
+# 0.38 for aged users made them STRICTER, the opposite of the adaptive
+# feature's intent.
+# VERIFICATION_THRESHOLD=0.45
+# VERIFICATION_THRESHOLD_AGED_YEARS=2.0
+# VERIFICATION_THRESHOLD_AGED=0.55  # higher than 0.45 ⇒ more lenient for aged
+
 # Alternative Models (comment/uncomment to switch)
 # IMPORTANT: When using pgvector, EMBEDDING_DIMENSION must match your model!
 
diff --git a/app/api/routes/puzzle.py b/app/api/routes/puzzle.py
index a100f7e..d48c56c 100644
--- a/app/api/routes/puzzle.py
+++ b/app/api/routes/puzzle.py
@@ -18,6 +18,10 @@
     VerifyPuzzleRequest,
     VerifyPuzzleResponse,
 )
+from app.api.schemas.single_challenge import (
+    VerifyChallengeRequest,
+    VerifyChallengeResponse,
+)
 from app.application.use_cases.generate_puzzle import GeneratePuzzleUseCase
 from app.application.use_cases.verify_puzzle import VerifyPuzzleUseCase
 from app.core.container import get_generate_puzzle_use_case, get_verify_puzzle_use_case
@@ -268,3 +272,137 @@ async def verify_puzzle(
             status_code=500,
             detail="Failed to verify puzzle. Please try again.",
         )
+
+
+# ---------------------------------------------------------------------------
+# Bug 4 (2026-05-12) — single-challenge server validation for the web
+# biometric-puzzles training surface.
+# ---------------------------------------------------------------------------
+#
+# Before this endpoint, ``FacePuzzle.tsx`` and ``HandGesturePuzzle.tsx``
+# detected gestures client-side and called ``onSuccess()`` directly. A
+# malicious user could mock the component out and "pass" any challenge.
+# This endpoint adds a server round-trip the web layer waits on, so
+# ``onSuccess`` is only invoked when the backend confirms the structural
+# checks below.
+#
+# Scope: structural validation only (action enum, timestamp monotonicity,
+# duration sanity, confidence floor). Heavier server-side detection
+# (re-running MediaPipe on uploaded frames) belongs to the multi-step
+# ``/liveness/verify`` flow used by enrollment. The training surface is
+# explicitly lightweight.
+
+
+# Minimum challenge duration (seconds). A real human gesture takes at
+# least ~120 ms even for the fastest blinks; bot scripts firing the
+# endpoint immediately are caught here.
+_MIN_CHALLENGE_DURATION_S = 0.12
+
+# Maximum challenge duration (seconds). Anything beyond 60 s is a stale
+# session or a replay; reject.
+_MAX_CHALLENGE_DURATION_S = 60.0
+
+# Minimum detection confidence the client must report. Below this the
+# server treats the submission as "no detection" regardless of the local
+# verdict. The floor is conservative (matches the engine's typical
+# detected-pass threshold of 0.5).
+_MIN_CHALLENGE_CONFIDENCE = 0.5
+
+
+@router.post(
+    "/verify-challenge",
+    response_model=VerifyChallengeResponse,
+    summary="Verify a single liveness challenge (web training surface)",
+    description=(
+        "Lightweight server validation for the biometric-puzzles training "
+        "surface. Accepts one completed challenge, runs structural checks "
+        "(action enum, timestamp monotonicity, duration sanity, confidence "
+        "floor) and returns a verdict. The web layer must wait on this "
+        "before resolving its onSuccess()."
+    ),
+    responses={
+        200: {"description": "Verdict returned (success=true|false)"},
+        400: {"description": "Malformed request"},
+    },
+)
+async def verify_challenge(
+    request: VerifyChallengeRequest,
+) -> VerifyChallengeResponse:
+    """Server validation for a single training puzzle challenge."""
+    duration_s = max(
+        0.0, (request.end_timestamp_ms - request.start_timestamp_ms) / 1000.0
+    )
+
+    # 1. Timestamps must be monotonic (end >= start).
+    if request.end_timestamp_ms < request.start_timestamp_ms:
+        logger.info(
+            "verify-challenge rejected: timestamps out of order action=%s",
+            request.action.value,
+        )
+        return VerifyChallengeResponse(
+            verified=False,
+            action=request.action,
+            duration_seconds=duration_s,
+            reason_code="TIMESTAMPS_OUT_OF_ORDER",
+            message="Challenge timestamps are not monotonic.",
+        )
+
+    # 2. Duration in sane bounds.
+    if duration_s < _MIN_CHALLENGE_DURATION_S:
+        logger.info(
+            "verify-challenge rejected: duration_too_short action=%s duration=%.3fs",
+            request.action.value,
+            duration_s,
+        )
+        return VerifyChallengeResponse(
+            verified=False,
+            action=request.action,
+            duration_seconds=duration_s,
+            reason_code="DURATION_TOO_SHORT",
+            message="Challenge duration is implausibly short.",
+        )
+    if duration_s > _MAX_CHALLENGE_DURATION_S:
+        logger.info(
+            "verify-challenge rejected: duration_too_long action=%s duration=%.1fs",
+            request.action.value,
+            duration_s,
+        )
+        return VerifyChallengeResponse(
+            verified=False,
+            action=request.action,
+            duration_seconds=duration_s,
+            reason_code="DURATION_TOO_LONG",
+            message="Challenge duration exceeds the allowed window.",
+        )
+
+    # 3. Confidence floor.
+    if request.confidence < _MIN_CHALLENGE_CONFIDENCE:
+        logger.info(
+            "verify-challenge rejected: confidence_below_floor action=%s conf=%.2f",
+            request.action.value,
+            request.confidence,
+        )
+        return VerifyChallengeResponse(
+            verified=False,
+            action=request.action,
+            duration_seconds=duration_s,
+            reason_code="CONFIDENCE_BELOW_FLOOR",
+            message="Detection confidence is below the acceptance floor.",
+        )
+
+    logger.info(
+        "verify-challenge accepted: action=%s tenant=%s user=%s "
+        "duration=%.2fs confidence=%.2f",
+        request.action.value,
+        request.tenant_id,
+        request.user_id,
+        duration_s,
+        request.confidence,
+    )
+    return VerifyChallengeResponse(
+        verified=True,
+        action=request.action,
+        duration_seconds=duration_s,
+        reason_code=None,
+        message="Challenge verified.",
+    )
diff --git a/app/api/routes/verification.py b/app/api/routes/verification.py
index 27183a0..7220c94 100644
--- a/app/api/routes/verification.py
+++ b/app/api/routes/verification.py
@@ -50,6 +50,15 @@
 _antispoof_assembler: Optional[Any] = None  # AntispoofPipelineAssembler when available
 _antispoof_assembler_init_failed = False
 
+# Bug 2 (2026-05-12) — single-frame EAR liveness signal. Wires the
+# spoof-detector EAR computation into /verify so the closed-eye signal can
+# veto a verification. Multi-frame BlinkAnalyzer state (cache + per-face
+# history) is out of scope here because /verify only receives one still
+# frame; that wiring belongs in the multi-frame /liveness/verify route
+# which already runs the puzzle pipeline.
+_face_landmarker_for_ear: Optional[Any] = None
+_face_landmarker_for_ear_init_failed = False
+
 
 def _get_device_spoof_risk_evaluator() -> DeviceSpoofRiskEvaluator:
     """Lazy-init singleton — DeviceSpoofRiskEvaluator constructs cv2 detectors
@@ -129,6 +138,164 @@ def _get_antispoof_assembler() -> Optional[Any]:
         return None
 
 
+def _get_face_landmarker_for_ear() -> Optional[Any]:
+    """Lazy-init a MediaPipe FaceLandmarker for single-frame EAR extraction.
+
+    Returns None if MediaPipe is not importable or the asset isn't present.
+    The result is cached so we don't pay model-init cost on every request.
+    Failures are recorded so we don't spam logs on every request.
+    """
+    global _face_landmarker_for_ear, _face_landmarker_for_ear_init_failed
+    if _face_landmarker_for_ear is not None:
+        return _face_landmarker_for_ear
+    if _face_landmarker_for_ear_init_failed:
+        return None
+    try:
+        import os
+        from pathlib import Path
+        import mediapipe as mp
+
+        # Reuse the same model path the active_liveness_manager loader uses,
+        # honouring FACE_LANDMARKER_MODEL_PATH so ops can override per-env.
+        default_path = (
+            Path(__file__).parent.parent.parent.parent / "models" / "face_landmarker.task"
+        )
+        model_path = Path(os.getenv("FACE_LANDMARKER_MODEL_PATH", str(default_path)))
+        if not model_path.exists():
+            logger.info(
+                "EAR check disabled — face_landmarker.task not found at %s "
+                "(set FACE_LANDMARKER_MODEL_PATH to a deployed asset to enable)",
+                model_path,
+            )
+            _face_landmarker_for_ear_init_failed = True
+            return None
+
+        options = mp.tasks.vision.FaceLandmarkerOptions(
+            base_options=mp.tasks.BaseOptions(model_asset_path=str(model_path)),
+            running_mode=mp.tasks.vision.RunningMode.IMAGE,
+            num_faces=1,
+            min_face_detection_confidence=0.4,
+            min_tracking_confidence=0.4,
+        )
+        _face_landmarker_for_ear = mp.tasks.vision.FaceLandmarker.create_from_options(
+            options
+        )
+        logger.info("FaceLandmarker initialised for single-frame EAR liveness check")
+        return _face_landmarker_for_ear
+    except Exception as exc:  # noqa: BLE001
+        logger.warning(
+            "FaceLandmarker init for EAR failed; closed-eye veto disabled: %s",
+            exc,
+        )
+        _face_landmarker_for_ear_init_failed = True
+        return None
+
+
+def _evaluate_ear_liveness_safe(image_path: str) -> Optional[dict]:
+    """Run a one-shot Eye Aspect Ratio check on a single still frame.
+
+    Uses the EAR computation from the spoof-detector library (paper-P0
+    calibration 2026-05-11: EAR_THRESHOLD=0.18). A frame where BOTH eyes
+    are clearly closed is treated as a strong spoof signal — single photos
+    of closed eyes are rare in legitimate verification flows, but they
+    matter as a defensive complement to texture-based liveness.
+
+    Returns a dict with shape:
+        {
+            "eyes_closed": bool,
+            "left_ear": float,
+            "right_ear": float,
+            "avg_ear": float,
+            "threshold": float,
+        }
+    or None when the check can't run (no MediaPipe, no model, no face).
+    """
+    if not settings.ANTISPOOF_EAR_VETO_ENABLED:
+        return None
+    try:
+        import cv2
+        import numpy as np
+        import mediapipe as mp
+        from spoof_detector.infrastructure.analyzers.blink_analyzer import (
+            BlinkAnalyzer,
+            LEFT_EYE,
+            RIGHT_EYE,
+            compute_ear,
+        )
+    except ImportError as exc:  # pragma: no cover - dep missing in CI
+        logger.warning("EAR check unavailable; import failed: %s", exc)
+        return None
+    try:
+        landmarker = _get_face_landmarker_for_ear()
+        if landmarker is None:
+            return None
+
+        frame_bgr = cv2.imread(image_path)
+        if frame_bgr is None or frame_bgr.size == 0:
+            return None
+
+        h, w = frame_bgr.shape[:2]
+        rgb = cv2.cvtColor(frame_bgr, cv2.COLOR_BGR2RGB)
+        mp_image = mp.Image(image_format=mp.ImageFormat.SRGB, data=rgb)
+        result = landmarker.detect(mp_image)
+        face_landmarks = result.face_landmarks or []
+        if not face_landmarks:
+            return None
+
+        # Use the first detected face. The pixel-space conversion matches
+        # the spoof-detector blink_analyzer contract.
+        lm = np.array(
+            [[l.x * w, l.y * h, l.z] for l in face_landmarks[0]]
+        )
+        if len(lm) < 468:
+            return None
+
+        left = compute_ear(lm, LEFT_EYE)
+        right = compute_ear(lm, RIGHT_EYE)
+        avg = (left + right) / 2.0
+        threshold = BlinkAnalyzer.EAR_THRESHOLD
+        return {
+            "eyes_closed": bool(avg < threshold),
+            "left_ear": round(left, 4),
+            "right_ear": round(right, 4),
+            "avg_ear": round(avg, 4),
+            "threshold": threshold,
+        }
+    except Exception as exc:  # noqa: BLE001 — fail-soft
+        logger.warning("EAR liveness check failed: %s", exc)
+        return None
+
+
+def _merge_block_verdict(
+    *,
+    antispoof_pipeline: Optional[dict],
+    ear_liveness: Optional[dict],
+) -> Optional[str]:
+    """Conservative veto — any spoof-leaning signal wins.
+
+    Returns a reason category string when verification MUST be blocked, or
+    ``None`` when the request is allowed to proceed. The reason category is
+    surfaced in the 403 body so callers can branch on it.
+
+    Veto rules (any one triggers a block):
+      * ``antispoof_pipeline.recommended_action == "block"``
+      * EAR check says ``eyes_closed=True`` (single still frame of closed
+        eyes is a strong spoof indicator).
+    """
+    if antispoof_pipeline is not None:
+        action = str(antispoof_pipeline.get("recommended_action", "")).lower()
+        if action == "block":
+            # Use the most specific reason the assembler attached.
+            if antispoof_pipeline.get("face_usability_block"):
+                return "FACE_UNUSABLE"
+            if antispoof_pipeline.get("hybrid_fusion_is_spoof"):
+                return "HYBRID_FUSION_SPOOF"
+            return "ANTISPOOF_BLOCK"
+    if ear_liveness is not None and ear_liveness.get("eyes_closed") is True:
+        return "EYES_CLOSED"
+    return None
+
+
 def _evaluate_antispoof_pipeline_safe(image_path: str) -> Optional[dict]:
     """Run the full anti-spoof assembler on an on-disk image.
 
@@ -273,11 +440,52 @@ async def verify_face(
 
         # Anti-spoof attachments. Both fields default None and are populated
         # only when their respective flags are on. The helpers swallow any
-        # exception — they never block verification.
+        # exception — they never block verification by raising.
         device_spoof_risk: Optional[dict] = None
         if settings.ANTISPOOF_DEVICE_RISK_ENABLED:
             device_spoof_risk = _evaluate_device_spoof_risk_safe(image_path)
         antispoof_pipeline = _evaluate_antispoof_pipeline_safe(image_path)
+        ear_liveness = _evaluate_ear_liveness_safe(image_path)
+
+        # Bug 1 (2026-05-12) — enforce assembler's `recommended_action="block"`
+        # and the EAR single-frame closed-eye signal. Previously the assembler
+        # verdict was advisory: a "block" recommendation was attached to the
+        # response but the route still returned `verified=true`. With
+        # ANTISPOOF_BLOCK_ENFORCE=true (the default) we now return 403 with
+        # a structured body. An operator can flip the flag to false for
+        # observation-only / canary rollout.
+        block_reason = _merge_block_verdict(
+            antispoof_pipeline=antispoof_pipeline,
+            ear_liveness=ear_liveness,
+        )
+        if block_reason is not None and settings.ANTISPOOF_BLOCK_ENFORCE:
+            logger.warning(
+                "Verification BLOCKED by anti-spoof veto: user_id=%s reason=%s "
+                "assembler_action=%s ear_avg=%s",
+                user_id,
+                block_reason,
+                (antispoof_pipeline or {}).get("recommended_action"),
+                (ear_liveness or {}).get("avg_ear"),
+            )
+            raise HTTPException(
+                status_code=403,
+                detail={
+                    "error_code": "ANTISPOOF_BLOCKED",
+                    "reason": block_reason,
+                    "antispoof_pipeline": antispoof_pipeline,
+                    "ear_liveness": ear_liveness,
+                    "message": "Verification rejected by anti-spoof checks",
+                },
+            )
+        if block_reason is not None:
+            # Enforcement disabled: log the bypass loudly so it's visible in
+            # production log streams when an operator runs in observation mode.
+            logger.warning(
+                "Verification anti-spoof veto SUPPRESSED (ANTISPOOF_BLOCK_ENFORCE=false): "
+                "user_id=%s reason=%s",
+                user_id,
+                block_reason,
+            )
 
         response = VerificationResponse(
             verified=result.verified,
@@ -287,6 +495,7 @@ async def verify_face(
             message=message,
             device_spoof_risk=device_spoof_risk,
             antispoof_pipeline=antispoof_pipeline,
+            ear_liveness=ear_liveness,
         )
 
         # D1 log-only: persist client pre-filter embedding for offline analysis.
diff --git a/app/api/schemas/single_challenge.py b/app/api/schemas/single_challenge.py
new file mode 100644
index 0000000..f643c80
--- /dev/null
+++ b/app/api/schemas/single_challenge.py
@@ -0,0 +1,85 @@
+"""Schema for single-challenge server validation (Bug 4, 2026-05-12).
+
+The web biometric-puzzles training surface (``BiometricPuzzlesPage``) runs
+one challenge at a time with local MediaPipe detection. Before this fix, it
+called ``onSuccess`` purely client-side — anyone could trivially mock the
+component and "pass" the puzzle. This schema is for the new
+``/liveness/verify-challenge`` endpoint that records a server round-trip
+for each completed challenge and returns a server verdict.
+
+The contract is intentionally narrow:
+  * One action per request.
+  * Client supplies start/end timestamps and a detection confidence
+    derived from MediaPipe.
+  * Server runs the cheap structural validations (action is a known type,
+    timestamps are monotonic and within a reasonable window, confidence
+    above a floor) and returns a verdict.
+
+Heavier server-side detection (re-running MediaPipe on uploaded frames) is
+out of scope for the training surface — the deep validation belongs to
+multi-step ``/liveness/verify`` flows used by enrollment.
+"""
+
+from __future__ import annotations
+
+from typing import Any, Dict, Optional
+
+from pydantic import BaseModel, Field
+
+from app.api.schemas.active_liveness import ChallengeType
+
+
+class VerifyChallengeRequest(BaseModel):
+    """Single challenge completion record submitted by the web client."""
+
+    action: ChallengeType = Field(
+        ..., description="The completed challenge action (e.g. blink, smile, pinch)."
+    )
+    start_timestamp_ms: float = Field(
+        ...,
+        gt=0,
+        description=(
+            "Client clock (performance.now() base or unix-ms) when the "
+            "challenge started. Used for monotonicity + duration sanity."
+        ),
+    )
+    end_timestamp_ms: float = Field(
+        ...,
+        gt=0,
+        description="Client clock when the challenge was detected as completed.",
+    )
+    confidence: float = Field(
+        ...,
+        ge=0.0,
+        le=1.0,
+        description="Detection confidence reported by the client engine [0..1].",
+    )
+    tenant_id: Optional[str] = Field(default=None, description="Tenant identifier.")
+    user_id: Optional[str] = Field(default=None, description="User identifier.")
+    metrics: Dict[str, Any] = Field(
+        default_factory=dict,
+        description=(
+            "Optional metric payload (e.g. min_ear for blink, mar_ratio for "
+            "smile, finger_count for hand puzzles). Logged for audit, never "
+            "used as the sole pass/fail signal."
+        ),
+    )
+
+
+class VerifyChallengeResponse(BaseModel):
+    """Server verdict for a single challenge submission."""
+
+    verified: bool = Field(..., description="Whether the challenge passed.")
+    action: ChallengeType = Field(..., description="The echoed challenge action.")
+    duration_seconds: float = Field(
+        ..., ge=0.0, description="end - start, in seconds (post-validation)."
+    )
+    reason_code: Optional[str] = Field(
+        default=None,
+        description=(
+            "Failure category when ``verified=false`` "
+            "(e.g. TIMESTAMPS_OUT_OF_ORDER, DURATION_TOO_SHORT, "
+            "CONFIDENCE_BELOW_FLOOR, UNKNOWN_ACTION)."
+        ),
+    )
+    message: str = Field(default="", description="Human-readable result message.")
diff --git a/app/api/schemas/verification.py b/app/api/schemas/verification.py
index cff0db7..5f8febf 100644
--- a/app/api/schemas/verification.py
+++ b/app/api/schemas/verification.py
@@ -32,8 +32,20 @@ class VerificationResponse(BaseModel):
         description=(
             "Optional combined verdict from spoof_detector.pipeline.AntispoofPipelineAssembler. "
             "Populated only when at least one of ANTISPOOF_USABILITY_GATE_ENABLED / "
-            "ANTISPOOF_FUSION_ENABLED is true. The `recommended_action` is advisory; "
-            "this service never enforces it."
+            "ANTISPOOF_FUSION_ENABLED is true. When `recommended_action` is "
+            "'block' AND ANTISPOOF_BLOCK_ENFORCE is true (default since "
+            "2026-05-12), the route returns HTTP 403 instead of attaching the "
+            "verdict here."
+        ),
+    )
+    ear_liveness: Optional[dict[str, Any]] = Field(
+        default=None,
+        description=(
+            "Optional single-frame Eye Aspect Ratio liveness observation from "
+            "spoof_detector.infrastructure.analyzers.blink_analyzer. Populated "
+            "only when ANTISPOOF_EAR_VETO_ENABLED=true. When 'eyes_closed' is "
+            "True AND ANTISPOOF_BLOCK_ENFORCE is true, the route returns 403 "
+            "instead of attaching the verdict here."
         ),
     )
 
@@ -47,6 +59,7 @@ class VerificationResponse(BaseModel):
                 "message": "Face verified successfully",
                 "device_spoof_risk": None,
                 "antispoof_pipeline": None,
+                "ear_liveness": None,
             }
         }
     }
diff --git a/app/core/config.py b/app/core/config.py
index 41c19e5..dffbcaa 100644
--- a/app/core/config.py
+++ b/app/core/config.py
@@ -153,6 +153,13 @@ def parse_cors_origins(cls, v):
     )
 
     # Thresholds
+    # Comparator semantics: ``verified = distance < threshold``.
+    # → HIGHER threshold = MORE LENIENT (allows greater distance ⇒ still a match).
+    # → LOWER threshold  = STRICTER (only very close distances accepted).
+    # This is the cosine-distance convention used in
+    # ``verify_face.py`` (line 181). Do not flip the comparator without also
+    # flipping every threshold pin in env.example / .env.prod, otherwise the
+    # FAR/FRR will silently invert.
     VERIFICATION_THRESHOLD: float = Field(default=0.45, ge=0.0, le=1.0)
     LIVENESS_THRESHOLD: float = Field(default=70.0, ge=0.0, le=100.0)
     QUALITY_THRESHOLD: float = Field(default=70.0, ge=0.0, le=100.0)
@@ -160,6 +167,12 @@ def parse_cors_origins(cls, v):
     # Adaptive verification threshold for aged embeddings (Faz 3-1)
     # When the stored embedding is older than VERIFICATION_THRESHOLD_AGED_YEARS,
     # a more lenient threshold is used to account for natural appearance changes.
+    # Bug fix 2026-05-12: previously default=0.38 which is LOWER than the
+    # standard 0.45 — under ``distance < threshold`` semantics that made aged
+    # users *stricter*, the opposite of intent (higher FRR). The default is
+    # now 0.55, raising the allowed distance ceiling so aged users match more
+    # easily, while staying well below the Facenet cosine-distance ceiling
+    # of ~0.6 (the model's known operating point for cosine distance).
     VERIFICATION_THRESHOLD_AGED_YEARS: float = Field(
         default=2.0,
         ge=0.0,
@@ -169,16 +182,40 @@ def parse_cors_origins(cls, v):
         ),
     )
     VERIFICATION_THRESHOLD_AGED: float = Field(
-        default=0.38,
+        default=0.55,
         ge=0.0,
         le=1.0,
         description=(
             "Cosine-distance threshold applied when embedding age exceeds "
-            "VERIFICATION_THRESHOLD_AGED_YEARS. Lower than the default (0.45) "
-            "to be more lenient with aged embeddings."
+            "VERIFICATION_THRESHOLD_AGED_YEARS. HIGHER than the default (0.45) "
+            "because the comparator is ``distance < threshold`` — a larger "
+            "allowed-distance ceiling means more lenient matching for aged "
+            "embeddings. Must remain below the Facenet cosine-distance "
+            "ceiling (~0.6) to keep FAR under control."
         ),
     )
 
+    @model_validator(mode="after")
+    def _validate_aged_threshold_lenience(self) -> "Settings":
+        """Catch the pre-2026-05-12 inversion regression at config-load time.
+
+        ``VERIFICATION_THRESHOLD_AGED`` must be >= ``VERIFICATION_THRESHOLD``
+        because the comparator is ``distance < threshold`` — a stricter
+        ceiling for aged embeddings is meaningless (it would force aged users
+        to match the standard with *additional* margin, the opposite of the
+        adaptive feature's purpose).
+        """
+        if self.VERIFICATION_THRESHOLD_AGED < self.VERIFICATION_THRESHOLD:
+            raise ValueError(
+                "Configuration inversion detected: VERIFICATION_THRESHOLD_AGED "
+                f"({self.VERIFICATION_THRESHOLD_AGED}) must be >= "
+                f"VERIFICATION_THRESHOLD ({self.VERIFICATION_THRESHOLD}) "
+                "under the ``distance < threshold`` comparator. A lower aged "
+                "threshold makes aged users *stricter*, not more lenient. "
+                "See app/application/use_cases/verify_face.py:181."
+            )
+        return self
+
     # ML Model Timeouts (prevents hung requests)
     ML_MODEL_TIMEOUT_SECONDS: int = Field(default=30, ge=5, le=120, description="Timeout for ML model operations")
 
@@ -678,6 +715,36 @@ def get_api_key_config(self) -> dict:
             "toggle."
         ),
     )
+    # Bug 1 (2026-05-12) — enforcement flag for `recommended_action="block"`.
+    # Before this, the assembler's block verdict was attached to the
+    # response but the route still returned 200/verified=True (advisory
+    # only). Default is now ON: any "block" verdict from the assembler or
+    # any closed-eye signal from the EAR check yields a 403 with a
+    # structured reason. Flip to false for canary/observation rollout.
+    ANTISPOOF_BLOCK_ENFORCE: bool = Field(
+        default=True,
+        description=(
+            "When True, AntispoofPipelineAssembler 'recommended_action=block' "
+            "and EAR closed-eye detection cause the /verify route to return "
+            "HTTP 403 (was advisory-only prior to 2026-05-12)."
+        ),
+    )
+    # Bug 2 (2026-05-12) — single-frame EAR liveness signal flag.
+    # When True, /verify runs MediaPipe FaceLandmarker on the uploaded
+    # frame and computes Eye Aspect Ratio via the spoof-detector library
+    # (calibration EAR_THRESHOLD=0.18, paper-P0 2026-05-11). If both eyes
+    # are clearly closed, the request is vetoed alongside the assembler.
+    # Default OFF until ops deploys the face_landmarker.task asset to the
+    # container — the helper fails-soft to None when the model is missing.
+    ANTISPOOF_EAR_VETO_ENABLED: bool = Field(
+        default=False,
+        description=(
+            "Enable the single-frame EAR (Eye Aspect Ratio) closed-eye veto "
+            "on /verify. Requires FACE_LANDMARKER_MODEL_PATH to point at a "
+            "deployed face_landmarker.task asset. Fails-soft to no-op when "
+            "MediaPipe or the model is unavailable."
+        ),
+    )
     GESTURE_HAND_LANDMARKER_MODEL_PATH: str = Field(
         default=str(_REPO_ROOT / "models" / "hand_landmarker.task"),
         description=(
@@ -732,6 +799,20 @@ def get_api_key_config(self) -> dict:
         default="",
         description="Expected SHA256 hex digest for Facenet512 weights file (empty = skip with warning)",
     )
+    # Bug 5 (2026-05-12) — fail-fast in prod when SHA pin is missing.
+    # Previously an empty DEEPFACE_FACENET512_SHA256 only logged a warning,
+    # which let an undetected weight rotation (or supply-chain compromise of
+    # ``~/.deepface/weights/``) silently change embeddings. With this flag on
+    # and ENVIRONMENT=production, an empty pin now raises at model-load time.
+    DEEPFACE_SHA256_REQUIRED: bool = Field(
+        default=True,
+        description=(
+            "When True (default) AND ENVIRONMENT=production, refuse to load "
+            "the DeepFace model unless DEEPFACE_FACENET512_SHA256 is pinned. "
+            "Set False to opt out (e.g. first-deploy of a new model version "
+            "before the hash has been captured)."
+        ),
+    )
 
     # ML-M5: server-side caps on find_similar threshold/limit (caller-controlled today).
     FIND_SIMILAR_FACE_MAX_THRESHOLD: float = Field(
diff --git a/app/infrastructure/ml/extractors/deepface_extractor.py b/app/infrastructure/ml/extractors/deepface_extractor.py
index 4ad7854..0463e17 100644
--- a/app/infrastructure/ml/extractors/deepface_extractor.py
+++ b/app/infrastructure/ml/extractors/deepface_extractor.py
@@ -88,9 +88,29 @@ def _verify_model_integrity(model_name: str) -> None:
         return
 
     if not expected:
-        # TODO: pin DEEPFACE_FACENET512_SHA256 in config.py once the known-good
-        # hash has been recorded from a trusted build. See ML-M1 in
-        # docs/audits/AUDIT_2026-04-19.md.
+        # Bug 5 (2026-05-12) — defense in depth: in production, an empty pin
+        # is no longer "warn + skip". An unpinned model means a weight
+        # rotation (or supply-chain compromise) of ~/.deepface/weights/ can
+        # land silently, so prod fails fast unless an operator explicitly
+        # opts out via DEEPFACE_SHA256_REQUIRED=False (e.g. during the very
+        # first deploy of a new model version where the hash hasn't been
+        # captured yet).
+        required = getattr(settings, "DEEPFACE_SHA256_REQUIRED", False)
+        env = (getattr(settings, "ENVIRONMENT", "") or "").lower()
+        if required and env == "production":
+            logger.error(
+                "DeepFace model integrity pin missing for %s while "
+                "DEEPFACE_SHA256_REQUIRED=true on production. Refusing to "
+                "load the model — set DEEPFACE_FACENET512_SHA256 in .env.prod "
+                "with the output of `sha256sum %s`.",
+                weight_path,
+                weight_path,
+            )
+            raise RuntimeError(
+                "DeepFace model integrity pin missing — refusing to load "
+                f"{weight_path}. Set DEEPFACE_FACENET512_SHA256 or set "
+                "DEEPFACE_SHA256_REQUIRED=false to opt out (not recommended)."
+            )
         logger.warning(
             "DeepFace model integrity check skipped (no pinned hash): %s. "
             "Set DEEPFACE_FACENET512_SHA256 once verified.",
diff --git a/tests/integration/test_verify_antispoof_block_enforce.py b/tests/integration/test_verify_antispoof_block_enforce.py
new file mode 100644
index 0000000..25effa6
--- /dev/null
+++ b/tests/integration/test_verify_antispoof_block_enforce.py
@@ -0,0 +1,431 @@
+"""Integration tests for ANTISPOOF_BLOCK_ENFORCE + ANTISPOOF_EAR_VETO_ENABLED.
+
+Bugs fixed 2026-05-12:
+  * Bug 1: AntispoofPipelineAssembler `recommended_action="block"` was
+    advisory — the route attached it to the response but still returned
+    200/verified=True. We now return 403 with a structured body when
+    enforce is on.
+  * Bug 2: blink-cache/EAR work from spoof-detector was unreachable from
+    /verify. We now wire `compute_ear` into a single-frame check and
+    veto when both eyes are closed.
+
+Per the existing test_verify_antispoof_wiring.py convention this file uses
+a module-scoped TestClient to avoid the anyio-portal closed-loop issue when
+the route's lru-cached deps are recreated mid-suite.
+"""
+
+from __future__ import annotations
+
+import io
+import sys
+from unittest.mock import AsyncMock, Mock, patch
+
+import cv2
+import numpy as np
+import pytest
+
+# Mock DeepFace before any imports that depend on it (same pattern as
+# test_verify_antispoof_wiring.py). Resemblyzer is required by main.py's
+# lifespan via SpeakerEmbedder; the dev host doesn't have it installed
+# (it's a CPU-heavy optional dep), so mock it too so the TestClient
+# lifespan succeeds. This is the "baseline rot" pattern documented in
+# bio main — 79 pre-existing failing tests share the same root cause.
+sys.modules.setdefault("deepface", Mock())
+sys.modules.setdefault("deepface.DeepFace", Mock())
+sys.modules.setdefault("resemblyzer", Mock(VoiceEncoder=Mock()))
+
+from fastapi.testclient import TestClient
+
+from app.api.routes import verification as verify_route
+from app.core.container import (
+    get_check_liveness_use_case,
+    get_client_embedding_observation_repository,
+    get_file_storage,
+    get_verify_face_use_case,
+)
+from app.domain.entities.liveness_result import LivenessResult
+from app.domain.entities.verification_result import VerificationResult
+from app.main import app
+
+
+@pytest.fixture(scope="module")
+def _module_client():
+    with TestClient(app) as c:
+        yield c
+
+
+@pytest.fixture
+def client(_module_client) -> TestClient:
+    verify_route._antispoof_assembler = None
+    verify_route._antispoof_assembler_init_failed = False
+    verify_route._device_spoof_risk_evaluator = None
+    verify_route._face_landmarker_for_ear = None
+    verify_route._face_landmarker_for_ear_init_failed = False
+    app.dependency_overrides.clear()
+
+    yield _module_client
+
+    app.dependency_overrides.clear()
+    verify_route._antispoof_assembler = None
+    verify_route._antispoof_assembler_init_failed = False
+    verify_route._device_spoof_risk_evaluator = None
+    verify_route._face_landmarker_for_ear = None
+    verify_route._face_landmarker_for_ear_init_failed = False
+
+
+@pytest.fixture
+def test_image_file():
+    img = np.full((100, 100, 3), 80, dtype=np.uint8)
+    ok, buf = cv2.imencode(".jpg", img)
+    assert ok
+    return ("test.jpg", io.BytesIO(buf.tobytes()), "image/jpeg")
+
+
+@pytest.fixture
+def mocks(tmp_path):
+    """Wire all upstream deps with fast, deterministic AsyncMocks."""
+    img = np.full((100, 100, 3), 80, dtype=np.uint8)
+    ok, buf = cv2.imencode(".jpg", img)
+    assert ok
+    image_path = tmp_path / "saved.jpg"
+    image_path.write_bytes(buf.tobytes())
+
+    verify_uc = Mock()
+    verify_uc.execute = AsyncMock(
+        return_value=VerificationResult(
+            verified=True, confidence=0.87, distance=0.13, threshold=0.6,
+        )
+    )
+
+    liveness_uc = Mock()
+    liveness_uc.execute = AsyncMock(
+        return_value=LivenessResult(
+            is_live=True, score=92.0, challenge="none",
+            challenge_completed=True, confidence=0.91,
+        )
+    )
+
+    storage = Mock()
+    storage.save_temp = AsyncMock(return_value=str(image_path))
+    storage.cleanup = AsyncMock()
+
+    observation_repo = Mock()
+    observation_repo.record = AsyncMock()
+
+    return verify_uc, liveness_uc, storage, observation_repo
+
+
+def _wire(verify_uc, liveness_uc, storage, observation_repo) -> None:
+    app.dependency_overrides[get_verify_face_use_case] = lambda: verify_uc
+    app.dependency_overrides[get_check_liveness_use_case] = lambda: liveness_uc
+    app.dependency_overrides[get_file_storage] = lambda: storage
+    app.dependency_overrides[get_client_embedding_observation_repository] = (
+        lambda: observation_repo
+    )
+
+
+# ---------------------------------------------------------------------------
+# Bug 1: enforce assembler recommended_action="block"
+# ---------------------------------------------------------------------------
+
+
+def test_block_verdict_triggers_403_when_enforce_on(
+    client: TestClient, mocks, test_image_file
+) -> None:
+    """recommended_action='block' + enforce=True → HTTP 403."""
+    verify_uc, liveness_uc, storage, observation_repo = mocks
+    _wire(verify_uc, liveness_uc, storage, observation_repo)
+
+    fake_block_result = {
+        "face_usability_block": True,
+        "face_usability_reason": "occluded",
+        "device_replay_risk": 0.05,
+        "device_signals": {"moire_risk": 0.0},
+        "hybrid_fusion_is_spoof": None,
+        "hybrid_fusion_score": None,
+        "hybrid_fusion_reasoning": None,
+        "recommended_action": "block",
+        "layers_evaluated": ["face_usability"],
+    }
+
+    with patch.object(
+        verify_route.settings, "ANTISPOOF_BLOCK_ENFORCE", True
+    ), patch.object(
+        verify_route.settings, "ANTISPOOF_EAR_VETO_ENABLED", False
+    ), patch.object(
+        verify_route.settings, "ANTISPOOF_FUSION_ENABLED", True
+    ), patch.object(
+        verify_route, "_evaluate_antispoof_pipeline_safe",
+        return_value=fake_block_result,
+    ):
+        resp = client.post(
+            "/api/v1/verify",
+            data={"user_id": "test_user_block"},
+            files={"file": test_image_file},
+        )
+
+    assert resp.status_code == 403, resp.text
+    body = resp.json()
+    detail = body.get("detail") or body
+    assert detail.get("error_code") == "ANTISPOOF_BLOCKED"
+    assert detail.get("reason") == "FACE_UNUSABLE"
+    assert detail.get("antispoof_pipeline") == fake_block_result
+
+
+def test_block_verdict_passes_when_enforce_off(
+    client: TestClient, mocks, test_image_file
+) -> None:
+    """recommended_action='block' + enforce=False → 200 + verdict attached."""
+    verify_uc, liveness_uc, storage, observation_repo = mocks
+    _wire(verify_uc, liveness_uc, storage, observation_repo)
+
+    fake_block_result = {
+        "face_usability_block": False,
+        "face_usability_reason": None,
+        "device_replay_risk": 0.85,
+        "device_signals": {"moire_risk": 0.7},
+        "hybrid_fusion_is_spoof": True,
+        "hybrid_fusion_score": 0.92,
+        "hybrid_fusion_reasoning": "spoof detected via fusion",
+        "recommended_action": "block",
+        "layers_evaluated": ["device_spoof_risk", "hybrid_fusion"],
+    }
+
+    with patch.object(
+        verify_route.settings, "ANTISPOOF_BLOCK_ENFORCE", False
+    ), patch.object(
+        verify_route.settings, "ANTISPOOF_EAR_VETO_ENABLED", False
+    ), patch.object(
+        verify_route.settings, "ANTISPOOF_FUSION_ENABLED", True
+    ), patch.object(
+        verify_route, "_evaluate_antispoof_pipeline_safe",
+        return_value=fake_block_result,
+    ):
+        resp = client.post(
+            "/api/v1/verify",
+            data={"user_id": "test_user_observe"},
+            files={"file": test_image_file},
+        )
+
+    # Enforce off — verification still returns 200 with the verdict attached.
+    assert resp.status_code == 200, resp.text
+    body = resp.json()
+    assert body["verified"] is True
+    assert body["antispoof_pipeline"] == fake_block_result
+
+
+def test_allow_verdict_passes_with_enforce_on(
+    client: TestClient, mocks, test_image_file
+) -> None:
+    """recommended_action='allow' must never trigger a block."""
+    verify_uc, liveness_uc, storage, observation_repo = mocks
+    _wire(verify_uc, liveness_uc, storage, observation_repo)
+
+    fake_allow_result = {
+        "face_usability_block": False,
+        "face_usability_reason": None,
+        "device_replay_risk": 0.05,
+        "device_signals": {"moire_risk": 0.01},
+        "hybrid_fusion_is_spoof": False,
+        "hybrid_fusion_score": 0.18,
+        "hybrid_fusion_reasoning": "LIVE verified",
+        "recommended_action": "allow",
+        "layers_evaluated": ["device_spoof_risk", "hybrid_fusion"],
+    }
+
+    with patch.object(
+        verify_route.settings, "ANTISPOOF_BLOCK_ENFORCE", True
+    ), patch.object(
+        verify_route.settings, "ANTISPOOF_EAR_VETO_ENABLED", False
+    ), patch.object(
+        verify_route.settings, "ANTISPOOF_FUSION_ENABLED", True
+    ), patch.object(
+        verify_route, "_evaluate_antispoof_pipeline_safe",
+        return_value=fake_allow_result,
+    ):
+        resp = client.post(
+            "/api/v1/verify",
+            data={"user_id": "test_user_allow"},
+            files={"file": test_image_file},
+        )
+
+    assert resp.status_code == 200, resp.text
+    body = resp.json()
+    assert body["verified"] is True
+
+
+def test_review_verdict_passes_with_enforce_on(
+    client: TestClient, mocks, test_image_file
+) -> None:
+    """recommended_action='review' must NOT cause a block (review != block)."""
+    verify_uc, liveness_uc, storage, observation_repo = mocks
+    _wire(verify_uc, liveness_uc, storage, observation_repo)
+
+    fake_review_result = {
+        "face_usability_block": False,
+        "face_usability_reason": None,
+        "device_replay_risk": 0.72,
+        "device_signals": {"moire_risk": 0.3},
+        "hybrid_fusion_is_spoof": False,
+        "hybrid_fusion_score": 0.5,
+        "hybrid_fusion_reasoning": "borderline",
+        "recommended_action": "review",
+        "layers_evaluated": ["device_spoof_risk", "hybrid_fusion"],
+    }
+
+    with patch.object(
+        verify_route.settings, "ANTISPOOF_BLOCK_ENFORCE", True
+    ), patch.object(
+        verify_route.settings, "ANTISPOOF_EAR_VETO_ENABLED", False
+    ), patch.object(
+        verify_route.settings, "ANTISPOOF_FUSION_ENABLED", True
+    ), patch.object(
+        verify_route, "_evaluate_antispoof_pipeline_safe",
+        return_value=fake_review_result,
+    ):
+        resp = client.post(
+            "/api/v1/verify",
+            data={"user_id": "test_user_review"},
+            files={"file": test_image_file},
+        )
+
+    assert resp.status_code == 200, resp.text
+    assert resp.json()["antispoof_pipeline"] == fake_review_result
+
+
+# ---------------------------------------------------------------------------
+# Bug 2: EAR closed-eye veto
+# ---------------------------------------------------------------------------
+
+
+def test_ear_closed_eyes_triggers_403_when_enforce_on(
+    client: TestClient, mocks, test_image_file
+) -> None:
+    """eyes_closed=True from EAR check vetoes the request alongside assembler."""
+    verify_uc, liveness_uc, storage, observation_repo = mocks
+    _wire(verify_uc, liveness_uc, storage, observation_repo)
+
+    fake_ear_result = {
+        "eyes_closed": True,
+        "left_ear": 0.12,
+        "right_ear": 0.10,
+        "avg_ear": 0.11,
+        "threshold": 0.18,
+    }
+
+    with patch.object(
+        verify_route.settings, "ANTISPOOF_BLOCK_ENFORCE", True
+    ), patch.object(
+        verify_route.settings, "ANTISPOOF_EAR_VETO_ENABLED", True
+    ), patch.object(
+        verify_route, "_evaluate_ear_liveness_safe", return_value=fake_ear_result,
+    ), patch.object(
+        verify_route, "_evaluate_antispoof_pipeline_safe", return_value=None,
+    ):
+        resp = client.post(
+            "/api/v1/verify",
+            data={"user_id": "test_user_ear"},
+            files={"file": test_image_file},
+        )
+
+    assert resp.status_code == 403, resp.text
+    detail = resp.json().get("detail") or {}
+    assert detail.get("error_code") == "ANTISPOOF_BLOCKED"
+    assert detail.get("reason") == "EYES_CLOSED"
+    assert detail.get("ear_liveness") == fake_ear_result
+
+
+def test_ear_open_eyes_passes(
+    client: TestClient, mocks, test_image_file
+) -> None:
+    """eyes_closed=False → response includes ear_liveness but verifies OK."""
+    verify_uc, liveness_uc, storage, observation_repo = mocks
+    _wire(verify_uc, liveness_uc, storage, observation_repo)
+
+    fake_ear_result = {
+        "eyes_closed": False,
+        "left_ear": 0.28,
+        "right_ear": 0.30,
+        "avg_ear": 0.29,
+        "threshold": 0.18,
+    }
+
+    with patch.object(
+        verify_route.settings, "ANTISPOOF_BLOCK_ENFORCE", True
+    ), patch.object(
+        verify_route.settings, "ANTISPOOF_EAR_VETO_ENABLED", True
+    ), patch.object(
+        verify_route, "_evaluate_ear_liveness_safe", return_value=fake_ear_result,
+    ), patch.object(
+        verify_route, "_evaluate_antispoof_pipeline_safe", return_value=None,
+    ):
+        resp = client.post(
+            "/api/v1/verify",
+            data={"user_id": "test_user_ear_open"},
+            files={"file": test_image_file},
+        )
+
+    assert resp.status_code == 200, resp.text
+    body = resp.json()
+    assert body["verified"] is True
+    assert body["ear_liveness"] == fake_ear_result
+
+
+def test_ear_helper_is_called_when_flag_on(
+    client: TestClient, mocks, test_image_file
+) -> None:
+    """Regression guard for Bug 2: the EAR helper must actually be invoked
+    from /verify when the flag is on. Asserts the path is reached at least
+    once (previously zero — the wiring was missing).
+    """
+    verify_uc, liveness_uc, storage, observation_repo = mocks
+    _wire(verify_uc, liveness_uc, storage, observation_repo)
+
+    ear_mock = Mock(return_value=None)
+
+    with patch.object(
+        verify_route.settings, "ANTISPOOF_BLOCK_ENFORCE", True
+    ), patch.object(
+        verify_route.settings, "ANTISPOOF_EAR_VETO_ENABLED", True
+    ), patch.object(
+        verify_route, "_evaluate_ear_liveness_safe", ear_mock,
+    ), patch.object(
+        verify_route, "_evaluate_antispoof_pipeline_safe", return_value=None,
+    ):
+        resp = client.post(
+            "/api/v1/verify",
+            data={"user_id": "test_user_ear_called"},
+            files={"file": test_image_file},
+        )
+
+    assert resp.status_code == 200, resp.text
+    assert ear_mock.call_count == 1, (
+        "EAR helper must be invoked from /verify (Bug 2 regression guard); "
+        f"got {ear_mock.call_count} calls."
+    )
+
+
+def test_ear_helper_returns_none_when_flag_off(
+    client: TestClient, mocks, test_image_file
+) -> None:
+    """When ANTISPOOF_EAR_VETO_ENABLED=False, the helper must short-circuit
+    and return None — the route must not even attempt MediaPipe import.
+    """
+    verify_uc, liveness_uc, storage, observation_repo = mocks
+    _wire(verify_uc, liveness_uc, storage, observation_repo)
+
+    with patch.object(
+        verify_route.settings, "ANTISPOOF_BLOCK_ENFORCE", True
+    ), patch.object(
+        verify_route.settings, "ANTISPOOF_EAR_VETO_ENABLED", False
+    ), patch.object(
+        verify_route, "_evaluate_antispoof_pipeline_safe", return_value=None,
+    ):
+        resp = client.post(
+            "/api/v1/verify",
+            data={"user_id": "test_user_ear_off"},
+            files={"file": test_image_file},
+        )
+
+    assert resp.status_code == 200, resp.text
+    assert resp.json()["ear_liveness"] is None
diff --git a/tests/integration/test_verify_antispoof_wiring.py b/tests/integration/test_verify_antispoof_wiring.py
index 46ecd90..5d20b3a 100644
--- a/tests/integration/test_verify_antispoof_wiring.py
+++ b/tests/integration/test_verify_antispoof_wiring.py
@@ -21,9 +21,14 @@
 import numpy as np
 import pytest
 
-# Mock DeepFace before any imports that depend on it.
+# Mock DeepFace + Resemblyzer before any imports that depend on them.
+# Both are CPU-heavy optional ML deps; resemblyzer isn't always installed
+# on dev hosts (the bio container builds it from source). Without the
+# mock, `app/main.py:lifespan` → `initialize_dependencies()` → SpeakerEmbedder
+# crashes with ModuleNotFoundError before any of these tests can run.
 sys.modules.setdefault("deepface", Mock())
 sys.modules.setdefault("deepface.DeepFace", Mock())
+sys.modules.setdefault("resemblyzer", Mock(VoiceEncoder=Mock()))
 
 from fastapi.testclient import TestClient
 
@@ -73,6 +78,8 @@ def client(_module_client) -> TestClient:
     verify_route._antispoof_assembler = None
     verify_route._antispoof_assembler_init_failed = False
     verify_route._device_spoof_risk_evaluator = None
+    verify_route._face_landmarker_for_ear = None
+    verify_route._face_landmarker_for_ear_init_failed = False
     app.dependency_overrides.clear()
 
     yield _module_client
@@ -81,6 +88,8 @@ def client(_module_client) -> TestClient:
     verify_route._antispoof_assembler = None
     verify_route._antispoof_assembler_init_failed = False
     verify_route._device_spoof_risk_evaluator = None
+    verify_route._face_landmarker_for_ear = None
+    verify_route._face_landmarker_for_ear_init_failed = False
 
 
 @pytest.fixture
diff --git a/tests/integration/test_verify_challenge_endpoint.py b/tests/integration/test_verify_challenge_endpoint.py
new file mode 100644
index 0000000..0dd4c2c
--- /dev/null
+++ b/tests/integration/test_verify_challenge_endpoint.py
@@ -0,0 +1,132 @@
+"""Integration tests for /liveness/verify-challenge (Bug 4, 2026-05-12).
+
+The endpoint exists to give the web biometric-puzzles training surface a
+server round-trip it MUST wait on before resolving its `onSuccess()`. The
+checks are structural — full ML re-detection belongs to the multi-step
+/liveness/verify flow.
+
+Each test pins one behavior:
+  * Happy path with sane inputs → 200, verified=true.
+  * Inverted timestamps → 200, verified=false (TIMESTAMPS_OUT_OF_ORDER).
+  * Duration < min → 200, verified=false (DURATION_TOO_SHORT).
+  * Duration > max → 200, verified=false (DURATION_TOO_LONG).
+  * Confidence below floor → 200, verified=false (CONFIDENCE_BELOW_FLOOR).
+  * Unknown action enum → 422 (FastAPI validation).
+"""
+
+from __future__ import annotations
+
+import sys
+from unittest.mock import Mock
+
+import pytest
+
+# Mock heavy ML deps before importing the app (matches the
+# test_verify_antispoof_block_enforce.py / test_verify_antispoof_wiring.py
+# convention — see those files' module-docstrings for context).
+sys.modules.setdefault("deepface", Mock())
+sys.modules.setdefault("deepface.DeepFace", Mock())
+sys.modules.setdefault("resemblyzer", Mock(VoiceEncoder=Mock()))
+
+from fastapi.testclient import TestClient
+
+from app.main import app
+
+
+@pytest.fixture(scope="module")
+def client():
+    with TestClient(app) as c:
+        yield c
+
+
+def _payload(**overrides) -> dict:
+    """Build a baseline-valid payload; override fields per test."""
+    base = {
+        "action": "blink",
+        "start_timestamp_ms": 1_000_000.0,
+        "end_timestamp_ms": 1_000_500.0,  # +500ms
+        "confidence": 0.85,
+        "tenant_id": "tenant-x",
+        "user_id": "user-y",
+        "metrics": {"min_ear": 0.12},
+    }
+    base.update(overrides)
+    return base
+
+
+def test_happy_path_returns_verified_true(client: TestClient) -> None:
+    resp = client.post("/api/v1/liveness/verify-challenge", json=_payload())
+    assert resp.status_code == 200, resp.text
+    body = resp.json()
+    assert body["verified"] is True
+    assert body["action"] == "blink"
+    assert 0.49 < body["duration_seconds"] < 0.51
+    assert body["reason_code"] is None
+
+
+def test_inverted_timestamps_reject(client: TestClient) -> None:
+    resp = client.post(
+        "/api/v1/liveness/verify-challenge",
+        json=_payload(start_timestamp_ms=2_000_000.0, end_timestamp_ms=1_999_000.0),
+    )
+    assert resp.status_code == 200, resp.text
+    body = resp.json()
+    assert body["verified"] is False
+    assert body["reason_code"] == "TIMESTAMPS_OUT_OF_ORDER"
+
+
+def test_duration_too_short_reject(client: TestClient) -> None:
+    # 50ms — below the 120ms floor.
+    resp = client.post(
+        "/api/v1/liveness/verify-challenge",
+        json=_payload(start_timestamp_ms=1_000_000.0, end_timestamp_ms=1_000_050.0),
+    )
+    assert resp.status_code == 200, resp.text
+    body = resp.json()
+    assert body["verified"] is False
+    assert body["reason_code"] == "DURATION_TOO_SHORT"
+
+
+def test_duration_too_long_reject(client: TestClient) -> None:
+    # 65s — above the 60s ceiling.
+    resp = client.post(
+        "/api/v1/liveness/verify-challenge",
+        json=_payload(start_timestamp_ms=1_000_000.0, end_timestamp_ms=1_065_000.0),
+    )
+    assert resp.status_code == 200, resp.text
+    body = resp.json()
+    assert body["verified"] is False
+    assert body["reason_code"] == "DURATION_TOO_LONG"
+
+
+def test_confidence_below_floor_reject(client: TestClient) -> None:
+    resp = client.post(
+        "/api/v1/liveness/verify-challenge",
+        json=_payload(confidence=0.3),
+    )
+    assert resp.status_code == 200, resp.text
+    body = resp.json()
+    assert body["verified"] is False
+    assert body["reason_code"] == "CONFIDENCE_BELOW_FLOOR"
+
+
+def test_unknown_action_is_422(client: TestClient) -> None:
+    resp = client.post(
+        "/api/v1/liveness/verify-challenge",
+        json=_payload(action="not_a_real_challenge"),
+    )
+    # FastAPI/Pydantic enum validation → 422.
+    assert resp.status_code == 422, resp.text
+
+
+def test_gesture_action_accepted(client: TestClient) -> None:
+    """Hand-modality actions (pinch, hand_flip, finger_count, ...) must
+    pass the structural checks the same as face actions."""
+    resp = client.post(
+        "/api/v1/liveness/verify-challenge",
+        json=_payload(action="pinch", end_timestamp_ms=1_002_000.0),
+    )
+    assert resp.status_code == 200, resp.text
+    body = resp.json()
+    assert body["verified"] is True
+    assert body["action"] == "pinch"
diff --git a/tests/unit/test_deepface_sha256_required.py b/tests/unit/test_deepface_sha256_required.py
new file mode 100644
index 0000000..a507e30
--- /dev/null
+++ b/tests/unit/test_deepface_sha256_required.py
@@ -0,0 +1,166 @@
+"""Tests for the DEEPFACE_SHA256_REQUIRED fail-fast behavior (Bug 5, 2026-05-12).
+
+Previously, an empty ``DEEPFACE_FACENET512_SHA256`` only logged a WARNING
+and continued loading the model. A silent weight rotation under
+``~/.deepface/weights/`` could change embeddings without anyone noticing.
+
+The new behavior:
+  * ``DEEPFACE_SHA256_REQUIRED=true`` (default) AND
+    ``ENVIRONMENT=production`` AND empty pin → RuntimeError at model-load.
+  * Any other combination keeps the old warn-and-skip behavior so dev
+    flows don't break.
+
+We don't load the actual ~400MB DeepFace weights here — we just exercise
+the integrity-check function directly with a tmp weight file.
+"""
+
+from __future__ import annotations
+
+import sys
+from pathlib import Path
+from unittest.mock import Mock, patch
+
+import pytest
+
+# Mock the heavy ML deps before importing the extractor module under test.
+# `deepface_extractor.py` does `from deepface import DeepFace` at module
+# load, which pulls TensorFlow on dev hosts that don't have it. The
+# integrity-check function itself only uses hashlib + pathlib — no TF.
+#
+# IMPORTANT: we do NOT mock `tensorflow` as a whole — that pollutes other
+# tests in the same pytest session (notably the integration tests that
+# import `app.main` which calls `gpu.configure_gpu()` and iterates over
+# `tf.config.list_physical_devices('GPU')`). Mocking only `deepface`
+# and `tf_keras` is enough because they're the deps that
+# deepface_extractor.py's top-level import chain pulls in.
+sys.modules.setdefault("tf_keras", Mock())
+sys.modules.setdefault("deepface", Mock())
+sys.modules.setdefault("deepface.DeepFace", Mock())
+
+
+@pytest.fixture
+def fake_weight_file(tmp_path):
+    """Create a small fake weight file so the integrity check has something
+    to digest. The actual hash doesn't matter for the missing-pin tests."""
+    weight = tmp_path / "facenet512_weights.h5"
+    weight.write_bytes(b"fake-weight-content-for-integrity-testing-12345")
+    return weight
+
+
+def _patch_settings(**kwargs):
+    """Patch attributes on the deepface_extractor module's `settings` import.
+
+    The function imports `settings` lazily via `from app.core.config import settings`
+    inside `_verify_model_integrity`, so we need to patch the actual module
+    attribute used at call time.
+    """
+    import app.core.config as cfg_module
+    return patch.multiple(cfg_module.settings, **kwargs)
+
+
+def test_missing_pin_raises_in_prod_when_required(fake_weight_file):
+    """Empty pin + prod env + required flag → RuntimeError."""
+    from app.infrastructure.ml.extractors.deepface_extractor import (
+        _verify_model_integrity,
+    )
+
+    with patch(
+        "app.infrastructure.ml.extractors.deepface_extractor._resolve_weight_path",
+        return_value=fake_weight_file,
+    ), _patch_settings(
+        DEEPFACE_FACENET512_SHA256="",
+        DEEPFACE_SHA256_REQUIRED=True,
+        ENVIRONMENT="production",
+    ):
+        with pytest.raises(RuntimeError) as exc_info:
+            _verify_model_integrity("Facenet512")
+        assert "integrity pin missing" in str(exc_info.value).lower()
+
+
+def test_missing_pin_warns_in_dev(fake_weight_file, caplog):
+    """Empty pin + dev env → log warning, no raise."""
+    import logging
+
+    from app.infrastructure.ml.extractors.deepface_extractor import (
+        _verify_model_integrity,
+    )
+
+    with patch(
+        "app.infrastructure.ml.extractors.deepface_extractor._resolve_weight_path",
+        return_value=fake_weight_file,
+    ), _patch_settings(
+        DEEPFACE_FACENET512_SHA256="",
+        DEEPFACE_SHA256_REQUIRED=True,
+        ENVIRONMENT="development",
+    ), caplog.at_level(logging.WARNING):
+        # Must not raise.
+        _verify_model_integrity("Facenet512")
+
+    assert any(
+        "skipped" in r.message.lower() and "no pinned hash" in r.message.lower()
+        for r in caplog.records
+    )
+
+
+def test_missing_pin_warns_when_required_false_in_prod(fake_weight_file, caplog):
+    """Opt-out flag must let prod boot with an empty pin (first-deploy scenario)."""
+    import logging
+
+    from app.infrastructure.ml.extractors.deepface_extractor import (
+        _verify_model_integrity,
+    )
+
+    with patch(
+        "app.infrastructure.ml.extractors.deepface_extractor._resolve_weight_path",
+        return_value=fake_weight_file,
+    ), _patch_settings(
+        DEEPFACE_FACENET512_SHA256="",
+        DEEPFACE_SHA256_REQUIRED=False,
+        ENVIRONMENT="production",
+    ), caplog.at_level(logging.WARNING):
+        _verify_model_integrity("Facenet512")
+
+    assert any(
+        "skipped" in r.message.lower() for r in caplog.records
+    )
+
+
+def test_correct_pin_passes(fake_weight_file):
+    """Pinned + correct hash → returns silently (success path)."""
+    import hashlib
+
+    from app.infrastructure.ml.extractors.deepface_extractor import (
+        _verify_model_integrity,
+    )
+
+    expected = hashlib.sha256(fake_weight_file.read_bytes()).hexdigest()
+
+    with patch(
+        "app.infrastructure.ml.extractors.deepface_extractor._resolve_weight_path",
+        return_value=fake_weight_file,
+    ), _patch_settings(
+        DEEPFACE_FACENET512_SHA256=expected,
+        DEEPFACE_SHA256_REQUIRED=True,
+        ENVIRONMENT="production",
+    ):
+        # Must not raise.
+        _verify_model_integrity("Facenet512")
+
+
+def test_wrong_pin_raises_regardless_of_env(fake_weight_file):
+    """An explicit pin that doesn't match the file MUST raise everywhere."""
+    from app.infrastructure.ml.extractors.deepface_extractor import (
+        _verify_model_integrity,
+    )
+
+    with patch(
+        "app.infrastructure.ml.extractors.deepface_extractor._resolve_weight_path",
+        return_value=fake_weight_file,
+    ), _patch_settings(
+        DEEPFACE_FACENET512_SHA256="deadbeef" * 8,
+        DEEPFACE_SHA256_REQUIRED=False,
+        ENVIRONMENT="development",
+    ):
+        with pytest.raises(RuntimeError) as exc_info:
+            _verify_model_integrity("Facenet512")
+        assert "integrity check failed" in str(exc_info.value).lower()
diff --git a/tests/unit/test_verification_threshold_aged.py b/tests/unit/test_verification_threshold_aged.py
new file mode 100644
index 0000000..e196a72
--- /dev/null
+++ b/tests/unit/test_verification_threshold_aged.py
@@ -0,0 +1,76 @@
+"""Regression tests for VERIFICATION_THRESHOLD_AGED semantics (bug 2026-05-12).
+
+Background
+----------
+The comparator in ``app/application/use_cases/verify_face.py`` is
+``verified = distance < threshold``. Under that semantic:
+
+  - HIGHER threshold ⇒ MORE LENIENT (the allowed-distance ceiling rises,
+    so more pairs pass).
+  - LOWER threshold  ⇒ STRICTER (only near-zero distances match).
+
+Before 2026-05-12 the defaults were:
+
+  VERIFICATION_THRESHOLD       = 0.45
+  VERIFICATION_THRESHOLD_AGED  = 0.38   # ← bug: stricter, not lenient
+
+That made aged users get a HIGHER FRR — the opposite of the adaptive
+feature's intent. This test pins:
+
+  1. The new default for ``VERIFICATION_THRESHOLD_AGED`` is higher than
+     the standard ``VERIFICATION_THRESHOLD``.
+  2. Loading an inverted config (aged < standard) raises a
+     ``ValidationError`` so the regression cannot silently come back via
+     env-file edits.
+"""
+
+from __future__ import annotations
+
+import pytest
+from pydantic import ValidationError
+
+from app.core.config import Settings
+
+
+def _make(**overrides) -> Settings:
+    """Build a Settings instance with no ambient .env interference."""
+    return Settings(_env_file=None, **overrides)
+
+
+def test_default_aged_threshold_is_more_lenient_than_standard():
+    """Default config: aged threshold must allow GREATER distance, not less."""
+    s = _make()
+    assert s.VERIFICATION_THRESHOLD_AGED > s.VERIFICATION_THRESHOLD, (
+        f"Aged threshold ({s.VERIFICATION_THRESHOLD_AGED}) must be > standard "
+        f"({s.VERIFICATION_THRESHOLD}) under the 'distance < threshold' "
+        "comparator. A lower aged threshold makes aged users stricter."
+    )
+
+
+def test_aged_threshold_below_standard_is_rejected():
+    """Inversion regression guard: aged < standard must fail config load."""
+    with pytest.raises((ValidationError, ValueError)) as exc_info:
+        _make(
+            VERIFICATION_THRESHOLD=0.45,
+            VERIFICATION_THRESHOLD_AGED=0.38,  # the pre-2026-05-12 buggy value
+        )
+    msg = str(exc_info.value)
+    assert "VERIFICATION_THRESHOLD_AGED" in msg
+    assert "VERIFICATION_THRESHOLD" in msg
+
+
+def test_aged_threshold_equal_to_standard_is_allowed():
+    """Boundary case: equal thresholds are valid (no adaptive lenience, but
+    not inverted)."""
+    s = _make(VERIFICATION_THRESHOLD=0.45, VERIFICATION_THRESHOLD_AGED=0.45)
+    assert s.VERIFICATION_THRESHOLD == s.VERIFICATION_THRESHOLD_AGED == 0.45
+
+
+def test_aged_threshold_within_facenet_safe_band():
+    """The new default must remain below the Facenet cosine-distance
+    operating-point ceiling (~0.6) to avoid blowing FAR past the model."""
+    s = _make()
+    assert s.VERIFICATION_THRESHOLD_AGED <= 0.6, (
+        "VERIFICATION_THRESHOLD_AGED above 0.6 risks FAR explosion under "
+        "Facenet cosine-distance distributions."
+    )

From dcbf725c1704bafb8a2f4dcfeb86129160305a2b Mon Sep 17 00:00:00 2001
From: Ahmet Abdullah Gultekin <ahmetabdullahgultekin@gmail.com>
Date: Tue, 12 May 2026 18:27:37 +0000
Subject: [PATCH 2/2] fix(lint): rename ambiguous 'l' variable to 'pt' (ruff
 E741)

---
 app/api/routes/verification.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/app/api/routes/verification.py b/app/api/routes/verification.py
index 7220c94..b4c1c16 100644
--- a/app/api/routes/verification.py
+++ b/app/api/routes/verification.py
@@ -245,7 +245,7 @@ def _evaluate_ear_liveness_safe(image_path: str) -> Optional[dict]:
         # Use the first detected face. The pixel-space conversion matches
         # the spoof-detector blink_analyzer contract.
         lm = np.array(
-            [[l.x * w, l.y * h, l.z] for l in face_landmarks[0]]
+            [[pt.x * w, pt.y * h, pt.z] for pt in face_landmarks[0]]
         )
         if len(lm) < 468:
             return None