FrameFind

Real-time on-device glasses detection for the browser and Node.js

FrameFind detects whether someone is wearing glasses in real time, running entirely on-device. No frames are sent to any server — the ONNX model runs locally in the browser via WASM or in Node.js via the native runtime.

Why local inference?

Most vision APIs need a round-trip to a server: your frame leaves the device, gets processed, and comes back. That adds latency, costs money per call, exposes biometric data, and breaks offline.

FrameFind runs the model in the browser itself:

Zero latency from network — inference happens on the same machine that captured the frame
Privacy by default — camera data never leaves the device
No usage costs — once the model is cached (~6.2 MB), every inference is free
Works offline — no connection required after first load

Benchmark

Measured on Chrome 124 / MacBook M2 with WASM backend, 200+ frames:

Metric	Value
Median inference	~27 ms
p95 inference	~35 ms
Model size	6.2 MB
First load (cached)	<50 ms
Input resolution	112 × 112

Browser compatibility

Browser	WASM	WebGPU
Chrome 112+	✅	✅
Firefox 110+	✅	🚧
Safari 16.4+	✅	✅
Edge 112+	✅	✅

WASM works everywhere. WebGPU accelerates inference where supported.

WebGPU vs WASM

	WebGPU	WASM
Inference speed	~8 ms	~27 ms
Compatibility	Chrome/Safari TP	All modern browsers
GPU required	Yes	No
Fallback	→ WASM	—

FrameFind uses WASM by default (via onnxruntime-web) and falls back gracefully. Switch to WebGPU by passing the executor option to onnxruntime-web.

Architecture

Frame / Image
     │
     ▼
Face Landmarker (MediaPipe)
     │
     ├─ landmarks found → crop eye region (34 keypoints, 112×112)
     │
     └─ no landmarks   → centered crop fallback
                              │
                              ▼
                      ONNX Model (6.2 MB)
                      logit → sigmoid → probability
                              │
                              ▼
                      Temporal smoothing (N frames)
                              │
                              ▼
              { glasses, probability, faceDetected }

Packages

packages/
  core/    → GlassesDetector (browser) and GlassesDetectorNode (Node.js)
  react/   → useGlassesDetector hook
  utils/   → shared types, constants, and helpers

Installation

# Browser / React
npm install @framefind/core onnxruntime-web
npm install @framefind/react onnxruntime-web react

# Node.js
npm install @framefind/core onnxruntime-node

Quick start — browser

import { GlassesDetector } from "@framefind/core";

const detector = new GlassesDetector({
  modelUrl: "https://cdn.framefind.moraxh.dev/glasses/v1/glasses.onnx",
});

await detector.load();

const result = await detector.detectFromCanvas(canvas, landmarks);
console.log(result.glasses, result.probability);

Quick start — React

import { useGlassesDetector } from "@framefind/react";

function Camera() {
  const { result, loading, detect } = useGlassesDetector({
    modelUrl: "https://cdn.framefind.moraxh.dev/glasses/v1/glasses.onnx",
  });

  // call detect() each frame via requestAnimationFrame
  return <p>{result?.glasses ? "Wearing glasses" : "No glasses"}</p>;
}

Quick start — Node.js

import { GlassesDetectorNode } from "@framefind/core/node";

const detector = new GlassesDetectorNode({
  modelPath: "./glasses.onnx",
});

await detector.load();
const result = await detector.detectFromImagePath("./photo.jpg"); // requires sharp

How it works

Receives a video frame, canvas, or image buffer
If MediaPipe landmarks are provided, extracts the eye region using 34 keypoints
Resizes to 112×112 and normalizes with ImageNet mean/std
Runs the ONNX model, gets a logit → sigmoid → probability
Smooths the last N predictions to avoid flickering
Returns { glasses, probability, faceDetected }

What's next

Glasses detection is the starting point, not the ceiling. The name isn't tied to any single task — FrameFind is about understanding what's on and around a face, frame by frame.

Planned detectors:

Mask — is the person wearing a face mask?
Eyes open/closed — blink detection, drowsiness
Face attributes — age range, expression, skin tone-agnostic attributes
Head pose — yaw, pitch, roll estimation
Attention — is the person looking at the screen?
Iris tracking — gaze direction without eye-tracking hardware

Same architecture, same on-device approach. Each detector ships as its own model and package so you only pull in what you need.

Name		Name	Last commit message	Last commit date
Latest commit History 28 Commits
.github/workflows		.github/workflows
.vscode		.vscode
apps		apps
media		media
models/glasses/v1		models/glasses/v1
packages		packages
scripts/glasses		scripts/glasses
training		training
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
biome.json		biome.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

FrameFind

Why local inference?

Benchmark

Browser compatibility

WebGPU vs WASM

Architecture

Packages

Installation

Quick start — browser

Quick start — React

Quick start — Node.js

How it works

What's next

Links

About

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

FrameFind

Why local inference?

Benchmark

Browser compatibility

WebGPU vs WASM

Architecture

Packages

Installation

Quick start — browser

Quick start — React

Quick start — Node.js

How it works

What's next

Links

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Contributors

Uh oh!

Languages