Skip to content

3Dunlop/comfyui_cad_legend

Repository files navigation

ComfyUI CAD Legend Processor

A ComfyUI custom node package that reads PDF CAD drawings (orthographic plans), detects the key/legend, extracts colour–label pairs, and generates binary masks for each colour in the drawing.

Built for UK Traffic Regulation Order (TRO) plans and similar CAD output where coloured zones represent different features (cycleways, bus lanes, footways, etc.), but works with any PDF CAD drawing that has a colour-coded legend/key.

What It Does

PDF CAD Drawing ──► Render at 600 DPI ──► Find KEY/Legend ──► Extract colour–label pairs ──► Generate binary masks

Given a multi-page PDF of CAD plans, this package:

  1. Renders each PDF page to a high-resolution raster image (600 DPI recommended)
  2. Locates the KEY/LEGEND heading using PDF text extraction (PyMuPDF)
  3. Extracts colour–label pairs by sampling the actual rendered pixels at each swatch position
  4. Generates per-colour binary masks using perceptual colour matching (CIE76 in LAB space)
  5. Saves individual mask PNGs, composite overlays, and structured JSON

Installation

Prerequisites

  • ComfyUI installed and working
  • Python 3.10+ (comes with ComfyUI)

Install the node package

cd ComfyUI/custom_nodes
git clone https://github.com/3Dunlop/comfyui_cad_legend.git
cd comfyui_cad_legend
pip install -r requirements.txt

Then restart ComfyUI. The nodes will appear under the CAD Legend Processor category.

Optional: Florence-2 support

If you want the Florence-2 fallback path (auto-detection of legend region via vision model), install:

cd ComfyUI/custom_nodes
git clone https://github.com/kijai/ComfyUI-Florence2.git

The Florence-2 model (microsoft/Florence-2-large) will be downloaded automatically on first use.

Note: Florence-2 is completely optional. The primary CAD: PDF Swatch Extractor node uses PyMuPDF text extraction and does not require any AI model.

Nodes

PDF Input

Node Description
CAD: PDF to Image Loads a PDF page and renders it as a ComfyUI IMAGE tensor at configurable DPI

Legend Extraction (choose one path)

Node Description
CAD: PDF Swatch Extractor Recommended. Direct PDF + raster extraction — no AI model needed. Uses PyMuPDF to find the KEY heading and label positions, then samples rendered pixels at each swatch location to get the actual colour
CAD: Swatch Extractor Florence-2 OCR path. Takes a cropped legend image + Florence-2 OCR JSON, searches for solid-colour swatches adjacent to each text label
CAD: Crop Legend (Florence-2) Crops the legend region using Florence-2 object detection
CAD: Crop Legend (Manual) Crops the legend region using manually specified pixel coordinates
CAD: Legend Display Formats and displays legend JSON as a readable table in the node preview

Mask Generation & Output

Node Description
CAD: Batch Masks from Legend Generates one binary mask per legend entry using LAB, HSV, or combined colour matching
CAD: Select Mask by Label Picks a single mask from the batch by partial label match or index
CAD: Save Labeled Masks Saves all masks to disk as PNGs named after their legend labels
CAD: Mask Preview Grid Combines all masks into a single colour-tinted grid image for visual inspection

Workflows

Two workflow JSON files are provided:

cad_legend_pdf_direct.json — Recommended (no Florence-2)

CAD: PDF to Image ──► CAD: PDF Swatch Extractor ──► CAD: Batch Masks from Legend ──► Save / Preview

Simple 3-step pipeline. Enter the PDF path and DPI in both the PDF loader and the swatch extractor nodes. No AI model required.

cad_legend_processor.json — Florence-2 path

CAD: PDF to Image ──► Florence-2 (detect) ──► Crop Legend ──► Florence-2 (OCR) ──► Swatch Extractor ──► Batch Masks ──► Save / Preview

Uses Florence-2 for both legend region detection and OCR. More flexible for unusual legend layouts but slower and requires the Florence-2 model (~1.5 GB).

How It Works

The Problem

CAD drawings exported as PDF use vector graphics internally. The colours stored in the PDF metadata (via get_drawings()) are the authoring colours and often do not match what appears in the rendered raster. For example:

  • A vector green #00DD6E may render as yellow-green #D0E080 at 600 DPI
  • Hatched patterns in the legend produce grey rasters, not the metadata colour
  • Anti-aliasing at region boundaries creates colour fringing

Any approach that reads PDF vector colours directly will generate masks that match nothing in the actual image.

The Solution: Raster Pixel Sampling

Instead of trusting PDF metadata, this package:

  1. Renders the PDF at high DPI to get the actual pixel colours
  2. Extracts text positions from the PDF's text layer (which IS reliable — PyMuPDF gives exact bounding boxes for every text span)
  3. Samples pixels at the known swatch position (just to the left of each label text) in the rendered raster
  4. Filters the sampled pixels to exclude paper-white background:
    • Primary filter: R+G+B < 690 AND max(R,G,B) - min(R,G,B) > 10 (coloured, not too bright)
    • Secondary filter: R+G+B < 640 (catches dark greys from hatching patterns)
    • If no coloured pixels found: defaults to #989898 grey (no swatch present)
  5. Takes the median of filtered pixels as the target colour — robust to noise and anti-aliasing

Colour Matching

Masks are generated using CIE76 perceptual distance in CIELAB colour space. This is superior to simple RGB distance because LAB is designed to match human colour perception:

  • Two colours that look similar have a small LAB distance
  • Two colours that look different have a large LAB distance
  • This matters for pastels, greys, and desaturated colours that are numerically close in RGB but visually distinct

Default tolerance is 25.0 ΔE, which works well for clean vector-rendered PDFs. Reduce to 15–20 for tighter matching, increase to 30–35 for scanned or noisy drawings.

Deduplication

Legend entries with very similar colours (CIE76 distance < 12.0 ΔE) are flagged as duplicates. The duplicate entry is kept in the output (tagged with _dup in the source field) but shares its mask with the original. This handles cases like:

  • "PROPOSED SHARED FOOTWAY AND CYCLEWAY" (#C0DFFF) vs "PROPOSED TRAFFIC SIGNALS" (#CDD9FF) — both pale blue, but the dup gets its own mask

Configuration Reference

CAD: PDF Swatch Extractor

Parameter Default Description
dpi 600 Must match the DPI used in CAD: PDF to Image
key_height_pts 130 PDF points below the KEY heading to search for labels
swatch_width_pts 45 PDF points to the left of label text to sample for colour
dedup_threshold 12.0 CIE76 ΔE threshold for duplicate colour detection

CAD: Batch Masks from Legend

Parameter Default Description
mask_method LAB LAB (recommended), HSV (saturated colours), or BOTH (max recall)
tolerance 20.0 Colour distance threshold. LAB: 12–25 typical. HSV: 10–20 typical
morphology_kernel 3 Cleanup kernel size. 0=off, 3=gentle, 5–7=noisy scans
invert_masks false Invert mask polarity (black ↔ white)

CAD: Swatch Extractor (Florence-2 path)

Parameter Default Description
swatch_side LEFT Where to search for colour swatches: LEFT, RIGHT, or BOTH
swatch_search_width 90 Pixel width of the swatch search region
variance_threshold 18.0 Max RGB std-dev for a region to count as "solid colour"
dedup_cie76_threshold 8.0 CIE76 ΔE threshold for duplicate skipping

Standalone Test Script

test_pipeline.py runs the full pipeline outside ComfyUI for development and validation:

cd ComfyUI/custom_nodes/comfyui_cad_legend

# Process page 0 (default)
python test_pipeline.py "D:/CAD/your_drawing.pdf"

# Process a specific page
python test_pipeline.py "D:/CAD/your_drawing.pdf" 2

Output is saved to D:/CAD/output/ (configurable in the script):

D:/CAD/output/
├── page00_raw.png              # Full-resolution render
├── page00_key_region.png       # Drawing with KEY region highlighted
├── page00_key_crop.png         # Cropped KEY area
├── page00_legend.json          # Extracted legend data
├── page00_composite.png        # All masks overlaid on drawing
├── page00_masks/
│   ├── 000_PROPOSED_CHANNEL_ALIGNMENT.png
│   ├── 001_PROPOSED_VERGE-LANDSCAPING.png
│   ├── ...
│   └── 010_PROPOSED_TRAFFIC_SIGNALS.png

The test script has a Florence-2 fallback path that activates if PDF text extraction fails (e.g., rasterised PDFs with no text layer).

Example Output

Processing a 4-page UK TRO plan (BSIP_Newhaven_Informal_TRO_plans_v3.pdf) at 600 DPI:

Legend Entries
==================================================
 1. #CDCDCD  RGB(205,205,205)  PROPOSED CHANNEL ALIGNMENT        0.74%
 2. #D2EE81  RGB(210,238,129)  PROPOSED VERGE/LANDSCAPING        0.64%
 3. #989898  RGB(152,152,152)  CARRIAGEWAY                       0.04%
 4. #FFC0BF  RGB(255,192,191)  PROPOSED 24/7 BUS LANE            0.66%
 5. #FFEFC0  RGB(255,239,192)  PROPOSED FOOTWAY                  0.24%
 6. #81A0FF  RGB(129,160,255)  PROPOSED CYCLEWAY                 0.08%
 7. #C0DFFF  RGB(192,223,255)  PROPOSED SHARED FOOTWAY           0.65%
 8. #FFC08F  RGB(255,192,143)  PROPOSED TACTILE PAVING           0.17%
 9. #000000  RGB(  0,  0,  0)  PROPOSED ROAD MARKINGS            1.84%
10. #FFC41A  RGB(255,196, 26)  DOUBLE YELLOW LINE MARKINGS       0.01%
11. #CDD9FF  RGB(205,217,255)  PROPOSED TRAFFIC SIGNALS          0.45%
==================================================

Coverage percentages show what fraction of the total drawing area each colour occupies.

Known Limitations

  • Black (#000000) masks are noisy — PROPOSED ROAD MARKINGS samples as black, which also matches all line work, text, and boundary outlines in the drawing. This is inherent to using black as a map symbol.
  • Very similar colours (e.g., two shades of pale blue) may be flagged as duplicates even when they represent different features. Adjust dedup_threshold if needed.
  • No text layer = no PDF extraction — if the PDF is a pure raster scan with no embedded text, the PDF Swatch Extractor will fail and you'll need the Florence-2 path.
  • Legend must have a KEY/LEGEND heading — the PDF extractor searches for "KEY", "LEGEND", or similar headings. If the drawing uses a non-standard heading, it falls back to inferring from "PROPOSED" text lines.

File Structure

comfyui_cad_legend/
├── __init__.py          # Package entry point, node registry
├── color_utils.py       # Shared colour math, tensor helpers, mask generation
├── nodes_pdf.py         # CAD_PDFToImage node
├── nodes_legend.py      # Legend extraction nodes (5 nodes)
├── nodes_mask.py        # Mask generation and output nodes (4 nodes)
├── test_pipeline.py     # Standalone test script
├── requirements.txt     # Python dependencies
└── README.md            # This file

Dependencies

Package Version Purpose
PyMuPDF ≥ 1.23.0 PDF rendering and text extraction
OpenCV (headless) ≥ 4.8.0 Colour space conversion, morphology, drawing
NumPy ≥ 1.24.0 Array operations
Pillow ≥ 9.0.0 Image I/O fallbacks
PyTorch Tensor operations (provided by ComfyUI)

License

MIT

About

ComfyUI custom nodes for extracting colour-coded legends from PDF CAD drawings and generating per-colour binary masks

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages