Skip to content

davidwhitenyc/citytracker

Repository files navigation

citytracker: Civic Data Dashboard, Powered by MarimoNetlify

View Dashboard on Netlify: https://citytracker-nyc.netlify.app/ Last Updated: 3/15/2026 11::25 AM

Note

Why use this particular method to publish a data dashboard? The Marimo → Netlify workflow provides the following strategic benefits:

  • Unified dashboard for all sites

  • Preview deployments for every branch (critical for client review)

  • Git-based workflow (push to deploy)

  • Custom domain management per project

  • No server costs or maintenance

  • Scales automatically to traffic

These workflow advantages outweigh the package compatibility constraints for this use case.

🟡 Current Status

Currently deployed on Marimo's molab platform. In progress: migrating to self-hosted Netlify deployment using WASM export for improved workflow management, custom domain support, and preview deployments.

🔧 Prerequisites

Local Development:

  • Python 3.11+
  • uv for environment management
  • marimo for notebooks

Deployment:

  • GitHub account (free tier)
  • Netlify account (free tier)
  • Git installed and configured

Optional:

  • NYC Open Data API token (for data fetching)

🗃️ Repository Structure

citytracker/
├── notebooks/
│   ├── citytracker.py              # Main dashboard [TODO] Switch data source from API to local storage
│   ├── fetch-housing-data.py       # [TODO] Data fetching notebook (run locally)
│   └── test-wasm-packages.py       # [TODO] Package compatibility testing
├── data/
│   ├── housing.parquet             # [TODO] Static housing data (or .csv)
│   └── housing_metadata.json       # [TODO] Last updated timestamp and row count
├── docs/
│   ├── images/
│   ├── reference-information/
│   └── tutorials-and-guides/
│       └── Publishing Marimo to Netlify Guide.md
├── requirements.txt                # ✅ Created, needs WASM compatibility verification
├── netlify.toml                    # [TODO] Build configuration
├── .gitignore                      # ✅ Created
└── README.md

✅ Completed

  • Repository structure created (notebooks/, data/, docs/)
  • requirements.txt created with core packages
  • .gitignore configured
  • Git repository initialized and commits made
  • Style guide created and translated to css
  • First relevant data source identified on https://opendata.cityofnewyork.us/ and data exported via API
  • Main notebook functional with seaborn visualizations and interactive widgets

🚧 Next Steps

01. Perform WASM compatibility testing

Caution

Test whether current visualization approach will work in WASM deployment.

  • Create notebooks/test-wasm-packages.py with test cells for:
    • seaborn (critical - used for all current visualizations)
    • PyArrow (if planning to use Parquet)
    • great-tables (imported but not currently used)
  • Export test notebook to WASM:
    marimo export html-wasm notebooks/test-wasm-packages.py -o test-dist --mode run
  • Test locally:
    cd test-dist && python -m http.server 8000
    # Open http://localhost:8000 in browser, check console (F12) for errors

Important

DECISION POINT: Package compatibility results will determine the path forward:

graph TD
    TestResults[Test Results from<br/>WASM Export] --> Seaborn{Seaborn<br/>Works?}

    Seaborn -->|✅ Yes| KeepSeaborn[Keep seaborn in requirements.txt<br/>No code changes needed]
    Seaborn -->|❌ No| RefactorMPL[Must refactor visualizations:<br/>- Replace seaborn with matplotlib<br/>- Apply custom styling<br/>- Update Step 2 checklist]

    KeepSeaborn --> PyArrow{PyArrow<br/>Works?}
    RefactorMPL --> PyArrow

    PyArrow -->|✅ Yes| UseParquet[Use Parquet format:<br/>- Smaller files 50-70%<br/>- Faster loading<br/>- Types preserved]
    PyArrow -->|❌ No| UseCSV[Use CSV format:<br/>- Guaranteed compatibility<br/>- Larger files<br/>- Need type conversions]

    UseParquet --> UpdateReqs[Update requirements.txt<br/>and document decisions]
    UseCSV --> UpdateReqs

    style Seaborn fill:#5A89B3,color:#fff
    style PyArrow fill:#5A89B3,color:#fff
    style RefactorMPL fill:#E8692B,color:#fff
    style UseParquet fill:#2A5A8C,color:#fff
    style UseCSV fill:#2A5A8C,color:#fff
    style KeepSeaborn fill:#2A5A8C,color:#fff
Loading
  • Update requirements.txt based on test results
  • Document test results below in "WASM Compatibility Results" section

02. Update data architecture

  • Create notebooks/fetch-housing-data.py notebook
  • Implement data fetching from Socrata API:
    • Use existing API code from citytracker.py (lines 207-246)
    • Filter to only needed columns:
      • borough
      • project_start_date, project_completion_date
      • extremely_low_income_units through other_income_units (6 columns)
    • Limit to 100,000 rows or appropriate subset
  • Save data in chosen format:
    • pd.to_csv('data/housing.csv', index=False) OR
    • pd.to_parquet('data/housing.parquet', index=False') OR
    • Save both for flexibility
  • Create data/housing_metadata.json:
    import json
    from datetime import datetime
    metadata = {
        "last_updated": datetime.now().isoformat(),
        "row_count": len(housing),
        "source": "NYC Open Data - hg8x-zxpr",
        "columns": list(housing.columns)
    }
    json.dump(metadata, open('data/housing_metadata.json', 'w'), indent=2)
  • Run fetch notebook locally to generate data files
  • Verify file size acceptable for browser download (<10 MB preferred)
  • Test loading data from both notebook locations:
    • From citytracker.py: pd.read_csv('../data/housing.csv')
    • Verify path works correctly

03. Refactor main notebook

  • In notebooks/citytracker.py, replace API data loading (lines 207-246):
    • Remove: from dotenv import load_dotenv
    • Remove: from sodapy import Socrata
    • Remove: load_dotenv(), os.getenv(), Socrata client code
    • Replace with: housing = pd.read_csv('../data/housing.csv') (or read_parquet)
  • Handle data types:
    • If using CSV: keep existing type conversion code (lines 253-265)
    • If using Parquet: remove type conversion (types preserved automatically)
  • Add data freshness indicator cell (after line 337):
    @app.cell(hide_code=True)
    def _(mo):
        import json
        metadata = json.load(open('../data/housing_metadata.json'))
        mo.md(f"*Data last updated: {metadata['last_updated']}*")
  • Update requirements.txt:
    • Remove: python-dotenv
    • Remove: sodapy
    • Keep or remove: great-tables (if unused)

Caution

If seaborn is incompatible: Refactor visualizations to matplotlib

  • Test notebook locally: marimo run notebooks/citytracker.py
  • Verify data loads correctly and visualizations work

04. Double-check WASM compatibility

  • Export main notebook to WASM:
    marimo export html-wasm notebooks/citytracker.py -o dist --mode run
  • Serve locally:
    cd dist && python -m http.server 8000
  • Open http://localhost:8000 in browser and verify:
    • Notebook loads without errors (check browser console with F12)
    • Data displays correctly
    • Year dropdown widget works
    • Housing type dropdown widget works
    • Bar chart renders correctly
    • Data freshness indicator shows correct timestamp
    • Initial load time acceptable (5-15 seconds for Pyodide is normal)
  • Test on multiple browsers (Chrome, Firefox, Safari)
  • If any issues found, debug and retest

05. Set up Netlify configuration

  • Create netlify.toml in project root:
    [build]
      command = "pip install marimo -r requirements.txt && marimo export html-wasm notebooks/citytracker.py -o dist --mode run"
      publish = "dist"
    
    [build.environment]
      PYTHON_VERSION = "3.11"
  • Update .gitignore to include test artifacts:
    test-dist/
    

Important

DECISION POINT: Decide on a data version control strategy:

graph TD
    Start[Data files ready in<br/>data/ directory] --> Consider{What's your<br/>update cadence?}

    Consider -->|Monthly/Weekly<br/>Manual updates| OptionA[Option A: Commit to Git<br/>recommended]
    Consider -->|Frequent updates<br/>Large files >10MB| OptionB[Option B: Git LFS]
    Consider -->|Want always-fresh<br/>data on deploy| OptionC[Option C: Build-time Fetch]

    OptionA --> A1[✅ Simplest setup<br/>✅ Data in version control<br/>✅ No extra configuration]
    OptionB --> B1[⚠️ Requires Git LFS setup<br/>✅ Better for large files<br/>✅ Keeps repo size small]
    OptionC --> C1[⚠️ Most complex<br/>⚠️ Requires API token in Netlify<br/>✅ Always current data]

    A1 --> Decide[Choose based on<br/>your needs]
    B1 --> Decide
    C1 --> Decide

    style Consider fill:#5A89B3,color:#fff
    style OptionA fill:#2A5A8C,color:#fff
    style OptionB fill:#B0B0B0,color:#000
    style OptionC fill:#B0B0B0,color:#000
Loading
  • Option A: Commit data files to git

    • Simplest approach
    • Fine for monthly/weekly updates
    • Data automatically included in deployments
  • Option B: Use Git LFS for data files

    • Better for frequent large file updates
    • Requires Git LFS setup
  • Option C: Fetch during Netlify build

    • Requires modifying build command to run fetch notebook
    • Requires storing API token in Netlify environment variables
    • More complex but always fresh data
  • Stage and commit all changes:

    git add .
    git status  # Review changes
    git commit -m "Configure for Netlify WASM deployment"

06. Deploy to Netlify (via GitHub)

  • Create new GitHub repository (if not exists): https://github.com/new

  • Push to GitHub:

    git remote add origin https://github.com/yourusername/citytracker.git
    git branch -M main
    git push -u origin main
  • Log in to Netlify: https://app.netlify.com

  • Add new site → Import existing project → GitHub

  • Select citytracker repository

  • Verify build settings (should auto-detect from netlify.toml):

    • Build command: pip install marimo -r requirements.txt && marimo export html-wasm notebooks/citytracker.py -o dist --mode run
    • Publish directory: dist
    • Python version: 3.11
  • Click "Deploy site"

  • Monitor build logs for errors

  • Once deployed, test live site:

    • Notebook loads (expect 5-15 seconds for Pyodide)
    • All interactive features work
    • Data displays correctly
    • Test on mobile device
  • (Optional) Configure custom domain:

    • Site settings → Domain management → Add custom domain
    • Follow DNS configuration instructions
    • Wait for HTTPS certificate provisioning
  • Document final deployment URL in this README

07. Update data (after deployment is complete)

  1. Run marimo edit notebooks/fetch-housing-data.py locally
  2. Execute all cells to fetch fresh data from Socrata API
  3. Verify data/housing.csv and data/housing_metadata.json updated
  4. Review data for anomalies: housing.info(), housing.describe()
  5. Commit changes:
    git add data/
    git commit -m "Update housing data: [date]"
    git push
  6. Netlify auto-deploys (2-3 minutes)
  7. Visit live site and verify data freshness indicator updated

💾 Data Sources

Dataset 01: Affordable Housing Production by Building

📚 Other Resources


📖 Appendix: Troubleshooting Marimo WASM Deployments

Problem Summary

After deploying a marimo notebook to Netlify using marimo export html-wasm --mode run, the deployed site showed only plain markdown text. CSS styling, images, interactive widgets, and visualizations were completely missing, despite no visible error messages in the deployment logs.

Root Causes Discovered

The issue had two distinct but related root causes:

1. Marimo Cell Output Pattern

Marimo cells must explicitly output their UI objects to display them. Simply creating an object and returning it is NOT sufficient—the object must appear as a standalone expression before the return statement.

Incorrect pattern (doesn't display):

@app.cell
def _(mo):
    custom_css = mo.Html("""<style>...</style>""")
    return custom_css

Correct pattern (displays):

@app.cell
def _(mo):
    custom_css = mo.Html("""<style>...</style>""")
    custom_css  # ← Must include this line to display
    return custom_css

This same pattern applies to ALL marimo UI objects, including images:

Incorrect pattern for images (doesn't display):

@app.cell(hide_code=True)
def _(mo):
    nyc_flag = mo.vstack([
        mo.image(
            src="https://raw.githubusercontent.com/.../Flag_of_New_York_City.svg",
            width=400,
            height=300,
        ),
        mo.md("*Flag of the City of New York*"),
    ], align="start")
    return nyc_flag

Correct pattern for images (displays):

@app.cell(hide_code=True)
def _(mo):
    nyc_flag = mo.vstack([
        mo.image(
            src="https://raw.githubusercontent.com/.../Flag_of_New_York_City.svg",
            width=400,
            height=300,
        ),
        mo.md("*Flag of the City of New York*"),
    ], align="start")
    nyc_flag  # ← Must include this line to display
    return nyc_flag

And the same pattern for interactive widgets:

Incorrect pattern for widgets (doesn't display):

@app.cell(hide_code=True)
def _(mo):
    year_dropdown = mo.ui.dropdown(
        options=["2014", "2015", "2016", "2017"],
        value="2014",
        label="select year:"
    )
    return (year_dropdown,)

Correct pattern for widgets (displays):

@app.cell(hide_code=True)
def _(mo):
    year_dropdown = mo.ui.dropdown(
        options=["2014", "2015", "2016", "2017"],
        value="2014",
        label="select year:"
    )
    year_dropdown  # ← Must include this line to display
    return (year_dropdown,)

2. Pyodide Package Compatibility

When using --mode run, the notebook executes in the browser using Pyodide (Python compiled to WebAssembly). Many Python packages are NOT available in Pyodide. Any import failure causes the entire cell to fail, preventing all downstream cells from executing.

Packages that FAILED in our deployment:

  • python-dotenv - Not available in Pyodide
  • sodapy - Not available in Pyodide
  • great_tables - Not available in Pyodide
  • plotly - Not available in Pyodide
  • seaborn - Not available in Pyodide

Packages that SUCCEEDED:

  • marimo - Core package, always available
  • numpy, pandas, matplotlib - Available in Pyodide
  • Standard library (os, json, datetime) - Always available

Diagnostic Process

Step 1: Check browser console

Open Chrome DevTools (F12) → Console tab. Look for Python errors:

[STDERR] Traceback (most recent call last):
  File "...", line X, in <module>
    from dotenv import load_dotenv
ModuleNotFoundError: No module named 'dotenv'

Step 2: Identify the failing cell

The error shows which cell crashed. When a cell crashes:

  • That cell's return values are undefined
  • ALL cells that depend on those values also fail
  • The failure cascades through the notebook

In our case, the imports cell failed, which meant mo (marimo) was never defined, causing every subsequent cell using mo to fail.

Step 3: Check package availability

Cross-reference imported packages against the Pyodide package list: https://pyodide.org/en/stable/usage/packages-in-pyodide.html

Step 4: Comment out incompatible imports

Iteratively comment out packages until imports succeed:

@app.cell
def _():
    import marimo as mo
    import pandas as pd
    import matplotlib.pyplot as plt
    # from dotenv import load_dotenv  # ← Commented out
    # import seaborn as sns            # ← Commented out
    return mo, pd, plt

Step 5: Fix cell output patterns

Once imports succeed, ensure all UI cells follow the correct pattern:

@app.cell
def _(mo):
    widget = mo.ui.dropdown(...)
    widget  # ← Display it
    return widget

Solutions Applied

  1. Removed incompatible imports from the imports cell:

    • Commented out: dotenv, sodapy, great_tables, plotly, seaborn
    • Kept only: marimo, os, numpy, pandas, matplotlib
  2. Fixed cell output patterns for all UI cells:

    • CSS injection cell: Added custom_css line before return
    • Image cells (×3): Added variable name line before return
    • Widget cells: Already correct (kept as-is)
  3. Commented out dependent code:

    • API data loading cell (used dotenv and sodapy)
    • Data processing cells (depended on API data)
    • Seaborn visualization cells (used incompatible seaborn)
    • Seaborn theme configuration cell
  4. Updated netlify.toml to reference correct file path:

    command = "pip install marimo && marimo export html-wasm notebooks/citytracker.py -o dist --mode run"

Troubleshooting Checklist for Future Deployments

Use this checklist when debugging WASM deployment issues:

  • Check browser console (F12 → Console)

    • Look for [STDERR] messages
    • Look for ModuleNotFoundError or NameError
    • Note which cell is failing (shown in error)
  • Verify package compatibility

  • Check cell output patterns

    • Every cell that creates a UI object (mo.Html(), mo.image(), mo.ui.*()) must display it
    • Pattern: obj = mo.something(...)objreturn obj
    • Look at working widget cells as examples
  • Verify file paths in netlify.toml

    • Build command must reference correct notebook path
    • Example: notebooks/citytracker.py not just citytracker.py
  • Test locally before deploying

    marimo export html-wasm notebooks/your-notebook.py -o test-dist --mode run
    cd test-dist && python -m http.server 8000
    # Open http://localhost:8000 and check console for errors
  • Common cascading failures to watch for

    • If imports cell fails → mo undefined → all cells fail
    • If data loading fails → widgets may render but have no data
    • If seaborn imported → imports fail → nothing renders

Key Lessons

  1. Always check the browser console - Netlify build logs only show build-time errors, not runtime errors that happen when Python executes in the browser

  2. WASM mode has strict package limitations - Don't assume a package works just because pip install succeeded locally

  3. Marimo cell output is explicit - Unlike Jupyter, you can't just create an object; you must display it

  4. Failures cascade - One failed cell can break an entire notebook if other cells depend on its outputs

  5. Test locally first - Export to WASM and test locally before deploying to Netlify

When to Use WASM Mode vs. Edit Mode

Use --mode run (WASM) when:

  • You want truly interactive notebooks (dropdowns, sliders work)
  • Data is small enough to bundle (<10 MB recommended)
  • You only need Pyodide-compatible packages
  • Users can modify code in the browser

Use --mode edit (static HTML) when:

  • You need packages not in Pyodide (seaborn, plotly, etc.)
  • You have large datasets
  • You just want to display pre-rendered output
  • Interactivity isn't critical

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages