Server should Accept Recipe JSON by ascibisz · Pull Request #419 · mesoscope/cellpack

ascibisz · 2025-10-22T23:29:32Z

Problem

Currently, when a user packs an edited recipe via the cellpack client, we upload the edited recipe to firebase, pass that reference to the server, and then the server retrieves the recipe from firebase. To minimize firebase calls and improve efficiency of the client, we're changing that flow to instead send the edited recipe JSON as the body of the packing request to the server.

We also wanted a way to check if we have already packed a recipe and if we have, return that result file rather than running the whole packing again. To do this, we are calculating a hash for the recipe objects before they are packed. The hash for each recipe will be uploaded to firebase along with its result files, so we can query firebase for a given hash to see if that recipe has a packing result already.

Key Server Improvements

1. Deduplication & Caching

BEFORE: Each request generated a unique UUID, no deduplication possible
AFTER: JSON recipes generate deterministic hash, enabling job deduplication

2. Input Flexibility & Backwards Compatibility

BEFORE: Only recipe file paths supported via query parameter
AFTER: Supports both recipe file paths AND direct JSON recipe objects in request body, plus optional config parameter

3. Smart Job Management

BEFORE: Generated UUID for each job without deduplication, every request creates new job regardless of content
AFTER: Uses deterministic hash for JSON recipes, enabling job reuse for identical recipes

4. Firebase Request Reduction

BEFORE: Every edited recipe was uploaded to firebase by the client and downloaded from firebase by the server
AFTER: Edited recipes are passed in the body of the packing request, so no firebase uploads or downloads occur

5: Unified Results Upload

BEFORE: Simularium result file was uploaded to S3 twice per job, once on its own and once as part of the full output files upload
AFTER: Only upload Simularium result file once by keeping track of its path when we upload all output files

Technical Implementation

New Server Components:

DataDoc.generate_hash() - Creates deterministic hash from recipe JSON
job_exists() - Checks if job already completed in Firebase
Enhanced request handling - Reads JSON from request body
Smart job ID generation - Uses hash for JSON recipes, UUID for file paths

Request Flow Changes:

Input validation now checks both query params and request body
Hash-based deduplication for JSON recipes
Backward compatibility maintained for file-based recipes
Consistent job tracking with hash parameter

Benefits

Reduced Server Load: Identical recipes don't reprocess
Faster Client Response: Instant return for duplicate JSON requests
Better Resource Utilization: No redundant compute for same recipes
Improved API Design: JSON recipes easier for programmatic access
Reduced Firebase Usage: Passing recipe directly instead of uploading to firebase

CellPACK Server Job Workflow Changes

BEFORE: Original Server Workflow

graph TD
    A[Client Request] --> B[POST /start-packing]
    B --> C{Check for recipe URL param}
    C -->|Missing| D[Return 400 Error]
    C -->|Present| E[Generate UUID for job_id]
    E --> F[Create Background Task]
    F --> G[Return job_id immediately]
    F --> I[Initiate packing]
    I --> J[Load recipe from firebase<br>using file path from<br>URL param]
    J --> K[Execute packing]
    K --> L{Packing succeeds?}
    L -->|Success| M[S3: Upload outputs to S3<br>Firebase: Update job status to SUCCEEDED]
    L -->|Failure| N[Firebase: Update job status to FAILED]
    
    style A fill:#e1f5fe
    style G fill:#c8e6c9
    style M fill:#fff3e0
    style N fill:#ffcdd2

AFTER: Enhanced Server Workflow with JSON Recipe Support

graph TD
    A[Client Request] --> B[POST /start-packing]
    B --> C{Check inputs}
    C -->|No recipe - no URL param<br>and no request body| D[Return 400 Error]
    C -->|Has recipe path URL param| E[Generate UUID for job_id]
    C -->|Has recipe JSON in request body| F[Generate hash from JSON]
    F --> G{Packing result exists<br>in firebase for this hash?}
    G -->|Yes| H[Return existing hash<br>as job_id]
    G -->|No| I[Use hash as job_id]
    E --> J[Create Background Task]
    I --> J
    J --> K[Return job_id immediately]
    J --> L[Initiate packing]
    L --> M{Input type?}
    M -->|Recipe path| N[Load recipe from firebase<br>using file path from<br>URL param]
    M -->|JSON body| O[Load recipe from JSON dict<br>from request body]
    N --> P[Execute packing]
    O --> P
    P --> Q{Packing succeeds?}
    Q -->|Success| R[S3: Upload outputs to S3<br>Firebase: Update job status to SUCCEEDED]
    Q -->|Failure| S[Firebase: Update job status to FAILED]
    
    style A fill:#e1f5fe
    style K fill:#c8e6c9
    style R fill:#fff3e0
    style S fill:#ffcdd2
    style G fill:#ffeb3b
    style H fill:#c8e6c9

ascibisz · 2025-10-23T21:52:24Z

docs/DOCKER.md

-3. Try hitting the test endpoint on the server, by navigating to `http://0.0.0.0:8443/hello` in your browser.
-4. Try running a packing on the server, by hitting the `http://0.0.0.0:80/pack?recipe=firebase:recipes/one_sphere_v_1.0.0` in your browser.
+3. Try hitting the test endpoint on the server, by navigating to `http://0.0.0.0:80/hello` in your browser.
+4. Try running a packing on the server, by hitting the `http://0.0.0.0:80/start-packing?recipe=firebase:recipes/one_sphere_v_1.0.0` in your browser.


These instructions were just slightly incorrect, this has nothing to do with the other code changes, I just ran into it when testing my code and I wanted to fix it

github-actions · 2025-10-27T23:48:49Z

Packing analysis report

Analysis for packing results located at cellpack/tests/outputs/test_spheres/spheresSST

Ingredient name	Encapsulating radius	Average number packed
ext_A	25	236.0

Packing image

Distance analysis

Expected minimum distance: 50.00
Actual minimum distance: 50.01

Ingredient key	Pairwise distance distribution
ext_A

github-actions · 2026-01-09T19:41:46Z

Packing analysis report

Analysis for packing results located at cellpack/tests/outputs/test_spheres/spheresSST

Ingredient name	Encapsulating radius	Average number packed
ext_A	25	236.0

Packing image

Distance analysis

Expected minimum distance: 50.00
Actual minimum distance: 50.01

Ingredient key	Pairwise distance distribution
ext_A

* remove os fetch for job_id * use dedup_hash instead of job id * proposal: get hash from recipe loader * renaming and add TODOs * format * rename param to hash * remove unused validate param and doc strings in pack * simplify get_ dedup_hash * refactor job_status update * cleanup * fix upload_job_status to handle awshandler * pass dedup_pash to env for fetching across files * add tests * format1 * format test

* proposal: get hash from recipe loader * simplify get_ dedup_hash * only post simularium results file once for server job runs * update code for rebase * code cleanup --------- Co-authored-by: Ruge Li <rugeli0605@gmail.com>

ascibisz · 2026-02-18T21:21:42Z

docker/server.py

        config = request.rel_url.query.get("config")
-        job_id = str(uuid.uuid4())
+
+        dedup_hash = DataDoc.generate_hash(body)


check if this causes an error in the case that recipe was passed in as a firebase path

ascibisz · 2026-02-18T21:23:40Z

docker/server.py

+            return initialized_db
+        return None
+
+    def job_exists(self, dedup_hash):


might be worth adding a comment describing this function

ascibisz · 2026-02-18T21:24:12Z

docker/server.py

Two major changes in this file:

We now accept the recipe as either a firebase path in the recipe URL parameter (as we always have) OR as a JSON object in the body of the packing request. Either way, it will be passed along to the pack function.

Instead of generating an unique id to be the job ID, we calculate the dedup hash for the recipe and use that as our job ID. We check this dedup hash against those existing in the job_status firebase collection to see if this exact recipe has already been packed, and if we find a match, we return it rather than running the packing

ascibisz · 2026-02-18T21:30:44Z

cellpack/bin/pack.py

Three notable changes here:

recipe argument can now be a dictionary representing the recipe, instead of just a file path or a firebase path which were previously supported. If it's a dictionary, we initialize the recipe loader using RecipeLoader.from_json, then everything else is the same

We're now using dedup_hash instead of job_id. We had previously been storing job_id as an environmental variable, which was kinda hacky, so now we're just passing dedup_hash in directly and passing it along in place of job_id for packings initiated from server.py

validate was no longer being used, so we just removed it from the argument list. This was an artifact from a separate PR that we just noticed here, so we're cleaning it up

ascibisz · 2026-02-18T21:40:17Z

cellpack/autopack/loaders/recipe_loader.py

RecipeLoader needs to now accept a dictionary representing a JSON recipe, while maintaining the previous functionality for accepting an input_file_path. Fortunately didn't take much to make this work! But a few things of note

I didn't know how best to change / document that RecipeLoader can be initialize with either a input_file_path OR a json_recipe, not both, but you need to have one. To try to clarify this, I made the from_json class method as the avenue to initialize a RecipeLoader with a JSON recipe, but I'm not sure if that's more clear or just more confusing. Open to feedback / suggestion on this

to make _read work for json_recipes, we just had to add some default values and skip the part where we do the initial recipe read if we have a json_recipe loaded in

ascibisz · 2026-02-18T21:45:47Z

cellpack/autopack/upy/simularium/simularium_helper.py

Major changes here:

We realized the results collection in firebase is not necessary. For cellpack studio purposes, it is redundant with job_status and we only look to job_status. For locally run packings, we can skip uploading the result path to firebase at all and just directly open the simularium file. This allowed us to remove store_metadata and simplify store_result_file

We had been storing job_id as an environmental variable for packings initiated from server.py, which was kinda hacky. Now, we are directly passing in dedup_hash, which is used as our identifier for server packings

ascibisz · 2026-02-18T21:53:24Z

cellpack/autopack/DBRecipeHandler.py

Major changes here:

We realized the results collection in firebase is not necessary. For cellpack studio purposes, it is redundant with job_status and we only look to job_status. For locally run packings, we can skip uploading the result path to firebase at all and just directly open the simularium file. This allowed us to remove the whole ResultDoc class, the DBMaintenance class, and remove upload_result_metadata

We're now using dedup_hash instead of job_id as the id for server packings

Refactored upload_job_status to handle the functionality of update_outputs_directory since they were basically the same, and removed update_outputs_directory

ascibisz · 2026-02-18T21:56:18Z

cellpack/bin/cleanup_tasks.py

We realized the results collection in firebase is not necessary. For cellpack studio purposes, it is redundant with job_status and we only look to job_status. For locally run packings, we can skip uploading the result path to firebase at all and just directly open the simularium file. This allowed us to remove the cleanup code for the results collection (which was the only firebase cleanup we were doing in this repo anyways)

ascibisz · 2026-02-18T21:58:54Z

cellpack/autopack/DBRecipeHandler.py

+                simularium_url = None
+                for url in public_urls:
+                    if url.endswith(".simularium"):
+                        simularium_url = url


We had previously uploaded the result .simularium file twice for server run packings! To avoid that, now we're finding it in the uploaded outputs directory and keeping track of it's path to specifically reference in the job_status entry

ascibisz · 2026-02-18T22:00:58Z

.github/workflows/cleanup-firebase.yml

We realized the results collection in firebase is not necessary. For cellpack studio purposes, it is redundant with job_status and we only look to job_status. For locally run packings, we can skip uploading the result path to firebase at all and just directly open the simularium file. This allowed us to remove the cleanup code for the results collection (which was the only firebase cleanup we were doing in this repo anyways)

ascibisz · 2026-02-18T22:01:16Z

cellpack/autopack/interface_objects/default_values.py

We realized the results collection in firebase is not necessary. For cellpack studio purposes, it is redundant with job_status and we only look to job_status. For locally run packings, we can skip uploading the result path to firebase at all and just directly open the simularium file.

docker/server.py

Copilot

Pull request overview

This PR updates the cellPACK server packing flow to accept edited recipe JSON directly in the request body (reducing Firebase reads), introduces deterministic hashing for job deduplication, and streamlines how packing outputs + job status are uploaded/tracked.

Changes:

Add JSON-body recipe support to /start-packing and compute a deterministic hash for dedup/job reuse.
Refactor result/job-status upload workflow to use a single outputs upload and store job status keyed by dedup hash.
Remove legacy Firebase “results” metadata + scheduled cleanup workflow.

Reviewed changes

Copilot reviewed 10 out of 13 changed files in this pull request and generated 8 comments.

Show a summary per file

File	Description
docs/DOCKER.md	Updates local Docker test instructions and endpoints.
docker/server.py	Accepts recipe via query param or JSON body; adds dedup hash + Firebase job lookup.
docker/Dockerfile.ecs	Changes how dependencies are installed in ECS image.
cellpack/tests/test_db_uploader.py	Adds tests for updated `upload_job_status()` behavior.
cellpack/bin/upload.py	Aligns result upload helper return value and DB setup URL usage.
cellpack/bin/pack.py	Supports dict-based recipe input; passes dedup hash through to env and upload flow.
cellpack/bin/cleanup_tasks.py	Removes Firebase cleanup task script.
cellpack/autopack/writers/init.py	Passes dedup hash into simularium post/upload helper.
cellpack/autopack/upy/simularium/simularium_helper.py	Simplifies result upload helper API; removes Firebase metadata write path.
cellpack/autopack/loaders/recipe_loader.py	Adds `from_json()` and allows loading recipes from dicts.
cellpack/autopack/interface_objects/default_values.py	Removes `results` from default remote collections list.
cellpack/autopack/DBRecipeHandler.py	Adds hashing utility; refactors job status + upload workflow; removes maintenance/metadata code.
.github/workflows/cleanup-firebase.yml	Removes scheduled cleanup workflow.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

docker/server.py

docker/Dockerfile.ecs

cellpack/tests/test_db_uploader.py

cellpack/autopack/loaders/recipe_loader.py

docs/DOCKER.md

docker/server.py

ascibisz changed the title ~~Feature/server passed recipe json~~ Server should Accept Recipe JSON Oct 23, 2025

ascibisz mentioned this pull request Oct 23, 2025

Send Edited Recipe JSON in Request Body AllenCell/cellpack-client#123

Closed

ascibisz commented Oct 23, 2025

View reviewed changes

ascibisz marked this pull request as ready for review November 5, 2025 17:12

ascibisz marked this pull request as draft November 20, 2025 17:39

Base automatically changed from feature/client-upload-script to main December 1, 2025 18:11

ascibisz added 7 commits January 9, 2026 11:35

add upload script

78ca86d

add example data and more documentation

78b9b0a

point to correct collection

a9b056b

have server accept recipe as json object in body of request

f5f7a69

update documentation

f87915a

remove accidential dockerfile changes

1f2d2e3

rename param json_recipe

bd8ec42

ascibisz force-pushed the feature/server-passed-recipe-json branch from 63ca5ae to bd8ec42 Compare January 9, 2026 19:40

ascibisz added 12 commits January 9, 2026 11:42

remove file that shouldn't be in this PR

358158e

remove accidential file

f0beaa1

lint fixes

a54ffa1

refactor to try to improve clarity of json recipe vs file path

3d01db3

lint fixes

529e15b

lint fix

63514c9

minimize changeset

b2440cd

minimize changeset

470e3a1

simplify changeset

8a34898

code cleanup

45d438a

minimize changeset

c8fe120

remove trailing comma

ecc645d

rugeli approved these changes Jan 23, 2026

View reviewed changes

rugeli mentioned this pull request Feb 2, 2026

Firebase maintenance: retire legacy fields in results and clarify the usage #444

Open

Only upload simularium file once (#446)

79e77e8

* proposal: get hash from recipe loader * simplify get_ dedup_hash * only post simularium results file once for server job runs * update code for rebase * code cleanup --------- Co-authored-by: Ruge Li <rugeli0605@gmail.com>

ascibisz commented Feb 18, 2026

View reviewed changes

ascibisz commented Mar 11, 2026

View reviewed changes

docker/server.py Show resolved Hide resolved

ascibisz mentioned this pull request Mar 11, 2026

send JSON object in body instead of uploading edited recipe AllenCell/cellpack-client#174

Draft

ascibisz requested a review from Copilot March 11, 2026 20:56

Copilot started reviewing on behalf of ascibisz March 11, 2026 20:56 View session

Copilot AI reviewed Mar 11, 2026

View reviewed changes

change error message body

5770826

ascibisz force-pushed the feature/server-passed-recipe-json branch from 8127a76 to 5770826 Compare March 11, 2026 21:15

ascibisz added 2 commits March 11, 2026 14:25

lint fixes

162ef12

add more checks when attempting to read json body

64c60c8

Conversation

ascibisz commented Oct 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem

Key Server Improvements

1. Deduplication & Caching

2. Input Flexibility & Backwards Compatibility

3. Smart Job Management

4. Firebase Request Reduction

5: Unified Results Upload

Technical Implementation

New Server Components:

Request Flow Changes:

Benefits

CellPACK Server Job Workflow Changes

BEFORE: Original Server Workflow

AFTER: Enhanced Server Workflow with JSON Recipe Support

Uh oh!

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Oct 27, 2025

Packing analysis report

Analysis for packing results located at cellpack/tests/outputs/test_spheres/spheresSST

Packing image

Distance analysis

Uh oh!

github-actions bot commented Jan 9, 2026

Packing analysis report

Analysis for packing results located at cellpack/tests/outputs/test_spheres/spheresSST

Packing image

Distance analysis

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ascibisz Feb 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ascibisz Feb 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ascibisz Feb 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

ascibisz commented Oct 22, 2025 •

edited

Loading

ascibisz Feb 18, 2026 •

edited

Loading

ascibisz Feb 18, 2026 •

edited

Loading

ascibisz Feb 18, 2026 •

edited

Loading