Feat/ota web#17
Merged
Merged
Conversation
Added functionality to support pull-based OTA updates by fetching firmware from a manifest. The new `otaFromManifest` method allows the system to check for available updates and flash the firmware if necessary. This enhancement improves the update process for observer builds using the MQTT bridge, ensuring a more seamless firmware management experience.
Updated the otaFromManifest method to enforce HTTP/1.0 for better compatibility with CDNs and to handle empty manifest responses. This ensures that the JSON parser receives a complete body, preventing errors during firmware update checks.
Updated the otaFromManifest method to stream-parse the firmware manifest directly from the network, reducing peak RAM usage during OTA checks. This change enhances compatibility with slow TLS links by implementing a per-read timeout, ensuring a more efficient and reliable update process.
Updated the startOTAUpdate method to serve the ElegantOTA on the station IP when connected to a WiFi network, enhancing accessibility for OTA updates. If not connected, it defaults to the MeshCore-OTA SoftAP. This change improves the user experience by allowing easier access to OTA updates without needing to switch networks.
Added support for deferred OTA updates in the MyMesh class, allowing the system to schedule firmware updates to occur after a confirmation reply is sent. This change improves the user experience by ensuring that the update process does not block the main application loop, allowing for smoother operation during firmware updates.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds a pull-based OTA path for observer (MQTT-bridge) builds: a node fetches its own
firmware from the web-flasher manifest and flashes itself over its existing WiFi
connection — no cable, no manual upload, triggerable remotely over the mesh.
Kept separate from
start ota(the ElegantOTA web-upload SoftAP), which is unchanged,so nobody expecting to hand-upload a binary triggers a silent online update.
Commands (observer builds only)
ota checkcurrent -> availablehash + partition-change status. No flash. Synchronous.ota updateBeginning update...immediately, then flashes and reboots.start otaOn non-observer builds
ota check/ota updatereturnERR: online OTA not supported on this build.How it works
The observer already has everything needed — a live WiFi STA link (for MQTT), the embedded
root-CA bundle, dual OTA partitions, and ArduinoJson — so no new dependencies.
OTA_MANIFEST_URL, baked in at compile time) over TLS verifiedagainst the embedded CA bundle.
flash-update(app-only).binwhose name matches this build's variant(
OTA_VARIANT, the PlatformIO env name injected bybuild.sh).FIRMWARE_VERSION), by shared prefix so a 7-char CI hash and an 8-char local hash forthe same commit aren't seen as different. Reports / skips if already up to date.
HTTPUpdateand reboots.No operator-supplied URLs — host and target are resolved entirely from baked-in config +
the manifest.
Robustness (the non-obvious parts, all hardware-driven)
Getting this reliable on a no-PSRAM ESP32-S3 (Heltec V3) surfaced several issues worth
flagging for review:
when reached via the deep mesh-receive call chain (canary panic). It now runs in a
dedicated 24 KB-stack FreeRTOS task, spawned per-operation and freed after.
HTTP/1.1 with
Transfer-Encoding: chunkedand noContent-Length; the raw chunkedstream corrupts the JSON parse. Fixed with
http.useHTTP10(true)→ unframed,Connection: closebody, stream-parsed with an ArduinoJson filter so peak RAM is thekept subset (~12 KB), not the ~40 KB manifest.
sessions drives free heap to a few hundred bytes and truncates the read. Both
ota checkand
ota updatestop the MQTT bridge first (its TLS contexts + task) for headroom; theWiFi STA link survives
end().can't go out inline.
ota updateschedules the flash ~2.5 s out (via the app loop) so theBeginning update...confirmation transmits over LoRa first.Bridge restart fixes (in
MQTTBridge)Stopping/restarting the bridge for OTA exposed two pre-existing bugs in
initializeWiFiInTask(), now fixed (these also benefitset mqtt…/restartBridge):WiFi.begin()when already connected.end()leaves WiFi up, so the restart wasforcing a needless disconnect/reconnect that also raced the MQTT task's first DNS lookup
(
getaddrinfo() returns 202/esp-tls 0x8001). Slot setup still fires because_ntp_syncedpersists acrossend().WiFi.onEventhandler once. It was re-registered on every restart andnever removed — a handler leak that duplicated every connect/disconnect log line.
Result: after an
ota check, the bridge comes back with only the MQTT sessions reconnecting— no WiFi flap, no NTP re-sync, no DNS errors, single log lines.
Safety
ota updateis admin + ACL gated in the mesh receive path.setInsecure()) for both manifest andbinary.
flash-updateartifact is fetched — never-merged.bin.partition-changein the manifest are refused (OTA can't rewrite thepartition table; those still need a cable/erase flash).
HTTPUpdate/Updatewrites only the inactive OTA slot and commits theboot pointer (
otadata) only after a complete, validated write (size + MD5 + image-headercheck). A failed/truncated/wrong-chip download is rejected without rebooting and the bridge
is resumed, so the node keeps booting the working partition. (ESP-IDF post-boot
auto-rollback is not enabled — a build that passes validation but is functionally broken
would not auto-revert; noted as future hardening.)
Also in this PR
start otastation IP: on a WiFi-connected device, ElegantOTA is now served on thestation IP (reachable from the LAN) instead of always raising the SoftAP and reporting
192.168.4.1.Build config
build.shinjects-DOTA_VARIANT='"<env>"'alongsideFIRMWARE_VERSION.-D OTA_MANIFEST_URL='"https://observer.gessaman.com/config.json"'added to all 28observer envs (1:1 with
WITH_MQTT_BRIDGE).Testing
Verified on hardware (Heltec V3 observer, no PSRAM):
ota check→update available: dec66838 -> 454afec, reliably, withMinfree heap~63 KB during the fetch and a clean bridge restart afterward (no WiFi flap / DNS errors).
ota update→Beginning update...ack received over LoRa, then download → flash →reboot onto the new build.
start ota→ ElegantOTA SoftAP, now reporting the correct station IP.heltec_v4andHeltec_v3observer envs; partition dump confirmsdual OTA slots.
Not yet exercised: no-op "already up to date" path,
partition-changerefusal, and adownload-failure (rollback) path.
Files changed
OTA feature:
build.sh,src/MeshCore.h,src/helpers/ESP32Board.{cpp,h},src/helpers/CommonCLI.{cpp,h},examples/simple_repeater/MyMesh.{cpp,h}, andOTA_MANIFEST_URLacross 12 observer variantplatformio.inifiles.Bridge restart fixes:
src/helpers/bridges/MQTTBridge.{cpp,h}— slightly outside the OTAfeature but required to make the bridge bounce clean; they improve every restart path.