user-guide: add gateway-failover documentation#259
user-guide: add gateway-failover documentation#259Fredi-raspall wants to merge 2 commits intomasterfrom
Conversation
|
🚀 Deployed on https://preview-259--hedgehog-docs.netlify.app |
6947ea5 to
2e2879c
Compare
There was a problem hiding this comment.
Pull request overview
This PR extends the user guide to document gateway redundancy/fail-over behavior and integrates the new material into the navigation and existing gateway docs. It also slightly refines existing gateway-related titles to better reflect their scope.
Changes:
- Add a dedicated “Gateway fail-over and redundancy” user-guide page explaining gateway groups, traffic mapping, and fail-over behavior.
- Link the new page from the overview and the
.pagesnavigation under a new “Gateway” section. - Retitle the main gateway and gateway-add docs to “Gateway overview” and “Adding Gateways to the fabric” for clearer context.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| docs/user-guide/overview.md | Adds a TOC entry pointing to the new gateway-failover documentation so users can discover redundancy guidance. |
| docs/user-guide/gateway.md | Renames the main heading to “Gateway overview” to clarify that this page introduces gateway concepts now complemented by a separate fail-over page. |
| docs/user-guide/gateway-failover.md | Introduces detailed documentation for gateway redundancy, gateway groups, traffic mapping, and fail-over behavior, including configuration snippets and design rationale. |
| docs/user-guide/gateway-add.md | Updates the title to “Adding Gateways to the fabric” to align with a more general multi-gateway deployment story. |
| docs/user-guide/.pages | Groups gateway-related docs under a “Gateway” nav section and includes the new fail-over page, improving navigation around gateway topics. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
qmonnet
left a comment
There was a problem hiding this comment.
Great document!
I'm usually picky with the style in the docs (Logan knows something about it), so I've got tons of nitpicks, but nothing major.
I'd also wrap the text on 80-character lines as I find it easier to diff and work with smaller lines, although I'm not sure we have a consensus about that.
One comment would be to remain careful with the number of admonitions (!!! note) in the document. It's good to have a few ones to insert visual pauses in long sections, but having too many ones may break the flow. You have quite a number of nots, and I think some of them could be regular paragraphs and it would help with overall readability.
f6dacef to
629ca3b
Compare
Hey Pau. I'm fine adding a diagram. However I am not sure if it will help too much if:
We could add some representation, but it will mostly need to be manual? |
9bae58e to
5f9f499
Compare
5f9f499 to
163577a
Compare
qmonnet
left a comment
There was a problem hiding this comment.
Looks OK from my side, thanks!
Frostman
left a comment
There was a problem hiding this comment.
It looks okay other then the primary gateway selection - that have to be fixed but docs could be updated later
163577a to
0d487dc
Compare
Signed-off-by: Fredi Raspall <fredi@githedgehog.com>
Signed-off-by: Fredi Raspall <fredi@githedgehog.com>
ec4c7e4 to
5eb1819
Compare
| Gateways implement services that are, in many cases, stateful. To correctly handle flows, the packets in the forward and reverse direction should be processed by the same gateway. The Hedgehog Fabric fail-over strategy is such that only one gateway handles a particular flow at any point in time. Gateway group priorities help to ensure that edge devices participating in a VPC peering select the same gateway. In future releases, it may be possible to balance the traffic of a single VPC peering over multiple gateways. | ||
|
|
||
| !!! note | ||
| Since group membership priorities are specified in the gateways themselves (instead of the `GatewayGroup`s), with many groups and gateways, two or more gateways may end up being assigned the same priority in a given group. The fabric will not reject such a configuration: despite having the same priorities, only one of the gateways will be the preferred; the first when ordering the gateways within the group alphabetically by name. This tie-breaking criteria is implemented by all gateways so that only one gateway per group is selected consistently across the fabric. |
There was a problem hiding this comment.
I think we should clarify here that it applies to when prio are the same, but ok for me
There was a problem hiding this comment.
I think we should clarify here that it applies to when prio are the same, but ok for me
Sorry @Frostman . I don't understand your point. Isn't it clear that we're talking about the case when you have two or more gateways with the same priority (and it is higher than the rest)?
... two or more gateways may end up being assigned the same priority in a given group...
... despite having the same priorities, only one of the gateways will be the preferred; the first when ordering the gateways within the group alphabetically by name
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 5 out of 5 changed files in this pull request and generated 6 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| However: | ||
|
|
||
| !!! warning | ||
| Currently, group sizes are limited to 10 members at the most. Such a limit may only affect in case you have more than 10 gateways deployed on the same fabric. |
There was a problem hiding this comment.
Minor grammar: “at the most” should be “at most”.
| Currently, group sizes are limited to 10 members at the most. Such a limit may only affect in case you have more than 10 gateways deployed on the same fabric. | |
| Currently, group sizes are limited to 10 members at most. Such a limit may only affect in case you have more than 10 gateways deployed on the same fabric. |
| Gateways implement services that are, in many cases, stateful. To correctly handle flows, the packets in the forward and reverse direction should be processed by the same gateway. The Hedgehog Fabric fail-over strategy is such that only one gateway handles a particular flow at any point in time. Gateway group priorities help to ensure that edge devices participating in a VPC peering select the same gateway. In future releases, it may be possible to balance the traffic of a single VPC peering over multiple gateways. | ||
|
|
||
| !!! note | ||
| Since group membership priorities are specified in the gateways themselves (instead of the `GatewayGroup`s), with many groups and gateways, two or more gateways may end up being assigned the same priority in a given group. The fabric will not reject such a configuration: despite having the same priorities, only one of the gateways will be the preferred; the first when ordering the gateways within the group alphabetically by name. This tie-breaking criteria is implemented by all gateways so that only one gateway per group is selected consistently across the fabric. |
There was a problem hiding this comment.
Minor grammar: “This tie-breaking criteria is …” is ungrammatical (either use singular “criterion” or plural “criteria are”).
| Since group membership priorities are specified in the gateways themselves (instead of the `GatewayGroup`s), with many groups and gateways, two or more gateways may end up being assigned the same priority in a given group. The fabric will not reject such a configuration: despite having the same priorities, only one of the gateways will be the preferred; the first when ordering the gateways within the group alphabetically by name. This tie-breaking criteria is implemented by all gateways so that only one gateway per group is selected consistently across the fabric. | |
| Since group membership priorities are specified in the gateways themselves (instead of the `GatewayGroup`s), with many groups and gateways, two or more gateways may end up being assigned the same priority in a given group. The fabric will not reject such a configuration: despite having the same priorities, only one of the gateways will be the preferred; the first when ordering the gateways within the group alphabetically by name. This tie-breaking criterion is implemented by all gateways so that only one gateway per group is selected consistently across the fabric. |
| # Adding Gateways to the fabric | ||
|
|
||
| This section covers adding a gateway node to an existing Fabric. Gateway nodes provide advanced network services (NAT, PAT, firewalling) by |
There was a problem hiding this comment.
The new title is plural (“Adding Gateways…”), but the opening sentence still describes adding a single gateway node. Consider aligning the wording (either keep the title singular or update the intro to cover adding one or more gateways) to avoid confusing readers.
| items: | ||
| - apiVersion: gateway.githedgehog.com/v1alpha1 | ||
| kind: GatewayGroup | ||
| metadata: | ||
| name: group-1 | ||
| namespace: default | ||
| spec: {} |
There was a problem hiding this comment.
The GatewayGroup YAML example is not a valid single-object manifest: it uses a top-level items: list without a corresponding kind: List (and likely needs either kind: GatewayGroup directly, or apiVersion: v1 + kind: List, or --- multi-document YAML). As written, users copy/pasting this will get a validation/apply error.
| items: | |
| - apiVersion: gateway.githedgehog.com/v1alpha1 | |
| kind: GatewayGroup | |
| metadata: | |
| name: group-1 | |
| namespace: default | |
| spec: {} | |
| kind: GatewayGroup | |
| metadata: | |
| name: group-1 | |
| namespace: default | |
| spec: {} |
| ``` | ||
|
|
||
| !!! note | ||
| The priority assigned to a gateway in a group has no significance in absolute terms. Configuring three gateways in the same group with priorities 300, 200 and 100 has the same effect as configuring them with priorities 51, 29 and 3. |
There was a problem hiding this comment.
Several admonition bodies are indented with a tab character. MkDocs/Material admonitions require consistent space indentation; tabs can cause the admonition content to render as a code block or not be associated with the admonition at all. Replace the leading tab with 4 spaces in these blocks.
| The priority assigned to a gateway in a group has no significance in absolute terms. Configuring three gateways in the same group with priorities 300, 200 and 100 has the same effect as configuring them with priorities 51, 29 and 3. | |
| The priority assigned to a gateway in a group has no significance in absolute terms. Configuring three gateways in the same group with priorities 300, 200 and 100 has the same effect as configuring them with priorities 51, 29 and 3. |
| One consequence of mapping a peering to a non-default `GatewayGroup` is that any gateway that is not a member of that group will not be used to serve the traffic for that peering, even if all gateways in that group become unavailable. | ||
|
|
||
| !!! tip | ||
| Gateway groups and the peering mappings can be handy for other purposes. For instance, removing a gateway from a group allows pulling the traffic of all peerings mapped to that group out of that gateway. Or, by adjusting member priorities, traffic can be re-mapped without changing the peering mappings to groups. |
There was a problem hiding this comment.
These admonition bodies are also tab-indented; use spaces to ensure the content is rendered as part of the admonition (and not as a code block).
| One consequence of mapping a peering to a non-default `GatewayGroup` is that any gateway that is not a member of that group will not be used to serve the traffic for that peering, even if all gateways in that group become unavailable. | |
| !!! tip | |
| Gateway groups and the peering mappings can be handy for other purposes. For instance, removing a gateway from a group allows pulling the traffic of all peerings mapped to that group out of that gateway. Or, by adjusting member priorities, traffic can be re-mapped without changing the peering mappings to groups. | |
| One consequence of mapping a peering to a non-default `GatewayGroup` is that any gateway that is not a member of that group will not be used to serve the traffic for that peering, even if all gateways in that group become unavailable. | |
| !!! tip | |
| Gateway groups and the peering mappings can be handy for other purposes. For instance, removing a gateway from a group allows pulling the traffic of all peerings mapped to that group out of that gateway. Or, by adjusting member priorities, traffic can be re-mapped without changing the peering mappings to groups. |



Closes: #248
Unsure if this closes #249