Skip to content

Fix multiget socket corruption#1

Open
cornu-ammonis wants to merge 3 commits into
mainfrom
fix-multiget-socket-corruption
Open

Fix multiget socket corruption#1
cornu-ammonis wants to merge 3 commits into
mainfrom
fix-multiget-socket-corruption

Conversation

@cornu-ammonis
Copy link
Copy Markdown
Owner

No description provided.

Currently there is a race condition: if something happens between make_getkq_requests and finish_queries, the getkq request responses will be written to the socket, but the sockets @request_in_progress flag will not be set. This means that future gets may read corrupted/incorrect socket data.

Instead we should conceive of the "request" as starting as soon as we go to write the getkq to the servers. Therefore we should verify the op and mark start_request! prior to sending the getkq ops.
so we can start_request! before get_multi's write operations
@cornu-ammonis
Copy link
Copy Markdown
Owner Author

cornu-ammonis commented May 15, 2023

Hi @casperisfine - thanks so much for being willing to help out with this. I'd love to help in any way I can, but I don't feel super comfortable engaging on the upstream repo given the level of hostility/unpleasantness I've met there. I also doubt it would be helpful - I'm concerned that my direct involvement would decrease the chances that any PR is evaluated fairly and merged.

So I figured maybe I could work with you to help polish up an implementation, and then you could carry it over the finish line? Let me know what you think.

Tagging on this PR because it is in some ways similar to what you proposed. (This PR is not complete in that it only addresses it for multiget, the regular get implementation would need similar treatment.)

As you can see, there's an existing @request_in_progress state flag. The wrinkle is that it is set to false between writing an op to memcached and reading its response. Not entirely sure why it works that way - I did open this PR in the main repo also, but did not receive substantive feedback - so maybe it would be best to introduce a new variable.

Was trying to think where the best place to put the new variable and check it would be. Working backwards, maybe we could do a similar check here https://github.com/petergoldstein/dalli/blob/0483446acf82918636990c2839ca042e0e95c656/lib/dalli/protocol/connection_manager.rb#L101 for a pending read. Then we could set pending read to true after verifying state, similar to how start_request! is called in a few places. The problem then is that for multiget, we verify state halfway through - as we setup reading responses rather than setup writing for the keys. So I think like I did in this PR, that might have to move around a little bit.

Curious what you think, maybe there's a better place to put the logic that would require changing less existing code.

@casperisfine
Copy link
Copy Markdown

I submitted what I think could be an acceptable patch.

@cornu-ammonis
Copy link
Copy Markdown
Owner Author

cornu-ammonis commented May 16, 2023

Thanks @casperisfine - looks great mostly. I think there's one missing piece, it seems that moving start_request! before write_noop isn't sufficient to protect the multiget flow.

If the interruption happens after make_getkq_requests but before pipeline_response_setup (i.e. try raising the interrupt right here) the multiget values will still end up on the socket when doing subsequent gets, but @request_in_prgress will be false. Please let me know if I'm missing something 😄

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants