Replace cursor RowLock with FOR SHARE NOWAIT #386
Conversation
Why? Mostly to allow enabling the super_read_only variable for source DBs. Since this brings the cursor closer to gh-ost implementation (https://github.com/github/gh-ost/blob/801ebabe343621fc06c92667fdb404f24bc605c0/go/sql/builder.go#L216-L218), I chose gh-ost's retry logic (60 times, 1s linear backoff) as a starting point. Had to introduce a 10ms sleep in the IntegrationTest setup to ensure it's more realistic.
ff5b2b6 to
e8ffb0f
Compare
| panic(err) | ||
| } | ||
|
|
||
| time.Sleep(10 * time.Millisecond) |
There was a problem hiding this comment.
Added this to make the DataWriter more realistic, some integration tests (example) start it with NumberOfWriters: 4 and that's too much for NOWAIT to work without sometimes taking way too long to obtain a lock.
I'm wondering if this is a signal to consider just a FOR SHARE lock 🤔
There was a problem hiding this comment.
FOR SHARE is equivalent to what we're doing right now, so it's not a step down, right?
I mean, we haven't seen problems with ghostferry holding on to locks so far, AFAIK, so it wouldn't be the worst thing to keep doing it. However, if the goal is the minimize production impact than not holding on to locks would be better.
There was a problem hiding this comment.
We could also start with FOR SHARE to unblock super_read_only and then investigate NOWAIT later.
There was a problem hiding this comment.
Yep, I agree that FOR SHARE is a safer option, but also don't have new ideas on how to de-risk FOR SHARE NOWAIT further. I think it's safe enough in a way that if the cursor fails to obtain a lock (which I doubt, the logic is equal to gh-ost's iterator now, which isn't even configurable), we can guarantee that any data will remain unaffected and ghostferry may just fail to run.
Do you have any objections against shipping with NOWAIT (or how we could test it further while shipping FOR SHARE first)?
There was a problem hiding this comment.
I think it's fine to go with NOWAIT. 👍
|
|
||
| if c.DBReadRetries == 0 { | ||
| c.DBReadRetries = 5 | ||
| c.DBReadRetries = 60 |
There was a problem hiding this comment.
And just double-checked that the config is currently @ default in core (components/shop_mover/app/utils/podding/shop_mover/ghostferry/config.rb)
|
|
||
| if c.RowLock { | ||
| selectBuilder = selectBuilder.Suffix("FOR UPDATE") | ||
| mySqlVersion, err := c.DB.QueryMySQLVersion() |
There was a problem hiding this comment.
Thought it's best to just ask the DB 🙃
Closes https://github.com/Shopify/db-mobility/issues/778, relates to #381
Allows enabling the
super_read_onlyvariable for source DBs. Since this brings the cursor closer togh-ostimplementation (https://github.com/github/gh-ost/blob/801ebabe343621fc06c92667fdb404f24bc605c0/go/sql/builder.go#L216-L218), I chose gh-ost's retry logic (60 times, 1s linear backoff) as a starting point. Had to introduce a 10ms sleep in theIntegrationTestsetup to ensure it's more realistic, which raises some questions about likelihood of erroring after retries are exhausted 🤔Feedback welcome, do not hesitate to share ideas on how to tophat better than with unit tests.