I’ve never run a big system like this, but like the lead character in the story, I always figured exponential backoff would be enough. Turns out there’s more.

  • RubberElectrons@lemmy.world
    link
    fedilink
    English
    arrow-up
    1
    ·
    7 hours ago

    All of what you’re saying seems correct. I think this is more of a meta discussion, on how (in this case) retries, even with exponential back off, aren’t a solution by themselves when you look at the system overall. There are interesting hidden caveats to any common solutions, this is one I personally wasn’t aware of.

    Practically, adding a timeout budget so that the clients themselves just error out (forcing a manual refresh) sorta accomplishes the same as what you’re positing.