登录
首页 >  Golang >  Go问答

Google Pub/Sub 的 RetryPolicy 中配置的指数退避如何工作?

来源:stackoverflow

时间:2024-02-27 21:09:25 494浏览 收藏

本篇文章主要是结合我之前面试的各种经历和实战开发中遇到的问题解决经验整理的,希望这篇《Google Pub/Sub 的 RetryPolicy 中配置的指数退避如何工作?》对你有很大帮助!欢迎收藏,分享给更多的需要的朋友学习~

问题内容

cloud.google.com/go/pubsub 库最近发布(v1.5.0,参见 https://github.com/googleapis/google-cloud-go/releases/tag/pubsub%2fv1.5.0)支持新的 retrypolicy 服务器端功能。目前的文档 (https://godoc.org/cloud.google.com/go/pubsub#retrypolicy) 内容为

我读过维基百科文章,虽然它描述了离散时间的指数退避,但我不明白该文章与 minimumbackoffmaximumbackoff 参数有何具体关系。有关这方面的指导,我参考了 github.com/cenkalti/backoff 的文档,https://pkg.go.dev/github.com/cenkalti/backoff/v4?tab=doc#exponentialbackoff。该库将 exponentialbackoff 定义为

type exponentialbackoff struct {
    initialinterval     time.duration
    randomizationfactor float64
    multiplier          float64
    maxinterval         time.duration
    // after maxelapsedtime the exponentialbackoff returns stop.
    // it never stops if maxelapsedtime == 0.
    maxelapsedtime time.duration
    stop           time.duration
    clock          clock
    // contains filtered or unexported fields
}

其中每个随机间隔的计算方式为

randomized interval =
    retryinterval * (random value in range [1 - randomizationfactor, 1 + randomizationfactor])

其中 retryinterval 是当前的重试间隔,据我了解,它从 initialinterval 的值开始,并以 maxinterval 为上限。

我是否正确理解 minimumbackoffmaximumbackoff 对应于 github.com/cenkalti/backoff 中的 initialintervalmaxinterval ?也就是说,minimumbackoff 是初始等待周期,maximumbackoff 是重试之间允许的最大时间?

为了测试我的理论,我编写了以下简化程序:

package main

import (
    "context"
    "flag"
    "fmt"
    "log"
    "os"
    "time"

    "cloud.google.com/go/pubsub"
    "google.golang.org/grpc/codes"
    "google.golang.org/grpc/status"
)

var (
    projectid                      string
    minimumbackoff, maximumbackoff time.duration
)

const (
    topicname             = "test-topic"
    subname               = "test-subscription"
    defaultminimumbackoff = 10 * time.second
    defaultmaximumbackoff = 10 * time.minute
)

func main() {
    flag.stringvar(&projectid, "projectid", "my-project", "google project id")
    flag.durationvar(&minimumbackoff, "minimumbackoff", 5*time.second, "minimum backoff")
    flag.durationvar(&maximumbackoff, "maximumbackoff", 60*time.second, "maximum backoff")
    flag.parse()
    log.printf("running with minumum backoff %v and maximum backoff %v...", minimumbackoff, maximumbackoff)

    retrypolicy := &pubsub.retrypolicy{minimumbackoff: minimumbackoff, maximumbackoff: maximumbackoff}

    client, err := pubsub.newclient(context.background(), projectid)
    if err != nil {
        log.fatalf("newclient: %v", err)
    }

    topic, err := client.createtopic(context.background(), topicname)
    if err != nil {
        log.fatalf("createtopic: %v", err)
    }
    log.printf("created topic %q", topicname)
    defer func() {
        topic.stop()
        if err := topic.delete(context.background()); err != nil {
            log.fatalf("delete topic: %v", err)
        }
        log.printf("deleted topic %s", topicname)
    }()

    sub, err := client.createsubscription(context.background(), subname, pubsub.subscriptionconfig{
        topic:       topic,
        retrypolicy: retrypolicy,
    })
    if err != nil {
        log.fatalf("createsubscription: %v", err)
    }
    log.printf("created subscription %q", subname)
    defer func() {
        if err := sub.delete(context.background()); err != nil {
            log.fatalf("delete subscription: %v", err)
        }
        log.printf("deleted subscription %q", subname)
    }()

    go func() {
        sub.receive(context.background(), func(ctx context.context, msg *pubsub.message) {
            log.printf("nacking message: %s", msg.data)
            msg.nack()
        })
    }()

    topic.publish(context.background(), &pubsub.message{data: []byte("hello, world!")})
    log.println("published message")
    time.sleep(60 * time.second)
}

如果我使用分别为 5 秒和 60 秒的默认 minimumbackoffmaximumbackoff 运行它,我会得到以下输出:

> go run main.go
2020/07/29 18:49:32 running with minumum backoff 5s and maximum backoff 1m0s...
2020/07/29 18:49:33 created topic "test-topic"
2020/07/29 18:49:34 created subscription "test-subscription"
2020/07/29 18:49:34 published message
2020/07/29 18:49:36 nacking message: hello, world!
2020/07/29 18:49:45 nacking message: hello, world!
2020/07/29 18:49:56 nacking message: hello, world!
2020/07/29 18:50:06 nacking message: hello, world!
2020/07/29 18:50:17 nacking message: hello, world!
2020/07/29 18:50:30 nacking message: hello, world!
2020/07/29 18:50:35 deleted subscription "test-subscription"
2020/07/29 18:50:35 deleted topic test-topic

而如果我分别使用 1 秒和 2 秒的 minimumbackoffmaximumbackoff 运行它,我得到

> go run main.go --minimumBackoff=1s --maximumBackoff=2s
2020/07/29 18:50:42 Running with minumum backoff 1s and maximum backoff 2s...
2020/07/29 18:51:11 Created topic "test-topic"
2020/07/29 18:51:12 Created subscription "test-subscription"
2020/07/29 18:51:12 Published message
2020/07/29 18:51:15 Nacking message: Hello, world!
2020/07/29 18:51:18 Nacking message: Hello, world!
2020/07/29 18:51:21 Nacking message: Hello, world!
2020/07/29 18:51:25 Nacking message: Hello, world!
2020/07/29 18:51:28 Nacking message: Hello, world!
2020/07/29 18:51:31 Nacking message: Hello, world!
2020/07/29 18:51:35 Nacking message: Hello, world!
2020/07/29 18:51:38 Nacking message: Hello, world!
2020/07/29 18:51:40 Nacking message: Hello, world!
2020/07/29 18:51:44 Nacking message: Hello, world!
2020/07/29 18:51:47 Nacking message: Hello, world!
2020/07/29 18:51:50 Nacking message: Hello, world!
2020/07/29 18:51:52 Nacking message: Hello, world!
2020/07/29 18:51:54 Nacking message: Hello, world!
2020/07/29 18:51:57 Nacking message: Hello, world!
2020/07/29 18:52:00 Nacking message: Hello, world!
2020/07/29 18:52:03 Nacking message: Hello, world!
2020/07/29 18:52:06 Nacking message: Hello, world!
2020/07/29 18:52:09 Nacking message: Hello, world!
2020/07/29 18:52:12 Nacking message: Hello, world!
2020/07/29 18:52:13 Deleted subscription "test-subscription"
2020/07/29 18:52:13 Deleted topic test-topic

似乎在后一个示例中,nack 之间的时间非常一致~3 秒,这大概代表了在 2 秒的 maximumbackoff 中做到这一点的“最大努力”?我仍然不清楚的是是否有随机化,是否有乘数(从第一个例子来看,重试之间的时间似乎每次都没有两倍长),以及是否有等价的maxelapsedtime 的时间,超过该时间就不再重试?


解决方案


最小退避和最大退避的重试策略字段类似于上面示例中的InitialInterval 和MaxInterval。 Cloud Pub/Sub 使用与您提到的类似的公式来计算指数延迟。这也包括随机化。

除了 MaxInterval 之外,每次后续重试都会增加 MaxInterval 的延迟。如果您想在尝试一定次数后停止重试,我们建议使用 Dead Letter Queues

本篇关于《Google Pub/Sub 的 RetryPolicy 中配置的指数退避如何工作?》的介绍就到此结束啦,但是学无止境,想要了解学习更多关于Golang的相关知识,请关注golang学习网公众号!

声明:本文转载于:stackoverflow 如有侵犯,请联系study_golang@163.com删除
相关阅读
更多>
最新阅读
更多>
课程推荐
更多>