Worker Pool in Go

How to use worker pool concurrency pattern in Go

Apr 07, 2024

Go will process goroutines without considering your remaining memory, CPU, and other resources. When you spawn a goroutine, Go requires a 2KB memory allocation and will grow and shrink when needed. It’s pretty lightweight compared to other languages. But, it doesn’t mean, you can spawn billions or infinite goroutines at the same time. Our server or our computer has limited resources.

The worker pool is a concurrency pattern to allows us to limit the number of goroutines that run at the same time. Here’s the illustration1 for the pattern:

When you want to use concurrency in production, it's better to consider using this pattern. Because in real life, we can't predict the total number of goroutines that are running at the same time. When your users grow, the number will also grow. So, it’s better to have a system that lets us define how much number of goroutines can run at the same time.

In this article, we'll see how to use the worker pool in Go with a basic program example.

Let’s say we have a basic example program to print a list of numbers:

func main() {
  nums := []int{1, 2, 3, 4, 5}

  var wg sync.WaitGroup
  for num := range nums {
    wg.Add(1)
    go func(num int) {
      fmt.Println(num)
      wg.Done()
    }(num)
  }

  wg.Wait()
}

In this example, we have 5 goroutines that ran simultaneously. Now, let’s implement a worker pool pattern with a limit of 2 goroutines.

func main() {
  nums := []int{1, 2, 3, 4, 5}

  var wg sync.WaitGroup
  wg.Add(2)
  go printNums(nums[:3], &wg)
  go printNums(nums[3:], &wg)

  wg.Wait()
}

func printNums(nums []int, wg *sync.WaitGroup) {
  for _, num := range nums {
    fmt.Println(num)
  }

  wg.Done()
}

As you can see we only trigger two goroutines because the total waitGroup is 2. The first goroutine prints the first three elements, and the second one prints the rest.

This code is getting things done. But, it can become hard to maintain when we want to adjust the limit. Let's say we want to change the limit from 2 to 3. We can have a lot of changes.

The other way is to use the channel.

Channel is Go’s data structure for communicating between goroutines.

Here’s the basic usage of the channel:

func main() {
  message := make(chan string)

  go func() {
    message <- "ping"
  }()

  fmt.Println(<-message) // ping
}

The main function in Go is goroutine, which means every program in Go always has at least one goroutine.

In this example, we have two goroutines: the main function, and the anonymous function. The anonymous function sends the “ping” message to the channel. While, in the main goroutine, we create the channel and print the message.

As you can see, we don't have a command to wait for all goroutines. It's because with channel, by default when you want to receive the message <-message, Go will wait for the message.

This is the reason why we’ll get the deadlock in this example:

func main() {
  message := make(chan string)

  fmt.Println(<-message) // fatal error: all goroutines are asleep - deadlock!
}

It’s because Go’s waiting for the message, but there is no goroutine to send the message.

We can use “for range” to print all the messages:

func main() {
  messages := make(chan string)

  go func() {
    messages <- "ping"
    messages <- "pong"
    close(messages)
  }()

  for msg := range messages {
    fmt.Println(msg)
  }
}

ping
pong

But, when want to use “for range” we need to close the channel. Closing the channel is telling Go to not receive any messages to that channel anymore. We will get an error when try to send the message to the closed channel. Closing the channel is also helping Go to know when to exist in the “for range”.

The “for range” also works when we reverse the code to be like this:

func main() {
  messages := make(chan string)

  go printer(messages)

  messages <- "ping"
  messages <- "pong"
  close(messages)
}

func printer(messages chan string) {
  for msg := range messages {
    fmt.Println(msg)
  }
}

ping
pong

Because the Go receiver always waits for the sender, our anonymous function is acting like a listener in this example. Here’s the flow for the code:

Now, let’s use the channel to write the worker pool.

We will use the job terms to identify a list of tasks. Because want to create a program to print 5 numbers so our program will contain 5 jobs. The worker is a term for identifying a goroutine. Because our pool limit is two, so, we'll have two workers.

func main() {
  nums := []int{1, 2, 3, 4, 5}
  jobs := make(chan int)
  pool := 2

  var wg sync.WaitGroup
  wg.Add(pool)

  // Spawn workers
  for i := 1; i <= pool; i++ {
    go func() {
      for job := range jobs {
        fmt.Println(job)
      }
      wg.Done()
    }()
  }

  // Generating jobs
  for _, num := range nums {
    jobs <- num
  }
  close(jobs)

  wg.Wait()
}

In this example, we have two workers that always listening to the jobs channel. For every new value added to the channel, the worker will process it.

Go guarantee for no duplications like two workers processing the same job.

// Spawn workers
for i := 1; i <= pool; i++ {
  go func(worker int) {
    for job := range jobs {
      fmt.Printf("Job %d is done by worker %d\n", job, worker)
    }
    wg.Done()
  }(i)
}

Job 1 is done by worker 2
Job 2 is done by worker 1
Job 4 is done by worker 1
Job 5 is done by worker 1
Job 3 is done by worker 2

Now, our code will be more maintainable, we only need to update the “pool” variable when want to adjust the limit.

pool := 3

Job 1 is done by worker 3
Job 3 is done by worker 2
Job 5 is done by worker 2
Job 4 is done by worker 3
Job 2 is done by worker 1

That’s all for the article. We already see how to use the worker pool pattern in Go. I hope you can get something from reading this article. Happy hacking and see you in another one.

The illustration is taken from this article.

Slice of Life

Discussion about this post

Ready for more?