A collection of best practices, conventions, patterns, tips and gotchas around golang and building modules
- Completely new to golang?
- How-to write a proper golang module
- Unit testing
- Integration tests
- Go routines
- Go channels
- Synchronization techniques
If you are completely new to golang, please first read and work through GoByExample. It is the best introduction to golang syntax and concepts. Once you worked through that, you are essentially able to write any program in go. But as always when you learn a new language, adopting the language features and concepts in the right context is difficult. This is what the following topics try to help with.
A good golang module is a small re-usable component with a clearly defined and documented API. Golang relies on certain conventions, applying them naturally leads to the creation of high quality modules. Using conventions also facilitate testability and above all: maintainability of the code.
The smallest unit that can be exposed as a go module is a package
. A module may consist of multiple packages, but each
of those would also be their own module. The package:
- must only expose types, functions, constants and variables which are supposed to be publicly visible. Optimally as few entities as possible. This text refers this as the API or package API
- must have a descriptive name that serves as part of the API flow. It should be short, avoid
camelCase
orunder_score
and not 'steal' popular variable names. More details - must have a
func New(...)
constructor for all exposed package objects - can have static functions
- must have tests
By convention a package exposes one or more objects that are created using New()
constructors
Example:
package dgraph
type Client struct{}
func New() *Client { return &Client{}}
A New()
constructor can have any number of parameters and descriptive suffixes to allow different ways to initialize the object.
Good 👍
//If package only exposes single package object (ideal unit)
dgraphClient := dgraph.New()
//If package exposes multiple different package objects
dgraphClient := dgraph.NewClient()
dgraphMonitor := dgraph.NewMonitor()
//Dependency Injection
dgraphClient := dgraph.New(connection)
//Overloading - See Constructor patterns for better alternative
dgraphClient := dgraph.NewWithTLS(connection, tlsConfig)
//Allow error handling
dgraphClient, err := dgraph.New()
By convention, alliteration must be avoided.
Bad 👎
//Alliteration
dgraphClient := dgraph.NewDgraphClient()
//Lots of parameters are always an incompatibility risk, see: Constructor patterns how to avoid
dgraphClient := dgraph.NewClient(ip, port, domain, errorHandler, prometheusCLient, snmpTrap)
//Not using New... convention
dgraphClient := dgraph.CreateClient()
//Not providing constructor function at all, leaving the API user to guess how to instantiate the object
dgraphClient := new(dgraph.Client)
dgraphClient := &dgraph.Client{}
When initializing objects using a New()
constructor, you often need to customize the returned object. Using a list of parameters is
discouraged, as golang does not allow overloading. Therefore, adding parameters will break the API or require the creation of verbose NewWithXAndYAndZ()
constructors. This can quickly become a permutation problem. Instead, you should use one of the two patterns described below, to configure objects on initialization.
The default constructor pattern that allows non-breaking addition of parameters is using a configuration object
type Config struct {
ip string
port int
}
The New()
constructor has usually a single parameter, the config object, which allows additions in the future without breaking backwards compatibility
dgraphClient := dgraph.New(&dgraph.Config{
ip : "localhost",
port: 9090
})
For complex configuration objects, it is good practice creating a constructor function that returns a configuration object with proper defaults.
config := dgraph.NewDefaultConfig()
dgraphClient := dgraph.New(config)
The functional-style options pattern is a very nice to use, yet a bit verbose to implement, alternative.
For public facing APIs with lots of different options, it is a good choice. The New()
constructor takes
a variadic variable of connectionOption
. The two provided implementations are WithIP(ip string)
and WithPort(port int)
.
The New(...)
constructor first initializes the Connection
object with default parameters and then iterates through the list of provided
options to apply them to the Connection
object before returning it.
package options_pattern
type Connection struct {
ip string
port int
}
type connectionOption func(*Connection)
func WithIP(ip string) func(connection *Connection) {
return func(connection *Connection) {
connection.ip = ip
}
}
func WithPort(port int) func(connection *Connection) {
return func(connection *Connection) {
connection.port = port
}
}
func New(opts ...connectionOption) *Connection {
conn := &Connection{
ip: "default",
port: 0,
}
//Apply all options
for idx := range opts {
opts[idx](conn)
}
return conn
}
Using this pattern makes for very intuitive and descriptive API calls that allow backward-compatible option additions and deprecation:
func TestOptionsPattern(t *testing.T) {
{
connection := options_pattern.New(options_pattern.WithIP("localhost"))
assert.Equal(t, "ip: localhost, port: 0", connection.ToString())
}
{
connection := options_pattern.New(
options_pattern.WithIP("localhost"),
options_pattern.WithPort(90008))
assert.Equal(t, "ip: localhost, port: 90008", connection.ToString())
}
}
New()
constructor is offered. Make the object type
package private (lowercase):
package dgraph
type client struct{} //Nobody can initialize this struct now
func New() *client { return &client{}}
This comes with the huge downside that package users are unable to reference the type directly, which prevents variable declarations and can make working with the package cumbersome. It is therefore discouraged.
package persistence
var dgraphClient *dgraph.client //Does not compile, package user cannot declare variables of your type
func smh() {
dgraphClient := dgraph.New() //No issues
}
Package functions are typically used to configure a package within the context of the application. An example would be providing a logger implementation if the package supports it or simply setting a log level.
dgraph.SetLogger(log.Logger)
dgraph.SetLogLevel(dgraph.DEBUG)
Another use case for package functions are utility operations. A good example for this is golangs strings
package:
hasLa := strings.Contains("Blabla", "la")
helloWorld := strings.Join([]string{"hello", "world"}, ",")
💡 Bad practice: Utility packages
Do NOT EVER create a utility package. These attempts to reuse code WILL fail because context-less utility packages will never be known to the next maintainer. They will not find the functions they are supposed to reuse and duplicate them anyway. Utility functions should be provided in the package context where they are valuable. In most cases they can be replaced with a method on an object.
//Bad
util.ConvertXToY
//Better on the object. Has tool support (IDE auto-completion)
y := x.ConvertToY()
//Bad, nobody will ever know it exists
util.CalculateGeoCentroid(long,lat float32)
//Better in a geo package for context, maintainer more likely to look here for a geo centroid function
geo.CalculateCentroid(long,lat float32)
In 99.9% cases, using fmt.Errorf()
is the recommended way to return error
s to your API users. It is used to create errors or add further context to downstream errors:
//Returning an error
return fmt.Errorf("login failed to due invalid credentials")
//Returning a parameterized error
return fmt.Errorf("login failed to due invalid credentials for user %s", username)
//Adding context to an existing error
return fmt.Errorf("errors connecting to postgres: %s", err.Error())
//Alternative: wrap an existing error
return fmt.Errorf("errors connecting to postgres: %w", err)
//Bad capitalized error message
return fmt.Errorf("Try again")
💡 Go convention forbids capitalized error messages unless the first word is a proper noun
But sometimes you want to enable API users to distinguish different error scenarios. There are essentially two ways to accomplish that:
- Package defined errors
- Custom error types
Package defined errors are your go-to solution if you have a function that can return multiple error conditions that the user wants to handle differently. An example would be a database call, where the driver connection could fail (retry) or the SQL could be corrupted (programming error, do not retry).
Note that package defined errors have a big disadvantage: They cannot be parameterized. As such their use is limited
Define package error
package errorhandling
import "fmt"
var (
//ConnectionError is a package defined error that allows users to react to different error conditions,
//but cannot be parameterized
ConnectionError = fmt.Errorf("connection failed")
)
Handling package defined errors as the API consumer:
func TestPreDefinedErrorHandling(t *testing.T) {
err := errorhandling.ReturnPredefinedError()
//You can use a switch
switch err {
case errorhandling.ConnectionError:
assert.Equal(t, "connection failed", err.Error())
default:
t.Fatal("unexpected error")
}
//Or a simple comparison
if err == errorhandling.ConnectionError {
assert.Equal(t, "connection failed", err.Error())
} else {
t.Fatal("unexpected error")
}
//Best: using errors.Is(), which examines the entire error chain in case of wrapped errors
if errors.Is(err, errorhandling.ConnectionError) {
assert.Equal(t, "connection failed", err.Error())
} else {
t.Fatal("unexpected error")
}
}
Use of this approach ONLY if you can get away with non-parameterized errors on a function that can return errors which require individual handling.
Custom error types overcome the limitation of package defined errors: they can be extensively parameterized.
Create a type that implements func Error() string
. This satisfies the error
interface.
Example of defining and returning a custom error type
type CustomError struct {
Status int
Reason string
}
//Error satisfies the error interface. Be aware what you return if you implement with or without pointer receiver
func (e CustomError) Error() string {
return fmt.Sprintf("failed with status %d: %s", e.Status, e.Reason)
}
func ReturnCustomError() error {
//Returning CustomError value (not pointer) as Error() is implemented with value receiver
return CustomError{
Status: 22,
Reason: "Just cause",
}
}
Handling a custom error type as API caller:
func TestCustomErrorHandling(t *testing.T) {
err := errorhandling.ReturnCustomError()
switch err.(type) {
case errorhandling.CustomError:
customError := err.(errorhandling.CustomError)
assert.Equal(t, 22, customError.Status)
assert.Equal(t, "Just cause", customError.Reason)
default:
t.Fatal("unexpected error")
}
//Preferred: using errors.As(...) which examines the entire error chain to find wrapped errorhandling.CustomError
var ce errorhandling.CustomError
if errors.As(err, &ce) {
assert.Equal(t, 22, ce.Status)
assert.Equal(t, "Just cause", ce.Reason)
} else {
t.Fatal("unexpected error")
}
}
ONLY use custom error types if you have a function that returns errors which require individual handling. This should be used rarely!
Introduced in golang 1.13, errors can be wrapped into other errors. This can be used to preserve more than just the
error text when context is added to an error. The newly addederrors.Is()
and errors.As()
functions can be used to check package defined or custom error types. These functions
consider the entire error chain. In the following example, the errorhandling.CustomError
is wrapped into a generic error using the format directive %w
.
errors.As(..)
locates the errorhandling.CustomError
in the error chain and deserializes it into the provided value for processing:
func TestCustomErrorChainHandling(t *testing.T) {
err := errorhandling.ReturnCustomError()
//Wrapping the original errorhandling.CustomError into another error using format directive '%w'
wrapErr := fmt.Errorf("I caught an error %w", err)
var ce errorhandling.CustomError
//errors.As() will find the errorhandling.CustomError in the chain
if errors.As(wrapErr, &ce) {
assert.Equal(t, 22, ce.Status)
assert.Equal(t, "Just cause", ce.Reason)
} else {
t.Fatal("unexpected error")
}
}
💡 If you want your custom error types to support wrapping, you need to implement func Unwrap() error
You should use error wrapping only when adding context to a complex error that carries more information than just the error text. You want to preserve the original information for your package user to process.
The best way to provide logging in your package is to define a logger interface
and allow users to set their own logger.
When you define a logging interface, stay as simple as possible to permit compatibility with popular logging libraries:
package packagelog
//Logger is the module's logging interface. It is compatible with the standard os logger.
//It won't be great for a lot of popular logging libraries, which commonly use InfoF and Errorf instead
type Logger interface {
Printf(l string, args ...interface{})
Fatalf(l string, args ...interface{})
//Infof and Errorf give a lot of compatibility with existing logging libraries
//Infof(l string, args ...interface{})
//Errorf(l string, args ...interface{})
//Warnf (sometimes Warningf) and Debugf are less common and should be avoided
//Warnf(l string, args ...interface{})
//Debugf(l string, args ...interface{})
}
//NoopLogger is the default provided logger
type NoopLogger struct{}
func (NoopLogger) Printf(l string, args ...interface{}) {}
func (NoopLogger) Fatalf(l string, args ...interface{}) {}
//moduleLogger is used for all logs of the module
var moduleLogger Logger = NoopLogger{}
//SetLogger allows the package user to provide his own implementation
func SetLogger(l Logger) {
moduleLogger = l
}
func MyCoolFunction(name string, age int) {
moduleLogger.Printf("Name: %s, Age: %d", name, age)
}
The package user can now provide his own logger, which ideally is simply his application logger:
func TestMyCoolFunction(t *testing.T) {
packagelog.MyCoolFunction("Paul", 43) //Logs nothing
packagelog.SetLogger(log.Default()) //Set standard library's log.Logger
packagelog.MyCoolFunction("Jill", 84) //Logs using standard library logger
}
If the application logger does not satisfy your logging interface natively, the package user can still build an adapter that satisfies the interface and delegates to his application logger.
💡
This example demonstrates the difference of interfaces in go and Java. In Java, a class
has to explicitly
implement an interface. In go, a type
has to simply satisfy the interface, by implementing all functions
defined by it.
Enter your package on command line and execute go mod init
, which creates the go.mod
file. You will be asked to run go mod tidy
after to add
all required dependencies.
Congratulations, you created a reusable module which you obtain in other projects by using go get ...
💡 When using a module that is part of (i.e.: a package in) your project, you can add the following line to you projects go.mod
file:
replace github.com/Accedian/stitchIt/persistence => ./persistence
This directive makes it so that all imports of github.com/Accedian/stitchIt/persistence
actually use the local package persistence
.
Note that go 1.18 introduces workspaces, which is a more advanced feature deal with multi-module environments.
Unit testing as important as the code you write. It assures that your
- code is working
- code is well-structured (as components or 'units')
- package API is well-defined and encapsulated
If you can't test your code with either unit or at least integration tests, refactor it until you can. Code that can't be tested is badly designed even if the logic is genius.
By default, you should use variadic tests for the following benefits
- Reduced repetition of code for similar test cases
- Allows easy setup and tear down without using the package global
TestMain
- Structure better to maintain
- Fast addition of new test cases
The last point being key: Variadic tests allow the quick addition of new testcases if a bug is found.
package variadictests
import (
"github.com/stretchr/testify/assert"
"strings"
"testing"
)
func TestToLowerCase(t *testing.T) {
//Test 'suit' set up here
// ... setup ...
defer func() {
//Optimal test 'suit' tear down here:
// ... tear down ...
}()
var tests = []struct {
Name string
Input string
ExpectedOutput string
}{
{
Name: "Uppercase",
Input: "UPPERCASE",
ExpectedOutput: "uppercase",
},
{
Name: "CamelCase",
Input: "CamelCase",
ExpectedOutput: "camelcase",
},
}
for _, test := range tests {
t.Run(test.Name, func(t *testing.T) {
//per-test setup
lower := strings.ToLower(test.Input)
assert.Equal(t, test.ExpectedOutput, lower)
//per-test tear down
})
}
//Optional test 'suit' tear down here
// ... tear down that may not execute on test failure ...
}
func TestToLowerCaseComplexObject(t *testing.T) {
var tests = []struct {
Name string
Input func() []*string //func() provides a nice way to set p more complex inputs
ExpectedOutput []string
}{
{
Name: "Uppercase",
Input: func() []*string {
a := "UPPERCASE"
return []*string{&a}
},
ExpectedOutput: []string{"uppercase"},
},
}
for _, test := range tests {
t.Run(test.Name, func(t *testing.T) {
input := test.Input() //Use the input
assert.Equal(t, 1, len(input))
})
}
}
The func()
is typically used to allow short-hand for setting up
- time
- pointers for scalar values
- complex objects
Use where you need to mock out dependencies because the surrounding code must be testable in a unit test and cannot be tested with an integration test instead. Usage of this should therefore be rare, since code around dependencies can usually be broken down into testable units, or are so intertwined with the dependency code, that an integration test is required.
Assume the following code: A lookup is made in a database and the resulting string is transformed to upper case. We want to test the upper case transformation, mocking out the database call.
To allow easy mocking, we implement the entire database access logic in a function var GetStringFromDatabase
. This is a Provider :
package mocking
import "strings"
var GetStringFromDatabase = func(entityId string) string {
//This is the production code accessing the database, ready a string by its ID and returning it
return ""
}
func ToUpperCaseFromDatabase(input string) string {
fromDB := GetStringFromDatabase(input) //mocked in test
//code under test
return strings.ToUpper(fromDB)
}
This allows us to exchange the implementation in our Provider in the test with a simple variable assignment:
func TestToUpperCaseFromDatabase(t *testing.T) {
original := mocking.GetStringFromDatabase //Store original to restore behavior after test
defer func() {
mocking.GetStringFromDatabase = original //Make sure behavior is restored after test
}()
mocking.GetStringFromDatabase = func(string) string { //Mock
return "mock"
}
result := mocking.ToUpperCaseFromDatabase("abc")
assert.Equal(t, "MOCK", result)
}
- Pros
- Quick and easy way to stub out dependencies with small, locally scoped mocks
- Superior mock strategy compared interface mocking or tool generated mocks
- Cons
- Pollutes package API
:bulb: If you want to avoid pollution, you may choose to make the provider function private (
var getStringFromDatabase = func(entityId string) string
). Your tests cannot be in the_test
package in this case. - Providers can feel convoluted to implement
- Mock problem: Garbage in garbage out. Wrong assumptions regarding the mock implementation render test useless
- Pollutes package API
:bulb: If you want to avoid pollution, you may choose to make the provider function private (
Interfaces are not the same in golang as they are in Java. In Java, interfaces define a contract to create abstraction and promote loose coupling. Interfaces are heavily tied to inheritance and generics.
In golang interfaces are not as useful for abstraction purposes due to limited inheritance and missing generics.
Even with golang 1.18 and the introduction of generics, interfaces are not playing a role in generics implementations.
Therefore, abstracting a package object with and interface is not commonly done as you'd see it in Java and as a result, interface mocking a 3rd party library like a database client, is often not an option.
Interfaces are typically used to define a contract in your package API, for users to satisfy. Allowing package API users to set their own logger is a great example for this.
When mocking interfaces, keep the following in mind:
- Avoid mock tooling (3rd party generators and syntax libraries): They require ramp up, often use obscure verification syntax and can introduce instability into the build
- Interface mocking can be very verbose
When mocking interfaces, you can create a base mock and extend from it to reduce verbosity of the mock code
package mocking_test
import (
"github.com/stretchr/testify/assert"
"minimalgo/mocking"
"testing"
)
type BasePersonMock struct {
}
func (m *BasePersonMock) PrintName() string {
return "Fred"
}
func (m *BasePersonMock) PrintLastName() string {
return "Smith"
}
//FrankMock only implements PrintName()
type FrankMock struct {
BasePersonMock
}
func (m *FrankMock) PrintName() string {
return "Frank"
}
func TestPerson_PrintNameMocked(t *testing.T) {
var personInterface mocking.PersonInterface
personInterface = &BasePersonMock{}
assert.Equal(t, "Fred", personInterface.PrintName())
assert.Equal(t, "Smith", personInterface.PrintLastName())
personInterface = &FrankMock{}
assert.Equal(t, "Frank", personInterface.PrintName())
assert.Equal(t, "Smith", personInterface.PrintLastName()) //uses the method from BasePersonMock
}
💡
If your package API exposes a type
instead of an interface
, it is good practice providing a noop implementation,
by nil
-checking the pointer receiver on each method and executing a default behavior. This prevents nil-pointers and
promotes loose coupling, by essentially making the dependency on you package optional.
type Person struct {
Name string
LastName string
}
func (m *Person) PrintName() string {
if m == nil {
return "" //Noop implementation
}
return m.Name
}
func (m *Person) PrintLastName() string {
if m == nil {
return ""
}
return m.LastName
}
func TestPerson_PrintName(t *testing.T) {
var person *mocking.Person //Nil
assert.Equal(t, "", person.PrintName()) //No nil pointer!! Nil has a Noop implementation
person2 := &mocking.Person{
Name: "Paul",
}
assert.Equal(t, "Paul", person2.PrintName())
}
func TestPerson_PrintNamePanic(t *testing.T) {
var personInterface mocking.PersonInterface
personInterface.PrintName() //this will cause a panic because the interface is nil and there is no implementation to catch the call
}
We use a lot of REST interactions between microservices. An alternative to provider mocking for those interactions is to start a small server in your test and mock the REST response. An example function could look like below:
const mockPort = 55575
func startMockServer(t *testing.T, expectedRequest []byte, responseCode int, response []byte) (*httptest.Server, error) {
server := httptest.NewUnstartedServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
data, _ := ioutil.ReadAll(r.Body)
defer r.Body.Close()
//Example: Verify the request body
assert.Equal(t, expectedRequest, data, "Expected %s, Got %s", string(expectedRequest), string(data))
//Mock desired response
w.WriteHeader(responseCode)
w.Write(response)
}))
l, err := net.Listen("tcp", fmt.Sprintf(":%d", mockPort))
if err != nil {
return nil, err
}
server.Listener = l
server.Start()
return server, nil
}
Calling startMockServer
, we provide the expected request body for the server to verify, as well as the responseCode
and response
body
we want to respond with.
Imagine a function DoPOST
:
func DoPOST(url string, body string) error {
request, err := http.NewRequest("POST", url, ioutil.NopCloser(bytes.NewReader([]byte(body))))
if err != nil {
return err
}
resp, err := http.DefaultClient.Do(request)
if err != nil {
return err
}
if resp.StatusCode != 200 {
return fmt.Errorf("unexpected response code: %d", resp.StatusCode)
}
return nil
}
Now we could test the function using our mock server:
func TestDoPOST(t *testing.T) {
server, err := startMockServer(t, []byte("Hello world"), 200, []byte(``))
if err != nil {
t.Fatal(err.Error())
}
defer server.Close() //Do not forget this
//Send the request to the mocked endpoint
err = mocking.DoPOST(fmt.Sprintf("http://localhost:%d", mockPort), "Hello world")
assert.Nil(t, err)
}
This test fails if we change the response code to something other than 200
, or the payload is different from Hello world
. This method allows for testing
request expectations as well as different response scenarios.
go test
runs multiple tests in parallel by default unless you pass the -p 1
flag. If you use mock servers, executing tests in
parallel can quickly end in port already in use
errors. To prevent these errors, you need to either come up with a synchronized way to assign unique ports to each
mock server instance, or run go test -p 1
.
Integration tests are required wherever you interact with other services. The idea is to be as close to a production environment
as possible, while keeping test execution times and resource utilization as low as possible. We also want to write the integration tests
in go
, and avoid using obscure frameworks that require ramp up.
The basis for integration tests is docker-compose
. Compose is a community adopted, well-known technology to define services.
Start by declaring all your required dependencies, as well as your service-under-test in a docker-compose.yml:
version: "3.3"
services:
dgraph:
image: dgraph/standalone:v21.03.2
ports:
- "9080:9080"
- "8080"
stitchit:
build:
context: ../
dockerfile: Dockerfile
command: -mode dgraph -dgraph integration_tests_dgraph_1:9080
environment:
- LOGLEVEL=DEBUG
depends_on:
- dgraph
restart: on-failure
In this example we define stitchit
, which is the service-under-test. The stitchit
container is built every time we change some code, so we
always test against the latest version. The compose file also defines the dependency dgraph
.
Running docker-compose -f docker-compose.yml up -d --build
, will build the new stitchit
container and start both the stitchit
and dgraph
services.
To tear the whole thing down, including volumes, simply execute: docker-compose -f integration_tests/docker-compose.yml down -v
.
Now that we have our service-under-test with minimal dependencies running, we just need to run our integration tests against it.
Our integration tests are standard golang unit tests, executed with go test
. The advantage is that there is no learning curve for integration tests.
Example:
//go:build integration
// +build integration
package integration_tests
func TestREST(t *testing.T) {
body := []byte(`{
"data": {
"attributes": {
"name": "Example",
"location": {
"latitude": 9.66654,
"longitude": -8.5554
}
},
"type": "asset"
}
}`)
request, err := http.NewRequest(http.MethodPost, "http://integration_tests_stitchit_1:8080/api/v1/stitchit/assets", ioutil.NopCloser(bytes.NewReader(body)))
if err != nil {
t.Fatal(err.Error())
}
request.Header.Set(XForwardedTenantId, testTenant)
request.Header.Set("Content-Type", "application/json")
response, err := httpClient.Do(request)
if err != nil {
t.Fatal(err.Error())
}
assert.Equal(t, 201, response.StatusCode)
}
This test executes a REST request against the service-under-test and verifies the response.
In order to interact with the service-under-test, our integration tests must have access to the same docker network.
The trick is to execute go test
inside a docker container, which is part of the appropriate docker network: We docker run
an alpine image, mount the code, set the GOPATH
and execute go test
with the appropriate parameters. In this case we only run tests in the
integration_tests package (-v
) that have the build tag integration
. The build tag allows integration tests to be ignored when executing unit tests (go test ./...
)
docker run -it --rm --name stitchit_itest --network=integration_tests_default \
-e GOPATH=/root/go -v "$(GOPATH):/root/go" -v "$(PROJECT_BASE_PATH):/root/workingdir" \
-w "/root/workingdir" docker-go-sdk:0.42.0-alpine go test -v ./integration_tests/... -tags integration
Now we smash all this into a nice Makefile target for convenient one-line execution:
itest: dockerbin
docker-compose -f integration_tests/docker-compose.yml down -v
docker-compose -f integration_tests/docker-compose.yml up -d --build
docker run -it --rm --name stitchit_itest --network=integration_tests_default \
-e GOPATH=/root/go -v "$(GOPATH):/root/go" -v "$(PROJECT_BASE_PATH):/root/workingdir" \
-w "/root/workingdir" docker-go-sdk:0.42.0-alpine go test -v ./integration_tests/... -tags integration
Using the presented approach we
- Test close to production environment (containers in docker network)
- Use slim, well-known technology stack (docker-compose, golang and
go tools
) - Easily define, start and tear down the test environment, including network and volumes
- Write integration tests in the same way we write unit tests
Go routines can be described as light-weight threads. Their scheduling is managed by the go runtime, which utilizes threads based on demand. It is entirely possible to have hundreds of go routines multiplexed on a single thread. This is why they are called light-weight. It is entirely feasible running thousands or even hundreds of thousands of go routines in your program. Go routines have a 4kb memory overhead and are very comfortable to use. Often they are employed in conjunction with go's other signature feature, Go channels to implement asynchronous or parallel logic.
A go routine is started with the keyword go
:
//Anonymous go routine
go func() {
//do stuff async
}()
func doit() {
}
//Running an existing function in a go routine
go doIt()
Go has closures. So passing variables into go routines that are declared inline, is very convenient:
h := "hello world"
go func() {
fmt.Println(h)
}()
The alternative is to using a closure is using the function parameter:
h := "hello world"
go func(s string) {
fmt.Println(s)
}(h)
Inline declared go routines making use of closures is very common, but must be avoided in loops due to how closures work:
The following test iterates over 3 Customer
s and prints their name in a go routine.
type Customer struct {
Name string
}
func TestGoRoutineClosurePitfall(t *testing.T) {
customers := []Customer{
{Name: "Avid"},
{Name: "Olav"},
{Name: "Jarl Varg"},
}
wg := sync.WaitGroup{}
wg.Add(3)
for _, customer := range customers {
go func() {
fmt.Println(customer.Name)
wg.Done()
}()
}
wg.Wait()
}
Output:
Jarl Varg
Jarl Varg
Jarl Varg
This behavior is due to the go routine closure using a reference to customer
, which gets a new value assigned on every loop iteration. Therefore, all go routines print Jarl Varg
.
when they are executed.
Fix
func TestGoRoutineClosurePitfall_Fixed(t *testing.T) {
customers := []Customer{
{Name: "Avid"},
{Name: "Olav"},
{Name: "Jarl Varg"},
}
wg := sync.WaitGroup{}
wg.Add(3)
for _, customer := range customers {
go func(c Customer) {
fmt.Println(c.Name)
wg.Done()
}(customer)
}
wg.Wait()
}
Go channels are golang's signature feature. They are a very powerful tool, but come with some caveats as well. As a newcomer it is often tricky to figure out when to effectively use go channels. This paragraph lists some of the most useful patterns, but first we start with some general advice and potential pitfalls:
Buffered channels are created with a specific buffer size. Writes to the channel will not block, until the buffer is full
buffered := make(chan string, 2)
buffered <- "a"
buffered <- "b"
buffered <- "c" //This blocks until another go routine consumes "a" from the channel
Buffered channels are almost never used. The reason being that whatever buffer size you choose, it will eventually we reached and cause a block.
So in 99.999% of cases you will use unbuffered channels.
When using an unbuffered channel, writes to the channel block unless at least one other go routine consumes the channel. Conversely, any statement receiving from a channel, blocks until there is something to receive. This is why unbuffered go channels are a synchronization mechanism.
unbuffered := make(chan string)
go func() {
read <- unbuffered //This will block until there is something to read
}()
unbuffered <- "hello"
Multiple go routines can write to a channel and multiple go routines can consume a channel.
When multiple routines consume the same channel, each message is guaranteed to be only received by one routine.
Channels can be closed. Its purpose is to serve as a signal to channel consumers, that processing is done and receiving from the channel can stop. This is not required for garbage collection, an unclosed channel that is no longer referenced will be garbage collected.
Writes to a closed channel or attempting to close it again causes a panic and must be avoided. Consuming a closed channel is generally okay as the statement will not block, but may lead to infinite loops.
It is good practice having a generator function that creates and returns the channel. The channel is asynchronously populated and closed when done.
func GenerateRandomNumbers(amount int) chan int {
output := make(chan int) //Create the channel
//Populate the channel in a go routine, this happens async, so the returned channel is ready to be consumed elsewhere while it is not populated
go func() {
for i := 0; i < amount; i++ {
output <- rand.Int()
}
//Once we are done populating the channel, we close it, this will cause consumer loops to exit gracefully
close(output)
}()
return output
}
Using the function:
func TestRoutine(t *testing.T) {
c := go_routines.GenerateRandomNumbers(10)
for n := range c {
fmt.Println(n)
}
fmt.Println("Done")
}
The advantage of this pattern is that the entire channel lifecycle is defined in one place. The caller of GenerateRandomNumbers
only has to worry about receiving from the channel and can rely on it being closed when no more data is sent.
Remember
- Writing to a closed channel causes a
panic
- Closing a closed channel causes a
panic
- Reading from a closed channel evaluates to
true
with empty value, which can cause infinite loops frying your CPU - Spreading your channel lifecycle throughout the code might lead to death threats from the next maintainer
To write to a channel simply use the <-
operator
c := make(chan int, 1)
c <- 5
💡 In this example, the channel write does not block, although we do not have a consumer, because it is buffered of size 1
.
In some cases, you want to write to an unbuffered channel, but you don't care if there is a consumer on the other side, and you don't want
the operation to block. In this case, you want to simply discard the element instead. You can achieve this with a select
and a default case. In the following example we write to
c
, but there is no consumer and c
is not a buffered channel. Without the select
, the <-
statement would block, but now it simply executes the default case.
Once we have a consumer, the element is written to the channel.
Example: Write or discard
func TestOptionalWrite(t *testing.T) {
c := make(chan int)
select {
case c <- 5:
t.Fatal("should not have been executed")
default:
fmt.Println("Discarded message")
}
//Start consumer routine
go func() {
<-c //Read an element
}()
time.Sleep(time.Millisecond) //Make sure consumer routine is up
select {
case c <- 5:
fmt.Println("Sent message")
default:
t.Fatal("should not have been executed")
}
}
This is useful in cases where your package API exposes a channel but doesn't know whether the user is consuming it.
There are different ways to consume a channel
- Direct assignment
- Loop over a channel with the
range
keyword - Using
select
to read from multiple channels
You can directly assign the next element of the channel to a variable using the <-
operator
c := go_routines.GenerateRandomNumbers(10)
firstNumber <- c
secondNumber <- c
Most typically you will range over a channel
c := go_routines.GenerateRandomNumbers(10)
for n := range c {
fmt.Println(n)
}
💡 When using range
to iterate over a channel, the loop is exited when the channel is closed.
Select allows you to consume from multiple channels at the same time. It is commonly used for
- Cancellation
- Timeouts
- General multi-channel consumption
One can use a signal channel in conjunction with a select
, to indicate the end of input processing. The following example has three go routines, one to populate the inout channel,
one to send a cancellation after 5 seconds and one to consume from the input and cancellation channel using select
:
func TestCancellation(t *testing.T) {
//Create a channel to signal the end of processing
cancel := make(chan struct{})
//Write some random numbers
c := make(chan int)
go func() {
for {
c <- rand.Int()
}
}()
//This routine with wait for 5 seconds, then send a `struct{}{}` into the cancel channel
go func() {
<-time.After(time.Second * 5)
cancel <- struct{}{}
}()
READ:
for {
select {
case number := <-c:
fmt.Println(number)
case <-cancel:
//After ~5 seconds, we will receive on the cancel channel and break out of the loop
break READ
}
}
fmt.Println("Done")
}
While perfectly sufficient for simple cases, use of the context
package is preferred.
It allows simultaneous cancellation of multiple go routines and supports other more complex use
cases, like child contexts and deadlines.
func TestCancellationWithContext(t *testing.T) {
c := make(chan int)
//Using a context with cancel() function instead of a signal channel
ctx, cancel := context.WithCancel(context.Background())
go func() {
for {
c <- rand.Int()
}
}()
go func() {
<-time.After(time.Second * 5)
cancel() //Invoke cancel() after 5 seconds
}()
READ:
for {
select {
case number := <-c:
fmt.Println(number)
case <-ctx.Done():
break READ
}
}
fmt.Println("Done")
}
Another common use of the select
keyword is timeouts. Imagine a routine that reads from a kafka topic channel, closing
if no message was received after 5 seconds:
func TestRefreshingTimeout(t *testing.T) {
c := make(chan int)
go func() {
//Send only two numbers, then wait for the timeout
c <- rand.Int()
c <- rand.Int()
}()
READ:
for {
select {
case number := <-c:
fmt.Println(number)
case <-time.After(time.Second * 5):
//This case is only executed, if we do not receive anything on 'c' for more than 5 seconds
//because every loop iteration reinitialized the timer
break READ
}
}
fmt.Println("Done")
}
In order to have a fix timeout, you have to instantiate the timeout channel outside the loop:
func TestFixedTimeout(t *testing.T) {
c := make(chan int)
timeout := time.After(time.Second * 5)
go func() {
for {
c <- rand.Int()
}
}()
READ:
for {
select {
case number := <-c:
fmt.Println(number)
case <-timeout: //This case is executed after 5 seconds, no matter what
break READ
}
}
fmt.Println("Done")
}
When you close a channel, you can still consume from it. Writing to a closed channel or closing it again causes a panic, but reading will
return immediately. This can cause very tragic outcomes when a channel is consumed in a for loop without using range
, for example in conjunction with select
.
func TestClosingChannelPitfall(t *testing.T) {
c := make(chan int)
//Create a wait group so the test doesn't exit before the go routine is done
wg := sync.WaitGroup{}
wg.Add(1)
go func() {
for {
select {
case value := <-c:
//Once the channel is closed, we will execute this case in an infinite loop with the default int value of `0`
fmt.Println(value)
}
}
wg.Done()
}()
close(c) //Closing the channel will *NOT* exit the for loop in this case
wg.Wait()
}
To fix this you need to explicitly exit the loop
func TestClosingChannelPitfallFixed(t *testing.T) {
c := make(chan int)
//Create a wait group so the test doesn't exit before the go routine is done
wg := sync.WaitGroup{}
wg.Add(1)
go func() {
LOOP: //Label for where to break out of
for {
select {
case value, ok := <-c:
if !ok {
break LOOP //break out of loop if channel is closed
}
fmt.Println(value)
}
}
wg.Done()
}()
close(c) //Closing the channel will *NOT* exit the for loop in this case
wg.Wait()
}
The go blog post Go Concurrency Patterns: Pipelines and cancellation is an absolute must-read!
It discusses
- Pipeline
- fan-in and fan-out
- cancellation techniques
Once these patterns are understood, usage of channels and routines feels a lot more natural and intuitive. Identifying the correct context to employ these features will be easier.
Is a way to route the data through a series of channels, processing the data in between.
Can be thought of as the classical worker pattern: A single input channel feeds multiple go routines that each do the same work in parallel. Each element on the channel is received by one routine only, making for fairly even distribution across all routines consuming it.
Fan-In is the consolidation of multiple input channels into a single output channel. Each input channel is consumed by its own go routine. All go routines are writing to the same output channel.
Be careful when exposing channels in your package APIs. It can be useful to provide an asynchronous interface to your users to consume data from, such as errors or events. But you must consider the channel lifecycle carefully. Remember that closing or writing to an already closed channel causes a panic.
If you want to allow the package API user to input data via a channel, do not create and return the channel to the user in your API. Instead, allow the user to pass in a channel. Now you are the consumer of the input channel while the user is in charge of its lifecycle.
In you output data to the API user via a channel document the behavior (blocking, discarding) and lifecycle of your exposed channels well. The consumer should not be responsible
for the channel's lifecycle at all. In cases where consumption of the channel is optional, use the select-default
pattern to write.
The most commonly used synchronization techniques are
sync
package: Mutex, RWMutex, WaitGroup and Once- Channels
The sync
package offers a Pool
a thread-safe Map
implementations as well as a Condition
. Pool is used for big objects that are temporarily un-used, to reduce impact
on garbage collection. a thread-safe map is better implemented typed with a sync.RWMutex
, but this may change have changed with generic??!? Condition is a more fine-grained WaitGroup, check it out.
In most cases the sync.RWMutex
is your go-to tool for anything that needs thread-safe access. Before generics, a common use case
was creation of thread-safe collections, like in this example:
package synchronization
import "sync"
type ThreadSafeMap struct {
sync.RWMutex
m map[string]string
}
func (t *ThreadSafeMap) Add(key, value string) {
t.Lock()
defer t.Unlock()
t.m[key] = value
}
func (t *ThreadSafeMap) Remove(key string) {
t.Lock()
defer t.Unlock()
delete(t.m, key)
}
func (t *ThreadSafeMap) Get(key string) string {
t.RLock()
defer t.RUnlock()
return t.m[key]
}
Always make sure your lock is released properly, there are few things worse to debug than locking issues. Using defer
is a surefire way to do so.
The sync.WaitGroup
is a great tool to synchronize go routines:
func TestWaitGroup(t *testing.T) {
wg := sync.WaitGroup{}
wg.Add(2) //Set counter to 2
go func() {
fmt.Println("Do something")
<-time.After(time.Second)
wg.Done() //Decreases counter
}()
go func() {
fmt.Println("Do something else")
<-time.After(time.Second * 2)
wg.Done() //Decreases counter
}()
wg.Wait() //Wait for counter to become 0
fmt.Println("All done")
}
Without the WaitGroup
, this test would print "All done" and likely exit before both go routine are executed.
The typical use case for WaitGroup
s are situations where you distribute work across multiple routines for parallel processing, and need wait for all of them to finish before proceeding with the main thread. As such it can be used to implement a simple fan-out/fan-in.
sync.Once
is used whenever you have a piece of code, that you want to be executed only once, thread-safe-once. This use case is rare, but it has its use cases.
func TestOnce(t *testing.T) {
once := sync.Once{}
var myFunc = func() {
once.Do(func() {
fmt.Println("See it only once")
})
fmt.Println("See it twice")
}
myFunc() // Only this call will execute the code in once
myFunc()
}
Prints:
=== RUN TestOnce
See it only once
See it twice
See it twice
--- PASS: TestOnce (0.00s)
PASS
The typical use case for Once
are situations where lazy initialization of a global resource, like a http.Client
instance for example, is met by potentially multiple routines hitting the initialization code at the same time.
Unbuffered chan
can be used to synchronize go routines. It is not their primary intend, but is used in examples already discussed, like the following
func TestCancellation(t *testing.T) {
//Create a channel to signal the end of processing
cancel := make(chan struct{})
//Write some random numbers
c := make(chan int)
go func() {
for {
c <- rand.Int()
}
}()
//This routine with wait for 5 seconds, then send a `struct{}{}` into the cancel channel
go func() {
<-time.After(time.Second * 5)
cancel <- struct{}{}
}()
READ:
for {
select {
case number := <-c:
fmt.Println(number)
case <-cancel:
//After ~5 seconds, we will receive on the cancel channel and break out of the loop
break READ
}
}
fmt.Println("Done")
}
The typical use case for using channels to synchronize go routines, is when you want to actively signal something to the target routine. Either by sending something through the channel, or by closing it.
:bulb: Channels are also commonly used for timeout related synchronization, due to the convenience of <- time.After(duration)
channels.