Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test worker priorization #45

Open
wants to merge 3 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion worker/go.mod
Original file line number Diff line number Diff line change
Expand Up @@ -5,5 +5,6 @@ go 1.14
require (
github.com/dadosjusbr/alba/git v0.0.0-20200925182652-7763ef421afb
github.com/dadosjusbr/alba/storage v0.0.0-20210315172958-12d46908cb02
github.com/dadosjusbr/executor v1.0.0 // indirect
github.com/dadosjusbr/executor v1.0.0
github.com/matryer/is v1.4.0
)
2 changes: 2 additions & 0 deletions worker/go.sum
Original file line number Diff line number Diff line change
Expand Up @@ -96,6 +96,8 @@ github.com/kr/text v0.2.0 h1:5Nx0Ya0ZqY2ygV366QzturHI13Jq95ApcVaJBhpS+AY=
github.com/kr/text v0.2.0/go.mod h1:eLer722TekiGuMkidMxC/pM04lWEeraHUUmBw8l2grE=
github.com/markbates/oncer v0.0.0-20181203154359-bf2de49a0be2/go.mod h1:Ld9puTsIW75CHf65OeIOkyKbteujpZVXDpWK6YGZbxE=
github.com/markbates/safe v1.0.1/go.mod h1:nAqgmRi7cY2nqMc92/bSEeQA+R4OheNU2T1kNSCBdG0=
github.com/matryer/is v1.4.0 h1:sosSmIWwkYITGrxZ25ULNDeKiMNzFSr4V/eqBQP0PeE=
github.com/matryer/is v1.4.0/go.mod h1:8I/i5uYgLzgsgEloJE1U6xx5HkBQpAZvepWuujKwMRU=
github.com/mitchellh/go-homedir v1.1.0 h1:lukF9ziXFxDFPkA1vsr5zpc1XuPDn/wFntq5mG+4E0Y=
github.com/mitchellh/go-homedir v1.1.0/go.mod h1:SfyaCUpYCn1Vlf4IUYiD9fPX4A5wJrkLzIz1N1q0pr0=
github.com/montanaflynn/stats v0.0.0-20171201202039-1bf9dbcd8cbe/go.mod h1:wL8QJuTMNUDYhXwkmfOly8iTdp5TEcJFWZD2D7SIkUc=
Expand Down
33 changes: 24 additions & 9 deletions worker/worker.go
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,13 @@ import (
"github.com/dadosjusbr/alba/storage"
)

type dbInterface interface {
GetPipelinesByDay(day int) ([]storage.Pipeline, error)
InsertExecution(e storage.Execution) error
GetLastExecutionsForAllPipelines() ([]storage.Execution, error)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Você provavelmente vai precisar passar algum tipo de quantidade aqui, não?

Outra coisa importante, a interface pode ser um bom lugar para documentar bem essas funções. Por exemplo, o método GetPipelinesByDay(int) retorna todos os pipelines do dia especificado, sempre no mês corrente.

GetLastExecutionsByPipelineID(limit, id int) ([]storage.Execution, error)
}

func main() {
var pipelines []storage.Pipeline
var finalPipelines []storage.Pipeline
Expand All @@ -22,6 +29,16 @@ func main() {
log.Fatal("error trying get environment variable: $MONGODB is empty")
}

errorLimitStr := os.Getenv("ERROR_LIMIT")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ERROR_LIMIT é a quantidade de vezes que a execução de um determinado mês pode falhar? Se sim, poderia chamar ela RETRY_LIMIT, por favor? Essa é a forma que ela se apresenta em outros programas.

Também poderia comentar essa variável? Vale super pena documentar esse parâmetro do programa.

if uri == "" {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if uri == "" {
if errorLimitStr == "" {

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lembrando que o strconv.Atoi falha quando a variável é vazia. Mas entendo que você quis dar uma mensagem mais bacana quando o erro mais recorrente.

log.Fatal("error trying get environment variable: $ERROR_LIMIT is empty")
}

errorLimit, err := strconv.Atoi(errorLimitStr)
if err != nil {
log.Fatalf("error trying convert variable $ERROR_LIMIT: %q", err)
}

dbClient, err := storage.NewDBClient(uri)
if err != nil {
log.Fatal(err)
Expand Down Expand Up @@ -55,16 +72,15 @@ func main() {
// if err != nil {
// log.Fatal(err)
// }
// TODO: merge pipelines a fim de não repetir os ids
// final_pipelines = append(final_pipelines, pipelines...)

// pipelines, err = getPipelinesForCompleteHistory(dbClient)
// if err != nil {
// log.Fatal(err)
// }
// final_pipelines = append(final_pipelines, pipelines...)

// Algoritmo: shuffle na lista + cap
toExecuteNow := prioritizeAndLimit(finalPipelines)
toExecuteNow := prioritizeAndLimit(dbClient, finalPipelines, errorLimit)

for _, p := range toExecuteNow {
err := run(emailSender, emailPassword, emailReceiver, p, dbClient)
Expand Down Expand Up @@ -143,13 +159,12 @@ func mergeEnv(defaultEnv, stageEnv map[string]string) map[string]string {
return env
}

func prioritizeAndLimit(list []storage.Pipeline) []storage.Pipeline {
// TODO: ordenar do mais recente para o mais antigo
// TODO: receber quantidade máxima de execuções via parâmetro e manter somente os x primeiros
func prioritizeAndLimit(db dbInterface, list []storage.Pipeline, limit int) []storage.Pipeline {

return list
}

func getPipelinesToExecuteToday(db *storage.DBClient) ([]storage.Pipeline, error) {
func getPipelinesToExecuteToday(db dbInterface) ([]storage.Pipeline, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Eu acho arretada essa forma que programas em Go crescem em complexidade. Hoje em dia me soa bastante orgânico, natural? Nenhuma modição em extends ou implements! Note que você não precisou modificar em nada storage.DBClient!

results, err := db.GetPipelinesByDay(time.Now().Day())
if err != nil {
return nil, fmt.Errorf("error getting pipelines by day: %q", err)
Expand All @@ -160,14 +175,14 @@ func getPipelinesToExecuteToday(db *storage.DBClient) ([]storage.Pipeline, error

// Apenas as execuções que devem acontecer por causa do mecanismo de
// tolerância à falhas.
func getPipelinesThatFailed(db *storage.DBClient) []storage.Pipeline {
func getPipelinesThatFailed(db dbInterface) []storage.Pipeline {

return nil
}

// Apenas execuções de devem acontecer para completar o histórico. Devemos
// ignorar casos em que já houve tentativa de execução, quer seja sucesso ou falha.
func getPipelinesForCompleteHistory(db *storage.DBClient) []storage.Pipeline {
func getPipelinesForCompleteHistory(db dbInterface) []storage.Pipeline {

return nil
}
Expand Down
195 changes: 195 additions & 0 deletions worker/worker_test.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,195 @@
package main

import (
"testing"

"github.com/dadosjusbr/alba/storage"
"github.com/dadosjusbr/executor"
"github.com/matryer/is"
)

var pipelinesDB = []storage.Pipeline{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tu deve precisar criar vários desses a medida que o teste aumenta.

Sugestão: cria os storage.Pipeline baseado nas características que você quer, coloca nomes sugestivos neles (dada as características de erro ou algo simples ou complexo ou algo assim) e coloca eles na parte de baixo do programa.

Daí, cria o []storage.Pipeline dentro de cada método. A pessoa lendo esse arquivo ver os casos de teste e quando for ler um terminado método vai entender mais fácilmente o caso sendo testado.

{
Pipeline: executor.Pipeline{
Name: "Pipeline 01: Finalizado com sucesso",
},
ID: "pipeline1",
Entity: "Pipeline 01",
City: "João Pessoa",
FU: "PB",
Repo: "github.com/dadosjusbr/coletores",
Frequency: 30,
StartDay: 5,
LimitMonthBackward: 2,
LimitYearBackward: 2021,
},
{
Pipeline: executor.Pipeline{
Name: "Pipeline 02: Tolerância a falha",
},
ID: "pipeline2",
Entity: "Tribunal Regional do Trabalho 13ª Região",
City: "João Pessoa",
FU: "PB",
Repo: "github.com/dadosjusbr/coletores",
Frequency: 30,
StartDay: 5,
LimitMonthBackward: 2,
LimitYearBackward: 2021,
},
{
Pipeline: executor.Pipeline{
Name: "Pipeline 03: Histórico",
},
ID: "pipeline3",
Entity: "Pipeline 03",
City: "João Pessoa",
FU: "PB",
Repo: "github.com/dadosjusbr/coletores",
Frequency: 30,
StartDay: 5,
LimitMonthBackward: 2,
LimitYearBackward: 2021,
},
}

type fakeFinder struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fakeDB?

pipelines []storage.Pipeline
executions []storage.Execution
err error
}

func (fake fakeFinder) GetPipelinesByDay(day int) ([]storage.Pipeline, error) {
return fake.pipelines, fake.err
}
func (fake fakeFinder) InsertExecution(e storage.Execution) error {
return fake.err
}

// TODO: Retornar a última execução de cada pipeline
func (fake fakeFinder) GetLastExecutionsForAllPipelines() ([]storage.Execution, error) {
return fake.executions, fake.err
}

// TODO: Retornar as <limit> ultimas execuções daquele pipeline
func (fake fakeFinder) GetLastExecutionsByPipelineID(limit, id int) ([]storage.Execution, error) {
return fake.executions, fake.err
}

// PrioritizeAndLimit
// 1º descartar se o pipeline já tem uma execução finalizada com sucesso hoje
// 2º descartar se o número de execuções com erro for igual ou maior que o limite
// ...
// por último aplicar filtro de tamanho
func TestPrioritizeAndLimit_LimitTest(t *testing.T) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isso está me parecendo um excelente caso para table driven tests (aqui e aqui).

cada caso de teste vai receber uma lista (pipeline), um limite a ser passaddo para função e o valor experado.

is := is.New(t)
list := pipelinesDB
limit := 2

finalPipelines := prioritizeAndLimit(fakeFinder{pipelines: pipelinesDB, err: nil}, list, limit)

is.True(len(finalPipelines) <= limit)
}

func TestPrioritizeAndLimit_PipelineWithMaxLimitOfFailure(t *testing.T) {
is := is.New(t)
list := pipelinesDB
limit := 3

executions := []storage.Execution{
{
PipelineResult: executor.PipelineResult{
Name: "Pipeline 02",
Status: "Setup Error",
},
ID: "pipeline2",
Entity: "Pipeline 02",
},
{
PipelineResult: executor.PipelineResult{
Name: "Pipeline 02",
Status: "Connection Error",
},
ID: "pipeline2",
Entity: "Pipeline 02",
},
{
PipelineResult: executor.PipelineResult{
Name: "Pipeline 02",
Status: "Run Error",
},
ID: "pipeline2",
Entity: "Pipeline 02",
},
}

finalPipelines := prioritizeAndLimit(fakeFinder{executions: executions, err: nil}, list, limit)

is.True(len(finalPipelines) == 2)
}

func TestPrioritizeAndLimit_PipelineWithTwoFailures(t *testing.T) {
is := is.New(t)
list := pipelinesDB
limit := 3

executions := []storage.Execution{
{
PipelineResult: executor.PipelineResult{
Name: "Pipeline 02",
Status: "Setup Error",
},
ID: "pipeline2",
Entity: "Pipeline 02",
},
{
PipelineResult: executor.PipelineResult{
Name: "Pipeline 02",
Status: "Connection Error",
},
ID: "pipeline2",
Entity: "Pipeline 02",
},
}

finalPipelines := prioritizeAndLimit(fakeFinder{executions: executions, err: nil}, list, limit)

is.True(len(finalPipelines) == 3)
}

func TestPrioritizeAndLimit_PipelineWithSuccessfulExecution(t *testing.T) {
is := is.New(t)
list := pipelinesDB
limit := 3

executions := []storage.Execution{
{
PipelineResult: executor.PipelineResult{
Name: "Pipeline 01",
Status: "OK",
},
ID: "pipeline1",
Entity: "Pipeline 01",
},
{
PipelineResult: executor.PipelineResult{
Name: "Pipeline 02",
Status: "Setup Error",
},
ID: "pipeline2",
Entity: "Pipeline 02",
},
{
PipelineResult: executor.PipelineResult{
Name: "Pipeline 03",
Status: "OK",
},
ID: "pipeline3",
Entity: "Pipeline 03",
},
}

finalPipelines := prioritizeAndLimit(fakeFinder{executions: executions, err: nil}, list, limit)

is.True(len(finalPipelines) == 1)
}