Backend Coding Assessment

How to Run the Solution

Install Go you can find the instructions here

Install GCC, needed for the testing

Make sure to set your GOROOT Directory Instructions Here
Install Dep You can find Instructions here.

*Dep is used for the package management in this application.
Download code into the correct directory.

Go is a little picky about where code is. It wants to be in your go root directory and the code needs to be in the following path for this project.

$GOROOT/src/github.com/Iukekini/backend-coding-assessment-Iukekini-1052
Build the Project and test project.

I setup a make file to do all the dependency loading / building / testing.

make

If you want to run it manually, here are the commands it will run.
```
 dep ensure
 go build -o podium-backend-assessment -v
 go test -v ./...
```
Run the application.

./podium-backend-assessment

Results Notes The results are laid out in a table wiht the follow columns
- Probability - The is the probability that the classifier put the review in the right class (1-5).
- Rating - This is the class returned by the classifier
- User - User that authored the review
- Visit Type - Service / Sales / Used
- Score - this is the score the user gave the review
- Date - Date of Visit
- Review - This is the title of the review. I didn't include the body as it was too long to display nicely.

If you want to see more reviews or pull more data (parse more pages) you can adjust that from the config.json file.

How I determined "Overly positive" reviews

In order to rank the reviews based on their positivity. I setup a Bayes classifier. I used a set of amazon reviews to train the classifier on what a positive review looked like. The classifier has 5 classes based on the 5 stars of an amazon review. After the classifier was trained I checked each of the reviews that I had parsed from the site against the classifier. I took the result and used that to sort the reviews and pick the highest rated 3 reviews to show.

notes

The classifier training data was not perfect for this scenario. Since an amazon review is more love / neutral / hate type of review. The classifier had a harder time picking between a good review and an over the top review. This problem could be solved by creating a set of training data that better represented this problem.

Problems / Questions / Frustrations

Please feel free to open an issue.

Open Source References

goml for the classification algorithm

go-config for the Config loading and management

Testify Some add ons for the go test suite. Enables assert and panic checks.

Log15 for the Logging

goquery like jquery but for go. Used it for searching parsing the webpages.

Training Data I used the amazon review csv to train the classifier. I only used the first 4k rows

Dep for package management

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
sample_html		sample_html
training		training
.gitignore		.gitignore
Gopkg.lock		Gopkg.lock
Gopkg.toml		Gopkg.toml
README.md		README.md
config.go		config.go
config.json		config.json
config_test.go		config_test.go
fmt		fmt
goml_classifier.go		goml_classifier.go
goml_classifier_test.go		goml_classifier_test.go
main.go		main.go
makefile		makefile
review.go		review.go
review_test.go		review_test.go
scraper.go		scraper.go
scraper_test.go		scraper_test.go

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Backend Coding Assessment

How to Run the Solution

How I determined "Overly positive" reviews

Problems / Questions / Frustrations

Open Source References

About

Releases

Packages

Languages

Iukekini/backend_go_sample

Folders and files

Latest commit

History

Repository files navigation

Backend Coding Assessment

How to Run the Solution

How I determined "Overly positive" reviews

Problems / Questions / Frustrations

Open Source References

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages