In the second part of this project, we will create an API for the categorization model developed previously.
More specifically, you will develop a server that should receive data related to products and return the best categories for them using a pretrained model.
More info about the data can be found here.
Your API should be composed of the following components:
-
Model Loading
Loads a pretrained model from a specified path available in the environment variableMODEL_PATH
. -
Categorization Endpoint
Exposes a POST endpoint/v1/categorize
that receives a JSON with product data and returns a JSON with their predicted categories. -
Input Validation
Returns status400 (Bad Request)
in case of ill-formatted user input without killing the API. -
[BONUS] Contract Testing
Runs automated tests from a filetest_api.py
to validate the API responses according to different inputs.
NOTE: To test your API, you must provide a JSON file generated from the
dataset test_producs.csv
, containing a valid input for your API
implementation. This file should be saved in the path available in the
environment variable TEST_PRODUCTS_PATH
.
The server API should be implemented using the Flask Library in a file
named api.py
.
Use Python comments to document relevant details about your implementation. Remember that good documentation should focus on the why (e.g., why a specific type of model was chosen), since clean code should be enough to understand the how (e.g., how you selected a specific type of model).
The expected input for the server should follow the following schema:
{
"products": [
{
"title": "Lembrancinha"
},
{
"title": "Carrinho de Bebê"
}
]
}
You MAY expect to receive other fields besides the title to represent the products. Remember, however, to use as key the name of the field specified in the raw data.
The expected output from the server should follow the following schema:
{
"categories": [
"Lembrancinha",
"Bebê"
]
}
You MUST NOT send other fields besides the category.
In this directory, we provide a containerized environment that uses docker and docker-compose to run the API. This should standardize the development environment and avoid compatibility problems.
To install docker and docker-compose, check their official documentation here and here. Both tools should be instalable at Linux, MacOS and Windows.
To execute the API, just run the following command:
docker-compose up --build
Then open the link shown in the end.
To install an OS package (Debian-based), add the name of the package in the file
packages.txt
. To intall a Python package (Pip-based), add the name and version
of the package in the file requirements.txt
.
The evaluation will be based on four criteria:
-
Correctness
If the solution runs without unexpected errors. -
Compliance
If the solution respects all specified behaviors, in particular concerning inputs and outputs. -
Code Quality
If the solution follows the principles of clean code and general good practices discussed in class. -
Documentation
If the solution documents relevant decisions in the right measure.