Metta's Movie API is a Python-based project designed to extract movie data from an external API (TMDB), process and filter it, and then load it into a PostgreSQL database. It provides an API endpoint to access movie data and genres. The project utilizes Flask for the API server, PostgreSQL for database management, and Docker for containerization.
- Python
- Flask
- PostgreSQL
- Docker
-
Clone the repository:
git clone git@github.com:MettaSurendhar/DataEngineeringProject.git
-
Install dependencies:
pip install -r requirements.txt
-
Set up PostgreSQL database using Docker:
docker-compose up -d
-
Create a
.env
file and add the required environment variables:API_KEY=<your_api_key> GENRE_LIST_API=<genre_list_api_endpoint> MOVIE_LIST_API=<movie_list_api_endpoint> DB_HOST=<database_host> DB_NAME=<database_name> DB_USER=<database_user> DB_PASSWORD=<database_password> DB_PORT=<database_port> ENGINE_PASSWORD=<engine_password>
-
Run the Python scripts seperately to extract, filter, and load movie data into the database:
python dataExtraction.py python dataTransformation.py python dataLoad.py
-
Run the Flask application to start the API server:
python app.py
-
Instead can run the .sh file to extract, filter, and load data and run Flask app :
bash ./entryPoint.sh
- Fork the repository.
- Create a new branch (
git checkout -b feature/your-feature-name
). - Make your changes.
- Test your changes thoroughly.
- Commit your changes (
git commit -am 'Add new feature'
). - Push to the branch (
git push origin feature/your-feature-name
). - Create a new Pull Request.
- Ensure that the necessary environment variables are properly set up in the
.env
file for the project to function correctly. - The project utilizes Docker for containerization, making it easy to set up the development environment.
- Review the
requirements.txt
file for all dependencies used in the project.