My name is Matthaios Letsios and in the past years I have been working in the field of Data Engineering and Data Science. List of my responsibilities in my previous roles have been to:
- Maintain, develop and enhance airflow pipelines focusing proper data collection, transformation and storage in the most efficient manner.
- Develop python services using technologies like gRPC, postgreSQL, clickhouse, redis and kafka.
- Develop and maintain NLP services.
- Developing different machine learning models (mainly anomaly detection).
- Ensure reliable software development by appropriate unit and regression tests as well as automated linting as part of Github workflows.
In the past I have also worked as a researcher in Telecom ParisTech & Universite Pierre Marie Curie.
I'm also proud of being co-author to the following publications:
- Finding Heaviest k-Subgraphs and Events in Social Media. ICDM Workshops 2016 link
- Scheduling under Uncertainty: A Query-based Approach. IJCAI 2018 link
Understats Scraping
understat.com is a website providing advanced data for football matches e.g. xGoals, xAssists, xGChain etc. In this project I set up an airflow instance to collect the data for the Premier League football matches, transform them and then in the end visualize them in a jupyter notebook. The motivation behind this is the Fantasy Premier League game, where I use those data as a basis for the weekly player selection.
Book Recommender
This is an web app that makes use of sentence embeddings to be able to find appropriate books, based on the given desired description. The webapp was build using Fastapi.
In the past I have participated in many google hashcode competitions. Unfortunately the competition no longer exists, but I hope Google will bring it back in the future. Here is the code for some of my participations.
Google Hashcode 2021 - Qualification Round
Problem: Given a city plan and all car itineraries in that city, the goal is to schedule all traffic lights, to help as many cars as possible to reach their destination on time.
Approach: Our solution takes into account the in & out degree of each traffic light and assigns proportinally the time to each traffic light. Also we find out that assigning to all traffic lights 1 second of traffic time provided efficient solutions in some instances. This solution allowed us to reach top 20% of the leaderboard during the competition.