Data analysis has captivated me from the very start, with its unique ability to transform raw information into actionable insights using applied mathematical methods. What excites me most about this field is the potential to make nicely visualised, data-driven decisions that can drive meaningful change. From building complex economic forecast models to exploring new ways of classifications, I’m driven by the challenge of making sense of data in different ways.
In this portfolio, I’m proud to present projects that showcase my skills and dedication to data science. You'll find a range of models I've developed - either by myself of with teams. Each project has helped me sharpen my technical expertise and push the boundaries of what data science can achieve. This portfolio is a reflection of my ongoing commitment to learning and my enthusiasm for driving real-world value through data.
The projects in my portfolio can be classified into four groups:
📄 Research: Universities in Hungary have a renowned scientific event called Scientific Students’ Associations Conference (SSA, or TDK in Hungarian). It is an annual contest that offers an open platform for the works of students seeking to gain academic knowledge beyond the regular curriculum. Basically, your task is writing a research paper in an academic quality, then - if you are selected - you have to present your work to a professional jury. Thanks to this opportunity, I could conduct more quantitative research, which of course required coding as well.
🏫 Coursework: During my academic years, I took several courses that taught me how to analyze data and write scripts in various programming languages. I had to work on these projects either by myself or with a group of other students. That is why it taught me not only hard skills but also communication and presentation skills.
🏆 Competitions: Participating in data science competitions has been a cornerstone of my learning journey. These challenged me to apply theoretical knowledge in real-world scenarios under tight deadlines. Through these experiences, I refined my coding, problem-solving, and teamwork skills. Each competition taught me new techniques, from algorithm optimization to innovative data visualization approaches, further enhancing my analytical toolkit.
💡 Personal: Outside formal settings, I enjoy exploring data science and modeling through personal projects. These often focus on topics that spark my curiosity as well as my creativity. These self-directed projects allow me to experiment with new methods and tools, further building my expertise in data science and quantitative analysis.
Description: This project focuses on calculating Value at Risk (VaR) for stock returns, a key risk management metric. VaR quantifies the potential loss on an investment within a specific timeframe and confidence level. Using historical stock data retrieved from Yahoo Finance, I implemented and compared three methods: Historical, Variance-Covariance, and Monte Carlo simulations. The project explores each method's assumptions, strengths, and limitations, presenting results with clear visualizations and explanations.
Methodologies: Risk Analysis, Statistical Modeling, Monte Carlo Simulation, Data Visualization
Languages: Python (Jupyter Notebook)
Description: In this project, I aimed to forecast the US Dollar / Japanese Yen (USD/JPY) exchange rate using a combination of econometric and machine learning models. This project investigates how effectively these approaches can predict exchange rates, given the inherent complexity and volatility of FX markets. The focus is on integrating macroeconomic indicators from both the US and Japan, performing detailed data exploration and visualization, and presenting results in a clear, insightful manner. Throughout the project, I explore the strengths and limitations of different modeling techniques in handling time series data and financial market dynamics.
Methodologies: Time Series Analysis, Econometric Modeling, Machine Learning, Histogram
Languages: Python
Do regional disparities within certain European countries exhibit significant and consistent patterns in specific directions? 🇪🇺
Description: In this ongoing project, I aim to visualize the Real GDP across regions in each European country at various levels. My focus is to explore whether there are distinct regional development patterns, (for example,the western part of a country being more developed than the eastern part) and how these patterns change over time.
Methodologies: Data Visualization On Maps, Economic Analysis, Linear Regression
Languages: R
Description: In my thesis, I investigated the effects of interest rate increases on the Hungarian forint’s exchange rate amid supply-side inflation in Europe. Using an event study methodology, I analyzed whether these monetary policy measures significantly impacted the forint’s value relative to a basket of key currencies. My research hypothesis suggested that, during this unique inflationary period, interest rate hikes had minimal influence on exchange rates, potentially signaling the need for alternative approaches to managing inflation. The study aimed to provide insights into the limitations of traditional monetary policy tools in atypical economic conditions.
Methodologies: Event Study Design, Return Calculation Using Interest Rate Parity, Data Visualization and Analysis
Languages: R
Description: In this analysis, the task was to identify the characteristics of car insurance offers that are most likely to be converted into contracts for AXA Belgium. This involved analyzing a dataset of 44,928 observations and 22 variables related to customer, car, and broker characteristics. The goal was to develop a predictive model that could predict the probability of conversion, providing insights to improve the offers and enhance conversion rates through better understanding of customer profiles and offer characteristics.
Methodologies: Machine Learning, Data Preprocessing, Model Selection and Evaluation (Logistic Regression, Decision Trees, Random Forests), Cross-Validation
Languages: Python
Description: The task involved supporting an American credit institution in assessing the climate risk of its agricultural investments by predicting crop yields in Minnesota. Our objective was to develop a model that forecasts potential losses in productivity due to climate change, focusing on three crops: corn, oats, and soybeans. We utilized historical crop yield and weather data, aggregated and cleaned the datasets, and built several linear multivariate OLS regression models of increasing complexity. The final model selected was a transformed linear regression, which provided reliable predictions of future crop yields, enabling the bank to make informed decisions regarding loan risk assessments.
Methodologies: Data Understanding and Preparation (crop yield and weather data), Geospatial Data, Multiple Linear Multivariate OLS Regression Models, Model Evaluation, Prediction and Forecasting
Languages: R
Description: My thesis investigates how monetary policy decisions by the National Bank of Hungary (MNB) influence the exchange rate of the Hungarian forint, especially in the short term. Following the end of the MNB’s extensive tightening cycle in 2022, the forint fell sharply, sparking public and media attention. This piqued my curiosity, motivating me to explore whether such monetary interventions have a significant effect on the exchange rate. My findings reveal that, in the short term, monetary policy decisions do not significantly impact the exchange rate, supporting my hypothesis. This conclusion aligns with existing literature, suggesting that factors beyond monetary policy also play a role in influencing the forint’s value.
Methodologies: Vector Autoregression (VAR) Models, Statistical Hypothesis Testing (t-tests), Monetary Transmission Mechanism
Languages: R
Description: In this study, we examined the effect of companies’ layoff announcements on the stock market via the event study methodology. We also analysed the market reaction differences between tech and non-tech sectors. We found that, despite adverse market conditions, the stocks of the companies that laid off workers had positive cumulated average abnormal returns. We also find a statistically significant difference between the stock market reaction of the tech and the non-tech sectors. We concluded that layoffs can result in positive abnormal returns for a company’s stock given that it has a cost-efficiency-enhancing purpose, and it can also vary depending on the human resource necessity of the given industry the company is functioning in.
Methodologies: Event Study Analysis, Event and Estimation Windows, Abnormal Return Calculation, Statistical Testing
Languages: R
Description: The task involved analyzing yacht and motorboat pricing using the "yacht_pricing" dataset. The goal was to predict the price in euros based on various characteristics, including boat type, year built, condition, dimensions (length, width, depth), displacement, number of cabins and beds, fuel capacity, engine hours, and recent views. This analysis aimed to determine which factors significantly impact yacht prices, adjusting the model to address multicollinearity issues, and refining it through diagnostic tests and specification adjustments.
Methodologies: Exploratory Data Analysis, Ordinary Least Squares (OLS) Regression, Model Diagnostics and Specification Adjustments
Languages: R
Description: This task involved analyzing a dataset of Airbnb rentals in Vienna to explore the factors driving a rental's popularity. Using descriptive statistics and linear regression modeling in Stata, I assessed variables such as host characteristics, property features, and listing attributes to determine their impact on the number of reviews and ratings, which I defined as a measure of popularity. The goal was to build a model that explains what made certain listings more popular among guests.
Methodologies: Descriptive Statistics, Correlation Analysis, Linear Regression Modeling, Model Diagnostics and Validation, Comparative Analysis
Languages: Stata