Skip to content

Code and data from the master’s thesis “Decoding Spatial Semantics”. Analyzes and compares open-source LLMs and NMT systems in translating spatial prepositions from English to Brazilian Portuguese. Includes preprocessing scripts, datasets, and evaluation metrics.

Notifications You must be signed in to change notification settings

rmaacario/LLMs-vs.NMT-spatial-semantics-translation

Repository files navigation

Decoding Spatial Semantics

This repository contains code and resources from the master’s thesis “Decoding Spatial Semantics: A Comparative Analysis of the Performance of Open-source LLMs against NMT Systems in Translating EN-PT-br”.

Overview

This study explores the challenges of translating spatial language using open-source Large Language Models (LLMs) and traditional Neural Machine Translation (NMT) systems. It focuses on translating spatial prepositions such as ACROSS, INTO, ONTO, and THROUGH from English to Portuguese (PT-br).

Contents

  • Code: Includes scripts for data preprocessing, running experiments, and evaluating results.
  • Datasets: Bilingual dataset of TED Talks subtitles focusing on spatial prepositions.
  • Evaluation Metrics: Scripts for computing BLEU, METEOR, BERTScore, COMET, and TER.

About

Code and data from the master’s thesis “Decoding Spatial Semantics”. Analyzes and compares open-source LLMs and NMT systems in translating spatial prepositions from English to Brazilian Portuguese. Includes preprocessing scripts, datasets, and evaluation metrics.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published