Skip to content

Sequence to Sequence Learning with Attention Mechanism for Language Translation

Notifications You must be signed in to change notification settings

SFETNI/NLP_Seq_2_Seq_with_Attention_Mechanism

Repository files navigation

Sequence to Sequence Learning with Attention Mechanism for Language Translation

This project focuses on implementing a neural machine translation model using PyTorch, with a specific emphasis on the attention mechanism.

General Information

This repository provides a detailed step-by-step guide to implementing a sequence-to-sequence model for language translation tasks. The model architecture is designed to capture dependencies between sequences and learn similarities using context vectors, facilitated by the attention mechanism.

Training Environment

The training for this model was conducted on High-Performance Computing (HPC) environments to ensure efficient computation and resource utilization.

Dataset

The training dataset utilized in this project is an English to French translation dataset. Despite encountering overfitting issues during training (it is recommemded to try other datasets to fix this issue and consider this tutorial as comprehensive), so the primary objective remains to showcase the attention mechanism's effectiveness in capturing sequence dependencies.

Model Overview

The sequence-to-sequence model architecture employed in this project consists of an encoder-decoder framework with an attention mechanism. The encoder processes the input sequence, while the decoder generates the output sequence. The attention mechanism allows the model to focus on relevant parts of the input sequence at each decoding step, thereby enhancing translation performance.

Usage

The repository includes pre-trained models (encoder and decoder) that can be readily applied for English to French translation tasks. Moreover, the framework is highly adaptable and can be extended to accommodate other datasets, languages, and tasks. Users can experiment with increasing model complexity by adding additional encoder and decoder layers or incorporating advanced techniques.

Notes

While this project primarily focuses on English to French translation, the underlying principles and architecture can be generalized to a wide range of language translation tasks.

Saved models

the encoder is uploaded here, while the decoder could be downloaded through this link, due to filesize limits: https://www.mediafire.com/file/s2xa013nt1rrkac/decoder.zip/file

Dataset Source

The dataset used in this project is available here.

Reference

https://medium.com/@eugenesh4work/attention-mechanism-for-lstm-used-in-a-sequence-to-sequence-task-be1d54919876 https://www.kaggle.com/code/asemsaber/english2french-nmt-tf-seq2seq-attention

Requirements

  • Python (>=3.6)
  • PyTorch

About

Sequence to Sequence Learning with Attention Mechanism for Language Translation

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published