generated from alshedivat/al-folio
-
Notifications
You must be signed in to change notification settings - Fork 8
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
gozdesahin
committed
May 21, 2024
1 parent
7b06c85
commit a1e58d9
Showing
11 changed files
with
64 additions
and
26 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,42 @@ | ||
--- | ||
layout: about | ||
inline: true | ||
title: Terminology-Grounded Translation | ||
description: Bridging the Gap Between Wikipedians and Scientists with Terminology-Aware Translation A Case Study in Turkish | ||
|
||
img: assets/img/projects/alfalfas.webp | ||
img_contains_title: true | ||
publications: 'projects^=*terminology' | ||
|
||
profile: | ||
name: Terminology-Grounded Translation | ||
image: projects/alfalfas.webp | ||
align: left | ||
address: > | ||
--- | ||
|
||
## Bridging the Gap Between Wikipedians and Scientists with Terminology-Aware Translation: A Case Study in Turkish | ||
|
||
<br> This project addresses the gap between the escalating volume of English-to-Turkish Wikipedia translations and the insufficient number of contributors, particularly in technical domains. Leveraging expertise from academics’ collaborative terminology dictionary effort, we propose a pipeline system to enhance translation quality. Our focus is on bridging academic and Wikipedia communities, creating datasets, and developing NLP models for terminology identification and retrieval, and terminology-aware translation. The aim is to foster sustained contributions and improve the overall quality of Turkish Wikipedia articles. | ||
|
||
### Goals | ||
|
||
The project will focus on the following tasks: | ||
|
||
- **High-quality parallel corpora for terminology-aware translation:** We aim to generate 3,000 parallel sentences in English-Turkish containing the following: i)English text annotated with the technical terms, ii) links to correct terminology entries in the database, and iii) edited translations using the correct terminology with Turkish terms. | ||
|
||
- **Term Identification:** Build models to identify the technical terms in a multilingual setup. | ||
|
||
- **Term Linking:** Build models to ground the identified terms in a terminology database (if possible). In case the DB does not contain the term, make a notification system for the domain experts. | ||
|
||
- **Terminology-Aware Translation:** We will build post-editing and translation systems that will be constrained with the terminology database. | ||
|
||
### Team | ||
|
||
- Asst. Prof. Gözde Gül Şahin | ||
- Ali Gebeşçe (Masters student) | ||
|
||
### Funding | ||
This project is funded by Wikimedia Research Fund. Official URL for the funded project:[https://meta.wikimedia.org/wiki/Grants:Programs/Wikimedia_Research_Fund/Bridging_the_Gap_Between_Wikipedians_and_Scientists_with_Terminology-Aware_Translation:_A_Case_Study_in_Turkish | ||
](https://meta.wikimedia.org/wiki/Grants:Programs/Wikimedia_Research_Fund/Bridging_the_Gap_Between_Wikipedians_and_Scientists_with_Terminology-Aware_Translation:_A_Case_Study_in_Turkish) | ||
|