Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

update members & news #20

Merged
merged 1 commit into from
Aug 16, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
25 changes: 25 additions & 0 deletions _members/alumni_kuzey.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
---
layout: about
inline: false
group: Former visitors, BSc/ MSc students, Interns
group_rank: 5
team_frontpage: false

title: Kuzey Kantarcıoğlu
description: Profile of Kuzey Kantarcıoğlu
lastname: Bond
publications: 'author^=*Kantarcıoğlu'

teaser: >
**Previously**: Worked on Procedural Language Generation.
<br>
**Currently**: Undergraduate student at Stanford University studying computer science and linguistics.
profile:
name: Kuzey Kantarcıoğlu
align: right
image: mems/kantarcioglu-profile.webp
role: Intern
email: kuzeykantarcioglu@hotmail.com
---

I am currently an undergraduate student at Stanford University studying computer science and linguistics.
27 changes: 27 additions & 0 deletions _members/intern_ata.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
---
layout: about
inline: false
group: Undergrad Intern
group_rank: 4
team_frontpage: false

title: Ata Halıcıoğlu
description: Profile of Ata Halıcıoğlu, Bachelor Student at Koç University.
lastname: Halıcıoğlu
publications: 'author^=*Halıcıoğlu'

teaser: >
I am a Senior Electrical and Electronics Engineering double majoring with Computer Engineering in Koc University.

profile:
name: Ata Halıcıoğlu
align: right
image: mems/halicioglu-profile.webp
role: Undergrad Intern
email: ahalicioglu20@ku.edu.tr

---

I am a Senior Electrical and Electronics Engineering double majoring with Computer Engineering in Koc University.


4 changes: 1 addition & 3 deletions _news/2023-07-12-paper-INLG23-accepted.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,4 @@ inline: true
Paper accepted to [INLG 2023](https://inlg2023.github.io/)!

***
Our paper entitled [Metric-Based In-context Learning: A Case Study in Text Simplification]({{ '/assets/pdf/2023.inlg-main.18.pdf' | relative_url }}) is accepted to [INLG 2023](https://inlg2023.github.io/) conference! Check the [repo](https://github.com/GGLAB-KU/metric-based-in-context-learning) for more details 📣

> Abstract: In-context learning (ICL) for large language models has proven to be a powerful approach for many natural language processing tasks. However, determining the best method to select examples for ICL is nontrivial as the results can vary greatly depending on the quality, quantity, and order of examples used. In this paper, we conduct a case study on text simplification (TS) to investigate how to select the best and most robust examples for ICL. We propose Metric-Based in-context Learning (MBL) method that utilizes commonly used TS metrics such as SARI, compression ratio, and BERT-Precision for selection. Through an extensive set of experiments with various-sized GPT models on standard TS benchmarks such as TurkCorpus and ASSET, we show that examples selected by the top SARI scores perform the best on larger models such as GPT-175B, while the compression ratio generally performs better on smaller models such as GPT-13B and GPT-6.7B. Furthermore, we demonstrate that MBL is generally robust to example orderings and out-of-domain test sets, and outperforms strong baselines and state-of-the-art finetuned language models. Finally, we show that the behaviour of large GPT models can be implicitly controlled by the chosen metric. Our research provides a new framework for selecting examples in ICL, and demonstrates its effectiveness in text simplification tasks, breaking new ground for more accurate and efficient NLG systems.
Our paper entitled [Metric-Based In-context Learning: A Case Study in Text Simplification]({{ '/assets/pdf/2023.inlg-main.18.pdf' | relative_url }}) is accepted to [INLG 2023](https://inlg2023.github.io/) conference! Check the [repo](https://github.com/GGLAB-KU/metric-based-in-context-learning) for more details 📣
6 changes: 1 addition & 5 deletions _news/2023-09-06-papers-AACL23-accepted.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,9 +9,5 @@ inline: true
***
Our paper entitled [Benchmarking Procedural Language Understanding for Low-Resource Languages: A Case Study on Turkish]({{ '/assets/pdf/2309.11346.pdf' | relative_url }}) is accepted to the main [IJCNLP-AACL 2023](http://www.ijcnlp-aacl2023.org/) conference! Check the [repo](https://github.com/GGLAB-KU/turkish-plu) for more details 📣

> Abstract: Understanding procedural natural language (e.g., step-by-step instructions) is a crucial step to execution and planning. However, while there are ample corpora and downstream tasks available in English, the field lacks such resources for most languages. To address this gap, we conduct a case study on Turkish procedural texts. We first expand the number of tutorials in Turkish wikiHow from 2,000 to 52,000 using automated translation tools, where the translation quality and loyalty to the original meaning are validated by a team of experts on a random set. Then, we generate several downstream tasks on the corpus, such as linking actions, goal inference, and summarization. To tackle these tasks, we implement strong baseline models via fine-tuning large language-specific models such as TR-BART and BERTurk, as well as multilingual models such as mBART, mT5, and XLM. We find that language-specific models consistently outperform their multilingual models by a significant margin across most procedural language understanding~(PLU) tasks.

***
Another paper [GECTurk: Grammatical Error Correction and Detection Dataset for Turkish]({{ '/assets/pdf/2309.11346.pdf' | relative_url }}) is accepted to the [Findings of IJCNLP-AACL 2023](http://www.ijcnlp-aacl2023.org/) 📣

> Abstract: Grammatical Error Detection and Correction (GEC) tools have proven useful for native speakers and second language learners. Developing such tools requires a large amount of parallel, annotated data, which is unavailable for most languages. Synthetic data generation is a common practice to overcome the scarcity of such data. However, it is not straightforward for morphologically rich languages like Turkish due to complex writing rules that require phonological, morphological, and syntactic information. In this work, we present a flexible and extensible synthetic data generation pipeline for Turkish covering more than 20 expert-curated grammar and spelling rules (a.k.a., writing rules) implemented through complex transformation functions. Using the pipeline, we derive 130,000 high-quality parallel sentences from professionally edited articles. Additionally, we create a more realistic test set by manually annotating a set of movie reviews. We implement three baselines formulating the task as i) neural machine translation, ii) sequence tagging, and iii) few-shot learning with prefix tuning, achieving strong results. Then we perform a zero-shot evaluation of our pretrained models on the coarse-grained "BOUN -de/-da" and fine-grained expert annotated dataset. Our results suggest that our corpus, GECTurk, is high-quality and allows knowledge transfer for the out-of-domain setting. To encourage further research on Turkish GEC, we release our dataset, baseline models, and synthetic data generation pipeline with https://anonymous.4open.science/r/tr-gec-17D6/.
Another paper [GECTurk: Grammatical Error Correction and Detection Dataset for Turkish]({{ '/assets/pdf/2309.11346.pdf' | relative_url }}) is accepted to the [Findings of IJCNLP-AACL 2023](http://www.ijcnlp-aacl2023.org/) 📣
4 changes: 1 addition & 3 deletions _news/2024-01-18-paper-arxiv.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,4 @@ inline: true
Our new paper is available on [arXiv](https://arxiv.org/)!

***
Our paper entitled [Quantifying Divergence for Human-AI Collaboration and Cognitive Trust]({{ '/assets/pdf/2312.08722.pdf' | relative_url }}) is available on [arXiv](https://arxiv.org/)! Check the [repo](https://github.com/gglab-ku/cogeval) for more details 📣

> Abstract: Predicting the collaboration likelihood and measuring cognitive trust to AI systems is more important than ever. To do that, previous research mostly focus solely on the model features (e.g., accuracy, confidence) and ignore the human factor. To address that, we propose several decision-making similarity measures based on divergence metrics (e.g., KL, JSD) calculated over the labels acquired from humans and a wide range of models. We conduct a user study on a textual entailment task, where the users are provided with soft labels from various models and asked to pick the closest option to them. The users are then shown the similarities/differences to their most similar model and are surveyed for their likelihood of collaboration and cognitive trust to the selected system. Finally, we qualitatively and quantitatively analyze the relation between the proposed decision-making similarity measures and the survey results. We find that people tend to collaborate with their most similar models -- measured via JSD -- yet this collaboration does not necessarily imply a similar level of cognitive trust. We release all resources related to the user study (e.g., design, outputs), models, and metrics at our repo.
Our paper entitled [Quantifying Divergence for Human-AI Collaboration and Cognitive Trust]({{ '/assets/pdf/2312.08722.pdf' | relative_url }}) is available on [arXiv](https://arxiv.org/)! Check the [repo](https://github.com/gglab-ku/cogeval) for more details 📣
4 changes: 1 addition & 3 deletions _news/2024-05-16-paper-ACL-findings.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,4 @@ inline: true
Paper accepted to [Findings of ACL 2024](https://2024.aclweb.org/)!

***
Our paper entitled [PARADISE: Evaluating Implicit Planning Skills of Language Models with Procedural Warnings and Tips Dataset]({{ '/assets/pdf/2403.03167.pdf' | relative_url }}) is accepted to [Findings of ACL 2024](https://2024.aclweb.org/)! Check the [repo](https://github.com/GGLAB-KU/paradise) for more details 📣

> Abstract: Recently, there has been growing interest within the community regarding whether large language models are capable of planning or executing plans. However, most prior studies use LLMs to generate high-level plans for simplified scenarios lacking linguistic complexity and domain diversity, limiting analysis of their planning abilities. These setups constrain evaluation methods (e.g., predefined action space), architectural choices (e.g., only generative models), and overlook the linguistic nuances essential for realistic analysis. To tackle this, we present PARADISE, an abductive reasoning task using Q\&A format on practical procedural text sourced from wikiHow. It involves warning and tip inference tasks directly associated with goals, excluding intermediary steps, with the aim of testing the ability of the models to infer implicit knowledge of the plan solely from the given goal. Our experiments, utilizing fine-tuned language models and zero-shot prompting, reveal the effectiveness of task-specific small models over large language models in most scenarios. Despite advancements, all models fall short of human performance. Notably, our analysis uncovers intriguing insights, such as variations in model behavior with dropped keywords, struggles of BERT-family and GPT-4 with physical and abstract goals, and the proposed tasks offering valuable prior knowledge for other unseen procedural tasks. The PARADISE dataset and associated resources are publicly available for further research exploration with this https URL.
Our paper entitled [PARADISE: Evaluating Implicit Planning Skills of Language Models with Procedural Warnings and Tips Dataset]({{ '/assets/pdf/2403.03167.pdf' | relative_url }}) is accepted to [Findings of ACL 2024](https://2024.aclweb.org/)! Check the [repo](https://github.com/GGLAB-KU/paradise) for more details 📣
12 changes: 12 additions & 0 deletions _news/TA-award-gurkan-abed.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
---
layout: post
date: 2024-08-13
inline: true
---

Congratulations to Abdulfattah and Gürkan for the Distinguished TA Award! 🎉

***
We are proud to announce that Abdulfattah and Gürkan have received the Distinguished Teaching Assistant Award for their exceptional dedication to teaching. Congratulations to both of them! 🎉

> <img title="ta_award_abed_gurkan" alt="ta_award_abed" src="assets/img/news/award_abed_gurkan.webp" width="287" height="419">
11 changes: 11 additions & 0 deletions _news/google_cloud_gemini_10000.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
---
layout: post
date: 2024-08-13
inline: true
---

"Exploring and Improving the Capacity of Gemini Pro for Meta-Linguistic Reasoning" has been awarded a $10,000 cloud credit grant from Google.

***
We are excited to announce that our proposal, "Exploring and Improving the Capacity of Gemini Pro for Meta-Linguistic Reasoning," has been awarded a $10,000 cloud credit grant from Google. This generous support will enable us to further our research and development efforts in enhancing the capabilities of the Gemini Pro system for advanced meta-linguistic reasoning tasks.

Binary file added assets/img/mems/halicioglu-profile.webp
Binary file not shown.
Binary file added assets/img/mems/kantarcioglu-profile.webp
Binary file not shown.
Binary file added assets/img/news/award_abed_gurkan.webp
Binary file not shown.
Loading