Skip to content

Commit

Permalink
Update
Browse files Browse the repository at this point in the history
  • Loading branch information
friedrichor committed Feb 27, 2024
1 parent 6245c3f commit 4747da4
Show file tree
Hide file tree
Showing 6 changed files with 43 additions and 60 deletions.
103 changes: 43 additions & 60 deletions index.html
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@
<meta name="viewport" content="width=device-width, initial-scale=1">


<title>StickerConv: Generating Multimodal Empathetic Responses from Scratch</title>
<title>STICKERCONV: Generating Multimodal Empathetic Responses from Scratch</title>
<link rel="icon" type="image/x-icon" href="static/images/favicon.ico">
<link href="https://fonts.googleapis.com/css?family=Google+Sans|Noto+Sans|Castoro"
rel="stylesheet">
Expand Down Expand Up @@ -52,7 +52,7 @@
<div class="container is-max-desktop">
<div class="columns is-centered">
<div class="column has-text-centered">
<h1 class="title is-1 publication-title">StickerConv: Generating Multimodal Empathetic Responses from Scratch</h1>
<h1 class="title is-1 publication-title">S<span style="font-size: smaller;">TICKER</span>C<span style="font-size: smaller;">ONV</span>: Generating Multimodal Empathetic Responses from Scratch</h1>
<div class="is-size-5 publication-authors">
<!-- Paper authors -->
<span class="author-block">
Expand Down Expand Up @@ -135,16 +135,17 @@ <h1 class="title is-1 publication-title">StickerConv: Generating Multimodal Empa
<h2 class="title is-3">Abstract</h2>
<div class="content has-text-justified">
<p>
Stickers, while widely recognized for enhancing empathetic communication in online interactions, remain underexplored
in current empathetic dialogue research. In this paper, we introduce the Agent for StickerConv (Agent4SC), which uses
collaborative agent interactions to realistically simulate human behavior with sticker usage, thereby enhancing multimodal
empathetic communication. Building on this foundation, we develop a multimodal empathetic dialogue dataset, StickerConv,
which includes 12.9K dialogue sessions, 5.8K unique stickers, and 2K diverse conversational scenarios, specifically
designs to augment the generation of empathetic responses in a multimodal context. To leverage the richness of this dataset,
we propose <b>PE</b>rceive and <b>G</b>enerate <b>S</b>tickers (PEGS), a multimodal empathetic response model, complemented by a
comprehensive set of empathy evaluation metrics based on LLM. Our experiments demonstrate PEGS's effectiveness in
generating contextually relevant and emotionally resonant multimodal empathetic responses, contributing to the advancement
of more nuanced and engaging empathetic dialogue systems.
Stickers, while widely recognized for enhancing empathetic communication in online interactions, remain underexplored in
current empathetic dialogue research, notably due to the challenge of a lack of comprehensive datasets. In this paper, we
introduce the Agent for S<span style="font-size: smaller;">TICKER</span>C<span style="font-size: smaller;">ONV</span> (Agent4SC),
which uses collaborative agent interactions to realistically simulate human behavior with sticker usage, thereby enhancing
multimodal empathetic communication. Building on this foundation, we develop a multimodal empathetic dialogue dataset,
S<span style="font-size: smaller;">TICKER</span>C<span style="font-size: smaller;">ONV</span>, comprising 12.9K dialogue sessions,
5.8K unique stickers, and 2K diverse conversational scenarios. This dataset serves as a benchmark for multimodal empathetic
generation. To advance further, we propose <b>PE</b>rceive and <b>G</b>enerate <b>S</b>tickers (PEGS), a multimodal empathetic
response generation framework, complemented by a comprehensive set of empathy evaluation metrics based on LLM. Our experiments
demonstrate PEGS's effectiveness in generating contextually relevant and emotionally resonant multimodal empathetic
responses, contributing to the advancement of more nuanced and engaging empathetic dialogue systems.
</div>
</div>
</div>
Expand All @@ -162,16 +163,20 @@ <h2 class="title is-2">Technical Description</h2>
<!-- Agent -->
<div class="columns is-centered">
<div class="column is-full-width">
<h4 class="title is-3">• Agent for StickerConv (Agent4SC)</h4>
<h4 class="title is-3">• Agent for S<span style="font-size: smaller;">TICKER</span>C<span style="font-size: smaller;">ONV</span> (Agent4SC)</h4>

<div class="content has-text-justified">
<img class="columns is-centered has-text-centered" src="./static/images/Agent4SC.png"
alt="Teaser" width="95%" style="margin:0 auto">
<br>
<figcaption>
<p style="text-align: center;">
<p style="text-align: justify;">
<font color="061E61">
<b>Figure 1:</b> The Overview of Agent4SC.
<b>Figure 1:</b> The overview of Agent4SC. Memory and Plan modules enable the agent to mimic human observation and
thought, overcoming LLMs' inability to grasp nuanced emotions. The Action module supports generating insights with
human-like emotional reactions. The Profile module gives each agent distinct reflections and actions. Furthermore,
Agent4SC uses stickers as a Tool for more natural conversation, allowing the agent to choose stickers like humans.
These modules streamline observation, reflection, and action, while the Manager Agent maintains performance and quality.
</font>
</p>
</figcaption>
Expand All @@ -182,16 +187,16 @@ <h4 class="title is-3">• Agent for StickerConv (Agent4SC)</h4>
<!-- Dataset -->
<div class="columns is-centered">
<div class="column is-full-width">
<h2 class="title is-3">StickerConv</h2>
<h2 class="title is-3">S<span style="font-size: smaller;">TICKER</span>C<span style="font-size: smaller;">ONV</span></h2>

<div class="content has-text-justified">
<img class="columns is-centered has-text-centered" src="./static/images/StickerConv_example.png"
alt="Teaser" width="40%" style="margin:0 auto">
<br>
<figcaption>
<p style="text-align: left;">
<p style="text-align: justify;">
<font color="061E61">
<b>Figure 2:</b> An example of multimodal conversation in our StickerConv dataset.
<b>Figure 2:</b> An example of multimodal conversation in our S<span style="font-size: smaller;">TICKER</span>C<span style="font-size: smaller;">ONV</span> dataset.
Both parties can utilize the stickers to express their emotions, which enhances interactivity and expression.
Assistant can empathize with the user according to the conversation (<span style="color: rgb(0, 153, 0);">green</span> text).
</font>
Expand All @@ -203,9 +208,9 @@ <h2 class="title is-3">• StickerConv</h2>
alt="Teaser" width="50%" style="margin:0 auto">
<br>
<figcaption>
<p style="text-align: center;">
<p style="text-align: justify;">
<font color="061E61">
<b>Figure 3:</b> The statistics of StickerConv.
<b>Figure 3:</b> The statistics of S<span style="font-size: smaller;">TICKER</span>C<span style="font-size: smaller;">ONV</span>.
</font>
</p>
</figcaption>
Expand All @@ -215,7 +220,7 @@ <h2 class="title is-3">• StickerConv</h2>
alt="Teaser" width="50%" style="margin:0 auto">
<br>
<figcaption>
<p style="text-align: left;">
<p style="text-align: justify;">
<font color="061E61">
<b>Figure 4:</b> The chart of emotional distribution in the choice of stickers between users and the system.
This chart revealed a striking trend: users have a significant preference for stickers that convey negative emotions,
Expand All @@ -234,9 +239,9 @@ <h2 class="title is-3">• StickerConv</h2>
alt="Teaser" width="90%" style="margin:0 auto">
<br>
<figcaption>
<p style="text-align: left;">
<p style="text-align: justify;">
<font color="061E61">
<b>Figure 5:</b> Emotion distribution of user profile in Agent for StickerConv.
<b>Figure 5:</b> Emotion distribution of user profile in Agent for S<span style="font-size: smaller;">TICKER</span>C<span style="font-size: smaller;">ONV</span>.
</font>
</p>
</figcaption>
Expand All @@ -247,9 +252,9 @@ <h2 class="title is-3">• StickerConv</h2>
alt="Teaser" width="90%" style="margin:0 auto">
<br>
<figcaption>
<p style="text-align: left;">
<p style="text-align: justify;">
<font color="061E61">
<b>Figure 6:</b> The 200 most popular emotion-related words in StickerConv.
<b>Figure 6:</b> The 200 most popular emotion-related words in S<span style="font-size: smaller;">TICKER</span>C<span style="font-size: smaller;">ONV</span>.
</font>
</p>
</figcaption>
Expand All @@ -269,14 +274,12 @@ <h2 class="title is-3">• PEGS</h2>
alt="Teaser" width="80%" style="margin:0 auto">
<br>
<figcaption>
<p style="text-align: left;">
<p style="text-align: justify;">
<font color="061E61">
<b>Figure 8:</b> The overview architecture of PEGS.
The input images are jointly encoded by image encoder, Q-Former and a linear layer.
Vicuna is used as the language model.
To empower the language model to generate images,
the output of the LLM is first mapped into the input space of the image decoder through a feature mapper,
and then the frozen image decoder is employed to generate images.
<b>Figure 7:</b> The architecture of PEGS framework includes various routing options, distinguished by colored connecting
lines. Input stickers undergo joint encoding by an image encoder, Q-Former, and a linear layer, with Vicuna serving as the
language model. The output of the LLM activates two sets of tokens differently across model versions: one for image retrieval
and the other as a textual condition. Subsequently, the frozen image decoder generates images.
</font>
</p>
</figcaption>
Expand All @@ -295,40 +298,21 @@ <h2 class="title is-3">• PEGS</h2>
<h2 class="title is-2">Results</h2>
<br>
</div>
<!-- Examples (Positive Emotion) -->
<!-- Conversation -->
<div class="columns is-centered">
<div class="column is-full-width">

<div class="content has-text-justified">
<img class="columns is-centered has-text-centered" src="static/images/results_case_positive.png"
<img class="columns is-centered has-text-centered" src="static/images/conversation.png"
alt="Teaser" width="95%" style="margin:0 auto">
<br>
<figcaption>
<p style="text-align: left;">
<p style="text-align: justify;">
<font color="061E61">
<b>Figure 9:</b> Examples of conversations with positive emotions.
Users can chat with multimodal content (text and stickers) and will receive positive multimodal responses.
</font>
</p>
</figcaption>
</div>
</div>
</div>

<!-- Examples (Negative Emotion) -->
<div class="columns is-centered">
<div class="column is-full-width">

<div class="content has-text-justified">
<img class="columns is-centered has-text-centered" src="static/images/results_case_negative.png"
alt="Teaser" width="95%" style="margin:0 auto">
<br>
<figcaption>
<p style="text-align: left;">
<font color="061E61">
<b>Figure 10:</b> Examples of conversations with negative emotions.
Our model can empathize with users who are suffering from negative emotions, such as sadness (left) or anger (right) .
Users will be comforted and guided to positive sentiments.
<b>Figure 8:</b> Examples of conversations by users interacting with PEGS.
Users can chat with multimodal content (text and stickers) and will receive multimodal empathetic responses.
Left: a conversation characterized by positive emotion (happiness).
Right: a conversation characterized by negative emotion (sadness).
</font>
</p>
</figcaption>
Expand All @@ -345,10 +329,9 @@ <h2 class="title is-2">Results</h2>
<h2 class="title">BibTeX</h2>
<pre><code>
@article{zhang2024stickerconv,
title={StickerConv: Generating Multimodal Empathetic Responses from Scratch},
title={STICKERCONV: Generating Multimodal Empathetic Responses from Scratch},
author={Zhang, Yiqun and Kong, Fanheng and Wang, Peidong and Sun, Shuang and Wang, Lingshuai and Feng, Shi and Wang, Daling and Zhang, Yifei and Song, Kaisong},
journal={arXiv preprint arXiv:2402.01679},
url={https://arxiv.org/abs/2402.01679},
year={2024}
}
</code></pre>
Expand Down
Binary file modified static/images/PEGS.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified static/images/StickerConv_example.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added static/images/conversation.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file removed static/images/results_case_negative.png
Binary file not shown.
Binary file removed static/images/results_case_positive.png
Binary file not shown.

0 comments on commit 4747da4

Please sign in to comment.