Update

NEU-DataMining · Feb 27, 2024 · 4747da4 · 4747da4
1 parent 6245c3f
commit 4747da4
Show file tree

Hide file tree

Showing 6 changed files with 43 additions and 60 deletions.
diff --git a/index.html b/index.html
@@ -24,7 +24,7 @@
   <meta name="viewport" content="width=device-width, initial-scale=1">
 
 
-  <title>StickerConv: Generating Multimodal Empathetic Responses from Scratch</title>
+  <title>STICKERCONV: Generating Multimodal Empathetic Responses from Scratch</title>
   <link rel="icon" type="image/x-icon" href="static/images/favicon.ico">
   <link href="https://fonts.googleapis.com/css?family=Google+Sans|Noto+Sans|Castoro"
   rel="stylesheet">
@@ -52,7 +52,7 @@
       <div class="container is-max-desktop">
         <div class="columns is-centered">
           <div class="column has-text-centered">
-            <h1 class="title is-1 publication-title">StickerConv: Generating Multimodal Empathetic Responses from Scratch</h1>
+            <h1 class="title is-1 publication-title">S<span style="font-size: smaller;">TICKER</span>C<span style="font-size: smaller;">ONV</span>: Generating Multimodal Empathetic Responses from Scratch</h1>
             <div class="is-size-5 publication-authors">
               <!-- Paper authors -->
               <span class="author-block">
@@ -135,16 +135,17 @@ <h1 class="title is-1 publication-title">StickerConv: Generating Multimodal Empa
         <h2 class="title is-3">Abstract</h2>
         <div class="content has-text-justified">
           <p>
-            Stickers, while widely recognized for enhancing empathetic communication in online interactions, remain underexplored
-            in current empathetic dialogue research. In this paper, we introduce the Agent for StickerConv (Agent4SC), which uses
-            collaborative agent interactions to realistically simulate human behavior with sticker usage, thereby enhancing multimodal
-            empathetic communication. Building on this foundation, we develop a multimodal empathetic dialogue dataset, StickerConv,
-            which includes 12.9K dialogue sessions, 5.8K unique stickers, and 2K diverse conversational scenarios, specifically
-            designs to augment the generation of empathetic responses in a multimodal context. To leverage the richness of this dataset,
-            we propose <b>PE</b>rceive and <b>G</b>enerate <b>S</b>tickers (PEGS), a multimodal empathetic response model, complemented by a
-            comprehensive set of empathy evaluation metrics based on LLM. Our experiments demonstrate PEGS's effectiveness in
-            generating contextually relevant and emotionally resonant multimodal empathetic responses, contributing to the advancement
-            of more nuanced and engaging empathetic dialogue systems.
+            Stickers, while widely recognized for enhancing empathetic communication in online interactions, remain underexplored in
+            current empathetic dialogue research, notably due to the challenge of a lack of comprehensive datasets. In this paper, we
+            introduce the Agent for S<span style="font-size: smaller;">TICKER</span>C<span style="font-size: smaller;">ONV</span> (Agent4SC),
+            which uses collaborative agent interactions to realistically simulate human behavior with sticker usage, thereby enhancing
+            multimodal empathetic communication. Building on this foundation, we develop a multimodal empathetic dialogue dataset,
+            S<span style="font-size: smaller;">TICKER</span>C<span style="font-size: smaller;">ONV</span>, comprising 12.9K dialogue sessions,
+            5.8K unique stickers, and 2K diverse conversational scenarios. This dataset serves as a benchmark for multimodal empathetic
+            generation. To advance further, we propose <b>PE</b>rceive and <b>G</b>enerate <b>S</b>tickers (PEGS), a multimodal empathetic
+            response generation framework, complemented by a comprehensive set of empathy evaluation metrics based on LLM. Our experiments
+            demonstrate PEGS's effectiveness in generating contextually relevant and emotionally resonant multimodal empathetic
+            responses, contributing to the advancement of more nuanced and engaging empathetic dialogue systems.
         </div>
       </div>
     </div>
@@ -162,16 +163,20 @@ <h2 class="title is-2">Technical Description</h2>
     <!-- Agent -->
     <div class="columns is-centered">
       <div class="column is-full-width">
-        <h4 class="title is-3">• Agent for StickerConv (Agent4SC)</h4>
+        <h4 class="title is-3">• Agent for S<span style="font-size: smaller;">TICKER</span>C<span style="font-size: smaller;">ONV</span> (Agent4SC)</h4>
 
         <div class="content has-text-justified">
           <img class="columns is-centered has-text-centered" src="./static/images/Agent4SC.png"
                alt="Teaser" width="95%"  style="margin:0 auto">
           <br>
           <figcaption>
-            <p style="text-align: center;">
+            <p style="text-align: justify;">
               <font color="061E61">
-                <b>Figure 1:</b> The Overview of Agent4SC.
+                <b>Figure 1:</b> The overview of Agent4SC. Memory and Plan modules enable the agent to mimic human observation and
+                thought, overcoming LLMs' inability to grasp nuanced emotions. The Action module supports generating insights with
+                human-like emotional reactions. The Profile module gives each agent distinct reflections and actions. Furthermore,
+                Agent4SC uses stickers as a Tool for more natural conversation, allowing the agent to choose stickers like humans.
+                These modules streamline observation, reflection, and action, while the Manager Agent maintains performance and quality.
               </font>
             </p>
           </figcaption>
@@ -182,16 +187,16 @@ <h4 class="title is-3">• Agent for StickerConv (Agent4SC)</h4>
     <!-- Dataset -->
     <div class="columns is-centered">
       <div class="column is-full-width">
-        <h2 class="title is-3">• StickerConv</h2>
+        <h2 class="title is-3">• S<span style="font-size: smaller;">TICKER</span>C<span style="font-size: smaller;">ONV</span></h2>
 
         <div class="content has-text-justified">
           <img class="columns is-centered has-text-centered" src="./static/images/StickerConv_example.png"
                alt="Teaser" width="40%"  style="margin:0 auto">
           <br>
           <figcaption>
-            <p style="text-align: left;">
+            <p style="text-align: justify;">
               <font color="061E61">
-                <b>Figure 2:</b> An example of multimodal conversation in our StickerConv dataset.
+                <b>Figure 2:</b> An example of multimodal conversation in our S<span style="font-size: smaller;">TICKER</span>C<span style="font-size: smaller;">ONV</span> dataset.
                 Both parties can utilize the stickers to express their emotions, which enhances interactivity and expression.
                 Assistant can empathize with the user according to the conversation (<span style="color: rgb(0, 153, 0);">green</span> text).
               </font>
@@ -203,9 +208,9 @@ <h2 class="title is-3">• StickerConv</h2>
                alt="Teaser" width="50%"  style="margin:0 auto">
           <br>
           <figcaption>
-            <p style="text-align: center;">
+            <p style="text-align: justify;">
               <font color="061E61">
-                <b>Figure 3:</b> The statistics of StickerConv.
+                <b>Figure 3:</b> The statistics of S<span style="font-size: smaller;">TICKER</span>C<span style="font-size: smaller;">ONV</span>.
               </font>
             </p>
           </figcaption>
@@ -215,7 +220,7 @@ <h2 class="title is-3">• StickerConv</h2>
                alt="Teaser" width="50%"  style="margin:0 auto">
           <br>
           <figcaption>
-            <p style="text-align: left;">
+            <p style="text-align: justify;">
               <font color="061E61">
                 <b>Figure 4:</b> The chart of emotional distribution in the choice of stickers between users and the system.
                 This chart revealed a striking trend: users have a significant preference for stickers that convey negative emotions,
@@ -234,9 +239,9 @@ <h2 class="title is-3">• StickerConv</h2>
                  alt="Teaser" width="90%"  style="margin:0 auto">
             <br>
             <figcaption>
-              <p style="text-align: left;">
+              <p style="text-align: justify;">
                 <font color="061E61">
-                  <b>Figure 5:</b> Emotion distribution of user profile in Agent for StickerConv.
+                  <b>Figure 5:</b> Emotion distribution of user profile in Agent for S<span style="font-size: smaller;">TICKER</span>C<span style="font-size: smaller;">ONV</span>.
                 </font>
               </p>
             </figcaption>
@@ -247,9 +252,9 @@ <h2 class="title is-3">• StickerConv</h2>
                  alt="Teaser" width="90%"  style="margin:0 auto">
             <br>
             <figcaption>
-              <p style="text-align: left;">
+              <p style="text-align: justify;">
                 <font color="061E61">
-                  <b>Figure 6:</b> The 200 most popular emotion-related words in StickerConv.
+                  <b>Figure 6:</b> The 200 most popular emotion-related words in S<span style="font-size: smaller;">TICKER</span>C<span style="font-size: smaller;">ONV</span>.
                 </font>
               </p>
             </figcaption>
@@ -269,14 +274,12 @@ <h2 class="title is-3">• PEGS</h2>
                alt="Teaser" width="80%"  style="margin:0 auto">
           <br>
           <figcaption>
-            <p style="text-align: left;">
+            <p style="text-align: justify;">
               <font color="061E61">
-                <b>Figure 8:</b> The overview architecture of PEGS.
-                The input images are jointly encoded by image encoder, Q-Former and a linear layer.
-                Vicuna is used as the language model.
-                To empower the language model to generate images,
-                the output of the LLM is first mapped into the input space of the image decoder through a feature mapper,
-                and then the frozen image decoder is employed to generate images.
+                <b>Figure 7:</b> The architecture of PEGS framework includes various routing options, distinguished by colored connecting
+                lines. Input stickers undergo joint encoding by an image encoder, Q-Former, and a linear layer, with Vicuna serving as the
+                language model. The output of the LLM activates two sets of tokens differently across model versions: one for image retrieval
+                and the other as a textual condition. Subsequently, the frozen image decoder generates images.
               </font>
             </p>
           </figcaption>
@@ -295,40 +298,21 @@ <h2 class="title is-3">• PEGS</h2>
       <h2 class="title is-2">Results</h2>
       <br>
     </div>
-    <!-- Examples (Positive Emotion) -->
+    <!-- Conversation -->
     <div class="columns is-centered">
       <div class="column is-full-width">
 
         <div class="content has-text-justified">
-          <img class="columns is-centered has-text-centered" src="static/images/results_case_positive.png"
+          <img class="columns is-centered has-text-centered" src="static/images/conversation.png"
                alt="Teaser" width="95%"  style="margin:0 auto">
           <br>
           <figcaption>
-            <p style="text-align: left;">
+            <p style="text-align: justify;">
               <font color="061E61">
-                <b>Figure 9:</b> Examples of conversations with positive emotions.
-                Users can chat with multimodal content (text and stickers) and will receive positive multimodal responses.
-              </font>
-            </p>
-          </figcaption>
-        </div>
-      </div>
-    </div>
-
-    <!-- Examples (Negative Emotion) -->
-    <div class="columns is-centered">
-      <div class="column is-full-width">
-
-        <div class="content has-text-justified">
-          <img class="columns is-centered has-text-centered" src="static/images/results_case_negative.png"
-               alt="Teaser" width="95%"  style="margin:0 auto">
-          <br>
-          <figcaption>
-            <p style="text-align: left;">
-              <font color="061E61">
-                <b>Figure 10:</b> Examples of conversations with negative emotions.
-                Our model can empathize with users who are suffering from negative emotions, such as sadness (left) or anger (right) .
-                Users will be comforted and guided to positive sentiments.
+                <b>Figure 8:</b> Examples of conversations by users interacting with PEGS.
+                Users can chat with multimodal content (text and stickers) and will receive multimodal empathetic responses.
+                Left: a conversation characterized by positive emotion (happiness).
+                Right: a conversation characterized by negative emotion (sadness).
               </font>
             </p>
           </figcaption>
@@ -345,10 +329,9 @@ <h2 class="title is-2">Results</h2>
     <h2 class="title">BibTeX</h2>
     <pre><code>
 @article{zhang2024stickerconv,
-  title={StickerConv: Generating Multimodal Empathetic Responses from Scratch},
+  title={STICKERCONV: Generating Multimodal Empathetic Responses from Scratch},
   author={Zhang, Yiqun and Kong, Fanheng and Wang, Peidong and Sun, Shuang and Wang, Lingshuai and Feng, Shi and Wang, Daling and Zhang, Yifei and Song, Kaisong},
   journal={arXiv preprint arXiv:2402.01679},
-  url={https://arxiv.org/abs/2402.01679},
   year={2024}
 }
     </code></pre>

diff --git a/static/images/PEGS.png b/static/images/PEGS.png
diff --git a/static/images/StickerConv_example.png b/static/images/StickerConv_example.png
diff --git a/static/images/conversation.png b/static/images/conversation.png
diff --git a/static/images/results_case_negative.png b/static/images/results_case_negative.png
diff --git a/static/images/results_case_positive.png b/static/images/results_case_positive.png