diff --git a/docs/docs/technology/assets/hello_world/exclamation.png b/docs/docs/technology/assets/hello_world/exclamation.png
new file mode 100644
index 00000000..1f2e5221
Binary files /dev/null and b/docs/docs/technology/assets/hello_world/exclamation.png differ
diff --git a/docs/docs/technology/assets/hello_world/period.png b/docs/docs/technology/assets/hello_world/period.png
new file mode 100644
index 00000000..147b4357
Binary files /dev/null and b/docs/docs/technology/assets/hello_world/period.png differ
diff --git a/docs/docs/technology/assets/hello_world/question_mark.png b/docs/docs/technology/assets/hello_world/question_mark.png
new file mode 100644
index 00000000..90f08208
Binary files /dev/null and b/docs/docs/technology/assets/hello_world/question_mark.png differ
diff --git a/docs/docs/technology/assets/zurich/Zurich.png b/docs/docs/technology/assets/zurich/Zurich.png
new file mode 100644
index 00000000..323f628c
Binary files /dev/null and b/docs/docs/technology/assets/zurich/Zurich.png differ
diff --git "a/docs/docs/technology/assets/zurich/Z\303\274erich.png" "b/docs/docs/technology/assets/zurich/Z\303\274erich.png"
new file mode 100644
index 00000000..9d8e388d
Binary files /dev/null and "b/docs/docs/technology/assets/zurich/Z\303\274erich.png" differ
diff --git "a/docs/docs/technology/assets/zurich/Z\303\274rich.png" "b/docs/docs/technology/assets/zurich/Z\303\274rich.png"
new file mode 100644
index 00000000..47094182
Binary files /dev/null and "b/docs/docs/technology/assets/zurich/Z\303\274rich.png" differ
diff --git a/docs/docs/technology/introduction.md b/docs/docs/technology/introduction.md
index 1d9707da..cd069e21 100644
--- a/docs/docs/technology/introduction.md
+++ b/docs/docs/technology/introduction.md
@@ -15,24 +15,24 @@ Each node represents a different module or function in the pipeline, with a link
```mermaid
flowchart TD
- A0[Spoken Language Audio] --> A1(Spoken Language Text)
- A1[Spoken Language Text] --> B[Language Identification]
- A1 --> C(Normalized Text)
- B --> C
- C & B --> Q(Sentence Splitter)
- Q & B --> D(SignWriting)
- C -.-> M(Glosses)
- M -.-> E
- D --> E(Pose Sequence)
- D -.-> I(Illustration)
- N --> H(3D Avatar)
- N --> G(Skeleton Viewer)
- N --> F(Human GAN)
- H & G & F --> J(Video)
- J --> K(Share Translation)
- D -.-> L(Description)
- O --> N(Fluent Pose Sequence)
- E --> O(Pose Appearance Transfer)
+ A0[Spoken Language Audio] --> A1(Spoken Language Text)
+ A1[Spoken Language Text] --> B[Language Identification]
+ A1 --> C(Normalized Text)
+ B --> C
+ C & B --> Q(Sentence Splitter)
+ Q & B --> D(SignWriting)
+ C -.-> M(Glosses)
+ M -.-> E
+ D --> E(Pose Sequence)
+ D -.-> I(Illustration)
+ N --> H(3D Avatar)
+ N --> G(Skeleton Viewer)
+ N --> F(Human GAN)
+ H & G & F --> J(Video)
+ J --> K(Share Translation)
+ D -.-> L(Description)
+ O --> N(Fluent Pose Sequence)
+ E --> O(Pose Appearance Transfer)
linkStyle default stroke:green;
linkStyle 3,5,7 stroke:lightgreen;
@@ -53,7 +53,7 @@ The dictionary-based translation approach aims to simplify the translation but s
```mermaid
flowchart LR
- a[Spoken Language Text] --> b[Glosses] --> c[Pose Sequence] --> d[Video]
+ a[Spoken Language Text] --> b[Glosses] --> c[Pose Sequence] --> d[Video]
```
![Visualization of one example through the dictionary-based translation pipeline](./assets//dictionary-pipeline.png)
@@ -80,7 +80,7 @@ The machine translation approach aims to achieve similar translation quality to
```mermaid
flowchart LR
- a[Spoken Language Text] --> b[SignWriting] --> c[Pose Sequence] --> d[Video]
+ a[Spoken Language Text] --> b[SignWriting] --> c[Pose Sequence] --> d[Video]
```
![Visualization of one example through the SignWriting-based translation pipeline](./assets/sign-tube-example.png)
@@ -97,9 +97,33 @@ flowchart LR
By combining a relatively small dataset of transcribed single signs (~100k) with a relatively small dataset of segmented continuous signs, and leveraging large video/text sign language datasets, we can automatically transcribe the latter. This process will generate large synthesized datasets for both **text-to-SignWriting** and **SignWriting-to-pose** conversions.
-#### **Potential Quality:**
+#### **Potential Quality**
+
+The system aims to accurately represent sign language grammar and structure, allowing for a good translation of both lexical and non-lexical signs, expressions, and classifiers.
+Potentially, the system can be as good as a deaf human translator, given quality data.
+
+#### **Motivating Examples**
+
+##### Robustness to minor inconsequential changes
+
+Here is an example where a minor, inconsequential, and possibly even **wrong** modification to the spoken language yields the same correct translation in SignWriting (the sign for the city of zurich) but the dictionary yields different ones.
+
+| Text | Machine Translation | Dictionary Translation |
+| ------------------------------------------------------------ | ---------------------------------------------------------------------------------- | ----------------------------------------------- |
+| [Zürich](https://sign.mt/?spl=de&sil=sgg&text=Z%C3%BCrich) | ![SignWriting for Zurich in Swiss-German Sign Language](assets/zurich/Zürich.png) | The sign for Zurich (correct) |
+| [Zurich](https://sign.mt/?spl=de&sil=sgg&text=Zurich) | ![SignWriting for Zurich in Swiss-German Sign Language](assets/zurich/Zurich.png) | Spelling the city name without umlaut (strange) |
+| [Züerich](https://sign.mt/?spl=de&sil=sgg&text=Z%C3%BCerich) | ![SignWriting for Zurich in Swiss-German Sign Language](assets/zurich/Züerich.png) | Spelling the city name (strange) |
+
+##### Adaptivity to minor important changes
+
+Here is an example where a minor, important modification to the spoken language (exclamation) yields different, correct translations in SignWriting (reflecting the emotion) but the dictionary yields the same one.
+Changing to question mark, the face correctly become questioning (even though the SignWriting is not perfect).
-The system aims to accurately represent sign language grammar and structure, allowing for a good translation of both lexical and non-lexical signs, expressions, and classifiers. Potentially, the system can be as good as a deaf human translator, given quality data.
+| Text | Machine Translation | Dictionary Translation |
+| --------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------- | ------------------------------------------------- |
+| [Hello world.](https://sign.mt/?spl=en&sil=ase&text=Hello%20world.) | ![SignWriting for "Hello World." in American Sign Language](assets/hello_world/period.png) | The sign for Hello followed by the sign for World |
+| [Hello world!](https://sign.mt/?spl=en&sil=ase&text=Hello%20world!) | ![SignWriting for "Hello World!" in American Sign Language](assets/hello_world/exclamation.png) | The sign for Hello followed by the sign for World |
+| [Hello world?](https://sign.mt/?spl=en&sil=ase&text=Hello%20world%3F) | ![SignWriting for "Hello World?" in American Sign Language](assets/hello_world/question_mark.png) | The sign for Hello followed by the sign for World |
## Signed to Spoken Language Translation
@@ -107,16 +131,16 @@ Following, is a flowchart of the current translation pipeline from signed to spo
```mermaid
flowchart TD
- A0[Upload Sign Language Video] --> A3[Video]
- A1[Camera Sign Language Video] --> A3
- A3 --> B(Pose Estimation)
- B --> C(Segmentation)
- C & B --> D(SignWriting Transcription)
- A2[Language Selector] --> E(Spoken Language Text)
- D --> E
- E --> F(Spoken Language Audio)
- E --> G(Share Translation)
- C -.-> H(Sign Image)
+ A0[Upload Sign Language Video] --> A3[Video]
+ A1[Camera Sign Language Video] --> A3
+ A3 --> B(Pose Estimation)
+ B --> C(Segmentation)
+ C & B --> D(SignWriting Transcription)
+ A2[Language Selector] --> E(Spoken Language Text)
+ D --> E
+ E --> F(Spoken Language Audio)
+ E --> G(Share Translation)
+ C -.-> H(Sign Image)
linkStyle 1,2 stroke:orange;