Add Ubiquitous language (#53)

kampersanda · Oct 26, 2024 · bdc7f03 · bdc7f03
1 parent 87741e3
commit bdc7f03
Show file tree

Hide file tree

Showing 2 changed files with 23 additions and 7 deletions.
diff --git a/elinor-cli/README.md b/elinor-cli/README.md
@@ -17,30 +17,37 @@ Simply use cargo to install from crates.io.
 cargo install elinor-cli
 ```
 
+## Ubiquitous language
+
+Elinor uses the following terms for convenience:
+
+- *True relevance score* means the relevance judgment provided by human assessors.
+- *Predicted relevance score* means the similarity score predicted by the system.
+
 ## elinor-evaluate
 
 elinor-evaluate evaluates the ranking metrics of the system.
 
 ### Input format
 
-elinor-evaluate requires two JSONL files: the true and predicted relevance scores.
+elinor-evaluate requires two JSONL files of true and predicted relevance scores.
 Each line in the JSONL file should be a JSON object with the following fields:
 
 - `query_id`: The ID of the query.
 - `doc_id`: The ID of the document.
 - `score`: The relevance score of the query-document pair.
-  - If it is true, the score should be a non-negative integer (e.g., 0, 1, 2).
-  - If it is predicted, the score can be a float (e.g., 0.1, 0.5, 1.0).
+  - If it is a true one, the score should be a non-negative integer (e.g., 0, 1, 2).
+  - If it is a predicted one, the score can be a float (e.g., 0.1, 0.5, 1.0).
 
-An example of the true JSONL file is:
+An example of the JSONL file for the true relevance scores is:
 
 ```jsonl
 {"query_id":"q_1","doc_id":"d_1","score":2}
 {"query_id":"q_1","doc_id":"d_7","score":0}
 {"query_id":"q_2","doc_id":"d_3","score":2}
 ```
 
-An example of the predicted JSONL file is:
+An example of the JSONL file for the predicted relevance scores is:
 
 ```jsonl
 {"query_id":"q_1","doc_id":"d_1","score":0.65}
@@ -53,6 +60,8 @@ The specifications are:
 - There is no need to sort the lines in the JSONL files.
 - The query-document pairs should be unique in each file.
 - The query IDs in the true and predicted files should be the same.
+- In binary metrics (e.g., Precision, Recall, F1),
+  true relevance scores more than 0 are considered relevant.
 
 Sample JSONL files are available in the [`test-data/sample`](../test-data/sample/) directory.
 

diff --git a/src/lib.rs b/src/lib.rs
@@ -16,10 +16,17 @@
 //!     Not only p-values but also other important statistics, such as effect sizes and confidence intervals, are provided for thorough reporting.
 //!     See the [`statistical_tests`] module for more details.
 //!
+//! # Ubiquitous language
+//!
+//! Elinor uses the following terms for convenience:
+//!
+//! * *True relevance score* means the relevance judgment provided by human assessors.
+//! * *Predicted relevance score* means the similarity score predicted by the system.
+//!
 //! # Basic usage in evaluating several metrics
 //!
-//! You first need to prepare true and predicted relevance scores through
-//! [`TrueRelStore`] and [`PredRelStore`], respectively.
+//! You first need to prepare true and predicted relevance scores for evaluation.
+//! These scores are stored in instances of [`TrueRelStore`] and [`PredRelStore`], respectively.
 //! You can build these instances using [`TrueRelStoreBuilder`] and [`PredRelStoreBuilder`].
 //!
 //! Then, you can evaluate the predicted relevance scores using the [`evaluate`] function and