Skip to content
Myklob edited this page Mar 19, 2023 · 3 revisions

Initially, determined by semantic similarity metrics and machine learning algorithms, this score is also subjected to our pro/con argument evaluation process. Specific to equivalency, users can submit and vote on reasons why beliefs are similar or superior to the other. We track the performance of these arguments and related up/down votes (or other measures of user approval) to increase confidence in the equivalency score, similar to how stock prices are tracked over time.

The Equivalency score is a metric used on a debate platform to determine the degree of similarity or overlap between two arguments. It is initially generated by combining semantic similarity metrics and machine learning algorithms, which compare the language and structure of the arguments. This initial score is called the Computer-generated Equivalency Score (CES).

Users can also submit and vote on pro/con reasons that argue two statements are essentially saying the same thing, which can adjust the Equivalency score over time. This user-generated score is called the User-generated Equivalency Score (UES).

The final Equivalency score is calculated by combining the CES and the UES, with the weighting of each determined by the performance of the Validity Comparison Argument (VCA). The VCA is an argument that evaluates whether the CES or the UES is better at identifying equivalency between two arguments.

The VCA is essentially an argument for the relative merits of the CES and the UES. It will evaluate the performance of each score and determine the multiplier for the CES. The multiplier for the CES will be the percentage of agreement in the VCA that the CES is more reliable and accurate than the UES.

In mathematical terms, the Equivalency score (ES) between two arguments A and B can be expressed as follows:

ES(A,B) = w_ces * CES(A,B) + w_ues * UES(A,B) where w_ces and w_ues are the multipliers for the CES and UES, respectively, and are determined by the VCA. The CES and UES are calculated using semantic similarity metrics and machine learning algorithms, as well as user-generated pro/con reasons.

Semantic Similarity Metric (SSM)

We'll use the spacy library to generate document vectors and calculate the cosine similarity between them.

import spacy

# Load the English language model in spacy
nlp = spacy.load("en_core_web_md")

def ssm_score(statement1, statement2):
    """
    Calculates the semantic similarity score between two statements using cosine similarity
    """
    doc1 = nlp(statement1)
    doc2 = nlp(statement2)
    
    return doc1.similarity(doc2)