Calculate semantic similarity between two conversations — semantic

This function calculates the semantic similarity between two conversations using either TF-IDF, Word2Vec, or GloVe embeddings approach.

Usage

semantic_similarity(
  conversation1,
  conversation2,
  method = "tfidf",
  model_path = NULL,
  dim = 100,
  window = 5,
  iter = 5
)

Arguments

conversation1: A character string representing the first conversation
conversation2: A character string representing the second conversation
method: A character string specifying the method to use: "tfidf", "word2vec", or "glove"
model_path: A character string specifying the path to pre-trained GloVe file (required for "glove" method)
dim: An integer specifying the dimensionality for Word2Vec embeddings (default: 100)
window: An integer specifying the window size for Word2Vec (default: 5)
iter: An integer specifying the number of iterations for Word2Vec (default: 5)

Value

A numeric value representing the semantic similarity (between 0 and 1)

Examples

conv1 <- "The quick brown fox jumps over the lazy dog"
conv2 <- "A fast auburn canine leaps above an idle hound"
semantic_similarity(conv1, conv2, method = "tfidf")
#> Warning: The 'tfidf' method may not provide highly meaningful results for short conversations or those with little vocabulary overlap. Consider using 'word2vec' or 'glove' methods for more robust results.
#> [1] 0.5