Calculate semantic similarity between two conversations
Source:R/main_functions.R
semantic_similarity.Rd
This function calculates the semantic similarity between two conversations using either TF-IDF, Word2Vec, or GloVe embeddings approach.
Usage
semantic_similarity(
conversation1,
conversation2,
method = "tfidf",
model_path = NULL,
dim = 100,
window = 5,
iter = 5
)
Arguments
- conversation1
A character string representing the first conversation
- conversation2
A character string representing the second conversation
- method
A character string specifying the method to use: "tfidf", "word2vec", or "glove"
- model_path
A character string specifying the path to pre-trained GloVe file (required for "glove" method)
- dim
An integer specifying the dimensionality for Word2Vec embeddings (default: 100)
- window
An integer specifying the window size for Word2Vec (default: 5)
- iter
An integer specifying the number of iterations for Word2Vec (default: 5)
Examples
conv1 <- "The quick brown fox jumps over the lazy dog"
conv2 <- "A fast auburn canine leaps above an idle hound"
semantic_similarity(conv1, conv2, method = "tfidf")
#> Warning: The 'tfidf' method may not provide highly meaningful results for short conversations or those with little vocabulary overlap. Consider using 'word2vec' or 'glove' methods for more robust results.
#> [1] 0.5