Skip to contents

This function calculates a sequence of similarities between consecutive windows in a conversation.

Usage

calc_sim_seq(conversation, window_size, similarity_func)

Arguments

conversation

A dataframe containing the conversation, with a column named 'processed_text'.

window_size

An integer specifying the size of each window.

similarity_func

A function that calculates similarity between two text strings.

Value

A list containing two elements:

sequence

A numeric vector of similarity scores between consecutive windows

average

The mean of the similarity scores

Examples

conversation <- data.frame(processed_text = c("hello", "world", "how", "are", "you"))
result <- calc_sim_seq(conversation, 2, function(x, y) sum(x == y) / max(length(x), length(y)))