Extract n-grams from text — extract_ngrams • LBDiscover

This function extracts n-grams (sequences of n words) from text.

Usage

extract_ngrams(text, n = 1, min_freq = 2)

Arguments

text: Character vector of texts to process
n: Integer specifying the n-gram size (1 for unigrams, 2 for bigrams, etc.)
min_freq: Minimum frequency to include an n-gram

Value

A data frame containing n-grams and their frequencies

Examples

if (FALSE) { # \dontrun{
bigrams <- extract_ngrams(abstracts, n = 2)
} # }