Skip to contents

This function preprocesses article text for further analysis.

Usage

preprocess_text(
  text_data,
  text_column = "abstract",
  remove_stopwords = TRUE,
  custom_stopwords = NULL,
  stem_words = FALSE,
  min_word_length = 3,
  max_word_length = 50
)

Arguments

text_data

A data frame containing article text data (title, abstract, etc.).

text_column

Name of the column containing text to process.

remove_stopwords

Logical. If TRUE, removes stopwords.

custom_stopwords

Character vector of additional stopwords to remove.

stem_words

Logical. If TRUE, applies stemming to words.

min_word_length

Minimum word length to keep.

max_word_length

Maximum word length to keep.

Value

A data frame with processed text and extracted terms.

Examples

if (FALSE) { # \dontrun{
processed_data <- preprocess_text(article_data, text_column = "abstract")
} # }