This function preprocesses article text for further analysis.
Usage
preprocess_text(
text_data,
text_column = "abstract",
remove_stopwords = TRUE,
custom_stopwords = NULL,
stem_words = FALSE,
min_word_length = 3,
max_word_length = 50
)
Arguments
- text_data
A data frame containing article text data (title, abstract, etc.).
- text_column
Name of the column containing text to process.
- remove_stopwords
Logical. If TRUE, removes stopwords.
- custom_stopwords
Character vector of additional stopwords to remove.
- stem_words
Logical. If TRUE, applies stemming to words.
- min_word_length
Minimum word length to keep.
- max_word_length
Maximum word length to keep.