Analyzes the effectiveness of individual search terms by calculating precision, coverage, and other relevant metrics for each term. This provides insight into which terms are most effective at retrieving relevant articles.
Usage
term_effectiveness(
terms,
search_results,
gold_standard = NULL,
text_fields = c("title", "abstract")
)Details
For each term, this function calculates:
Number of articles containing the term
Number of relevant articles containing the term (if gold_standard provided)
Precision (proportion of retrieved articles that are relevant)
Coverage (proportion of relevant articles retrieved by the term)
Examples
# Create sample data
search_results <- data.frame(
id = paste0("art", 1:10),
title = c("Diabetes treatment", "Clinical trial", "Diabetes study",
"Treatment options", "New therapy", "Glucose control",
"Insulin therapy", "Management of diabetes", "Clinical study",
"Therapy comparison"),
abstract = c("This study examines diabetes treatments.",
"A clinical trial on new treatments.",
"Diabetes research findings.",
"Comparison of treatment options.",
"Novel therapy approach.",
"Methods to control glucose levels.",
"Insulin therapy effectiveness.",
"Managing diabetes effectively.",
"Clinical research protocols.",
"Comparing therapy approaches.")
)
# Define search terms
terms <- c("diabetes", "treatment", "clinical", "therapy")
# Define gold standard (relevant articles)
gold_standard <- c("art1", "art3", "art7", "art8")
# Analyze term effectiveness
term_metrics <- term_effectiveness(terms, search_results, gold_standard)
print(term_metrics)
#> Term Effectiveness Analysis
#> ==========================
#> Search Results: 10 articles
#> Gold Standard: 4 relevant articles
#> Fields Analyzed: title, abstract
#>
#> term articles_with_term relevant_with_term precision coverage
#> diabetes 3 3 1.000 0.750
#> treatment 3 1 0.333 0.250
#> clinical 2 0 0.000 0.000
#> therapy 3 1 0.333 0.250