Calculates a balanced effectiveness score for individual search terms using the harmonic mean of precision and coverage. This provides a single metric to evaluate how well each term performs in retrieving relevant articles.
Details
The Term Effectiveness Score (TES) is calculated as: $$TES = 2 \times \frac{precision \times coverage}{precision + coverage}$$
Where:
Precision: Proportion of retrieved articles that are relevant
Coverage: Proportion of term-specific relevant articles that were retrieved
This differs from the traditional F1 score in that it uses coverage (term-specific relevance) rather than recall (overall strategy relevance).
Key Differences from F1 Score:
F1 Score: Precision × Recall (strategy-level performance)
TES: Precision × Coverage (term-level performance)
Recall: Relevant articles found / All relevant articles
Coverage: Relevant articles found / Term-specific relevant articles
See also
term_effectiveness for calculating term precision and coverage
calc_precision_recall for strategy-level F1 scores
Examples
# Create sample term analysis
terms <- c("diabetes", "treatment", "clinical")
search_results <- data.frame(
id = paste0("art", 1:20),
title = paste("Study on", sample(terms, 20, replace = TRUE)),
abstract = paste("Research about", sample(terms, 20, replace = TRUE))
)
gold_standard <- paste0("art", c(1, 3, 5, 7, 9))
# Analyze term effectiveness
term_analysis <- term_effectiveness(terms, search_results, gold_standard)
# Calculate effectiveness scores
term_scores <- calc_tes(term_analysis)
print(term_scores[order(term_scores$tes, decreasing = TRUE), ])
#> Term Effectiveness Analysis
#> ==========================
#> Search Results: 20 articles
#> Gold Standard: 5 relevant articles
#> Fields Analyzed: title, abstract
#>
#> term articles_with_term relevant_with_term precision coverage tes
#> clinical 13 3 0.231 0.600 0.3333333
#> diabetes 9 2 0.222 0.400 0.2857143
#> treatment 10 2 0.200 0.400 0.2666667