Skip to contents

Calculate Precision and Recall Metrics

Usage

calc_precision_recall(retrieved, relevant, total_relevant = NULL)

Arguments

retrieved

Vector of retrieved article IDs

relevant

Vector of relevant article IDs (gold standard)

total_relevant

Total number of relevant articles in corpus

Details

Calculates standard information retrieval metrics:

  • Precision: TP/(TP+FP) - proportion of retrieved articles that are relevant

  • Recall: TP/(TP+FN) - proportion of relevant articles that were retrieved

  • F1 Score: Harmonic mean of precision and recall

  • Number Needed to Read: 1/precision - articles needed to read to find one relevant

where TP = True Positives, FP = False Positives, FN = False Negatives

References

Manning, C. D., Raghavan, P., & Schütze, H. (2008). Introduction to information retrieval.

Examples

retrieved_ids <- c("art1", "art2", "art3", "art4", "art5")
relevant_ids <- c("art1", "art3", "art6", "art7")
metrics <- calc_precision_recall(retrieved_ids, relevant_ids)
print(paste("Precision:", round(metrics$precision, 3)))
#> [1] "Precision: 0.4"
print(paste("Recall:", round(metrics$recall, 3)))
#> [1] "Recall: 0.5"