
Load biomedical dictionaries with improved error handling
Source:R/text_preprocessing.R
load_dictionary.Rd
This function loads pre-defined biomedical dictionaries or fetches terms from MeSH/UMLS.
Usage
load_dictionary(
dictionary_type = NULL,
custom_path = NULL,
source = c("local", "mesh", "umls"),
api_key = NULL,
n_terms = 200,
mesh_query = NULL,
semantic_type_filter = NULL,
sanitize = TRUE,
extended_mesh = FALSE,
mesh_queries = NULL
)
Arguments
- dictionary_type
Type of dictionary to load. For local dictionaries, limited to "disease", "drug", "gene". For MeSH and UMLS, expanded to include more semantic categories.
- custom_path
Optional path to a custom dictionary file.
- source
The source to fetch terms from: "local", "mesh", or "umls".
- api_key
UMLS API key for authentication (required if source = "umls").
- n_terms
Number of terms to fetch.
- mesh_query
Additional query to filter MeSH terms (only if source = "mesh").
- semantic_type_filter
Filter by semantic type (used mainly with UMLS).
- sanitize
Logical. If TRUE, sanitizes the dictionary terms.
- extended_mesh
Logical. If TRUE and source is "mesh", uses PubMed search for additional terms.
- mesh_queries
Named list of MeSH queries for different categories (only if extended_mesh = TRUE).
Examples
# Load a disease dictionary from local source
disease_dict <- load_dictionary("disease", source = "local")
#> Package not installed or dictionary not found. Using example dictionaries.
#> Creating dummy dictionary for disease
#> Sanitizing dictionary with 8 terms...
#> Removed 6 terms that did not match their claimed entity types
#> Sanitization complete. 2 terms remaining (25% of original)
head(disease_dict)
#> term id type source
#> 1 migraine DISEASE_1 disease dummy
#> 6 cancer DISEASE_6 disease dummy
# Load with custom terms
custom_dict <- data.frame(
term = c("migraine", "headache", "photophobia"),
type = c("disease", "symptom", "symptom"),
id = c("D001", "D002", "D003"),
source = rep("custom", 3)
)
print(custom_dict)
#> term type id source
#> 1 migraine disease D001 custom
#> 2 headache symptom D002 custom
#> 3 photophobia symptom D003 custom