PubMed Unified REtrieval for Multi-Output Exploration. An R package that provides a single interface for accessing a range of NLM/PubMed databases, including:
PubMed abstract records,
iCite bibliometric data,
PubTator3 named entity annotations, and
full-text entries from PubMed Central (PMC).
This unified interface simplifies the data retrieval process, allowing users to interact with multiple PubMed services/APIs/output formats through a single R function.
The package also includes MeSH thesaurus resources as simple data frames, including Descriptor Terms, Descriptor Tree Structures, Supplementary Concept Terms, and Pharmacological Actions; it also includes descriptor-level word embeddings (Noh & Kavuluru 2021). Via the mesh-resources library.
Get the released version from CRAN:
install.packages('puremoe')
Or the development version from GitHub with:
::install_github("jaytimm/puremoe") remotes
The package has two basic functions: search_pubmed
and
get_records
. The former fetches PMIDs from the PubMed API
based on user search; the latter scrapes PMID records from a
user-specified PubMed endpoint – pubmed_abstracts
,
pubmed_affiliations
, pubtations
,
icites
, or pmc_fulltext
.
Search syntax is the same as that implemented in standard PubMed search.
<- puremoe::search_pubmed('("political ideology"[TiAb])',
pmids use_pub_years = F)
# pmids <- puremoe::search_pubmed('immunity',
# use_pub_years = T,
# start_year = 2022,
# end_year = 2024)
<- pmids |>
pubmed ::get_records(endpoint = 'pubmed_abstracts',
puremoecores = 3,
sleep = 1)
<- pmids |>
affiliations ::get_records(endpoint = 'pubmed_affiliations',
puremoecores = 1,
sleep = 0.5)
<- pmids |>
icites ::get_records(endpoint = 'icites',
puremoecores = 3,
sleep = 0.25)
<- pmids |>
pubtations ::get_records(endpoint = 'pubtations',
puremoecores = 2)
When the endpoint is PMC, the
get_records()
function takes a vector of filepaths (from the PMC Open Access list) instead of PMIDs.
<- puremoe::data_pmc_list(use_persistent_storage = T)
pmclist <- pmclist[PMID %in% pmids]
pmc_pmids
<- pmc_pmids$fpath[1:5] |>
pmc_fulltext ::get_records(endpoint = 'pmc_fulltext', cores = 1) puremoe