Skip to contents

This function scrapes details of books using their IDs from Goodreads.

Usage

scrape_books(book_ids_path, use_parallel = FALSE, num_cores = 4)

Arguments

book_ids_path

Path to a text file containing book IDs.

use_parallel

Logical indicating whether to scrape in parallel (default is FALSE).

num_cores

Number of CPU cores to use for parallel scraping (default is 4).

Value

A data frame containing scraped book details.

Examples

# \donttest{
# Create a temporary file with sample book IDs
temp_file <- tempfile(fileext = ".txt")
writeLines(c("1420", "2767052", "10210"), temp_file)

# Run the function (with a small delay to avoid overwhelming the server)
result <- scrape_books(temp_file, use_parallel = FALSE)
print(head(result))
#> # A tibble: 3 × 13
#>   book_id book_title      book_details format publication_info authorlink author
#>   <chr>   <chr>           <chr>        <chr>  <chr>            <chr>      <chr> 
#> 1 1420    Hamlet          "Among Shak… 289 p… First published… https://w… Willi…
#> 2 2767052 The Hunger Gam… "Could you … 374 p… First published… https://w… Suzan…
#> 3 10210   Jane Eyre       "Alternate … 532 p… First published… https://w… Charl…
#> # ℹ 6 more variables: num_pages <chr>, genres <chr>, num_ratings <chr>,
#> #   num_reviews <chr>, average_rating <chr>, rating_distribution <chr>
# Clean up: remove the temporary file
file.remove(temp_file)
#> [1] TRUE
# }