最新消息:Welcome to the puzzle paradise for programmers! Here, a well-designed puzzle awaits you. From code logic puzzles to algorithmic challenges, each level is closely centered on the programmer's expertise and skills. Whether you're a novice programmer or an experienced tech guru, you'll find your own challenges on this site. In the process of solving puzzles, you can not only exercise your thinking skills, but also deepen your understanding and application of programming knowledge. Come to start this puzzle journey full of wisdom and challenges, with many programmers to compete with each other and show your programming wisdom! Translated with DeepL.com (free version)

rstudio - Subset error after importing with readr ("arguments imply differing number of rows") in Quarto R chu

matteradmin3PV0评论

I can't subset rows after importing a csv with readr::read_csv from a .qmd.

Data example

Create a file called BSI_test8.csv with these two rows:

id,data_collection_date,data_collection,reporting_anisation_code,specimen_date,specimen_time,week_no,icu_admission_date,icu_admission_time
478574,01/01/1900,ICU BSI,ABC1,01/01/1900,12:00,1,31/12/1899,23:00

Code

  1. Create a .qmd file
  2. Insert a new R code chunk
  3. Paste the following code in that chunk:
linelist_pbc_raw <-
    readr::read_csv("./BSI_test8.csv", show_col_types = FALSE)

linelist_pbc_raw[linelist_pbc_raw$specimen_date != linelist_pbc_raw$data_collection_date, ]
  1. Run the chunk.

Error

I get this error:

Error in (function (..., row.names = NULL, check.rows = FALSE, check.names = TRUE,  : 
  arguments imply differing number of rows: 0, 1

I don't understand what this error means, as both columns have the same number of rows (in this example, 1).

What works

I have found these workarounds:

  1. Running the exact same subsetting code directly from the console

  2. From an R chunk, removing some other date columns from the dataset before subsetting (in this MRE, just the time columns):

linelist_pbc_raw |>
    subset(specimen_date != data_collection_date,
           select = -c(specimen_time, icu_admission_time))
  1. From an R chunk, using data.table::fread to import the data before subsetting
linelist_pbc_raw_datatable <-
    data.table::fread("./BSI_test8.csv")

linelist_pbc_raw_datatable[linelist_pbc_raw_datatable$specimen_date != linelist_pbc_raw_datatable$data_collection_date, ]

I imagine this might be a bug in the way that readr imports data without a final empty line and the way RStudio runs the R chunks within a qmd file. Any idea why this might happen?

R version 4.4.1
RStudio version 2024.9.0.375
OS: Windows 10 x64 (build 19045)

Articles related to this article

Post a comment

comment list (0)

  1. No comments so far