Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

openxlsx: bad interactions between writing data tables and images #463

Open
Arthfael opened this issue Feb 20, 2024 · 0 comments
Open

openxlsx: bad interactions between writing data tables and images #463

Arthfael opened this issue Feb 20, 2024 · 0 comments

Comments

@Arthfael
Copy link

Arthfael commented Feb 20, 2024

Background: I am slowly starting to use openxlsx2 instead of openxlsx, because openxlsx does not support some features I need (specifically, multiple text colors in a single cell). For practical reasons, and because styling is so easy in openxlsx (but so %@~#!$£?!!! in openxlsx2), I was hoping that I could perform some steps in openxlsx, save the workbook then re-open in openxlsx2. This has caused some issues with data tables and embedded images, which I had written into the wb using openxlsx. As I have investigated this, I have realized that these are issues with how openxlsx writes data tables and images: in some cases, an image will not be written even though it should be allowed at the location by Excel, if a data table is in the same workbook. This does not happen for all cells, the issue tends to only affect cells to the right (and below? not sure...) of the table. Also, adding the offending table after, not before the image, removes the image. See reprex below (openxlsx2 parts included because the errors and warnings might reveal some info about the issues):

remotes::install_github("ycphs/openxlsx")
remotes::install_github("JanMarvin/openxlsx2")
packageVersion("openxlsx")
packageVersion("openxlsx2")

# openxlsx phase
require(openxlsx)
df <- data.frame(A = 1:10, B = paste0("B", 1:10))
wb <- createWorkbook()
addWorksheet(wb, "Test")
writeDataTable(wb, "Test", df)
fl <- openxlsx::temp_xlsx(name = "temp_xlsx")
saveWorkbook(wb, fl, overwrite = TRUE)
openXL(fl)
#
# openxlsx2 phase
require(openxlsx2)
img <- system.file("extdata", "einstein.jpg", package = "openxlsx2")
wb2 <- wb_load(fl)
fl2 <- openxlsx2::temp_xlsx(name = "temp_xlsx", macros = FALSE)
wb_save(wb2, fl2)
# First issue:
# Warning message:
#  In wb$clone()$save(file = file, overwrite = overwrite, path = path) :
#  [worksheets] file expected to be in output is missing: ../drawings/vmlDrawing1.vml
#
dms <- wb_dims(1, 1)
wb2 <- wb_add_image(wb2, "Test", dims = dms, img)
# This throws error "Error in self$drawings[[sheet_drawing]] : subscript out of bounds"
dms <- wb_dims(15, 15)
wb2 <- wb_add_image(wb2, "Test", dims = dms, img)
# This too - this is not an issue with overlap with the table
wb2 <- wb_add_worksheet(wb2, "Test2")
dms <- wb_dims(1, 1)
wb2 <- wb_add_image(wb2, "Test2", dims = dms, img)
# Now the above works!
dms <- wb_dims(2, 2)
wb2 <- wb_add_image(wb2, "Test2", dims = dms, img)
# (and this too)
wb_save(wb2, fl2)
xl_open(fl2)
#
# First I thought openxlsx2 had a problem, but actually it seems it is openxlsx-related:
insertImage(wb, "Test", img) # No error, but...
saveWorkbook(wb, fl, overwrite = TRUE)
openXL(fl) # ... no image!
insertImage(wb, "Test", img, startRow = 15)
saveWorkbook(wb, fl, overwrite = TRUE)
openXL(fl) # ... no image!
insertImage(wb, "Test", img, startCol = 15)
saveWorkbook(wb, fl, overwrite = TRUE)
openXL(fl) # ... no image!
# In the past with openxlsx only I did not notice the problem, but I was embedding images first, then data tables:
removeWorksheet(wb, "Test") # Just remove the tabs, no need to start from a fresh workbook
addWorksheet(wb, "Test")
writeDataTable(wb, "Test", df, 10)
writeDataTable(wb, "Test", df, 15)
insertImage(wb, "Test", img)
insertImage(wb, "Test", img, startRow = 15)
insertImage(wb, "Test", img, startCol = 25)
saveWorkbook(wb, fl, overwrite = TRUE)
openXL(fl) # My first 2 images are here, not the 3rd
#
# I cannot show every combination, but at this stage it seems that it's not the order in which I embed
# images/data tables which matters, but whether images are in a space allowed.
# As long as there are tables in the workflow, any images overlapping with them or to their right (how far?)
# seem to be disallowed or removed.
#
# This limitation is not present in Excel.
#
# What about trying the same thing with openxlsx2?
wb2 <- wb_workbook()$
  add_worksheet()$
  add_data_table(x = df, start_col = 10)$
  add_data_table(x = df, start_col = 15)$
  add_image(file = img)$
  add_image(file = img, start_row = 15)$
  add_image(file = img, start_col = 25)$
  save(fl2)
xl_open(fl2)
# This works better than openxlsx but not perfectly:
# The 3 images are added, but all 3 are stacked in A1!
wb2 <- wb_workbook()$
  add_worksheet()$
  add_data_table(x = df, start_col = 10)$
  add_data_table(x = df, start_col = 15)$
  add_image(file = img)$
  add_image(file = img, dims = "A15")$
  add_image(file = img, dims = "Y1")$
  save(fl2)
xl_open(fl2)
# Now my images are where I want them.
# Changing the order works too:
wb2 <- wb_workbook()$
  add_worksheet()$
  add_image(file = img)$
  add_image(file = img, dims = "A15")$
  add_image(file = img, dims = "Y1")$
  add_data_table(x = df, start_col = 10)$
  add_data_table(x = df, start_col = 15)$
  save(fl2)
xl_open(fl2)
# Conclusions:
# The main issue is with openxlsx, though there are some issues with start_row/start_col being ignored in openxlsx2.
# The way I read this, openxlsx seems to introduce small issues in the excel sheet, which are ignored by Excel,
# but cause issues when the worksheet is loaded into openxlsx2.

Details about my session:

> sessionInfo()
R version 4.3.2 (2023-10-31 ucrt)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows Server 2022 x64 (build 20348)

Matrix products: default


locale:
[1] LC_COLLATE=English_United Kingdom.utf8  LC_CTYPE=English_United Kingdom.utf8   
[3] LC_MONETARY=English_United Kingdom.utf8 LC_NUMERIC=C                           
[5] LC_TIME=English_United Kingdom.utf8    

time zone: Europe/Vienna
tzcode source: internal

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] openxlsx2_1.4       openxlsx_4.2.5.9001

loaded via a namespace (and not attached):
 [1] desc_1.4.3        R6_2.5.1          magrittr_2.0.3    remotes_2.4.2.1   ps_1.7.5          cli_3.6.2        
 [7] zip_2.3.1         processx_3.8.3    callr_3.7.3       compiler_4.3.2    rstudioapi_0.15.0 tools_4.3.2      
[13] pkgbuild_1.4.3    curl_5.2.0        Rcpp_1.0.12       stringi_1.8.3    
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant