tracker issue : CF-4204270

select a category, or use search below
(searches all categories and all time range)
Title:

cfspreadsheet crashes/locked up server with (238kb) file

| View in Tracker

Status/Resolution/Reason: To Track//PRNeedInfo

Reporter/Name(from Bugbase): Chad S. / ()

Created: 04/25/2019

Components: Document Management, Office Integration

Versions: 2016,2018

Failure Type: Crash

Found In Build/Fixed In Build: 2018,0,02,313961 /

Priority/Frequency: Normal / All users will encounter

Locale/System: English / Win 2012 Server x64

Vote Count: 0

Problem Description:
When trying to read certain excel documents we've run into an issue where it will crash our server. There are no errors, but Coldfusion continues to take up CPU and memory until it reaches it's limits, or the servers limits. 

You will notice that this spreadsheet has almost a million rows in it, except there is NO data. Something about how cfspreadsheet is reading this document is bugged as it's attempting to add blank rows. 

The problem is... we can't even read the file to tell if there is a problem prior to "fully reading" the file. 

Steps to Reproduce:
Use cfspreadsheet to try an open it. 

Actual Result:
Crashes or Locks up server

Expected Result:
Reads the file

Any Workarounds:
None at this time.

Attachments:

Comments:

Chad, The excel file you've shared has a hidden sheet called "page". Col "I" in that sheet has formulae that follow pattern similar to the following: =INDEX(C:C,MATCH(G2,A:A)) Here the address space C:C and A:A is quite exhaustive. CF internally uses POI library for spreadsheet manipulation, and it's running into memory outage when reading cell with such formulae. You can avoid the issue by using a more restricted search space such as A1:A1500. Let us know if this works to solve the issue.
Comment by Piyush K.
30710 | May 10, 2019 11:12:20 AM GMT
No, it doesn’t. There is no documented way to restrict search like that with cfspreadsheet. Suggestions: Add that capability to cfspreadsheet tag OR Create a new method that pulls high level information about a spreadsheet. Items like Total sheets, Total hidden sheets, total records per sheet. That information would allow any developer to quickly scan documents prior to a full read.
Comment by Chad S.
30745 | May 15, 2019 09:52:37 PM GMT
Piyush, first let me thank you for looking at this issue. I've had some more thoughts but let me state my goals here as your team might find alternative solutions to meet those goals When using cfspreadsheet: 1) Make sure files like these can't crash ColdFusion servers (or seriously load them down) 2) Add the capability to identify a host of format/template/etc issues prior to reading the entire document to memory. After looking at POI documentation the following should be possible and would be AMAZING to developers. It would make a newsworthy impact for ColdFusion developers with relatively little effort. Here's the link I used: https://poi.apache.org/apidocs/dev/org/apache/poi/ss/usermodel/Workbook.html Item #1 Create New Method: ReadTopLevel (or whatever you want to call it) This creates a Struct/Array with the following information: (Top Level) - Total Sheets/Tabs [ getNumberOfSheets() ] - Total Hidden Sheets/Tabs [ isSheetHidden() AND isSheetVeryHidden() ] - Total Hidden Columns in Sheets/Tabs (Top Level Sheet Data) Sheet/Tab - Name [ getSheetName() ] - Visibility [ getSheetVisibility() ] - Total Rows [ getPhysicalNumberOfRows() ] - Total Columns [ read first row only, getRow(0).getPhysicalNumberOfCells() ] - Column Names [ cell.getSheet().getRow(0).getCell(currentcellIndex) .getRichStringCellValue().toString() ] Item #2 Add functionality to cfspreadsheet would allow us to: - Read from only visible Sheets/Tabs - Read from only the first Sheet/Tab - Read from only Sheet/Tab with name "X" - Read only the top X rows (developers understand java starts with 0). Thank you so much for working with this issue.
Comment by Chad S.
30752 | May 16, 2019 05:34:31 PM GMT