tracker issue : CF-4203132

select a category, or use search below
(searches all categories and all time range)
Title:

[cielen] cfhtmltopdfitem - error when used with special characters

| View in Tracker

Status/Resolution/Reason: To Fix/Withdrawn/HaveNewInfo

Reporter/Name(from Bugbase): Philipp Cielen / ()

Created: 07/24/2018

Components: Document Management, PDF Generation (CFHTML2PDF)

Versions: 13.0

Failure Type: Crash

Found In Build/Fixed In Build: CF 2018 /

Priority/Frequency: Normal / All users will encounter

Locale/System: German / Windows 10 64 bit

Vote Count: 0

This one was quite tricky to break down and isolate. cfhtmltopdfitem seems to throw an error as soon as more than three special characters are used anywhere in the code - even in comments!

Try this code:
 
<cfhtmltopdf>
<cfhtmltopdfitem type="header" >
	<cfoutput>
		ü
	</cfoutput>
</cfhtmltopdfitem>
<html><head><body><cfoutput>test ä</cfoutput></body></head></html>
<!--- öä --->
</cfhtmltopdf>

it will throw an error "Error in handling header footer related attributes."

As soon as at least one of the four German special characters (öäü) on the page is removed, the error will no longer occur. 

Found in the release version of CF 2018

Attachments:

Comments:

add -Dfile.encoding=UTF8 in jvm arguments using cf administrator's server settings
Comment by Ajay Kumar Rai
29448 | August 03, 2018 05:57:00 AM GMT
Hi Philip, Can you try adding the encoding -Dfile.encoding=UTF8 in jvm arguments in jvm.config file?
Comment by Kailash Bihani
29449 | August 09, 2018 10:19:07 AM GMT
Default Encoder "ISO-8859-1" is inconsistent with generating byte streams for these gereman characters because they not supported by encoding ISO-8859-1. Thus when acrobat content reader is not getting correct stream of bytes it is throwing exceptions. These characters are supported by utf-8 encoding and to use utf-8 encoding add "-Dfile.encoding=UTF8" in jvm arguments using cf administrator's server settings, same has been demonstrated in image below. !CF-4203132.JPG|thumbnail!
Comment by Ajay Kumar Rai
29450 | August 09, 2018 10:21:47 AM GMT
Default Encoder "ISO-8859-1" is inconsistent with generating byte streams for these gereman characters because they not supported by encoding ISO-8859-1. Thus when acrobat content reader is not getting correct stream of bytes it is throwing exceptions. These characters are supported by utf-8 encoding and to use utf-8 encoding add "-Dfile.encoding=UTF8" in jvm arguments using cf administrator's server settings, same has been demonstrated in image below. !image-2018-08-09-16-11-38-277.png!
Comment by Kailash Bihani
29451 | August 09, 2018 10:41:41 AM GMT
Kailash, thanks for the feedback. These characters (äöüßÄÖÜ) definitely are included in ISO-8859-1 as characters 00e4,00f6, 00fc, 00df, 00c4, 00d6, 00dc. See https://en.wikipedia.org/wiki/ISO/IEC_8859-1#ISO/IEC_8859-1 - on the right hand side the included special characters are listed. So from my point of view this definitely is a bug and not something that should be handled by changing enconding via JVM settings. Also the behavior of the tag is absolutely erratic - it DOES work when only very few special characters are included and only starts to break when there are more than three of those on the same page which makes debugging this issue a nightmare. Did you test the code sample that I have provided and get the same results as I did? Try removing just one of the special characters and everything should work fine. Please reconsider opening and fixing this bug. The very least you could do would be to catch the error and provide a meaningful error message. Other than that I would expect to be able to change encoding via tag settings and not via JVM. But I really think that this should be fixed as the tag does not work as intended. P.S.: I cannot see any of the images that you have attached to your comments. Maybe there is a setting for internal/external visibility of images that you need to change in the bug tracker?
Comment by Philipp Cielen
29617 | August 25, 2018 03:10:50 PM GMT
One more thing to note: There is no problem with the special characters included in the page itself, the problem only occurs when these characters are used within the header text (cfhtmltopdfitem type="header")
Comment by Philipp Cielen
29618 | August 25, 2018 03:14:59 PM GMT
Hi Philip,   On further investigation, we found that this is an issue with a library we are using. We have raised a bug with them. I have reopened this bug, and will keep you posted with its progress   Thanks, Kailash
Comment by Kailash Bihani
29659 | September 04, 2018 07:47:26 AM GMT
Hi Kailash, thanks for following this up and reopening the bug! This issue is bugging us in productive applications so a fix would really be appreciated. Note: I did not get an e-mail notification about your message this time (re-checked spam folder). Thanks, Philipp
Comment by Philipp Cielen
29703 | September 16, 2018 10:29:57 PM GMT