Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

XML Special Character errors/handling #82

Open
deydist opened this issue Jul 20, 2023 · 1 comment
Open

XML Special Character errors/handling #82

deydist opened this issue Jul 20, 2023 · 1 comment
Assignees
Labels
bug Something isn't working

Comments

@deydist
Copy link

deydist commented Jul 20, 2023

  • itoolkit version: 1.7.2
  • Python version: 3.8.10
  • OS Name and Version: Ubuntu 20.04.6
  • IBM i version: V7R4M0
  • XMLSERVICE version:

Describe the bug

How does itoolkit handle special character's for xml? The example below returns *BADPARSE. Our understanding is that CDATA should allow for the use of special characters. I think in this case it's a period (.)

3e3c215b43444154415b4b796c651a73 ><![CDATA[Kyle.s2076616e5d5d3e3c2f646174613e0a3c van]]></data>.<

#install packages
# itoolkit
# pyodbc


from itoolkit import iToolKit, iCmd, iData, iPgm
from itoolkit.transport import DatabaseTransport

text_key = 12021270
iseries_connection =  pyodbc.connect(settings.ISERIES_CONNECTION_DSN, timeout=1)

itransport = DatabaseTransport(iseries_connection)
itool = iToolKit()

itool.add(iCmd('chglibl',
               'CHGLIBL LIBL(DD1492BFDD DD1492BS XX1492BP DD1492BP U492BP U492AP U491AP U490AP)'))


#10a, 10a, 15p0, 3a, 10a, 7000a, 2a
itool.add(iPgm('CSBR926B', 'CSBR926B')
          .addParm((iData('IN_TextType', '10a', '*SALORDLIN')))
          .addParm(iData('IN_DocumentType', '10a', '*ALL'))
          .addParm(iData('IN_TextKey', '15p0', text_key))
          .addParm(iData('IN_Empty', '3a', ""))
          .addParm(iData('IN_User', '10a', "james"))
          .addParm(iData('OUT_TEXT', '7000a', ""))
          .addParm(iData('OUT_Error', '2a', ""))
          )
itool.call(itransport)

results = itool.dict_out('CSBR926B')
print(results)

XML_IN:

<?xml version='1.0'?>
<xmlservice><cmd exec="cmd" error="fast" var="chglibl"><![CDATA[CHGLIBL LIBL(DD1492BFDD DD1492BS XX1492BP DD1492BP U492BP U492AP U491AP U490AP)]]></cmd><pgm name="CSBR926B" error="fast" var="CSBR926B"><parm io="both" var="p1"><data type="10a" var="IN_TextType"><![CDATA[*SALORDLIN]]></data></parm><parm io="both" var="p2"><data type="10a" var="IN_DocumentType"><![CDATA[*ALL]]></data></parm><parm io="both" var="p3"><data type="15p0" var="IN_TextKey"><![CDATA[12021270]]></data></parm><parm io="both" var="p4"><data type="3a" var="IN_Empty"/></parm><parm io="both" var="p5"><data type="10a" var="IN_User"><![CDATA[james]]></data></parm><parm io="both" var="p6"><data type="7000a" var="OUT_TEXT"/></parm><parm io="both" var="p7"><data type="2a" var="OUT_Error"/></parm></pgm></xmlservice>

XML_OUT:

<?xml version="1.0" ?><xmlservice>
<error>*BADPARSE</error>
<error>

<![CDATA[ ?xml version='1.0'? xmlservice cmd exec="cmd" error="fast" var="chglibl" success 

![CDATA[+++ success CHGLIBL LIBL(DD1492BFDD DD1492BS XX1492BP DD1492BP U492BP U492AP U491AP U490AP)]] /success /cmd pgm name="CSBR926B" error="fast" var="CSBR926B" parm io="both" var="p1" data type="10a" var="IN_TextType" 

![CDATA[*SALORDLIN]] /data /parm parm io="both" var="p2" data type="10a" var="IN_DocumentType" 

![CDATA[*ALL]] /data /parm parm io="both" var="p3" data type="15p0" var="IN_TextKey" 

![CDATA[12021270]] /data /parm parm io="both" var="p4" data type="3a" var="IN_Empty" 

![CDATA[]] /data /parm parm io="both" var="p5" data type="10a" var="IN_User" 

![CDATA[james]] /data /parm parm io="both" var="p6" data type="7000a" var="OUT_TEXT" 
![CDATA[Kyle s van]] /data /parm parm io="both" var="p7" data type="2a" var="OUT_Error" 

![CDATA[]] /data /parm success 

![CDATA[+++ success CSBR926B]] /success /pgm /xmlservice ]]></error>
</xmlservice>

Trace

control Thu Jul 20 12:54:52 2023
 ipc(*na) ctl(*here *cdata) proc(QXMLSERV.iPLUGR512K)
input Thu Jul 20 12:54:52 2023
<?xml version='1.0'?>
<xmlservice><cmd exec="cmd" error="fast" var="chglibl"><![CDATA[CHGLIBL LIBL(DD1492BFDD DD1492BS XX1492BP DD1492BP U492BP U492AP U491AP U490AP)]]></cmd><pgm name="CSBR926B" error="fast" var="CSBR926B"><parm io="both" var="p1"><data type="10a" var="IN_TextType"><![CDATA[*SALORDLIN]]></data></parm><parm io="both" var="p2"><data type="10a" var="IN_DocumentType"><![CDATA[*ALL]]></data></parm><parm io="both" var="p3"><data type="15p0" var="IN_TextKey"><![CDATA[12021270]]></data></parm><parm io="both" var="p4"><data type="3a" var="IN_Empty"/></parm><parm io="both" var="p5"><data type="10a" var="IN_User"><![CDATA[james]]></data></parm><parm io="both" var="p6"><data type="7000a" var="OUT_TEXT"/></parm><parm io="both" var="p7"><data type="2a" var="OUT_Error"/></parm></pgm></xmlservice>

parse (fail) Thu Jul 20 12:54:52 2023
3c3f786d6c2076657273696f6e3d2731 <?xml version='1
2e30273f3e0a3c786d6c736572766963 .0'?>.<xmlservic
653e3c636d6420657865633d22636d64 e><cmd exec="cmd
22206572726f723d2266617374222076 " error="fast" v
61723d226368676c69626c223e3c7375 ar="chglibl"><su
63636573733e3c215b43444154415b2b ccess><![CDATA[+
2b2b2073756363657373204348474c49 ++ success CHGLI
424c204c49424c284444313439324246 BL LIBL(DD1492BF
44442044443134393242532058583134 DD DD1492BS XX14
39324250204444313439324250205534 92BP DD1492BP U4
39324250205534393241502055343931 92BP U492AP U491
415020553439304150295d5d3e3c2f73 AP U490AP)]]></s
7563636573733e0a3c2f636d643e0a3c uccess>.</cmd>.<
70676d206e616d653d22435342523932 pgm name="CSBR92
364222206572726f723d226661737422 6B" error="fast"
207661723d224353425239323642223e  var="CSBR926B">
0a3c7061726d20696f3d22626f746822 .<parm io="both"
207661723d227031223e0a3c64617461  var="p1">.<data
20747970653d2231306122207661723d  type="10a" var=
22494e5f5465787454797065223e3c21 "IN_TextType"><!
5b43444154415b2a53414c4f52444c49 [CDATA[*SALORDLI
4e5d5d3e3c2f646174613e0a3c2f7061 N]]></data>.</pa
726d3e0a3c7061726d20696f3d22626f rm>.<parm io="bo
746822207661723d227032223e0a3c64 th" var="p2">.<d
61746120747970653d22313061222076 ata type="10a" v
61723d22494e5f446f63756d656e7454 ar="IN_DocumentT
797065223e3c215b43444154415b2a41 ype"><![CDATA[*A
4c4c5d5d3e3c2f646174613e0a3c2f70 LL]]></data>.</p
61726d3e0a3c7061726d20696f3d2262 arm>.<parm io="b
6f746822207661723d227033223e0a3c oth" var="p3">.<
6461746120747970653d223135703022 data type="15p0"
207661723d22494e5f546578744b6579  var="IN_TextKey
223e3c215b43444154415b3132303231 "><![CDATA[12021
3237305d5d3e3c2f646174613e0a3c2f 270]]></data>.</
7061726d3e0a3c7061726d20696f3d22 parm>.<parm io="
626f746822207661723d227034223e0a both" var="p4">.
3c6461746120747970653d2233612220 <data type="3a" 
7661723d22494e5f456d707479223e3c var="IN_Empty"><
215b43444154415b5d5d3e3c2f646174 ![CDATA[]]></dat
613e0a3c2f7061726d3e0a3c7061726d a>.</parm>.<parm
20696f3d22626f746822207661723d22  io="both" var="
7035223e0a3c6461746120747970653d p5">.<data type=
2231306122207661723d22494e5f5573 "10a" var="IN_Us
6572223e3c215b43444154415b6a616d er"><![CDATA[jam
65735d5d3e3c2f646174613e0a3c2f70 es]]></data>.</p
61726d3e0a3c7061726d20696f3d2262 arm>.<parm io="b
6f746822207661723d227036223e0a3c oth" var="p6">.<
6461746120747970653d223730303061 data type="7000a
22207661723d224f55545f5445585422 " var="OUT_TEXT"
3e3c215b43444154415b4b796c651a73 ><![CDATA[Kyle.s
2076616e5d5d3e3c2f646174613e0a3c  van]]></data>.<
2f7061726d3e0a3c7061726d20696f3d /parm>.<parm io=
22626f746822207661723d227037223e "both" var="p7">
0a3c6461746120747970653d22326122 .<data type="2a"
207661723d224f55545f4572726f7222  var="OUT_Error"
3e3c215b43444154415b5d5d3e3c2f64 ><![CDATA[]]></d
6174613e0a3c2f7061726d3e0a3c7375 ata>.</parm>.<su
63636573733e3c215b43444154415b2b ccess><![CDATA[+
2b2b2073756363657373202043534252 ++ success  CSBR
393236425d5d3e3c2f73756363657373 926B]]></success
3e0a3c2f70676d3e0a3c2f786d6c7365 >.</pgm>.</xmlse
72766963653e rvice>
parse step: 2 (1-ok, 2-*BADPARSE, 3-*NOPARSE)
{'error': {'error': '*BADPARSE', 'error1': ' ?xml version=\'1.0\'? xmlservice cmd exec="cmd" error="fast" var="chglibl" success ![CDATA[+++ success CHGLIBL LIBL(DD1492BFDD DD1492BS XX1492BP DD1492BP U492BP U492AP U491AP U490AP)]] /success /cmd pgm name="CSBR926B" error="fast" var="CSBR926B" parm io="both" var="p1" data type="10a" var="IN_TextType" ![CDATA[*SALORDLIN]] /data /parm parm io="both" var="p2" data type="10a" var="IN_DocumentType" ![CDATA[*ALL]] /data /parm parm io="both" var="p3" data type="15p0" var="IN_TextKey" ![CDATA[12021270]] /data /parm parm io="both" var="p4" data type="3a" var="IN_Empty" ![CDATA[]] /data /parm parm io="both" var="p5" data type="10a" var="IN_User" ![CDATA[james]] /data /parm parm io="both" var="p6" data type="7000a" var="OUT_TEXT" ![CDATA[Kyle s van]] /data /parm parm io="both" var="p7" data type="2a" var="OUT_Error" ![CDATA[]] /data /parm success ![CDATA[+++ success CSBR926B]] /success /pgm /xmlservice ', 'CSBR926B': {...}}}
@deydist deydist added the bug Something isn't working label Jul 20, 2023
@kadler
Copy link
Member

kadler commented Oct 19, 2023

It appears that in your output you have the string "Kyle?s van" where ? is a substitute character (0x1A in UTF-8). As XML does not allow control characters other than tab, LF, and CR this causes a parsing error.

We could try to work around this by replacing any substitute characters with the U+FFFD Unicode replacement character, which is allowed. I don't know if this is what you want, though, since your data will still have "garbage" in it. Probably better to investigate why the output is getting converted improperly in the first place. Looking at the output, I'm guessing the data contains some sort of smart quote/apostrophe, eg. Kyle’s van instead of Kyle's van.

Note that the IBM i Access ODBC driver uses the locale's encoding for character conversions. By default Unix applications start up in the "C" locale, which uses CCSID 819 in the driver (ISO-8859-1). Likely what you want is to use a UTF-8 locale sent by your environment variables. This can be called from your main application in Python with the example shown here: https://docs.python.org/3/library/locale.html#locale.setlocale

Alternatively, if you don't want (or can't) set the locale, you can set CCSID=1208 in your connection string or DSN to override the locale's encoding with UTF-8.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants