NetTalk Central

Author Topic: Unescaping (?) HTML-special-chars from a string back to normal ASCII  (Read 3452 times)

Wolfgang Orth

  • Sr. Member
  • ****
  • Posts: 251
    • View Profile
    • oData Wolfgang Orth
Hello all!

My SOAPServer receives data from the SOAPClient, which sometimes contain those characters like &, which get transformed to & for transport.

<Company>Dunburry &amp; Brooks</Company>

The Server, however, needs the original string, not that converted character:

<Company>Dunburry & Brooks</Company>

With .StringTheorys HtmlEntityToDec() I can swap things like &copy; (the copyright symbol ©) to &#169;, but this is not exctly what I want <g>.

Is there something to re-connvert like .HTMLtoASCII() or whatever alike?

Thanks in advance,
Wolfgang

Bruce

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 11250
    • View Profile
Re: Unescaping (?) HTML-special-chars from a string back to normal ASCII
« Reply #1 on: October 05, 2014, 10:46:58 PM »
Hi Wolfgang,

this sort of encoding is in the xml - and usually would get parsed out for you by xFiles - but of course for the individual field we're just using Between - so that's not happening.

I'll add some code to the template, but in the meantime you can do something like;

whatever = xml.decode(whatever)

xml is the name of any xfiles object (if one isn't declared then use
xml   xFileXml

whatever is the string containing the encoded chars.

cheers
Bruce

Wolfgang Orth

  • Sr. Member
  • ****
  • Posts: 251
    • View Profile
    • oData Wolfgang Orth
Re: Unescaping (?) HTML-special-chars from a string back to normal ASCII
« Reply #2 on: October 05, 2014, 11:50:25 PM »
Thanks for your quick reply, Bruce, but I suspect, this ain't gonna work.

The received XML sits in p_web.GetValue('xml'), but the template then puts it into a StringTheory-object strxml.SetValue(p_web.GetValue('xml')) instantly, thus I have no chance to access the contents in time.


PrimeParameters   routine
! Start of "Start of PrimeParameters Routine"
! [Priority 5000]

!    dbgView(p_web.GetValue('xml')) ! this is the incoming REQUEST
   
! End of "Start of PrimeParameters Routine"
  If p_web.xml = 0   ! incoming parameters are just url encoded, either in the URL, or as post data, or as a cookie.
    .....
  Else   ! incoming parameters are in an xml structure
    strxml.SetValue(p_web.GetValue('xml'))
    Clear(myVariable)
    myVariable= strxml.between(p_web.Nocolon('<myVariable>',Net:SingleUnderscore+Net:NoSpaces),p_web.Nocolon('</myVariable>',Net:SingleUnderscore+Net:NoSpaces) , 1, 0, NET:NoCase)    ! that needs to be added also
   ...


The german umlauts get displayed correct, its the Ampersand, < and >, that cause the grief.

I was brave and looked into xFils.CLW. There I found xFileXML.DecodeAmpersand  Procedure(String pStr,*String dStr).

I was even more brave, took my heart and added a STOP() each before and inside the CASE-construct.

Nothing happened... the earth did not tremble, did not open up and swallowed me, but nothing else happened at all too, like no STOP().

I also tried with < and >, they too do not converted from &lt; or &gt;.

Perhaps with .DecodeAmpersand we have the culprit?

Until later then,
Wolfgang


« Last Edit: October 05, 2014, 11:56:58 PM by Wolfgang Orth »

Bruce

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 11250
    • View Profile
Re: Unescaping (?) HTML-special-chars from a string back to normal ASCII
« Reply #3 on: October 05, 2014, 11:54:40 PM »
>> Perhaps with .DecodeAmpersand we have the culprit?

no, that's a utility method, not used by the class. And in any event you're not using xFiles to parse the xml, so it's not the case.

The code I was suggesting should go after the call to StringTheory.Between.

cheers
Bruce