Recently after getting into XML parsing with PHP and realising how hard most of the functions were to use, I decided to put it down and that i was going to require PHP5 for all my projects. Great i thought, SOAP, PHP’s got a SOAPClient class!
Personally i didnt like the Soap class, I’m happy to hardcode the values i send to the server, Â But i want to read the returned XML easily.
I looked around, and found SimpleXML, And i like it!, It worked well with all the sniplets of XML i gave it.. Well, Until i actually used it on some live data!
Suddenly SimpleXML refuses to parse the SOAP reply…
Heres a example of XML SimpleXML didnt like:
[sourcecode language=”xml”]<?xml version="1.0" encoding="utf-8"?>
<soap:Envelope xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/">
<soap:Body>
<GetProviders xmlns="http://hostname/" />
</soap:Body>
</soap:Envelope>[/sourcecode]
See any problem? Looks valid to me!
The cause of it not parsing though? SimpleXML doesnt like any colons(:) in tagnames or attribute names! If its contained within the value of the tag or attribute its ok though.
So, What can i do? A Mass-replace of all colons? That’d potentially destroy my source data..
I came up with this short snipplet of PHP regular expressions to strip out any colons in the tags/attributes:
[sourcecode language=”php”]$out = preg_replace(‘|<([/\w]+)(:)|m’,’<$1′,$in);
$out = preg_replace(‘|(\w+)(:)(\w+=\")|m’,’$1$3′,$out);[/sourcecode]
The result after this has been done:
[sourcecode language=”xml”]<?xml version="1.0" encoding="utf-8"?>
<soapEnvelope xmlnsxsi="http://www.w3.org/2001/XMLSchema-instance" xmlnsxsd="http://www.w3.org/2001/XMLSchema" xmlnssoap="http://schemas.xmlsoap.org/soap/envelope/">
<soapBody>
<GetProviders xmlns="http://hostname/" />
</soapBody>
</soapEnvelope>
[/sourcecode]
As you can see, the general gyst of the document is there, and its parsable, Â just without the colons where simpleXML cant handle them.