PHP: SimpleXML + SOAP
Recently after getting into XML parsing with PHP and realising how hard most of the functions were to use, I decided to put it down and that i was going to require PHP5 for all my projects. Great i thought, SOAP, PHP’s got a SOAPClient class!
Personally i didnt like the Soap class, I’m happy to hardcode the values i send to the server, But i want to read the returned XML easily.
I looked around, and found SimpleXML, And i like it!, It worked well with all the sniplets of XML i gave it.. Well, Until i actually used it on some live data!
Suddenly SimpleXML refuses to parse the SOAP reply…
Heres a example of XML SimpleXML didnt like:
<?xml version="1.0" encoding="utf-8"?> <soap:Envelope xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/"> <soap:Body> <GetProviders xmlns="http://hostname/" /> </soap:Body> </soap:Envelope>
See any problem? Looks valid to me!
The cause of it not parsing though? SimpleXML doesnt like any colons(:) in tagnames or attribute names! If its contained within the value of the tag or attribute its ok though.
So, What can i do? A Mass-replace of all colons? That’d potentially destroy my source data..
I came up with this short snipplet of PHP regular expressions to strip out any colons in the tags/attributes:
$out = preg_replace('|<([/\w]+)(:)|m','<$1',$in);
$out = preg_replace('|(\w+)(:)(\w+=\")|m','$1$3',$out);
The result after this has been done:
<?xml version="1.0" encoding="utf-8"?> <soapEnvelope xmlnsxsi="http://www.w3.org/2001/XMLSchema-instance" xmlnsxsd="http://www.w3.org/2001/XMLSchema" xmlnssoap="http://schemas.xmlsoap.org/soap/envelope/"> <soapBody> <GetProviders xmlns="http://hostname/" /> </soapBody> </soapEnvelope>
As you can see, the general gyst of the document is there, and its parsable, just without the colons where simpleXML cant handle them.