PHP: SimpleXML + SOAP

Recently after getting into XML parsing with PHP and realising how hard most of the functions were to use, I decided to put it down and that i was going to require PHP5 for all my projects. Great i thought, SOAP, PHP’s got a SOAPClient class!

Personally i didnt like the Soap class, I’m happy to hardcode the values i send to the server,  But i want to read the returned XML easily.

I looked around, and found SimpleXML, And i like it!, It  worked well with all the sniplets of XML i gave it.. Well, Until i actually used it on some live data!

Suddenly SimpleXML refuses to parse the SOAP reply…

Heres a example of XML SimpleXML didnt like:

[sourcecode language=”xml”]<?xml version="1.0" encoding="utf-8"?>
<soap:Envelope xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/">
<soap:Body>
<GetProviders xmlns="http://hostname/" />
</soap:Body>
</soap:Envelope>[/sourcecode]

See any problem? Looks valid to me!

The cause of it not parsing though? SimpleXML doesnt like any colons(:) in tagnames or attribute names! If its contained within the value of the tag or attribute its ok though.

So, What can i do? A Mass-replace of all colons? That’d potentially destroy my source data..

I came up with this short snipplet of PHP regular expressions to strip out any colons in the tags/attributes:

[sourcecode language=”php”]$out = preg_replace(‘|<([/\w]+)(:)|m’,’<$1′,$in);
$out = preg_replace(‘|(\w+)(:)(\w+=\")|m’,’$1$3′,$out);[/sourcecode]

The result after this has been done:

[sourcecode language=”xml”]<?xml version="1.0" encoding="utf-8"?>
<soapEnvelope xmlnsxsi="http://www.w3.org/2001/XMLSchema-instance" xmlnsxsd="http://www.w3.org/2001/XMLSchema" xmlnssoap="http://schemas.xmlsoap.org/soap/envelope/">
<soapBody>
<GetProviders xmlns="http://hostname/" />
</soapBody>
</soapEnvelope>
[/sourcecode]

As you can see, the general gyst of the document is there, and its parsable,  just without the colons where simpleXML cant handle them.