Recently after getting into XML parsing with PHP and realising how hard most of the functions were to use, I decided to put it down and that i was going to require PHP5 for all my projects. Great i thought, SOAP, PHP’s got a SOAPClient class!

Personally i didnt like the Soap class, I’m happy to hardcode the values i send to the server,  But i want to read the returned XML easily.

I looked around, and found SimpleXML, And i like it!, It  worked well with all the sniplets of XML i gave it.. Well, Until i actually used it on some live data!

Suddenly SimpleXML refuses to parse the SOAP reply…

Heres a example of XML SimpleXML didnt like:

[sourcecode language=”xml”]<?xml version="1.0" encoding="utf-8"?>
<soap:Envelope xmlns:xsi="" xmlns:xsd="" xmlns:soap="">
<GetProviders xmlns="http://hostname/" />

See any problem? Looks valid to me!

The cause of it not parsing though? SimpleXML doesnt like any colons(:) in tagnames or attribute names! If its contained within the value of the tag or attribute its ok though.

So, What can i do? A Mass-replace of all colons? That’d potentially destroy my source data..

I came up with this short snipplet of PHP regular expressions to strip out any colons in the tags/attributes:

[sourcecode language=”php”]$out = preg_replace(‘|<([/\w]+)(:)|m’,’<$1′,$in);
$out = preg_replace(‘|(\w+)(:)(\w+=\")|m’,’$1$3′,$out);[/sourcecode]

The result after this has been done:

[sourcecode language=”xml”]<?xml version="1.0" encoding="utf-8"?>
<soapEnvelope xmlnsxsi="" xmlnsxsd="" xmlnssoap="">
<GetProviders xmlns="http://hostname/" />

As you can see, the general gyst of the document is there, and its parsable,  just without the colons where simpleXML cant handle them.