PHP: SimpleXML + SOAP

Recently after getting into XML parsing with PHP and realising how hard most of the functions were to use, I decided to put it down and that i was going to require PHP5 for all my projects. Great i thought, SOAP, PHP’s got a SOAPClient class!

Personally i didnt like the Soap class, I’m happy to hardcode the values i send to the server,  But i want to read the returned XML easily.

I looked around, and found SimpleXML, And i like it!, It  worked well with all the sniplets of XML i gave it.. Well, Until i actually used it on some live data!

Suddenly SimpleXML refuses to parse the SOAP reply…

Heres a example of XML SimpleXML didnt like:

[sourcecode language=”xml”]<?xml version="1.0" encoding="utf-8"?>
<soap:Envelope xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/">
<soap:Body>
<GetProviders xmlns="http://hostname/" />
</soap:Body>
</soap:Envelope>[/sourcecode]

See any problem? Looks valid to me!

The cause of it not parsing though? SimpleXML doesnt like any colons(:) in tagnames or attribute names! If its contained within the value of the tag or attribute its ok though.

So, What can i do? A Mass-replace of all colons? That’d potentially destroy my source data..

I came up with this short snipplet of PHP regular expressions to strip out any colons in the tags/attributes:

[sourcecode language=”php”]$out = preg_replace(‘|<([/\w]+)(:)|m’,’<$1′,$in);
$out = preg_replace(‘|(\w+)(:)(\w+=\")|m’,’$1$3′,$out);[/sourcecode]

The result after this has been done:

[sourcecode language=”xml”]<?xml version="1.0" encoding="utf-8"?>
<soapEnvelope xmlnsxsi="http://www.w3.org/2001/XMLSchema-instance" xmlnsxsd="http://www.w3.org/2001/XMLSchema" xmlnssoap="http://schemas.xmlsoap.org/soap/envelope/">
<soapBody>
<GetProviders xmlns="http://hostname/" />
</soapBody>
</soapEnvelope>
[/sourcecode]

As you can see, the general gyst of the document is there, and its parsable,  just without the colons where simpleXML cant handle them.

11 thoughts on “PHP: SimpleXML + SOAP”

  1. hey there. i just ran across the same issue today. i found a more simple method after trial & error: simply replace <soap:Body> and </soap:Body> with an empty string. SimpleXML will parse the XML with no problems after that happens.

    i’m not sure why it chokes on namespaces like that, but there might be some settings we’re overlooking to make it work properly? though it does seem like a bug more than anything.

  2. Possibly a setting i guess.. but i never found anything.

    Some Soap responces have data outside of the <soap:Body> elements which are needed however, like in the soap:header/etc.

  3. Actually, having done some further investigation, I suspect that SimpleXML is parsing your SOAP response correctly, but it doesn’t provide you easy access to the nodes which aren’t in the default namespace. You can check if it’s parsing the document correctly by doing echo $xml_obj->asXML();. If you get the same output as input it’s being parsed.

    Now to access the nodes not in the default namespace (probably most of them) have a look here:

    http://devzone.zend.com/node/view/id/688

    It’s a bit of a pain, and you might still like your method better, but this is, I guess, how SimpleXML is intended to be used when namespaces are involved.

  4. Hm, You’re right Simon, I guess that is how they’ve designed it to work (I however still cant manage to navigate the structure its made).

    I remember i was also getting fatal errors from SimpleXML with certain soap headers, I cant find a example of that right now though.

  5. Thanks for this. I just spent an hour trying to figure out what was happening inside a simplexml object and getting nowhere. So annoying that you cannot interrogate object values with print_r.

    Tip: I ended up using unserialize to see parse into an array and verify that the XML was parsable. Seeing it was lead me to search the web and the SOAP problem!

  6. I had some trouble with soap and being new blood to programming hoped I could find the answer out there. Through reading a few different blogs I have successfully parsed a soap header structured xml file. One big problem is definitely the colons, of which other people have given far more expanded opinions on – that are over my head to be honest as I say I am a noob!

    The other factor maybe – I am just spit balling – it the structure is not what simpleXML is expecting. Changing the pointer seems to work so instead of simplexml only seeing two nodes and passing this through foreach twice it see all the children.
    For me the pointer was: $xml->soapBody->PostAdvert->Adverts

    Anyone wants to expand on this and make sense feel free, code below:

    $src = fopen(“original_file.txt”, “r”);
    if (!$src) {
    die(“Failed to open the files”);
    }
    $dest = fopen(“write_to_file.xml”, “w”);
    if (!$dest) {
    die(“Failed to open the files”);
    }
    while (!feof($src)) {
    echo “Starting preg_replace”;
    $data = fgets($src);
    $xmlString = preg_replace(“/(]*>)/”, “$1$2$3”, $data);

    fwrite($dest, $xmlString);
    }
    fclose($src);
    fclose($dest);

    // Load your preg_replaced to simplexml
    $xml = load_xml_file(“write_to_file.xml”);

    // Set the level at which the xml is read (I think)
    $result = $xml->soapBody->Start_node->Whatever_it_may_be;

    // Extract children
    foreach($result->children() as $child) {
    $child->child_node;
    }

  7. Hi, great solution! Thanks!

    The only think was I have to made was:

    $response = str_replace(”, ”, $response);

    But your solution works!

    Thanks!
    Nik

Leave a Reply

Your email address will not be published. Required fields are marked *