Sample code used in this posting is ColdFusion-specific, but the XPath syntax will likely apply to many XPath implementations in other languages.
I was looking a specific set of elements in a SOAP response and kept getting an empty array returned from XMLSearch. Turns out this was related to how the namespaces in the response were defined and/or assigned to certain elements. (see this Talking Tree posting)
The information in the Talking Tree posting helped me identify my problem, but in my case, it wasn't about noname namespaces. In fact, what if you want to find all elements of a given name and don't care about the namespace or even where it lies in the hierarchy? In the example below, I want to quickly retrieve all of the Response elements. Note that one Response element is a child of Other, while the remaining are direct descendants of ResponseList.
<ResponseList xmlns="urn:shama.lama.dingdong.net">
<Response>
<ns1:success xmlns:ns1="urn:core.shama.lama.dingdong.net">true</ns1:success>
<baseRef internalId="1234" xmlns:ns2="urn:core.shama.lama.dingdong.net"/>
</Response>
<Response>
<ns3:success xmlns:ns3="urn:core.shama.lama.dingdong.net">false</ns3:success>
<ns3:statusDetail type="ERROR">
<ns3:code>USER_ERROR</ns3:code>
<ns3:message>That record does not exist.</ns3:message>
</ns3:statusDetail>
</ns3:status>
<baseRef internalId="4421" xmlns:ns4="urn:core.shama.lama.dingdong.net"/>
</Response>
<Other>
<Response>
<ns3:success xmlns:ns3="urn:core.shama.lama.dingdong.net">false</ns3:success>
<ns3:statusDetail type="ERROR">
<ns3:code>RECORDNOTFOUND_ERROR</ns3:code>
<ns3:message>That record does not exist.</ns3:message>
</ns3:statusDetail>
</ns3:status>
</Response>
<warning>Import timed out briefly and process was restarted. No further errors were reported.</warning>
</Other>
</ResponseList>
Now if it weren't for the namespaces, I could simply use the following:
<cfset MyArray = XMLSearch(MyXMLDoc, "//Response")>
...but in this case that would return an empty array.
Ok, but what about the noname namespace syntax:
<cfset MyArray = XMLSearch(MyXMLDoc, "//:Response")>
That's fine if there's an actual noname namespace assigned to that element, but in this case there isn't one assigned.
After trying countless variations of XPath syntax, I still couldn't get anything other than a blank array returned. I became desperate and quickly wrote a function that strips all of the namespace definitions and labels out of the XML, then searched on the results of that. But I felt this was rather convoluted and added too much overhead. Surely there was a better way. I kept searching, and lo and behold, came across the local-name function. If you want to ignore namespaces and hierarchical context completely, you can search by the local name of the element:
<cfset MyArray = XMLSearch(MyXMLDoc, "//*[local-name()='Response']"
And now you have your array containing the three Response elements.
March 11, 2008 UPDATE: Ryan commented that he had tried the syntax below with similar success. I have not tried this myself, but give it a shot:
<cfset MyArray = XMLSearch(MyXMLDoc, "//*:Response")>

7 comments:
Excellent solution.
Hey,
I too came to this solution after a lot of head scratching but now my source xml is getting quite large i'm finding XmlSearch really slow , have you experienced any performance problems?
Colin, actually I have run into that. Perhaps there are XML parsing performance improvements between MX 6 and 7 (we only just started to use 7 in production in the last few months where I work), but my experience with extremely large xml docs in 6 was similar to what you describe. Unfortunately I'm not aware of an elegant solution. I'm sure doing searches for explicit paths (as opposed to wildcard searches) would help quite a bit, but sometimes that can hurt the extensibility of the code.
Great stuff - i also thought namespace stripping was the answer, but you've opened my eyes to a more elegant solution. Thank you!
We also had this problem, we are on CF8 now, so it may be a little different, but the cleanest syntax we found was a namespace of '*'.
<cfset MyArray = XMLSearch(MyXMLDoc, "//*:Response")%gt;
Thanks, Ryan! We haven't moved to 8 yet, but I will definitely keep that in mind.
Just in case someone else comes across this ... the solution above did not work for me in CF8. (I did not try it in any other version.) Instead, I had to specify no namespace identifier.
For example, getting Google Picasa XML and performing an XPath search for "entry": "/:feed/:entry".
To get the "media" namespace and the group element (media:group), use "/:feed/:entry/media:group"
Post a Comment