I’m looking for a way to retrieve an XML doc (via URL), and then extract a node value from it, preferably identifying the node by XPATH.
The context for using this is either in the Content Type Properties (as a Pre-Processing exit to populate a field from the XML) or in a Velocity template (to get a value from XML for inclusion in the assembled page).
I’ve gotten part of the way there. In the Velocity template, I used $rx.doc.getDocument to hit the URL and get the XML document in a variable. But I’m not sure what methods or functions to use to parse out the XML node I’m looking for.
In the Content Type Pre-Processing, I’m not sure how to get the XML document or parse it. I have tried using the sys_xdDomToText extension, passing it the URL for the XML document, but it’s unclear what the other arguments do, or where the result goes (and I want to extract a node from the result anyway).
Has anyone got any experience in making this work?
I haven’t tried this before… but you should have access to the jDOM classes that the Rhythmyx server uses. In so doing, you won’t need to use $rx.doc.getDocument()
To get to those classes without developing a Velocity Extension for it, you can use code similar to the following (untested):
I tried this, both in the Velocity code, and as template bindings, and in both cases, I got java.lang.ClassNotFoundException: org.jdom.input.SAXBuilder. This class is not available in that context in my environment (we are running Rhythmyx 6.5.2 - not sure if that makes a difference).
Blast… I was assuming that since jdom.jar was in my 6.5 folder (AppServer\server\rx\deploy\rxapp.ear\rxapp.war\WEB-INF\lib) that it would be available for use. I’ll have to test to make sure, but in the mean time, check to see if it’s in your folder. If not, download a copy from www.jdom.com.
Normally, I’d prefer to use SAXON, but jdom’s interface is much simpler for what you’re looking to accomplish.
In the meantime, I worked around this by using JSON instead of XML, but I would still like to know how to do this. I’m in the midst of planning our upgrade from Rhythmyx 6.5.2 to CM System 6.7, but once we’re on 6.7, I expect to get back to figuring this out. This is in the context of our plans for integration with online video platform, and that platform provides both XML and JSON integration options - I’d like to make both work with Rhythmyx, so I have a choice in different situations. I’ll try to remember to come back and post my results here.
Have a look at PSXMLDomUtil (It is in the RX Public API)…that might just be what you are looking for once you have the xml doc… I think I had used it for something before, but I can’t remember if I settled for using that vs. org.w3c.dom.* classes…
Gotchas to watch out for: if the read or select method calls return null, then the variable won’t be set… use your favorite means of error checking to prevent unpredictable results.
Jitendra, I’m not familiar with using the w3c classes to do this… do they provide an implementation-independent interface to do all this?
Ah…it comes back to me…Long story short, we needed a way to parse a document (say the body of a particular item) and create new content items on the fly based on what was found and also remove certain divs with ids.
...
String xmlString = itemBodyField.getValue().getValueAsString();
PSXmlDocumentBuilder psdb = new PSXmlDocumentBuilder();
Document doc = psdb.createXmlDocument(new StringReader(xmlString), false);
Element docEle = doc.getDocumentElement();
HashSet removeElements = new HashSet();
NodeList divNodes = docEle.getElementsByTagName("div");
// Now iterate through divNodes to see check contents and see if we need to create a new item
...
Rushing, w3c provides the interface and percussion has implemented it in that particular class. I believe I just settled on using PSXmlDocumentBuilder to write the xml doc back and I didn’t need to use PSXMLDomUtil to find / select nodes. I believe if you go through Perc’s implemented classes, then it should be platform independent and you don’t have to rely on a separate jar installation. However i don’t think it implements the latest version of w3c.dom as I do recall a method (don’t recall which one ) not existing that should have…
Caveat: javax.xml.XPath evaluations always return a string, so result nodes will be translated to strings before returning, preventing customized traversal of subtrees.
Your macro would solve a major problem for me, but I can’t get it to work. We have installed the AtcToolkit.
I edited the macro to return each of the variables set so I could see where it broke down.
$domBuilder (returns org.apache.xerces.jaxp.DocumentBuilderImpl@1a9dd0e) and $xpath (returns com.sun.org.apache.xpath.internal.jaxp.XPathImpl@efda94) both appear to be set correctly.
$url fails, even after removing your parameters, but I can set that using #set($url = $BASE_URL) if that is acceptable.
But I can’t get $source or anything dependent on $source to work.