我想使用Java中的XPath读取XML数据,因此对于我所收集的信息,我无法根据我的要求解析XML。
这就是我想做的:
从网上通过其URL获取XML文件,然后使用XPath来解析它,我想在其中创建两个方法。一种情况是,我输入一个特定的节点属性id,结果得到所有的子节点,另一种情况是,假设我只想得到一个特定的子节点值
<?xml version="1.0"?>
<howto>
<topic name="Java">
<url>http://www.rgagnonjavahowto.htm</url>
<car>taxi</car>
</topic>
<topic name="PowerBuilder">
<url>http://www.rgagnon/pbhowto.htm</url>
<url>http://www.rgagnon/pbhowtonew.htm</url>
</topic>
<topic name="Javascript">
<url>http://www.rgagnon/jshowto.htm</url>
</topic>
<topic name="VBScript">
<url>http://www.rgagnon/vbshowto.htm</url>
</topic>
</howto>
在上面的例子中,我想读取所有的元素,如果我通过@name搜索,还有一个函数,我只是想从@name 'Javascript'的url只返回一个节点元素。
下面是一个用vtd-xml处理xpath的例子…对于繁重的XML处理,它是首屈一指的。这是最近的一篇关于用Java处理XML -性能基准的论文
import com.ximpleware.*;
public class changeAttrVal {
public static void main(String s[]) throws VTDException,java.io.UnsupportedEncodingException,java.io.IOException{
VTDGen vg = new VTDGen();
if (!vg.parseFile("input.xml", false))
return;
VTDNav vn = vg.getNav();
AutoPilot ap = new AutoPilot(vn);
XMLModifier xm = new XMLModifier(vn);
ap.selectXPath("/*/place[@id=\"p14\" and @initialMarking=\"2\"]/@initialMarking");
int i=0;
while((i=ap.evalXPath())!=-1){
xm.updateToken(i+1, "499");// change initial marking from 2 to 499
}
xm.output("new.xml");
}
}
下面是一个用vtd-xml处理xpath的例子…对于繁重的XML处理,它是首屈一指的。这是最近的一篇关于用Java处理XML -性能基准的论文
import com.ximpleware.*;
public class changeAttrVal {
public static void main(String s[]) throws VTDException,java.io.UnsupportedEncodingException,java.io.IOException{
VTDGen vg = new VTDGen();
if (!vg.parseFile("input.xml", false))
return;
VTDNav vn = vg.getNav();
AutoPilot ap = new AutoPilot(vn);
XMLModifier xm = new XMLModifier(vn);
ap.selectXPath("/*/place[@id=\"p14\" and @initialMarking=\"2\"]/@initialMarking");
int i=0;
while((i=ap.evalXPath())!=-1){
xm.updateToken(i+1, "499");// change initial marking from 2 to 499
}
xm.output("new.xml");
}
}
你需要这样的东西:
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = factory.newDocumentBuilder();
Document doc = builder.parse(<uri_as_string>);
XPathFactory xPathfactory = XPathFactory.newInstance();
XPath xpath = xPathfactory.newXPath();
XPathExpression expr = xpath.compile(<xpath_expression>);
然后调用express .evaluate(),传递在该代码中定义的文档和您期望的返回类型,并将结果转换为结果的对象类型。
如果您需要关于特定XPath表达式的帮助,您可能应该将其作为单独的问题提出(除非这是您首先要问的问题—我理解您的问题是如何在Java中使用API)。
编辑:(回复评论):这个XPath表达式将为您提供PowerBuilder下第一个URL元素的文本:
/howto/topic[@name='PowerBuilder']/url/text()
这将得到第二个:
/howto/topic[@name='PowerBuilder']/url[2]/text()
你可以通过下面的代码得到:
expr.evaluate(doc, XPathConstants.STRING);
如果你不知道给定节点中有多少url,那么你应该这样做:
XPathExpression expr = xpath.compile("/howto/topic[@name='PowerBuilder']/url");
NodeList nl = (NodeList) expr.evaluate(doc, XPathConstants.NODESET);
然后循环遍历NodeList。
如果你有一个像下面这样的xml
<e:Envelope
xmlns:d = "http://www.w3.org/2001/XMLSchema"
xmlns:e = "http://schemas.xmlsoap.org/soap/envelope/"
xmlns:wn0 = "http://systinet.com/xsd/SchemaTypes/"
xmlns:i = "http://www.w3.org/2001/XMLSchema-instance">
<e:Header>
<Friends>
<friend>
<Name>Testabc</Name>
<Age>12121</Age>
<Phone>Testpqr</Phone>
</friend>
</Friends>
</e:Header>
<e:Body>
<n0:ForAnsiHeaderOperResponse xmlns:n0 = "http://systinet.com/wsdl/com/magicsoftware/ibolt/localhost/ForAnsiHeader/ForAnsiHeaderImpl#ForAnsiHeaderOper?KExqYXZhL2xhbmcvU3RyaW5nOylMamF2YS9sYW5nL1N0cmluZzs=">
<response i:type = "d:string">12--abc--pqr</response>
</n0:ForAnsiHeaderOperResponse>
</e:Body>
</e:Envelope>
并希望提取下面的XML
<e:Header>
<Friends>
<friend>
<Name>Testabc</Name>
<Age>12121</Age>
<Phone>Testpqr</Phone>
</friend>
</Friends>
</e:Header>
下面的代码有助于实现同样的目标
public static void main(String[] args) {
File fXmlFile = new File("C://Users//abhijitb//Desktop//Test.xml");
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
Document document;
Node result = null;
try {
document = dbf.newDocumentBuilder().parse(fXmlFile);
XPath xPath = XPathFactory.newInstance().newXPath();
String xpathStr = "//Envelope//Header";
result = (Node) xPath.evaluate(xpathStr, document, XPathConstants.NODE);
System.out.println(nodeToString(result));
} catch (SAXException | IOException | ParserConfigurationException | XPathExpressionException
| TransformerException e) {
e.printStackTrace();
}
}
private static String nodeToString(Node node) throws TransformerException {
StringWriter buf = new StringWriter();
Transformer xform = TransformerFactory.newInstance().newTransformer();
xform.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, "yes");
xform.transform(new DOMSource(node), new StreamResult(buf));
return (buf.toString());
}
现在,如果您只需要如下所示的xml
<Friends>
<friend>
<Name>Testabc</Name>
<Age>12121</Age>
<Phone>Testpqr</Phone>
</friend>
</Friends>
你需要改变
String xpathStr = "//信封//头";to String xpathStr = "//信封//头/*";
在@bluish和@Yishai的精彩回答上展开,下面是如何使nodelist和节点属性支持迭代器,即for(node n: nodelist)接口。
像这样使用它:
NodeList nl = ...
for(Node n : XmlUtil.asList(nl))
{...}
and
Node n = ...
for(Node attr : XmlUtil.asList(n.getAttributes())
{...}
代码:
/**
* Converts NodeList to an iterable construct.
* From: https://stackoverflow.com/a/19591302/779521
*/
public final class XmlUtil {
private XmlUtil() {}
public static List<Node> asList(NodeList n) {
return n.getLength() == 0 ? Collections.<Node>emptyList() : new NodeListWrapper(n);
}
static final class NodeListWrapper extends AbstractList<Node> implements RandomAccess {
private final NodeList list;
NodeListWrapper(NodeList l) {
this.list = l;
}
public Node get(int index) {
return this.list.item(index);
}
public int size() {
return this.list.getLength();
}
}
public static List<Node> asList(NamedNodeMap n) {
return n.getLength() == 0 ? Collections.<Node>emptyList() : new NodeMapWrapper(n);
}
static final class NodeMapWrapper extends AbstractList<Node> implements RandomAccess {
private final NamedNodeMap list;
NodeMapWrapper(NamedNodeMap l) {
this.list = l;
}
public Node get(int index) {
return this.list.item(index);
}
public int size() {
return this.list.getLength();
}
}
}
入门示例:
xml文件:
<inventory>
<book year="2000">
<title>Snow Crash</title>
<author>Neal Stephenson</author>
<publisher>Spectra</publisher>
<isbn>0553380958</isbn>
<price>14.95</price>
</book>
<book year="2005">
<title>Burning Tower</title>
<author>Larry Niven</author>
<author>Jerry Pournelle</author>
<publisher>Pocket</publisher>
<isbn>0743416910</isbn>
<price>5.99</price>
</book>
<book year="1995">
<title>Zodiac</title>
<author>Neal Stephenson</author>
<publisher>Spectra</publisher>
<isbn>0553573862</isbn>
<price>7.50</price>
</book>
<!-- more books... -->
</inventory>
Java代码:
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import org.testng.annotations.DataProvider;
import org.testng.annotations.Test;
import org.w3c.dom.Document;
import org.w3c.dom.Element;
import org.w3c.dom.Node;
import org.w3c.dom.NodeList;
import org.xml.sax.SAXException;
import org.xml.sax.SAXParseException;
try {
DocumentBuilderFactory docBuilderFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder docBuilder = docBuilderFactory.newDocumentBuilder();
Document doc = docBuilder.parse (new File("c:\\tmp\\my.xml"));
// normalize text representation
doc.getDocumentElement().normalize();
System.out.println ("Root element of the doc is " + doc.getDocumentElement().getNodeName());
NodeList listOfBooks = doc.getElementsByTagName("book");
int totalBooks = listOfBooks.getLength();
System.out.println("Total no of books : " + totalBooks);
for(int i=0; i<listOfBooks.getLength() ; i++) {
Node firstBookNode = listOfBooks.item(i);
if(firstBookNode.getNodeType() == Node.ELEMENT_NODE) {
Element firstElement = (Element)firstBookNode;
System.out.println("Year :"+firstElement.getAttribute("year"));
//-------
NodeList firstNameList = firstElement.getElementsByTagName("title");
Element firstNameElement = (Element)firstNameList.item(0);
NodeList textFNList = firstNameElement.getChildNodes();
System.out.println("title : " + ((Node)textFNList.item(0)).getNodeValue().trim());
}
}//end of for loop with s var
} catch (SAXParseException err) {
System.out.println ("** Parsing error" + ", line " + err.getLineNumber () + ", uri " + err.getSystemId ());
System.out.println(" " + err.getMessage ());
} catch (SAXException e) {
Exception x = e.getException ();
((x == null) ? e : x).printStackTrace ();
} catch (Throwable t) {
t.printStackTrace ();
}