我有一个小问题,XPath包含与dom4j…
假设我的XML是
<Home>
<Addr>
<Street>ABC</Street>
<Number>5</Number>
<Comment>BLAH BLAH BLAH <br/><br/>ABC</Comment>
</Addr>
</Home>
假设我想找到文本中所有有ABC的节点,给定根元素…
所以我需要写的XPath是
/ * [contains(短信),‘ABC’)
然而,这不是dom4j返回的内容....这是dom4j的问题,还是我对XPath工作原理的理解,因为该查询只返回Street元素而不返回Comment元素?
DOM使Comment元素成为一个具有四个标记(两个)的复合元素
[Text = 'XYZ'][BR][BR][Text = 'ABC']
我假设查询仍然应该返回元素,因为它应该找到元素并在其上运行contains,但它没有……
下面的查询返回元素,但它返回的不仅仅是元素——它还返回父元素,这对问题来说是不可取的。
//*[contains(text(),'ABC')]
有人知道XPath查询只返回元素<Street/>和<Comment/>吗?
<Comment>标记包含两个文本节点和两个<br>节点作为子节点。
你的xpath表达式是
//*[contains(text(),'ABC')]
为了分析这个问题,
* is a selector that matches any element (i.e. tag) -- it returns a node-set.
The [] are a conditional that operates on each individual node in that node set. It matches if any of the individual nodes it operates on match the conditions inside the brackets.
text() is a selector that matches all of the text nodes that are children of the context node -- it returns a node set.
contains is a function that operates on a string. If it is passed a node set, the node set is converted into a string by returning the string-value of the node in the node-set that is first in document order. Hence, it can match only the first text node in your <Comment> element -- namely BLAH BLAH BLAH. Since that doesn't match, you don't get a <Comment> in your results.
你需要把这个改成
//*[text()[contains(.,'ABC')]]
* is a selector that matches any element (i.e. tag) -- it returns a node-set.
The outer [] are a conditional that operates on each individual node in that node set -- here it operates on each element in the document.
text() is a selector that matches all of the text nodes that are children of the context node -- it returns a node set.
The inner [] are a conditional that operates on each node in that node set -- here each individual text node. Each individual text node is the starting point for any path in the brackets, and can also be referred to explicitly as . within the brackets. It matches if any of the individual nodes it operates on match the conditions inside the brackets.
contains is a function that operates on a string. Here it is passed an individual text node (.). Since it is passed the second text node in the <Comment> tag individually, it will see the 'ABC' string and be able to match it.