我有一个小问题,XPath包含与dom4j…

假设我的XML是

<Home>
    <Addr>
        <Street>ABC</Street>
        <Number>5</Number>
        <Comment>BLAH BLAH BLAH <br/><br/>ABC</Comment>
    </Addr>
</Home>

假设我想找到文本中所有有ABC的节点,给定根元素…

所以我需要写的XPath是

/ * [contains(短信),‘ABC’)

然而,这不是dom4j返回的内容....这是dom4j的问题,还是我对XPath工作原理的理解,因为该查询只返回Street元素而不返回Comment元素?

DOM使Comment元素成为一个具有四个标记(两个)的复合元素

[Text = 'XYZ'][BR][BR][Text = 'ABC'] 

我假设查询仍然应该返回元素,因为它应该找到元素并在其上运行contains,但它没有……

下面的查询返回元素,但它返回的不仅仅是元素——它还返回父元素,这对问题来说是不可取的。

//*[contains(text(),'ABC')]

有人知道XPath查询只返回元素<Street/>和<Comment/>吗?


当前回答

下面是匹配包含给定文本字符串的节点的另一种方法。首先查询文本节点本身,然后获取父节点:

//text()[contains(., "ABC")]/..

对我来说,这很容易阅读和理解。

其他回答

下面是匹配包含给定文本字符串的节点的另一种方法。首先查询文本节点本身,然后获取父节点:

//text()[contains(., "ABC")]/..

对我来说,这很容易阅读和理解。

接受的答案也将返回所有的父节点。使用ABC只获取实际节点,即使字符串在后面:

//*[text()[contains(.,'ABC')]]/text()[contains(.,"ABC")]
//*[text()='ABC'] 

返回

<street>ABC</street>
<comment>BLAH BLAH BLAH <br><br>ABC</comment>

<Comment>标记包含两个文本节点和两个<br>节点作为子节点。

你的xpath表达式是

//*[contains(text(),'ABC')]

为了分析这个问题,

* is a selector that matches any element (i.e. tag) -- it returns a node-set. The [] are a conditional that operates on each individual node in that node set. It matches if any of the individual nodes it operates on match the conditions inside the brackets. text() is a selector that matches all of the text nodes that are children of the context node -- it returns a node set. contains is a function that operates on a string. If it is passed a node set, the node set is converted into a string by returning the string-value of the node in the node-set that is first in document order. Hence, it can match only the first text node in your <Comment> element -- namely BLAH BLAH BLAH. Since that doesn't match, you don't get a <Comment> in your results.

你需要把这个改成

//*[text()[contains(.,'ABC')]]

* is a selector that matches any element (i.e. tag) -- it returns a node-set. The outer [] are a conditional that operates on each individual node in that node set -- here it operates on each element in the document. text() is a selector that matches all of the text nodes that are children of the context node -- it returns a node set. The inner [] are a conditional that operates on each node in that node set -- here each individual text node. Each individual text node is the starting point for any path in the brackets, and can also be referred to explicitly as . within the brackets. It matches if any of the individual nodes it operates on match the conditions inside the brackets. contains is a function that operates on a string. Here it is passed an individual text node (.). Since it is passed the second text node in the <Comment> tag individually, it will see the 'ABC' string and be able to match it.

[contains(text(), ")]只返回true或false。它不会返回任何元素结果。