XPath contains(text()，'some string')在与具有多个text子节点的节点一起使用时无效

<Comment>标记包含两个文本节点和两个<br>节点作为子节点。

你的xpath表达式是

//*[contains(text(),'ABC')]

为了分析这个问题，

* is a selector that matches any element (i.e. tag) -- it returns a node-set. The [] are a conditional that operates on each individual node in that node set. It matches if any of the individual nodes it operates on match the conditions inside the brackets. text() is a selector that matches all of the text nodes that are children of the context node -- it returns a node set. contains is a function that operates on a string. If it is passed a node set, the node set is converted into a string by returning the string-value of the node in the node-set that is first in document order. Hence, it can match only the first text node in your <Comment> element -- namely BLAH BLAH BLAH. Since that doesn't match, you don't get a <Comment> in your results.

你需要把这个改成

//*[text()[contains(.,'ABC')]]

* is a selector that matches any element (i.e. tag) -- it returns a node-set. The outer [] are a conditional that operates on each individual node in that node set -- here it operates on each element in the document. text() is a selector that matches all of the text nodes that are children of the context node -- it returns a node set. The inner [] are a conditional that operates on each node in that node set -- here each individual text node. Each individual text node is the starting point for any path in the brackets, and can also be referred to explicitly as . within the brackets. It matches if any of the individual nodes it operates on match the conditions inside the brackets. contains is a function that operates on a string. Here it is passed an individual text node (.). Since it is passed the second text node in the <Comment> tag individually, it will see the 'ABC' string and be able to match it.

2010-09-07 03:36:37

<Comment>标记包含两个文本节点和两个<br>节点作为子节点。

你的xpath表达式是

//*[contains(text(),'ABC')]

为了分析这个问题，

* is a selector that matches any element (i.e. tag) -- it returns a node-set. The [] are a conditional that operates on each individual node in that node set. It matches if any of the individual nodes it operates on match the conditions inside the brackets. text() is a selector that matches all of the text nodes that are children of the context node -- it returns a node set. contains is a function that operates on a string. If it is passed a node set, the node set is converted into a string by returning the string-value of the node in the node-set that is first in document order. Hence, it can match only the first text node in your <Comment> element -- namely BLAH BLAH BLAH. Since that doesn't match, you don't get a <Comment> in your results.

你需要把这个改成

//*[text()[contains(.,'ABC')]]

* is a selector that matches any element (i.e. tag) -- it returns a node-set. The outer [] are a conditional that operates on each individual node in that node set -- here it operates on each element in the document. text() is a selector that matches all of the text nodes that are children of the context node -- it returns a node set. The inner [] are a conditional that operates on each node in that node set -- here each individual text node. Each individual text node is the starting point for any path in the brackets, and can also be referred to explicitly as . within the brackets. It matches if any of the individual nodes it operates on match the conditions inside the brackets. contains is a function that operates on a string. Here it is passed an individual text node (.). Since it is passed the second text node in the <Comment> tag individually, it will see the 'ABC' string and be able to match it.

2010-09-07 03:36:37

[contains(text()， ")]只返回true或false。它不会返回任何元素结果。

2016-12-24 23:08:10

//*[text()='ABC']

返回

<street>ABC</street>
<comment>BLAH BLAH BLAH <br><br>ABC</comment>

2020-06-16 14:12:29

接受的答案也将返回所有的父节点。使用ABC只获取实际节点，即使字符串在后面:

//*[text()[contains(.,'ABC')]]/text()[contains(.,"ABC")]

2020-03-06 09:43:31

包括XPath 1.0和XPath 2.0+行为的现代答案…

这个XPath,

//*[contains(text(),'ABC')]

在XPath 1.0和XPath(2.0+)的后续版本中表现不同。

常见的行为

//*选择文档中的所有元素。 []根据其中表达的谓词筛选这些元素。谓词中的Contains (string, substring)将过滤那些元素，使其substring为string中的子字符串。

XPath 1.0行为

Contains (string, substring)将通过获取节点集中第一个节点的字符串值将节点集转换为字符串。对于//*[contains(text()，'ABC')]，该节点集将是文档中每个元素的所有子文本节点。由于只使用了第一个文本节点子节点，因此违反了测试所有子文本节点是否包含'ABC'子字符串的期望。对于不熟悉上述转换规则的人来说，这将导致反直觉的结果。

XPath 1.0在线示例显示只选择了一个“ABC”。

XPath 2.0+行为

将包含多个项的序列作为第一个参数调用contains(string, substring)是错误的。这纠正了上面在XPath 1.0中描述的违反直觉的行为。

XPath 2.0在线示例显示了一个典型的错误消息，这是由于XPath 2.0+特有的转换错误造成的。

常见的解决方案

If you wish to include descendent elements (beyond children), test against the string value of an element as a single string, rather than the individual string values of the child text nodes, this XPath, //*[contains(.,'ABC')] selects your targeted Street and Comment elements and also their Addr and Home ancestor elements because those too have 'ABC' as substrings of their string values. Online example shows ancestors being selected too. If you wish to exclude descendent elements (beyond children), this XPath, //*[text()[contains(.,'ABC')]] selects only your targeted Street and Comment because only those elements have text node children whose string values contain the 'ABC' substring. This will be true for all versions of XPath Online example shows only Street and Comment being selected.

2022-02-24 16:57:55

XPath contains(text()，'some string')在与具有多个text子节点的节点一起使用时无效

推荐文章

最新文章

标签