发布于 2015-08-27 16:32:18 | 339 次阅读 | 评论: 0 | 来源: 网络整理
The CssSelector component converts CSS selectors to XPath expressions.
You can install the component in 2 different ways:
symfony/css-selector
on Packagist);When you’re parsing an HTML or an XML document, by far the most powerful method is XPath.
XPath expressions are incredibly flexible, so there is almost always an XPath expression that will find the element you need. Unfortunately, they can also become very complicated, and the learning curve is steep. Even common operations (such as finding an element with a particular class) can require long and unwieldy expressions.
Many developers – particularly web developers – are more comfortable using CSS selectors to find elements. As well as working in stylesheets, CSS selectors are used in JavaScript with the querySelectorAll
function and in popular JavaScript libraries such as jQuery, Prototype and MooTools.
CSS selectors are less powerful than XPath, but far easier to write, read and understand. Since they are less powerful, almost all CSS selectors can be converted to an XPath equivalent. This XPath expression can then be used with other functions and classes that use XPath to find elements in a document.
The component’s only goal is to convert CSS selectors to their XPath equivalents:
use SymfonyComponentCssSelectorCssSelector; print CssSelector::toXPath('div.item > h4 > a');
This gives the following output:
descendant-or-self::div[@class and contains(concat(' ',normalize-space(@class), ' '), ' item ')]/h4/a
You can use this expression with, for instance, DOMXPath
or SimpleXMLElement
to find elements in a document.
小技巧
The Crawler::filter()
method uses the CssSelector component to find elements based on a CSS selector string. See the The DomCrawler Component for more details.
Not all CSS selectors can be converted to XPath equivalents.
There are several CSS selectors that only make sense in the context of a web-browser.
:link
, :visited
, :target
:hover
, :focus
, :active
:invalid
, :indeterminate
(however, :enabled
, :disabled
, :checked
and :unchecked
are available)Pseudo-elements (:before
, :after
, :first-line
, :first-letter
) are not supported because they select portions of text rather than elements.
Several pseudo-classes are not yet supported:
*:first-of-type
, *:last-of-type
, *:nth-of-type
, *:nth-last-of-type
, *:only-of-type
. (These work with an element name (e.g. li:first-of-type
) but not with *
.