入门 - HTML敏捷包

// From File
var doc = new HtmlDocument();
doc.Load(filePath);

// From String
var doc = new HtmlDocument();
doc.LoadHtml(html);

// From Web
var url = \"http://html-agility-pack.net/\";
var web = new HtmlWeb();
var doc = web.Load(url);

 

HtmlAgilityPack使用XPath语法,尽管许多人认为它的记录不足,但是我从XPath文档的帮助中没有任何麻烦:www.w3schools.com/ /xpath_syntax.asp

解析

<h2>
  <a href=\"\">Jack</a>
</h2>
<ul>
  <li class=\"tel\">
    <a href=\"\">81 75 53 60</a>
  </li>
</ul>
<h2>
  <a href=\"\">Roy</a>
</h2>
<ul>
  <li class=\"tel\">
    <a href=\"\">44 52 16 87</a>
  </li>
</ul>

我这样做:

string url = \"http://website.com\";
var Webget = new HtmlWeb();
var doc = Webget.Load(url);
foreach (HtmlNode node in doc.DocumentNode.SelectNodes(\"//h2//a\"))
{
  names.Add(node.ChildNodes[0].InnerHtml);
}
foreach (HtmlNode node in doc.DocumentNode.SelectNodes(\"//li[@class=\'tel\']//a\"))
{
  phones.Add(node.ChildNodes[0].InnerHtml);
}
收藏 打印