Htmlagilitypack And Separating On
I have some html, which is separated by e.g.: Jack Janson 309 123 456 My Special Street 43 What is the easiest way to retrieve the information
Solution 1:
In pure XPATH over XML, you would use an XPATH expression like this: //preceding-sibling::br
or //following-sibling::br
(see here for help on XPATH Axes)
But, the XPATH over HTML implementation that you'll find in Html Agility Pack does not support pure text node or (Attribute node) in XPATH selection expressions (//br/text()
or //br/@blah
do not work for example). Note it works in filters, so, these //br[text()='blah']
or //br[@att='blah']
work.
So, back to the question, you need to combine XPATH and code, something like this:
HtmlDocument doc = new HtmlDocument();
doc.Load(myHtmlFile);
foreach (HtmlNode p in doc.DocumentNode.SelectNodes("//br"))
{
Console.WriteLine(p.PreviousSibling.InnerText.Trim());
}
That will output
Jack Janson
309 123 456
Post a Comment for "Htmlagilitypack And Separating On
"