c# - Parsing results from HTMLAgiltyPack -

- July 15, 2013

i'm trying parse yahoo finance page list of stock symbols , company names. url i'm using is: http://uk.finance.yahoo.com/q/cp?s=%5eftse

the code i'm using is;

htmlagilitypack.htmldocument page = new htmlweb().load("http://uk.finance.yahoo.com/q/cp?s=%5eftse");          var titles = page.documentnode.selectnodes("//td[@class='yfnc_tabledata1']");     // returns titles on home page of site in array.          foreach (var title in titles)         {             txtlog.appendtext(title.innerhtml + system.environment.newline);          }

the txtlog.appendtext line me testing. code correctly gets each lines contains class of yfnc_tabledata1 under node of td. when i'm in foreach loop need parse title grab symbol , company name following html;

<b><a href="/q?s=glen.l">glen.l</a></b> glencore xstrat <b>343.95</b> <nobr><small>3 may 16:35</small></nobr> <img width="10" height="14" style="margin-right:-2px;" border="0" src="http://l.yimg.com/os/mit/media/m/base/images/transparent-1093278.png" class="pos_arrow" alt="up"> <b style="color:#008800;">12.80</b> <bstyle="color:#008800;"> (3.87%)</b> 68,086,160

is possible parse results of parsed document? i'm little unsure on start.

you need continue xpath extraction work are. there many possibilities. difficulty yfnc_tabledata1 nodes @ same level. here how can (in console app example dump list of symbols , companies):

htmlagilitypack.htmldocument page = new htmlweb().load("http://uk.finance.yahoo.com/q/cp?s=%5eftse");  // directly symbols under 1st td element. recursively search element has href attribute under td. var symbols = page.documentnode.selectnodes("//td[@class='yfnc_tabledata1']//a[@href]");  foreach (var symbol in symbols) {     // current element, go 2 level , next td element.     var company = symbol.selectsinglenode("../../following-sibling::td").innertext.trim();     console.writeline(symbol.innertext + ": " + company); }

Search This Blog

HPH

c# - Parsing results from HTMLAgiltyPack -

Comments

Post a Comment

Popular posts from this blog

c++ - Function signature as a function template parameter -

algorithm - What are some ways to combine a number of (potentially incompatible) sorted sub-sets of a total set into a (partial) ordering of the total set? -

How to call a javascript function after the page loads with a chrome extension? -