By Bob Rudis (@hrbrmstr)
Thu 09 July 2015 | tags: blog, r, rstats, xml, xslt, webscraping, -- (permalink)

Sometimes you just need the salient text from a web site, often as a first step towards natural language processing (NLP) or classification. There are many ways to achieve this, but XSLT (eXtensible Stylesheet Language) was purpose-built for slicing, dicing and transforming XML (and, hence, HTML) so, it can make ...