By Bob Rudis (@hrbrmstr)
Thu 09 July 2015 | tags: blog, r, rstats, xml, xslt, webscraping, -- (permalink)

Sometimes you just need the salient text from a web site, often as a first step towards natural language processing (NLP) or classification. There are many ways to achieve this, but XSLT (eXtensible Stylesheet Language) was purpose-built for slicing, dicing and transforming XML (and, hence, HTML) so, it can make ...


By Bob Rudis (@hrbrmstr)
Tue 29 April 2014 | tags: data munging, xml, R, rstats, scraping, -- (permalink)

NOTE: Qualys allows automated access to their SSL Server Test site in their T&C’s, and the R fucntion/script provided here does its best to adhere to their guidelines. However, if you launch multiple scripts at one time and catch their attention you will, no doubt, be banned ...