I’m finding R more and more useful for just dragging data out of things. RSS data is a touchy subject with some, I still use it a lot and built Curatic to get me the stories I want to read, not lists of stories I don’t. Anyway, that’s not the point of this post.
R and XPath, good friends. Pull and RSS feed and get the titles, descriptions and the publication dates quickly.
> library(XML) > library(RCurl) > xml.url<-"https://dataissexy.wordpress.com/feed/" > rssdoc <- xmlParse(getURL(xml.url)) > rsstitle <- xpathSApply(rssdoc, '//item/title', xmlValue) > rssdesc <- xpathSApply(rssdoc, '//item/description', xmlValue) > rssdate <- xpathSApply(rssdoc, '//item/pubDate', xmlValue)
Done š