10 Second RSS Parsing in R and XPath

I’m finding R more and more useful for just dragging data out of things. RSS data is a touchy subject with some, I still use it a lot and built Curatic to get me the stories I want to read, not lists of stories I don’t. Anyway, that’s not the point of this post.

R and XPath, good friends. Pull and RSS feed and get the titles, descriptions and the publication dates quickly.

> library(XML) 
> library(RCurl) 
> xml.url<-"https://dataissexy.wordpress.com/feed/" 
> rssdoc <- xmlParse(getURL(xml.url)) 
> rsstitle <- xpathSApply(rssdoc, '//item/title', xmlValue) 
> rssdesc <- xpathSApply(rssdoc, '//item/description', xmlValue) 
> rssdate <- xpathSApply(rssdoc, '//item/pubDate', xmlValue)

Done šŸ™‚

 

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: