Goose - Article Extractor now open source, same as Flipboard/Instapaper
December 21, 2010
Today I'm releasing a project I worked on for http://gravity.com. It's an HTML Article Extractor ala Flipboard / Instapaper style. It will take an article, run some calculations on it and give you an Article object back with the text of the extracted Article as well as the main image that we think is relevant to the article.
https://github.com/jiminoc/goose/wiki
The goal is to create an open source article extractor for use with open source applications, crawlers or academic NLP processing initiatives.
see more at the new blog: http://jimplush.com
https://github.com/jiminoc/goose/wiki
The goal is to create an open source article extractor for use with open source applications, crawlers or academic NLP processing initiatives.
see more at the new blog: http://jimplush.com