01 Jul 2020

Jornalia, a news aggregator API

After working on cvd19.news, where the data source came from NewsAPI, I decided to do some research on other news APIs available on the market. Seeing that the Google News API had been deprecated, and that there were not many alternatives to get data from many Argentine media, together with a friend we decided to create a news aggregator that exposes a RESTful interface.

Combining web crawling techniques and RSS feed consumption, we developed Jornalia, a JSON API for Argentine media news. Jornalia allows searching among a large number of media, being able to filter the articles found by categories and publication dates. The stack is made up of an API in Express, a web crawler in Python with Scrapy and MongoDB to store all the data.

Jornalia is still in active development phase, and we have already released the Beta version which can be found at https://jornalia.net. There are many plans to continue development, among which is to create a news portal that allows you to create personalized feeds and start integrating media from other countries.

nodejs
express
mongodb
web
crawling
python
aws