December 24th, 2014 |
Data Aggregation, Web Crawling, Web Scraping
Free Web Data Platform . Free Web Scraping Tool
A lot happening all the time, a lot new platforms came into the horizon. Some saying they can do everything for you within a simple setup, some would say they are doing the best ever stuffs in the web scraping / extracting field and minimized your time and expense. But still the essence of a developer-based web scraping is much more demanding than those automated tools.
One of our new clients recently sent me a message saying “I’ve attempted to download and use at least 12 different programs and non of them do what is expected. I appreciate your help. “. We discussed his requirements and gave solutions that his business needed urgently, ofcourse within a reasonable cost and time. We hear from such unhappy webmasters almost every week and bring smiles on their faces after sometime.
Before writing this post, we the team members created accounts on some of those platforms and tested them in various ways. The experience was very very time killing, we got wrong results and we had no way to check how many wrong results we received. So this is simply not for serious business and professional people. If you need to depend on your data, you only need the accurate one and if you have somebody responsible for that, you can stay relaxed.
Please feel free to communicate with us to discuss any web scraping project.
March 16th, 2012 |
Data Aggregation, Data Extraction, Data Parsing, Web Scraping
Drupal . RSS Aggregator . RSS Parser . WordPress
We developed a complex RSS Parser and aggregator module for Drupal that can scrape given feeds and create nodes with proper versioning. It doesn’t only merge RSS feeds but also can hanle duplicate items according to the setup in the backend. The module is fully manageable from the backend. Later we developed a similar plugin for WordPress where each item has been added as a post. In both cases, you have ability to add & manage the custom feeds and their data. We have also prepared and deployed CRON version for both cases.
Beside the above experience, we worked for a UK based property (MLS) website with a number of real estate agents feeding data into the site in a number of various formats including RMv3, BLM files, XML feeds etc. The website is built using Drupal. Our role was to build a Module to process agent feeds & some websites and parse them to feed to the Drupal system. Also, we managed their frontend website to deploy & display those processed data properly.