14 | 12 | 2017

Building Python Powered Content Aggregator For Multimedia Startup

Some stress elegance of the Python programming language, others marvel at its modularity and extensive standard library, while we at Agiliway tip the hat to Python for the increased productivity it provides.

Once again Python played to its strengths at the fingertips of our engineers, and this time an American multimedia startup got a perfect tool for doing market research easily and effectively.

Challenges for the Startup in the Digital Age

The variety of digital data come thick and fast daily leaving readers overwhelmed and huddled over their laptops for hours on end. Per day about 4 mln hours of content are uploaded to YouTube, an average of 3.6 bln Google searches conducted, and more than 2 million articles published on the web. The Washington Post only publishes around 1,200 staff-produced articles, wire stories, graphics, and videos what makes one story every two minutes. Who would have thought that 600 new page edits are published on Wikipedia every minute?! Highlights of last year’s reports about world’s data generation make one scratch the head in confusion.

Given an all-time high pace at which events hit headlines the news goes viral and updates are published, our client representing the multimedia startup experienced a dire need of having routine daily work expedited and leveled up. The company serves the enterprise software industry by delivering detailed analyses of the industry, raising awareness about information technologies, creating webcasts, preparing articles, researching programs and white papers. Heavily reliant on the Internet for the latest industry news, updates, and analyses, the client could spend hours collecting relevant data on a certain topic. At times searching turned out to be ineffective due to the lack of necessary skills, time or resources. With the advent of digital media came continuous access to more incoming information than a person could spot, process and absorb.

Getting a Handle on Information Overload

To keep its business alive and thriving the company couldn’t afford to escape the glut of information coming fast and furious from a variety of sources. Without fail, only after data on a certain topic have been collected, filtered through, classified and then studied for accuracy and reliability, analytics merits notice.

The solution was made to build in a third-party app, Python Powered Content Aggregator, for all the dirty work of searching and collecting niche-relevant content from multiple sources. It serves to crawl the Internet (social media platforms, RSS feeds, news or company websites, online editions of journals and newspapers, forums, blogs, etc.), check out all sorts of updates, filter media outlets according to set criteria, automatically upload them in a repository which an admin can access anytime to further work with.

Advantages following the implementation of Python Powered Content Aggregator are as follows:

relevant content is pulled from all corners of the Internet and without human judgment
information is received automatically
value of updates is preserved due to timely delivery
admin can scan information quickly without having to visit each source site individually
Python-based platform is easy to use with powerful filtering and source configuration capabilities
highly customizable, the aggregator grabs information the webmaster set it up for
new posts can be categorized by subject, easily sorted through, added, deleted, and commented on
analysis has higher intrinsic value when based on a rich informative channel

Having full control, clicking only through items of interest and spending time on relevant posts were the client’s expectations. To meet them Agiliway engineers developed an aggregation strategy and turned an informative channel into a hub of community information with editorial comments.

The process included a few steps:

creating a web crawler in Python for parsing source sites and fetching data
recording data into a non-relational database on the MongoDB platform for their fast and convenient processing
metadata for the crawler were entered in the MySQL database for better and faster indexing
WordPress plugin was created to configure sources, keywords, and categories for WordPress posts
crawler categorizes, filters and automatically creates posts in WordPress as well as in other CMSs
the user interface was designed using popular CMS as a prototype. Interactive and user-friendly, it allows users with limited expertise to review, edit, modify or delete content from a website.

Less time spent on searching relevant content means more time allocated to analyze it, reach conclusions, prepare data-driven reports and high-powered analytics. With content aggregator written in the high-level, dynamic, interactive Python programming language with the use of full-featured MongoDB, parsing, searching and grouping takes a moment.

Whether you decide to implement it or not, its applications are diverse – news, reviews, analysis, research, price comparisons, etc. – and most likely your competitors are already making the most of it.

Keywords

Our recent news

Read blog

16 | 04 | 2024

Building Python Powered Content Aggregator For Multimedia Startup

Challenges for the Startup in the Digital Age

Getting a Handle on Information Overload

The process included a few steps:

Keywords

Our recent news

Implementing Data Analysis for Operational Optimization and Increased Business Performance

Implementation of Conversational AI in Business

How to Make the Right Choice for Your Business: Hiring a Software Development Company vs. Freelancer

Building Python Powered Content Aggregator For Multimedia Startup

Challenges for the Startup in the Digital Age

Getting a Handle on Information Overload

The process included a few steps:

Keywords

Subscribe for our newsletter

Our recent news

Implementing Data Analysis for Operational Optimization and Increased Business Performance

Implementation of Conversational AI in Business

How to Make the Right Choice for Your Business: Hiring a Software Development Company vs. Freelancer