Big Data Solution for a Media Company 

The concept of big data has been around for years and is gradually becoming a mainstream IT solution to complex business problems. The data is considered “big” if it’s so big and so complex that it cannot be handled using traditional data-processing application software.  Still, the value of big data is not in the amount of the collected and stored information as such, but in business intelligence (BI), that is mining, analysis and structuring of this data that has a profound effect on business. Agiliway has managed to provide such effective business intelligence and big data solution, which has become the core of our client’s business.      

BigData

Background 

Our client has been working on the media market and wished to webcast news related to internet technologies. To meet this goal, the client needed to have an effective business intelligence system that would (1) analyze a constant flow of information from multiple public sources, (2) check the collected information for relevance, and (3) structure this data and produce blog posts based on it.  

Such task posed a number of technological challenges. In particular,  

  • processing of large amounts of data sources required a crawler with a complex multi-thread logic; 
  • determination of the “relevance” of the found pages required implementation of complicated decision-making logic and application of some AI practices; 
  • there was a need to select an effective storage for such large amount of data;  
  • quick and accurate management of requests, keywords, weights, excluded keywords, etc. called for building a convenient user admin panel that will be comprehensible and user-friendly for clients’ content managers.  

Provided Solution  

Agiliway used an elaborate Python code and MongoDB databases to provide the necessary big data solution. Namely, we have created a crawler, written in Python, that would search through social media, news platforms and blogs for the information related to the specified keywords. Together with the pieces of news, the Python code collects the information about the users’ feedback on it, including the number of likes and comments.  

As the next step in the big data solution, the found data is processed for relevance. If the relevance is confirmed, the data is stored on MongoDB, a free open-source document-oriented database, which supports dynamic queries and allows fast updates and scalability. In MongoDB, the data is structured under specific categories, which correspond to definite keywords. Thus, this data is shared to MySQL databases of WordPress, the platform of the client’s website, and becomes the basis of new posts being automatically created.  

Value Delivered  

The big data solution developed by Agiliway provided the client with a full cycle of a product creation. All the complex processes of gathering, analyzing, storing, structuring the data and, then, putting it to use happens automatically due to the well-implemented business intelligence processes.   

Our BI team have great experience in rational databases like MS SQL and Oracle BD, non-relational databases like MongoBD, and data visualization and BI tools, e.g. Tableau, Jasper Reports and similar. We successfully leverage the expertise gained in a variety of projects to design scalable data storages, migrate data, execute ETL processes, and build intelligent solutions with comprehensive visualization and reporting systems for our clients. Feel free to contact us to find out more about how big data can bring competitive advantage to your business.