An online library that is fully equiped with distributed analytics cluster
MySQLMongoDBPythonReactNodeJSBashHadoopAWS
About the Project:
Liberate is a fullstack webapp that stores books information and reviews. It has automated scaling mechanism for its database which utilise both MongoDB and MySQL to stores books' information and review. The system also employs Hadoop and PySpark for analytic tasks. All of the system are instantiated and hosted on AWS EC2 instances.
Application main screen
Features:
Automatic spinning up and tearing down of the entire system and automatic scaling of hadoop analytic clusters. The automation was achieved using python and bash script during instance instantiation.
The Web Application allows user to filter and browse through available books, add new books and post review of the books.
The analytic system using PySpark and Hadoop can output tf-idf on the review and pearson correlation between price and average review length periodically.