Anthems, a media portal from the USA
Anthems, a media portal from the USA, required our help with infrastructure optimization. They have a PHP app running on Google cloud and using Google SQL DB to store user content and data. The customer wanted us to audit existing infrastructure (including load-testing) and suggest ways to ensure that the platform will be able to handle millions of users simultaneously using DevOps best-practices.
Location: The United States
Industry: Social Media Network
Partnership period: June 2019 – ongoing
Team size: 2-3 people
Team location: Kharkiv, Ukraine
Services: Cloud infrastructure design and implementation, database optimization, Web Development, load-testing
Expertise delivered: GCP cloud administration, DigitalOcean cloud, DevOps services, PHP app architecture, SQL and NoSQL database management, cloud monitoring
Technology stack: GCP services, JMeter, DO, Ansible, MySQL, NoSQL, PHP
IT Svit had to make the following:
- Perform a load-testing to ensure that the platform can handle millions of users.
- Test the current API to see how many concurrent users they can handle.
- Leverage monitoring tools to identify if any scalability bottlenecks exist due to poor PHP code VS poor MySQL queries VS poor MySQL index VS suboptimal server configuration.
- Design and implement the solutions for the problems found
- Conceptualize other ways to increase scalability and performance (ie caching, etc)
- Design and develop infrastructure using best-practices
Challenges and solutions
While working on the project we encountered the following challenges:
- All the user data and content was stored in MySQL and it is by far not the fastest DB
- Every user had very resource-intensive queries in the common DB
The following solutions were proposed:
- We suggested introducing separate tables for every user, as that would speed up the database operations. However, this was only a partial solution, as the app architecture must be rewritten to support millions of users, as the customer wants
- As an alternative, we suggested storing all user data in some NoSQL database and store only the link to it in the common MySQL database. Also as a part of this solution, we can integrate Elasticsearch and create correct text search, as well as speeding up the system operations several times.
- IT Svit enabled and configured correct Google CDN caching, which allowed the system to process 2,000 rps with 250 ms response time under 90% percentile.
- We wrote a detailed description of the infrastructure and workflows in place
- We created an Ansible playbook for automated creation of load-testing infrastructure on DigitalOcean
- We selected the safest cache for the system. We tested Cloudflare and Google CDN + Google Cloud Armor by several parameters: DDoS protection, resilience to SQL injections, cross-site scripting and cross-site request forgery. We decided to stick with CloudFlare as Google Cloud Armor was released not so long ago and there are a lot of features that are in the beta stage.
- Started logging the SQL query caching to ensure it helped speed up the system operations
- Configured system monitoring and alerting using Google Stackdriver
- Ensured the system security by hiding the instances behind Cloudflare CDN into private subnet behind a bastion host to avoid direct access via the Internet
- Created auto-scaling groups for API and PHP endpoints to ensure simple scalability under heavy workloads
At the moment all the DevOps tasks are coming to the final stage and we discuss the PHP and database architecture improvements required to continue the project. Once the customer approves the scope and roadmap, the project will move on.