We can make your Elgg site shine

High load Elgg solution

367 days ago by Michał Zacher

Problem

 

PHP/MySQL tandem isn't the best solution to write highly scalable systems for large userbase. Moreover, Elgg's EAV model of database is very resource consuming. Our solution was to minimizes PHP usage by serving cached data by more robust technologies and performing as much processing as possible on client side.

Solution

 

We decided to limit usage of PHP by implementing cache on framework's level. Usage of database and processing of data in Elgg was reduced by implementing controllers. We limited data transfer between server and browser by generating HTML by javascript, on browser's side. This ensures user downloads from server only data, without HTML structure.

After spending over 1200 workhours on those modifications, we improved performance of Elgg by 28 times. Page loading times were lowered by 16 times. This performance is similar to performance of Wikipedia. We plan to increase it even more, it's possible to get twice better results with some additional optimizations.

Architecture

 

Our technology is based on building particular pages from blocks called controllers, and templates which are their presentation layer. Controllers are aware of their data. They do processing and retrieval of data only when it's required and act only as a skeleton when their data is not being used. They are reusable across whole website. Separating them from view layer allows us to use single instance of given controller for displaying same data in different views. They operate partly based on HMVC architecture.

When user is performing request to display any page for the frist time, he downloads HTML templates for the whole website encoded in Javascript. Those templates weight approx. 6 kb for 40 pages – less than standard HTML for a single page. They are downloaded by user only once.
With each page refresh, user downloads raw data for a given page. Unlike in API-based systems, all data is gathered with single request. Based on this data, HTML code is generated by JavaScript already in user's browser.

Raw data is taken directly from Elgg only once per controller. Afterwards, it's taken from cache, without use of PHP. In case data changes, controller is notified about this and data in cache is refreshed. Flow of data between user, browser and server is presented on Figure 1.

 

 

Performance details

 

We compare performance in two categories:

  • capacity - how many requests can be processed simultaneously
  • latency - how much times it takes to process single request under heavy load

Comparison was done between Elgg page working with and without cache system and page presenting similar data but served via Symfony 2.0 framework running on nginx server with cache enabled. Comparison was made on Amazon Web Services server instance of type 'Small', which, according to amazon specs, have following configuration:

  • 1.7 GB memory
  • 1 EC2 Compute Unit (1 virtual core with 1 EC2 Compute Unit)
  • 160 GB instance storage

 

 

Our cache system

Pure Elgg

Symfony

Capacity [requests/sec]

400

7

80

Latency [ms]

60

1000

200



It's worth to mention, that these tests were made with prototype of our cache server and there is high chance of improving this result in nearest future. These times don't include time of loading static external resources like images, so final time after which user will see ready output in browser will be higher in all three cases. These numbers presents how fast and how often server can serve response.

Please note – rendering of a page in a borwser also takes some time. This time is longer in our solution, although it still takes a small percent of a whole request. Rendering page on browser's side don't burden server and don't influence on how many users you can support.

Advantages and disadvantages


Advantages:

  • much better performance (latency and throughtput)
  • object oriented, clean and easy to support architecture
  • full separation of presentation and business logic throught templating system
  • all new Vazco plugins can work both with pure Elgg and with our solution. This allows you to use some third party plugins, and our plugins at the same time – both on pure Elgg and on our solution
  • Obiect oriented code allows us to use some advanced mechanisms, automated tests being the most important

Disadvantages:

  • rendering of a page on browser's side takes a little longer (usually approx. 50 miliseconds)
  • third-party plugins don't use our technology and won't have increase in performance. They will still work with our solution.
  • your hosting server needs to support node.js. You also have to be able to set up REDIS server
  • learning new architecture takes some time for a developer (approx. a week)

Used technologies

 

  • Node.js – as cache server
  • Redis – as cache storage
  • Mustache – our modified version of mustache library for rendering views

Usefull links


See also:

 

Go to the previous page