LiTWol @ Oleg Terenchuk

  • Contact
  • About me

User login

What is OpenID?
To prevent automated spam submissions leave this field empty.
  • Log in using OpenID
  • Cancel OpenID login
  • Request new password

litwol's tweets

  • If a song is not tellig you which side to take. Then the song is making a terrible mistake. — 50 weeks 6 days ago
  •  
  • 1 of 9
  • ››
more
Home

High throughput web architecture with drupal - *For authenticated users*

Submitted by litwol on Tue, 09/22/2009 - 03:47

Attached is visual representation of my research. I've been studying how to optimize drupal for authenticated users (Unlike every other solution out there that focuses on delivering content to anonymous users... that task is easy).

I approached the problem from a far, taking a look at both application and server. More importantl i've considered how application interacts with the server to deliver content to the end user.

The idea in this approach is that each page generated consist partly of generic content that can be served to every user on that one page, and some of the content is user specific that can be served on other pages to this one user.

I carefuly study page structure and break it up into smaller chunks, lets call them blocks or grids for the sakes of the topic. Each grid may have similar caching strategy or it may have very unique caching strategy. In both cases we have to look at context. @todo: elaborate on that here.

As shown in the diagram, application never serves content directly to the user. Application places content cache chunks into memcache only. Server (in this case nginx which has direct memcache access module) reads directly from memcache to construct the page contents.

This creates for a very high throughput application & web architecture because much of the content will be read directly from RAM without ever invoking a PHP thread and much less doing full drupal bootstrap.

The downside of this architecture is that it must be fine tuned to each application that uses it, but i am sure that comes with no surprise at all.

@todo: describe high level technical requirements to get this system up and running on both Web server (nginx) and application (drupal) levels.

@todo: Add link to repository to get nginx config files and custom drupal core to make this work.

@todo: Write an step through tutorial that describes getting this system running from ground up.

@todo: Elaborate on the statistics: $20 VPS from linode.com running apache and barebone drupal core was able to serve 40 requests/second using authenticated session. the same VPS package running NginX (an asynchronous web server/proxy) was able to serve 90 requests/second with the same drupal configuration. The same $20 VPS package running "demo" version of my architecture design was able to max out at 770 requests/second for 5500 requests after which point my mis-configured linux system reached sockets limit and stalled the system. I am sure this architecture can reach higher numbers in more experienced sysadmin hands who can fine tune this setup that deal properly with such high request count on kernel level.

AttachmentSizeHitsLast download
High throughput web architecture with drupal.jpg108.95 KB0Not yet downloaded
  • 6568 reads

1 response to "High throughput web architecture with drupal - *For authenticated users*"

1. Panels

Submitted by sdboyer (not verified) on Sat, 10/31/2009 - 18:06.

I carefuly study page structure and break it up into smaller chunks, lets call them blocks or grids for the sakes of the topic. Each grid may have similar caching strategy or it may have very unique caching strategy. In both cases we have to look at context. @todo: elaborate on that here.

Just dropping the note, per your request, that Panels' architecture inherently obviates the need to reverse-engineer the theme system (or any other approach), as it acts as a centralized, abstract controller for page rendering. I'm kinda of the opinion that trying to create advanced, granular caching strategies without some such kind of controller is rather like swimming upstream. Against a very fast current. Here's our IRC convo, for reference/public record:

[10:51:12] litwol|mac: i've been talking with folks quite a bit about this recently - if the need is to break up the page into granular sections wrt the caching methodology...
[10:51:17] -*- sdboyer takes a deep breath
[10:51:20] litwol|mac: panels :)
[10:54:16] sdboyer: context?
[10:54:35] to make sure im on the same page..
[10:54:50] litwol|mac: your discussion with david eight hours ago, and your post here: http://litwol.com/content/high-throughput-web-architecture-drupal-authen...
[10:54:58] ah ok
[10:55:00] yes
[10:55:08] "I carefuly study page structure and break it up into smaller chunks, lets call them blocks or grids for the sakes of the topic. Each grid may have similar caching strategy or it may have very unique caching strategy. In both cases we have to look at context. @todo: elaborate on that here."
[10:55:51] I think it is possible to get a "satisfactory" core implementation but it will take much effort and will be very hackish for sure. even with d7 the only way to accomplish _mature_ implementation without wasting too much effort is to utilize views+panels (dont shoot me plz)
[10:55:51] the standard drupal approach to page rendering - hand it off to whatever callback wants it and let it make all the decisions - is horribly difficult to design granular caching strategies around, period
[10:55:59] i disagree
[10:56:03] for ^^ reason
[10:56:14] :)
[10:56:17] reason being, they have best implemented and handle contexts. and to cache things right we need them.
[10:56:22] yup
[10:56:24] but it's not just that
[10:56:33] it's granular control over specific page elements
[10:56:56] which you otherwise need to reverse-engineer
[10:57:08] hmm ok
[10:57:16] give me an example please
[10:57:41] sec lemme pull a nice random one off the net
[10:58:21] sdboyer: if we track context change (cache context as needed, perhaps per to-be-cached-region) then we can check if context was changed between caches
[10:58:55] sdboyer: it doesn't really matter what happens to produce the page element
[10:59:07] as long as we know the input and output of that page element process
[11:01:13] litwol|mac: how do you decide which chunks of html are being cached, which are being changed, etc.?
[11:01:27] where do those chunks stop/start?
[11:01:50] sdboyer: easy. cache_set(theme('$page_elemnt', $page['my_element']));
[11:01:54] that's the question that demands more granular control over the rendering process
[11:02:01] yeah, that's not easy
[11:02:26] a) it assumes that there's a 1-to1 relationship between the page element and a theme function
[11:02:33] admittedly i did have to hack core to accomplish it
[11:02:52] also my assumtpions are based on d6
[11:04:07] b) for more common theme functions, it means a lot more context checking than is strictly necessary in order to determine a caching strategy
[11:04:14] more php execution time--
[11:04:49] c) it means a whole lot of a PITA to actually figure out what all the theme functions are in the first place
[11:04:56] you dont check caching strategy on reads, only on writes
[11:05:09] cache to write is separate from cache to read
[11:05:36] the only connection point is ESI/SSI uri constructed that is verbose enough to get the right data
[11:05:41] it'd never be sufficient to do theme('block') only on reads
[11:05:43] (for example)
[11:06:01] you dont do that
[11:06:02] yes, right
[11:06:15] you wrap around theme('block')
[11:06:20] instead of print $output
[11:06:29] you do cache_set($cid, $output);
[11:06:33] you need to have an ESI/SSI which is specific enough, on reads, to make sure the frontend cache grabs the right one
[11:06:35] yeah, i got that
[11:06:39] print "$esi_include$cid"
[11:06:40] k
[11:06:40] my point wasn't calling theme('block')
[11:06:56] k
[11:07:06] my point was that you need to have a whole variety of caching strategies for blocks
[11:07:25] that are sensitive to both the block being shown, and context
[11:07:31] permutation of (user|role|page|all) ??
[11:07:52] or more.
[11:07:59] yep
[11:08:20] details of which i've not figured out yet ;-p.
[11:08:23] -*- sdboyer grins
[11:08:47] we could, for example, offer an extensive admin UI with all possible context per region where a clueless admin can select stuff
[11:08:54] ...like a panel :P
[11:09:14] but then again, if you are going at lengths building varnish+esi+drupal+foobar then first of all you will not have those extra modules enabled
[11:09:33] you may, but in some weird setup to cutback on performance drain
[11:09:42] ahh right i forgot
[11:09:46] hmm that sounded deragotory
[11:09:47] i didnt mean it
[11:09:52] i haven't finished+published my panels results
[11:09:57] to dispell the illusion that panels is a performance drain
[11:09:58] oh! do tell
[11:10:05] -*- litwol|mac drools
[11:10:07] hurry
[11:10:08] lol
[11:10:09] :)
[11:10:43] actually panels with proper caching setup is rather fast
[11:10:47] short version is that the overall performance hit is not especially significant for very simple cases - 10-15%, as of the tests i ran quite a while ago
[11:10:47] yup
[11:11:02] i used it on one site and it improved perofrmance because it was all managed by one tool
[11:11:07] but with more sophisticated, granularized caching strategies it's just completely non-comparable
[11:11:08] exactly
[11:11:35] i'm really not arguing that the approach you're taking wouldn't work, per se - just that there's a degree of complexity that arises with configuring more and more complex pieces of content
[11:12:00] sdboyer: if ou dont mind, please add a comment to my writeup listing (as many as you can think of) caching strategies/dependencies
[11:12:02] and that the questions you're asking when you design a panel are fundamentally the same ones that you have to figure out in order to take the theme() approach
[11:12:24] and given that panels has a pluggable caching backend... :) that's my only real point

  • reply

Post new comment

The content of this field is kept private and will not be shown publicly.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Allowed HTML tags: <a> <em> <strong> <cite> <code> <ul> <ol> <li> <dl> <dt> <dd>
  • You may post code using <code>...</code> (generic) or <?php ... ?> (highlighted PHP) tags.
  • Lines and paragraphs break automatically.

More information about formatting options

To prevent automated spam submissions leave this field empty.
CAPTCHA
This question is for testing whether you are a human visitor and to prevent automated spam submissions.

Recent posts

  • Developing PHP in VIM - VIM IDE for PHP
  • Creating custom fields using Drupal 7 field api
  • Creating 3d cube movieClip using Sprites and animating rotation through ActionsScript 3.0
  • Adding sprite from inside an external class onto stage and animating it using arrow key movement.
  • Playing with flash keyboard click events
  • Script to benchmark API execution time.
  • Very good dvcs guide
  • High throughput web architecture with drupal - *For authenticated users*
  • About me - Oleg Terenchuk
  • Note to self
  • Pre-generating drupal forms
  • Change memcached admin stats page
  • The cult of done

LiTWoL © Oleg Terenchuk - Hosted on Linode.com 360

I love Smashing Magazine!
Fervens Drupal theme by Leow Kah Thong. Designed by Design Disease and brought to you by Smashing Magazine.