Sunday, June 23, 2013

Setting up Dagger on Eclipse for Google App Engine

Dagger is a dependency injection framework similar to Guice, but code can be generated at compile time to wire together dependencies instead of using reflection at run time.  It's written by some of the same guys who created Guice and had problems with its speed when it runs on Android.

I also have problems with the start-up time of Guice on Google App Engine where "loading requests" can keep a user waiting for your app to get itself up and running.

Dagger promises to shave some precious seconds off this app startup time and probably also make each request faster.  Reflective calls on App Engine can be slow - I don't know how slow but will eventually run some tests to see how much difference Dagger makes.

To get it running in Eclipse you need to modify your projects settings under Java Compiler > Annotation Processing > Factory Path and add the following 4 jar files: dagger-x.x.x.jar, dagger-compiler-x.x.x.jar, javax.inject.jar and javawriter-x.x.x.jar

You also need to tick the "Enable project specific settings" in the parent configuration page" for these settings to take effect.

Tuesday, January 15, 2013

Whats the real uptime of Google App Engine?

One of the principal reasons for running your website on Goole App Engine is its high reliability.  Their engineers carry the pagers for you so when something goes wrong at 4am you can stay in bed while they scramble to sort it out.  That is great in theory but how well do they do their jobs?

The Service Level Agreement promises 99.95% "uptime" and defines compensation if that level is not met.  Uptime is defined by Google as more than 10% errors for the datastore or the serving infrastructure.  Most of the services such as the task queue, email, Blobstore and memcache are not covered at all.  They could go down taking your app with them but this is not considered downtime by the SLA.  Also, when the system is running slowly your sites get penalised by Googles ranking algorithm but slow responses are also not covered in the SLA.

Last night my application running on GAE had another outage which seem to be a lot more frequent than I expect.  Each time something happens they analyse the problem, post a message apologising and describe how they fixed something so it won't happen again .... but then something else happens to take the site off-line.

I monitor the site with Pingdom so I can  look at the historical "real uptime" of my app.  This is reported  as 99.87 uptime over the past 60 days.  Over this time the app has not been offline due to application failures - only infrastructure failures.

Slightly below their SLA requirements.

Pingdom also allows me to download the historical data as a csv and analyse it myself.  In the last 60 days they pinged my site 86400 times and it was down 196 times.  That is 0.23% downtime or 99.87% uptime.  Hey thats the same as the official figure!

But what if we add requests that took an unacceptable amount of time to return?  The average request time is about 800ms from the Pingdom servers and anything over 10 seconds is seriously slow.  There were also 270 pings that took longer than 10000ms but were logged as successful.  That would take the amount of "bad requests" to 0.54% or 99.46% uptime.  Requests over 5 seconds were more than twice this amount.  Lucky for Google they do not promise your site will be fast!

All in all, I'm pretty happy with that uptime considering that I no not need to worry about my own infrastructure and software security.  I also have faith that big G will continue to improve and refine their systems or they will lose a lot of business.  I'm probably less sensitive to downtime than a lot of their customers.