Building a scalable application? So am I. One of the mental conflicts that I keep having is that all of the layers that we add to an application for scalability -- especially a Web application -- hurt the performance of the application!

An increasing common pattern is to have the client talk to a presentation layer talking to a business logic layer talking to a data access layer that talks to the database. This makes sense. And for the sake of interoperability and reusability of each layer for future growth, we end up doing it through XML at the very least, or SOAP, or something similar (anyone remember CORBA?). It all makes sense -- and let's add some more layers to this. After all, if the app ever gets enough users, we are going to need to have some sort of state sharing within each of those layers, so let's add a clustering technology at each layer. After all, with all of these concurrent requests flying around, we need to be able to maintain state, locks at the session and application layer, etc. etc. etc. between servers.

Let's face it, if we are thinking about a system big enough to have to worry about being able to add business logic processing servers without having to add presentation logic servers (they will scale at different rates), then it is an application big enough and important enough to need some redundancy, failover, load balancing, and more. Which means clustering with shared "stuff".

So now we have this monstrosity of an architecture, and tracing an error involves two or three XML transactions, three (at a minimum) logical servers (they may all be on the same hardware), and a big dose of oversight software.

What a mess. It is so easy to say: "Hey, let's just have the application logic and presentation logic and data access logic all in the same layer, and if the box goes down, oh well, sessions will be lost". But we live in a world where customers get sold contracts with "five nines" and SLAs and performance metrics and the works. So our applications need to be fast, scalable, and robust. And "scalable" and "robust" run counter to "fast."

The disaster of Java frameworks is a great example. Build an app on a popular Java application server, and start piling on the thousand and one different frameworks out there. Spring, struts, faces, and all of those other Java words. Now, do something dumb like cause a NullPointerException on purpose, and examine the stack trace. That null value went through 80+ layers before it got tossed as an exception. I am not picking on where the exception occurred, what I am pointing out is just how much overhead went into handling that variable. Do the same thing in a little Perl or PHP or Ruby script, and the variable went through only a few functions before it hit the runtime exception level.

This kind of thing is why .Net and Java will never be as fast as native code: The native code writer will rarely have more than a few layers between the OS and the application. These massive frameworks are so abstracted in the "kitchen sink" effort, they have 50 -150 copies of the data sitting in the stack before it ever gets to where it is going.

This has been a bit of a digression, but it illustrates my point.

You see the exact same thing in an n-tiered Web application built for scalability, robustness, and interoperability plus future expansion. Let's follow an imaginary client request to insert a new record into the database:

1. The load balanced HTTP server receives the request and parses it just enough to decide which application server cluster it needs to go to.

2. The application cluster receives it, and then starts to process it at a presentation layer level. It "sees" the request to insert a record, and creates a SOAP request to the business logic layer to handle it.

3. The business logic cluster gets the request, and needs to check to make sure that there are no other requests trying to insert a duplicate row (maybe a double click on the "Submit" button?). So it polls itself and all of the other cluster members with some shared memory system to see if they are doing the same thing. If it does not see that happening, it sticks a note into the shared memory for "heads up, I'm doing XYZ right now, so please do not do it." The business logic layer creates a SOAP request to the data access layer to insert the row.

4. The data access layer gets the request, opens it up, figures out which stored procedure on the database layer to use, and calls the stored procedure over a TCP/IP pipe.

5. The database gets the request, runs the stored procedure, and reports success to the caller.

6. The data access layer receives the status, and reports success via a response to the SOAP request.

7. The business logic layer receives a success notification, reports to the shared memory that it is done, maybe does a few more things, and returns a success message to presentation.

8. Presentation selects an "A-OK!" message, and passes it back out.

And that is a really simplified version that leaves out message queuing and such!

You can see how all of that scalability and robustness kills performance. After all, you have 8 transactions occurring to insert a row. This is a far cry from an old ASP, JSP, PHP, or Perl script that just works directly with the database, all within the same process, in the same thread of the same server. One is going to consume little overall memory, stack space, and have a rather small call stack, and have no worries about interserver communications, network latency, and so on. The other will have a huge spaghetti network of bytes and bits.

But I will say this: when it is done right, n-tiered apps do scale well in terms of being able to add resources only where needed, they are robust, and they are capable of "five nines". But it is all at the expense of greatly increased complexity and the requirement of a large amount of resources per request. The only other option out there, as far as I can see, is a mainframe environment where you keep it all on the same monster box instead.

Choose your poison carefully.

Interpreting Java This was published in Interpreting Java, check every Tuesday for more stories

Comments

1

Marian - 28/03/08

It should be said that that's the price payed for smooth load balancing and neither less importantly the failover capabalities. The cost of development both in terms of time and money turns high without all that slow, complex and standards following, but easy to program, processing. Buying more machines for production often turns out to be much more cheaper and less risk prone. It's a strange world in eyes of a good computer programmer, trained to think of effective and fast working solutions, but that's what programming is turning to be

» Report offensive content

Leave a comment

You must read and type the 6 chars within 0..9 and A..F

* indicates mandatory fields.

1

Marian - 28/03/08

It should be said that that's the price payed for smooth load balancing and neither less importantly the failover capabalities. The ... more

Log in


Sign up | Forgot your password?

What's on?

  • Optus Deal

    Broadband + home phone + PlayStation®3 in a single package price!