Author: Kapil Viren Ahuja

Don’t Like Throttling?

You don’t have a choice – the underlying system (The JVM here will do it for you).

I still recall the summer of 2013 when I was running a project and it was 1 URL in my whole of application that brought the servers down. The problem was simple – a bot decided to index our site at a very high rate and the bot was creating a millions of URL combinations which bypassed all of my caching layer and they were all hitting my application servers. Well we had a very high cache rate in the application (95%) or so and the application server layer was not designed for a high load (it was Adobe AEM 5.6 and the logic to do searches and make pages was very computational heavy). Earlier that year we wanted to handle the case of Dog-Pile effect and we had spoken about having some sort of throttling in place. At the start of the conversation every one frowned about the idea of throttling the same (except 2 people).

In fall of 2012 Ravi Pal had suggested to have error handling in place such that a system should not just fall on it’s head but degrade gracefully. I only realized the gravity of his suggesting when we hit this problem in 2013.

Now I am here working on yet another platform and the minute I bring up the idea of throttling, it’s being frowned upon again. One guy actually laughed at me in a meeting. One other person suggested that we want to handle the scenario by “Auto-scale” instead of throttling the same. We have our infrastructure on AWS Cloud and I am not expert but the experts tell me a server can be replicated as-is in around 10 minutes (we will be proving benchmarking this out very soon).

I was an ambitious architect who though I controlled the traffic coming to my site. I no longer live in that illusion.

This may be a series of posts, but today here I start off with showcasing that you do not have a choice and whether you like it not, the system will throttle your traffic for you.

The Benchmark Overview

  • A simple Web application built using Spring Boot
  • A Spring MVC REST controller that will accept some HTTP Requests and send back a OK response after a induced delay
  • jMeter to simulate a load
  • custom plugin (a big shoutout to these guys for the plugin) to generate stepped load and capture custom enhanced graphs
  • Tomcat 8.x to host the web site – launched in memory using Spring Boot. No customizations done

First Groups – The Good One

Test Plan

This Thread Group is going to simulate a consistent stream of requests to our application server. A typical scenario that happens very often.

Throttling - thread group - the good one

Server Performance

As Expected? Yes.

As you see below, the chart shows that the application server is behaving normally. All the requests over a time period of 15 minutes is consistent with a “single user model” aka 1 second request response time.

throttle - the good one - TPS - Scenario 1

Second Group – The Sudden High Traffic

Test Plan

This test plan is a stepped approach and it’s trying to simulate a scenario where a campaign will start hitting a certain page (or set of pages) for a short duration. This is a use case we see most often in the industry where our websites are open to the whole world to hit.

this thread group is not OOTB and I downloaded a plugin

throttle - the high traffic one - test plan

Server Performance

So what do we expect to happen? Depending on how much juice my server has (threads, cpu cycles, etc.), my server may or may not be able to handle the requests. Given I am running everything on my local laptop, it will be interesting if my local box can handle 600 threads on .

throttle - the high traffic one - TPS - Scenario 1

And we see that my laptop cant really handle 600 thread. So what does tomcat do?

It Throttles

How the Good One changes behave

Test Plan

I run the 1st test plan and follow it up with the high traffic plan (introducing a 30 second delay).

Impact

The following image shows how the Good One has been impacted. While the traffic for The Good One has not changed a bit, it has still been impacted because something else introduced a spike.

Please go and tell the JVM that you do not like throttling

throttle - the good one - TPS - Scenario 2

So What’s Next

You have really 3 choices (we will look into details of each of the following in separate posts)

  1. Auto-scale the application servers and hope that the new servers are ready in time to handle the load or;
  2. Do something about throttling and control your destiny – what if the high traffic is not a revenue generation resource and the Good One was?
  3. Continue to frown upon Throttling

Are annotations bad?

I eased off into this topic with my principles on my post about Spring XML vs. Annotations that other day. This easy inlet was also my way of not complicating things too much for my team who is currently involved in writing this new app that will probably have a production life-span for 3-5 years (if we do it right and hope world of technology does not changes over it’s head).

I have been working with Spring Days since 1.1 so yes I have a level of comfort working with very large and complex XMLs. But, I know how to write them and more importantly I know how to read them. Since then Spring has made it easy for developers to understand them – Spring STS with Beans Explorer /Graph. Developers now do not have the need to worry about looking at multiple XML – those tools do the job for them even writing and managing beans for them.

We sacrifice the art of writing good and performant code for the short term gains of improving developer productivity

Since I saw Spring 3.x introduce this notion of Annotation based configurations, and the hype train of using these annotations instead of using XML has been huge for at-least 7 years (if i remember correctly). I have not been able to make peace with this change in direction. Not saying it’s bad, but the point that this feature has been anything but abused by the community to it’s core and Spring has been guilty of promoting the abuse. Any Spring Documentation today, talks about annotation-style’d coding only to follow with the “classic XML way” of doing things.

While people say – it’s easier to read the Code, it’s easier to debug the code with annotations in the mix, they forget what’s it’s not code in code anymore – they have embedded configuration in code. And as far as I remember Configurations were supposed to be externalized. The problem is more severe in cases where we use ORM frameworks like Hibernate and JPA.

Even in original Spring Design, even with XML I feel that how we setup spring applications are not what spring was design for. It’s time for me to go find what Rod Johnson had in his mind when he designed Spring (I know a bit but I need to find some details and get into depth). But thats for another day.

So let’s look at this blog post that explains using JPA with Spring or read this StackOverFlow thread. Well, they both explain how to use, but very soon we realize that but using these so called Rich Annotation based configurations in Code we have diluted the overall meaning of What code/design is supposed to be. This style of programming is great when I have to try something new as a personal pet project to get off the ground quickly – i can just write a class, type a few annotations and boom i am ready to do CRUD, but does this really works in enterprise level applications especially how do we manage this in production.

These articles are nothing but a bunch of marketing/sales pitches that want us to go use these frameworks and new features, but they hardly put in context the complex situations we have to deal with in big production systems

In 2007, we used hibernate extensively on our project (with Spring 2.x with XML based configurations) and we realized very soon that we had taken the ORM framework beyond it’s limits. we had complex queries which we were trying to retrofit into Hibernate and something that was possible to write in MS-SQL as optimized procedures and fire away those queries were now becoming major bottleneck. I was new to the framework but more importantly I had a push from my technical leadership to use Hibernate to it’s fullest. Those people had access to article like I shared earlier and this looked like the way to go but they were nothing but marketing material to sell a feature that Hibernate and ORM brought onto the table. When rubber hits the road is when i had to go back and refactor the code and follow good old ways of writing queries.

90% of the times these frameworks that use annotations work well, but those 10% where you need your system to perform under stress is EXACTLY when these fail

Back tracking to Spring and Annotations now – why i do not like them? Simply because they make me write code like I am a college student who is learning something. They force me away from what used to be good practices in golden old days. Yes it used to take time to setup a few bunch of classes and it used to take time to write the SQL queries but I had right stuff in right places. And Yes it took time before we gathered momentum, but once we had those basics setup tight not only we could development speed, we also had done the things the right ways.

And yes no one can force us, but the average Joe Developer or the average Jim architect do not have the time and inclination and make these POVs, they do a google search and when they see 5 articles saying the same thing, they presume it’s the right thing to do and they proceed happily. And many of our Senior Technologists who also read these articles support the designs and many a times challenge the POV of what I am trying to put here.

TLDR;

Think about it and please do not use annotations to configure your applications. Configurations were never meant to be part of code – the reason they are called configurations. So let’s let those be. A small gain in short term wont go the long way especially when a client asks for a change in a table or a value and you tell him that will beed 5 days of development, testing and deployment.

Making Thread Dumps Intelligent

Long back I had learnt about something called Log MDC, and I was a big fan of it. I was suddenly able to make sense of anything that happens in log files and pin-point to a specific log entry and find what’s right or wrong with it especially when it was about debugging a bug in production.

In 2013 I was commissioned to work on a project that was running through some troubled waters (combination of several things) and almost every week I had to go through several Java Thread Dumps trying to make sense what’s happening in the application to make it stop. Also, there were times when I had to have profilers like AppDynamic, jProfiler, jConsole all hooked up to the application trying to find what’s the issue, and more importantly what’s triggering the issue. jStack was one of the most helpful tools that I had worked with but the thread dumps being bumps had no contextual information that I could work with. I was stuck with seeing 10(s) of dumps with stack traces of what classes are causing the block but there was no information of what’s call and what inputs were causing the issues and it got frustrating very fast. Eventually we found the issues but they were mostly after several rounds of deep debugging the code with variety of data sets.

Once I was done with that project I swore that I will never find myself in that situation again. I Explored ways in which I can use something similar to Log4j’s NDC but have that in threads so that my dumps mean something. And i was able to find that I can change the ThreadName. And my next project I did use that very effectively. I recently came across an article that explains that concept very well. I am not going to rewrite everything they said, so here is a link to their blog post.

So last week I am starting a new project and as I get into coding the framework (using Spring 4.1 and Spring Boot), this is the first class I am writing for the application and ensuring that the filter gets into the code ASAP which not only helps us in post-production but also makes my development logs meaningful.


A copy of the code for both Log4j NDC, and setting up a ThreadName is below.

Spring Framework – XML vs. Annotations

This question has been around for many years since Spring started to move heavily towards Annotation based configurations (if i recall right it’s called configuration by convention). Annotations based configurations was like a jungle fire which spread across the industry and very soon it was a norm. But, this question “XML vs. Annotation” always existed.

I for one have been around in Spring world since it’s version 1.1 when annotations weren’t a thing and I know what’s it about to write those XML and the power to configure an application to suit my needs. Since then whenever I have went about writing an application in Spring I have asked myself this question and I never really had a good answer until recently. While you will find tons of post around in google,  when you search for this, only a few really give you an unbiased opinion.

I started to work on an application that needs some very flexible configuration options and before I dive into that I had to yet again make this decision and this time I wanted to keep things simple and my rationale was…

Use Annotations to anything is is core to the application and defines the core structure of the application. Anything that would need a code change is okay to sit as an annotation.

Use XML based configurations when you know you may have a need to alter the behavior of an application without the need of compiling and deploying the code all over again.

This is how simple i kept it for my team. Once this principle is defined, job is only halve done. But we will get there soon.

It’s all about laying blame

I heard this in House MD

Good things happen often, Bad things happen sometimes.

If you are are talented and skilled you will try things that others are afraid to try out and because you are going to live dangerously there will be times when you will fail. What you did and What the outcome is are 2 different things. You can fail or succeed but that will have no bearing on whether you are right or wrong. So next time you have to make a tough decision where the chances of failing are high don’t be afraid – do the right thing.

There will be some who will try to lay blame but eventually you will see – people will realize that your process works because you succeed and disrupting your process makes you ineffective and no one wants to do that.

Motivated or Skilled?

It’s interesting how when we have to select a bunch of people to be on our projects we immediately latch onto the skilled ones whether or not they are motivated and the average guys are first ones to go even if they have a vested interested in our success (as they succeed along with us).

Next time when you have to make a choice of keeping someone on the team, pick the one who will do anything to make you successful and you will be surprised what that person will do to gather all the skills needed. He will do everything and a lot more to make him/her successful and in the process you will see success beyond your wildest imagination. You just need to provide them some mentorship – thats all.

Alternatively, keep finding ways to motivate people but don’t forget The Paradox of Rising Expectations.

Just Write away…

I have been meaning to write a few articles for “let’s just say someone” but that publication will mean a bunch of reviews and approvals. I get the fact that I am getting publiched in anotehr brand and they need to make sure it (if not represents) fits the bill. Having said that, the fact that a person is reviewing my article may have very different opinion on the topic and that difference of opinion will impact how my article is reviewed and if gets selected or not.

This article is my POV and what I want to tell the readers, and not what this reviewer thinks and what he wants to get out. This is not just an “edit” on punctuations and language but on my thoughts is my biggest deterrent on writing for any other publication and I realized that those 2 articles have been sitting in draft mode for 18 months now. It just gets “uuggghhh…”

So my advice to myself here and to all of you out there – “just write away.. find a way where you just get to state what you think // dont get stuck into that review trap”.

Unit Testing in AEM (thinking loud)

This is not a recommendation of any sorts but a culmination of ideas and a few options that are available for us to use if we want to do unit testing within AEM. I had done some research for a client some time back and this article is largely influenced by that work but a lot of contextual stuff has been pulled out. I have still tried my best to ensure that the article here holds it’s essence. I will try to do a follow-up soon with a lot more details.

Option 1: Use Sling tools and test in-container

Apache sling has released a set of tools http://sling.apache.org/documentation/development/sling-testing-tools.html which can assist unit testing in the application. There tools offer several ways of doing the testing like a) good old JUnits where there are no external dependencies or b) Use of mocks – sling provides readymade mocks which reduce the effort or c) we can deploy the test cases in a CQ box (or sling) and run using OSGi references.

The approach I am recommending here is where we will deploy JUnits in an already hosted CQ instances and invoke the test cases remotely. I understand that this is not “old school unit testing as i am not abstracting any dependencies and my units include dependencies” but i have a reason for doing that. As a matter of fact if you have been following up my writings on unit testing you would know that I am not a big fan of mocking and actually am very happy to do any unit testing against dependencies if i can set it up.

 To do this we need a few things to happen as follows:

  1. We will need to have a hosted CQ instance that can be used as a container for running test cases
    1. We can use embedded systems but then we will have to spend additional effort creating content and what not. Also the embedded container will be sling and not CQ and we would like to keep the environment as close to what we use as possible
  2. The CQ instance should have a pre-populated set of products and images (this setup does uses AEM eCommerce module and PIM and DAM have been integrated with external systems) and that acts for us as readymade test data. These can be achieved using our backend integrations. We can chose to do it independently or can do it automatically (automation of these things can also happen over time to allow us to start quickly)
  3. For interactions with any backend services (like Order Management, Pricing, Account information), we would need to have a backend service instance running (as i said i prefer systems over mocks if possible) with all the variables and pieces setup. This instance should also have various data setup like user accounts, products instances, availability, prices, etc to ensure our use cases work. There are obvious challenges setting up independent backend services and we can explore one of the following 2 options
    1. Capture all requests and responses for a certain request type and serialize those into a test-data store. It can be a huge XML that we can store in a key-value pair sort of a system – can be a database like mongo (even SQL would do) or we can serialize on file system or;
    2. We can use an already existing backend system

Option 2: Use selenium as the functional testing tool

In this approach I am recommending not to use JUnits at all. The idea is to use the philosophy of system testing which can test all of your units in the code. This is a big departure from the traditional way of unit testing where all dependencies are mocked out, and we can run several tests quickly. While Option 1 is also to the same effect, in this approach we go a step further and leverage our system test suits. The idea is not to do this for every single use case, but pick up business critical functions like checkouts, order management, account management and automate those. The selenium scripts can then be integrated with a JUnit runner where we can then integrate it with CI tools and can run it from Eclipse or Maven and hence can be integrated with CI itself. This saves us the time to write those JUnits and manages a whole suite independently. This approach also needs a hosted CQ instance with product data setup, some content setups, and backend integrations just like in Option 1.

Of course this is bit tricky and not really unit testing but it has some huge upsides if done right.

Don’t be afraid to course correct

When driving a car if you sense a flat tyre; do you stop and fix it or do you just go about driving with a sluggish pace which eventually will lead to an accident?

I guess we all know what we will do! So when we see a problem on a project why don’t we take the time and fix it? Why do we get afraid of telling stakeholders that something has gone wrong and we need to course correct which might lead to an increased cost or a bit of a delay in schedule?

I am sure the counter argument would be – we do fix it or inform it. I am sure if you introspect enough you will realize that we do one of the following:

  1. It’s trivial and we can deal with it ourselves – leads to no communication and eventually there are enough stacked up trivial(s) which eventually leads to cost of schedule gone wrong or;
  2. It’s a engine gone wrong and the cost is already so huge and I will blamed or it will taint my reputation of not being able to handle it well so let me fix it

Either ways you will find yourself screwed compared to if you would just tell the stakeholders who will more often than not be able to share some ideas of how to fix the problem faster. You may still end up bearing the cost (which you do anyways) but your journey towards delivering the project will be so much relaxing and i can almost guarantee your relationships with the stakeholders will be lot more trusting.

Start by getting familiar

The level of confidence a team can get for the number of functional defects that a client can find in UAT has a direct relation to how much they know of the application they are building. You can write any number of scripts (manual or automated) but unless you get yourself to be in a position where you are like a user of the site you will never know if what you are testing will meet the client’s end requirements.

So at some point in time, get to know the application upfront even before you write a single line of code or test script or find someone who does. It’s no different from how our mind works when we travel from home to office or to a new place – either we know where we are going and our eyes and brain give is immediate and instant feedback if we go wrong or we need something like Google Maps to tell us (even though it is a bit delayed).

This is a step that you need to have to be able to deliver Quality on any project