Category Archive : Engineering

React and Angular do not fit everywhere

Last few years have been fascinating with several front-end frameworks coming up. I am not going to compare them; neither am I going to talk about The Good or The Bad on them. The buzz words, the new technologies are so shiny that almost everyone I speak to today wants to work on the new stuff. And they also tend to believe that you can solve everything by just writing a front-end application, make some API calls and it’s done. The focus of this post will be that React, or Angular or Vue.js or anything similar is not one size fit all frameworks and Servers Side Code still has a place in this world.

Read More

Cracking The “Laravel Vs. Cakephp” debate

I started off this thread to compare 2 frameworks for my next project which would have a team of 20 developers. Team experience was a bit skewed on CakePHP, but the experience outside the organization was tilted towards Laravel. The community buzz was definitely biased towards Laravel, but I wasn’t sure. Hence, I started to jot down some key points and here is the showdown

Niether of the 2 frameworks is the definitive choice. I realized that both do the job and will fit very well in a certain given scenario.

I decided to start working on Laravel 5.4

Read More

The false security of Notifications

The Problem

On one of my recent projects, I just got added to a distribution list that receives alerts from vehicle-dashboard-lightsour monitoring system. first few days I tried to read some of those notifications, but then one day when I opened my email it was flooded with about 500 or so messages. Some of them were more of less the same message coming every few seconds/minutes. over the course of last few weeks, I get like 100+ messages every day and most of them when I am asleep.

The most interesting things that came to my notice are:

  • most of these alerts while are warnings, they don’t really bring any our services down anytime soon. some of these don’t even get actioned upon and they self recover.
  • the team actually actions upon a couple of these really; everything else is more of an info
  • inboxes get flooded during night when our core support team is sleeping and there is no way to know for the core support team if something is going to fail soon

it’s like jumping into my car and every time i see the dashboard every light in there is brightly lighted up – to the point that one day i stop caring. eventually, someday something will fail – i just hope it’s not the day when I am driving to someplace in an emergency

The Symptom

when I reached out to the team and articulated the issue i have with out notification strategy, the prompt response I received was to create a new DL, which i believe will be the goto list where all notifications go. Yes, i will be receiving lesser emails and maybe none. And it solves for nothing.

This is just a big symptom if you see in your organization you should think if the team in on top of knowing when something is really gonna fail. Or are you relying on a system that sends everything it sees wrong as a notification and let’s a bunch of humans decide what to act upon or maybe not. also, you cant avoid the fact that many of these notifications are going over an channel that has no way to “push” notify a user of an issue.

Think of a car dashboard with all these light sitting no in front of a driver, but in the glove box. someone would have to open the glove box to see if a light is on or not. the light maybe on for hours before someone realizes something’s gone missing.

The Solution

I don’t have a technical solution in place for my project, but something I am going to speak to my team about, but the analogy that I will leave you all with is that think of what a notification/alerting system should be like?

  • Have you car’s dashboard light up with a warn Green telling something has happened (like an indicator has been switched on and it’s blinking)
    • Green and soft clicking sounds – eventually a driver will see and will turn it off (was the case with cars in 1990s with low or no sensitivity indicators). but you don’t want to alarm the driver – it’s not detrimental
  • Have car’s dashboard light up in Yellow like a warning. I have my car light up a fuel warning as soon as the levels are dangerously low. I can still drive 80-100 KMS based on how i drive but it’s more than enough for me to eventually see it and get to a refueling station
  • Have car’s dashboard flash a Bridge RED – like Doors open. Well, you wont want to drive your cars with doors/hood open. hence a bridge red sometimes accompanied with a few sounds.
  • Or have a Sound beep every few seconds – i like how my car alerts me every few seconds when i dont have my seat belt on. Or when I drive over 120KMPH. It’s like reminding me every 10 seconds that I have something fatal going wrong and i can die from it

 

How will this translate for me and my project team is something i don’t know yet. But, as we go about fixing this, I will post if here. What have you done to address your strategy or is it still all dashboard lights flashing all the time?

 

 

Make Sense out of your Stack Traces

I recently came across this cool tool http://www.stackifier.com/ which allows you to make so much sense out of your stack traces. Once I have locked on a stack I want to look at, this tool makes it so much easier.

You can also read about it on their blog post – http://blog.takipi.com/stackifier-make-sense-of-your-stack-trace/

Don’t Like Throttling?

You don’t have a choice – the underlying system (The JVM here will do it for you).

I still recall the summer of 2013 when I was running a project and it was 1 URL in my whole of application that brought the servers down. The problem was simple – a bot decided to index our site at a very high rate and the bot was creating a millions of URL combinations which bypassed all of my caching layer and they were all hitting my application servers. Well we had a very high cache rate in the application (95%) or so and the application server layer was not designed for a high load (it was Adobe AEM 5.6 and the logic to do searches and make pages was very computational heavy). Earlier that year we wanted to handle the case of Dog-Pile effect and we had spoken about having some sort of throttling in place. At the start of the conversation every one frowned about the idea of throttling the same (except 2 people).

In fall of 2012 Ravi Pal had suggested to have error handling in place such that a system should not just fall on it’s head but degrade gracefully. I only realized the gravity of his suggesting when we hit this problem in 2013.

Now I am here working on yet another platform and the minute I bring up the idea of throttling, it’s being frowned upon again. One guy actually laughed at me in a meeting. One other person suggested that we want to handle the scenario by “Auto-scale” instead of throttling the same. We have our infrastructure on AWS Cloud and I am not expert but the experts tell me a server can be replicated as-is in around 10 minutes (we will be proving benchmarking this out very soon).

I was an ambitious architect who though I controlled the traffic coming to my site. I no longer live in that illusion.

This may be a series of posts, but today here I start off with showcasing that you do not have a choice and whether you like it not, the system will throttle your traffic for you.

The Benchmark Overview

  • A simple Web application built using Spring Boot
  • A Spring MVC REST controller that will accept some HTTP Requests and send back a OK response after a induced delay
  • jMeter to simulate a load
  • custom plugin (a big shoutout to these guys for the plugin) to generate stepped load and capture custom enhanced graphs
  • Tomcat 8.x to host the web site – launched in memory using Spring Boot. No customizations done

First Groups – The Good One

Test Plan

This Thread Group is going to simulate a consistent stream of requests to our application server. A typical scenario that happens very often.

Throttling - thread group - the good one

Server Performance

As Expected? Yes.

As you see below, the chart shows that the application server is behaving normally. All the requests over a time period of 15 minutes is consistent with a “single user model” aka 1 second request response time.

throttle - the good one - TPS - Scenario 1

Second Group – The Sudden High Traffic

Test Plan

This test plan is a stepped approach and it’s trying to simulate a scenario where a campaign will start hitting a certain page (or set of pages) for a short duration. This is a use case we see most often in the industry where our websites are open to the whole world to hit.

this thread group is not OOTB and I downloaded a plugin

throttle - the high traffic one - test plan

Server Performance

So what do we expect to happen? Depending on how much juice my server has (threads, cpu cycles, etc.), my server may or may not be able to handle the requests. Given I am running everything on my local laptop, it will be interesting if my local box can handle 600 threads on .

throttle - the high traffic one - TPS - Scenario 1

And we see that my laptop cant really handle 600 thread. So what does tomcat do?

It Throttles

How the Good One changes behave

Test Plan

I run the 1st test plan and follow it up with the high traffic plan (introducing a 30 second delay).

Impact

The following image shows how the Good One has been impacted. While the traffic for The Good One has not changed a bit, it has still been impacted because something else introduced a spike.

Please go and tell the JVM that you do not like throttling

throttle - the good one - TPS - Scenario 2

So What’s Next

You have really 3 choices (we will look into details of each of the following in separate posts)

  1. Auto-scale the application servers and hope that the new servers are ready in time to handle the load or;
  2. Do something about throttling and control your destiny – what if the high traffic is not a revenue generation resource and the Good One was?
  3. Continue to frown upon Throttling

Spring Framework – XML vs. Annotations

This question has been around for many years since Spring started to move heavily towards Annotation based configurations (if i recall right it’s called configuration by convention). Annotations based configurations was like a jungle fire which spread across the industry and very soon it was a norm. But, this question “XML vs. Annotation” always existed.

I for one have been around in Spring world since it’s version 1.1 when annotations weren’t a thing and I know what’s it about to write those XML and the power to configure an application to suit my needs. Since then whenever I have went about writing an application in Spring I have asked myself this question and I never really had a good answer until recently. While you will find tons of post around in google,  when you search for this, only a few really give you an unbiased opinion.

I started to work on an application that needs some very flexible configuration options and before I dive into that I had to yet again make this decision and this time I wanted to keep things simple and my rationale was…

Use Annotations to anything is is core to the application and defines the core structure of the application. Anything that would need a code change is okay to sit as an annotation.

Use XML based configurations when you know you may have a need to alter the behavior of an application without the need of compiling and deploying the code all over again.

This is how simple i kept it for my team. Once this principle is defined, job is only halve done. But we will get there soon.

Unit Testing in AEM (thinking loud)

This is not a recommendation of any sorts but a culmination of ideas and a few options that are available for us to use if we want to do unit testing within AEM. I had done some research for a client some time back and this article is largely influenced by that work but a lot of contextual stuff has been pulled out. I have still tried my best to ensure that the article here holds it’s essence. I will try to do a follow-up soon with a lot more details.

Option 1: Use Sling tools and test in-container

Apache sling has released a set of tools http://sling.apache.org/documentation/development/sling-testing-tools.html which can assist unit testing in the application. There tools offer several ways of doing the testing like a) good old JUnits where there are no external dependencies or b) Use of mocks – sling provides readymade mocks which reduce the effort or c) we can deploy the test cases in a CQ box (or sling) and run using OSGi references.

The approach I am recommending here is where we will deploy JUnits in an already hosted CQ instances and invoke the test cases remotely. I understand that this is not “old school unit testing as i am not abstracting any dependencies and my units include dependencies” but i have a reason for doing that. As a matter of fact if you have been following up my writings on unit testing you would know that I am not a big fan of mocking and actually am very happy to do any unit testing against dependencies if i can set it up.

 To do this we need a few things to happen as follows:

  1. We will need to have a hosted CQ instance that can be used as a container for running test cases
    1. We can use embedded systems but then we will have to spend additional effort creating content and what not. Also the embedded container will be sling and not CQ and we would like to keep the environment as close to what we use as possible
  2. The CQ instance should have a pre-populated set of products and images (this setup does uses AEM eCommerce module and PIM and DAM have been integrated with external systems) and that acts for us as readymade test data. These can be achieved using our backend integrations. We can chose to do it independently or can do it automatically (automation of these things can also happen over time to allow us to start quickly)
  3. For interactions with any backend services (like Order Management, Pricing, Account information), we would need to have a backend service instance running (as i said i prefer systems over mocks if possible) with all the variables and pieces setup. This instance should also have various data setup like user accounts, products instances, availability, prices, etc to ensure our use cases work. There are obvious challenges setting up independent backend services and we can explore one of the following 2 options
    1. Capture all requests and responses for a certain request type and serialize those into a test-data store. It can be a huge XML that we can store in a key-value pair sort of a system – can be a database like mongo (even SQL would do) or we can serialize on file system or;
    2. We can use an already existing backend system

Option 2: Use selenium as the functional testing tool

In this approach I am recommending not to use JUnits at all. The idea is to use the philosophy of system testing which can test all of your units in the code. This is a big departure from the traditional way of unit testing where all dependencies are mocked out, and we can run several tests quickly. While Option 1 is also to the same effect, in this approach we go a step further and leverage our system test suits. The idea is not to do this for every single use case, but pick up business critical functions like checkouts, order management, account management and automate those. The selenium scripts can then be integrated with a JUnit runner where we can then integrate it with CI tools and can run it from Eclipse or Maven and hence can be integrated with CI itself. This saves us the time to write those JUnits and manages a whole suite independently. This approach also needs a hosted CQ instance with product data setup, some content setups, and backend integrations just like in Option 1.

Of course this is bit tricky and not really unit testing but it has some huge upsides if done right.

High availability design

If you have ever travelled in an Indian Railways you would have noticed that the capacity for which the train is supposed is handle holds no meaning because the number of people it will be carrying is just going to be way over. That’s how the passenger load and platforms all across the country are managed. The method mostly works fine, but from time to time there are breakdown and trains are delayed, sometime cancelled but the life goes on as people expect that this will happen with Indian Railways. 

When we design and write those big platforms/software something similar happens but the biggest difference is that customers/clients who have paid for the software don’t like those downtimes (cancellations) and slowness (delays). Last 2 years or so I have had so many conversations where 2 key NFRs intersect – Performance and Availability. I have noticed and started to realize that while these two end up being joint at the hip and need to work closely together they still mean a world of difference and what it means when we speak about performance and availability and each of these need to be addressed differently. 

Of course, eventually with all many fixes you will eradicate a lot of cases that led to failures but it would have taken you so long and the reputation that the brand holds do dear is already damaged.

 

The Start 

Most project (almost all) in today’s world have some Non Functional Requirements and the 3 that take the top most priority are Performance, Availability and Security. Some numbers that get most often thrown around are:

  • Pages should open in under 1 second and so forth
  • There should be an uptime of 99.99% – which is actually 1.01 minutes per week 

And that’s about it. Of course there are more, but 95% of the conversations revolve around the two here. then we go about designing solutions to meet those numbers. 

 

Just before GO Live

Things are al good when we are under implementation and we do everything to make sure we meet those 2 or so numbers met with out design. We do performance modeling and then we execute those performance models and prove out that the trafic model/simulation as to what we understand as client use cases work fine. So that is not we say with confidence that our system will meet the performance needs. In this model, we do take a capacity increase of 40% or so; again based on anayltic and some future growth and we incubate those numbers into our calculations and then we are even sure that our system will be able to handle a bit more if that happens at times. 

Now that we are so sure that performance is all good for the traffic we expect to get we believe that our software should continue to work fine because everything will work in the same constraints in which we have tested it. And because those constraints are well defined there should really be no problem with us meeting our availability numbers. 

 

We are in flight Houston

Then the next cool thing happens – we go live and our system that we have build with so much pride and caution and we have tested so much is Live and so many people start to use it. It certainly is an exhilarating feeling seeing your sweat and hard work go live and people are seeing and interacting with what you have build. 

Until a time comes when the system goes down. You would be sleeping in middle of the night and you will get a call and someone will be telling you to get up, switch on your laptop and get ready to debug as to why the system went down and get it up and running quick. It takes you a while to think – what the hell just happened. We did everything we had to do. There were even reviews that we did and everything was looking good. How can it go down.

Well there is something called Universe that has a different set of plans for you and those plans just went into motion.

So what happens?

Indian Railways happens 🙂 You realize that there is a traffic pattern that has suddenly come and hit your servers which you did not know. A Chinese search engine has started to crawl all over your site and the site is not even in china. Well it’s a free ride this internet and anyone can get on. Why can some people sitting in a country not see a website that is not supposed to be targeted at them – but we did not have those specifications. We put in all checks but we never anticipated for all those search engines and bots and the crawling they will do. 

What do we do?

We fix the problem and put in either a block to stop that traffic pattern or we throttle it or we add servers to handle it. And then we go about thinking okay now i am good; this is done and dusted and wont happen again.

Universe has other plans

Next time someone will add some bad servers into WIP and cluster will fail. and the next time someone will delete the database and it will crash again and the next and the next and the next…

In the whole software development process we miss this key step – to design for this game and how we will be setup to get the system back up and running in that time frame. 

 

Fixing begins

Of course we have to do something but what we do is to start looking at our solution and check why performance is bad. Performance – really? We do everything we can do to fix the performance problem but we spend no time on availability aspect.

Going back to the India Railways analogy I drew upfront; a train – engine and bogeys are built to handle certain load and they was agreed. It cant be more precise as there are seats and there are tickets that need to be bought to get into that train. As long as the number of people that get in there are within those constraints our problems will be much less. Everything around our software (our train) needs to work in tandem. But, it is difficult to control. Internet is much more wider than an Indian Railways and who comes, when they come and how many come is just not predictable. It becomes important to acknowledge that no matter how you do there will always be a model of traffic that will come and visit you that will take your system outside of the known boundaries and more often than not once our system is operating outside of those boundaries it’s bound to fail some point. This is where the Availability and Resilience perspectives need to be brought into the picture. 

 

Next Time do something else too

Availability and Resilience

At their core this perspective asks you to set some designs, practice and most importantly an expectation with your clients as to what you are dealing with. We all know that in the last decade how we run our business and how we deal with internet hosted sites is very different than how we used to make systems in past. I paraphrase from article – “If your site is down, your business will suffer” and yet everyone will want a 24×7 uptime but yet we sold what NFRs – 99.99 (1.06 minutes) or maybe 99.9999 (0.605 seconds) thinking it’s okay. If the expectation is 24×7 why would we even start with something less?

We then need to look at the next 2 most important metrics which we miss all along and we never plan, design or test for. It’s like we take them for granted. It’s the Recovery Time Objective (RTO) and RPO (Recovery Point Objective). As we speak about uptime and outages, whenever we have an unplanned outage and we have promised a certain uptime, we need to have the Operational Ability to be do whatever is needed to get the system up and running. If we designed for 99.99% we need to have methods in place to get system back in 1.06 seconds – it feels like Minute to Win It. In the whole software development process we miss this key step – to design for this game and how we will be setup to get the system back up and running in that time frame. 

The Operational ViewPoint

Operation Viewpoint is a key architectural principle that we omit to design for when we are building software systems and platform. How we run software now has been completely changed in the last decade with the cloud hosting. As cloud makes things so easy to provision and host (AWS) we believe that everything should be easy. So where in the past when we used to focus on Availability design a lot more we almost take it for granted. This Viewpoint is something that should become our bread ad butter during the implementation phase with a dedicated team who is going to look at operational processes and tools and provide methods to make the recovery possible in the time it’s expected to be. 

Categorize and Prioritize

This is where it becomes critical to have a conversation with out clients and understand how various parts of the systems can be identified and broken down into services. A classification of sorts like “Platinum”, “Gold”, “Bronze” starts to make sense to get the business to prioritize as to what services should get the top most priority incase of an unplanned outage. The focus of the operational design and implementation team then needs to focus on how to look at the system and how to make those services up and running quickly. This is a key inputs for the implementation phase because unless those services are not known there won’t be a way those services are coded such. 

Recover and not debug

When these unplanned outages happen, the team which is responsible for managing the system more often than not start with a different mind set. They are like Cops who have reached a crime scene after a crime has happened. They start by looking around for evidence and analyze the crime scene as to what happened. The idea is to look for evidence and then solve the crime and hopefully then find the criminal and put them behind bars. Well we all know it takes so long to get there. With cops, i can see the point – you can’t have cameras in all the home and everywhere, so you will have do post-mortems. But this is a software system and we need to be fire fighters. The idea has to be to put the fire out and do it quickly before it takes the block away. The idea has to be to realize some damage has been done – thats a lost cause; let’s see how we can save what’s left of it. 

In softwares that we write and platforms that we host we need to have a something I refer to as “Last know good state” and when an outage happens what we need to do it so just ourselves back to that state. But, when do we do when the state or behavior is not under your control. Going to back to Indian Railways, what do you do when you can’t control the number of people who are coming in on the platform and onto your train – they just keep coming in; no matter if you find a way to replace the train, they will keep coming in. The other way is to of course start adding more trains on the platform. With cloud you can do that and keep adding servers until all the traffic is dealt with. This is where we move seamlessly into Performance and Scalability perspective

This is where we lose sight of the problem and we try to fix something else in modern software.

 So what should do if we can not control traffic. We need an effective mechanism on our train that wont allow everyone to get in. We need to have the ability to know who can get in. We have these ID cards and Turnstil on platform. So if our platforms do not give us those why can we not put those in in our trains. It may not stop all malicious traffic, but it will certainly stop a lot of it. Most importantly you can go back and authorize your Platinum users to get in while you block everyone else.

So in software world, you need to have a cutover switch that will stop all traffic and only allow what is key for business. Unless everything that is coming in is Platinum which is not the case most times, you will be able to recover your most important services easy. Of course there is a degradation of other services but that is something you would already set expectations with your clients and the business. They will be mad but less mad. 

The 300

If you have not seen 300, you have got to see this and learn what it can do to your systems and ability to recover if you handle your enemy (the traffic) though a funnel. You will longer and you will get a lot more time to fight the enemy. On top of it, the pressure and stress that the business creates when their platinum services are not available will also be reduced. You can then go about debugging once you have contained. 

 

Nothing is for Free

Of course we do so much more, but the more we do the more it will cost. Back to indian railways, we can either chose to save some money on building my train by not installing those ID cards turnstiles or we can invest that money and ensure we have continuity. The Turnstiles will be more and more complex and will need a lot more fine tuning to handle all scenarios. So if you need to also handle for use cases where you don’t want your turnstile to fail, then you need to install 2 of them on all bogeys which will cost more and the setup goes on. 

The point I am trying to make is that when we go from 99% to 99.99% to 99.999% we dont look at the cost drivers and what it will do to the project. We may think – it would mean a few more rounds of performance testing and we should done. We we know now what’s going to happen. 

If you fail to articulate to the clients what these numbers will mean to them in terms of cost, you wont ever get them accept the reality of internet and the universe. More often than not you will realize how business will realize that there are services they can live without. Of course, eventually with all many fixes you will eradicate a lot of cases that led to failures but it would have taken you so long and the reputation that the brand holds do dear is already damaged. If you think of this as “risk money” and how you invest in guard your reputation will justify the cost every time – that much i can assure you.

AEM Development Workflow – Part 3 (Coding Old School)

In this series I have been trying to define the various development workflows (that have existed or will arise in near future) and what sort of problems do I see with each of those. We start off by seeing the very first use case that has probably existed for years now “the old school” way of coding in CQ. 

Requirements

carousel-authoring-requirements

Image 1: Component Breakdown

It is always beneficial to define what the end result has to be so that we can ensure that we have achieved what we set out to do. I had finally received a HTML for a carousel from my Site Development friend (download here) and our job is to take that carousel and a simple page and make it Content Managed in Adobe AEM. For this purpose I am going to use AEM 6.x as we need to use sightly in later phases. As we you see in the Image 1, you will how the carousel should look like. I have broken the page down to 2 parts/components that we will have to develop:

  1. Carousel Component – this is highlighted in blue block and marked #4
  2. Title Component – For the same of this use case and trying to handle how we can reuse 1 component for various displays, I have classified #1, #2 and #3 into “Text” component

Implementation

CQ Project Structure Setup & Tools

Image 2: Project Structure

Image 2: Project Structure

We are going to use the default CQ structure that allows us to handle components, templates, client libs, etc. I had also setup Eclipse for AEM for development purposed but I was unable to use it to it’s full extend because of the lack of integration and I had to fall back on CRXDE-Lite several times. Following are the details for each of the folders defined:

  • aem6scratchpad: This is the project application folder and is at the root of  everything we would be coding 
  • components: this folder holds all the components that we will create in this series
  • components/page: this folder holds the implementation of the templates
  • install: this is the folder where any .jar files will get installed
  • templates: holds the templates which will be used to create pages

 

Task: Move Static files in AEM

Image 3: Final Output in AEM

Image 3: Final Output in AEM

Image 3, shows us that I was able to achieve what I started to do as the very first step. I set out to take the HTML that I received and just convert the files into a set of files/components in CQ so that while the content would not be managed, but it would look the same. If you go back to where we started this, this Step is going to make or break how the Development Workflow looks like.

Approach: Like a Novice Developer

Coming into this exercise, I did everything that I see happen between a Site Developer and a CQ Developer. I went ahead a bit and did how a CQ developer would work – just dive into the coding and don’t worry about any standards, unit testing (JIT coding as I have come to call it). I have not coded in CQ for a long time from scratch. I have been reviewing code for best practices but a lot of my time last couple of years has been spent into designing solutions. I have looked at code from time to time, but it was a long time since I had coded a set of components and templates from scratch. For this exercise I believe that me in that situation was a very good thing, because that how our developers go though. Some of them are new to this domain, but even when they are well versed with coding, the approach that is taken is more or less what I ended up doing. 

 

Journey: Painful

The journey was nothing but painful all along the way. It took me 5 hours to do what should have been a few minutes job. The Site developer had the code up and running in a HTML file in a browser and all I had to do was to make it work “as is” within CQ. It seemed like the Force of Nature were working against me and everything I did, had a problem in it. I finally got it up and running (the designs done match off as is still), but it was excruciating pain. Here are a list of things that went wrong (and don’t be surprised if you find this the same as it happens at the start of every project):

The set of tools which I thought would do a wonderful job did not cut it out

I was so excited to work with AEM development tools for Eclispe. I got it up and running after a bit of struggle, but working in it only helped in coding a bit. A lot of things that had to be CQ specific like creating components, templates, dialogs weren’t doable in Ecliple. I ad to move into CRXDE-Lite to do all of that. The good part was that I didn’t have to worry about vault (vlt). I could easily synchronize between hosted CQ and Eclipse easy. It made it just a wee-bit easy

I had to write repetitive code into components

Image 4: CQ IncludesRecall those famous set of lines that we need in every component which if don’t get copied something will stop working. If you have forgotten those you can have a look at Image 4 or download this gist. Well, i tried the shortcut – the ultra famous shortcut that Adobe also recommend us to take (they like to call is Adapting it from existing component) which is supposed to be easy.

In the CRXDE Lite, create a new component folder in /apps/<myProject>/components/<myComponent> by copying an existing component, such as the Text component, and renaming it.

Well I did that, but seemed I kept picking up a wrong component because something or the other didn’t happen. There were times, when the template will not come up when I wanted to create the page or once i opened the page, the component did not show up in the side kick. Well, it seemed, one or the other properties were wrong. 

Where do the client libs go and where they come from

This is probably was the biggest one. The Site developer has picked up whatever version of client lib he wanted to work with. This was by design – I didnt intentionally tell him what to pick it up. Now when I had to get the JS and CSS into CQ along with the assets I had to ensure what he had picked up worked for me. No surprises there that once i got the carousel in the CQ, the first thing I saw was initialization errors which once solved led into next problem as to not the right JS and CSS were picked up. I finally “as a CQ developer” ended up fixing those and got my Carousel to work just as it was working on HTML. Yes!! all of the interactions were working.

 

End Result: Not a Production Ready code

I am going to paste snapshots of the files I ended up creating with links to them on GitHub. This code is not final code and does not have 1/10th of the quality standards that I would deliver. But, the code here will clearly highlight how and where the problem starts.

contentpage.jsp

Image 5: Content Page Template Code

carousel.jsp

Image 6: Carousel Code

 

In Near Future: This is beyond repair

At this point in time, we have seen how the code from Site Developers starts to get into the CQ and what shape it takes. When you take the above and give it to 10 odd developers on a project before you know you will have 10 different standards (or maybe 5) in a project. All of this happens very fast because all one has to do is to drop in a code and make it work. And then we identify the gravity of the problem when we start doing code reviews and realize that this needs more than just refactoring – more often than not this now needs to be fixed only via rewrite. I have not known many projects who have the time and budget to rewrite code. The project releases (thanks to Agile) come to a point when we get workable demos in front of clients – so it all gets tested and done and dusted. With no automated test cases (unit or integration or functional) the risk of breaking something as we make changes it so high that many will just drop and hope that the next component is fixed per a standard.

 

Fix Something

Absolutely!! Thats the intent. But, do we really need to fix the technology. Do we really see that technology that was at fault here? If we had a different set of technologies would this problem be solved for us? Or can we solve this in another way? 

 

Previous > AEM Development Workflow > The problem statement

Next > AEM Development Workflow > Coding Old School with Standards (TBD)

TIP: Make Sling Testing Framework work

We have been trying to find the right mix of unit testing (Automated) in our project, and I have been looking at various options that Sling has to offer. This was done for development of AEM based projects. I tried to follow a few articles to help me get started and each one of those had some issue or the other. I hope that this article allows you not to spend the 10 hours I did only find out that there are tweaks that are needed to make it work.

  1. http://docs.adobe.com/docs/en/dev-tools/aem-eclipse.html – this is the Adobe’s newly released AEM Dev tools for Eclipse. The documentation targets starting new projects and there is also a documentation to move existing projects into Eclipse, but the documentation is pretty weak while it has depth. When I create a new project using Archetype version 7 and run the tests either via maven or Eclipse 2 errors come up
    1. Dependency for slf4j is missing so test dont run
    2. Once you add that dependency, the tests just dont run. I have tried running via Eclipse JUnit plugin and maven test command. I added an assert statement to fail and test cases do not fail
    3. Still Open – I dont know what is wrong here and why these do not work. Still trying to unravel this mystery
  2. http://labs.sixdimensions.com/blog/2013-06-05/creating-integration-tests-apache-sling/ – First of all this article works if you do exactly as it states you have to do. However, if you have password set to anything but “admin” this will not work. It fails in the steps where it has to check if thebundleshave been installed or not in the sling runtime (hosted one).
    1. The defect is pretty stupid, which i opened in sling’s bug tracking system
    2. Also, the test cases will not work if you are running the test cases via Eclipse plugin for JUnit; it just doesn’t work

 

Tips to get going quick

If you are looking to work with server side tests for Sling, I strongly recommend that you start with Dan Klco’s article on Sling’s Integration Tests and use it as is. But, If you want to use a hosted server runtime, then you have to make some changed to POM.XML in the project that you download as follows:

Additional Properties needed hosted server

[code language=”xml”]
<p class="p1"><span class="s2"><</span>sling.additional.bundle.2<span class="s2">></span><span class="s1">jstl</span><span class="s2"></</span>sling.additional.bundle.2<span class="s2">></span></p>
<p class="p1"><span class="s2"><</span>launchpad.http.server.url<span class="s2">></span><span class="s1">http://54.179.160.9:4502</span><span class="s2"></</span>launchpad.http.server.url<span class="s2">></span></p>
<p class="p1"><span class="s2"><</span>test.server.username<span class="s2">></span><span class="s1">admin</span><span class="s2"></</span>test.server.username<span class="s2">></span></p>
<p class="p1"><span class="s2"><</span>test.server.password<span class="s2">></span><span class="s1">admin</span><span class="s2"></</span>test.server.password<span class="s2">> <!– this password has <span class="hiddenGrammarError" pre="has " data-mce-bogus="1">to be</span> admin to work because of the defect (–>https://issues.apache.org/jira/browse/SLING-3873) in sling’s framework –></span></p>
[/code]

 

The following change to the server ready path is needed if you are using AEM 6.x

[code language=”xml”]</pre>
<p class="p1"><span class="s1"><</span><span class="s2">server.ready.path.1</span><span class="s1">></span>/projects.html:src="/libs/cq/gui/components/common/wcm/clientlibs/wcm.js"<span class="s1"><!–</span–><span class="s2">server.ready.path.1</span><span class="s1">></span></span></p>

<pre>
[/code]

 

Closing thoughts

Sling testing framework looks to have potential, but the documentation is so bleak to make adoption so tough. I interact with a lot of CQ (Sling) developers every day and almost each one of them have some issues not to use unit testing – unit testing seems to be so chaotic and non-pleasure that it is like a burden. While developers like to do it, they just feel they are spending so much more time in writing test cases than they are writing code and they do not like it. But, here is just one example that working with CQ/Sling tools for doing unit testing is so primitive and not advertised that it makes things so much more difficult for us. Only if Sling/Adobe would improve this not only they would get adoption, people like me would not have to spend several hours just to get it up and running.