Category Archive : Engineering

AEM Development Workflow – Part 2 (finding the problem)

While this post will be a self sufficient one, I will recommend that you go back and read up on the context and the background on the AEM Development Workflow to be able to fit this on in the bigger picture.

I was planning to do a grand component to be able to prove my point but it turns our that my HTML and CSS3 skills are not that great and I had to wait up on a Site Developer to be able to provide me with a HTML, CSS, JS for a Carousel – and that can take a while. In the interest of me moving forward, I adapted and now I am going to work with the OOTB List component that AEM offers. Why I picked this component because it a very good use case of what we do in AEM a lot of the time in component development. We build lists of everything, actually from how i see every component can inherit from a list component and render either 1 or more; but more about some other time.

Let’s set the background

Let’s open the OOTB List Component and it looks like having a bunch of files that so several things like render lists in a different way, do pagination, show a feed of list, analytics, has a dialog box and a bunch of other things. To how how the component looks like, i created a new page and then dropped the OOTB List Component. I the configured to pick a list of child pages and show me a list of those. The following Image is shows the authoring dialog as well as the output – ignore the Hello World for now (this is me being lazy). We can clearly see that the OOTB component works well and it does what it’s supposed to do.

spad - adwflow part 2 - list component

List (foundation) component authoring configuration and rendering

 

List component on Geometrixx

List component on Geometrixx

This is all good, but now the real project requirements come into play and the first thing we will notice that we have to show this same list in a different design; a design that was created by designers. Let’s look at the GeoMetrixx Outdoors and let’s open a Product Listing Page (PLP) where we are sure to find ‘a list component’. The image on the right depicts the component and it’s authoring configuration. You will notice a few key differences in both the consumer facing and authoring requirements:

  1. On the Authoring side the most significant is that now the Author is selecting a “Display As”. This property is used by the component logic to apply different styles. These different styles or views as I call it are how the component can be rendered on a page
  2. On the Consumer Side is going to be look an feel like those boxes around a list element (T-Shirt) in this case, of the price and its Currency, etc.

When I open up the code I see the following resource type definition

[code language=”java”]

sling:resourceType = geometrixx-outdoors/components/nav_products

[/code]

as we go deeper into the code we find that the code for rendering this View (read design) is actually sitting in a localized component of the Geometrixx Outdoors site.

So now let’s look at the code of this rendition from the list component in foundation as well as in the geometrixx (you can get these from Gist), but the lines I want to highlight are as follows

Code from libs/foundation/list

[code language=”java”]

<li>
<p>
<a href="<%= listItem.getPath() %>.html" title="<%= StringEscapeUtils.escapeHtml4(listItem.getTitle()) %>"
onclick="CQ_Analytics.record({event: ‘listItemClicked’, values: { listItemPath: ‘<%= listItem.getPath() %>’ }, collect: false, options: { obj: this }, componentPath: ‘<%=resource.getResourceType()%>’})"><%= StringEscapeUtils.escapeHtml4(listItem.getTitle()) %>
</p>
</li>

[/code]

Code from geometrixx-outdoors (list)

[code language=”java”]

<li>
<p>
<a href="<%= listItem.getPath() %>.html" title="<%= StringEscapeUtils.escapeHtml4(listItem.getTitle()) %>"
onclick="CQ_Analytics.record({event: ‘listItemClicked’, values: { listItemPath: ‘<%= listItem.getPath() %>’ }, collect: false, options: { obj: this }, componentPath: ‘<%=resource.getResourceType()%>’})"><%= StringEscapeUtils.escapeHtml4(listItem.getTitle()) %>
</p>
</li>

[/code]

If you have not already spotted, let me highlight the singular fact that codes in their rendering logic include presentation attributes like width, height (one of them has nothing to it will render like plain old HTML as browser does).

Identification of the problem

My Site Development team delivers to me this great looking markup that works very well in my browsers. I just received my Carousel Code from a Site Developer colleague (Thanks a lot Nitin Suri) and I see this great HTML which works fine (download the whole working copy from my Github). This is the where it all goes haywire. This guy Nitin, does not know CQ while is an awesome Site Developer; one of the best I have worked with. And I am a guy who knows little bit of HTML 5, but very little CSS3. JS i do okay. I now have to take up this great looking markup from these static files and create the following in CQ:

  1. A Component called “Carousel” this does everything that the Site Developers have delivered
    1. Identify the set of authoring components in this component; I see some key content areas in there like Main Heading, Sub Heading, Description and some Slides where I presume authors can include images, Text, Rich Text, Videos, Flash (really Flash!!) – for now I will presume only images
    2. Authors can chose how many slides they want, should there be an auto rotate and what should be the wait interval; what sort of effects does this carousel needs to have
  2. A Template called “Landing Page” where this carousel will rendered among other things that I don’t have yet

I don’t see this as a problem, while this is quoted to be the problem that everyone wants to fix and hence as i read it this classic model of coding in CQ does not work. Remember my post earlier here where I spoke about how even Adobe has come up with sightly as the templating language to fix this inefficiency between this handover.

The problem is how we code this component. Rewind to the way the list component is coded, and we see all of the presentation logic, the business logic – all of it is stuffed into this single JSPs. Well if you are going to stuff shit up while coding, you are going to take time to find the meat from the shit. It’s like mining that takes so long to get to the real stones.

So the question is why do we have to code the way we code in classic methods. Why can we not go back and write good code and make everyone’s lives easier. The AEM workflow problem is not really an inefficiency in the handover of HTMLs to CQ developers but how we should have been writing the code to begin with. We start here by seeing where the problem starts and how the code has been written. Unfortunately, we do see the OOTB Code in AEM as provided by Adobe itself are not coded to solve the problem. When I speak with Adobe they make it clear that these are reference sites and are to used as “Self-learning” but little did they know at the time that people will take this a practice and convert this into a culture.

 

Previous > 

Next > AEM Development Workflow – Part 3 Coding Old School

AEM Development Workflow – Part 1 (introduction)

If you have followed my blogs recently, you would know that I have debated the Development Workflow for a CMS based application being developed on top of Adobe AEM. If you missed those posts you can read here and here. To summarize and set context

The challenge we have seen is too much time spent to integrate the HTML/JS/CSS being delivered by the Site Developers into the AEM’s components and templates. And the solution that seems to be cropping up is not to fix the process but to pick technologies that change the workflow fundamentally. More specifically, I have seen some suggestions that try to solve many other things and lose sight of this simplest of the problems.

Technology should really be an enabler and in this case if there is a technology that can enable us to do something better and faster why not. I am thinking through is if there is a universal solution to this problem and can we apply this solution pattern or technology to all solutions CMS or not; AEM or Tridion, etc.; or does there needs to be a more scientific way of arriving at the solution.

aem-dev-workflowReferring to a few of the slides from Adobe’s presentation presentation here; I want to highlight two specific slides (merged here into one) where the problem now stands in how the components get build and how much time it takes (the problem itself). If you have read the entire presentation, you would already know that Adobe’s answer to fixing the problem is sightly.

What I want to do here is to compare 3 workflows and see what each one has to offer and what’s the best possible way to remove this inefficiency or improve productivity.

  1. Follow the current set of technologies JSP-Java but change the way of working aka different set of tools, trainings and processes
  2. Use Sightly ~ the new templating language pushed by AEM
  3. Use other templating languages like handlebars or angular which are more platform agnostic and goes beyond just CMS and AEM (old school application development also fits)

 

There will be a set of posts in coming week as we go through each of those 3 ways of writing the application. For the sake of this exercise, I am going to take on one of the most common components ever used in the CMS world – the carousel. For each of these activities I will elaborate upon the steps which entail to reach from a design to end product aka – final working part for a consumer. The solution will be built in Adobe AEM (not going to touch any other set of CMS tools for now; maybe later). For me to be objectively look at the end results, I will use the following parameters for comparison in the methods:

  1. Level of Effort (LOE) spent by a Site Developer in converting the design into a working front-end code
  2. LOE spent by a AEM developer to take up the front-end code and convert it into a CQ Component
  3. LOE spent in any sort of integrations and bug fixes between the two paradigms

 

As we progress, I will post sample code, screenshots of what I am trying to do. I will also publish a product (Code, Configuration, Content) that you should be able to run on an instance of AEM if you have access to one handy. I will also publish the tools I would use along the way – including instructions or setting it up on your boxes. If you like this subscribe to this URL and you can directly see posts only applicable to this topic.

 

Next > Finding the Problem

(Unit Testing) – Cost vs. Benefit

downtrendI have been a big fan of unit testing for a very long time; my blog is ridden with posts about it. If I look at at how automated unit testing has progressed in my projects over the last decade I am noticing that the inclination to do unit testing is just not there. Recently, I came across this Podcast series between Martin Fowler, Kent Beck and DHH which was in response to DHH’s post TDD is dead, long live Testing. The Podcast did’t reveal something new to me but it helped enlighten me. I spent the last evening introspecting and thinking how I have progressed TDD/Unit Testing in general over my projects and the trend I see is that the ones where I have been more involved as a developer or a senior developer or even an architect the effort to make unit testing has been there, BUT unit testing (not TDD) didnt work very well in last big project. And since I have taken up the responsibilities of a Solution Architect and as with my responsibilities I have moved away from Code a bit, I see the next layer not doing unit testing very much.

As I spent the time thinking what could be the causes, this is where the Podcast series helped a lot. The reasons that I can attribute to Unit Testing (not TDD) just not happening on projects recently

  1. Writing Automated Unit Tests is just not as fun as Coding is
  2. Writing Automated Unit Tests is just not as fun as Coding is

You get the point! I compare the cases when I first did unit testing back in 2004 (my first project with an IT Consultancy Organization) and then the last one in 2012 and I see why is that was the case. The environment in which I was doing unit testing back in 2004 and then in 2012 (8 years apart) were very different. Back in the days, I was an Enterprise/Core Java/.Net developer, and my life was to deal with things like making Database calls, writing services – business logic, validations, yada yada yada. The part which involved writing automated test cases for Web layer was minimal because the code that went into Web Controller (MVC frameworks we used varied) was only view. And we used Functional/System testing approach to test the UI. The responsibility of creating those selinium test cases were not with the QA Function but still with me as the developer. So the whole testing was “what makes sense” – we did TDD on services code and then functional testing (post coding) on Front-end. But the most important part was that as a developer I was supposed to make sure that nothing breaks when I committed the code. The CI tools were not that advanced back in the days, but they were there. We used Ant for build and there was a task in there which was responsible for running test cases else failed the build (I even coded the ant build script). We were not able to integrate Selenium test runner ant script so that was a manual step, but we had to run a whole suite to ensure everything works. The phenomena “Self Testing Code” existed and we followed it to the core. The were build break fines place – 50 INR as fine of we failed a break; 3 consecutive breaks and it went upto 500 INR; 20 INR for any regression defects introduced (QA had their own suits to keep us in control). When I coded and when I committed the code, I knew that at night no one is going to call me to tell me that other members of my team is stuck. I was working in India and I had a team working from Boston and i knew that when I left and when they came in they wont be gated because i fucked up a compilation or broke something – they can just go about doing their work and be productive. Everyone did so and we all were happy bunch of developers/team. We did not have Continuous Delivery, but once story was done, a couple of rounds of QA testing and it was demo able to the clients. Out test coverage was 80%+ and we had a solid test suite. The suite however took 2 hours to run (I will talk about is very soon as that was the part which make Unit Testing a pleasure).

That was the last time, I saw the workflow work so very well. Surprisingly, that was also the first time I was doing it.

Fast forward 2007-08; I was the architect on a 40 people team (Developers, QA, Site Developers) and we were working on a web development – some heavy web development. It was using Backbase, JSF on JBoss, talking to MySQL via hibernate, everything wrapped around Spring. We started with the project and like before I wanted to do full blown TDD (not just Unit Testing). We started, but very soon (Sprint 1) there was feedback from developers that unit testing was too tough, it takes more time than it takes time to code. I was not the architect and I had produced a Golden Copy as a reference point which the team was using; so the feedback team was giving me made no sense. I let the team struggle on it for another sprint thinking they just need time to come around it, but when it started to hit our velocity points, I was forced to see what was going on in that Unit Testing Departments and what I saw shocked the soul out of me. The automated tests were there for a formality – the tests did test something, they provided some coverage but they were monstrous. They were ridden with mocks (almost) everywhere. No wonder it took so much time for the team to write test cases. For everything that needed to test they had to create a mock. A bit of it was lack of experience as well where team did not create helpers classes to provide for those common mocks, but trying to mock things like HTTPSession, HTTPRequest, Data Adapters for MySQL logic – aka every collaborator was in there. Well they did the right thing (per Unit Testing definition) – test the unit itself, so they mocked even the classes making calls to databases. It make so much harder to test a functionality. We quickly turned it around and set some ground rules

  1. We will not use mocks
  2. If we find a scenario where mocks could not be avoided we will not write automated unit tests for it

What this meant was a) this was not Unit Testing in it’s purest form (I don’t even know what Unit Testing is in it’s purest form, but what definitions and my seniors told me) and b) we had a bit of less coverage. But, I was okay with those consequences. The outcome was that team was not needed to write stuff that was used for just testing; it made our whole test suite a bit slow because it now made calls to the DB, but DB was local so it was not so slow – the team started to spend less time on unit testing, we wrote lesser tests and achieved better coverage to code. This wasn’t unit testing, but we did achieve Self Testing Code. Our CI build was now once again telling us if a commit broke something else. The fun was coming back into the game and we started to make progress. Then in Jan 2008, we decided to switch out Backbase and move over to Flex and this needed a redesign not just in our front-end but also in our back-end services. When we put together a team to redesign and reimplement the services they were given a solid test suite of 300 odd tests. The implementation was thrown away a few interfaces were changed. We fixed the test suite only (We made a decision to let our QA team fix the test suite – which need a QA person to code/script in Java which caused a lot of pushback, but I won that battle) the interface part and then started writing those services again and all we had to do was to make those 300 test cases pass. It was a breeze; we didn’t even need a QA team to test our services. Our test suite was good enough to do it.

We never achieved perfection, but then I was able to meet the goals

Fast Forward 2012-2013; I was the architect on Adobe CQ 5.5 project and the frameworks had changed. There were no services we were writing. The traditional MVC framework was all not there at the developers disposal. I saw the code and there was monstrosity of JSP having business logic in scriplets, single tiered Java classes which did everything from getting data, manipulating them, validating inputs, sending outputs in very fluid formats like Maps (ValueMaps to be specific to Sling implementation). Amongst all this we made some hard design decisions like no scriptlets or business logic in JSPs. Hence there was now going to be Java Code, but there were numerous sling and JCR dependencies that we had to deal with. I once again wanted to do what I had done before – No MOCKS/STUBS. But, the frameworks to be able to do that were’t at my disposal back then. They existed but the community around then was thin, support was less and whatever existed was not going to work. And I ended up deciding to do Mocks. 3 months and it was exactly what was in 2008 – it wasn’t working.

The mocks were so bad that we could change the logic and tests would still pass.

What had happened that developers just mangle everything and tests were not even testing state or behavior. They just went for coverage. The tools were better this time and we had sonar. The idea was to write a test case and make it pass and show an increased covered in sonar. We had used mockit and it seemed like a good idea at the time, but just didn’t work. I understand mocks but i just don’t get them. We setup some state and we check for that state – all makes sense. But when everything is coming from Collaborators and the object you are testing is just aggregating and assembling data into a ValueObject (Map in our case), testing that assembly makes no sense. It’s like saying a developer can code the following wrong.

[code language=”java”]

outputData.put("athletes", athleteList);

[/code]

Well, if a developer can’t get a bunch of basic java statements right, well they have no business coding; let’s send them into a training and save ourselves from a world of pain.

Fast Forward 2014, I am not a solution architect and not focusing a whole lot of coding per say. I love to code, but it’s not my day job (i would like coding to be my day job some day). But, now I see teams struggle a lot to get unit testing even off of the ground. The environment is similar to what I had back in 2012, but now I do have a solution and I think there is a way of doing it without Mocks. Yet, the struggle is there, developers dont see this as fun any more – it looks like a burden.

The Cost vs. Benefit argument is a valid one in every scenario, and in case of our software development lifecycle where getting any sort of confidence is just too tough, as a developer the need to have something to fall back on and knowing that we are not breaking a system with my next commit is huge. In my development lifecycle, we can probably still live with some regression, but as we reach application maturity, as we start to reach stabilization not having a Self Testing Code makes us like Mammoths (not elephants) who move ever so slowly. In the current world of marketing, where we have clients who want to run campaigns in next 2 weeks, we can’t be slow in how soon we release code. You can only be relevant in the industry if you can move quickly, achieve the Continuous Delivery or at least reach a point when you can get releases out in production with a reasonable speed. Those days where releases used to happen once every 6 months are gone; or at least gone in the environment/market I am operating.

We have to have Self Testing Code, We have to Unit Test, We have to have feedback loops and We have to have feedback loop as soon as possible. There is no avoidance; let’s understand and embrace Or be extinct in a few years.

 

Unit Testing – Why not?

For JUnit implementation in our project, we see a great challenge in having them implemented as we are already running behind for Sprint 2 and Sprint 3. Team was provided with all the knowledge walkthrough of the JUnit implementation and example test case. I do not see the team would be able to meet the JUnit coverage as expected. We need to first come to track in terms of delivery and can plan to write JUnits as team gets bandwidth. Please let us know your thoughts.

Really – you are going to give me that shit yet again. Does the behind-ness on your project does not tell you something, does it not even rake your mind that if developers weren’t testing their code there might actually be lesser problems and maybe the project would be on track.

This is how I would have responded a week back. But today, I am willing to have a reasonable conversation to understand what are the pain points which makes it difficult for the team to do “automated” Unit Testing (automated being the key here). Let’s go back to the very core of unit testing and even before that – Self Testing Code. Quoting Martin Fowler …

These kinds of benefits are often talked about with respect to TestDrivenDevelopment (TDD), but it’s useful to separate the concepts of TDD and self-testing code. I think of TDD as a particular practice whose benefits include producing self-testing code. It’s a great way to do it, and TDD is a technique I’m a big fan of. But you can also produce self-testing code by writing tests after writing code – although you can’t consider your work to be done until you have the tests (and they pass). The important point of self-testing code is that you have the tests, not how you got to them.

Read this para again and again until you imprint the 2 key messages a) you cant consider your work to be done until you have the tests and they pass and b) important point is that you have the tests, not how you got to them. Now that you understand this full (hopefully), let me ask you once this simple question – Are the developers doing any unit testing (again not automated)? If the answer is No then I have a problem and I am outraged.

I am big fan of Unit Testing and my first experience was on my first job and it was with TDD and yes it was a adrenaline rush a dose of dopamine (as David H says in his podcast). I saw it’s effectiveness when at a stage we decided to throw away the implementation because we realized that it would just not scale. We were 6 months into the development and we had a lot of LOC and along side a solid test suite of 400+ test cases. When we decided we needed to do something fundamentally different we chose to keep the test suite (we used NUnits back in the day). Over the course of next 2 months as we wrote the new implementation those test cases were our feedback loop. Our job (team of 4 developers) was to ensure all of those 400+ test cases turned Green. The day we saw all of them to be Green we knew we were exactly where (functional delivery) when we had been when threw the code/implementation out. The speed at which we redid the implementation was because the confidence we had in the new code we were writing – we were being told almost immediately if we did something right or wrong (they were not unit tests as per definition – but that is for later). Then a few years later in 2008 we did the same thing again and it was a 9 month x 25 people implementation which we redid and it took us 3 month x 20 people to catch up and meet same functional delivery.

In this situation I read a project is behind track, and they need to go faster with confidence to catch up and meet delivery timelines. Not doing Unit Testing (again not saying “automated”), is like someone telling me – Kapil, you are driving to go to your wedding and you are late and now you have to drive faster. But, instead of putting in a few more airbags and giving you better set of wheels, better brakes; We are going to take the 1 Air Bag you have today and also replace your wheels with an older set. Now, go drive else you will not get married. What do you think I would do – Drive faster and risk my life or start driving even slower because I hope that my would be wife loves enough to know that I had no option but to drive slowly. Well you get the point – While I may get married, She is going to stay mad at me for a very long time for ruining her perfect day.

I have not yet broken down this problem if delay on this project (which I have to), but I can bet my ridiculous salary that it’s because of 2 things – a) “Regression” that developer keeps introducing and b) sending incomplete functional code to QA where defects are found and then it all comes back to them and in some cases these defects are found even in client layer of testing. In my experience this is how my ideal day as a developer used to look like

  1. Day 1
    1. Understand requirements – 1 hour
    2. Design and Implement and unit test- 6 hours
    3. Test (integration + functional/system) – 1 hour
  2. Day 2
    1. Repeat Day 1

And when I delivered code to QA I hardly got back any defects to fix and I would continue this cycle day in and day out. The times would stretch if someone did wrong estimates but the activities remained same always and time spent in each of these always increased in some proportion. And I have seen developers do that. But, more recently, I see developers, this is how their day looks like:

  1. Day 1
    1. Implement new story – 12 hours
  2. Day 2
    1. Fix Defects from Day 1 – 4 hours
    2. Implement new story – 12 hours

Well you already see that on Day 2 the developer is fixing their technical debt and they are struggling to catch up. This endless cycle lets call it Cyclone of technical debt” is unrecoverable. The project will deliver one day and people will get burnt. Developers blame estimations, and Architects blame skills, training, indiscipline. Well let’s call the team failed.

As a developer I ask you – Why would you not want to get High? Why would you not want to be in a state of confidence where you know that your code commits is not going to break something else? Why would you not Unit Test?

Squares pegs in Round holes

This is a part 2 of a series of articles I have just started to write. I spoke about Think Clients (SPA) and CMS and what sort of problems do we have.

World Wide Web had a boost back in 2000s and then more recently there has been a huge surge on web frameworks and more importantly the kind of a user experience we have to deliver to our Square Peg in a Round Hole_0565consumers. The web is no longer a way we push some static content; it’s more about telling a compelling story, making people come to you and buy stuff. MyVessyl is one of the cool things that many of us would like to buy, and why we would want to buy is going to me some of because of it’s utility in daily life, but more so how this has been marketed to us. As you watch the site, it’s just so cool how it looks (watch the Video with sound on).

Let’s start decomposing the home page down:

  1. Rich content including Videos, Images, Animated Graphics and of course text
  2. There are parallax affects on the web site, making it a very rich user experience
  3. There are Call To Action (CTA) on the changing the behavior of the site on the page itself and some take you away
  4. Dynamic Capability like “placing order” and “My Account”

If this was a typical project we would start by ideating on the concept, which then gets passed along to the creative team to wireframe it up and apply designs, and then it reaches the implementation team where it the developers pick it up to convert it to code into whatever technology stack that has been picked up and it all end up with everyone then trying to bridge the gap and make it all come together. This process is just so complex and it has literally no collaboration until we hit the last step when we need to get things up and running.

 

blog-square pegs

 

This was not a very big problem a few years back when we had people who were T-Shaped, but this skill development has changed dramatically to a point when we have started to have Specialists in jobs. The job profiles and need to have specialists has landed us into this situation where at some point we thought it would help us improve productivity, but we find ourselves at a problem that handshakes and getting things work holistically is causing us more time than it did in past. Well who knew this would happen – side effects of evolution I guess.

Here is our Square Peg

 

In my last post around Thick Client and CMS, the way we are moving is to make our web client ‘Think’ and most common rationale I have heard is that we want to solve the handshake problem. It is true that if we allow a site developer to work amongst their tools they will be more productive and our backend engineers will be more productive doing what they do best (java, .net, php, or whatever). So the middle ground I hear we can achieve is to make our engineers expose content as a service and then let the front-end developer consume and showcase the data. If you are interested in seeing such an implementation, you can hop over to SapientNitro’s newly launched site. SapientNitro’s site is a good example of using similar technologies and making a SPA in middle of a Content Managed System and my delight Adobe’s AEM.

Going back, the most common argument I am hearing in space today is that we have to go to SP applications because it helps us solve the developer productivity problem, which I don’t understand.

Here is our Round Hole

How I see  this – SPA does not seems like a logical next step in this evolutionary cycle. It looks like the easiest way to get there which will lead to several other problems in a CMS world. SapientNitro was redesigned to give our viewers a phenomenal experience – it had to be what SapientNitro stands for and hence mobile first design and for that matter every design decision made was to meet that experience. The technology and architectural choices from there on was just to enable the experience. Now that to me makes sense.

Back to the dilemma – I was not sure if SPA was the way to go because it helped us solve a developer handshake problem,  and no I am confident that this is not a good way. We will simply be trying to push a wrong solution for a problem.

 

Next to come – How do we solve this in AEM

Thick Clients and CMS

I originally started this blog off with a once very famous thick client (Adobe Flex); and after half a decade I am seeing a surge of the thick clients once again – this time it’s more where the standards and adoption lies – HTML5 and JavaScript. (I am not going to define thick clients here as there are numerous definitions available out there). My 2nd innings with thick clients come at a point when I am not working on building typical Web application – the classic CRUD operations. For the last 2 years I have been engaged in building Content Management Platform using products like Adobe AEM 5.x (6.x), SDL Tridion, etc.

If you have worked with any of the systems, you would know that the typical design and implementation lifecycle of these projects has been and remains to be:

  1. Site Developers write the HTML
  2. Handover the HTML to the CMS developers
  3. CMS developers then use the HTML and integrate with backend code (JSPs in AEM)
  4. Take assistance from Site Developers to fix integration issues mostly JS/CSS

This is nothing new; we have been doing this for years, but as the technologies have progressed I have started to see that the time we now need to take the HTMLs and put them into AEM is increasing significantly compared to when I used to do SpringMVC/Struts based applications. The integrations weren’t issues, but only tasks that the developers would do over the course of story implementation. Some of these did need SMEs to go and fix the finer details but thats about it.

Current State

A typical CMS based site is such where we push content out for consumption of our readers, the content could be simple (text, articles, etc.) or rich (images, videos, etc.). However, in the current world the nature of consumer based sites is changing rapidly and they are getting to a point when we need a dynamic element to the content as well. This dynamic element can be in several shapes and forms as in Omni-channel (responsive is one way of doing it), feeds from external systems (social networks or internal feeds on fan engagement sites like NASCAR.com for Live Leaderboard). Barring a few areas where we have to introduce dynamic content, not a lot of content management websites need to be built on principles of a thick client. Good looking old school models of server taking care of presentation tier should work fine and it works very well as well. Those few dynamic areas can be handled very well using the jQuery (AJAX) sort of frameworks that bring together the two worlds and provide a final view to our consumers. So where does the problem lies or why dow e have to change anything if all this works so very well for us?

The Problem

This is not a technical problem that we have to solve for; as I said that technical integrations are well addressed in these products as they provide their templating languages and ways to integrate with them. If we look at Adobe they had a good way of making this happen (I am not going to talk about the patterns implemented), but their component architecture is actually nice that allow you write templates and what not very effectively. The problem we have on our hands is that of a process as I defined earlier. As we have moved to this product philosophy and as the technology landscape has expanded we have reached a point where I believe we have felt a need to have specializations in our domains. Once where there were no boundaries, now there are BOLD lines drawn as to who is going to be responsible of doing what job in a story implementation. A Site Developer will only work on HTML/JS/CSS (and there can be further divisions based on what skills we need), while a CQ developer will only be responsible writing Java/JSP, Dialogs (and again there are further divisions). While we have strived to achieve specializations in technology domains, we have not progressed on the process side of the things – we still want to do it that exact same way as we were doing it several years ago.

The Impact

As we move to achieve specialization on technology (which is a different matter); we are creating a big Gap between how the 2 worlds come together. The world of Agency where we have HTML/JS/CSS being coded doesn’t understand what the server side world is and vice versa. The marketing world starts conceptualizing the idea and the code even before the backend technology comes into play and by the time it reached the backend it’s already too late as the agency will do whatever they would find fit. This results in that problem – where we now need a mammoth effort to make the client work in a product like AEM. The developers to begin with have little clue what HTML is and how to fragment this into components that we will have to build as part of CMS; but the bigger problem is that HTML and JS to begin with is not conducive to such development. In cases/projects where we have the knowledge of backend systems available to us, we do little justice to that knowledge as the specialization pushed our site developers not to care about the server side technology. Some take care, but there are hardly anything that comes together as a symphony. As a result – we see increasing efforts to bring the two world together and making the projects more expensive than they ever need to be.

The Solution / The Dilemma

Recently there has been some flirtation around the idea of using thick clients within the mix of a CMS world. The idea of using technologies like Angular, Handlebar, Backbone, etc. to build a client which knows nothing about the backend solution. The content should be then exposed as “services” (CaaS) as XML or JSON feeds that then gets consumed by these clients to show the final presentation to the user. When I started to search about this, I was actually surprised that people have already tried this HippoCMS is one such example (at least 6 months old).

Other things that came up while reading on this topic is http://wisdomofganesh.blogspot.in/2007/10/life-above-service-tier.html where they have explained the challenged as described above and a specification to help solve for it. Without diving into merits and demerits of the solution, the idea we have been flirting around is something very similar – of course the technologies being looked upon are different (I am looking at Angular to fit in this space). What I don’t know if CMS world is really the world to apply this pattern. This pattern will absolutely fix the problem if people are willing to unlearn and learn about this new methodology, but the real challenge I have with is CMS and thick clients and if that technology is really something that should be used.

 

At this point – I don’t know if that is the answer, but it seems a wrong variable is being used to solve for the problem we have on hand.

Caching Architecture (Adobe AEM)

Cache (as defined by Wikipedia) is a component that transparently stores data such that future requests for data can be faster. I hereby presume that you understand cache as a component and any architectural patterns around caching and thereby with this presumption I will not go into depth of caching in this article. This article will cover some of the very basics of fundamentals of caching (wherever relevant) and then will take a deep dive into the point-of-view on the caching architecture with respect to a Content Management Plan in context to Adobe’s AEM implementation.

  Read More

A POV on Slice (a Adobe CQ Framework)

I have worked in Adobe CQ for last 18 months now; a large part of my not being active on this blog is because i was busy but the most important reason was that I wanted to understand the underlying technology before I go about writing it – best practices, what works and what does not and what not. Last I did about this was back in 2008 when I wrote about another Adobe product Flex 3.x.

Adobe CQ (Sling) background

If you don’t know what Adobe CQ is, you should go and read the documentation here and you will learn something new about a CMS system. I am looking at this framework today is Slice. I start by analysing the problems getting solved and I read the section they explain Why use it? I agree to all of these pain points in the Adobe CQ development system (actually let me clarify – i dont think any of this is Adobe CQ issue, this is a programming construct I see being used in underlying frameworks Sling). Refer to this document and you will notice that the problem arises where a lot of code sits naturally in a JSP (scripting language). This does not work well when you got to build enterprise applications and platforms on top of a product – the cost of managing the code is too high in the long run. The only way out is to fall back on enterprise standards (or best practices) of writing java and jee applications.

First of all, I have to appreciate that fact that people are thinking of solving for problems that we face during sling development. If you see the documentation, the model presented there is not ideal for a platform/enterprise version. I am not saying that is recommended from Adobe, and I am sure it has it’s place to be used. But, good to see others trying to solve for cases (Someday soon I will write more about the issues in current programming model).

 

What problems do we really need to solve for?

So my assuming you understand the construct and do understand the architectural principles of Sling (CQ) development, let me dive into my POV as if should I be using this framework or not. In CQ (Sling) I have been dealing with the following use cases.  All the following use cases are about one thing i.e. reading data from the underlying data-store. Persisting the the data in CQ and how to render that (using JSP or other scripting language) is out of scope of this discussion primarily because this framework does not solves for those. Back to use cases, to list them (without a lot of detail)

  • Read data stored on the same node where the component lives (stored via Dialog box)
  • Read data stored in another node
  • Aggregate data from several other nodes
  • Search data using default (for now) APIs built into CQ (sling)

 

Does Slice addresses these issues

Use case 1: Read data stored on the same node where the component lives (stored via Dialog box)

If I read the basics of this framework here, use for the very first first use case does not make any sense; that is something we get from the framework OOTB. Sling framework puts all of the properties from the current node into a HashMap and exposes the same to us in JSP simply. I like to call this Content-in-Content-out approach of managing contents. So, why use a model in a JSP for something that already exists!! My bet is that this is not the case for which the framework was written and this was just the starting point like a building block to offer something more.

 

Use case 2: Read data stored in another node

Where things get interesting is the 2nd use case (read this advanced capability here). However, what this does is basically allows me to instead of making a call to an underlying JCR API to read the node, i can now refer to my Model to load the same stuff via a POJO. This makes things interesting, but do not interest me too much because of one primary reason this will make me model all of my content in to a model, but not string enough to reject this fully.

In all of my enterprise world we have been taught to work with DAOs and VOs basically define a model of our objects and then refer to them everywhere. This is a powerful construct that we all have been using it for ages and i wont even debate why we should not use it. But, in case of CQ, where the underlying framework converts the entire node into a HashMap. This programming model does not gives you a structure at times, but if the models are not complex and nested it is actually a pretty powerful way of dealing with data. CQ default is a Page based CMS and converting that to a content-based CMS kind of makes things tricky anyways.

Most of the components will invariably represent either a content-type or meta-data to fetch content can’t be modelled into POJO, there are so many of those (and lets not forget that we get all of that free or cost in properties). This will be POJO hell in no time. This leaves us with the use case of industry domains like in sports we might have an Athlete or a Venue that you can model into a POJO. You got to ask yourself the question, if you want to manage via POJOs or just use HashMap which makes the programming flexible. Work with properties – simple key value pairs or keep managing POJOs. I think at some point, I might use this framework for modelling the underlying content-structures into POJOs. However, i’d like to following certain principles in mind that i dont understand yet with Slice as to how they play out. Like we are injecting PageManagers which are core Day CQ APIs. What worries me is that if I have so much of underlying APIs strapped into my code, if anything was to change, I will end up re-writing so much of this. Maybe for a project a few months (maybe a quarter) I will give this a shot – saves me time to manage content. But, if i was writing a platform that would like several years and might actually see a major CQ upgrade, I really will think twice (maybe thrice) to use this framework.

 

Use case 3 and 4

The last two use cases are either built on top of this a Model and search is not catered by this use case. So basically if you’d ask me this framework provides a small coverage of a use cases and with the increased debt of managing another framework like Guice etc, if something was to change in CQ I might just be taking a lot of risk in refactoring later. Something that doesn’t makes me feel very comfortable.

 

Verdict

Put simply in plain english

  1. For anything that is enterprise or platform(ish), I won’t go for this. The technical overhead that this framework and what it solves for is not worth the ROI of managing another framework
  2. If I’d come across strongly types content-types, I would consider to use this. The ultimate tie breaker would be how how many of the content-types need to be displayed “as-is”. If all i had to show are several compositions of data (search or what have you) maybe not. But, definitely something to consider
  3. I checked the roadmap, and I see very little activity – not something that will make me commit to using this long term

Not a lot of rope to hang off of; very little use cases to use and not exciting me much.

Modeling Content in CQ54

CQ54 is not a a typical RDBMS where I can model a set of relationships in table and soon a pretty picture starts to present itself. CQ54 stores everything in its content repository (CRX) as nodes which follow an entirely different data model i.e. Hierarchical Structure. My experience with hierarchical databases has been with day to day applications like MS Windows File Explorer, outlook Folder structure and in application development directory services like LDAP. So, I am going to start off by listing down what I understand of hierarchical database before I ponder down to my set of questions.

A hierarchical database model means that my data is arranged into a structure that is similar to tree (organization chart). This resides on the premise of a 1:N relationship where a child can have only one parent, where in a parent can have multiple child records. It has characteristics that differ a lot from a relational database. To list a few:

1. Every node is a record
2. Data is stored as properties on the node
3. Every node can be of a different data type – a hierarchical model does not mandate to have same record types under a same parent
4. A child node can be a child to one and only one parent

Hierarchical databases have their advantages:

Performance: Navigating records in a hierarchical model is faster because the references are basically pointers to the nodes/records directly. I don’t have to search in a index or a set of indexes. This however, is true in a case only when my data model does not have a lot of references. If i am working off with a content-model that includes multi-level references, performance will head south
Easy to understand: It is a simple hierarchy; and it represents something that is “non-technical”. It naturally represents what exists.
And Hierarchical databases have their limitations:

Unable to draw complex relationships between various child nodes – Given the premise that a child node will have only one parent, they are identified only by their parents. We have the capability like XPath to navigate directly to a node, which may be faster. If we do not know the exact path, we will have to navigate the tree (up to a parent, maybe the root) and then down to all nodes before we find what we are looking for. Some questions that I am asking myself:
1. What qualifies as a reference for an object?
2. Should speed at which the data can be fetched a driver to defining a reference?
3. What are the best practices that I should be aware off, when I am modeling my domain?
4. When do i decide I need a network model instead of hierarchical model?
Difficult of maintain – hierarchical models also mean that I do not have a command like ALTER TABLE. This essentially means then if I later decide to add another property to a specific node type I will have to write code to update all the nodes
1. Is there a way where I can update a node-type thus updating all the objects which are of that node type?
2. Is there a way to avoid such situations (apart from saying that lets get it right in Release 1.0 and pray to God client will not ask for a change request :))
Lack of Flexibility – In this article, Scott Ambler quotes – “Hierarchical databases fell out of favor with the advent of relational databases due to their lack of flexibility because it wouldn’t easily support data access outside the original design of the data structure. For example, in the customer-order schema you could only access an order through a customer, you couldn’t easily find all the orders that included the sale of a widget because the schema isn’t designed to all that.”. This is a typical case of where reporting is a must and it might be in many systems.
1. Are there other scenarios?
With all the context set of Hierarchical, it is now important we look at CQ54’s content repository – CRX. While CRX is a hierarchical repository it should not be confused with a hierarchical database. CRX provides us with JCR node types which allow us to force structure. We also have the capability of creating custom nodes, but should do it with care. The principle is not to go overboard with structure.

Question remains – “how do I manage content in CQ54”. I do not have a “go-to” answer, but what I have described below is how I am going to think when I start the process.

Content modeling: Look at the requirements i.e. wireframes, creative design assets and identify various content types, structures and relationships between content types. We can take the object-oriented approach and define everything as an object or keep similar content types together. There are several things that should be considered when taking one approach over the other:

What is the business process for crating an object type. Do the content types follow same workflow?
1. Steps that are required to activate a content. An article, a blog, a discussion forum entry may have the same process flow of an author and a reviewer then there is a case of having a single abstract content type
2. However, if an article needs a legal review and can be used in several other business process than just a simple article we may want to bring article out as its own content type
Reuse
1. What kind properties do they share
2. Modeling content for an education system where we have content types like a college or a school where we see a lot of similarities there is a case we can build on creating an abstract content
How does the content author wants to look at the content
1. If we have a set of users who want to manage their content as structured content like books, movies etc we should look to provide those content types very specifically
2. In another scenario if we have authors who do not worry a lot about specific objects i.e. Page-centric content creation then we can decide to club content types together

 

Managing Relationships: In CQ, given it has a hierarchy based data storage model which complies with JCR specifications, we do not have a way to create strict rule-based relationships. We can create relationships using one of the following ways:

Path based references: We can do this by creating properties on objects that hold a “path” or a “list of path” to which the content has relationships with
1. They are semantic
2. Not bound to an “obscure IDs”
3. Do not enforce integrity constraints which may create troubles in extensibility later
4. Being REST-ful they allow us to navigate directly to the node, thus making navigations very quick
5. Being REST-ful, they allow author to visualize their content relationships well thus providing them a business view of the content
Taxonomy based references: CQ uses tags to represent a taxonomy. However,  we can not extend tags to hold various profile information. So, you will need to have a mapping system that maps a tag to a content in CRX
1. Taxonomy is the foundation on which the IA stands. Taxonomy allows us a classification system and how the users will view the content on the site.
2. Allow us to clearly identify where in the system the content type resides
3. Is a conceptual framework allowing customers and their customer to locate what they need easily
4. It is hierarchical
Relational Database
1. Can be used in case we reach a point where relationships are too complex
2. Transactional Data should be kept out of CMS and placed in a relational database (or similar)
3. If we do not have to manage the lifecycle of the content
4. Please note that this  will make architecture complex, but if this is needed that it is

Java EE 6 vs. Spring Framework: A technology decision making process

I came across this article which I just want to share with others – I found this a good read.

Java EE 6 vs. Spring Framework: A technology decision making process.