Category Archive : Adobe Aem

Unit Testing in AEM (thinking loud)

This is not a recommendation of any sorts but a culmination of ideas and a few options that are available for us to use if we want to do unit testing within AEM. I had done some research for a client some time back and this article is largely influenced by that work but a lot of contextual stuff has been pulled out. I have still tried my best to ensure that the article here holds it’s essence. I will try to do a follow-up soon with a lot more details.

Option 1: Use Sling tools and test in-container

Apache sling has released a set of tools http://sling.apache.org/documentation/development/sling-testing-tools.html which can assist unit testing in the application. There tools offer several ways of doing the testing like a) good old JUnits where there are no external dependencies or b) Use of mocks – sling provides readymade mocks which reduce the effort or c) we can deploy the test cases in a CQ box (or sling) and run using OSGi references.

The approach I am recommending here is where we will deploy JUnits in an already hosted CQ instances and invoke the test cases remotely. I understand that this is not “old school unit testing as i am not abstracting any dependencies and my units include dependencies” but i have a reason for doing that. As a matter of fact if you have been following up my writings on unit testing you would know that I am not a big fan of mocking and actually am very happy to do any unit testing against dependencies if i can set it up.

 To do this we need a few things to happen as follows:

  1. We will need to have a hosted CQ instance that can be used as a container for running test cases
    1. We can use embedded systems but then we will have to spend additional effort creating content and what not. Also the embedded container will be sling and not CQ and we would like to keep the environment as close to what we use as possible
  2. The CQ instance should have a pre-populated set of products and images (this setup does uses AEM eCommerce module and PIM and DAM have been integrated with external systems) and that acts for us as readymade test data. These can be achieved using our backend integrations. We can chose to do it independently or can do it automatically (automation of these things can also happen over time to allow us to start quickly)
  3. For interactions with any backend services (like Order Management, Pricing, Account information), we would need to have a backend service instance running (as i said i prefer systems over mocks if possible) with all the variables and pieces setup. This instance should also have various data setup like user accounts, products instances, availability, prices, etc to ensure our use cases work. There are obvious challenges setting up independent backend services and we can explore one of the following 2 options
    1. Capture all requests and responses for a certain request type and serialize those into a test-data store. It can be a huge XML that we can store in a key-value pair sort of a system – can be a database like mongo (even SQL would do) or we can serialize on file system or;
    2. We can use an already existing backend system

Option 2: Use selenium as the functional testing tool

In this approach I am recommending not to use JUnits at all. The idea is to use the philosophy of system testing which can test all of your units in the code. This is a big departure from the traditional way of unit testing where all dependencies are mocked out, and we can run several tests quickly. While Option 1 is also to the same effect, in this approach we go a step further and leverage our system test suits. The idea is not to do this for every single use case, but pick up business critical functions like checkouts, order management, account management and automate those. The selenium scripts can then be integrated with a JUnit runner where we can then integrate it with CI tools and can run it from Eclipse or Maven and hence can be integrated with CI itself. This saves us the time to write those JUnits and manages a whole suite independently. This approach also needs a hosted CQ instance with product data setup, some content setups, and backend integrations just like in Option 1.

Of course this is bit tricky and not really unit testing but it has some huge upsides if done right.

AEM Development Workflow – Part 3 (Coding Old School)

In this series I have been trying to define the various development workflows (that have existed or will arise in near future) and what sort of problems do I see with each of those. We start off by seeing the very first use case that has probably existed for years now “the old school” way of coding in CQ. 

Requirements

carousel-authoring-requirements

Image 1: Component Breakdown

It is always beneficial to define what the end result has to be so that we can ensure that we have achieved what we set out to do. I had finally received a HTML for a carousel from my Site Development friend (download here) and our job is to take that carousel and a simple page and make it Content Managed in Adobe AEM. For this purpose I am going to use AEM 6.x as we need to use sightly in later phases. As we you see in the Image 1, you will how the carousel should look like. I have broken the page down to 2 parts/components that we will have to develop:

  1. Carousel Component – this is highlighted in blue block and marked #4
  2. Title Component – For the same of this use case and trying to handle how we can reuse 1 component for various displays, I have classified #1, #2 and #3 into “Text” component

Implementation

CQ Project Structure Setup & Tools

Image 2: Project Structure

Image 2: Project Structure

We are going to use the default CQ structure that allows us to handle components, templates, client libs, etc. I had also setup Eclipse for AEM for development purposed but I was unable to use it to it’s full extend because of the lack of integration and I had to fall back on CRXDE-Lite several times. Following are the details for each of the folders defined:

  • aem6scratchpad: This is the project application folder and is at the root of  everything we would be coding 
  • components: this folder holds all the components that we will create in this series
  • components/page: this folder holds the implementation of the templates
  • install: this is the folder where any .jar files will get installed
  • templates: holds the templates which will be used to create pages

 

Task: Move Static files in AEM

Image 3: Final Output in AEM

Image 3: Final Output in AEM

Image 3, shows us that I was able to achieve what I started to do as the very first step. I set out to take the HTML that I received and just convert the files into a set of files/components in CQ so that while the content would not be managed, but it would look the same. If you go back to where we started this, this Step is going to make or break how the Development Workflow looks like.

Approach: Like a Novice Developer

Coming into this exercise, I did everything that I see happen between a Site Developer and a CQ Developer. I went ahead a bit and did how a CQ developer would work – just dive into the coding and don’t worry about any standards, unit testing (JIT coding as I have come to call it). I have not coded in CQ for a long time from scratch. I have been reviewing code for best practices but a lot of my time last couple of years has been spent into designing solutions. I have looked at code from time to time, but it was a long time since I had coded a set of components and templates from scratch. For this exercise I believe that me in that situation was a very good thing, because that how our developers go though. Some of them are new to this domain, but even when they are well versed with coding, the approach that is taken is more or less what I ended up doing. 

 

Journey: Painful

The journey was nothing but painful all along the way. It took me 5 hours to do what should have been a few minutes job. The Site developer had the code up and running in a HTML file in a browser and all I had to do was to make it work “as is” within CQ. It seemed like the Force of Nature were working against me and everything I did, had a problem in it. I finally got it up and running (the designs done match off as is still), but it was excruciating pain. Here are a list of things that went wrong (and don’t be surprised if you find this the same as it happens at the start of every project):

The set of tools which I thought would do a wonderful job did not cut it out

I was so excited to work with AEM development tools for Eclispe. I got it up and running after a bit of struggle, but working in it only helped in coding a bit. A lot of things that had to be CQ specific like creating components, templates, dialogs weren’t doable in Ecliple. I ad to move into CRXDE-Lite to do all of that. The good part was that I didn’t have to worry about vault (vlt). I could easily synchronize between hosted CQ and Eclipse easy. It made it just a wee-bit easy

I had to write repetitive code into components

Image 4: CQ IncludesRecall those famous set of lines that we need in every component which if don’t get copied something will stop working. If you have forgotten those you can have a look at Image 4 or download this gist. Well, i tried the shortcut – the ultra famous shortcut that Adobe also recommend us to take (they like to call is Adapting it from existing component) which is supposed to be easy.

In the CRXDE Lite, create a new component folder in /apps/<myProject>/components/<myComponent> by copying an existing component, such as the Text component, and renaming it.

Well I did that, but seemed I kept picking up a wrong component because something or the other didn’t happen. There were times, when the template will not come up when I wanted to create the page or once i opened the page, the component did not show up in the side kick. Well, it seemed, one or the other properties were wrong. 

Where do the client libs go and where they come from

This is probably was the biggest one. The Site developer has picked up whatever version of client lib he wanted to work with. This was by design – I didnt intentionally tell him what to pick it up. Now when I had to get the JS and CSS into CQ along with the assets I had to ensure what he had picked up worked for me. No surprises there that once i got the carousel in the CQ, the first thing I saw was initialization errors which once solved led into next problem as to not the right JS and CSS were picked up. I finally “as a CQ developer” ended up fixing those and got my Carousel to work just as it was working on HTML. Yes!! all of the interactions were working.

 

End Result: Not a Production Ready code

I am going to paste snapshots of the files I ended up creating with links to them on GitHub. This code is not final code and does not have 1/10th of the quality standards that I would deliver. But, the code here will clearly highlight how and where the problem starts.

contentpage.jsp

Image 5: Content Page Template Code

carousel.jsp

Image 6: Carousel Code

 

In Near Future: This is beyond repair

At this point in time, we have seen how the code from Site Developers starts to get into the CQ and what shape it takes. When you take the above and give it to 10 odd developers on a project before you know you will have 10 different standards (or maybe 5) in a project. All of this happens very fast because all one has to do is to drop in a code and make it work. And then we identify the gravity of the problem when we start doing code reviews and realize that this needs more than just refactoring – more often than not this now needs to be fixed only via rewrite. I have not known many projects who have the time and budget to rewrite code. The project releases (thanks to Agile) come to a point when we get workable demos in front of clients – so it all gets tested and done and dusted. With no automated test cases (unit or integration or functional) the risk of breaking something as we make changes it so high that many will just drop and hope that the next component is fixed per a standard.

 

Fix Something

Absolutely!! Thats the intent. But, do we really need to fix the technology. Do we really see that technology that was at fault here? If we had a different set of technologies would this problem be solved for us? Or can we solve this in another way? 

 

Previous > AEM Development Workflow > The problem statement

Next > AEM Development Workflow > Coding Old School with Standards (TBD)

AEM Development Workflow – Part 2 (finding the problem)

While this post will be a self sufficient one, I will recommend that you go back and read up on the context and the background on the AEM Development Workflow to be able to fit this on in the bigger picture.

I was planning to do a grand component to be able to prove my point but it turns our that my HTML and CSS3 skills are not that great and I had to wait up on a Site Developer to be able to provide me with a HTML, CSS, JS for a Carousel – and that can take a while. In the interest of me moving forward, I adapted and now I am going to work with the OOTB List component that AEM offers. Why I picked this component because it a very good use case of what we do in AEM a lot of the time in component development. We build lists of everything, actually from how i see every component can inherit from a list component and render either 1 or more; but more about some other time.

Let’s set the background

Let’s open the OOTB List Component and it looks like having a bunch of files that so several things like render lists in a different way, do pagination, show a feed of list, analytics, has a dialog box and a bunch of other things. To how how the component looks like, i created a new page and then dropped the OOTB List Component. I the configured to pick a list of child pages and show me a list of those. The following Image is shows the authoring dialog as well as the output – ignore the Hello World for now (this is me being lazy). We can clearly see that the OOTB component works well and it does what it’s supposed to do.

spad - adwflow part 2 - list component

List (foundation) component authoring configuration and rendering

 

List component on Geometrixx

List component on Geometrixx

This is all good, but now the real project requirements come into play and the first thing we will notice that we have to show this same list in a different design; a design that was created by designers. Let’s look at the GeoMetrixx Outdoors and let’s open a Product Listing Page (PLP) where we are sure to find ‘a list component’. The image on the right depicts the component and it’s authoring configuration. You will notice a few key differences in both the consumer facing and authoring requirements:

  1. On the Authoring side the most significant is that now the Author is selecting a “Display As”. This property is used by the component logic to apply different styles. These different styles or views as I call it are how the component can be rendered on a page
  2. On the Consumer Side is going to be look an feel like those boxes around a list element (T-Shirt) in this case, of the price and its Currency, etc.

When I open up the code I see the following resource type definition

[code language=”java”]

sling:resourceType = geometrixx-outdoors/components/nav_products

[/code]

as we go deeper into the code we find that the code for rendering this View (read design) is actually sitting in a localized component of the Geometrixx Outdoors site.

So now let’s look at the code of this rendition from the list component in foundation as well as in the geometrixx (you can get these from Gist), but the lines I want to highlight are as follows

Code from libs/foundation/list

[code language=”java”]

&amp;lt;li&amp;gt;
&amp;lt;p&amp;gt;
&amp;lt;a href=&amp;quot;&amp;lt;%= listItem.getPath() %&amp;gt;.html&amp;quot; title=&amp;quot;&amp;lt;%= StringEscapeUtils.escapeHtml4(listItem.getTitle()) %&amp;gt;&amp;quot;
onclick=&amp;quot;CQ_Analytics.record({event: ‘listItemClicked’, values: { listItemPath: ‘&amp;lt;%= listItem.getPath() %&amp;gt;’ }, collect: false, options: { obj: this }, componentPath: ‘&amp;lt;%=resource.getResourceType()%&amp;gt;’})&amp;quot;&amp;gt;&amp;lt;%= StringEscapeUtils.escapeHtml4(listItem.getTitle()) %&amp;gt;
&amp;lt;/p&amp;gt;
&amp;lt;/li&amp;gt;

[/code]

Code from geometrixx-outdoors (list)

[code language=”java”]

&amp;lt;li&amp;gt;
&amp;lt;p&amp;gt;
&amp;lt;a href=&amp;quot;&amp;lt;%= listItem.getPath() %&amp;gt;.html&amp;quot; title=&amp;quot;&amp;lt;%= StringEscapeUtils.escapeHtml4(listItem.getTitle()) %&amp;gt;&amp;quot;
onclick=&amp;quot;CQ_Analytics.record({event: ‘listItemClicked’, values: { listItemPath: ‘&amp;lt;%= listItem.getPath() %&amp;gt;’ }, collect: false, options: { obj: this }, componentPath: ‘&amp;lt;%=resource.getResourceType()%&amp;gt;’})&amp;quot;&amp;gt;&amp;lt;%= StringEscapeUtils.escapeHtml4(listItem.getTitle()) %&amp;gt;
&amp;lt;/p&amp;gt;
&amp;lt;/li&amp;gt;

[/code]

If you have not already spotted, let me highlight the singular fact that codes in their rendering logic include presentation attributes like width, height (one of them has nothing to it will render like plain old HTML as browser does).

Identification of the problem

My Site Development team delivers to me this great looking markup that works very well in my browsers. I just received my Carousel Code from a Site Developer colleague (Thanks a lot Nitin Suri) and I see this great HTML which works fine (download the whole working copy from my Github). This is the where it all goes haywire. This guy Nitin, does not know CQ while is an awesome Site Developer; one of the best I have worked with. And I am a guy who knows little bit of HTML 5, but very little CSS3. JS i do okay. I now have to take up this great looking markup from these static files and create the following in CQ:

  1. A Component called “Carousel” this does everything that the Site Developers have delivered
    1. Identify the set of authoring components in this component; I see some key content areas in there like Main Heading, Sub Heading, Description and some Slides where I presume authors can include images, Text, Rich Text, Videos, Flash (really Flash!!) – for now I will presume only images
    2. Authors can chose how many slides they want, should there be an auto rotate and what should be the wait interval; what sort of effects does this carousel needs to have
  2. A Template called “Landing Page” where this carousel will rendered among other things that I don’t have yet

I don’t see this as a problem, while this is quoted to be the problem that everyone wants to fix and hence as i read it this classic model of coding in CQ does not work. Remember my post earlier here where I spoke about how even Adobe has come up with sightly as the templating language to fix this inefficiency between this handover.

The problem is how we code this component. Rewind to the way the list component is coded, and we see all of the presentation logic, the business logic – all of it is stuffed into this single JSPs. Well if you are going to stuff shit up while coding, you are going to take time to find the meat from the shit. It’s like mining that takes so long to get to the real stones.

So the question is why do we have to code the way we code in classic methods. Why can we not go back and write good code and make everyone’s lives easier. The AEM workflow problem is not really an inefficiency in the handover of HTMLs to CQ developers but how we should have been writing the code to begin with. We start here by seeing where the problem starts and how the code has been written. Unfortunately, we do see the OOTB Code in AEM as provided by Adobe itself are not coded to solve the problem. When I speak with Adobe they make it clear that these are reference sites and are to used as “Self-learning” but little did they know at the time that people will take this a practice and convert this into a culture.

 

Previous > 

Next > AEM Development Workflow – Part 3 Coding Old School

AEM Development Workflow – Part 1 (introduction)

If you have followed my blogs recently, you would know that I have debated the Development Workflow for a CMS based application being developed on top of Adobe AEM. If you missed those posts you can read here and here. To summarize and set context

The challenge we have seen is too much time spent to integrate the HTML/JS/CSS being delivered by the Site Developers into the AEM’s components and templates. And the solution that seems to be cropping up is not to fix the process but to pick technologies that change the workflow fundamentally. More specifically, I have seen some suggestions that try to solve many other things and lose sight of this simplest of the problems.

Technology should really be an enabler and in this case if there is a technology that can enable us to do something better and faster why not. I am thinking through is if there is a universal solution to this problem and can we apply this solution pattern or technology to all solutions CMS or not; AEM or Tridion, etc.; or does there needs to be a more scientific way of arriving at the solution.

aem-dev-workflowReferring to a few of the slides from Adobe’s presentation presentation here; I want to highlight two specific slides (merged here into one) where the problem now stands in how the components get build and how much time it takes (the problem itself). If you have read the entire presentation, you would already know that Adobe’s answer to fixing the problem is sightly.

What I want to do here is to compare 3 workflows and see what each one has to offer and what’s the best possible way to remove this inefficiency or improve productivity.

  1. Follow the current set of technologies JSP-Java but change the way of working aka different set of tools, trainings and processes
  2. Use Sightly ~ the new templating language pushed by AEM
  3. Use other templating languages like handlebars or angular which are more platform agnostic and goes beyond just CMS and AEM (old school application development also fits)

 

There will be a set of posts in coming week as we go through each of those 3 ways of writing the application. For the sake of this exercise, I am going to take on one of the most common components ever used in the CMS world – the carousel. For each of these activities I will elaborate upon the steps which entail to reach from a design to end product aka – final working part for a consumer. The solution will be built in Adobe AEM (not going to touch any other set of CMS tools for now; maybe later). For me to be objectively look at the end results, I will use the following parameters for comparison in the methods:

  1. Level of Effort (LOE) spent by a Site Developer in converting the design into a working front-end code
  2. LOE spent by a AEM developer to take up the front-end code and convert it into a CQ Component
  3. LOE spent in any sort of integrations and bug fixes between the two paradigms

 

As we progress, I will post sample code, screenshots of what I am trying to do. I will also publish a product (Code, Configuration, Content) that you should be able to run on an instance of AEM if you have access to one handy. I will also publish the tools I would use along the way – including instructions or setting it up on your boxes. If you like this subscribe to this URL and you can directly see posts only applicable to this topic.

 

Next > Finding the Problem

Caching Architecture (Adobe AEM)

Cache (as defined by Wikipedia) is a component that transparently stores data such that future requests for data can be faster. I hereby presume that you understand cache as a component and any architectural patterns around caching and thereby with this presumption I will not go into depth of caching in this article. This article will cover some of the very basics of fundamentals of caching (wherever relevant) and then will take a deep dive into the point-of-view on the caching architecture with respect to a Content Management Plan in context to Adobe’s AEM implementation.

  Read More

A POV on Slice (a Adobe CQ Framework)

I have worked in Adobe CQ for last 18 months now; a large part of my not being active on this blog is because i was busy but the most important reason was that I wanted to understand the underlying technology before I go about writing it – best practices, what works and what does not and what not. Last I did about this was back in 2008 when I wrote about another Adobe product Flex 3.x.

Adobe CQ (Sling) background

If you don’t know what Adobe CQ is, you should go and read the documentation here and you will learn something new about a CMS system. I am looking at this framework today is Slice. I start by analysing the problems getting solved and I read the section they explain Why use it? I agree to all of these pain points in the Adobe CQ development system (actually let me clarify – i dont think any of this is Adobe CQ issue, this is a programming construct I see being used in underlying frameworks Sling). Refer to this document and you will notice that the problem arises where a lot of code sits naturally in a JSP (scripting language). This does not work well when you got to build enterprise applications and platforms on top of a product – the cost of managing the code is too high in the long run. The only way out is to fall back on enterprise standards (or best practices) of writing java and jee applications.

First of all, I have to appreciate that fact that people are thinking of solving for problems that we face during sling development. If you see the documentation, the model presented there is not ideal for a platform/enterprise version. I am not saying that is recommended from Adobe, and I am sure it has it’s place to be used. But, good to see others trying to solve for cases (Someday soon I will write more about the issues in current programming model).

 

What problems do we really need to solve for?

So my assuming you understand the construct and do understand the architectural principles of Sling (CQ) development, let me dive into my POV as if should I be using this framework or not. In CQ (Sling) I have been dealing with the following use cases.  All the following use cases are about one thing i.e. reading data from the underlying data-store. Persisting the the data in CQ and how to render that (using JSP or other scripting language) is out of scope of this discussion primarily because this framework does not solves for those. Back to use cases, to list them (without a lot of detail)

  • Read data stored on the same node where the component lives (stored via Dialog box)
  • Read data stored in another node
  • Aggregate data from several other nodes
  • Search data using default (for now) APIs built into CQ (sling)

 

Does Slice addresses these issues

Use case 1: Read data stored on the same node where the component lives (stored via Dialog box)

If I read the basics of this framework here, use for the very first first use case does not make any sense; that is something we get from the framework OOTB. Sling framework puts all of the properties from the current node into a HashMap and exposes the same to us in JSP simply. I like to call this Content-in-Content-out approach of managing contents. So, why use a model in a JSP for something that already exists!! My bet is that this is not the case for which the framework was written and this was just the starting point like a building block to offer something more.

 

Use case 2: Read data stored in another node

Where things get interesting is the 2nd use case (read this advanced capability here). However, what this does is basically allows me to instead of making a call to an underlying JCR API to read the node, i can now refer to my Model to load the same stuff via a POJO. This makes things interesting, but do not interest me too much because of one primary reason this will make me model all of my content in to a model, but not string enough to reject this fully.

In all of my enterprise world we have been taught to work with DAOs and VOs basically define a model of our objects and then refer to them everywhere. This is a powerful construct that we all have been using it for ages and i wont even debate why we should not use it. But, in case of CQ, where the underlying framework converts the entire node into a HashMap. This programming model does not gives you a structure at times, but if the models are not complex and nested it is actually a pretty powerful way of dealing with data. CQ default is a Page based CMS and converting that to a content-based CMS kind of makes things tricky anyways.

Most of the components will invariably represent either a content-type or meta-data to fetch content can’t be modelled into POJO, there are so many of those (and lets not forget that we get all of that free or cost in properties). This will be POJO hell in no time. This leaves us with the use case of industry domains like in sports we might have an Athlete or a Venue that you can model into a POJO. You got to ask yourself the question, if you want to manage via POJOs or just use HashMap which makes the programming flexible. Work with properties – simple key value pairs or keep managing POJOs. I think at some point, I might use this framework for modelling the underlying content-structures into POJOs. However, i’d like to following certain principles in mind that i dont understand yet with Slice as to how they play out. Like we are injecting PageManagers which are core Day CQ APIs. What worries me is that if I have so much of underlying APIs strapped into my code, if anything was to change, I will end up re-writing so much of this. Maybe for a project a few months (maybe a quarter) I will give this a shot – saves me time to manage content. But, if i was writing a platform that would like several years and might actually see a major CQ upgrade, I really will think twice (maybe thrice) to use this framework.

 

Use case 3 and 4

The last two use cases are either built on top of this a Model and search is not catered by this use case. So basically if you’d ask me this framework provides a small coverage of a use cases and with the increased debt of managing another framework like Guice etc, if something was to change in CQ I might just be taking a lot of risk in refactoring later. Something that doesn’t makes me feel very comfortable.

 

Verdict

Put simply in plain english

  1. For anything that is enterprise or platform(ish), I won’t go for this. The technical overhead that this framework and what it solves for is not worth the ROI of managing another framework
  2. If I’d come across strongly types content-types, I would consider to use this. The ultimate tie breaker would be how how many of the content-types need to be displayed “as-is”. If all i had to show are several compositions of data (search or what have you) maybe not. But, definitely something to consider
  3. I checked the roadmap, and I see very little activity – not something that will make me commit to using this long term

Not a lot of rope to hang off of; very little use cases to use and not exciting me much.

Modeling Content in CQ54

CQ54 is not a a typical RDBMS where I can model a set of relationships in table and soon a pretty picture starts to present itself. CQ54 stores everything in its content repository (CRX) as nodes which follow an entirely different data model i.e. Hierarchical Structure. My experience with hierarchical databases has been with day to day applications like MS Windows File Explorer, outlook Folder structure and in application development directory services like LDAP. So, I am going to start off by listing down what I understand of hierarchical database before I ponder down to my set of questions.

A hierarchical database model means that my data is arranged into a structure that is similar to tree (organization chart). This resides on the premise of a 1:N relationship where a child can have only one parent, where in a parent can have multiple child records. It has characteristics that differ a lot from a relational database. To list a few:

1. Every node is a record
2. Data is stored as properties on the node
3. Every node can be of a different data type – a hierarchical model does not mandate to have same record types under a same parent
4. A child node can be a child to one and only one parent

Hierarchical databases have their advantages:

Performance: Navigating records in a hierarchical model is faster because the references are basically pointers to the nodes/records directly. I don’t have to search in a index or a set of indexes. This however, is true in a case only when my data model does not have a lot of references. If i am working off with a content-model that includes multi-level references, performance will head south
Easy to understand: It is a simple hierarchy; and it represents something that is “non-technical”. It naturally represents what exists.
And Hierarchical databases have their limitations:

Unable to draw complex relationships between various child nodes – Given the premise that a child node will have only one parent, they are identified only by their parents. We have the capability like XPath to navigate directly to a node, which may be faster. If we do not know the exact path, we will have to navigate the tree (up to a parent, maybe the root) and then down to all nodes before we find what we are looking for. Some questions that I am asking myself:
1. What qualifies as a reference for an object?
2. Should speed at which the data can be fetched a driver to defining a reference?
3. What are the best practices that I should be aware off, when I am modeling my domain?
4. When do i decide I need a network model instead of hierarchical model?
Difficult of maintain – hierarchical models also mean that I do not have a command like ALTER TABLE. This essentially means then if I later decide to add another property to a specific node type I will have to write code to update all the nodes
1. Is there a way where I can update a node-type thus updating all the objects which are of that node type?
2. Is there a way to avoid such situations (apart from saying that lets get it right in Release 1.0 and pray to God client will not ask for a change request :))
Lack of Flexibility – In this article, Scott Ambler quotes – “Hierarchical databases fell out of favor with the advent of relational databases due to their lack of flexibility because it wouldn’t easily support data access outside the original design of the data structure. For example, in the customer-order schema you could only access an order through a customer, you couldn’t easily find all the orders that included the sale of a widget because the schema isn’t designed to all that.”. This is a typical case of where reporting is a must and it might be in many systems.
1. Are there other scenarios?
With all the context set of Hierarchical, it is now important we look at CQ54’s content repository – CRX. While CRX is a hierarchical repository it should not be confused with a hierarchical database. CRX provides us with JCR node types which allow us to force structure. We also have the capability of creating custom nodes, but should do it with care. The principle is not to go overboard with structure.

Question remains – “how do I manage content in CQ54”. I do not have a “go-to” answer, but what I have described below is how I am going to think when I start the process.

Content modeling: Look at the requirements i.e. wireframes, creative design assets and identify various content types, structures and relationships between content types. We can take the object-oriented approach and define everything as an object or keep similar content types together. There are several things that should be considered when taking one approach over the other:

What is the business process for crating an object type. Do the content types follow same workflow?
1. Steps that are required to activate a content. An article, a blog, a discussion forum entry may have the same process flow of an author and a reviewer then there is a case of having a single abstract content type
2. However, if an article needs a legal review and can be used in several other business process than just a simple article we may want to bring article out as its own content type
Reuse
1. What kind properties do they share
2. Modeling content for an education system where we have content types like a college or a school where we see a lot of similarities there is a case we can build on creating an abstract content
How does the content author wants to look at the content
1. If we have a set of users who want to manage their content as structured content like books, movies etc we should look to provide those content types very specifically
2. In another scenario if we have authors who do not worry a lot about specific objects i.e. Page-centric content creation then we can decide to club content types together

 

Managing Relationships: In CQ, given it has a hierarchy based data storage model which complies with JCR specifications, we do not have a way to create strict rule-based relationships. We can create relationships using one of the following ways:

Path based references: We can do this by creating properties on objects that hold a “path” or a “list of path” to which the content has relationships with
1. They are semantic
2. Not bound to an “obscure IDs”
3. Do not enforce integrity constraints which may create troubles in extensibility later
4. Being REST-ful they allow us to navigate directly to the node, thus making navigations very quick
5. Being REST-ful, they allow author to visualize their content relationships well thus providing them a business view of the content
Taxonomy based references: CQ uses tags to represent a taxonomy. However,  we can not extend tags to hold various profile information. So, you will need to have a mapping system that maps a tag to a content in CRX
1. Taxonomy is the foundation on which the IA stands. Taxonomy allows us a classification system and how the users will view the content on the site.
2. Allow us to clearly identify where in the system the content type resides
3. Is a conceptual framework allowing customers and their customer to locate what they need easily
4. It is hierarchical
Relational Database
1. Can be used in case we reach a point where relationships are too complex
2. Transactional Data should be kept out of CMS and placed in a relational database (or similar)
3. If we do not have to manage the lifecycle of the content
4. Please note that this  will make architecture complex, but if this is needed that it is

OSGI: The new Toy

I heard about OSGI sometime early last year, but I did not care about it – it meant start thinking about a new way of development and deployment (thats what I heard from my friends) and I did not want to learn something else when Spring worked great for me. And, my colleagues who spoke about OSGI did not do a good job of advocating OSGI. Last month, I came across Adobe Day as a potential platform for a project implementation. Day was amongst some of the CMS platforms that I was evaluating, namely – SDL Tridion, Oracle Stellant and Interwoven.

It was Adobe Day that introduced me to OSGI and during one of the webinars, I was introduced to OSGI using a simple yet powerful graphic (see below).

OSGI - Intro

OSGI - Intro

And from that moment on, I was simply hooked on. I have been a big fan of Java and its deployment frameworks like Maven – but essentially using those frameworks and tools also meant that sooner or later I would be dealing with various versions of same library (ehCache, Log4j being the most common) and when I wen to deployment on JBoss or Apache Tomcat, more often than not I would have to tweak my project dependencies to or servers to ensure that those “dependencies” are resolved appropriately.

Suddenly, OSGI seems to be the answer to everything (or almost everything). I would be able to create and deploy components and then choose to include them as needed; I did not have to worry about backward compatibility or which service to deploy – opportunities were endless.

So today, I am going to start off my journey to read more about OSGI and explore some common available containers like Equinox (used by Eclipse) and Apache Felix (used by Adobe Day). And I hope that while I learn more I am able to share my thoughts and some of the best practices around the same.