Category Archive : Design

Concurrency Pattern: Producer and Consumer

In my career spanning 15 years, the problem of Producer and Consumer is one that I have come across only a few times. In most programming cases, what we are doing is performing functions in a synchronous fashion where the JVM or the web container handles the complexities of multi-threading on its own. However, when writing certain kinds of use cases where we need this. Last week, I came acros one such use case that sent me 3 years back when I last did it. However, the way it was done last time was very different.

When I first heard the problem statement, I knew instantly what was needed. However, my approach to doing it this time was going to be different from last time. It had simply to do with how I am viewing technology in my life today. I will not go into any non-technical side and will jump straight into the problem and its solution. I started to look at what existed in the market and did come across a couple of posts that helped me in channelizing my thoughts in the right way.

Problem Statement

We need a solution for a batch migration. We are migrating data form System 1 to System 2 and in the process we need to do three tasks:

  • Load data from Database based on groups
  • Process the data
  • Update the records loaded in step#1 with modifications

We have to handle 100s of groups and each group will have around 40K records. You can imagine the amount of time it would take if we were to perform this exercise in a synchronous fashion.  Image here explains this problem in an effective way.

Producer Consumer: The Problem

Producer Consumer: The Problem

Producer and Consumer Pattern

Let us take a look at the Producer Consumer pattern to begin with. If you refer to the problem statement above and look at the image, we see that there are so many entities who are ready with their part of data. However, there are not enough workers who can process all the data. Hence, as the producers continue to line-up in a queue it just continues to grow. We see that the systems start to hog up threads and take a lot of time.

Intermediate Solution

Producer Consumer: The Intermediate approch

Producer Consumer: The Intermediate approch

We do have an intermediate solution. Refer to the image and you will immediately notice that the producers are piling up their work in a filing cabinet and the worker continues to pick it up as they get done with the previous task. However, this approach does have some glaring shortcomings:

  1. There is still one worker who has to do all the work. The external systems may be happy, but the task will still continue to exist until the worker has completed all of the tasks
  2. The producers will pile up their data in a queue and it needs resources to hold the same. Just as in this example the cabinet can fill up, the same can happen with the JVM resources too. We need to be careful how much data we are going to place in memory and in some cases it may not be much.

The Solution

Producer Consumer: The Solution

Producer Consumer: The Solution

The solution is what we see everyday in many places – like the cinema hall queue, Petrol Pumps etc. There are so many people who come in to book a ticket and based on how many people come in, the more people are added to issue tickets. Essentially, refer to image here and you will notice that Producers will keep adding their jobs to the cabinet and we have more workers to handle the work load.

Java provided concurrency package to solve this issue. Till now, I have always worked on threading at a much lower level and this was first time I was going to work with this package. As I started to explore the web and read fellow bloggers with what they have to say, I came across one very good article. It helped in understanding the use of BlockingQueue in a very effective manner. However, the solutions provided by Dhruba would not have helped me in achieving the high throughput which is needed. So, I started to explore the use of ArrayBlockingQueue for the same.

The Controller

This is the first class where the contract between the producers and consumers are managed. The controller will setup 1 thread for the Producer and 2 threads for the consumer. Based on the needs we can create as many threads as we need; and even can even read the data from a properties or do some dynamic magic. For now, we will keep this simple.

package com.kapil.techieforever.producerconsumer;

import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
import java.util.concurrent.Future;

public class TestProducerConsumer
{

public static void main(String args[])
{
try
{
Broker broker = new Broker();

ExecutorService threadPool = Executors.newFixedThreadPool(3);

threadPool.execute(new Consumer(“1”, broker));
threadPool.execute(new Consumer(“2”, broker));
Future producerStatus = threadPool.submit(new Producer(broker));

// this will wait for the producer to finish its execution.
producerStatus.get();

threadPool.shutdown();
}
catch (Exception e)
{
e.printStackTrace();
}
}
}

I am using ExecuteService to create a thread pool and manage it. Instead of using the basic Thread implementation, this is a more effective way as it will handle the exiting and restarting the threads as needed. You will also notice that I am using Future class to get the status of the producer thread. This class is very effective and will halt my program from further execution. This is a nice way of replacing the “.join” method on the threads. Note: I am not using Future very effectively in this example; so you may have to try a few things as you feel fit.

Also, you should note the Broker class which is being used as filing cabinet between the producers and consumers. We will see its implementation in just a little while.

The Producer

This class is responsible for producing the data that needs to be worked upon.

package com.kapil.techieforever.producerconsumer;

public class Producer implements Runnable
{
private Broker broker;

public Producer(Broker broker)
{
this.broker = broker;
}

@Override
public void run()
{
try
{
for (Integer i = 1; i < 5 + 1; ++i)
{
System.out.println(“Producer produced: ” + i);
Thread.sleep(100);
broker.put(i);
}

this.broker.continueProducing = Boolean.FALSE;
System.out.println(“Producer finished its job; terminating.”);
}
catch (InterruptedException ex)
{
ex.printStackTrace();
}

}
}

This class is doing the most simplest of things that it can do – adding an integer to the broker. Some key areas to note are:
1. There is a property on Broker which is updated in the end by the producer when its done producing. This is also known as the “final” or “poison” entry. This is used by the consumers to know that there are no more data coming up
2. I have used Thread.sleep to simulate that some producers may take more time to produce the data. You can tweak this value and see the consumers act

The Consumer

This class is responsible for reading the data from the broker and doing its job

package com.kapil.techieforever.producerconsumer;

public class Consumer implements Runnable
{

private String name;
private Broker broker;

public Consumer(String name, Broker broker)
{
this.name = name;
this.broker = broker;
}

@Override
public void run()
{
try
{
Integer data = broker.get();

while (broker.continueProducing || data != null)
{
Thread.sleep(1000);
System.out.println(“Consumer ” + this.name + ” processed data from broker: ” + data);

data = broker.get();
}

System.out.println(“Comsumer ” + this.name + ” finished its job; terminating.”);
}
catch (InterruptedException ex)
{
ex.printStackTrace();
}
}

}

This is again a simple class that reads the Integer and prints it on the console. However, key points to note are:
1. The loop to process data is an endless loop, that runs on two conditions – until the producer is consuming and there is some data with the broker
2. Again, the Thread.sleep is used to create effective and different scenarios

The Broker

package com.kapil.techieforever.producerconsumer;

import java.util.concurrent.ArrayBlockingQueue;
import java.util.concurrent.TimeUnit;

public class Broker
{
public ArrayBlockingQueue queue = new ArrayBlockingQueue(100);
public Boolean continueProducing = Boolean.TRUE;

public void put(Integer data) throws InterruptedException
{
this.queue.put(data);
}

public Integer get() throws InterruptedException
{
return this.queue.poll(1, TimeUnit.SECONDS);
}
}

The very first thing to note is that we are using ArrayBlockingQueue as the data holder. I am not going to say what this does, but insist you to read it on the JavaDocs here. however, I will explain that the producers are going to place the data in the queue and the consumers will fetch from the queue in FIFO format. But, if the producers are slow, the consumers will wait for data to come in and if the array is full, the producers will wait for it to fill up.

Also, note that I am using the ‘poll’ function instead of get in the queue. This is to ensure that the consumers will not keep waiting for ever and the waiting will time out after a few seconds. This helps us in inter-communication and kill the consumers when all the data is processed. (Note: try replacing poll with get and you will see some interesting outputs).

Code

I have the code sitting on Google project hosting. Feel free to go across and download it from there. It is essentially an eclipse (Spring STS) project. You may also get additional packages and classes when you download it based on when you are downloading it. Feel free to look into those too and share your comments
– You can browse the source code on the SVN browser or;
– You can download it from the project itself

[learn_more caption=”A side step solution”]

Initially, I posted this solution in middle, but then I realized that this is not the way to do things and hence I took this out of the main content and placed it in the end. Another variant to the final solution would be that the workers/consumers do not take one job at a time to process, but they pick up multiple jobs together and go about finishing them before going to the next set. This approach can generate similar results, but in some cases where we have jobs who do not take same time to finish can essentially mean that some workers will end up sooner than others creating some bottleneck. And, if the jobs are allocated before hand which means that all the consumers will have all the jobs before they process (not producer-consumer pattern) then this problem can add up even more and lead to more delays to the processing logic.
[/learn_more]

The Architect’s Eye – Communicating Errors

In many of my projects, I have found architects guilty of preparing a design that leaves the error messages out of the question. And now, I come across an article that shows us 35 creative designs of showing a 404 page (http://www.onextrapixel.com/2011/03/09/the-secret-of-a-successful-error-page-with-35-amazing-404-page-designs/). As I was browsing some of these designs, I recall a designer I worked with – Beth. I learnt so much from her about design and especially Information Architecture. I have always found her looking at things differently be it work or a status update she did.

Coming back to error messages, in most applications I have noticed that error messages are cryptic like “An error has occurred, please check again and get back to System Administrator”. The user is needed to log a report with the call center and report what they were doing. Many a times a user would just ignore to do all that because it takes their time to do such stuff and they say we will come back and try another time – of course if it is not urgent. I see two issues here:

1. A user has been asked to do something that an application designer – An Architect could have done by designing the system right

2. The Application team has lost an opportunity of knowing where their application failed because a user chooses not to report it. They lost an opportunity to fix something pro-actively.

 

Performance Unpredictability

I was hearing a talk from Joshua Bloch on Performance Anxiety and my key observations from the discussion was

  1. It has become impossible to estimate performance
  2. Performance is becoming more abstract
  3. Measure and use statistics with measures

Joshua talks about various aspects of the current systems and how these systems today have lead to a situation where we can not predict or estimate performance of a section of code. And is simply because we have so many layers of code, libraries, patterns, app servers and JVM and what not. The same set of code can have varied performance on a different machine just because how a JVM is going to interpret the code.

He also speaks very clearly on the current known facts and how can they simply be myths – profilers, app servers. There are studies and papers he mentions that speak about this area of uncertainty.

He sumarizes

Our results are disturbing because they indicate that profiler incorrectness is pervasive—occurring in most of our seven benchmarks and in two production JVM—-and significant—all four of the state-of-the-art profilers produce incorrect profiles. Incorrect profiles can easily cause a performance analyst to spend time optimizing cold methods that will have minimal effect on performance. We show that a proof-of-concept profiler that does not use yield points for sampling does not suffer from the above problems.

My Perspective

I chose to blog about this, is because for many years now, I have  been working on making applications performant in many ways – Databases, Application layers, Web – client and servers. In my experience, I can not more agree with Joshua and agree that many times, I have been baffled myself where performance models that I have used to measure performance in a QA or a staging environment came back to be different in production of the various constants I had with the hardware and many times, I had to go back to Production and see what is happening in the real world – One of the biggest factors that I simply could not replicate was the “user base” of the systems in any of the previous systems. I just could not ever stop measuring the performance of the systems and analyze the data only to find myself tuning certain areas of the application.

I will strongly recommend you all to go through his talk

‘The Null’ Nuisance

While working on enahncements on a project already in production, I had a very interesting conversation. Let me give a brief background – the core architecture is all in place and we need to build in new functionality. Of course, refactoring is being done along the road. In a specific scenario, I got into a conversation with a fellow architect on usage of “nulls” and “null checks”. The theme of the conversation was “Should a method return a null or an initialized instance of the class”. Let me take an example:

There is a service method that connects to a database loading records for all users in the system. In the DAO we are loading the recordset from the database and converting to an ArrayList of DTO (ValueObject). A sample code to map the a DTO generally is:

List<User> users = null;
for(int index = 0; index < recordSet.size(); index++)
{
    User user = new User();
    user.setFirstName(recordSet.getString(“firstName”);
    user.setMiddleName(recordSet.getString(“middleName”);
    user.setLastName(recordSet.getString(“lastName”);    user.add(user);
}

return users;


 

 

 

 

I had an objection to this style of coding. The simple reason being, on the front-end, I had to put a check for null which was un-necessary. Hence, the other classes that were consuming the results had to write the following code:

List<Users> users = loadAll();
if(users != null)
{
    /// do something
}
else
{
    if(users.get(index).getMidleName() != null)
    {
        // show the middle name
    }
    else
    {
        // do not show the middle name
    }
}

Now, consider a scenario with complex objects having lists all down the hierarchy. It means that before we access a property, we will have to provide a null check. Soon, this “do nothing” null check will become a headache. Someone has coded a null propogation somewhere and we can not trace it. We feel the easiest way is to put in a null check. In my given example, I would have my JSP strewen with null checks cluttering my code.

Unfortunately, this will not solve the real problem. A simple solution is to identify the code where a null reference can be introduced and handle it there. The rest will be happy about it.

More importantly, et us pause for a minute and ask ourselves – Is there something that the application can do, with an object refering to nothing? Let us go back to my example and see how is the application going to use the user list. We need the list of the users to display a report for the users listed. If no users are returned, the uer should see “No users exist”. The UI is no sure, what represents  users – a null object or an initialized object with 0 size or an exception. This will mean that the developer consuming the method will have to write these multple conditions for a simple check.

We can do oe of the following: 

 

 

 

 

1. Throwing a business exception that voilates a business logic can be an effective strategy. However, it largely depends on how do you use exceptions in applications. Remember, raising an exception is an expensive operation.

2. Alternatively, you can provide an Empty implementation of the object that can do something useful like logging an info or an error to the log system.

I am not a hugh fan of throwing an Exception, and also because it is expensive, I am exploring the second option. This changes my code to:

List<User> users = null;
for(int index = 0; index < recordSet.size(); index++)
{
    User user = new User();
    user.setFirstName(recordSet.getString(“firstName”));
    if(
recordSet.getString(“middleName”) == null)   // You can also use StringUtils from apache.lang
    {
        user.setMiddleName(“”);
    }
    else
    {

        user.setMiddleName(recordSet.getString(“middleName”));
    }
    user.setLastName(recordSet.getString(“lastName”));    user.add(user);
}

if(users == null)
{
    // throw new business exception
}

// else we return an initialized list.
return new ArrayList<User>();

This will change the UI code to:

 

 

 

 

 

List<Users> users = loadAll();
// code to show the middle name – if it does not exist, it will show up as blank.


The most evident benefits is – “No more if statements for null checks on the UI. Check is being pushed down in the call hierarchy. Hence, multiple methods calling the same method will not have to worry about nulls.”

The most important question is “Is this approach safe?” Nothing ever is. There is no reason for someone to code incorrectly. Of course, we can not on external libraries never to return null references, but when you write your own code, following this approach can lead to a less cluttered application and a better control over source code.

Remember: The approach is not always necessary, just ensure that the null reference should not be catastrophic.

Need for 3-tier Architecture

Last week, I was working to define an architecture for an existing application. When I walked into the room with the prposal the Senior Delivery Manager asked me “Why do we need an architecture? Why can not not use what we already have?” His concern was logical, this shift was going to push his behind schedule. While I spent next 20 minutes explaining him the importance and need of a 3-tier architecture, it dawned upon me that i have done this several times. Only if i have this documented on paper it would save me lot of time.

What is a Layer?

A layer is referred to a logical separation of code. In J2EE world this is referred to generally an independent Java project that holds the logic. A layer is responsible for speaking to other layers in the application providing or extracting information. An example is the Presentation Layer that is responsible for showing data to the user but it has the responsibility of extracting the information from various other layers.

Two-Tier Architecture

A two-tier architecture is represented when all the code for extracting data from the database and presentation logic i.e. show data to the user resides in the web layer itself. Some definitive advantage of this approach is that it is handy and provides rapid development. However, this approach has some obvious dis-advantages:

  • Putting all the code in the web layer makes if difficult to maintain. 80% of the time of the application life cycle is spent during maintenance and support. Having unmanageable code only makes matters worse
  • Code reuse is not possible. Many a times with changing needs, organization decide to change the application front-end of the presentation. At times, they decide to add some other add-ons. With code sitting on the web layer makes this impossible. Hence, the application can not be scaled
  • Relying on data source (JDBC) controls makes things more complex.

How do we solve this problem is by introducing a 3-tier architecture which abstracts the code based on logical groupings i.e. Data Access, Business Logic and Presentation Logic. This could be a slow process to start with, but has many advantages in the long run.

Hope this helps.  Soon, I will post about the 3-tier architecture and talk about its benefits.

Cairngorm – A glamorous pitfall, Is it? – Part I

When I started using Flex a few months back, the very first tings thing that crossed me was the fan-following for Cairngorm. It was amazing how could a framework be so popular, but they claimed to solve all the problems. It seemed a little good to be true, but was worth trying. Over a period of last two months, I have now realized that Cairngorm has quite a few shortcomings that leads to be an overly glorified framework that can not scale for any Enterprise Application, without any changes. “Without any changes” is the key here. With this Article, I start off a series of articles that is my attempt to explain why I feel Some things in Cairngorm need to change.

“Cairngorm is an implementation of design patterns that the consultants at Adobe Consulting have successfully taken from enterprise software development (with technologies including J2EE and .NET) and applied rich Internet application development using Adobe Flex.” – Quoted by Adobe Labs

The benefits of the Cairngorm architecture are realized when developing complex RIA applications with multiple use-cases and views, with a team of developers, and with a multi-disciplinary development team that includes designers as well as creative and technical developers. – Quoted by Adobe

Steven Webster follows up with series of articles explaining details on how to use Cairngorm. The goals that Cairngorm helps us achieve are:

  1. Keeping State on the Client
  2. Architecting the View
  3. Feature-driven Development
  4. Server-side Integration

In this article, I will talk about the “Keeping State on the Client” aspect of the Cairngorm and I will be referring to the “Cairngorm Store Web 2.1” for any examples/references.

Steven introduces us to the Value Object pattern and the Model Locator pattern. As he quotes:

Many developers are familiar with the concept of MVC or Model/View/Control in application development, and wonder where state fits into this discussion. Quite simply, the state is the model.

In this case the ValueObject Pattern is one where you can use objects on the Client (Flex) and pass the same information to the back-end. If you are using LCDS/BlazeDS, then you get the benefit of using the same object in both the tiers. This is a great value add as developers do not have to maintain the overhead of heavy conversions on objects. Also, there are some tools available like XDoclet that will allow you to generate ActionScript objects from your Java POJOs and hence achieve quick and speedy results.

Model Locator pattern is the one that I find is the glorified pitfall. As Steven quotes again:

The Model Locator pattern is unique because it is not a pattern we borrowed from the Core J2EE Pattern catalog. Instead, we created this pattern particularly for Flex application development.Our motivation was to have a single place where the application state is held in a Flex application and where view components are able to “locate” the client-side model that they wish to render.

Great thinking, but the implementation is not what it should have been. A few of my observations:

1. Adobe has provided a marker interface “ModelLocator” which I do not understand why it is needed. The only driver that I can think of is to follow the OO practice – Develop against Interfaces and not implementations. Even though you would like to have your state/model divided into many ways, this does not makes sense. A Marker interface is used is you want to use things at Reflection, but in this case there is not such usage and is redundant. Model/state could have very well lived in the even without the Marker Interface

2. Use of singletons is where I believe which breaks the things. Steven quotes:

Having all the attributes on the Model Locator pattern as static attributes ensures that the Model Locator pattern is a simple implementation of a singleton. You ensure, for instance, that one and only one instance of a ShoppingCart exists per user.

Again, the intent is good, but the example that they took up is not at all good, is very bad. Let us take the example of the Cairngorm Store where they have used the Bindings to bind the Single Instance of the data to the view This simply couples the View with the State/Model and re-usability goes out of the window. <Kaboom/>. Code:

<details:ProductDetails
        id="productDetailsComp"
        width="100%" height="325"
        currencyFormatter="{ ModelLocator.currencyFormatter }"
        selectedItem="{ ModelLocator.selectedItem }" 
        addProduct="addProductToShoppingCart( event )" />

Consider a scenario, where in my application I want to use the component “ProductDetails” elsewhere, but as this component is now bound to the model/state, if I change the model, it effects the data elsewhere. Try adding the capability to compare the products, where you would like to re-use the same component to show the details in a comparative view. This component is useless and you can never get two different data sets to be bound to the same view.

But, do not just write-off this pattern – it is still very useful. All you folks who have ever developed the Web 1.0, know that a user’s session state is something that we have to manage in the HTTPSession and that was known as the state. You can use this pattern very well to do similar things like:

  1. Manage User’s Session State and use the same to check for authorizations;
  2. Show user;s navigation in the application. Almost all the applications have Breadcrumbs which should have just 1 state/model;
  3. Current View/navigation state of the user i.e. -> ViewStack indexes/TabView indexes

I am sure that there are more, but I find this pattern very useful in many ways; just have to be careful that over usage of the same would lead to some serious trouble.

Singleton – Boon or a Sin

Over last few weeks, I have faced quite a few issues with Cairngorm’s Singleton pattern and I decided to put forward a post that should help in making some decision. While I found a few articles about Singletons, but not even one that talks about how should we use Singletons.

Ask any programmer, and they will instantaneously discourage you with the use of global data (objects). But, many a times when you find a need where you have to have some objects available globally and also need a single point of access example: User’s state, permissions which has to be retrieved globally across the user session. And, it is then when you use the design pattern singleton.

When is a class really a Singleton?

Ask yourself:

  • Will every application use this class exactly the same way?
  • Will every application ever need only one instance of this class?
  • Should the clients be unaware of the application that this class is part of?

If you can answer all the above as a Yes, then you found a singleton – remember your Logger / Logging classes. Thats what you need as a singleton.

Now, let us switch focus to Flex, and how can we make use of a Singleton. As we all know that Flex is all about Events and thats how any two components interact wit each other. Singletons here become very handy, as you can dispatch en event on a Singleton Event Dispatcher and the component who needs to listen to it, can easily be listening on the Singleton. Solves many of our problems. Huh!! does it? I am putting down some code for the Buttons (which I am treating as a component). Try executing this code, and you will find that when you click on any one of the buttons, both the buttons handle the event and show and alert twice.

In my next article, I will explore the issues of Singleton with Cairngorm.

Code Coverage for Flex

It have arrived!!

Flex Cover 0.10 has just been launched and has me excited. Next few days I will be spending some time see how this works with Builder and Ant. Will keep you all posted.