Month: August 2011

Publishing Project using Maven

Apache Maven logo.

Image via Wikipedia

Most of us use Maven for building our projects and preparing a JAR or a WAR or an EAR to be deployed on server. However, during the development life-cycle we do not realize that Maven can be used for a much higher purpose – “Create a project site to show all of projects’ information”. For many years, that I have been using Maven myself, this is something that I never thought off using to my advantage – I guess I never wanted to do something like that.

Last week, when I launched my project as an Open Source Library, I realized that I need a easy and quick way to make many artifacts available for my project. I have used Google for code hosting and issue tracking, however there are other documents like JavaDocs, list of dependencies, XRef Source that provide a lot of additional information which would be useful to whom so ever decides to use the library.

Necessity is the mother of invention

In my case, it was not an invention, but it sure was a learning experience as I started exploring maven-site-plugin. This plugin generates various reports for a maven project including those that have been configured in the pom.xml file. However, it was not a straight forward implementation because I had some additional requirements like placing Google Analytics Code and placing custom Project information on my site. In this post, I will make an attempt to explain some of the things I did to help me generate my site.

To understand what I did, you should have a quick look at the site that I have deployed. I will now start take one section at a time and explain what changes I made to my pom for that section to work. I have also attached the pom file with this post for you quick reference.

POM - Main

POM - Main

The image on the right explains the main sections of the POM.XML. This section will configure the Project Information and Project Team pages. I have added only a few details that i needed, but you can add a lot more information as you need.

The next section holds the License information. I am using the Apache License 2.0 and hence I am providing the URL for the license from Apache. This section generates the Project License page for site. The next two sections define the Project Repository and Issue Management (Defect Management) settings.

Rest everything is standard POM stuff i.e. dependencies and Build plugins to run test cases etc. I am not going to take a deep dive into those because there are just so many tutorials on the same. However, what is important are the sections on Java Cods and Source Code Reference. To generate these two sections as a part of the Maven Site, you need to configure the Reporting Plugins as follows:


 
   
        org.apache.maven.plugins
       maven-javadoc-plugin
      
        
           http://download.oracle.com/javase/6/docs/api/
        
      
      2.8
   
   
       org.apache.maven.plugins
        maven-jxr-plugin
       2.3
   
 

And lastly, to setup the left-navigation menu, you need to add a site.xml file into the “srcsite” folder of you project and provide the menu details and their linking to the pages.

In this file you can also add code for Google Analytics, you will notice in the image below that I have added the Google Analytic code to the header section which then copies the code over to each of the static HTML pages and allows me to track the same.

Site.XML

Site.XML

I hope you find this useful and are able to use it to publish your own site too.

iFramework: Updates

I just released a few more framework/utility classes for this Framework. The three new features that have been added are

 

Exception Handling

This simple yet powerful implementation provides some of the most desirable features in any Exception Handling framework:

  • Unchecked Exceptions for Business and System errors
  • Message Externalization
  • Message Localization
  • Configurable Message Data Sources

 

Message Readers

This package provides factory to fetch various types of Message Readers. The current implementation only supports Resource Bundle based message stores, but it can be easily extended to other data stores.

 

Concurrent Processing

This package provides base classes that allow you to run your functions in a producer and consumer fashion. By using this framework you have to worry about writing your business logic for a Producer and a Consumer and not about multi-threading and how these two entities interact. This framework currently has only one implementation of a Broker i.e AsyncQueueBroker but more will be added in coming months.

A new OSS project: iFramework

If you have been following me for a long time, you would know that I keep saying of releasing some of my work to the community; and today I finally am proud to announce that I am releasing my first projecthttp://scratchpad101.com/2011/08/24/i-framework/. Today this project is just a set of two/thre utility classes, but over next few months I plan to add a lot more to it.

Current Features:

  • Improved Loggers
  • Database Setup Utilities

So, please go visit the project and have a look on what I have. Also, please let me know if you are looking for something special.

Concurrency Pattern: Producer and Consumer

In my career spanning 15 years, the problem of Producer and Consumer is one that I have come across only a few times. In most programming cases, what we are doing is performing functions in a synchronous fashion where the JVM or the web container handles the complexities of multi-threading on its own. However, when writing certain kinds of use cases where we need this. Last week, I came acros one such use case that sent me 3 years back when I last did it. However, the way it was done last time was very different.

When I first heard the problem statement, I knew instantly what was needed. However, my approach to doing it this time was going to be different from last time. It had simply to do with how I am viewing technology in my life today. I will not go into any non-technical side and will jump straight into the problem and its solution. I started to look at what existed in the market and did come across a couple of posts that helped me in channelizing my thoughts in the right way.

Problem Statement

We need a solution for a batch migration. We are migrating data form System 1 to System 2 and in the process we need to do three tasks:

  • Load data from Database based on groups
  • Process the data
  • Update the records loaded in step#1 with modifications

We have to handle 100s of groups and each group will have around 40K records. You can imagine the amount of time it would take if we were to perform this exercise in a synchronous fashion.  Image here explains this problem in an effective way.

Producer Consumer: The Problem

Producer Consumer: The Problem

Producer and Consumer Pattern

Let us take a look at the Producer Consumer pattern to begin with. If you refer to the problem statement above and look at the image, we see that there are so many entities who are ready with their part of data. However, there are not enough workers who can process all the data. Hence, as the producers continue to line-up in a queue it just continues to grow. We see that the systems start to hog up threads and take a lot of time.

Intermediate Solution

Producer Consumer: The Intermediate approch

Producer Consumer: The Intermediate approch

We do have an intermediate solution. Refer to the image and you will immediately notice that the producers are piling up their work in a filing cabinet and the worker continues to pick it up as they get done with the previous task. However, this approach does have some glaring shortcomings:

  1. There is still one worker who has to do all the work. The external systems may be happy, but the task will still continue to exist until the worker has completed all of the tasks
  2. The producers will pile up their data in a queue and it needs resources to hold the same. Just as in this example the cabinet can fill up, the same can happen with the JVM resources too. We need to be careful how much data we are going to place in memory and in some cases it may not be much.

The Solution

Producer Consumer: The Solution

Producer Consumer: The Solution

The solution is what we see everyday in many places – like the cinema hall queue, Petrol Pumps etc. There are so many people who come in to book a ticket and based on how many people come in, the more people are added to issue tickets. Essentially, refer to image here and you will notice that Producers will keep adding their jobs to the cabinet and we have more workers to handle the work load.

Java provided concurrency package to solve this issue. Till now, I have always worked on threading at a much lower level and this was first time I was going to work with this package. As I started to explore the web and read fellow bloggers with what they have to say, I came across one very good article. It helped in understanding the use of BlockingQueue in a very effective manner. However, the solutions provided by Dhruba would not have helped me in achieving the high throughput which is needed. So, I started to explore the use of ArrayBlockingQueue for the same.

The Controller

This is the first class where the contract between the producers and consumers are managed. The controller will setup 1 thread for the Producer and 2 threads for the consumer. Based on the needs we can create as many threads as we need; and even can even read the data from a properties or do some dynamic magic. For now, we will keep this simple.

package com.kapil.techieforever.producerconsumer;

import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
import java.util.concurrent.Future;

public class TestProducerConsumer
{

public static void main(String args[])
{
try
{
Broker broker = new Broker();

ExecutorService threadPool = Executors.newFixedThreadPool(3);

threadPool.execute(new Consumer(“1”, broker));
threadPool.execute(new Consumer(“2”, broker));
Future producerStatus = threadPool.submit(new Producer(broker));

// this will wait for the producer to finish its execution.
producerStatus.get();

threadPool.shutdown();
}
catch (Exception e)
{
e.printStackTrace();
}
}
}

I am using ExecuteService to create a thread pool and manage it. Instead of using the basic Thread implementation, this is a more effective way as it will handle the exiting and restarting the threads as needed. You will also notice that I am using Future class to get the status of the producer thread. This class is very effective and will halt my program from further execution. This is a nice way of replacing the “.join” method on the threads. Note: I am not using Future very effectively in this example; so you may have to try a few things as you feel fit.

Also, you should note the Broker class which is being used as filing cabinet between the producers and consumers. We will see its implementation in just a little while.

The Producer

This class is responsible for producing the data that needs to be worked upon.

package com.kapil.techieforever.producerconsumer;

public class Producer implements Runnable
{
private Broker broker;

public Producer(Broker broker)
{
this.broker = broker;
}

@Override
public void run()
{
try
{
for (Integer i = 1; i < 5 + 1; ++i)
{
System.out.println(“Producer produced: ” + i);
Thread.sleep(100);
broker.put(i);
}

this.broker.continueProducing = Boolean.FALSE;
System.out.println(“Producer finished its job; terminating.”);
}
catch (InterruptedException ex)
{
ex.printStackTrace();
}

}
}

This class is doing the most simplest of things that it can do – adding an integer to the broker. Some key areas to note are:
1. There is a property on Broker which is updated in the end by the producer when its done producing. This is also known as the “final” or “poison” entry. This is used by the consumers to know that there are no more data coming up
2. I have used Thread.sleep to simulate that some producers may take more time to produce the data. You can tweak this value and see the consumers act

The Consumer

This class is responsible for reading the data from the broker and doing its job

package com.kapil.techieforever.producerconsumer;

public class Consumer implements Runnable
{

private String name;
private Broker broker;

public Consumer(String name, Broker broker)
{
this.name = name;
this.broker = broker;
}

@Override
public void run()
{
try
{
Integer data = broker.get();

while (broker.continueProducing || data != null)
{
Thread.sleep(1000);
System.out.println(“Consumer ” + this.name + ” processed data from broker: ” + data);

data = broker.get();
}

System.out.println(“Comsumer ” + this.name + ” finished its job; terminating.”);
}
catch (InterruptedException ex)
{
ex.printStackTrace();
}
}

}

This is again a simple class that reads the Integer and prints it on the console. However, key points to note are:
1. The loop to process data is an endless loop, that runs on two conditions – until the producer is consuming and there is some data with the broker
2. Again, the Thread.sleep is used to create effective and different scenarios

The Broker

package com.kapil.techieforever.producerconsumer;

import java.util.concurrent.ArrayBlockingQueue;
import java.util.concurrent.TimeUnit;

public class Broker
{
public ArrayBlockingQueue queue = new ArrayBlockingQueue(100);
public Boolean continueProducing = Boolean.TRUE;

public void put(Integer data) throws InterruptedException
{
this.queue.put(data);
}

public Integer get() throws InterruptedException
{
return this.queue.poll(1, TimeUnit.SECONDS);
}
}

The very first thing to note is that we are using ArrayBlockingQueue as the data holder. I am not going to say what this does, but insist you to read it on the JavaDocs here. however, I will explain that the producers are going to place the data in the queue and the consumers will fetch from the queue in FIFO format. But, if the producers are slow, the consumers will wait for data to come in and if the array is full, the producers will wait for it to fill up.

Also, note that I am using the ‘poll’ function instead of get in the queue. This is to ensure that the consumers will not keep waiting for ever and the waiting will time out after a few seconds. This helps us in inter-communication and kill the consumers when all the data is processed. (Note: try replacing poll with get and you will see some interesting outputs).

Code

I have the code sitting on Google project hosting. Feel free to go across and download it from there. It is essentially an eclipse (Spring STS) project. You may also get additional packages and classes when you download it based on when you are downloading it. Feel free to look into those too and share your comments
– You can browse the source code on the SVN browser or;
– You can download it from the project itself

[learn_more caption=”A side step solution”]

Initially, I posted this solution in middle, but then I realized that this is not the way to do things and hence I took this out of the main content and placed it in the end. Another variant to the final solution would be that the workers/consumers do not take one job at a time to process, but they pick up multiple jobs together and go about finishing them before going to the next set. This approach can generate similar results, but in some cases where we have jobs who do not take same time to finish can essentially mean that some workers will end up sooner than others creating some bottleneck. And, if the jobs are allocated before hand which means that all the consumers will have all the jobs before they process (not producer-consumer pattern) then this problem can add up even more and lead to more delays to the processing logic.
[/learn_more]

TestNG or JUnit

For many years now, I have always found myself going back to TestNG whenever it comes to doing Unit Testing with Java Code. Everytime, I picked up TestNG, people have asked me why do I go over to TestNG especially with JUnit is provided by the default development environment like Eclipse or Maven. Continuing the same battle, yesterday I started to look into Spring’s testing support. It is also built on top of JUnit. However, in a few minutes of using the same, I was searching for a feature in JUnit that I have always found missing. TestNG provides Parameterized Testing using DataProviders. Given that I was once again asking myself a familiar question – TestNG or JUnit, I decided to document this so that next time I am sure which one and why.

Essentially the same

If you are just going to do some basic Unit Testing, both the frameworks are basically the same. Both the frameworks allow you to test the code in a quick and effective manner. They have had tool support in Eclipse and other IDE. They have also had support in the build frameworks like Ant and Maven. For starters JUnit has always been the choice because it was the first framework for Unit Testing and has always been available. Many people I talk about have not heard about TestNG till we talk about it.

Flexibility

Let us look at a very simple test case for each of the two.

package com.kapil.itrader;
import java.util.Arrays;
import java.util.List;
import junit.framework.Assert;
import org.junit.BeforeClass;
import org.junit.Test;

public class FibonacciTest
{
    private Integer input;
    private Integer expected;

    @BeforeClass
    public static void beforeClass()
    {
        // do some initialization
    }

    @Test
    public void FibonacciTest()
    {
        System.out.println("Input: " + input + ". Expected: " + expected);
        Assert.assertEquals(expected, Fibonacci.compute(input));
        assertEquals(expected, Fibonacci.compute(input));
    }
}

Well, this is example showcases I am using a version 4.x+ and am making use of annotations. Priori to release 4.0; JUnit did not support annotations and that was a major advantage that TestNG had over its competitor; but JUnit had quickly adapted. You can notice that JUnit also supports static imports and we can do away with more cumbersome code as in previous versions.

package com.kapil.framework.core;
import junit.framework.Assert;
import org.springframework.context.support.ClassPathXmlApplicationContext;
import org.testng.annotations.BeforeSuite;
import org.testng.annotations.Test;

public class BaseTestCase
{
    protected static final ClassPathXmlApplicationContext context;

    static
    {
        context = new ClassPathXmlApplicationContext("rootTestContext.xml");
        context.registerShutdownHook();
    }

    @BeforeSuite
    private void beforeSetup()
    {
       // Do initialization
    }

    @Test
    public void testTrue()
    {
        Assert.assertTrue(false);
    }
}

A first look at the two code, would infer that both are pretty much the same. However, for those who have done enough unit testing, will agree with me that TestNG allows for more flexibility. JUnit requires me to declare my initialization method as static; and consequently anything that I will write in that method has to be static too. JUnit also requires me to have my initialization method as public; but TestNG does not. I can use best practices from OOP in my testing classes as well. TestNG also allows me to declare Test Suite, Groups, Methods and use annotations like @BeforeSuite, @BeforeMethod, @BeforeGroups in addition to @BeforeClass. This is very helpful when it comes to writing any level of integration testing or unit test cases that need to access common data sets.

Test Isolations and Dependency Testing

Junit is very effective when it comes to testing in isolation. It essentially means that there is you can not control the order of execution of tests. And, hence if you have two tests that you want to run in a specific order because of any kind of dependency, you can not do that using JUnit. However, TestNG allows you to do this very effectively. In Junit you can make workaround this problem, but it is not neat and that easy.

Parameter based Testing

A very powerful feature that TestNG offers is “Parameterized Testing”. JUnit has added some support for this in 4.5+ versions, but it is not as effective as TestNG. You may have worked with FIT you would know what I am talking about. However, the support added in JUnit is very basic and not that effective. I have modified my previous test case to include parameterized testing.

package com.kapil.itrader;

import static org.junit.Assert.assertEquals;

import java.util.Arrays;
import java.util.List;

import junit.framework.Assert;

import org.junit.BeforeClass;
import org.junit.Test;
import org.junit.runner.RunWith;
import org.junit.runners.Parameterized;
import org.junit.runners.Parameterized.Parameters;

@RunWith(Parameterized.class)
public class FibonacciTest
{
    private Integer input;
    private Integer expected;

    @Parameters
    public static List data()
    {
        return Arrays.asList(new Integer[][] { { 0, 0 }, { 1, 1 }, { 2, 1 }, { 3, 2 }, { 4, 3 }, { 5, 5 }, { 6, 8 } });
    }

    @BeforeClass
    public static void beforeClass()
    {
        System.out.println("Before");
    }

    public FibonacciTest(Integer input, Integer expected)
    {
        this.input = input;
        this.expected = expected;
    }

    @Test
    public void FibonacciTest()
    {
        System.out.println("Input: " + input + ". Expected: " + expected);
        Assert.assertEquals(expected, Fibonacci.compute(input));
        assertEquals(expected, Fibonacci.compute(input));
    }

}

You will notice that I have used @RunWith annotation to allow my test case to be parameterized. In this case, the inline method – data() which has been annotated with @Parameters will be used to provide data to the class. However, the biggest issue is that the data is passed to class constructor. This allows me to code only logically bound test cases in this class. And, I will end up having multiple test cases for one service because all the various methods in the Service wil require different data sets. The good thing is that there are various open source frameworks which have extended this approach and added their own “RunWith” implementations to allow integration with external entities like CSV, HTML or Excel files.

TestNG provides this support out of the box. Not support for reading from CSV or external files, but from Data Providers.

package com.kapil.itrader.core.managers.admin;

import org.testng.Assert;
import org.testng.annotations.Test;

import com.uhc.simple.common.BaseTestCase;
import com.uhc.simple.core.admin.manager.ILookupManager;
import com.uhc.simple.core.admin.service.ILookupService;
import com.uhc.simple.dataprovider.admin.LookupValueDataProvider;
import com.uhc.simple.dto.admin.LookupValueRequest;
import com.uhc.simple.dto.admin.LookupValueResponse;

/**
 * Test cases to test {@link ILookupService}.
 */
public class LookupServiceTests extends BaseTestCase
{

    @Test(dataProvider = "LookupValueProvider", dataProviderClass = LookupValueDataProvider.class)
    public void testGetAllLookupValues(String row, LookupValueRequest request, LookupValueResponse expectedResponse)
    {
        ILookupManager manager = super.getLookupManager();
        LookupValueResponse actualResponse = manager.getLookupValues(request);
        Assert.assertEquals(actualResponse.getStatus(), expectedResponse.getStatus());
    }
}

The code snippet above showcases that I have used dataProvider as a value to the annotations and then I have provided a class which is responsible for creating the data that is supplied to the method at the time of invocation. Using this mechanism, I can easily write test cases and its data providers in a de-coupled fashion and use it very effectively.

Why I choose TestNG

For me the Parameterized Testing is the biggest reason why I choose TestNG over Junit. However, everything that I have listed above is the reason why I always want to spend a few minutes in setting up TestNG in a new Eclipse setup or maven project. TestNG is very useful when it comes to running big test suites. For a small project or a training exercise JUnit is fine; because anyone can start with it very quickly; but not for projects where we need 1000s of test cases and in most of those test cases you will have various scenarios to cover.

http://kapilvirenahuja.com/tech/2011/08/07/testng-or-junit/

NTLM Authentication in Java

In one of my previous lives, I used to work in Microsoft and there this word – NTLM (NT Lan Manager) was something that came to us whenever we used to work on applications. Microsoft OS have always provided us with an inbuilt security systems that can be effectively used to offer authentication (and even authorization to web applications).

Many years back, I moved over into Java world and when I was asked to carry out my very first security implementation, I realized that there was no easy way to do this and many clients would actually want us to use LDAP for authentication and authorization. For many years, I continued to use that. And, then one day in a discussion with a client, we were asked to offer SSO implementation and client did not have an existing setup like SiteMinder. I started to think about if we can go about using NTLM based authentication. The reason that was possible was because the application we were asked to build was to be used within the organization itself and all the people were required to login into a domain.

After some research, I was able to find out a way we could do this. We did a POC and showed it to the client and they were happy about it. What we did has been explained below:

  • Wrote a Servlet which was the first one to be loaded (like Authentication Interceptor). This servlet was responsible for reading the header attributes and identify the user’s Domain and NTID
  • Once we had the details; we sent a request to our Database to see if that user is registered under the same domain/NTID
  • If the user was found in our user-database we allowed him to pass through
  • And then roles and authorization for user was loaded

Basically, we bypassed the “Login Screen” where the user was entering the password and used Domain information. Please note that it was possible for us because the Client guaranteed that there was this domain always and all users had unique NTIDs. Also, that it was their responsibility to shield the application from any external entry points where someone may impersonate the Domain/ID.

If you are interested, you can refer to the code below:

<%@ page import="sun.misc.BASE64Encoder" %>
<%
String auth = request.getHeader("Authorization");
String s = "";

//no auth, request NTLM
if (auth == null)
{
        response.setStatus(response.SC_UNAUTHORIZED);
        response.setHeader("WWW-Authenticate", "NTLM");
        return;
}

//check what client sent
if (auth.startsWith("NTLM "))
{
        byte[] msg =
           new sun.misc.BASE64Decoder().decodeBuffer(auth.substring(5));
        int off = 0, length, offset;

        if (msg[8] == 1) {
            off = 18;

            byte z = 0;
            byte[] msg1 =
                {(byte)'N', (byte)'T', (byte)'L', (byte)'M', (byte)'S',(byte)'S', (byte)'P',
                z,(byte)2, z, z, z, z, z, z, z,
                (byte)40, z, z, z, (byte)1, (byte)130, z, z,
                z, (byte)2, (byte)2, (byte)2, z, z, z, z, //
                z, z, z, z, z, z, z, z};
            // send ntlm type2 msg

            response.setStatus(response.SC_UNAUTHORIZED);
            response.setHeader("WWW-Authenticate", "NTLM "
               + new sun.misc.BASE64Encoder().encodeBuffer(msg1).trim());

               return;
        }
        else if (msg[8] == 3) {
                off = 30;
                length = msg[off+17]*256 + msg[off+16];
                offset = msg[off+19]*256 + msg[off+8];
                s = new String(msg, offset, length);
                // print computer name // out.println(s + " ");
        }
        else
        return;

        length = msg[off+1]*256 + msg[off];
        offset = msg[off+3]*256 + msg[off+2];
        s = new String(msg, offset, length);
        //domain//out.println(s + " ");
        length = msg[off+9]*256 + msg[off+8];
        offset = msg[off+11]*256 + msg[off+10];

        s = new String(msg, offset, length);
        out.println("Hello  "); out.println(s + "");
}
%>