Coder's World: July 2013

Saturday 27 July 2013

Java Automated Build Process With Hudson, Nexus, Git and Maven

Automated Build Process

I am going to explain the above picture work flow, then we will look at each technologies in detail.
Validate - validate the project
Test - Test the compiled source code using a suitable unit testing framework. Test code is separated from source code. Its not packaged and deployed with source code.
Package- Take the compiled source code and packaged in to a distributed format, such as jar/war
Integration-test - process and deploy the package if necessary into an environment where integration tests can be run
Verify- Run and check to verify the package is valid or not
Install - Install the package in to the local repository
Deploy- Done in release environment. Deploy the package in to remote repository for sharing with developers or other projects.

1. Developer has git installed in his local system. Lets assume that he has completed the task, he issues git commit command to commit his current changes in his local repository.

2. He, then issues git push command to push his changes to the remote repository which is a centralized repository server where the source code version is maintained. Clients connects to this centralized version control server to clone, pull, commit, push their changes. To learn more about git command please visit http://git-scm.com/documentation

3. Hudson(Continuous Integration) server. CI main focus is to notify the users who breaks the build due to erroneous code changes. Its possible that developers sometimes commit and push their changes without proper testing. So, CI connects to our central code repository server and picks the latest changes periodically and verify the changes by running test against them. It can be configured to notify all the developers in case if the build breaks. As shown in the diagram, It connects to the central repository to fetch the latest changes

4. Picks the latest changes and validate, compile, build, package, install in to the local repository or notify developers in case of any failure

5. If build and release phase is configured in the pom.xml file, It performs a release and deploy the new version in nexus repository management server.

6. This step is performed to automate the deployment of new version release in our web application server. In this step, We have to create a script file and configure it to execute after a successful release in pom.xml. This script file main focus is to connect to the nexus repository and download the latest release.

7. Once the download is complete, script file automatically deploy the distribution war in our web application server(glassfish opensource edition).

In java world, if you are a programmer who frequently working with server side applications, then an automated build process is very essential. In this post I will try to explain the work flow of an automated build system that we are currently following in our company. Automated build process of java application become very popular with the release of Maven technology. Without Maven its very tedious to build an automated system. Maven simplified the entire automated build process. Maven has its own life cycle to build an application phase by phase. It starts with

Compile- Compile the source code of the project

pom.xml is the heart of any maven project. Its the file where we can configure the build phases, what to include/exclude in a phase, how to filter unwanted dependencies, and most importantly how to maintain the dependencies. Managing dependencies version is one of the most important aspect of maven technology. In early days, developers has to taken care of the dependency jar file versions manually. But maven simply eliminated that need and provided a framework to handle all dependency files versions automatically. If we put "LATEST" as the version under "<version>" tag, maven automatically fetches the latest version of that dependency from maven central repository. Its that simple.

Sonatype Nexus: Nexus is a repository management standard. It helps the developers to share the artifacts with other projects.

Hudson: Continuous Integration, CI is a technology to automate the build process. It periodically check the code changes in source code repositories with the help of SCM tags and validate, compile, test, build and install the distribution package in local repository. It alerts the developers in case of any build cycle failure.

GIT: GIt is a distributed version control and source code management (SCM) system.

Sunday 21 July 2013

JVM Tuning

JVM is the heart of any java based applications. Most of the issues we face in our applications and servers are due to incorrect JVM settings and memory leaks. Memory leaks, well, we have no option we have to sit and debug the code. But the former is entirely a different task, Tuning a JVM is a dark art. You need to have complete knowledge about the JVM architecture. Also, tuning a JVM for a specific hardware is another headache. I have struggled a lot to configure our Glassfish server on Ubuntu with 16GB RAM. The default glassfish server JVM settings are not suitable for production use and that required some alteration. Now, after so much struggle, our server is running fine. So I thought this post would be helpful for some people who are struggling like me to configure JVM settings for different application servers. First we will learn about each segments of JVM in detail and in my next post I will show the glassfish jvm settings for production use.

-Xms: This jvm argument request the operating system to allocate the minimum amount of heap memory at start-up. i.e, -Xms1g allocates 1GB as the initial heap memory. It accepts m and g as the units

-Xmx: Set the maximum amount of JVM heap memory. OutOfMemoryException is thrown when memory goes beyond this level due to memory leaks or because of overload condition.

-XX:NewRatio: It defines the old/new generation ratio. In other words, this ratio defines the size of old(tenured) and new(eden) generation space. i.e,
-XX:NewRatio=2 allocates 2:1 ratio for old to new generation.
If our max heap space is -Xmx=9g, then the calculation goes like this
2 / 3 * 9 = 6GB for old generation
1 / 3 * 9 = 3GB for new generation

-XX:PermSize: Allocates the initial permanent generation non heap size

-XX:MaxPermSize: Allocates the maximum permanent generation non heap size

-XX:SurvivorRatio: This parameter controls the size of two survivor spaces. i.e, -XX:SurvivorRatio=6 sets the ratio between each survivor space and eden to be 1:6, each survivor space will be one-eighth of the young generation.
UseAdaptiveSizePolicy parameter must be disabled, otherwise jvm ignore this parameter. If the size is too small, copying collection overflows directly in to tenured space. If the size is too large, they will be empty.

-server: jvm can be tuned for client or server machine. For example, If you download glassfish app server, by default, it is configured for client machine using -client parameter. But for production, we have to change this parameter to -server for better performance. -client and -server changes the garbage collection algorithms. -client uses Serial collector and -server uses Parallel collector algorithms. We will see GC algorithms in a short while.

How Garbage Collection works?
All GC algorithms follows the same basic principle to deliver better memory utilization, though their implementation differs, some algorithms favor better throughput and some favor reduced latency. But their core principle is the same. When new instances/objects are created by the application, it is pushed in to Eden/young generation space, from there, algorithm detects whether its a short lived or long lived object, if its a long lived object, it is further moved on to survivor spaces and then on to tenured or old generation.
Finally when full GC occurs, this old object will be cleaned up if the object has no references. Short lived object stays in eden or in survivor spaces and cleaned up when short pause GC occurs.

Saturday 20 July 2013

Install UML in Netbeans 7.3

Netbeans supported UML in 6.7 release, but in later versions, I couldn't find any of the UML plugins. Anybody knows whats the reason behind that?. Fortunately I figured out a way to integrate UML with later versions of netbeans. I have tried this approach in Netbeans 7.3 and it worked for me.
But, unfortunately, Netbeans 7.3 unable to open the diagrams which I created earlier. I have created the project and a use case diagram successfully, saved it and closed the project, After some time, I reopened the project and tried to open the use case diagram, Nebeans couldn't open the diagram at all. Now I have to fall back to Netbeans 6.9 to continue with UML.
If anyone looking for UML in latest versions of Netbeans, Please follow the below steps. But its useless.

1) Star Netbeans

2) Go to Tools-> Plugins->Settings(tab)
and click on Add button and enter the below URL http://dlc.sun.com.edgesuite.net/netbeans/updates/6.9/uc/m1/dev/catalog.xml

then, Click ok

3) Go to Tools-> Plugins-> Available Plugins(tab)
Search for "UML" in search textfield at the right side corner

4) Select the UML plugin and click on install and restart netbeans

Monday 15 July 2013

Cron Jobs Schedule Pattern

Cron job patterns:

To run a job at every hour of every day

0 0 * * * *

To run a job at every 15 seconds

*/15 * * * * *

To run a job at every 30 min

0 0/30 * * * *

To run a job at 6, 7 and 8 'o clock' of every day

0 0 6-8 * * *

To run a job at 6, 6:30, 7, 7:30 and 8 'o clock' of every day

0 0/30 6-8 * * *

To run only on week days between 6 and 8 'o clock'

0 0 6-8 * * MON-FRI

To run on every christmas day at mid night

0 0 0 25 12 ?

To run at midnight every day

0 0 0 * * ?

Saturday 6 July 2013

Spring Task Execution And Scheduling

If your applications needs a single or batch of tasks to be executed in an asynchronous manner at some point in future, then what you are ultimately looking for is TaskExecuter implementations and TaskScheduler of spring framework.
TaskExecuter allows us to execute tasks in an asynchronous manner
TaskScheduler allows us to schedule a task to run at some point in future.
To know more about taskexecuter and taskscheduler visit http://static.springsource.org/spring/docs/3.0.x/reference/scheduling.html

Here we can see how to declare taskexecuter and taskscheduler in spring configuration file
and how to create a bean with cron job @Scheduled annotation to schedule the date and time of execution.

Spring configuration file

Now to schedule a job for every 20 minutes

Make sure to replace xxxxxxxxxxx of <context:component-scan base-package="xxxxxxxxxxx" /> with the package name of your ScheduleService class.

Now, lets analyse the attributes of task executer and scheduler.

<task:scheduler id="taskScheduler" pool-size="1"/>
pool-size="1": Here the value "1" indicates that only one task can be scheduled at a time. If you want one or more task to be scheduled in an asynchronous manner, increase its value. For example, to schedule 5 tasks at a time ( <task:scheduler id="taskScheduler" pool-size="5"/> )

<task:executor id="taskExecutor" pool-size="1" queue-capacity="2"/>
As in the case of scheduler, pool-size with value "1" indicates that only one task can be executed at a time. If you set value 1 to pool-size of scheduler and executer, it behaves in a synchronous manner.ie, only one task can be scheduled and only one task can be executed at a time. If you want to execute more tasks at a time, just increase the value of the pool-size. For example, If you want to execute 5 tasks at a time
<task:executor id="taskExecutor" pool-size="5" queue-capacity="2"/>

<task:executor id="taskExecutor" pool-size="1" queue-capacity="2"/>
Capacity of TaskExecuter queue. Before an item is pushed to the pool, it checks the current capacity of the pool. If the pool is completely filled and busy, it will be pushed to the queue, and it remains there until the TaskExecuter finish running a task. Here the value "2" defines that only 2 tasks can be put in to the queue before it moves on to the pool.

Tuesday 2 July 2013

Avoid Duplicates in List

How to avoid duplicates in List

There is no direct classes available in java to avoid duplicates. Use HashSet along with ArrayList to avoid duplicates

Monday 1 July 2013

/proc/kcore and rinetd occupies the entire hard disk space

We had a strange problem in ubuntu server. This happened without any warning, still don't know whats the reason for that. I logged in to the system and couldn't do anything...I was getting strange errors, and then I saw the hard disk space, it was 0...0 space left. It was shocking for me. We have around 2TB of hard disk space, then I checked the folders one by one...there was one folder named "proc" that occupied almost the entire 1.32TB of data and the rest by the /var/log folder.
I googled many times, and the result I got is this "proc/kcore" is a virtual file, it will not occupy any space. Its protected and you cannot delete that file also, then I restarted the system. After a quck reboot, system was normal...everything was working fine, after an hour and so I checked the disk space again, boooooooooommmmmm...almost 50% occupied by "proc/kcore" file again. I couldn't find an answer for this. then I upgraded ubuntu 10.04 to 12.x.
After a successfull upgrade everything seemed to be normal. After 3-4 days, I checked the space again..booooooooooomm...35% of space gone....I checked the "proc/kcore" file this time...Fortunately this time it's not his fault. then I checked the /var/logs folder....too many logs..
After searching for a very long time, I found that there is a process called "RINETD" that creates too many logs.

I still couldn't find an answer for my 1'st problem.

But I managed to solve issues with RINETD process. To disable the logs just perform the below steps.

1) Stop rinetd process first

service rinetd stop

2) Delete the log files from logs folder

cd var/logs
rm -R *

3) Open rinetd config file and disable the log output

nano /etc/rinetd.conf

To disable log, comment the following

# Remove the comment from the line below to enable logging option.
# logfile /var/log/rinetd.log

4) Restart the service again

service rinetd restart