Rajiv Srivastava's blog: 2011

Monday, September 12, 2011

Integrating Hibernate with Maven

I was trying to integrate Hibernate with maven for quite a long time but gave up everytime due to long list of dependency that had to be configured while working with Hibernate. Moreover, no website or blog was clearly able to tell where to find the latest repository that could download all the dependencies. I would have expected Hibernate guys to maintain the dependencies and allowed any user to just use a single <dependency> element to be added to POM file and that would do the magic coz that's what maven is for -- resolving transitive dependency.
Lately JBoss guys have starting maintaing hibernate dependency on https://repository.jboss.org/nexus/content/groups/public. Though using this repository you don't have to add multiple dependencies like hibernate-core, hibernate-commons, hibernate-annotations... and so on.
Adding the following dependency does most of the magic.
<dependency>
       <groupId>org.hibernate</groupId>
       <artifactId>hibernate-entitymanager</artifactId>
       <scope>compile</scope>
       <version>3.4.0.GA</version>
</dependency>
But still you have to configure the logging dependency. In this case it requires sl4j-log4j12. So you might just add a dependency of slf4j-log4j12 with current GA version as below:
<dependency>
       <groupId>org.slf4j</groupId>
       <artifactId>slf4j-log4j12r</artifactId>
       <type>jar</type>
       <scope>runtime</scope>
       <version>1.5.2</version>
</dependency>

But as soon as you compile and there's a boooom....#*%^^%#Exception occured.
So what went wrong? The slf4j-log4j12 version that you just used is not compatible with the Hibernate version you are using. Why can't the JBoss guys maintain this in the repo and do away with it?

If you get this, you can find out the dependency using the command mvn dependency:tree

In this case, I had to use 1.4.2 version of slg4j-log4j12 API.
Pre-requisites: Maven is downloaded and configured on your m/c.

Tuesday, August 30, 2011

Comparable Interface : Problem while adding object to TreeSet

public class X{
public static void main(String [] arg){
     Set s=new TreeSet();
    s.add(new Person(20));
    s.add(new Person(10));
    System.out.println(s);
}
}
class Person{
   int i;
   Person(int i){
         i = this.i;
    }
}
Can anyone tell me why this code fragment shows me the “ClassCastException”, Using Java 2 1.5 compiler.
In the above code
Set s=new TreeSet();
Set S which is a TreeSet accepts an object that is very true when developer adds an object new Person(20) to code, code will not fail. Now when developer will add another object new Person(10) the code will fail.
Now replace the code with a fresh code
Set s=new TreeSet();
s.add(new Integer(10)); // replace person object with Integer object
s.add(new Integer(20)); // replace person object with Integer object
System.out.println(s);
The above code will not fail.
Now lets dig in to the concept.
TreeSet is a class belongs to Set interface, being a set it does not accept duplicates. First object insertion goes fine here,Now when we are trying to add the second object then there must be a criteria based on which the 2 object should be comparable. And in the above code there is no such criteria, for uniqueness the set internally use compareable interface to check object equality which is not implemented by the Person class,So in this case developer have to implement the comparable interface in his Person Class.
There are fourteen classes in the Java 2 SDK, version 1.2, that implements the Comparable interface.
BigDecimal, BigInteger, Byte, Double, Float, Integer, Long, Short, Character, CollationKey, Date, File, ObjectStreamField, String
So when developer will try t add the above objects they won’t give any problem, as developer is facing in above scenario.
Below code for person class will solve the developer’s problem.
public class Person implements Comparable{
   int i;
   Person(int i){
        i = this.i;
    }

public int compareTo(Object o1) {
       if (this.i == ((Person) o1).i)
               return 0;
        else if ((this.i) > ((Person) o1).i)
            return 1;
        else
            return -1;
   }
}

Tuesday, August 9, 2011

What is the difference between mvc1 and mvc2?

What is the difference between MVC-1 and MVC-2?

In MVC-1 architecture (also referred as Model 1), the request is first handled by a JSP, that interacts with a Bean. Here the JSP page may have partial processing logic, although a bulk of processing logic may be handled by the beans that may interact with the database. The JSP in this case in addition to being responsible for 'View' of MVC also takes the resposibility as 'Controller', and the beans acting as Model. For small applications that do not have complex processing, this model may be fine, but in case of bigger applications where a lot of processing and decision making is required (Authentication, Logging, Conditional redirection, database interactions, network connections) this is not the best option.

In such cases MVC2 or Model 2 architecture is the better option. It has a Controller Servlet that handles all the incoming requests (You may refer to the Front Controller pattern) and acts as the 'Controller', it determines what comes next, a View (JSP) or further processing by Model (Beans doing all the complex tasks), and will decide the view to display the results from the Model.
The links on the JSP pages for next view may also pass through the controller servlet to determine the next view, unlike in MVC-1 where the links on a JSP page will directly point to another JSP page.

Saturday, July 30, 2011

Good article about JVM - Heap memory, Garbage collection and other Java related articles.

10 points about Java Heap Space or Java Heap Memory

http://javarevisited.blogspot.com/2011/05/java-heap-space-memory-size-jvm.html#.TjMNBluRoRQ

Wednesday, July 27, 2011

Optimizing SQL Queries

I have found a nice article on Oracle query optimzation:

D.1 Optimizing Single-Table Queries

To improve the performance of a query that selects rows of a table based on a specific column value, create an index on that column. For example, the following query performs better if the NAME column of the EMP table has an index.

SELECT * 
FROM EMP 
WHERE NAME = 'Smith';

D.2 Optimizing Join Queries

The following can improve the performance of a join query (a query with more than one table reference in the FROM clause).

D.2.1 Create an Index on the Join Column(s) of the Inner Table

In the following example, the inner table of the join query is DEPT and the join column of DEPT is DEPT#. An index on DEPT.DEPT# improves the performance of the query. In this example, since DEPT# is the primary key of DEPT, an index is implicitly created for it. The optimizer will detect the presence of the index and decide to use DEPT as the inner table. In case there is also an index on EMP.WORKS_IN column the optimizer evaluates the cost of both orders of execution;DEPT followed by EMP (where EMP is the inner table) and EMP followed by DEPT (where DEPT is the inner table) and picks the least expensive execution plan.

SELECT e.SS#, e.NAME, d.BUDGET
FROM EMP e, DEPT d 
WHERE e.WORKS_IN = DEPT.DEPT# 
AND e.JOB_TITLE = 'Manager';

D.2.2 Bypassing the Query Optimizer

Normally optimizer picks the best execution plan, an optimal order of tables to be joined. In case the optimizer is not producing a good execution plan you can control the order of execution using the HINTS feature SQL. For more information see the Oracle9i Lite SQL Reference.
For example, if you want to select the name of each department along with the name of its manager, you can write the query in one of two ways. In the first example which follows, the hint /++ordered++/ says to do the join in the order the tables appear in the FROM clause with attempting to optimize the join order.

SELECT /++ordered++/ d.NAME, e.NAME
FROM DEPT d, EMP e
WHERE d.MGR = e.SS#

or:

SELECT /++ordered++/ d.NAME, e.NAME 
FROM EMP e, DEPT d 
WHERE d.MGR = e.SS#

Suppose that there are 10 departments and 1000 employees, and that the inner table in each query has an index on the join column. In the first query, the first table produces 10 qualifying rows (in this case, the whole table). In the second query, the first table produces 1000 qualifying rows. The first query will access the EMP table 10 times and scan the DEPT table once. The second query will scan the EMP table once but will access the DEPT table 1000 times. Therefore the first query will perform much better. As a rule of thumb, tables should be arranged from smallest effective number rows to largest effective number of rows. The effective row size of a table in a query is obtained by applying the logical conditions that are resolved entirely on that table.

In another example, consider a query to retrieve the social security numbers and names of employees in a given location, such as New York. According to the sample schema, the query would have three table references in the FROM clause. The three tables could be ordered in six different ways. Although the result is the same regardless of which order you choose, the performance could be quite different.

Suppose the effective row size of the LOCATION table is small, for example select count(*) from LOCATION where LOC_NAME = 'New York' is a small set. Based on the above rules, the LOCATION table should be the first table in the FROM clause. There should be an index on LOCATION.LOC_NAME. Since LOCATION must be joined with DEPT, DEPT should be the second table and there should be an index on the LOC column of DEPT. Similarly, the third table should be EMP and there should be an index on EMP#. You could write this query as:

SELECT /++ordered++/ e.SS#, e.NAME 
FROM LOCATION l, DEPT d, EMP e 
WHERE l.LOC_NAME = 'New York' AND 
l.LOC# = d.LOC AND 
d.DEPT# = e.WORKS_IN;

D.3 Optimizing with Order By and Group By Clauses

Various performance improvements have been made so that SELECT statements run faster and consume less memory cache. Group by and Order by clauses attempt to avoid sorting if a suitable index is available.

D.3.1 IN subquery conversion

Converts IN subquery to a join when the select list in the subquery is uniquely indexed.
For example, the following IN subquery statement is converted to its corresponding join statement. This assumes that c1 is the primary key of table t2:

SELECT c2 FROM t1 WHERE 
c2 IN (SELECT c1 FROM t2);

becomes:

SELECT c2 FROM t1, t2 WHERE t1.c2 = t2.c1;

D.3.2 ORDER BY optimization with no GROUP BY

This eliminates the sorting step for an ORDER BY clause in a select statement if ALL of the following conditions are met:

All ORDER BY columns are in ascending order or in descending order.
Only columns appear in the ORDER BY clause. That is, no expressions are used in the ORDER BY clause.
ORDER BY columns are a prefix of some base table index.
The cost of accessing by the index is less than sorting the result set.

D.3.3 GROUP BY optimization with no ORDER BY

This eliminates the sorting step for the grouping operation if GROUP BY columns are the prefix of some base table index.

D.3.4 ORDER BY optimization with GROUP BY

When ORDER BY columns are the prefix of GROUP BY columns, and all columns are sorted in either ascending or in descending order, the sorting step for the query result is eliminated. If GROUP BY columns are the prefix of a base table index, the sorting step in the grouping operation is also eliminated.

D.3.5 Cache subquery results

If the optimizer determines that the number of rows returned by a subquery is small and the query is non-correlated, then the query result will be cached in memory for better performance. Currently the number of rows is set at 2000. For example:

select * from t1 where 
t1.c1 = (select sum(salary) 
from t2 where t2.deptno = 100);

Monday, July 25, 2011

Why ConcurrentHashMap is better than Hashtable and HashMap ???

ConcurrentHashMap is a pretty ignored class. The class offers a very robust and fast (comparatively, we all know java concurrency isn’t the fastest) method of synchronizing a Map collection.

There is no way you can compare the two, one offers synchronized methods to access a map while the other offers no synchronization whatsoever. What most of us fail to notice is that while our applications, web applications especially, work fine during the development & testing phase, they usually go tits up under heavy (or even moderately heavy) load in PRODUCTION. This is due to the fact that we expect our HashMap’s to behave a certain way but under load they usually misbehave.

Hashtable’s offer concurrent access to their entries, with a small caveat, the entire map is locked to perform any sort of operation. While this overhead is ignorable in a web application under normal load, under heavy load it can lead to delayed response times and overtaxing of your server for no good reason.

This is where ConcurrentHashMap’s step in. They offer all the features of Hashtable with a performance almost as good as a HashMap. ConcurrentHashMap’s accomplish this by a very simple mechanism. Instead of a map wide lock, the collection maintains a list of 16 locks by default, each of which is used to guard (or lock on) a single bucket of the map. This effectively means that 16 threads can modify the collection at a single time (as long as they’re all working on different buckets). Infact there is no operation performed by this collection that locks the entire map. The concurrency level of the collection, the number of threads that can modify it at the same time without blocking, can be increased. However a higher number means more overhead of maintaining this list of locks.

Retrieval operations on a ConcurrentHashMap do not block unless the entry is not found in the bucket or if the value of the entry is null. In such a case the map synchronizes on the bucket and then tries to look for the entry again just in case the entry was put or removed right after the get in synchronized mode.
Removal operations do require a bit of overhead. All removal operations require the chain of elements before and after to be cloned and joined without the removed element. Since the value of the map key is volatile (not really, the value of the inner Entry class is volatile) if a thread already traversing the bucket from which a value is removed reaches the removed element, it automatically sees a null value and knows to ignore such a value.
Traversal in a ConcurrentHashMap does not synchronize on the entire map either. Infact traversal does not synchronize at all except under one condition. The internal LinkedList implementation is aware of the changes to the underlying collection. If it detects any such changes during traversal it synchronizes itself on the bucket it is traversing and then tries to re-read the values. This always insures that while the values recieved are always fresh, there is minimalistic locking if any.
Iteration over a ConcurrentHashMap are a little different from those offered by other collections. The iterators are not fail-fast in the sense that they do not throw a ConcurrentModificationException. They also do not guarantee that once the iterator is created it will list/show all elements that are added after its creation. The iterators do however guarantee that any updates or removal of items will be reflected correctly in their behaviour. They also guarantee that no element will be returned more than once while traversal.
In conclusion, give it a try, replace some Hashtable’s in your application with ConcurrentHashMap and see how they perform under load. The two are interchangeable so it shouldn’t be hard to update your app.