AmeyDhoke's blog

Tuesday, May 29, 2012

Java 7: Multi-catch block

I wrote a blog about using multiple exceptions catch statement long before Java 7 was released. That time, multi-catch block and better re-throw was part of Project coin, small language enhancements. Now Java7 being out, we can see it action. This is how it works.

First of all, to get multi-catch working, we need to set '-source 7' as commond line parameter during compilation.
We can write multi-catch statement as -

try {
newCatchBlock.methodWithMultipleExceptions();
} catch (MyException1 | MyException2 exception) {
...
}

This can be useful in few cases and can save few lines of code. The type of exception caught will be the closest common base class. If there is no common class, the type of object will be Exception, which is the base class for all exceptions. However, that means you will not be able to invoke methods from MyExcepetion1 or MyExcepetion2, as the object will be of type Exception. The parameter in catch block is final. Trying to reassign value to it will result in error "Multi catch exception may not be assigned".

Another important point to note is that exceptions in multi-catch statement can not have parent-child relationship. You can not have catch block like (IOException | FileNotFoundException exception) as FileNotFound is subclass of IOException.

When I looked for how this feature is implemented in language, I found that there is no magic here. The generated bytecode will just have a plane old catch block which catches exception of closest common base class. That explains the type of caught exception as well.

This feature adds a bit of syntactic sugar. As I mentioned in my previous blog on this topic, I don't see this as being very useful in day to day programming. I don't remember last time when I wanted to catch multiple exceptions and have the same code to handle them. However, this does make Java a little bit better than before.

Thursday, July 15, 2010

Rails best practices

I am from Java/C# background and working on my first rails project. I heard in the beginning that rails has a specific style. Now, after spending few months with rails, I see that rails, like other MVC frameworks, has the same gotchas and similar best practices to address them.
I have observed few best practices that we follow on our project. Here is my attempt to consolidate them..

Fat Models and skinny controllers is one of the most talked about best practice in the rails world. It's aligned with the MVC philosophy of having minimum code in the controllers. Putting business logic in model makes the code easily testable and reusable. Apart from the render block, actions in the controllers should get, create or update the models and then invoke at most one method from the models. All other business logic should be pushed in models.

That being said, we should be careful that our models should not put on too much of fat. The User the model in my current project is 450 lines, and its spec is 1100 lines long ! This clearly indicates that the model is doing too many things. What can we do with such over fatty models ? There are multiple ways to handle this -
- First thing is to try to identify if it is possible to extract another model. However,it makes sense to do so only if some code clearly belongs to another model. Otherwise, we would be creating an extra model object which does not exists in the domain and we will end up confusing other devs.
- Extract out the a group of related functionality into mixins. This will make models little more maintainable and will also segregate the specs.
- User ActiveRecord::Observer to perform certain actions when model object is created or deleted. This is a great way to separate the code that isn't model's core responsibility.
  In my project, we want to send an email when new model is created. Sending email is not the model's responsibility so the code should not be in the model class. In such situations, its better to use Observers. Also, Observers can observe more than one models. One problem with this approach is that model class does not holds any reference of the Observer. So, by looking at the code in models, its not evident that an observer is observing this class.
- Make sure that the model does have the presentation logic. If the model has code which modifies/combines the model property for view, move that code to view helpers.
- Use concerned_with plugin which helps to move code into separate files. It allows to segregate the concerns in different files by opening the model class multiple times. Personally, I don't see it to be much different from extracting the code in modules.
Keep your views lightweight. Views should have only HTMLs and minimum rendering code. Move code from views to controllers if the code accesses the database, to model if the code has some business logic which belongs to a specific model and to view helpers if code is about modifying data to show it in specific format on UI. Make sure that views are not unknowing making database calls by accessing associations. Try to avoid putting css and javascript in erbs. They should go in separate files.
Use view helpers effectively. Views, models and controllers should not have presentation logic. Moving the presentation logic to view helpers makes it easily testable.
In modules, avoid accessing instance variables of including class. If the module is included in controller, avoid reference to params hash. Instead of using the instance variables or methods from the class, pass the required data to methods. This reduced coupling between the class and included modules, makes module code reusable and easy to test.
If you are using activerecord finder with same condition in multiple places, move them to model and use named_scope. Named scopes allows you to give a descriptive name to the finder. You can also chain multiple named_scopes together. Move all the activerecord finders from controller to models.
Understand available callbacks on ActiveRecord controller. Consider using activerecord observers.
If you have a common code across multiple actions in a controller, extract out that code in filters. Use before_filter, around_filter and after_filter to keep your controllers DRY.
Ideally, Controller should have only action methods. If you have any method in controller, which is not an action, see if you can move this method to model or to view helper.
Instead of creating the model and then building it from params in controller, pass the hash to the initialize in models. Instead of having any conditional logic in controllers, move it downstream in models or services.
Understand REST principles. Create controller for identifiable resource/domain entity. Do not do update database in index/show action of controller. Use right HTTP methods.
Consider creating service classes to represent operation which does not have state. e.g. The code which talks to some external service can go in a service class.
And most importantly, write tests for all your code !

Monday, June 21, 2010

Loading a readonly ActiveRecord object

I am working on a story, with my pair Jaju, which generates reports from database. The database has questions and answers. We want to generate report with all questions and answers in specific range. All we wanted to do is to load some data in readonly mode and had no intentions of updating the database. It turned out to be not that straightforward though.

The Question model has :has_many answers and Answer model :belongs_to question.

To begin with, the simplest thing to do is -

questions = Question.all(:include => :answers , :conditions => ["answer.created_date > ?", Date.today.to_formatted_s(:db)])

All well and good. One query which joins question and answer table is fired and we get questions with answers created after yesterday. But now, there is a new problem - our BA (Business Analyst) says - "Hey, This is not what I want. I want all the questions in the report, only the answers should get filtered if they are not created after given date. So even if question doesn't have answer in the given time range, I need the question in the report without the answer."

Ohh..the code that we have written, filters the questions if they do not have answers created after given date, but we can change it. Here is the next version of the code.

1| results = Question.all(:include => :answers)
2| results.each { |question|
3|   question.answers = question.answers.select { |answer|
4|       answer.created_date > Time.parse(Date.today.to_formatted_s(:db))
5|   }
6| }

Alright ! This is exactly what we want. We get all the questions, with answers created after Date.today. But wait, the test log has following query -

UPDATE `answer` SET question_id = NULL WHERE (question_id = 1747 AND answer_id IN (1099))

Damn !! Active record is setting question_id to null in answer table. This is dangerous !

We realised that this is happening because we are filtering the answers in the question object and reassigning it to question.answers (line 3 in the above code). question.answer is an object of AssociationProxy and when we reassign question.answer on line 3, it updates that in database.

We tried to find out whether we can load the complete active record object in the readonly mode. This can be done by -

results = Question.all(:include => :answers, :readonly => true)

But this doesn't fix the issue. It still updates the answer table. The :readonly flag is only applicable to the parent object. So any changes on the question object will not be persisted. However, changes in question.answers are still reflected in database.

The root of the problem is that we are assigning something to an association proxy object. So we need to avoid that part.
While looking at the activerecord code, we realised that activerecord maintains a hash of all the attributes and it is possible to add a new attribute to the hash (thanks to my pair). So we changed the code like below.

1| results = Question.all(:include => :answers)
2| results.each { |question|
3|   question["filtered_answers"] = question.answers.select { |answer|
4|       answer.created_date > Time.parse(Date.today.to_formatted_s(:db))
5|   }
6| }

So rather than changing the association proxy object, we are adding a new entry in the hash. Obviously now it does not update the database as well. Once I use the above code, I can refer to filtered questions as 'question.filtered_answers'

This is the work around that we used because we couldn't figure out a way to load the complete activerecord object hierarchy in readonly mode or 'detach' the activerecord object from session. But anyway, our code works and we got want we want without unnecessary database updates.

Thursday, June 17, 2010

Public defender methods - yet another Java 7 proposal

Support of lambdas and closures is the most important language feature that is proposed to ship with Java 7. With lambdas, it will be possible to implement some interesting methods into the old collection classes. Methods like find, select, reject, forEach, include can take a lambda and perform relevant operations on the collection object. However, adding these methods into existing collection interfaces would be a problem as it will break the existing implementations. To address this issue, the proposal of defender method is suggested.

Inherently static nature of Java restricts the developer from extending the functionality without affecting the existing usages. The current implementation of interfaces doesn't allow us to add a new method to interface without breaking all its implementation. Sometimes, there is a strong case for adding new behaviour in the interface so that its applicable to all the implementations. Currently the only way to do this is to change interface into an abstract class and change the code to extend this abstract class. However, we need to change at all the places where this interface is implemented and anyway, this change is not always possible because a class might already have a superclass. The proposal of public defender method suggests a solution.

This proposal allows to provide a default implementation for interface methods. This default implementation will be a static method on helper class and it will be invoked if the class implementing this interface does not provide any custom implementation.

The proposed syntax is -

public interface List {
extension public List select(#boolean(Object)) default ListHelper.select
// other list methods
}

Here, select method from the List interface takes a lambda (which is again a proposal) as argument. The 'extension' keyword signifies an extension method. It must have a default implementation which is provided by static select method from ListHelper class. This method has the same signature as the extension method and it will receive the the extended interface object (list) as the first argument. If the class implementing this List interface does not provide implementation, the select method from the ListHelper class will be used. This method from the helper class which provides the default implementation is called as public defender method, as it defends the classes that does not implement extension method.

Obviously, the specific implementations of the interface class can provide an optimised implementation for extension method.

As mentioned in the Baptiste Wicht's blog, with this approach, it is possible to include the static methods in Collections class to the List interface. This will allow us to invoke methods like reverse() on the list object itself, rather than passing the object to the static method in Collections class. Also, it will allow us to override these methods.

In general, this proposal will allow us to add the relevant methods to interfaces and be dependant on the classes with static utility methods.
so we will be able to do -

list.select(..)

instead of

ListHelper.select(list, ..)

The helper class will still hold the implementation, but objects need not be aware of them.

Along with the ability to extend the already existing interfaces, this proposal will have more profound effect on the programming style. It will be possible to create a interface in which, all the methods are extension methods with default implementation. Then, it will be possible to use this interface as 'mixin' and use it to achieve the effect of multiple inheritence. Also, abstract classes may not be required as the same functionality can be achieved by interfaces with few defender methods.

We can compare this proposal with the extension methods introduced in C# 3.0. Although not exactly same, conceptually both are same. C# extension methods allows us to add methods in the existing classes by defining the static methods in another static class. However, The visibility of C# extension methods is much more controller as the these methods are available only in the classes that import the static class with extension method. This is different from defender methods where extension methods will be available to all the implementations of the interface.

At the moment, defender methods is only a proposal for Java 7 and just like other proposals, we are not whether it will come up with Java 7. but nevertheless, this interesting language feature will improve the extendibility of dying Java language and will allow us to extend the existing library.

Thursday, September 24, 2009

Spring AOP : Avoide JDK proxies for struts application

Spring AOP uses both JDK proxies and cglib for creating dynamic proxies. The main purpose of proxy classes is to intercept method invocation on the target object and the execute the code from applicable advice. If the class implements an interface, then spring uses JDK proxies by default, and for all other classes it uses cglib to generate proxies.

Proxy classes generated by JDK proxy provides implementation for all methods in all the interfaces that are implemented by target class. So, generated class will not have methods that are defined by target class but are not part of any interface. All the implementations of the interface can use the same generated class which results in less number of dynamically generated proxy classes.
On the other hand, proxies generated by cglib extends the target class. So the generated proxy has all the methods defined by the class. This results in more number of proxy classes as every class, even different implementation of the same interface, will have their own dynamically generated proxy. However, the generated proxy classes are cached, so new classes are not created on every invocation.

In a web application which uses Struts, its not possible to use JDK proxies for Spring AOP. Consider the following scenario.

I have an action class -

package com.amey.labs.web.action;
//imports

public class HomeAction implements ModelDriven {

private String name;
private PersonService personService;
private HomePage homePage = new HomePage();

public HomeAction() {
}

public HomeAction(PersonService personService) {
this.personService = personService;
}

@Action(value = "/home", results = {@Result(name = SUCCESS, location="home.jsp")})
public String execute() {
homePage.setPerson(personService.getDetails(name));
return SUCCESS;
}

public String getName() {
return name;
}

public void setName(String name) {
this.name = name;
}

public HomePage getModel() {
return homePage;
}
}

I have configured an around advice for this action to log the entry and exit of the execute method.

<aop:config>
<aop:aspect ref="loggerAspect">
<aop:around method="doLogging" pointcut="execution(public * com.amey.labs.web.action..*.execute(..))">
</aop:around>
</aop:aspect>

HomeAction implements ModelDriven interface. So Spring will use JDK proxies to create a proxy class. The created proxy class will implement all the methods from ModelDriven interface. However, it won't have execute() method as it is not defined in ModelDriven interface. If I try to execute this action, I get NoSuchMethodException exception in AnnotationValidationInterceptor. This interceptor gets the method name which has @action annotation and then get the method object with the same name from action's Class object. However, the Class object is a object of proxy class generated by JDK proxy instead of the actual action class. So it does not have execute() method.

To fix this issue, I have to implement the com.opensymphony.xwork2.Action interface (or extend the ActionSupport). Once i do this, the created proxy class implements Action interface and has the execute() method.

Now, when I invoke the action from the UI, it can execute the execute() in the action. However, the generated proxy class doesn't have the methods for setting individual attributes in the action class (name in HomePageAction). So there is no way that the ParameterInterceptor finds out that this name-value passed in the request (either GET or POST) is actually a attribute on action and it just ignores that. CompoundRootAccessor class, which is invoked from ParameterInterceptor generates the warning message that the setter for "name" is not present in Action class or the model class and it is logged if the struts is running in dev mode, otherwise the parameters is considered as junk and it is ignored.

So the action class can not have any attribute which should be set from the UI. I need to move the name attribute from action to the Model class. If attribute is part of model, then it gets populated properly and I can use it in my action. However, attribute can not always be a logical part of the model object.

To force Spring to use cglib proxies, we just have to set proxy-target-class attribute to true.

<aop:config class="true">
// aop aspects
</aop:config>

In the struts actions where we have a property that we want to set from the UI, JDK proxies can not be used. So we have to use proxies generated by cglib. Even in other cases, cglib proxies performs better than JDK proxies. With JDK proxies, proxy class intercepts all the method invocations on the target object. All methods on target object are invoked by reflection, even if they are not advised. With cglib, proxy classes extends the target class and intercepts only advised methods. All other methods are invoked directly. So, using cglib proxies is considerably faster. Obviously, cglib has its own set of problems, but its a better alternative than JDK proxies.

Tuesday, May 5, 2009

Do you really want moist test ?

Moist tests are the test cases that do not follow DRY(Don't repeat yourself) principle. According to this philosophy, the setup and any other private method in the test class is a smell. It advocates that having all the test code in test method itself, improves readability and makes it easy to maintain. I recently joined a project team which follows this concept, and it was pretty surprising for me.

The test acts as a documentation for the object it tests. If i have to understand the job of a class, I will go to the test and read test methods. And thanks to the concept of moist test, I end up reading 5 lines of code duplicated in all the tests which creates object of system under test, creates mock objects, set expectations, inject dependency and do other setup stuff. So reader has to skip a chunk of code to get to the crux of the test. This is definitely not improving the readability of test.

The readability and maintainability of the test is more important than deciding whether to be moist or DRY in your tests. I agree that we should not be obsessive about strictly adhering to DRY in the test code, but in the same way, we should not be intentionally 'wet' and duplicate the code everywhere.

Lets look at the reasons people gives to write 'Moist' tests. I have taken most of the reasons from the Jay field's and Shane Harvie's post.

- When a test fails, I want to fix it quickly
How will duplicating the code in all test methods will allow you to fix the code quickly ? Suppose you add one more parameter to the constructor of your object, changing it in at one place in setup method will be much quicker than changing it in every test method.

Obviously, if you want to initialize your object differently, you have to instantiate it in your test method. But that's a special case, may be one of the ten test methods will need a different object and that method will instantiate it. Other methods in the test class will continue using the object that is instantiated in the setup method. If all the test methods needs different objects, then you don't need instance variable at all and all the test will have their own local variable. By doing so, you are breaking DRY principle, but it is necessary to make sure that your test is not forced to execute unnecessary code from setup method.

- I have to search around a large test class for extracted methods, or instance variables that may be modified throughout the class, this costs me time
Why would you have a large test class ? You will have a large test class only if the object your testing has multiple jobs and you have to test all that behavior. If you follow the Single responsibility principle and write classes that has only one job, you will have only one behavior to test (with alternate flows) and you will end up with small test classes.
So, if you have large class, its a smell that your object is doing too many things. You need to fix that first by segregate the responsibility into separate classes. Once you move the responsibilities to the different classes, you have to move the tests to separate test classes as well and you will end up with small test class.If doing so is impossible for some reason, you could extract test cases into separate files each testing a part of object's behavior.

Still even if your test class is large, I do not understand how searching for a method will take much time.(In most of the IDEs, you just have to do Ctrl+B(or other similar shortcut) to go to the method definition). The purpose of extracting methods is not only to group a block of code together, the more important purpose is to create abstraction. By giving descriptive method name to the extracted method and using that method in your test, you are creating an abstraction over a block of code and thereby increasing the readability of the test.

In few test, instance variable that is initialized in setup method will be modified by the test. However, this should not affect other tests. Ideally, you will create the instance of system-under-test (the object which you are testing) in the setup method and then use it in your test. The changes made by other objects to the instance variable will not affect your test as object is getting initialized in setup method for every test.

- Some developers don't know that setup method is getting executed/ they may not notice that setup method exists in test
This is probably the most lame excuse. Even a developer who learned JUnit today will know that the setup method gets executed before every test. And if your developer don't have this basic understanding, you should stop writing any code and educate them first. Their ignorance can't be the reason why you don't want to have setup methods in test class.

- Separating the test code in setup and test method decreases readability
Writing the code in setup method will reduce the readability only if you are not using the setup method properly. i.e. if we have any test specific code in setup method. However, if we have test environment setup code in setup and test and assertion code in test method, then it follows the natural flow of the test and should increase the readability. Though it depends on the reader as well. If the person is used to reading the code with small modular methods,reading the tests with setup methods should come naturally for him.

- Tests should be independent, so why should all of them use common setup (or any other common) method ?
When we say that tests should be independent, it doesn't mean that they should be lexically separate. It means that we should be able to run tests individually and the outcome of one test should not impact other tests. The setup method is executed before every test, so the objects are initialized freshly for every test. If setup method is too big and it contains some code that is not required for all the tests, then you have to extract the test specific code in your test method or write private methods.

- If you have to write large code to setup your test, you have to rethink about your object model instead of putting all the code in setup method
This is definitely true. Big setup code is a smell that your class has too many dependencies. In this case, we should definitely try to minimize it by breaking the class into different classes. But duplicating the chunk of code in all the tests is not the solution.

The ultimate core aim is the readability of test. Readability of the test is more important than applying DRY(this holds almost true for production code as well). We should not apply DRY principle stringently for test code. But it doesn't mean that we should go entirely other way round, write duplicate code and make our test moist purposefully.

IMHO, test method should have the code to invoke the method to test and do the state/behavior verification. It may also have code to set the expectations on the mock objects. The code for initializing the system-under-test (probably by injecting some mock objects) should go in the setup, which is the right place for it. Obviously, if all the test methods needs different setup, then the common setup method doesn't make sense and we have to duplicate some code in test methods. If you notice that changes are required in the common extracted methods and the changes are affecting other test, its time to get rid of extracted method and copy the code in all tests. So, you may end up with a test method which has all the code for setup, test and assertion. Its OK to violate DRY principle in your test code if it is required but this should be an exception. Naturally, we should try to keep our tests DRY and introduce 'moistness' only if required. We shouldn't intentionally make it 'wet' by duplicating the code everywhere.

Wednesday, April 22, 2009

Using field level access for hibernate entities

Hibernate allows you to configure the access mechanism for attributes in the entity object. It can access the attributes directly or through the accessors. The field or property access type is decided by hibernate based on the location of @Id or @EmbeddedId.

@Entity
public class Item {
@Id
public String id;
public long itemNumber;
}

@Entity
public class FoodItem {
private String id;
private long foodId;

@Id
public String getId(){
return id;
}
public long getFoodId(){
return foodId;
}
}

In the above code, Item class uses field access and FoodItem class uses property access.

Using property access mechanism allows classes to hide the internal data structure from the external world and it enables class to perform validations while setting the value or while accessing it. It is useful when we want to perform some calculation before returning the result or setting the value.

However, mostly we end up directly accessing the variable from the getter and setter methods. In that case, property access is overkill. It makes us write lot of getter - setters which are not used anywhere else in the code. It also exposes setters if when we don't want anyone to set the value of fields like 'id'. We can mark the accessors private or protected and avoid this problem, but we will still have the code.

IMHO, while creating the entity, we should start with field level access. Doing so will reduce the unnecessary code which is used only by hibernate. We should keep the fields private and provide accessor methods only when required. So our entity object will be modeled by the needs of the application and not by the hibernate. In future, if we need to validate or manipulate the values set on the object (which mostly won't happen), we can then switch to property access, but we shouldn't start with it.