Sunday, 22 January 2012

Google App Engine, with a comparison to Cloud solutions

A few weeks ago we had an internal seminar at Syncron. You may find my slides on Google App Engine (including a comparison to other cloud solutions) interesting. Please note that most of the information is written in the speaker notes.

Wednesday, 15 April 2009

Windows Tools Useful for Developers

Despite the IDE and the browser I use a lot of simple and cheap (often free) software that helps me a lof with everyday developer tasks. I don't like spending time on repeatable (and therefore boring) tasks - the tools below help me to limit this time to minimum.

BareTail - This is a log file viewer. It just opens a file and has the ability to follow the end of the file. It always displays the last lines of the file even when new content is appened all the time. It works like 'tail -f' in Unix. Theare are a few features that make this program very handful:
- customizable font makes easy to see a lot of log lines on the screen
- BareTail doesn't lock files on the file system; that's why restarts of monitored servers aren't problems for BareTail - it follows the file even if it's deleted and recreated
- remote files from the Windows shares are open as if they were local files
- BareTails dosn't load files into memory so handling of very large files isn't a problem
- coloring of the files makes easy to track errors in you application - just assign the red color to errors to see them easily
I use the free version (without the find option) and it works fine for me.

Beyond Compare - The file and directory comparator. It adds the compare option to the standard Windows menu for every file and directory so it's easily accessible. Its power is ease of use and performance. There are buttons and keyboard shortcats for common actions: expanding all directories, selecting all files, comparing selected files, etc. It has a nice option to refresh the compare view and ability to compare the archives as normal directories (ZIPs, EARs, JARs, etc.). It costs about $30.

Process Explorer - This tool works great as replacement of the MS Windows Task Manager. The feature I like the best is the tree view of the processes. It helps to track to command line arguments passed to sub-processes, for example when Ant invokes Java. It's free to download from Microsoft.

Active Ports - I have one and simple use case for this tool - find a process that takes a needed port. Often port 1099 is reserved by some process and my JBoss can't start. I use Active Ports to find the process and sometimes kill it. It's for free.

Total Commander - I've never managed to start using MS Windows Explorer. I prefer the two-window interface of Total Commander. Total Commanders makes easy to handle archives (e.g. ZIP or JAR) - when Ctrl-PgDown is pressed an archive is opened like a directory. I ofen use a simple FTP client built into it. It costs about $30.

WinSCP, PuTTY - I use them to connect to remote systems. They are free.

Let me know if you have tools you can't live without!

Tuesday, 7 October 2008

Logging on application servers – java.util.logging with log4j together

Logging seems quite an easy task. However, from my experience I know that not all Java programmers fully understand it. There are lot of different libraries on the market to choose from – log4j, java.util.logging (JUL, or Jdk14 logging), commons-logging, slf4j and some others. Although large choice of libraries seems generally a good thing it is a pain in case of logging. I'll explain why.

I got a task at work to choose a logging framework for our application. My first idea was to take the framework that was already used by third-party libraries being part of our application. Thanks to this all logs could be aggregated together and we would have one-place logging configuration. This would mean, for instance that our application's logs would be written together with SQL queries logged by Hibernate. This would enable easy debugging. The opposite situation when application logs are in a different file then Hibernate logs isn't very convenient. It quickly appeared that it wasn't possible to use the same logging framework as our libraries did because libraries used different logging frameworks. Log4j is used by default by JBoss and one of our in-house modules, JUL by JSF-RI, Facelets, JAXB and CXF. Quartz, Spring and RichFaces use commons-logging. Hibernate switched to slf4j. Luckily there was a solution – routing log entries from one logging framework to another.

Logging on JBoss, make JUL working

By default JUL writes all logs to System.err (see JRE_HOME/lib/logging.properties). It means that if you use a Logger object created by JUL your log entries will be just printed to the console. Luckily JBoss intercepts System.err and dumps it through its internal logging system to the server.log file. JBoss by default uses log4j and is configured through jboss-log4j.xml. Thanks to this all logs produced by JBoss and the deployed applications are put to the same place. The only problem is that all JUL logs are treated by JBoss's log4j as plain text:

12:15:01,820 ERROR [STDERR] 04-Jan-2008 12:15:01 com.sun.facelets.compiler.TagLibraryConfig loadImplicit
INFO: Added Library from: jar:file:...

The first part is produced by log4j and the rest by JUL. As you see it's impossible to use jboss-log4j.xml to filter JUL logs based on level (priority) or category. That's because JBoss treats all data printed to System.err as an ERROR in the STDERR category. This means we can't make the proper use of the original level (INFO) and category (com.sun.facelets.compiler.TagLibraryConfig).

JUL can be only configured globally at the JVM level. There is no way of per-classloader configuration like for log4j. You can either override the default JUL configuration by specifying a file accessible through the file system or programmatically. File configuration isn't appropriate for the Java EE environment. The best solution to configure JUL is to use the JBoss service as described by Shrubbery. Shrubbery also provides ready-to-use handler that redirects JUL to log4j. In this solution JUL redirecting handler is registered when the application EAR starts and unregistered when EAR stops. Unregistering is needed to remove a class dependency from JUL to log4j appenders loaded from the EAR. As long as the handler class is registed in JUL the garbage collector cannot clean it.

After applying the Shrubbery's handler the logs looks appropriate with levels and categories are preserved:

14:18:03,490 INFO [faces.compiler] Added Library from: jar:file:…

With redirecting in place there is no need to touch JUL configuration. All setup can be done in one place – jboss-log4j.xml. This applies to configuration of appenders, priorities, etc. Because jboss-log4j.xml is kept out of the application EAR each developer may have their own configuration to debug specific parts of the system.

After making all logs coming to the same place we decided that for our project there is no much difference what logging API to use. In this case let’s use JUL because it’s a JDK standard. There were no important functional differencies that justified choosing something non-standard. Adhering to standard reduces number of libraries we depend on.

Logging on WebSphere, make log4j working

WebSphere 6 internally uses JUL. Thanks to this all logs reported with JUL API are stored properly in SystemOut.log. The correct level and category is preserved. This is also the case for commons-logging. It seems that WebSphere automatically reconfigures commons-logging to use JUL. This happens even if log4j is on the classpath which normally switches commons-logging to log4j. Slf4j can be easily set up to use JUL which is very reasonable for WebSphere. The only framework that needs attention is log4j.

Routing log4j to JUL is easier than routing vice-versa. That's because it's easy to have per-classlaoder configuration of log4j. To configure redircting it is needed to add log4j.properties to the EAR's classpath and write a log4j appender.

The log4j.properties that work for me are here:
log4j.appender.JUL=se.sync.util.logging.JulLog4jAppender
log4j.logger.se.sync=ALL,JUL
log4j.additivity.se.sync=false
I tried make it working with it working with reconfiguring the rootLogger instead of se.sync logger but it didn't work. I don't know exactly why but it seems that WebSphere registers a root appender when log4j is used and this causes duplicate log entries. To eliminate this both appender and additivity have to be set on a non-root logger. Here is the code for the log4j appender.
package se.sync.util.logging;
import java.util.HashMap;
import java.util.Map;
import java.util.logging.Level;
import java.util.logging.LogRecord;
import java.util.logging.Logger;

import org.apache.log4j.AppenderSkeleton;
import org.apache.log4j.spi.LoggingEvent;
import org.apache.log4j.spi.ThrowableInformation;

/**
* @author karol.bienkowski on 30 Sep 2008
*/
public class JulLog4jAppender extends AppenderSkeleton {

private static final Logger log = Logger.getLogger(JulLog4jAppender.class.getName());

private static final Map<org.apache.log4j.level, Level> LEVEL_MAP =
new HashMap<org.apache.log4j.level, Level>();

static {
LEVEL_MAP.put(org.apache.log4j.Level.OFF, Level.OFF);
LEVEL_MAP.put(org.apache.log4j.Level.FATAL, Level.SEVERE);
LEVEL_MAP.put(org.apache.log4j.Level.ERROR, Level.SEVERE);
LEVEL_MAP.put(org.apache.log4j.Level.WARN, Level.WARNING);
LEVEL_MAP.put(org.apache.log4j.Level.INFO, Level.INFO);
LEVEL_MAP.put(org.apache.log4j.Level.DEBUG, Level.FINE);
LEVEL_MAP.put(org.apache.log4j.Level.TRACE, Level.FINER);
LEVEL_MAP.put(org.apache.log4j.Level.ALL, Level.ALL);
}

@Override
protected void append(LoggingEvent event) {
LogRecord record = new LogRecord(extractLevel(event), extractMessage(event));
record.setLoggerName(event.getLoggerName());
record.setMillis(event.timeStamp);
record.setThrown(extractThrown(event));
log.log(record);
}

private Level extractLevel(LoggingEvent event) {
org.apache.log4j.Level level = event.getLevel();
Level ret = LEVEL_MAP.get(level);
return ret != null ? ret : Level.ALL;
}

private String extractMessage(LoggingEvent event) {
Object message = event.getMessage();
return message != null ? message.toString() : null;
}

private Throwable extractThrown(LoggingEvent event) {
ThrowableInformation info = event.getThrowableInformation();
return info != null ? info.getThrowable() : null;
}

public void close() {
// nothing to do
}

public boolean requiresLayout() {
return false;
}

}
The other trick is to put log4j configuration into a separate JAR. An EAR's classpath specified in the MANIFEST.MF file can contain of JARs only so I packed my log4j.properties into a one-file log4j-config.jar.

To illustrate how it works take a look at a sample of misconfigured logs on WebSphere:

[9/30/08 10:33:07:848 CEST] 0000003f SystemOut O 10391 [SoapConnectorThreadPool : 18] INFO org.hibernate.cfg.Environment - Hibernate 3.3.1.GA

and then after setting it correctly (slf4j with JUL):

[10/1/08 9:31:45:167 CEST] 0000008b Environment I org.hibernate.cfg.Environment Hibernate 3.3.1.GA



Saturday, 13 September 2008

JavaScript image changer

I'm definitely not a JavaScript programmer. I just wanted to have a simple script that would change background image on the site automatically in a loop. I couldn't find a ready ouf-of-the-box solution so I've created my own. It's based on YUI. The only reason of choosing YUI over Prototype or JQuery was that YUI it's used by LightFlow. The target site of my image loop changer was already using LightFlow so JavaScript could be shared.

Here is my code. This code is definitely not mature but still maybe you'll find it helpful. The following things should be applied to your HTML:

1. Load the YUI library
<script type="text/javascript"
src="http://yui.yahooapis.com/combo?2.5.2/build/yahoo-dom-event/yahoo-dom-event.js">
</script>

2. Create a DIV with the original image background (the one loaded initially)
<div id="imageLoopDiv" style="background: url(img1.jpg) no-repeat;">

3. Initialize JavaScript: provide the list of images (loopImages), pre-load the images so that they don't blink when changed, and register the function that automatically switches backgrounds
<script type="text/javascript">

var loopImages=new Array('img1.jpg','img2.jpg','img3.jpg');

var showNextImage=function(){
var currentImage=YAHOO.util.Dom.getStyle('imageLoopDiv','background');
var currentIdx=0;
for(var i=0;i<loopImages.length;i++)
if(currentImage.indexOf(loopImages[i])!=-1){currentIdx=i; break;}
var newImage=loopImages[(currentIdx+1)%loopImages.length];
YAHOO.util.Dom.setStyle('imageLoopDiv','background','url('+newImage+') no-repeat');
};

YAHOO.util.Event.addListener(window,'load',function(){
for(var i=0;i<loopImages.length;i++){
var img=new Image(); img.src='../img/'+loopImages[i];
}
YAHOO.lang.later(10000,null,showNextImage(),null,true);
});

</script>

The running example of this scipt is oodesign.eu. This scripts are script.js and yui-utilities.js.

Sunday, 7 September 2008

Java links

This is a list of Java related links I've gathered over time.

2007-12-22: A introduction to REST

2007-12-05: Web framework comparison again. Matt Raible: first PDF (classic Java frameworks) and the the second PDF (not so pure Java).

2007-12-05: J2EE news. I used to read TheServerSide but recently InfoQ seems be much more interesting.

2007-11-10: SOFEA - the future(?) of web frameworks. By reading this discussion (actual paper - pdf) you'll learn what the web framework should be and why Java web frameworks are anti-frameworks.

2007-07-19: catch (InterruptedException e). Did you ever have a problem what to do with InterruptedException? Here is a tip.

2007-01-05: Design Driven Development. A book good book about architecture and design, without any technology dependencies at InfoQ

2006-02-17: AOP Article. A good article on AOP – at developerWorks (not only on AOP: design patterns, frameworks, QA, etc. also addressed). Before this I was rather skeptic about AOP. Now I’m rather excited J - I’d like to see a really OO system, with design patterns, and some crosscutting concerns designed as Aspects.

2005-04-25: Annotation. When to annotate - read a Bill Burke blog entry

Web Frameworks. See comparison by Matt Raible [PDF]

Micro Benchmarks. On testing small aspects of the application - at developerWorks

Exceptions. J2EE Exception handling strategy - a JavaWorld article

Sunday, 31 August 2008

Books

Here is a list of the important software books I've read over the last few years. Thanks to this list you can learn something about me as these books affect the way I'm writing code and shaped my attitude to software projects. I'll be updating this list.

Remote GUI - DTOs and other problems

Here is a description of our adventurous process of developing an application with a remote GUI. Our application consists of two parts that can be deployed separately - the GUI and of the server. Only the server has access to the database. GUI connects to the database and other systems through the server.

It was not a technical decision to develop the remote GUI. From the sales perspective it was crucial to be able to run the server and the GUI on separate machines. We (Syncron) make a product so we have to accommodate to requirements of many potential customers. Another sales requirement was to be able to switch a transport protocol that is used between the GUI and server – one of the customers wants to use JMS for this. What's more, web services and SOA are still hot terms so we just had to have these technologies in place even if this was not technically justified.

We end up with the following remote GUI architecture:
  • the GUI is a JSF web application,
  • GUI-server communication is through web services, CXF together with JAXB is used for this as CXF supports different transport protocols including JMS,
  • the server is a Java application using JPA.

At first I thought that remoting between the GUI and server would add just a few little complications to our development (like performance) and would be generally transparent. It hasnt't been so easy.

To DTO or not to DTO

The first decision to make is about having a layer of DTOs. It's tempting to skip it. Having DTOs results in duplication of code because for each entity you need to create its DTO counterpart.

That's why our first approach was to skip DTOs and use entities directly in the GUI. By entities I mean classes that are mapped to database tables and annotated with JPA metadata. This worked like this:
  • the entity is read from the database (by Toplink that time) and sent over the wire to the GUI to be displayed,
  • the GUI updates the copy of the entity and sends it back to the server,
  • the server applies the changes made in the GUI back to the database.
The procedure of applying the changes to the database isn't straightforward but I'll concentrate on this later. It's important to be aware that GUI operates on a copy of the entity. This copy is constructed within the web service framework (CXF in our case) in the process of XML deserialization.

Such approach has many positive aspects. There is a common object model used throughout the application - entities. We name the set of entities as Domain Model. In case of Syncron's business the domain model consists of entities like Order, Supplier or Warehouse. The common model used in both the server and GUI thights them up strongly. We consider such tight coupling of the domain model to the GUI a good thing. Remember that the domain model is not a purely technical thing. Is is created first by analysts in Word documents and UML diagrams. It describes how the customers people see the business our software operates on. One of the goals of our architecture is to let non-programmers create GUI pages. It works well well the domain model is represented one-to-one in the objects the page templates operate on. This way the web developers can use the domain model specification as a reference when creating pages.

Domain Model vs. GUI
Often GUI puts different requirements on the objects then the domain model. This makes hard to use domain model entities directly in the GUI and may suggest using of DTOs.

The simplest example of such discrepancy is data conversion and validation. For example, in the domain model we see the deliveryDate property as Date. In the GUI it is always displayed and read as a String (e.g. "2008-08-22"). Such discrepancies are addresssed on-the-fly by JSF converters and validators.

There are discrepancies caused by the different level of granularity between the GUI and the domain logic. Take a wizard. I has a few screens that you use to fill in information about a user but in the domain model it ends up with a single createUser operation. It's not a good idea to pollute the domain model with the GUI logic of wizards steps (I mean to have methods like createUserStep1 and createUserStep2 in the doman model). The JSF solution for this is managed beans. In our application we have managed beans only when needed and generally view templates operate directly on entities. It's the JSF Expression Language (EL) that makes it easy. With EL you can navigate through the domain model. Even if you change your domain model by applying refactoring you can easily adapt the view templates to the changes by updating the EL expressions. There is no need of flattennig the model for views.

Now, the hard things. There are at least three serious problems caused by exposing entities to the GUI. Problem number one is large collections, problem number two is business methods and number three is cycles.
  1. Sometimes a to-many relation points to a large collection of objects. For example, a history collection of LoginAttempts associated with a given User can have thousands of elements. It would be time consuming and pointless to send such whole collection over the wire from the server to the GUI. Our solution for this issue is to remove all collection fields from the entities. This is a big limitation. To work around this the GUI can make a separate server request to fetch the collection contents. The collection may be fetched page by page, e.g. the first ten LoginAttempts. Making two requests instead of one hits the performance and adds complexity into the GUI application.
  2. The domain model needs business methods. An example business method maybe loginUser that denies access users that have last three LoginAttempts failed because of the wrong password. Business methods access database (e.g. to query for the last LoginAttempts), contact external systems and use third-party libraries (for encryption, statistics, etc). The best place for them is the domain model. The problem is that if you put the business method into entity you can no longer share this entity with the GUI because GUI doesn't have the fancy server-side libraries. The compilation of the GUI would just fail. We had to give up and move business methods out of entities to a new services layer.
  3. Entities usually contain back links. From User you have a link to LoginAttempts and from LoginAttempts to User. You can usually have code without back links but it adds complexity to your business methods. The problem with back links is that they form cycles of dependencies. Such cycles cause trouble when serializing to XML for web service transport. When trying to serialize a cycle you will end up with an endless nested XML elements. The JAXB specification doesn't address this so we end up with a Sun reference implementation workaround – CycleRecoverable interface.
Although there are workarounds for all these problems we decided that it was better to introduce DTOs to save our domain model. Without DTOs the model would become anemic – business methods would be taken out of entities to a new layer of services, we would loose possibility of having arbitrary collection fields and ease of creating complex structure of associations. We didn't want to loose all the advantages of having the domain model exposed to the GUI, common object model for the whole application is the number one here. Therefore we made a few architectural decisions that helped to minimize the overhead of introducing the DTO layer.

Working on a copy

Our DTOs are as similar to the entities as possible. Class names are the same (but with the "DTO" suffix), property names are the same and types of fields are usually the same. The DTOs are generally the copies of the entities. The differences are that DTOs don't have big collection fields, their methods are only getters and setters and the structure of associations is simpler then in entities.

Instead of copying data manually between entities and DTOs we use Dozer which does this job perfectly. We thought of making entities subclasses of DTOs but it doesn't solve any issue and adds complexity. In the DTOs we place only the properties really required by the GUI. The large collections and business methods are in place only in the entities so the business methods can operate on them. Thanks to skipping the large collections in DTOs there is no overhead on transporting them. We have the possibility of having different DTOs per one entity if different views require different subsets of data (e.g. UserDTO and BasicUserDTO). Nice and easy? Not exactly.

When working on a copy of the entity you can't take full advantage of many peformance optimization provided by the persistence layer. This includes Hibernate's lazy loading and the Open Session in View pattern. The creator of the DTO has to a priori decide which entity properties will be needed for a view. All lazy collections are loaded when the DTO is constructed by Dozer. For example, there can be a page that has either a basic or advanced view of the User entity. In order to have less data loaded when a basic view is used you need separate DTO classes, e.g. BasicUserDTO and UserDTO. When not working on a copy but on an original entity you don't have to worry about this because thanks to the Open Session in View only the required data is loaded.

The update operation also suffers when the GUI doesn't operate directly on the entities. When a remote GUI is used the update operation results in constructing an DTO in the GUI. This DTO is sent to the server. The server can't just pass this DTO to the JPA layer because JPA has no idea about DTOs so the manual (or Dozer) transformation to an entity is needed. This is a few step process:
  • the original entity is read from the database through JPA based on identifier received in the DTO,
  • the changes from the DTOs are applied to the entity,
  • the entity is persisted.
The process of applying changes is recursive. In our application it is performed by Dozer. We have extended Dozer to recursively fetch the original entities from JPA. There are lots of cases to take into account here: updates of collections, updates of associated objects, recursive creation of new entities, etc. It is now up and running for us but it took some time to implement this. Without DTOs applying changes would be much simpler because we could use techniques like Hibernate's (or Toplink's) detached objects. With DTOs it's impossible to use detached objects as DTOs don't contain the complete information to recreate the entity. DTOs have only subset of entity fields.

Third-party libraries

Using a new technology requires you to depend on many third-party libraries. I'd like to quickly remind that this doesn't come for free.

Our web service implementation (CXF) has quite a few dependencies. CXF.jar itself is almost 4 MB and dependent jars add a few MB more. The weight that slows down deployment is not the only problem. We had a lot of versioning issues with the libraries provided by the application servers. JAXB was biggest troublemaker here. We end up in bundling a lot of JARs into our EAR so that we have control over the exact library versions.

There is no bug free code and this applies to libraries we use. Some of the issues with Dozer hit as so seriously that we use a patched version of it. For a long time we've been working on the nightly snapshot of CXF because the released version didn't work for us. We require the latest JAXB implementation as the one bundled in JBoss doesn't work properly. BTW, we can't use the most recent JBoss as it has a bug in the JSF implementation that kills our application.

We have the Java-first approach for building web services. We don't pay much attention to how the resulting XML looks like. In theory the XML serialization should be transparent for us but in practice we need a few JAXB annotations to make it work. @XmlSeeAlso is the most annoying of them. We
also need custom JAXB serializers because many constructs that are natural in Java aren't easily expressable in XML, e.g. this affect fields of type Object.

Performance

So far we haven't measured performance of our application. Performance seems to be the most obvious obstacle in having the remote application behaving like the local one. Because of lack of tests we still don't know if it really is

We only used intuition when deciding that it was not a good idea to send large collection over a wire. Our intuition told us that it's better to reduce number of remote method ivocations needed to render a screen. I can't write anything more on this subject as we don't have spreadheets showing the actual impact. I hope we will create them some time when profiling our application.

Conculusion

Having a remote GUI for a system is a hard task. Don't do it unless you really need it. If you just need to structure your applictions into layers - do it using local interfaces not the remote ones. If you are really forced to have a remote GUI take the following things into consideration:
  • You'll need a DTO layer and some framework to map entites to DTOs (e.g. Dozer). Without DTOs you can't have rich domain model with business methods, large connections and complex associations.
  • You'll loose a lot of features of the persistence framework (e.g. Hibernate). Lazy loading, the open session in view and the detached objects technique can't be freely used.
  • You'll have to deal with many buggy external libraries and JAR versioning problems when running on the application server.
  • You'll need to worry about the performance.