Thursday, November 23, 2006

Automating Database Connection Management

One of the most frustrating problems in software development is this database connection exceeding due to non closed db connection usage. Things getting worse especially when we do not discover this problem at the development/testing phase (as we may not be continuously running the development/testing systems for very long periods). But frustrating enough the problem is going to appear in the production site.

What can we do about this? One way is to thoroughly test the application for non closed connections. Yes this requires developer concentration and can be time consuming if you do not use proper tools. More effective solution is to use a database access platform that can automatically take care of closing the opened database connections. But what if our simple project requires use of direct database access technologies (such as pure JDBC or pure ADO.NET)?

Let’s look at a small code fragment which we can use to employ a very simple data access framework to take care of this problem. In short we are going to write a base database action class which can provide and manage database connection to its sub classes. We use sub classing to write data access code (data actions) we need. Below is an example for the base action class written on VB.NET.

Public MustInherit Class ActionBase(Of T)

Protected MustOverride Function Body(ByVal conn As DbConnection) As T

Public Function Execute() As T

' Create a database connection

Dim conn As DbConnection = ConnectionProvider.GetConnection()

' Call the body method by providing the connection

Body(conn)

' Close the connection

Conn.Close()

End Function

End Class

All the sub classes need to implement the abstract method “Body(ByVal conn As DbConnection)”. As a method parameter, a connection instance will be available to the implementing method and that can be used by the sub class method to perform data access logic. Above base class is written as a generic class and subclasses should provide the template class name which should be the expected return object type of the sub class. For an example, a data access action that retrieve a ‘User’ instance will like follows:

Friend Class RetrieveUserAction Inherits ActionBase(Of User)

Private _userId As String

Public Sub New(ByVal userId As String)

Me._userId = userId

End Sub

Protected Overrides Function Body(ByVal conn As DbConnection) As User

Dim user as User = new User()

' Use connection to fetch user info and populate ‘user’ instance

Return user

End Function

End Class

As you can see we write sub classes to execute database operation on the database. Developers will not have to look for database connections. It will be provided in the “Body()”method parameter. Also the developers are free from the burden of closing the connections as framework itself will take care of it.

Let’s look at how one can make use of the above written class to retrieve the user with id=12.

Dim action AS new RetrieveUserAction(12)

Dim user As User = Action.Execute()

Even though this simple framework is pretty useful in managing connection, this has some limitations also. Developers are not able to spawn threads and pass the connection in to it for further processing. Also the results will be always disconnected, you cant pass resultsets to upper layers for presessing (but I think this enforces a good practice).

Also one major thing to consider is whether actually database resources getting released by closing the connections. In Oracle with JDBC connection close will not release database resources and you need to close the ‘Statement’ instance to release database resources. In such situations we can ammend the framework to provide us statement objects instead of providing connections.

Sunday, November 19, 2006

Estimating Estimations

During last few days I was doing a research on how to perform effective estimations. There are a few interesting points I have learnt in doing the research. One good theory I learnt was the “Cone of Uncertainty”.

The above picture depicts how the level of uncertainty changes during the project life cycle. As you can see the uncertainty reduces as project progresses. This may not be news to most of us, but the problem is how many of us use this instinctive knowledge in our estimation practices? For an example is we provide an estimation in the project agreement level, the deviation of actual effort can be 4 times the estimation. “Later you estimate, higher the accuracy will be”

Also if we look at the project failure (or overrun) rates, market research shows that the success probability is high for small projects.

Project Size (lines of code)

Early

On Time

Late

Failed

1000 LOC

11%

81%

6%

2%

10,000 LOC

6%

75%

12%

7%

100,000 LOC

1%

61%

18%

20%

1,000,000 LOC

<1%

28%

24%

48%

10,000,000 LOC

0%

14%

21%

65%

One reason for this fact is that it is difficult to estimate a large project than a small one (of course there are many other reasons also). If you have a large system to develop try to break it in to several small projects. “Smaller the project, higher the success probability”

Another common mistake we make is the bias towards so called ‘Expert Judgment’. How many of us believe our experience and knowledge is the best source for estimations? Probably many of us think so. But the researches have proven the converse. Expert judgment is one of the worst methodologies to derive an estimate. So what would give us a better estimate? Often the historical data provide us a better estimation than any other method. Historical data can be in 3 forms:

  • Industry average data
  • Organization average data
  • Project local data

Needless to say the data accuracy is higher in project data than industry average data. We should build an estimation process that uses historical data. It is better to use several industry accepted estimation methodologies (customized with historical data) instead of a one methodology. Simple models with less control knobs have proven to deliver effective results. “Never trust expert judgments, use count and compute methods empowered with historical data”

One other common mistake is forgetting many of the required software activities at the estimation time. It is very preferred to have omitted activity list and go through it at the estimation time. Also allow several individual estimators to work independently and then converge those to get the final estimate. The derived estimate should not be a single value but should be presented with a variance. Three point estimation techniques is a good place to start working on estimation probability.

I hope these hits will help you in making a proper estimation process for your organization. You probably need to build a customized process that match with your organization practices. Try to use estimations throughout the project life cycle after end of every phase. Also make sure to asses and improve the estimation process. Cost of over running a project for several times will be definitely higher than your estimation effort. “Invest on estimations. It always paybacks”

References:
Software Estimation: Demystifying the Black Art
Software Measurement and Estimation: A Practical Approach
CHAOS Report

Wednesday, May 24, 2006

SQLite as an Embeddable Database

These days I'm involved in designing a ‘Smart Client Application’ architecture for one of our customers. These smart client applications require fair amount of offline capabilities, fast data access and low network utilization. To achieve the objectives most of the data is cached and persisted on the client side. This required us to look at data manipulation, persisting and synchronization mechanisms at the client side.

The first alternative we had was to maintain XML serialized objects on the client side to cache data. This had the problems like Data querying difficulties, Security issues, Thread safe access issues, High memory consumption and Implementation difficulties.

Next we have decided to use an embeddable database for storing data. Our first candidate was MS SQL Express database, as our project is mainly based on VB.NET with MS technologies. Even though it has some limitations, SQL Express is a very feature rich database which can be invoked in the similar way you invoke SQL Server database (with SPs, Views, Functions, full ADO.NET support, etc…). This was a major plus point for us as we could reuse the same data access layer components on the client side. It also supported database replication so that the cached data syncing could be done even at the database level.

But the problem was the high installation requirements of the SQL Express database. Also configuration of the database seemed to be fairly complex for our requirements.

Next alternative we looked at was the open source database SQLite. It has an incredibly small foot print of less than 400Kb and performance was impressive as a small single user database. ADO.NET 2.0 data provider was also available for SQLite and can be integrated to Visual Studio 2005 easily. It has implemented most of the SQL92 standard with few exceptions. SQLite is really a zero configuration single-file database which runs in-process of your application.

Having said the plus points, it is now the time to look at the limitations also. One main problem is that, the only locking level it supports is at database level. This can be a problem when multiple processes accessing the database simultaneously. Since smart clients are generally used by a single user, this shouldn’t be a big problem as an embedded database. But in our case, the smart client application might be deployed on a Cytrix Server and may have many users connecting over consoles.

Having honored by “2005 Open Source Award from Google and O'Reilly”, I think we should seriously consider using SQLite as an embedded database for our smart client architectures.

Friday, May 12, 2006

Maintaining Database-Application Versioning


Maintaining Database-Application Versioning


One major challenge in product development is managing/tracking the database changes and versioning it. There are many things to consider developing a proper strategy for database version controlling.
  • Application should be able to identify the database version it is using
  • If the database is not up-to-date, the application should handle the problem gracefully with out bringing surprises to the users (either gracefully exit or upgrade the database).
  • Database also should be upgraded if the ‘Online Update’ (such as in ‘no-touch deployment’) strategy is used in the application.
  • If application has capabilities to upgrade the database then preferably it should be able to update the databases in various versions.
  • System should be able to track any improper database changes (e.g. uncontrolled manual changes by the customer)

After a little thought I came up with a model to solve the issue. Still I couldn’t experiment this in a real world project. Here is an outline of the model:
  • Obviously the database should contain a Metadata Table keeping the current version of it. This version is unique for a particular state of the (database schema + system data) in the database
  • Any version change (db upgrade) should increment the database version number and should be performed through a set of scripts. That is for any two adjacent versions are associated with a set of scripts performing that version upgrade.
  • For example bringing database from version 2.3.5 to version 2.3.7 requires running two sets of scripts, one to upgrade from 2.3.5 to 2.3.6 and another to upgrade from 2.3.6 to 2.3.7. At the end of this script, it should properly change the db version to indicate the current db version.
  • Now we need to have a database level trigger to alter the db version number if any uncontrolled change is done to the database schema in an improper way. Additionally system data tables can also be protected with the same trigger mechanism which changes the db version if any uncontrolled change is happened to the system tables.
  • This trigger will change the db version to a state such as “NOT-DEFINED” so that the application may inform the user regarding the uncontrolled database state.
  • Additionally if we need to make automatic db upgrades to happen with the application upgrades, then we can build a small class library which will check the database version at the application startup and if the database needs an upgrade run the necessary upgrade scripts.
  • For this we can use the ‘Command Pattern’ to have individual command classes which can upgrade the database from one version to the next adjacent version. For an example to upgrade database from version 2.3.5 to version 2.3.7, application will run two command objects one responsible for upgrading from 2.3.5 to 2.3.6 and another for upgrading from 2.3.6 to 2.3.7.
This may not be the ultimate model for managing database versions. But I feel this would do well if implemented and controlled consciously.

Tuesday, February 14, 2006

Composite UI Application Block

Composite UI Application Block

Another Microsoft Application Blocks is available for downloading now. This new ‘Composite UI Application Block’ provides a long awaited smart client application framework for .NET technologies. That is the ‘pluggable component architecture’ for smart client applications. This plugability is required in many applications to provide features such as:

  • providing multiple versions of application with different feature sets (e.g. enterprise version, standard version, etc)
  • creating the application as a composition of loosely coupled reusable modules
  • providing personalization capability for the application
  • Easily adding, updating, configuring sub modules of the application (e.g. online updating and versioning)

Few months ago, Microsoft has released a portal framework with the new ASP.NET 2.0 release, enabling web applications to enjoy many of the above features. Also several Java implementations provide similar frameworks for Smart Client application development. Eclipse framework is one of the major competitors in this area with a very powerful plug-in engine. Sun’s Netbeans application platform also provides a development framework with a pluggable architecture.

For some time, I have been waiting for a similar kind of framework to appear in the .NET arena as well. Few of our customers had similar kind of requirements for their smart client applications and I was wondering how to provide this plug-in capability in .NET platform. So I’m really excited with the .NET ‘Composite UI Application Block’ and hope that this will be the first step towards a feature rich .NET plugging framework for smart client applications.

Sunday, January 29, 2006

Impressing Customers on their own ground

These days I’m back in Norway for some pre-project activities. This is really a different experience for me being in the winter time (last time it was the summer). We had some interesting ‘Skying’ sessions and as a result I have a broken knee now :D.
Alright… now back on the business… Last few days we had very interesting sessions with few of our prospective customers. It was really a challenging task to impress them on their own ground (specially as they are software companies who know their business and the technology very well). Also being a technical guy it was really difficult to just forget the technology and concentrate on the business requirements. Anyhow during this visit I have learned many tactics of how to impress customers on their own ground.
The most important thing to do is, we should make them feel that we understand their business very well. This is a challenge as all the customers were in the business for many years and they have knowledge which they think obvious (implicit knowledge) but being a third party that is not so for us. So here are few guidelines which we learned and used to capture the business domain knowledge during the few meetings we had.
  • Let them describe the business domain they are working on. This is not about their company processes but about the industry as a whole. Here we need to understand where our customer stands in the industry (E.g. Market share).

  • Take the main business process of the company and discuss it with the customer. We have to identify where in the process IT fits in to. Use of activity diagrams was very useful in our discussions. Also ask them to take few common variations of this process if any exists (This is effective if the customer is a vendor of a product which is used by several of his customers).

  • It is really important to keep a balance between automated and manual processes, as most of the companies add value to their process through manual processing (Use of human intelligence in the process). These manual processes may give them an edge in competing with other firms. So we need to be care full in proposing IT automations of the process components.
  • Discuss the importance of IT to their organization. Guide the discussion such a way it reveals lot of non functional requirements such as performance, robustness, error handling, and scalability of the proposed system.
  • Talk about the current IT infrastructure. Discuss the strengths and weaknesses of the current system.

  • Talk about the competitors. This generally led to a really interesting discussion and reveals most of the expectations of the customer. Discuss the IT infrastructure owned by the competitors, which processes they perform better and what to learn from those.

  • Discuss the income break down of the company. That is which part of the business is bringing more income to the company, which parts are weaker and how IT can help them to grow. One thing we need to remember is that, at the end of the day it is all about profit and incomes (providing cool and nice features which do not bring any profit to the company are out of the interest).
  • Most importantly discuss what make their business to grow. For an example attracting more suppliers, attracting more customers and making the internal processes efficient may be the growth factors of a business. So the proposed system needs to address those factors to make the business to grow.
Actually working with prospects is fairly different and difficult compared to a traditional requirement gathering session. To win the project we need to focus well and impress them on our capabilities and understanding of the business. Trust building is the key to win a project. Good concentration focused questions, professional answers and well thought improvement proposals helps to build this trust. Also never forget to talk about your dog and his cat to build a good personnel relationship…

[tag: 99xb ]

Thursday, January 05, 2006

War between typed datasets and custom entities

War between typed datasets and custom entities

In Eurocenter currently we are preparing for a new project-kickoff which is going to be based on the .NET platform. Among many other design decisions we are currently considering an implementation technology for our domain model. We have to decide between "typed datasets" and "custom entities". Both the methods has their own pros and cons. I have done a small research on this and I thought of sharing those information through this blog post.

Here are some major reasons where I think datasets are better in some situations.
  • Database integration of the domain objects is very simple. DataAdaptor classes take care of the most of the database operations of the dataset. Also dataset has ability to remember the original values making the dirty object identification very easy.

  • Easy serialization of data entities to XML

  • Easy binding to most of the UI components

  • Easy integration with other tools and components (E.g. Biztalk Server)

  • Very good documentation and community support
Antway saying all those plus points of datasets, there are very strong arguments why and where we should not use datasets but go with custom objects.
  • Datasets makes the system less object oriented (as business methods are separated from the domain model)

  • Too much of generated code of datasets causes code duplication and lesser maintainability.

  • Exposing datasets as SOA can make services less interoperable.

  • Custom class approach is much cleaner and clear.

  • Unit testing is pretty much easy with custom class approach as data is easily bound to the operations.

  • Use of DataReaders, SPs and Custom Entities seems to perform better in the sense of memory usage and data fetch time.

In my opinion, we need to make sure that we use right tool at the right scenario. In a complex application (lots of integrations), I would prefer going with custom entities. If the application is a data driven or when the application is a disconnected data centered desktop application I would consider using datasets.