Thursday, November 23, 2006

Automating Database Connection Management

One of the most frustrating problems in software development is this database connection exceeding due to non closed db connection usage. Things getting worse especially when we do not discover this problem at the development/testing phase (as we may not be continuously running the development/testing systems for very long periods). But frustrating enough the problem is going to appear in the production site.

What can we do about this? One way is to thoroughly test the application for non closed connections. Yes this requires developer concentration and can be time consuming if you do not use proper tools. More effective solution is to use a database access platform that can automatically take care of closing the opened database connections. But what if our simple project requires use of direct database access technologies (such as pure JDBC or pure ADO.NET)?

Let’s look at a small code fragment which we can use to employ a very simple data access framework to take care of this problem. In short we are going to write a base database action class which can provide and manage database connection to its sub classes. We use sub classing to write data access code (data actions) we need. Below is an example for the base action class written on VB.NET.

Public MustInherit Class ActionBase(Of T)

Protected MustOverride Function Body(ByVal conn As DbConnection) As T

Public Function Execute() As T

' Create a database connection

Dim conn As DbConnection = ConnectionProvider.GetConnection()

' Call the body method by providing the connection

Body(conn)

' Close the connection

Conn.Close()

End Function

End Class

All the sub classes need to implement the abstract method “Body(ByVal conn As DbConnection)”. As a method parameter, a connection instance will be available to the implementing method and that can be used by the sub class method to perform data access logic. Above base class is written as a generic class and subclasses should provide the template class name which should be the expected return object type of the sub class. For an example, a data access action that retrieve a ‘User’ instance will like follows:

Friend Class RetrieveUserAction Inherits ActionBase(Of User)

Private _userId As String

Public Sub New(ByVal userId As String)

Me._userId = userId

End Sub

Protected Overrides Function Body(ByVal conn As DbConnection) As User

Dim user as User = new User()

' Use connection to fetch user info and populate ‘user’ instance

Return user

End Function

End Class

As you can see we write sub classes to execute database operation on the database. Developers will not have to look for database connections. It will be provided in the “Body()”method parameter. Also the developers are free from the burden of closing the connections as framework itself will take care of it.

Let’s look at how one can make use of the above written class to retrieve the user with id=12.

Dim action AS new RetrieveUserAction(12)

Dim user As User = Action.Execute()

Even though this simple framework is pretty useful in managing connection, this has some limitations also. Developers are not able to spawn threads and pass the connection in to it for further processing. Also the results will be always disconnected, you cant pass resultsets to upper layers for presessing (but I think this enforces a good practice).

Also one major thing to consider is whether actually database resources getting released by closing the connections. In Oracle with JDBC connection close will not release database resources and you need to close the ‘Statement’ instance to release database resources. In such situations we can ammend the framework to provide us statement objects instead of providing connections.

Sunday, November 19, 2006

Estimating Estimations

During last few days I was doing a research on how to perform effective estimations. There are a few interesting points I have learnt in doing the research. One good theory I learnt was the “Cone of Uncertainty”.

The above picture depicts how the level of uncertainty changes during the project life cycle. As you can see the uncertainty reduces as project progresses. This may not be news to most of us, but the problem is how many of us use this instinctive knowledge in our estimation practices? For an example is we provide an estimation in the project agreement level, the deviation of actual effort can be 4 times the estimation. “Later you estimate, higher the accuracy will be”

Also if we look at the project failure (or overrun) rates, market research shows that the success probability is high for small projects.

Project Size (lines of code)

Early

On Time

Late

Failed

1000 LOC

11%

81%

6%

2%

10,000 LOC

6%

75%

12%

7%

100,000 LOC

1%

61%

18%

20%

1,000,000 LOC

<1%

28%

24%

48%

10,000,000 LOC

0%

14%

21%

65%

One reason for this fact is that it is difficult to estimate a large project than a small one (of course there are many other reasons also). If you have a large system to develop try to break it in to several small projects. “Smaller the project, higher the success probability”

Another common mistake we make is the bias towards so called ‘Expert Judgment’. How many of us believe our experience and knowledge is the best source for estimations? Probably many of us think so. But the researches have proven the converse. Expert judgment is one of the worst methodologies to derive an estimate. So what would give us a better estimate? Often the historical data provide us a better estimation than any other method. Historical data can be in 3 forms:

  • Industry average data
  • Organization average data
  • Project local data

Needless to say the data accuracy is higher in project data than industry average data. We should build an estimation process that uses historical data. It is better to use several industry accepted estimation methodologies (customized with historical data) instead of a one methodology. Simple models with less control knobs have proven to deliver effective results. “Never trust expert judgments, use count and compute methods empowered with historical data”

One other common mistake is forgetting many of the required software activities at the estimation time. It is very preferred to have omitted activity list and go through it at the estimation time. Also allow several individual estimators to work independently and then converge those to get the final estimate. The derived estimate should not be a single value but should be presented with a variance. Three point estimation techniques is a good place to start working on estimation probability.

I hope these hits will help you in making a proper estimation process for your organization. You probably need to build a customized process that match with your organization practices. Try to use estimations throughout the project life cycle after end of every phase. Also make sure to asses and improve the estimation process. Cost of over running a project for several times will be definitely higher than your estimation effort. “Invest on estimations. It always paybacks”

References:
Software Estimation: Demystifying the Black Art
Software Measurement and Estimation: A Practical Approach
CHAOS Report