Wednesday, May 24, 2006

SQLite as an Embeddable Database

These days I'm involved in designing a ‘Smart Client Application’ architecture for one of our customers. These smart client applications require fair amount of offline capabilities, fast data access and low network utilization. To achieve the objectives most of the data is cached and persisted on the client side. This required us to look at data manipulation, persisting and synchronization mechanisms at the client side.

The first alternative we had was to maintain XML serialized objects on the client side to cache data. This had the problems like Data querying difficulties, Security issues, Thread safe access issues, High memory consumption and Implementation difficulties.

Next we have decided to use an embeddable database for storing data. Our first candidate was MS SQL Express database, as our project is mainly based on VB.NET with MS technologies. Even though it has some limitations, SQL Express is a very feature rich database which can be invoked in the similar way you invoke SQL Server database (with SPs, Views, Functions, full ADO.NET support, etc…). This was a major plus point for us as we could reuse the same data access layer components on the client side. It also supported database replication so that the cached data syncing could be done even at the database level.

But the problem was the high installation requirements of the SQL Express database. Also configuration of the database seemed to be fairly complex for our requirements.

Next alternative we looked at was the open source database SQLite. It has an incredibly small foot print of less than 400Kb and performance was impressive as a small single user database. ADO.NET 2.0 data provider was also available for SQLite and can be integrated to Visual Studio 2005 easily. It has implemented most of the SQL92 standard with few exceptions. SQLite is really a zero configuration single-file database which runs in-process of your application.

Having said the plus points, it is now the time to look at the limitations also. One main problem is that, the only locking level it supports is at database level. This can be a problem when multiple processes accessing the database simultaneously. Since smart clients are generally used by a single user, this shouldn’t be a big problem as an embedded database. But in our case, the smart client application might be deployed on a Cytrix Server and may have many users connecting over consoles.

Having honored by “2005 Open Source Award from Google and O'Reilly”, I think we should seriously consider using SQLite as an embedded database for our smart client architectures.

Friday, May 12, 2006

Maintaining Database-Application Versioning


Maintaining Database-Application Versioning


One major challenge in product development is managing/tracking the database changes and versioning it. There are many things to consider developing a proper strategy for database version controlling.
  • Application should be able to identify the database version it is using
  • If the database is not up-to-date, the application should handle the problem gracefully with out bringing surprises to the users (either gracefully exit or upgrade the database).
  • Database also should be upgraded if the ‘Online Update’ (such as in ‘no-touch deployment’) strategy is used in the application.
  • If application has capabilities to upgrade the database then preferably it should be able to update the databases in various versions.
  • System should be able to track any improper database changes (e.g. uncontrolled manual changes by the customer)

After a little thought I came up with a model to solve the issue. Still I couldn’t experiment this in a real world project. Here is an outline of the model:
  • Obviously the database should contain a Metadata Table keeping the current version of it. This version is unique for a particular state of the (database schema + system data) in the database
  • Any version change (db upgrade) should increment the database version number and should be performed through a set of scripts. That is for any two adjacent versions are associated with a set of scripts performing that version upgrade.
  • For example bringing database from version 2.3.5 to version 2.3.7 requires running two sets of scripts, one to upgrade from 2.3.5 to 2.3.6 and another to upgrade from 2.3.6 to 2.3.7. At the end of this script, it should properly change the db version to indicate the current db version.
  • Now we need to have a database level trigger to alter the db version number if any uncontrolled change is done to the database schema in an improper way. Additionally system data tables can also be protected with the same trigger mechanism which changes the db version if any uncontrolled change is happened to the system tables.
  • This trigger will change the db version to a state such as “NOT-DEFINED” so that the application may inform the user regarding the uncontrolled database state.
  • Additionally if we need to make automatic db upgrades to happen with the application upgrades, then we can build a small class library which will check the database version at the application startup and if the database needs an upgrade run the necessary upgrade scripts.
  • For this we can use the ‘Command Pattern’ to have individual command classes which can upgrade the database from one version to the next adjacent version. For an example to upgrade database from version 2.3.5 to version 2.3.7, application will run two command objects one responsible for upgrading from 2.3.5 to 2.3.6 and another for upgrading from 2.3.6 to 2.3.7.
This may not be the ultimate model for managing database versions. But I feel this would do well if implemented and controlled consciously.