Embedded Database: Java DB
Vinod Kumar Bobba
Dr. Billy Lim
ITK 478
Sr. No. / Table Of Content / Page no.Evaluation Report
1 / Overview & History / 1
2 / About Java DB / 2
3 / Why Java DB? / 2
4 / Architecture / 3
5 / Java DB tools / 5
6 / Java DB is now part of Sun’s JDK / 6
7 / Compare Java DB and HSQLDB / 8
Hands-on work : Documentation
1 / Database connectivity in Java Studio Creator
2 / Address Book demonstration
3 / Accessing Embedded database without SQL
Overview:
Java DB is Sun Microsystems branding of the Apache Derby database, it is a pure Java, small footprint, easy to use relational database engine that is based on the Java programming language and SQL. Java DB is a commercial release of the Apache Software Foundation's open source relational database project, Derby. The Java DB product includes very thing from Derby (Functionalities) without any modification. However, the technical support is available for purchase for the Java DB product through Sun. Simply we can say that the Java DB is Sun's version of the open-source Apache Derby Project database. That is the major relation and minor difference between them; moreover even the Java DB documentation refers to the core functionality as Derby.
Before talking more specific about Java DB, let’s talk about something more general about Sun and its database technology. There is a Database Technology Group within Sun that is responsible for database technology developments. This group not only evaluates databases fro internal Sun use, but also evaluates and tests other databases. In short, it works on new db technologies analyze, evaluate and test them for better development. The results of this group are HADB (Clustra) and Java DB (Derby).
Now, Java DB came into existence, when Sun decided to support the distribution of Derby. Open source database technology, Pure Java, easy to use, small foot print, standards based, complete relational design engine and secure these are few buzz words or feature of Derby that made Sun to go for it. So finally what came out is, three organizations (Apache, Sun, IBM), three brands (Apache Derby, Java, Cloudscape) and one product i.e. Java DB.
In short Java DB History is as below:
Year / Description1996 / Cloudscape founded
1997 / JBMS released
1999 / Cloudscape acquired for Informix
2001 / IBM acquired DB part of Informix
2004 / IBM donated Cloudscape to Apache(Derby)
July, 2005 / Derby Graduated from Apache Incubator
December, 2005 / Sun Announces Java DB
Right Now / Current release 10.1.3.1
Table 1 [1].
About Java DB:
Java DB is completely based on Java i.e. written in Java to take advantage of Java's write once, run anywhere (any hardware, any OS, any vendor). So Java’s buzz words do apply here and we can say it provides a robust, small-footprint Java database management system that is cost effective and simple to deploy. Due to Java portability, it can be used across Multi-platforms and it is easy to migrate an application using Java DB to other open standard databases. Database on-disk format is platform independent. It is fully transactional, secure, easy-to-use, standards-based -- SQL, JDBC API, and Java EE -- yet small, only 2MB.Java DB technology adheres to database standards such as JDBC and ANSI SQL standards, this make it to expect functionalities of relational database with SQL syntax and the other RDBMS functionalities like concurrency control, transaction management and triggers. Java DB SQL syntax is total based on SQL92, SQL99 and SQL2003.It also means that it is easy to upgrade an application using Java DB to other standards-based databases, such as Oracle and DB2.As an embedded database Java DB Database engine may run in application's virtual machine, which requires no additional process.Database requests are just method calls within the JVM whereStartup & shutdown of database controlled by application. All we need is a library to Java applications (single jar file). All the underlying administration details are invisible to the user, so it is easy to use with zero maintenance.
It is a complete relational engine with full support toTables,Indexes,Views,Triggers,Joins,Procedures (Java), Functions (Java),Temporary tables,foreign keys, constraints, Cursors, Transactions, Isolation levels andACID. All these entities and aspects are totally applicable to multiple users, withDeadlock detection,Crash recovery,Backup & restore,Data caching,Statement caching,logging andGroup commit. Thus Java DB provides multiple databases per system or multiple systems per read-only databases environment making life easier for application developers.
Why Java DB?
Java DB is considered ideal for Java application development and testing because it is easy to use, where you can fit on a laptop or on a mainframe and available for free under Apache license. It also a best fit for Java client-server applications, that need up to 24 x 7 support of database based on sophisticated standards and transactional SQL features that could protect against data corruption or systems crashes with minimal database administrator skills.
Embedding the database will always make life simple considering the scope for small applications. Likewise, even Java DB well suits as alocal data store for on- or off-line Web applications. Embedding provide the simplicity where there is there is no need for the developer or the end-user to buy / download, install, administer the database separately from the application or IDE. Java DB also has the flexibility to support both embedded and client-server mode. In embedded mode, Java DB runs on the same JVM as the application and users may not even be aware that they are accessing a relational database. Itrequires no administration if embedded and little if used in client-server mode. We will be discussing some of these features further as well as in the demo.
Security is something that we should consider in every aspect of today’s application or web development. Even in this aspect Java DB provides a numbers of security mechanisms including database file encryption, authentication through either external LDAP directory or authorization. Use within browser-based, Web (2.0) applications for easy distribution, one-click install, secure local data storage, and data persistence if the Internet connection is lost or for use off-line. Java DB is also apt for applications running in a J2ME CDC (e.g., PDA) environment that need a small size (2MB) without sacrificing functionality like full SQL support, transaction management, stored procedures, triggers, concurrency, and backups.
Architecture:
Architecture of Java DB is considered pretty solid with straight, solid state-of- the art technology. Basically it is a modular architecture Aries (a recovery algorithm designed to work with a no-force, steal database approach. ARIES is a popular algorithm used by IBM DB2, Microsoft SQL server) and Indices as B-trees. There is no separate SQL virtual machine to deal with SQL queries. The following diagram describes the procedure, where SQL related information is compiled into java byte-code and is made to run on standard JVM. Thus everything is converted into byte-code and a single VM to deal with this all, reminds us of Java DB as 100% Java based.
However it has its own cons and pros. Compiling SQL to byte-code makes it faster using hotspot compiler or Just In Time (JIT) compilers, where they compileinterpreted byte code to native machine code. But on the other hand this extra compilation and class loading is burden on the Virtual machine as it is also acting like a db engine. So to overcome this, other option is to consider pluggable storage architecture. This is both flexible and modular. This Pluggable Storage Architecture provides a standard set of server, drivers, tools, management, and support services that are leveraged across all the underlying storage engines.Moreover this pluggable storage architecture has a pluggable storage layer which enables to mix and match storage (memory), file system, jar file (read only) or just use what we need for an efficient optimized footprint.
Figure [1].
We have already discussed and I have mentioned that Java DB provide all features like complete relational engine. So we can have multiple systems for a single read-only database or multiple databases for a single system. In that aspect, we have to mention about, embedded and client-server mode in which Java DB operates. The flexibility to support both embedded and client-server mode allows Java DB to adapt to diverse deployment scenarios.
Figure Embedded [1].Figure Client/Server [1].
In embedded mode, Java DB runs on the same JVM as the application and users may not even be aware that they are accessing a relational database i.e. database accessible from only a single JVM. However there can be multiple applications per JVM like in application server. This method is easy to use, no administrator skills required and faster. There is one more option with respect to embedding where it adds more flexibility, where embedding network server. Even this does not require administrator skill and no need to change application i.e. there is no need to change the code of the application, all we need to do is to enable by setting the corresponding properties. This provides access todatabase from outsidethe application's VM through DRDA standard protocol. Moreover it also adds DB reporting anddebugging capabilitiesto stand-aloneapplications.
In client/server mode, many applications use only one database. Network server uses embedded driver against derby. Standard protocol DRDA (industry standard for database access interoperability), may use drivers from other vendors where scripts are provided to stop and start network server. So does require a little administration.
Java DB tools:
ij : This is a SQL scripting tool, which is JDBC neutral and can be used against other JDBC drivers. It is a simple utility for running scripts against a Derby database. You can also use it interactively to run ad hoc queries. ij provides several commands for ease in accessing a variety of JDBC features. ij can be used in an embedded or a client/server environment.
sysinfo : sysinfo provides information about your version of Derby and your environment.
dblook : schema extraction tool for derby .dblook is Derby's Data Definition Language (DDL) Generation Utility, also called a schema dump tool. It is a simple utility for the dumping the DDL of a user-specified database to either a console or to a file. The generated DDL can then be used for such things as recreating all or parts of a database, viewing a subset of a database's objects (for example, those which pertain to specific tables and schemas), or documenting a database's schema.
JVM and classpath for Derby tools
ij, sysinfo, and dblook are tools that can be used in an embedded or a client/server environment.
Java 2 Platform, Standard Edition, Version 1.3
All Derby tools require Java 2 Platform, Standard Edition, Version 1.3 or later.
Derby class path requirements:
• / To use ij, you must have derbytools.jar in your classpath.If you are using the embedded driver, you must also include derby.jar.
• / To use sysinfo, either derby.jar or derbytools.jar must be in your classpath.
• / To use Derby tools from a client with the Derby Network Server, you must have derbyclient.jar and derbytools.jar in your classpath.
There are also some popular tools that Support Java DB –Net Beans and Java Studio creator, which we will be using with our demo applications.
Java DB Scaling:
In general there is no restriction and it is unlimited with no architectural constraints. Java Db has already been tested by sun with databases up to 300GB and up to 100 active connections. Talking about horizontal scaling and high availability there is no built-in support, but can be achieved by making Java DB interact with other technologies. About memory, Java Db caches data in memory. However, durability is achieved at a cost with performance penalty as we try to persist more data to disk and it is very risky to persist at shutdown. But lack of durability is not a problem as it is possible to run Java DB with lessdurability. We will set the following property to get going with it in case we find there is lack of durability-Dderby.system.durability=test.With SQLRDBMS functionality, B-trees were designed for efficient disk storage &retrieval in Java DB and finally atomic transactions(ACID).
Java DB is now part of Sun's JDK:
Finally, it’s official that Java developers will have the convenience of a fully functional, 100% Java database shipping with the Sun JDK. Java DB 10.2 will be available with Mustang (Sun's JDK for Java SE 6) as part of the JDK bundles. It is a great thing for java developers as we have a database that you can build and test against that implements the latest version of JDBC, and which, if you so choose, you can take and deploy your application with, free of charge. We can find it under the db directory of your JDK install. As a reminder, this is Sun's redistribution of Apache Derby and the Java DB community is very active and you get quick responses from Derby developers and users. It's free, it's open source, and now it's part of the JDK.
We already said that for a great out-of-the-box development experience with database applications, the final Java SE 6 development kit – though not the Java Runtime Environment (JRE) – will co-bundle the all-Java JDBC database, Java DB based on Apache Derby. Developers will get the updated JDBC 4.0, a well-used API that focuses on ease of usemany additional features like special support for XML as an SQL datatype and better integration of Binary Large OBjects (BLOBs) and Character Large OBjects (CLOBs) into the APIs. Additional features that improve ease of use includesome of the new annotations that make SQL strings embed better into your JDBC application – like decorating your getAllUsers() method with an @Query(sql="select * from user") annotation, and that being all you need. This allows Java developers to build applications even more rapidly and easily by having access to a Java Database which implements many features from the latest JDBC4 API specification, directly out of the JDK.
Java DB is not like the XML parser situation. The XML parser was included in java core, and its classes were loaded automatically when the VM started up. If you wanted to use another XML parser, it was a real problem. That's not true with Java DB. You have to explicitly put derby.jar in your classpath if you want to use it. The advantage is well, now there is a database in Sun's JDK that allows people to exercise the JDBC APIs. Tutorials can refer to it. Demos can use it. It helps people get comfortable with JDBC. It provides early access to the new JDBC APIs. All these things I think have a lot of value to a certain class of developers.
Since you have to explicitly say you want to use Java DB (opt in, not opt out) and it's not like we are being forced to download some huge database that requires 20 steps to install. It's a couple of megabytes; it is a completely silent install, and just sits there doing nothing until and if you decide to use it. In Java DB 10.2 which is bundled with JDK 6 these are the features that are more specificScrollable updatable result sets,JDBC 4,Grant/Revoke,Online backup, Stronger Network Authentication, XML.
Concentrating more on performance, Sun has studied and compared the performance of Derby, MySQL, PostgreSQL. I would like provide you with those snap shots (graphical statistics).
Figure [1].In-Memory DB (DB 10MB, Buffer 50MB, 400 Branches)
Figure [1]. Disk-bound DB (DB 10GB, buffer 64 MB, 400 branches)
These are some of the hints they came up to improve the performance of Java DB.
- Use (and reuse) prepare statements
- Put DB log and data on separate disks
- Tune page cache size (default 4 MB) using derby.storage.pageCacheSize
- Use indexes to avoid table scans by checking query plans using derby.language.logQueryPlan=true
Comparing Java DB with HSQLDB:
I would like to compare Java DB with HSQLDB.HSQLDB is a relational database management system which is also written in java. It is based on Thomas Mueller's discontinued Hypersonic SQL Project. The software is available under a BSD License.
It has a JDBC driver and supports a rich subset of SQL-92, SQL-99, and SQL 2003 standards. It offers a fast, small (less than 100k in one version) database engine which offers both in-memory and disk-based tables. Embedded and server modes are available.
Most of the functionalities are pretty much same as Java DB. Additionally, it includes tools such as a minimal web server, in-memory query and management tools (can be run as applets). HSQLDB is currently being used as a database and persistence engine in many open source software projects, such as OpenOffice Base, as well as in commercial projects and products, such as InstallShield or InstallAnywhere (starting with version 8.0).
HSQLDB is best known for its small size, ability to execute completely in memory, and its speed. It can also run on free Java runtimes such as Kaffe (virtual machine).