Unit testing database code: notes from the JUnit mailing list

The JUnit mailing list at Yahoo! occasionally carries discussions on database unit testing. On this page I've collected together some of the threads that I've found useful.

See also: my notes on unit testing databases.

Updated: $Revision: 1.2 $ $Date: 2003/08/04 08:13:09 $ $Author: richard $

26 - 31 Jul 2001: Options for testing

testing database operation (Simon Evans)
no autoCommit (Neil Swingler)
setUp() and tearDown() (David Roussel)
List of options (Mathieu Gervais)
Comments on the list of options (Richard Dallaway)
In an EJB environment (Neil Swingler)

31 Aug - 4 Sep 2001: More autocommit discussion, using JTA/JTS

Can you use an outer transactions to avoid committing DB changes? (Richard Dallaway)
No (Yang Wang)
Custom transactions (Eric Vought)
Maybe you can use subtrasactions (Bruce Vinchon)
Subtransactions are different (Steven Morgan)
Nested transactions are hard to find (Oracle savepoints may help) (Bruce Vinchon)

25 Sep 2001: JDBC testing with mock objects

Why we decided to drop mock objects (Johannes Link)
Why testing without a db (Steve Freeman)

From: Simon Evans [mailto:sevans@...]
Sent: 26 July 2001 19:29
To: junit@yahoogroups.com
Subject: [junit] testing database operation

i am trying to test objects that use the database. in order to leave the
database in the same state after the test, one test method has to do a
store, load and delete. is this normal? or is there a way to test each
method (store, load, delete) separately, maybe using the setUp() and
tearDown() methods? is there a standard pattern for this?

also the assert methods do not throw anything...do they throw a
RuntimeException or somethnig when they fail? because if i want to do
some
clean up in a certain test method using a try-catch-finally block, do i
have
to catch a specific runtime exception? or should all cleanup for all
methods
be done in the tearDown() method?

thanks
simon


   Date: Fri, 27 Jul 2001 09:00:48 +0100
   From: Neil Swingler 
Subject: RE: testing database operation


The easiest way to clean up the database is to ensure nothing ever gets
commited. I aquire my connection in setUp (no autoCommit) and rollback in
tearDown().

   Date: Fri, 27 Jul 2001 08:52:45 +0100
   From: "David Roussel" 
Subject: RE: testing database operation

I use setUp() to establish a DB connection, log the test user into the
database.  My test methods create, manipulate and assert on the data
layer objects.  I then use the tearDown() method to delete all objects
added.  If fact sometimes I also put clean up code in setUp() just case
the last test failed.

If you want to clean up a after a RunTimeException try use try/finally
block...

try
{
  assert( thisMightFail );
}
finally
{
  doCleanUp();
}

   Date: Fri, 27 Jul 2001 14:54:48 -0400
   From: "Mathieu Gervais" 
Subject: Re: testing database operation

>This technique, however, is dependant on your locking model.

I think you are right, altought I'm not expert in "locking models".

Another important point is that you can have code that insist on commiting,
for example an entity EJB that you test in the real container environment.
Or code that do multi-transaction, so need to commit in between.

In that case, you can not do the real neat trick of autoCommit false.

I once dreamed a system where every affected row would be rolled back.

I investigated many options :

Glossary:
-"stable data" means data you no is ok and that should be the baseline for
the tests, i.e. before the tests are runned, the DB contains that baseline
stable data. Note that if your tables are all emtpy (i.e. no baseline data)
before runnning the tests, everything is much simpler, you just do a
truncate/delete on all tables after every test and you don't need to read on
:)

Option 1 - Data refresh:  refresh data after/before every test.
...from a copy of the baseline data somewhere.
Really impossible due to performance. If you think not, let me know how.

Option 2 - Data Re-initialize: every test case clean up it's data in
tearDown()
Pointed out by David Roussel also.
That's one of the strategy we use. Works well but you got to be carefull
writing your cleanup code.
Impossible to use if you are changing stable data rows: only work if just
just add rows.

Option 3 - use mock objects to fake the DB.
Dicussed in the mailing list archives.
When you want to test DB code, (SQL, transactions, etc), does not really
apply since you fake the part you want to test.
Only applies if the actual DB is outside the "unit" under test, which isnot
always the case.

Option 4 - Data Reset: reverse the changes via some generic black-magic
after every test.
The dream! Fast, efficient, generic! :)
I've spent some time trying to find a neat DB trick do to that one. Never
really got the time to find a complete solution and implement it.

4a) Journalling
Use some sort of journalling provided by the database to replay in backward
direction all the changes made. Is this even possible/supported?

4b) DB tricks, triggers and al.
You need a DB that is a copy of your baseline stable data. (Might be in the
same DB, but with table names prefixed with 'STBL'(stable)).
(You could use 'create table like' to ease making baseline table with the
same structure.)

Every table as a flag column name BASELINE to keep track of what happens to
the line. For your baseline data, set the flag to 'Y'(es).
Default on insert for this column in 'N' (New, Not part of baseline). A
trigger on update changes the flag to 'U'(updated).
A trigger on delete (only effective on a row marked 'Y' or 'U')  marks the
corresponding row in the baseline copy with 'D'(deleted)

After a test:
-drop all lines with 'N',
-if a line is U, refresh it from the baseline copy. (resetting the flag to
'Y').
-Add all the rows marked 'D' in the baseline copy to the working copy.

Daily or weekly, refresh from scratch your working copy from the baseline
copy. Do this also if a test case fail and it should really worked and you
suspects the code that revert changes to have somehow failed.

Beware of autoincrement columns, they will cause re-adding a deleted row to
break the system because the column(possibly a key) will change.
Any trigger used in the system needs to be analysed to see if it would work
with this system (e.g. a trigger that sets a column to the current time will
make re-adding a delete row fail, i.e. the time will have changed. This
might be ok or not).
Special treatment should be done for views also. (Don't run the reversing
code on them)
(You might also have to drop the integrity constraints while doing this if
you don't want to have to do it in the right order respecting integrity
constraints).

Option 5 - Data Roll forward:  don't do anything!
Just start the next test without cleaning the junk of the previous one.
Might work in some case (i.e. obvious read-only DB case) but is not really a
"solution".

Option 6 - Auto commit = false ...
...and rollback in tearDown().
Pointed out by Neil Swingler also.
Very good solution if applicable (see top of this email for reasons why it
would not always apply).
Your code should allow this, and your DB also ("locking model"), see email
below from David Roussel.

If you ever implement 4b, or have any comments or solution, please let me
know! :)

mathieu


   Date: Tue, 31 Jul 2001 09:35:23 +0100
   From: Richard Dallaway 
Subject: Re: testing database operation


Mathieu - thanks for putting together that list of options.  I think
there's some good stuff in there to think about.


>Option 1 - Data refresh:  refresh data after/before every test.
>...from a copy of the baseline data somewhere.
>Really impossible due to performance. If you think not, let me know how.


If...
  - you need a lot of data for unit tests (and some codes does); or
  - your DB machine is on the wrong side of the planet; or
  - your DB machine/DB product is under-powered; or
  - you have so many DB tests that you've reached a limit...
...then yes, performance is an issue with this approach.  For an
ever-growing project it's something you're going to run into sooner or
later, I guess.


On the other hand, there are some things you can do to make it
workable.  For example:


- if a subsection of tests only touch a couple of tables, then you don't
need to wipe and recreate the whole database for those tests.  I tend to
write helper methods to set up database scenarios and call them as required
(rather than touch every table every time in setUp and tearDown).


- how much data do you need for unit tests?  If you craft your tests
carefully, you may be able to get away with just a hand full of records.


- as you point out, truncate is very fast for getting rid of data.  And for
some tests, that's all you need to do.


- if your tests are all in one test class, then it's going to be a pain to
run that class.  You may get better mileage from option 1 if you can break
your testing into a number of smaller classes.  During a coding frenzy you
can run the test class covering just the area you're working on... and run
all tests a little less frequently.


I guess the viability of option 1 depends on your aims and circumstances.



>Option 4 - Data Reset: reverse the changes via some generic black-magic
>after every test.
>The dream! Fast, efficient, generic! :)


Oh yes, this is what we want.



>4b) DB tricks, triggers and al.
>You need a DB that is a copy of your baseline stable data. (Might be in the
>same DB, but with table names prefixed with 'STBL'(stable)).


The only downside I see to this is the complexity it brings, and I'm not
sure I want to add much more complexity to my testing environment.   I
guess it's a trade off: extra complexity for time gained from something
like option 1.


Regards
Richard

Message: 5
   Date: Tue, 31 Jul 2001 09:40:11 +0100
   From: Neil Swingler 
Subject: RE: testing database operation

Well I have just finished my first EJB project and this is what I did.

DB layer
    rollback after each test in tearDown

Entity EJBs
    manage transactions using UserTransaction so that I know when ejbLoad
and ejbStore are going to happen
    delete created objects in tearDown, also in setUp in case a previous
test bombed out
    test getters before committing and after commiting to check correct
function of
    ejbStore and ejbLoad

Session EJBs
    manage transactions using UserTransaction but start the transaction in
setUp and rollback
    in tearDown

I didn't use any mock objects and this makes the tests expensive to set up
and slow to
execute.

For the DB layer I think I would always want to test against a real database
to
verify that the SQL is correct.

For the EJB layer, next time I would not test the EJBs in the container and
I would mock out
the DB layer.

--
Neil Swingler

> -----Original Message-----
> From: Lindanger Trond Kjetil
> [mailto:trond-kjetil.lindanger@...]
> Sent: 31 July 2001 09:50
> To: 'junit@yahoogroups.com'
> Subject: RE: [junit] testing database operation
>
>
> Interesting to see what you did Neil.
> Did you ever have problems with tests breaking down (e.g. uncaught
> NullPointerException)
> never reaching tearDown for rollback?
>

If you never commit it doesn't matter. That's why you set autoCommit to
false when testing the db layer.

In the EJB layer, if you use UserTransaction, a commit won't happen unless
you call commit().

In both cases the explicit rollBack() is not really needed
but will presumably free up resources sooner.

--
Neil

   Date: Fri, 31 Aug 2001 18:33:30 +0100
   From: Richard Dallaway 
Subject: database testing with JTA/JTS

A while ago I had an email conversation with Bruce Vinchon regarding
database testing. I said I'd mention something we discussed on this mailing
list so here goes (and apologies to Bruce if I've misrepresented his thoughts).

Bruce mentioned using the "auto commit off" trick to make database testing
easier.   I.e., you do something like turn auto commit off during setUp()
and do an rollback/abort in tearDown(), so you can write to a test database
and not have to worry about cleaning it up. Something like that.

I replied talking about the problems when you have methods which are
themselves transactions (i.e., methods which
start transactions, write to various tables, then commit).  How can you use
the "auto commit off" trick in those situations?

Bruce suggested that you may be able to get something to work with nested
transactions, perhaps using a transaction service such as JTA/JTS.  Sounds
good to me, but I haven't had the opportunity to investigate this idea or
the Java Transaction APIs.

Can anyone comment on this approach?

Regards
Richard

   Date: Fri, 31 Aug 2001 10:48:20 -0700
   From: "Yang Wang" 
Subject: RE: database testing with JTA/JTS

I do not think JTA support nested transactions.  In the case of nested
transactions, it will suspend the outer transaction, begin the inner
transaction, commit it and then resume the outer transaction. Maybe new
JTA version supports nested transaction.

Yang

   Date: Fri, 31 Aug 2001 14:10:20 -0400 (EDT)
   From: Eric Vought 
Subject: Re: database testing with JTA/JTS

In the past, I've designed my database interaction classes to take an
application specific Transaction object.  The Transaction had
Commit/Rollback methods which contained application specific handling
(like resetting local state in event of a database rollback) as well as
performing the real commit/rollback . By substituting a special purpose
Transaction object I could simulate nested transactions or
dummy out the commits for testing purposes.

   Date: Mon, 3 Sep 2001 11:46:32 +0200
   From: "Bruce Vinchon" 
Subject: Re: database testing with JTA/JTS

hi all,

As richard said, there could be several approach to test a database :

1) I just want to make some test and rollback (autoCommit(false) in
setup() and rollback() in tearDown()). It allows to insert, update, delete
in my database during my tests and rollback all operations to let the
database clean. If you do a lot of tests (testSuite), you could just have
independant tests (as it should be). If you want to chain tests that uses
the same connection and to rollback all operations at the end, you
probably should developed a transaction object. Eric describes it. You can
control autoCommit and rollback and inhibit their standard behavior. a
testSuite can control commit and rollback through other methods in the
transaction class (globalCommit() and globalRollback() called in the body
of your testSuite ?) as each test methods call inactive autoCommit(),
commit() and rollback() methods.

2) you don't want to borrow about it and want to manage nested
transactions. well... JTS specification allows nested transaction (spec
1.0 says : "not required" but you can find JTS compliant platforms that
support it) :

"A JTS transaction manager must support flat transactions; support of
nested transactions is optional. If a client begins a transaction, and
within that transaction begins another transaction, the latter operation
will throw a NotSupportedException if the JTS implementation does not
support nested transactions. Keep in mind that even if the JTS
implementation supports nested transactions, this transaction
manager-level support does not guarantee support for nested transactions
in an application. For example, the EJB 1.1 specification does not support
nested transactions."... [find on jGuru]

but EJB 2.0 supports it !
(http://www.wap-shareware.com/sjug/ejb/index.shtml - see point 6).

so where is the problem ? you should find a database that supports it and
a transaction server that is JTS compliant. DB2 and SQL Server seems not
to support nested transactions (perhaps directly ?). Oracle 8i does...
with a little trick : the save points.

3) With save points, you could simulate nested transactions. Oracle 8i
allows it in PL/SQL procedure that you could call through JDBC (not good
for simple tests !) (see
http://www.inquiry.com/techtips/oracle_pro/10min/10min1100.asp). [Informix
Universe 9.4 supports the same thing : see
http://www.inactsys.com/universe.htm].

Where to find some transaction application server that allows it ? Some
products like PowerTiers provides setSavePoints() methods (you can read
here for an explanation :
http://www.persistence.com/powertier/devpit/documentation/programming/directt5.htm).
But you should use it with an appropriate database.

Conclusion : as it seems not quite easy to find a cool platform where
nested transactions are automatically supported, the first solution seems
the best approach actually.

Any comments or ideas ?

regards,

Bruce Vinchon

   Date: Tue, 4 Sep 2001 06:13:42 -0700 (PDT)
   From: Steven Morgan 
Subject: subtransactions are very different

> 2) you don't want to borrow about it and want to manage nested
> transactions. well... JTS specification allows nested transaction (spec
> 1.0 says : "not required" but you can find JTS compliant platforms that
> support it) :

> Bruce suggested that you may be able to get something to work with
nested
> transactions, perhaps using a transaction service such as JTA/JTS.
Sounds
> good to me, but I haven't had the opportunity to investigate this idea
or
> the Java Transaction APIs.
>
> Can anyone comment on this approach?

OTS subtransactions are very different from top-level transactions.
Resources are not consulted and no two-phase commit occurs.
Subtransactions simply promote registered resources if they commit and
discard them if they rollback.  You will need to use subtransaction-aware
resources (which are not resources, go figure) to interact with the
transaction manager for a subtransaction.  And that is far more than you
wanted to learn that much about the OMG Transaction Spec, I bet.

-Steve  :)

   Date: Tue, 4 Sep 2001 17:24:59 +0200
   From: "Bruce Vinchon" 
Subject: Re: bruce's sub-trance actions :)

> OTS subtransactions are very different from top-level transactions.
> Resources are not consulted and no two-phase commit occurs.
> Subtransactions simply promote registered resources if they commit and
> discard them if they rollback.  You will need to use
subtransaction-aware
> resources (which are not resources, go figure) to interact with the
> transaction manager for a subtransaction.  And that is far more than you
> wanted to learn that much about the OMG Transaction Spec, I bet.

you're totally right ! As i've dived into several specification documents
(not comments about them), i've discovered that OTS doesn't support
directly nested transaction but only subtransaction (as you said). And JTS
is based upon OTS. So it explains the "nested transactions not required".
About EJB 2.0 spec, it is written : "no nested transactions supported".
Authors explain that they want EJB spec to be implemented on any database
system. They add that EJB spec could support nested transactions if
databases support them [today, Oracle and Informix begin to do it via
SavePoints].

So a lot of my mail was wrong (shame on me :() but the conclusion :))
[Long live simple JDBC ! :)]

Without enough "normative oxygen" to stay in spec documentation :) i've
discovered some research works that explains how to manage nested
transaction with OMG database
[http://citeseer.nj.nec.com/tesch97global.html (ODMG but works for OMG
too)]. but it's another story...

thanks Steven !

regards,

bruce.


   Date: Tue, 25 Sep 2001 10:25:37 +0200 (MEST)
   From: Johannes Link 
Subject: Re: Test-first paper


> Here's something I've been working on. It's a paper that talks about some
> techniques for developing a database application test-first. The example
> code is in the cvs respository on sourceforge, accessible via
> http://www.mockobjects.com.
>
>  http://www.mockobjects.com/papers/jdbc_testfirst.html
>
> Comments, etc?
>
> Steve

Steve,
I have tried to write a similar chapter for the test-first book I'm writing
at the moment. What evetually made me drop the mock approach for JDBC code
is:
- The setup of the mocks was more and more of a nightmare. If you really
want to test the correct usage of JDBC, you have to check state-dependent
behaviour across objects. For instance, you have to check that a statement won't be
reused if its connection has been closed, or that your statement has not
been reused for another query to execute before you have read all the results;
and so on and on.
- The test code got too brittle for my taste. During refactoring I
constantly changed the way how I queried the database and handed on result sets etc.
So I found myself changing the test code again and again. I think your example
is too stable to show this disadvantage.
- Testing that the SQL is right requires some kind of SQL-Parser. It's not a
problem for static SQL strings but it is one when SQL code is somehow
assembled dynamically - which was the case in the example I had chosen and which
seemed to be natural when striving for the simples thing.

The problem might be that JDBC itself was designed without test first and
testability in mind. One might probably come up with an API that is much better
suited for testing.

Johannes

   Date: Wed, 26 Sep 2001 21:16:20 +0100
   From: "Steve Freeman" 
Subject: Re: Testing SQL: was Test-first paper

> Surely the only way to test SQL is to use a database. Imagine testing
> java code by parsing it rather than running it. Maybe CMP EJBs make this
> a lot easier.

but there are two things going on here: checking that the right things go into the generated SQL and checking that the SQL is in sync with the schema. In fact, code generation is an interesting example because the unit tests for the Big Ben tools (which generate test suites) just does some simple string matching -- there are higher level tests to make sure the thing runs.

In my experience, most dynamic SQL is some static stuff with values, and maybe an extra clause, stuffed in. It may not be worth testing the static stuff during SQL generation, but it is worth gaining enough confidence that the stuffing is working -- and often the minimum working test can be a substring match. I've seen effective unit testing of HTML done with regular expressions.

Now the corollary of this is that at some point you have to be sure that it all hangs together (as against hanging seperately...). My suspicion is that most applications of any size end up with a data model layer, which can be sanity checked by a smaller cross-section of integration (or whatever we're calling them now) tests. It sounds like your approach is to concentrate on more thorough integration tests.

What's the point of the extra tests?
- we test everything that can break and some things, like forcing exceptions, are very hard to test against a real server. I find that testing with mocks pushes me towards thinking harder about things like error handling.
- those of us who use them regularly find that the discipline of writing Mock Objects has a good effect on the shape of our code. I'd be loath to give that up.
- as we discover new classes and factor them out, we can see the tests that go with them and move them too.

Steve