`dbstreams`: Classes

This section describes the dbstream classes. It's meant as an in introduction, not a reference manual, more of a what-and-why than a how-to. Detailed descriptions can be found in the reference manual. (TODO: write reference manual.) We begin with the helper classes because they're part of a dbstream, end with dbstream itself.

Primary Classes

Applications interact primarily with a dbstream, which behaves largely like a std::iostream.
SQL text is kept in a query object. Inserting a query into the stream prepares the query, and, unless is contains parameters (placeholders), executes it.
A provider class (there are several) wraps a native database client library e.g., ODBC, in a uniform API for use by dbstream. It implements a set of pure virtual functions.

Other Classes (abbreviated)

login holders user authentication information.
metadata describes a column or parameter.
cell describes a fetched value or parameter.
dbstatus holds stream state information and library/server messages.

All classes live in the dbstreams namespace. Among other things, this avoids conflict with the std namespace while permitting use of some of the same names.

`provider`

provider is an abstract class. It defines, through pure virtual functions, how dbstreams will access a native library. Each native library (and hence server) supported by dbstreams has its own provider.

The application never uses the provider directly. Understanding that it exists and the role it plays helps in understanding the design and use of dbstreams.

`query`

The fundamental reason for the existence of the query object is to allow the dbstream to distinquish an inserted query from ordinary data.

A query object is basically a std::string. You can assign a std::string to it. The operator<<(const std::string&) is also supported because it makes query building easier.

`login`

A login object holds user credentials, whatever that might mean. When one dbstream is copied to another, its credentials are what's actually copied.

`metadata`

As soon as the query is executed — that is, after the query object is inserted into the dbstream — the first row of results is available. The provider gathers the metadata of each column: name, number, type, size, and nullability. .

`cell`

A cell is an intermediate object, seldom used by applications. It is used when there's a need to hold a typeless value, normally something returned from the database whose destination (and hence type) is not yet known. Operators are defined to assign/extract the value to/from built-in types.

Because cell is derived from metadata, a cell knows its name, ordinal position in the row, and datatype.

A cell does not convert from one type to another. If an integer is assigned to it and a string is read, the result is an exception. [TODO: throw exceptions.]

`dbstatus`

dbstatus is quite complicated. It's rarely examined directly, but it's returned by many dbstream operations. Because lots of status information orginates within the provider, the provider's status object — typically dbstatus or a derivative — is a template argument to the provider. For example, for the SQLite provider:

	template <typename STATUS>
		class SQLite : private provider_data

dbstatus is actually what's tested in contructs such as

	if( db )

which amounts to something like

	if( dbstream<dbstatus>::dbstatus::operator void*() )

although it's quite a bit less typing.

Stream State

	enum iostate { goodbit, badbit, eofbit, failbit = 4  };
	bool good() const;
	bool bad() const;
	bool fail() const;
	bool eof() const;
	iostate rdstate();
	iostate setstate( iostate state );
	iostate clear( iostate state = goodbit );

dbstatus defines the state status bits and the functions that get/set them. It is modelled on std::ios and uses the same names for the status bits and functions. If you know those, you're good to go. If you don't they're good to know.

One of the challenges of the library writer is to make things as simple as possible, but no simpler. It's one thing to say “a database is like a file”; it's something else to reduce a connection's states to those of a stream. It's not easy but it is possible. It's also one of the reasons dbstreams is easy to use.

Besides state, dbstatus holds the native error number and message when an error occurs. The provider may also choose to include the file and line number where the error occurred, especially helpful for exceptions. Oh, and dbstatus is sometimes thrown, as you can see from the above example.

dbstatus has two functions for managing messages from the server.

notify: This is called whenever a server delivers an error message, before any associated data are delivered. The default action is to write the message to std::cerr. It can of course be overridden by deriving a new class.
quit: Nominally, this is called by the provider to determine whether or not to proceed with the current operation. So far, it hasn't been needed.

`dbstream`

	template <typename PROVIDER, typename STATUS>
	class dbstream : private dbstream_data

Construction and Destruction

	dbstream();
	dbstream( const dbstream& that );
	dbstream( const std::string& username, const std::string& password );

Like a std::iostream, there's a default constructor. It's initialized to dbstatus::goodbit because that's how iostreams work.

There are differences, too. One constructor accepts a username and password that will be used when opening connections. And the copy constructor has defined behavior: the login credentials are copied, and a new connection is formed to the same server and database.

There is no constructor that takes a servername because it's too confusing. Forming a connection can require up to four (and sometimes more) strings: username, password, servername, database. There's no logical order to them and no way for the constructor to distinquish among them by type. So we limit construction to authentication and relegate database information to the open methods.

	~dbstream();

The destructor frees any resources and closes the connection. May throw an exception if the provider detects an error from the native library.

Open and Close

	const dbstatus& open( const std::string& server, 
			      const std::string& dbname, 
			      const std::string& tablename = std::string() );
	const dbstatus& open( const std::string& tablename );

In the first form, a connection is formed to the database. The returned dbstatus object should be tested before proceeding; it will not throw an error. The optional tablename argument opens a table by calling the second form.

The second form “opens a table”, meaning it readies the stream to write data to the table via the operator<< and write methods.

	void close();

As with an iostream, no error is returned. If the provider detects a “can't happen” condition, it throws an exception. This guards against silently discarding data in an open transaction, and encourages discovery of such impossible situations.

Stream Status

	const dbstatus& status() const;
	operator const void*() const;
	bool eof() const;
	bool error() const;

If desired, the stream's provider's status object — normally dbstatus or a derivation — can be retrieved and dealt with explicitly.

Normally, that's not necessary except to handle errors. For go/no-go decisions, the question is whether or not the stream has more data available for extraction. For that purpose,

	if( db )

and

	if( db.eof() )

normally suffice.

The eof test departs slightly from iostreams to support record-oriented operations. A query may produce several result sets, and an application wants to distinguish between them. A dbstream signals the end of a resultset with dbstatus::eofbit but not failbit. failbit is set only if there are no data pending, no further results to be read.

The void* operator (a fancy boolean) tests for failbit and not for eofbit. In an iostream, that's fine, because the stream will set eofbit and failbit whenever end-of-file is reached. A dbstream, in contrast, will exhibit transient eof status as each resultset is read.

The use pattern changes slightly from iostream for dbstreams. In iostreams, one might say:

`iostream` pattern

	ifstream is("f");
	while( is ) {
		int i;
		is >> i;
		cout << i << endl;
	}

whereas a dbstream has two loops and checks for eof:

`dbstream` pattern

	db.open("server", "database");
	query sql("select * from A; select * from B;");
	// while any results are pending
	for ( db << sql; db; db++ ) {
		// process each resultset
		for( ; ! db.eof(); db++ ) {
			int i;
			db >> i;
			cout << i << endl;
		}
	}

In the real world, there would obviously be different things to do depending on which resultset was being read.

The dbstream has record semantics: operator>> extracts a column value to a variable and advances to the next column. operator++ advances to the next row. Attempts to extract beyond end-of-row result in an exception being thrown.

Note that operator++ is called at the end of each loop. This is a deliberate choice, to simplify the caller's life. The sequence is:

	next, row N-1, good
	next, row N, good
	next, row N+1, eof	// next results pending
	next, row 1, good	// start of next result
	...
	next, row N, good
	next, row N+1, eof + fail // no more results

If the stream were not incremented in the outer loop, something else would have to clear the eof condition, else the inner loop would never fetch the second resultset. Incrementing seemed the most natural way.

Logging

	std::ostream* log( std::ostream * pos );

For debugging purposes, the application may open a std::iostream and pass it to the dbstream for logging. What appears in the log is up to the provider. Normally the log output is not interesting to the application programmer, but it can be very helpful to the provider author.

Execute Queries

The query object was discussed above. For simple queries, execution is simply a matter of inserting the query into the stream.

	dbstream<dbstatus> db(provider, username, password);
	db.open( servername, database );	

	query q = "select * from T";

	db << q;

Parameterized queries are beyond the scope of this document. Briefly, the application constructs a parameter_type for each parameter and inserts that into the stream after the query. End-of-parameter-data is indicated by inserting dbstreams::endl into the stream.

Metadata

	const metadata& meta(int c) const;
	int columns() const;
	int rows() const;

Metadata are available immediately after executing a query. If a resultset was produced, the column count is also immediately available. (For most providers, this is true even if the resultset has no rows).

As each row is fetched (or sent) the row counter is updated by the provider. The application can keep count itself, of course, and it can also interrogate the counter with the row method.

Insert and Extract

	template <typename D>
		dbstream<PROVIDER>& 
			operator<<( const D& datum );

Insertion, as described earlier, directly mimics iostreams. There is even a dbstream manipulator, dbstreams::endl, to signal end-of-row.

		db.table( "T", bcpmode );
		db << 4    << 4.4  << "four"  << endl
		   << null << 5.5  << "five"  << endl
		   << 6    << null << "six"   << endl;

Different providers have different capabilities. Some providers build an INSERT statement from the insertion sequence. Others have ways to accept data other than via SQL, ways that can be faster. Sybase, for example, has a bulk-copy mode that lets the client send the server data in much the same way the server sends the client data. For providers with such a feature, there is another manipulator, eob, to signify end-of-batch.

	struct c
	{
		std::string colname;
		c( const std::string& colname );
	};

	dbstream<PROVIDER>&  operator>>( const dbstreams::c& c );

	template <typename D>
		dbstream<PROVIDER>& 
			operator>>( D& datum );

Data can be read from a row in column order, just as with any stream. The c structure, when “extracted” to, simply sets the streams's current column number according to the struct's colname. In that way:

	db >> c("phone") >> phone;

sets db:icol to N, where N is the column whose name is "phone". That operation returns a reference to the stream. Next the data are extracted from the stream into the variable phone. The stream uses its current column number, still N. N is incremented after the extraction.

Read and Write

	class a_container
	{
	public:
		template <typename OS>
			OS& read( OS& os );
		template <typename OS>
			OS& write( OS& os );
	};
	
	template <typename CONTAINER>
		dbstream<PROVIDER>& 
			read( CONTAINER& container );

To read a value requires an operator, but to read a whole resultset requires a function. And a place to put the data.

dbstream::read requires a standard STL container (or something very similar). It calls the elements's read method repeatedly, incrementing the stream each time, until eof. It is up to the container's element class to define what is to be done with the stream, i.e. which columns to assign to which member variables.

	template <typename CONTAINER>
		dbstream<PROVIDER>& 
			write( CONTAINER& container );

dbstream::write has the same standard STL container requirements. It iterates over the container, calling the element's write method for each one. If dbstream::write encounters a stream error, it throws a std::runtime_error exception.

Observe: the loops are already written for you. You don't declare iterators, worry about off-by-one errors, dereference pointers, nothing. Just define how your container elements read and write themselves to a dbstream, and call the appropriate dbstream function. What could be easier?

$Id: classes.desc.html,v 1.7 2008/04/05 22:56:44 jklowden Exp $

Comments, questions, and encouraging words are welcome. Please email the author, James K. Lowden.

dbstreams: Classes