Welcome to Schema Mania

Schema Mania is a place for people who like (or need, or are just good at) database designs. It's completely non-profit, dependent on the enthusiasm of its visitors and the talent of its contributors.

Purpose

Schema Mania was conceived as a repository of database designs. You'd be able to come here, browse for a database design in your "problem space". With luck, you'd find something at least similar to what you had in mind. You'd download it, adapt it to your needs, and be happy. www.schemamania.org would be a web of database designs, if you will.

But, a funny thing happened on the way to the forum. Much of the technology that Schema Mania needs is not ready for general use. What's available is nascent; the rest is missing. However valuable the concept might be, Schema Mania lacks both software and standards. It thus became part of Schema Mania's goal to bring together people of various disciplines, to help them find each other and create better tools.

Table of Contents

Subpurpose: A Technical Detour

Blob o' Dots

Given the state of technology and standards today, what could it mean to "browse database designs"? What sort of file would you view and/or download, and what would you like to be able to do with it?

For viewing purposes, most people like a picture, a diagram of the database tables and their relationships. Pictures are, as Confucius never said1, worth a thousand words in this regard. A picture of a database is very helpful for human beings who want to understand the design. For implementation, though, we need a way to communicate the picture to the database. Later on (or, sometimes, earlier on), it's nice to have a way to convert existing database designs into pictures. We need, in other words, an intelligent picture, one that can be created from words and converted to words. We need it, but it doesn't exist. Not yet, not quite.

It's hardly news that file formats for pictures abound on the web. JPEG, PNG, you name it, the web does graphics. What these formats have in common is that they treat the picture as a block of dots. To the extent that they can be said to contain information, that information is only useful to the program that displays the picture. They are to intelligent diagrams as a fax is to email: Understandable and useful to the human being, but intractable, just a blob of dots.

Enter Dia

The great thing about standards, as some wag said, is that there are so many to pick from. Pictures have not stopped at the blob-of-dots dead end. The W3C developed Scalar Vector Graphics (SVG), an XML derivative. And there's a little-heralded part of the Gnome Office called Dia. Dia is a diagram editor and, like the rest of the Gnome Office, it stores its files in XML format. Dia shapes can be described with SVG, and the whole Dia diagram is stored as XML according to a freely available DTD. In a market defined as "free software to create diagrams stored in a published XML format", Dia is alone. It has no competition.

Dia is short for "Diagram". It might as well stand for "Data In Action". A blob of dots it's certainly not. Dia makes intelligent pictures.

Dia++

It didn't take long (nothing does, in this environment) before someone came up with a way to convert a Dia diagram to SQL. Someone else created a way to convert Perl and C++ to UML Dia diagrams. As of this writing, work is underway to create links in Dia files, to connect them together.

The efforts to date to connect Dia diagrams to database servers have (as their authors acknowledge) been quick hacks. They focus on a subset if one or another flavor of SQL, to get a job done. No criticism here: these programs are Good Things. But their authors are working in isolation without a framework and without any way to build on one another's work.

To be a full-fledged database diagramming tool, Dia needs:

Dia++ is an idea, not a separate project, to make Dia a better database diagramming tool. That means improvements to Dia itself, and to the surrounding tools and glue.

Current Interim Subpurpose of Schema Mania

For Schema Mania to become what it was conceived as, Dia (or something like it, but there is nothing else like it) is a sine qua non. But Dia needs more database intelligence, and it needs more and better conversion tools. In a bootstrappy way, it became Schema Mania's immediate purpose to do what it can to facilitate these events, to convert hope into reality.

Schema Mania aims to coordinate links and information about the disparate efforts to bring Dia to databases. If you know of such an effort, send me a link and a note about what you think about it. I'll add it to the collection.

Meta Architecture

I never meta architecture I didn't like?

What does Schema Mania need to fulfill its purpose? Free tools and free thinking.

Except for the first item, software already exists to do some part of each of these things. But what is the first item doing there, and why isn't it just a Dia file?

By the way, "databases" means relational databases. Maybe object-oriented databases or some other kind of database could be represented in Dia and within the meta-architecture described here, but the author doesn't have any experience with such things and consequently can't say much about them. At this point in the history of technology, relational databases are widespread, well-understood, entrenched, and unchallenged. The envisioned population of Schema Maniacs have relational designs to use and share. No offense intended to enthusiasts of more enlightened database architectures.

Schema in XML

Database designs have no accepted XML DTD representation2, and Schema Mania needs one. Why not just use Dia's DTD and go home? Dia diagrams of necessity contain a lot of non-database information (color and location of objects, for example) and don't have any place for database rules (cardinality of a relationship, for example). After all, Dia is a general purpose diagramming tool. There's no assurance that an arbitrary Dia file contains anything resembling a database diagram. It could as easily be a circuit diagram or organization chart.

The would-be tool writer, who'd like to apply a Dia diagram to, say, a MySQL database, has to master the Dia DTD, apply a whole bunch of extra-Dia sanity checks to make sure the diagram objects are even vaguely database objects, before he can begin transforming the diagram to SQL. He has another hurdle, too: each target database vendor has a different SQL flavor.

The Dia-to-SQL tool thus has to solve at least two problems:

  1. Extract the Dia objects and validate them according to the strictures of SQL.
  2. Convert the extracted objects to a particular flavor of SQL
  3. (Optional) Using the target database server's communication library, connect to the server and apply the generated SQL.

Now suppose there were an accepted XML DTD for relational database schema. If you're familiar with XML, you know that there's a standard in the XML framework for converting one XML document to another: XLST. All the rules of step #1 could be embedded in an XLST and processed with any program that understands XLSTs and DTDs. The operation is not really specific to SQL or even Dia. The challenge moves from parsing Dia files to defining rules in XLST. From code to data, in other words. A lot of people would say that's a Good Thing.

Step #1 is technology independent; it doesn't matter whose database server you use, there are still tables and keys and such. Step #2 involves a particular flavor of SQL, as defined by a particular version of a database server, and as such will have many implementations. Step #2 is obviously much easier to implement if step #1 is complete. That's the primary argument, in the Dia-To-SQL context, favoring an XML DTD for relational database designs.

This two step approach is not just a good idea; it's also been implemented once as a student project. Dominique Rohard wrote XSLT files and converted this diagram to this SQL using a generic XSLT processor. Dominique is French, which has fettered our communication, but it's still a wonderful demonstration.

There is a standard that might be applicable: XMI. XMI is a meta-model interchange format being promoted by IBM and others as a way to exchange UML and MOF diagrams. It employs some extensions to XML (as this writer understands it). Because XMI can hold information about object models generally, it's a bit heavyweight for holding simple relational models. But it might suit our purpose.

Current Tools

What have we got to work with, today? Below is a list of tools and projects I know about. If you know of others, please send them along

First of all, there's Dia herself
Dia can be used to draw many different kinds of diagrams. It currently has special objects to help draw entity relationship diagrams, UML diagrams, flowcharts, network diagrams, and simple circuits.
Dia SQL export plug-in
Currently supports only UML class diagrams, and can generate only plain SQL CREATE TABLE statements.
dia2code
Generates C++ and Java code from an UML Dia Diagram.
AutoDia
Creates Dia UML Diagrams from source code. Does not read SQL; included because it demonstrates possibilities.
Alzabo
Alzabo is a dual-purpose project. Its first goal is to be a data modeller, written in Perl. Its second goal is to use the data model created as the basis on an OO-RDBMS mapping.

Resources

Ronald Bourret
is a freelance programmer, writer, and XML researcher, specializing in databases and schemas. His article Mapping DTDs to Databases is an especially good guide to the whole subject.
The XMI specification and appendix
The XML Metadata Interchange Format (XMI) specifies a meta-data DTD.

Footnotes

  1. The ancient proverb, "A picture is worth a thousand words" is neither ancient nor proverbial. It was an advertising slogan for baking powder in the early 20th century. Many Chinese streetcars carried this version of the ad.
  2. To my knowledge, there's no XML standard for relational database schema representation. There are many standards for representing relational data, and there are such things as XML Schema to define rules for data in XML documents. If there's a DTD out there being used by more than a few people that represents a database design (independent of database or tool), I'd very much like to know about it.

About the author. Not much to say, really. If you know me, you might find my helter-skelter home page handy (or not). By day I work as a quantitative analyst; Schema Mania is strictly a hobby. Well, it's also a passion, because hobbies should be passions, after all.

Technical notes

I advocate the use and credo of Free Software.
The web site you have come to runs on NetBSD,
was edited with Bluefish and Nedit,
and receives mail with the venerable sendmail system.

Flawless DSL service provided by Speakeasy.net, an ISP that, unlike most, provides support for customers who want to run their own web sites.

top