DataDraw - datadraw.sourceforge.net

DataDraw3.0

Unleashed Speed and Power...

"Beer is proof that God loves us, and wants us to be happy" - Ben Franklin

Reving up C Applications with DataDraw Databases

DataDraw is an ultra-fast persistent database for high performance programs written in C. It's so fast that many programs keep all their data in a DataDraw database, even while being manipulated in inner loops of compute intensive applications. Unlike slow SQL databases, DataDraw databases are compiled, and directly link into your C programs. DataDraw databases are resident in memory, making data manipulation even faster than if they were stored in native C data structures (really). Further, they can automatically support infinite undo/redo, greatly simplifying many applications.

DataDraw databases can be persistent. Modifications to persistent data are written to disk as they are made, which of course dramatically slows write times. However, DataDraw databases can also be volatile. Volatile databases exist only in memory, and only for the duration that your program needs it. Volatile databases can be directly manipulated faster than C structures, since data is better organized in memory to optimize cache performance. DataDraw supports modular design. An application can have one or more common persistent databases, and multiple volatile databases to support various tools' data structures. Classes in a tool's database can extend classes in the common database. DataDraw is also 64-bit optimized, allowing programs to run much faster and in less memory than standard C programs using 64-bit pointers. This is because DataDraw databases support over 4 billion objects of a given class with 32-bit object references.

DataDraw is released under the GNU Library General Public License, Version 2. It costs you nothing to use, and does not restrict your application in any way. Only the DataDraw program itself is covered by the license. There is detailed documentation on DataDraw3.0, in OpenOffice open-document format. For those who don't have a recent OpenOffice installation, here's a PDF version. If you have questions, feel free to contact Bill Cox, or post to the mailing list.

Use Cases

If your application is 99% GUI, and 1% data manipulation, don't use DataDraw, because that 1% isn't worth automating. If you need to write a CGI application for the Apache web server with a MySQL back-end, don't use DataDraw, because the speed DataDraw gives your application will be wasted. If you don't use data structures more complex than a tree, don't use DataDraw, because there will be little for DataDraw to automate. Use DataDraw when you need speed, efficiency, and/or rich data structures. Use it for the simplicity it brings your project, it's automated debugging, persistence, and undo/redo capabilities.

DataDraw is extensively used in EDA tool development, where speed is critical and data structures complex. It has, for example, been used in technology mappers, circuit simulators (both analog and digital), placers, and routers. DataDraw has been in use since in EDA since 1992, and has matured greatly over that time. DataDraw has also been used in compiler development.

Internet servers also benefit from DataDraw. A DataDraw backed application can process 100X to 1000X more transactions per second than a LAMP based application. This makes DataDraw a good choice for SIP servers, BitTorrent, and other applications supporting thousands of simultaneous connections. Embedded web servers could also benefit from DataDraw's small memory footprint, power efficient data manipulation, and ultra-high speed. Telephony applications, and other CPU intensive tasks are potentially a good fit. Editors of all kinds are a good fit with DataDraw, because of it's infinite undo/redo automation.

When to use DataDraw vs MySQL and PHP

LAMP is a very powerful combination for creating web applications: Linux, Apache, MySQL, and PHP. Apache provides an incredibly powerful framework built around a world-class web server. PHP provides a powerful language for developing web applications rapidly. MySQL provides a way for these web applications to manage data. DataDraw is not meant to replace any of this. However, Apache is bloated, PHP is a slow interpreted language, and MySQL interprets ASCII commands that it reads through sockets that communicate with PHP. All this slows the system down 100-1000X, relative to plain old C code. Most applications don't care: if I'm just trying to sell stuff over the Internet, being able to process even one transaction per second is probably fine.

DataDraw is for demanding applications for which LAMP is too slow and/or bloated. While running, a DataDraw application owns the database, and does not share it with others. That makes it well suited for implementing some tasks, and not others. For example, it is well suited for building SQL servers, or BitTorrent trackers, and embedded servers, but not well suited for Apache modules. In these cases, consider embedding both DataDraw, and a free, fast, tiny HTML server, such as the MiniWeb HTTP server, directly in your application. This will allow you to serve many times more requests per second, in far less memory.

Benchmarks vs. C++/STL

DataDraw has been extensively benchmarked against the #1 rival for EDA software development: C++ using the Standard Template Library. In short DataDraw smokes C++/STL. In simple depth-first graph traversal, the graph_benchmark example shows DataDraw based code runs over 15X faster than the C++/STL version. It also uses less than half the memory when compiled in 64-bit mode. Check out the examples directory for current benchmarks. The difference in runtime is mostly due to L2 data cashe hit rates, which are 16.6X lower for the DataDraw version.

The DataDraw code also runs 7X faster than the C version!

	L1 Miss Rate	L2 Miss Rate	Run Time	Memory
DataDraw	3.8%	1.3%	7.83s	432MB
C++/STL	32.9%	21.7%	124.6s	1.1GB
Raw C	34.4%	20.4%	55.2	702MB

Installation

DataDraw3.0 is under heavy development, so it is wise to download and compile it directly from source. Use subversion like this:

$ svn co https://datadraw.svn.sourceforge.net/svnroot/datadraw/trunk datadraw

However, you are welcome to download a recent source tarball from here. Once you have a datadraw directory, cd into it and type:

    $ ./autogen
    $ ./configure
    $ make
$ su
$ make install

This should create a 'datadraw' executable, and install it in /usr/local/bin. You can specify another destination with the --prefix=<dir> option to configure.

Open-Source Projects Using DataDraw

There are a few fairly well developed open-source applications using DataDraw, in addtion to at least three commercial EDA companies using it for high-performance EDA tool development. The stable open-source projects are:

DataDraw (this project)
gnetman - an netlist translation tool, compatible with gEDA
BTSlave - a BitTorrent client
NetFS - a BitTorrent replacement

Credits

DataDraw has a long history, dating back to 1992, when Bill Cox wrote verision 1.0 and placed it into the copyleft domain. Since then, others have contributed to version 2.0, including Bill Falk, Mukesh Lulla, Brad Pirtle, and John Maushammer.

Copyleft 2006 All rights approved.