The Standard Mp3 Database Project - Documentation

Tom von Schwerdtner

Revision History
Revision 0.012001-08-01Revised by: TGvS
Initial release.

This document is copyrighted (c) 2001 Tom von Schwerdtner and is distributed under the GNU Free Documentation License, stated below.

No liability for the contents of this document can be accepted. Use the concepts, examples and other content at your own risk. As this is a new edition of this document, there may be errors and inaccuracies, that may of course be damaging to your system. Proceed with caution, and although this is highly unlikely, the author(s) do not take any responsibility for that.

All copyrights are held by their respective owners, unless specifically noted otherwise. Use of a term in this document should not be regarded as affecting the validity of any trademark or service mark.

Naming of particular products or brands should not be seen as endorsements.

You are strongly recommended to take a backup of your system before major installation and backups at regular intervals.

Table of Contents
1. Introduction
2. The SMDP Database
2.1. Database Introduction
2.2. The Design of the SMDP Database
2.2.1. Central Tableset
2.2.2. The feature and feature_type Tables
2.2.3. Levels of Data Storage
2.3. Database Specific Variations
3.1. API Introduction
3.2. The Design of the SMDP API
3.3. Using the C API
3.4. Using the C++ API
3.5. Using the Java API
3.6. Using the PHP API
3.7. Using the Perl API
A.2. Optional ERD
B. feature_type information
C. Correlation Identifiers
D. Installing the SMDP
D.1. Setting up the Database
D.1.1. Installing a release
D.1.2. Installing from CVS
E. Utility Programs
E.1. dia2sql
E.1.1. Description
E.1.2. Requirements
E.1.3. Usage


A few years ago I went to one of those "Computer Expo and Sale" things that you always hear about on TV (the ones with the monster truck esque "sunday Sunday SUNDAY!" announcer that really speaks volumes about what the advertising people think of the target audience). I had never been to one of these shows and my only reason for going this time was to pick up a very large hard drive for a very small amount of money.

After a few hours of very careful comparison shopping, I settled my eyes on a 20 gig drive priced at a modest $150 or so (roughly the exact same price I would have payed for it anywhere else). The size of the drive was adequate for my purposes, since all I wanted to do was to pack it full of mp3's for my new "home-stereo" Linux box I had just setup, and you can fit somewhere around 10,000 mp3's on a 20 gig drive without any problems, so I was fairly pleased with purchase.

At any rate, I took the drive home and quickly began ripping every single CD I owned onto it, and for a month or two it served me well as a fairly large repository of mp3's. Then, one sad day while I was trying out some "experimental" software (namely the BeOS demo for Linux), the entire drive was hopelessly corrupted beyond the reach of fsck or any other diagnostic software I could get my hands on. Needless to say, I was not happy. I ranted and raved for a while. I posted to newsgroups and mailing lists trying to find ways to blame anyone but myself for using software that was very clearly labeled "alpha" and "100% totally and completely unsupported, use at your own risk". Then, I just sat there, and after a few hours, I went on with my life and totally forgot about the drive.

About a year passed, maybe more, maybe less, and one faithful day a few months ago I randomly ran fsck on the drive for kicks, and because I had upgraded things quite a few times since I had last tried anything on the drive. Well, to my total shock, fsck started running normally on the drive and, where previously it had very slowly spat out I/O errorrs, it quickly spat out less serious but still not-kosher errors about inodes and last-mod times, and after about 20 minutes of checking, it finished up and I was able to mount the drive. Things were still a mess on the drive since the directory strucute was trashed and all the files had archaic names in the lost+found directory, but after a little reorganizing, I had access to my beloved mp3 collection again.

Since I suddenly had a reasonable collection of mp3's again, I really wanted a good way to use and maintain them. I did a little research on mp3 players and collection managers and came to the conclusion that one of the SQL based mp3 players was the best solution. I downloaded a few of these apps and after trying them out I became somewhat irritated at the amount of effort it took to setup a database, index it and load it up into an app just to find out that I really didn't like the app all that much because it lacked some feature I wanted. I was not really suprised since most of the non-SQL mp3 players out there don't do everything I want them to do, but since they all interact directly with the mp3 files, there is no problem with using one app to create playslits, another to listen to them, another to tag them, and a another to rename the files based on the tags. The problem with the SQL based apps is that they all used thier own custom datbase designs, so one app could not use the database created by a seperate app. So, my little brain started to churn out some smoke and I decided that it would be very cool if there was some sort of standard database for storing mp3's that all these apps used, hence the creation of the Standard Mp3 Database Proejct.

At any rate, the SMDP wont change the world, in fact, most people will never even know it exists, but it has so far been serving to entertain a small handful of coders around the world. All I really wanted out of this project was a little fun, and with some luck a useful end product, and while the use of the ned product has yet to be determined, I have certainly enjoyed the experience to date. So, I hope you like it.

- Tom

Chapter 1. Introduction

The Standard Mp3 Database Project (SMDP) aims to make mp3 management easier for both the developer and the end-user. This document servers to assist developers in taking full advantage of the SMDP database and API, while cluing in the users as to the extent of the possibilities --FIXME--

Chapter 2. The SMDP Database

2.1. Database Introduction

In this section we will discuss some of the the ideas and reasoning that went into the design of the SMDP database. We will also discuss the varios diffrences you will encounter when using diffrent RDBMS's with the SMDP.

This section is not meant to be a tutorial or HOWTO on using databases, as some initial knowledge is required to understand the concepts discussed.

2.2. The Design of the SMDP Database

The SMDP database is designed to allow for a very dynamic set of information relating to mp3 and ogg files. The basic structure covers the usual entities in a traditional relational model. This is then complemented by a set of tables essentially composing a directory-like storage structure, geared toward both specific attribute classes as well as a completely "free" relations table - ie where the type of the objects are specified on both sides of the relation.

This provides - in addition to the usual information such as artist name, song title, album name and so on - for space to define unique information as well as to establish relationships betewwen any two objects in the database.

Anyways, enough of the hot on for more information.

2.2.1. Central Tableset

The central tableset is composed of the song, artist, collection (album), file, and media (filesystem) tables. Aside from the feature tables (Section 2.2.2), these will be the most frequently queried tables in the database.

2.2.2. The feature and feature_type Tables

The feature tables serve to store specific information relating to the central tableset. In turn, the feature tables all point to 'tag' information stored in the feature_type table.

For each of the main objects in the database, there exists a companion 'feature' table. The 'feature' tables serves to further describe the records in the object table.

See Appendix B for information on translating Id3 tags and Ogg comments into SMDP records.

2.2.3. Levels of Data Storage

To allow some felxability in the size and complexity of the database, there will be a number of storage options available. The details have not been hashed out yet, but there will most likely be option-specific settings as well as more generalized settings that take care of the specific options. Some of the options may include:

store_binary. [true|false] Determines if binary information from tags is actually stored in the database or if the location and type of information is stored (but not the binary data)

use_musicbrainz. [true|false] Weather or not to store MusicBrainz information in the database.

use_freedb. [true|false] Weather or not to store FreeDB information in the database.

Chapter 3. The SMDP API

3.1. API Introduction

API Introduction goes here

3.2. The Design of the SMDP API

API Introduction goes here

3.3. Using the C API

API Introduction goes here

3.4. Using the C++ API

API Introduction goes here

3.5. Using the Java API

You can find the javadoc files for the java api at

3.6. Using the PHP API

API Introduction goes here

3.7. Using the Perl API

API Introduction goes here

Appendix A. SMDP ERD




Figure A-1. SMDP ERD

ERD of the core SMDP Database

A.2. Optional ERD


Figure A-2. Optional ERD

ERD of the optional SMDP Database

Appendix B. feature_type information

Up-to-date ID3 Frame Identfiers can be found at the official id3 website.

Table B-1. feature_type information

IdFIXME-labelDescriptionSMDP Record
 ID3FN_TEXTENCText encoding (unicode or ASCII) 
 ID3FN_TEXTText field 
 ID3FN_DATAData field 
 ID3FN_DESCRIPTIONDescription field 
 ID3FN_OWNEROwner field 
 ID3FN_EMAILEmail field 
 ID3FN_RATINGRating field 
 ID3FN_FILENAMEFilename field 
 ID3FN_LANGUAGELanguage field 
 ID3FN_PICTURETYPEPicture type field 
 ID3FN_IMAGEFORMATImage format field 
 ID3FN_MIMETYPEMimetype field 
 ID3FN_COUNTERCounter field 
 ID3FN_IDIdentifier/Symbol field 
 ID3FN_VOLUMEADJVolume adjustment field 
 ID3FN_NUMBITSNumber of bits field 
 ID3FN_VOLCHGRIGHTVolume chage on the right channel 
 ID3FN_VOLCHGLEFTVolume chage on the left channel 
 ID3FN_PEAKVOLRIGHTPeak volume on the right channel 
 ID3FN_PEAKVOLLEFTPeak volume on the left channel 
 ID3FN_LASTFIELDIDLast field placeholder 
 ID3FID_NOFRAMENo known frame 
 ID3FID_AUDIOCRYPTOAudio encryption 
 ID3FID_PICTUREAttached picture 
 ID3FID_COMMERCIALCommercial frame 
 ID3FID_CRYPTOREGEncryption method registration 
 ID3FID_EVENTTIMINGEvent timing codes 
 ID3FID_GENERALOBJECTGeneral encapsulated object 
 ID3FID_GROUPINGREGGroup identification registration 
 ID3FID_INVOLVEDPEOPLEInvolved people list 
 ID3FID_LINKEDINFOLinked information 
 ID3FID_CDIDMusic CD identifier 
 ID3FID_MPEGLOOKUPMPEG location lookup table 
 ID3FID_OWNERSHIPOwnership frame 
 ID3FID_PRIVATEPrivate frame 
 ID3FID_POSITIONSYNCPosition synchronisation frame 
 ID3FID_BUFFERSIZERecommended buffer size 
 ID3FID_VOLUMEADJRelative volume adjustment 
 ID3FID_SYNCEDLYRICSSynchronized lyric/text 
 ID3FID_SYNCEDTEMPOSynchronized tempo codes 
 ID3FID_ALBUMAlbum/Movie/Show title 
 ID3FID_BPMBPM (beats per minute) 
 ID3FID_COPYRIGHTCopyright message 
 ID3FID_LYRICISTLyricist/Text writer 
 ID3FID_CONTENTGROUPContent group description 
 ID3FID_TITLETitle/songname/content description 
 ID3FID_SUBTITLESubtitle/Description refinement 
 ID3FID_ORIGALBUMOriginal album/movie/show title 
 ID3FID_ORIGFILENAMEOriginal filename 
 ID3FID_ORIGLYRICISTOriginal lyricist(s)/text writer(s) 
 ID3FID_ORIGARTISTOriginal artist(s)/performer(s) 
 ID3FID_ORIGYEAROriginal release year 
 ID3FID_FILEOWNERFile owner/licensee 
 ID3FID_LEADARTISTLead performer(s)/Soloist(s) 
 ID3FID_CONDUCTORConductor/performer refinement 
 ID3FID_MIXARTISTInterpreted, remixed, or otherwise modified by 
 ID3FID_PARTINSETPart of a set 
 ID3FID_TRACKNUMTrack number/Position in set 
 ID3FID_NETRADIOSTATIONInternet radio station name 
 ID3FID_NETRADIOOWNERInternet radio station owner 
 ID3FID_ISRCISRC (international standard recording code) 
 ID3FID_ENCODERSETTINGSSoftware/Hardware and settings used for encoding 
 ID3FID_USERTEXTUser defined text information 
 ID3FID_UNIQUEFILEIDUnique file identifier 
 ID3FID_UNSYNCEDLYRICSUnsynchronized lyric/text transcription 
 ID3FID_WWWCOMMERCIALINFOCommercial information 
 ID3FID_WWWCOPYRIGHTCopyright/Legal infromation 
 ID3FID_WWWAUDIOFILEOfficial audio file webpage 
 ID3FID_WWWARTISTOfficial artist/performer webpage 
 ID3FID_WWWAUDIOSOURCEOfficial audio source webpage 
 ID3FID_WWWRADIOPAGEOfficial internet radio station homepage 
 ID3FID_WWWPUBLISHEROfficial publisher webpage 
 ID3FID_WWWUSERUser defined URL link 
 ID3FID_METACRYPTOEncrypted meta frame (id3v2.2.x) 
 ID3FID_METACOMPRESSIONCompressed meta frame (id3v2.2.1) 
IdFIXME-labelDescriptionSMDP Record

Appendix C. Correlation Identifiers

If you wish to add identifiers, please contact the SMDP developers first to prevent overlapping values.

Table C-1. Correlation Identifiers


Appendix D. Installing the SMDP

FIXME - some introductory nonsense

D.1. Setting up the Database

Untill there are packages available, you will have to install the schema from the released tarballs or from a CVS checkout.

D.1.1. Installing a release

Coming very soon

D.1.2. Installing from CVS

 $ cd /path/to/schema/dir/
 $ autoconf
 $ ./configure
 loading cache ./config.cache
 checking whether make sets ${MAKE}... (cached) yes
 checking for perl... (cached) /usr/bin/perl
 checking for mysql... (cached) /usr/bin/mysql
 checking for psql... no
 checking for cat... (cached) /bin/cat
 creating ./config.status
 creating Makefile

 $ make

   make schema          Build sql-scripts for production-systems
   make schema-devel    Build sql-scripts for developement
   make csv             Build csv files
   make db-init         Initialize a database
   make db-init-devel   Initialize a database with devel-mode

 $ make db-init

Appendix E. Utility Programs


E.1. dia2sql

E.1.1. Description

dia2sql generates sql scripts from native Dia [1] files.

E.1.2. Requirements

In order to use dia2sql, you need:

  • Perl: any version greater 5.00x will work.

  • The expat library, you can find it here:

  • XML::Parser from CPAN, you can install it via the CPAN shell (at least on *nix systems you can :)

Technically, you dont have to have Dia, you just need a Dia file to parse. Make sure, to save the dia-files uncompressed.

E.1.3. Usage v1.2 smdp-pl7 - \$Revision: 1.2 $ 
original author: 2001 by Alexander Troppmann
heavily patched by the SMDP-Group

Usage: [options] infile.dia [outfile.sql]
Options may be:   
  -h                    print this help.
  --mysql               create SQL for mysql
  --postgresql          create SQL for postgresql
  --mssql               create SQL for ms-sql-server
  --debug               enter debug-mode
  --with-comments       write comments
  --with-create         write create statements for tables
  --with-html-output    writes print-statements with html-stuff
                        into sql-script. Only useful for 
                        database-parsers that will display that
                        like isql.exe from MS-SQL Server.
  --absolute-clean      write drop/create statements for the 
                        database and all tables. use it wisely!
  --create-indices      create indeces (postgres-only)
  --convert             convert datatypes to fit target db
                        name of the database to be used for the 
                        create-statement if not given, defaults
                        to smdp.
  --datatype=<file>     file needs to hold the mapping of the
                        datatypes. (defaults to datatypes.txt)
  --input=<input-type>  Datatypes in input belong to 
                        valid values:   mysql   (for mysql)
                                        mssql   (for ms-sql)
                                        pgsql   (for postgresql)

Converts xml data input from Dia to sql statements. If outfile.sql is not
specified the sql statements will be printed to STDOUT.

Edit and change the configuration at top of the Perl script.
Make sure you have defined the right database (MySQL or PostgreSQL) for
SQL output.



Dia is a gtk+ based diagram creation program released under the GPL license. See for more information.