September « 2009 « Infomotions Mini-Musings

Archive for September, 2009

Collecting water and putting it on the Web (Part III of III)

Thursday, September 3rd, 2009

This is Part III of an essay about my water collection, specifically a summary, opportunities for future study, and links to the source code. Part I described the collection’s whys and hows. Part II described the process of putting it on the Web.

Summary, future possibilities, and source code

There is no doubt about it. My water collection is eccentric but through my life time I have encountered four other people who also collect water. At least I am not alone.

Putting the collection on the Web is a great study in current technology. It includes relational database design. Doing input/output against the database through a programming language. Exploiting the “extensible” in XML by creating my own mark-up language. Using XSLT to transform the XML for various purposes: display as well as simple transformation. Literally putting the water collection on the map. Undoubtably technology will change, but the technology of my water collection is a representative reflection of the current use of computers to make things available on the Web.

I have made all the software a part of this system available here:

SQL file sans any data – good for study of simple relational database
SQL file complete with data – see how image data is saved in the database
PHP scripts – used to do input/output against the database
waters.xml – a database dump, sans images, in the form of an XML file
waters.xsl – the XSLT used to display the browser interface
waters2markers.xsl – transform water.xml into Google Maps XML file
map.pl – implementation of Google Maps API

My water also embodies characteristics of librarianship. Collection. Acquisition. Preservation. Organization. Dissemination. The only difference is that the content is not bibliographic in nature.

There are many ways access to the collection could be improved. It would be nice to sort by date. It would be nice to index the content and make the collection searchable. I have given thought to transforming the WaterML into FO (Formatting Objects) and feeding the FO to a PDF processor like FOP. This could give me a printed version of the collection complete with high resolution images. I could transform the WaterML into an XML file usable by Google Earth providing another way to view the collection. All of these things are “left up the reader for further study”. Software is never done, nor are library collections.

River Lune

Roman Bath

Ogle Lake

Finally, again, why do I do this? Why do I collect the water? Why have a spent so much time creating a system for providing access to the collection? Ironically, I am unable to answer succinctly. It has something to do with creativity. It has something to do with “arsience“. It has something to do with my passion for the library profession and my ability to manifest it through computers. It has something to do with the medium of my art. It has something to do with my desire to share and expand the sphere of knowledge. “Idea. To be an idea. To be an idea and an example to others… Idea”. I really don’t understand it through and through.

Read all the posts in this series:

Visit the water collection.

Tags: water
Posted in Hacks, Librarianship | Comments Off on Collecting water and putting it on the Web (Part III of III)

Collecting water and putting it on the Web (Part II of III)

Thursday, September 3rd, 2009

This is Part II of an essay about my water collection, specifically the process of putting it on the Web. Part I describes the whys and hows of the collection. Part III is a summary, provides opportunities for future study, and links to the source code.

Making the water available on the Web

As a librarian, I am interested in providing access to my collection(s). As a librarian who has the ability to exploit the use of computers, I am especially interested in putting my collection(s) on the Web. Unfortunately, the process is not as easy as the actual collection process, and there have been a number of processes along the way. When I was really into HyperCard I created a “stack” complete with pictures of my water, short descriptions, and an automatic slide show feature that played the sound of running water in the background. (If somebody asks, I will dig up this dinosaur and make it available.) Later I created a Filemaker Pro database of the collection, but that wasn’t as cool as the HyperCard implementation.

Mississippi River

The current implementation is more modern. It takes advantage of quite a number of technologies, including:

a relational database
a set of PHP scripts that do input/output against the database
an image processor to create thumbnail images
an XSL processor to generate a browsable Web presence
the Google Maps API to display content on a world map

The use of each of these technologies is described in the following sections.

Relational database

ER diagram

Since 2002 I have been adding and maintaining newly acquired waters in a relational, MySQL, database. (Someday I hope to get the waters out of those cardboard boxes and add them to the database too. Someday.) The database itself is rather simple. Four tables: one for the waters, one for the collectors, a join table denoting who collected what, and a metadata table consisting of a single record describing the collection as a whole. The entity-relationship diagram illustrates the structure of the database in greater detail.

Probably the most interesting technical characteristic of the database is the image field of type mediumblob in the waters table. When it comes to digital libraries and database design, one of the perennial choices to make is where to save your content. Saving it outside your database makes your database smaller and more complicated but forces you to maintain links to your file system or the Internet where the actual content resides. This can be an ongoing maintenance nightmare and can side-step the preservation issues. On the other hand inserting your content inside the database allows you to keep your content all in once place while “marrying” it to up in your database application. Putting the content in the database also allows you to do raw database dumps making the content more portable and easier to back-up. I’ve designed digital library systems both ways. Each has its own strengths and weaknesses. This is one of the rarer times I’ve put the content into the database itself. Never have I solely relied on maintaining links to off-site content. Too risky. Instead I’ve more often mirrored content locally and maintained two links in the database: one to the local cache and another to the canonical website.

PHP scripts for database input/output

Sets of PHP scripts are used to create, maintain, and report against the waters database. Creating and maintaining database records is tedious but not difficult as long as you keep in mind that there are really only four things you need to do with any database: 1) create records, 2) find records, 3) edit records, and 4) delete records. All that is required is to implement each of these processes against each of the fields in each of the tables. Since PHP was designed for the Web, each of these processes is implemented as a Web page only accessible to myself. The following screen shots illustrate the appearance and functionality of the database maintenance process.

Admin home

Admin waters

Edit water

High-level menus on the right. Sub-menus and data-entry forms in the middle. Simple. One of the nice things about writing applications for oneself is the fact that you don’t have to worry about usability, just functionality.

The really exciting stuff happens when the reports are written against the database. Both of them are XML files. The first is a essentially a database dump — water.xml — complete with the collection’s over-arching metadata record, each of the waters and their metadata, and a list of collectors. The heart of the report-writing process includes:

finding all of the records in the database
converting and saving each water’s image as a thumbnail
initializing the water record
finding all of the water’s collectors
adding each collector to the record
going to Step #5 for each collector
finishing the record
going to Step #2 for each water
saving the resulting XML to the file system

There are two hard parts about this process. The first, “MOGRIFY”, is a shelled out hack to the operating system using an ImageMagik utility to convert the content of the image field into a thumbnail image. Without this utility saving the image from the database to the file system would be problematic. Second, the SELECT statement used to find all the collectors associated with a particular water is a bit tricky. Not really to difficult, just a typical SQL join process. Good for learning relational database design. Below is a code snippet illustrating the heart of this report-writing process:

  # process every found row
  while ($r = mysql_fetch_array($rows)) {
  
    # get, define, save, and convert the image -- needs error checking
    $image     = stripslashes($r['image']);
    $leafname  = explode (' ' ,$r['name']);
    $leafname  = $leafname[0] . '-' . $r['water_id'] . '.jpg';
    $original  = ORIGINALS  . '/' . $leafname;
    $thumbnail = THUMBNAILS . '/' . $leafname;
    writeReport($original, $image);
    copy($original, $thumbnail);
    system(MOGRIFY . $thumbnail);
          
    # initialize and build a water record
    $report .= '<water>';
    $report .= "<name water_id='$r[water_id]' lat='$r[lat]' lng='$r[lng]'>" . 
               prepareString($r['name']) . '</name>';
    $report .= '<date_collected>';
    $report .= "<year>$r[year]</year>";
    $report .= "<month>$r[month]</month>";
    $report .= "<day>$r[day]</day>";
    $report .= '</date_collected>';
    
    # find all the COLLECTORS associated with this water, and...
    $sql = "SELECT c.*
            FROM waters AS w, collectors AS c, items_for_collectors AS i
            WHERE w.water_id   = i.water_id
            AND c.collector_id = i.collector_id
            AND w.water_id     = $r[water_id]
            ORDER BY c.last_name, c.first_name";
    $all_collectors = mysql_db_query ($gDatabase, $sql);
    checkResults();
    
    # ...process each one of them
    $report .= "<collectors>";
    while ($c = mysql_fetch_array($all_collectors)) {
    
      $report .= "<collector collector_id='$c[collector_id]'><first_name>
                 $c[first_name]</first_name>
                 <last_name>$c[last_name]</last_name></collector>";
      
    }
    $report .= '</collectors>';
    
    # finish the record
    $report .= '<description>' . stripslashes($r['description']) . 
               '</description></water>';
  
  }

The result is the following “WaterML” XML content — a complete description of a water, in this case water from Copenhagen:

  <water>
    <name water_id='87' lat='55.6889' lng='12.5951'>Canal
      surrounding Kastellet, Copenhagen, Denmark
    </name>
    <date_collected>
      <year>2007</year>
      <month>8</month>
      <day>31</day>
    </date_collected>
    <collectors>
      <collector collector_id='5'>
        <first_name>Eric</first_name>
        <last_name>Morgan</last_name>
    </collector>
    </collectors>
    <description>I had the opportunity to participate in the
      Ticer Digital Library School in Tilburg, The Netherlands.
      While I was there I also had the opportunity to visit the
      folks at 
      <a href="http://indexdata.com">Index Data</a>, a company
      that writes and supports open source software for libraries.
      After my visit I toured around Copenhagen very quickly. I
      made it to the castle (Kastellet), but my camera had run out
      of batteries. The entire Tilburg, Copenhagen, Amsterdam
      adventure was quite informative.
    </description>
  </water>

When I first created this version of the water collection RSS was just coming on line. Consequently I wrote an RSS feed for the water, but then I got realistic. How many people want to get an RSS feed of my water. Crazy?!

XSL processing

Now that the XML file has been created an the images are saved to the file system, the next step is to make a browser-based interface. This is done though an XSLT style sheet and XSL processor called Apache2::TomKit.

Apache2::TomKit is probably the most eclectic component of my online water collection application. Designed to be a replacement for another XSL processor called AxKit, Apache2::TomKit enables the developer to create CGI-like applications, complete with HTTP GET parameters, in the form of XML/XSLT combinations. Specify the location of your XML files. Denote what XSLT files to use. Configure what XSLT processor to use. (I use LibXSLT.) Define an optional cache location. Done. The result is on-the-fly XSL transformations that work just like CGI scripts. The hard part is writing the XSLT.

The logic of my XSLT style sheet — waters.xsl — goes like this:

Get input – There are two: cmd and id. Cmd is used to denote the desired display function. Id is used to denote which water to display
Initialize output – This is pretty standard stuff. Display XHTML head elements and start the body.
Branch – Depending on the value of cmd, display the home page, a collectors page, all the images, all the waters, or a specific water.
Display the content – This is done with the thorough use of XPath expressions.
Done – Complete the XHTML with a standard footer.

Of all the XSLT style sheets I’ve written in my career, waters.xsl is definitely the most declarative in nature. This is probably because the waters.xml file is really data driven as opposed mixed content. The XSLT file is very elegant but challenging for the typical Perl or PHP hacker to quickly grasp.

Once the integration of the XML file, the XSLT style sheet, and Apache2::TomKit is complete, I was able to design URL’s such as the following:

index.xml?cmd=getwaters – list all waters
index.xml?cmd=getcollectors – list all collectors
index.xml?cmd=getimages – dump all water thumbnail images
index.xml?cmd=getwater&id=79 – display a specific water
index.xml?cmd=getcollector&id=20 – display a specific collector and their waters

Okay. So its not very REST-ful; the URLs are not very “cool”. Sue me. I originally designed this in 2002.

Waters and Google Maps

In 2006 I used my water collection to create my first mash-up. It combined latitudes and longitudes with the Google Maps API.

Inserting maps into your Web pages via the Google API is a three-step process: 1) create an XML file containing latitudes and longitudes, 2) insert a call to the Google Maps javascript into the head of your HTML, and 3) call the javascript from within the body of your HTML.

For me, all I had to do was: 1) create new fields in my database for latitudes and longitudes, 2) go through each record in the database doing latitude and longitude data-entry, 3) write a WaterML file, 4) write an XSLT file transforming the WaterML into an XML file expected of Google Maps, 5) write a CGI script that takes latitudes and longitudes as input, 6) display a map, and 7) create links from my browser-based interface to the maps.

It may sound like a lot of steps, but it is all very logical, and taken bit by bit is relatively easy. Consequently, I am able to display a world map complete with pointers to all of my water. Conversely, I am able to display a water record and link its location to a map. The following two screen dumps illustrate the idea, and I try to get as close to the actual collection point as possible:

World map

Single water

Read all the posts in this series:

Visit the water collection.

Tags: water
Posted in Hacks | 1 Comment »

Collecting water and putting it on the Web (Part I of III)

Thursday, September 3rd, 2009

This is Part I of an essay about my water collection, specifically the whys and hows of it. Part II describes the process of putting the collection on the Web. Part III is a summary, provides opportunities for future study, and links to the source code.

I collect water

It may sound strange, but I have been collecting water since 1978, and to date I believe I have around 200 bottles containing water from all over the world. Most of the water I’ve collected myself, but much of it has also been collected by friends and relatives.

The collection began the summer after I graduated from high school. One of my best friends, Marlin Miller, decided to take me to Ocean City (Maryland) since I had never seen the ocean. We arrived around 2:30 in the morning, and my first impression was the sound. I didn’t see the ocean. I just heard it, and it was loud. The next day I purchased a partially melted glass bottle for 59¢ and put some water, sand, and air inside. I was going keep some of the ocean so I could experience it anytime I desired. (Actually, I believe my first water is/was from the Pacific Ocean, collected by a girl named Cindy Bleacher. She visited there in the late Spring of ’78, and I asked her to bring some back so I could see it too. She did.) That is how the collection got started.

Cape Cod Bay

Robins Bay

Gulf of Mexico

The impetus behind the collection was reinforced in college — Bethany College (Bethany, WV). As a philosophy major I learned about the history of Western ideas. That included Heraclitus who believed the only constant was change, and water was the essencial element of the universe. These ideas were elaborated upon by other philosophers who thought there was not one essencial element, but four: earth, water, air, and fire. I felt like I was on to something, and whenever I heard of somebody going abroad I asked them bring me back some water. Burton Thurston, a Bethany professor, went to the Middle East on a diplomatic mission. He brought back Nile River water and water from the Red Sea. I could almost see Moses floating in his basket and escaping from the Egyptians.

The collection grew significantly in the Fall of 1982 because I went to Europe. During college many of my friends studied abroad. They didn’t do much studying as much as they did traveling. They were seeing and experiencing all of the things I was learning about through books. Great art. Great architecture. Cities whose histories go back millennia. Foreign languages, cultures, and foods. I wanted to see those things too. I wanted to make real the things I learned about in college. I saved my money from my summer peach picking job. My father cashed in a life insurance policy he had taken out on me when I was three weeks old. Living like a turtle with its house on its back, I did the back-packing thing across Europe for a mere six weeks. Along the way I collected water from the Seine at Notre Dame (Paris), the Thames (London), the Eiger Mountain (near Interlaken, Switzerland) where I almost died, the Agean Sea (Ios, Greece), and many other places. My Mediterranean Sea water from Nice is the prettiest. Because of the all the alge, the water from Venice is/was the most biologically active.

Over the subsequent years the collection has grown at a slower but regular pace. Atlantic Ocean (Myrtle Beach, South Carolina) on a day of playing hooky from work. A pond at Versailles while on my honeymoon. Holy water from the River Ganges (India). Water from Lock Ness. I’m going to grow a monster from DNA contained therein. I used to have some of a glacier from the Canadian Rockies, but it melted. I have water from Three Mile Island (Pennsylvania). It glows in the dark. Amazon River water from Peru. Water from the Missouri River where Lewis & Clarke decided it began. Etc.

Many of these waters I haven’t seen in years. Moves from one home to another have relegated them to cardboard boxes that have never been unpacked. Most assuredly some of the bottles have broken and some of the water has evaporated. Such is the life of a water collection.

Lake Huron

Trg Bana Jelacica

Jimmy Carter Water

Why do I collect water? I’m not quite sure. The whole body of water is the second largest thing I know. The first being the sky. Yet the natural bodies of water around the globe are finite. It would be possible to collect water from everywhere, but very difficult. Maybe I like the challenge. Collecting water is cheap, and every place has it. Water makes a great souvenir, and the collection process helps strengthen my memories. When other people collect water for me it builds between us a special relationship — a bond. That feels good.

What do I do with the water? Nothing. It just sits around my house occupying space. In my office and in the cardboard boxes in the basement. I would like to display it, but over all the bottles aren’t very pretty, and they gather dust easily. I sometimes ponder the idea of re-bottling the water into tiny vials and selling it at very expensive prices, but in the process the air would escape, and the item would lose its value. Other times I imagine pouring the water into a tub and taking a bath it it. How many people could say they bathed in the Nile River, Amazon River, Pacific Ocean, Atlantic Ocean, etc. all at the same time.

How water is collected

The actual process of collecting water is almost trivial. Here’s how:

Travel someplace new and different – The world is your oyster.
Identify a body of water – This should be endemic of the locality such as an ocean, sea, lake, pond, river, stream, or even a public fountain. Natural bodies of water a preferable. Processed water is not.
Find a bottle – In earlier years this was difficult, and I usually purchased a bottle of wine with my meal, kept the bottle and cork, and used the combination as my container. Now-a-days it is easier to root round in a trash can for a used water bottle. They’re ubiquitous, and they too are often endemic of the locality.
Collect the water – Just fill the bottle with mostly water but some of what the water is flowing over as well. The air comes along for the ride.
Take a photograph – Hold the bottle at arm’s length and take a picture it. What you are really doing here is two-fold. Documenting the appearance of the bottle but also documenting the authenticity of the place. The picture’s background supports the fact that water really came from where the collector says.
Label the bottle – On a small piece of paper write the name of the body of water, where it came from, who collected it, and when. Anything else is extra.
Save – Keep the water around for posterity, but getting it home is sometimes a challenge. With the advent of 911 it is difficult to get the water through airport security and/or customs. I have recently found myself checking my bags and incurring a handling fee just to bring my water home. Collecting water is not as cheap as it used to be.

Who can collect water for me? Not just anybody. I have to know you. Don’t take it personally, but remember, part of the goal is relationship building. Moreover, getting water from strangers would jeopardize the collection’s authenticity. Is this really the water they say it is? Call it a weird part of the “collection development policy”.