Searchability

Previous | Top | Next | Search | Comments

Searchability

SEARCHABILITY MEANS DIRECT INFORMATION ACCESS

Provide an index.

The largest of information systems must include search features.

ADVANTAGES

These features help overcome the disadvantages of the purely browsable system. They have a number of distinct advantages.

Creates alternative logical classifications
Simplifies location of known items
Works independently of collection size

As described in the previous section, your conception of the information universe is not necessarily the same as your reader's. While you try to group things in the most logical manner, your reader's "logic" will be different than yours. Searchability can help over come this discrepancy by allowing the reader to create their own set of logically similar items.

Searchability readily lends itself to locating known items rather than making the reader browse down a number of menus to get what they want. Similarly, a reader may have used an item from you before and not put it into their hotlist. If you have searching mechanisms in place, then it may be easier for the reader to find the item again.

Searchability works independently of your collection's size. Browsable systems begin to break down after the collection becomes too large. The effectiveness of searching an information system is not directly impeded by the size of the system.

DISADVANTAGES

Yet, purely searchable systems are not perfect either. Users:

Must know searching syntax
Must have a preconceived idea, phrase, or term
Must know the structure of the data

In order to effectively search an information system, the reader must know the query language of the search engine. This may include Boolean logic or Unix regular expressions. They may have to know the meaning of right-hand truncation and the symbol for its use. While this sort of knowledge is necessary to use the system, it is irrelevant to the information itself and is seen by the reader as an impediment.

The ability to search an information system assumes your readership has a preconceived idea describing what they need. This is a notorious example of the chick and egg problem.

"The ability to search an
information system assumes
your readership has a preconceived
idea describing what they need."

How are you suppose to find information about a particular topic if you don't know about that topic in the first place. In other words, the reader must come to the information with some terms or phrases describing want they want to find. Many times those terms or phrases will not be found in the collection, but synonyms will be found. It is difficult to think of many synonyms and it is difficult to "guess" at the controlled vocabulary used by the collection.

Finally, totally searchable systems require the searcher to know the data structure of the indexed collection. Is the data divided into fields? If so, what are those fields and how do they specify them in their query?

GUIDELINES

What to do? Here are some guidelines for creating searchable systems.

Include help texts
Map located items to similar items
Provide simple as well as "power user" search mechanisms

Include help texts
Just as "about" texts are necessary in a browsable system, help texts are necessary for searchable systems. Help texts describe the features and limitations of the system. They list system's data structure including fields available for searching and the contents of those fields. Help texts also list plenty of example searches and provide explanations on what the end-user should do if they encounter too many or too few results.
Map located items to similar items
After items are located with the search mechanism, there should be links to similar items. This answers the perennial question, "Can you find me more items like this one?" These links should go directly back to your browsable collection where the end-user can freely "wander". From there the end-user will have the ability to see terms that can be applied to more searches. This is where you provide the end-user with the vocabulary terms of your system, in case they are unfamiliar with your system of information organization.
Provide simple as well as "power user" search mechanisms
Simple search mechanisms will be most useful for the first-time or casual end-user. Unfortunately, these same mechanisms often return too many or too few "hits". Providing power user search mechanisms like field searching, truncation, Boolean qualifiers, and number-of-term limitations can compensate for the simple searches. Unfortunately, the cost of these services is effectively describing the more powerful searching mechanisms to the end-user. Again, readability comes into play.

SOFTWARE

Readability is by achieved by exploiting HTML. Browsability is most effectively and efficiently via datbase applications. Searchability is acquired through indexing HTML files or directly searching the contents of a database. Below is a non-exhaustive list of software to help accomplish the goals of browsability and searchability:

Unix

database - Mini SQL with W3-mSQL
indexing - Harvest or freewais-sf

Windows

database - FileMaker Pro
indexing - WebIndex, a port of SWISH that comes with WebSite

Macintosh

database - FileMaker Pro with or without Lasso Lite
indexing - Apple e.g. (if you can find it) or Search Server by Social Engineering

SEE ALSO

"Web Server Search for Windows" - "WSS is a CGI back-end for Windows based Web servers that allows your clients to conduct simple queries on html files in an unlimited number of directories. The output is a listing of links containing the title, heading, or file name of files that contain the search string. You simply modify the search.ini file for the directories you want users to search, and insert a form into your page that includes the number of directories to search, a reference to these directores and a submit button. WSS takes care of the rest." <URL:http://wgg.com/wgg/best/search.htm>

Chuck Shotton, "Using FileMaker Pro with MacHTTP" - An archive with sample forms and CGI that shows how to hook MacHTTP to FMPro. <URL:http://www.biap.com/machttp/examples/fmpro.sit.hqx>

Chuck Shotton, "Writing Search Engines for MacHTTP " - This points to an archive containing C source code for a sample application that performs searches in conjunction with MacHTTP using the "srch" AppleEvent. <URL:http://www.biap.com/machttp/ftp/search_ex.sit.hqx>

Glimpse Working Group, "Glimpse" - "Glimpse is a very powerful indexing and query system that allows you to search through all your files very quickly. It can be used by individuals for their personal file systems as well as by organizations for large data collections. Glimpse is the default search engine in Harvest." <URL:http://glimpse.cs.arizona.edu/>

Kevin Hughes, "SWISH Documentation" - "SWISH stands for Simple Web Indexing System for Humans. With it, you can index directories of files and search the generated indexes. For an example of swish can do, try searching for the words "office and map" at EIT. All of the search databases you see there were indexed by swish. When you do a search, it's the swish program that's doing the actual searching." <URL:http://www.eit.com/software/swish/swish.html>

Mic Bowman, et al., "Harvest Information Discovery and Access System" - "Harvest is an integrated set of tools to gather, extract, organize, search, cache, and replicate relevant information across the Internet. With modest effort users can tailor Harvest to digest information in many different formats from many different machines, and offer custom search services on the web." <URL:http://harvest.transarc.com/>

Yahoo!, "Gateways (Yahoo!)" - A collection on searching gateway scripts, as well as a number of CGI examples are found here. <URL:http://www.yahoo.com/Computers/World_Wide_Web/Gateways/>

Previous | Top | Next | Search | Comments

Version: 1.5
Last updated: 2004/12/23. See the release notes.
Author: Eric Lease Morgan (eric_morgan@infomotions.com)
URL: http://infomotions.com/musings/waves/