The other…

DROID, and PRONOM while we’re at it.


Missive from Droid

Occasionally this may happen, an experimental use of devices. In the interest of adding a little spice to IRLS674 (it’s spring 2010).

collected, federated (unit 10)

My explorations of federated collections available via OAI-PMH led me to three sites I liked. I should be more critical, perhaps; there are small notes of negativity below. But overall, I am impressed with many projects, even distracted by them, blinded by the science and beauty of what has already been achieved in these, The Early Days.

Gatherings of unlike things are still gatherings of unlike things, and until voice recognition and 3D rendering are incorporated and linked, and even then, we are going to travel step-wise, not immediately, to the Thing we want. These federations all act slightly differently as repositories, meaning they offer up metadata, to greater and lesser degrees, and the thing itself, to greater and lesser degrees, and related information, to greater and lesser degrees. They “own” their holdings at different depths. I discovered I could have spent a very very long time as I discovered more angles to analyze. Time drives on.

The features I look for in a good federation include: the ability to build a good and useful search across (large) collections; the ability to sort or refine search results; multiple paths into collections via guides and indexes; clarity about sources; and some kind of quantification of the resources available in specific areas (know how much you’re in for). As well as the clearest home page in the galaxy.

The three I reviewed all scored well with me, and here they are:

Digital Library for Earth System Education (DLESE)

Funded originally by the National Science Foundation. Contributors include NASA as well as educators; resources are suggested as well as directly contributed by creators. Good FAQ. Nice graphic representation of quantity available for each subject area covered. All resources are categorized by grade level. Unfortunately search results for simple searches are presented in unknown order in nearly full-record form, so scrolling is very necessary until you add facets or additional terms to cut down the hits list. Search is available by resource type, very useful, including by subtype of classroom activity. Excellent ability to build a useful set of results. Includes ability to select reviewed sources. The home page is frankly confusing, being a small jumble of radio buttons, links, text boxes, and drop-down lists smooshed against the left margin–but there isn’t so much that it can’t be figured out. It isn’t DSpace or another recognizable repository. Not my favorite, not well-designed enough, but nice content for educators.

Western Waters Digital Library

Originally funded by IMLS, now representing collections of 32 universities. Includes mission statement and collection policy, as well as a collection guide and very nice interactive collection guide map organized by river basins, unfinished but nonetheless interesting. Results of searches link out to owning repositories, so only core Dublin Core is offered within the WWDL. Thumbnails are offered where they can be, but that’s all you get within WWDL–full images only at home repositories. The home page is clean, with slideshow images.


(While not listed on either of our lists of service providers, my background checking shows that its contributions are required to be OAI-PMH-friendly, and its parent, CDL, is harvestable.) A “free public gateway to a world of primary sources” and a project of the California Digital Library, this repository offers collections targeting educators, including “themed” by era. With a browsable subject index, keyword searches show results grouped by type, and hits are shown in context for textual results. There is no advanced search; one is invited to search within a given set of results. Within Calisphere full Dublin Core is offered; linking out to the owning repository is the last and least option. Its entry page is beautiful, well laid out with a few choice chosen images representing its breadth.

I found myself taking WWDL and Calisphere head to head and tried to find the same item in both collections to compare apples and apples and came quite close. The downside of leaving WWDL immediately upon clicking on a thumbnail is leaving your result set behind; the downside of Calisphere could be seen as losing context by not easily getting to items in their native repository. I can’t tell which might be the larger federation, and they both may grow daily or at least weekly (I’d bet on it with Calisphere, unsure about WWDL). Large sets of data are good; they’re all in one place, and you can scan-search them all. On the other hand, fineness of searching has to give in at some point with huge federations; they just can’t spend the resources to offer the level of detail that a more focused collection can. They probably cannot govern their contributors to the same fine degree, either, so scattered detail in, scattered detail out.

Calisphere wins by a nose.

nix on the prefab (unit 12)

tv-dinnerFrom a learning and pedagogical perspective, I would not, not, not want to entertain the notion of downloading a prepackaged virtual machine. The most frustrating thing I can think of, at least in this IRLS672/675 realm, would be a black box experience. Of course I wouldn’t have known any better. But tech life is full of black boxes for the curious and uninformed these days. It’s Just More Fun Doing It Myself. Doing with my own ten fingers eyes glued to screen with held breath is how I learn, best.

These courses are, IMHO, all about understanding how to take the lid off the black box and at least burn the memory of having once seen the contents revealed, if not how to remove them and replace them each nut and bolt by nut and bolt in perfect reverse sequence. At least being able to talk to the doers, if not be the doer. I like the idea of getting in there from the beginning and building the thing, setting the stage for tweaking and downloading and unpacking to come. Given that there are always ways to retrace much further back, back to zeroes and ones in the end, so our VM is really itself an artificial starting point, but. It’s hard these days to get much further back than the web; that’s one of the reasons this program so appealed to me. And from a digital library standpoint, I can see saying to Customer A, Hey, I can show you what I mean, unpacking the laptop and firing up the VM. Possible without having built it, but probably not so I could explain to Customer A what a virtual machine is. Not that Customer A asked. That was Customer B who had some programming experience and questioned me about this DigIn experience listed on my résumé.

Then there’s the bounceback effect. The virtual machine is a lovely, harmless way of learning how the rather more easily harmed home machine works…can I count the ways in which I’ve learned how my Mac works from running the VM through paces? No I cannot. But they are many. Not just because of the Linux/Unix heritage. From the start, the idea that this is not my Mac, this is another entity cohabiting the aluminum casing, that would have been hard to get across to me without all the startup; by now I might have slipped into thinking of it as an application. Mystified by my inability to open a Word doc inside it.

Perhaps these are not generalized courses in computer science…but as archivist, librarian, information professionista, I would prefer to be able to say that I built a VM than that I’ve seen a VM. No doubt there would have been time for other things had we skipped straight in (that’s assuming the large file issue of a prefab was not an issue–got a feeling there’d always be something to troubleshoot, though). Maybe the time freed up (which I’m not persuaded would have been much) could have been spent more carefully preparing a collection for use and reuse across IR software so that we’d have, I don’t know, more experience with different file formats or import/construction of sturdy metadata. I wouldn’t trade it in.

Then again: I’ve had a pretty easy time with the VMs. Let those speak who have not.

inbetweenThe current DLib article on DataStaR meshes with something we’re thinking about at my workplace. My topic is In Between Repositories (IBRs), where data bits may live, waiting, either forever or on their way to becoming something else. Granted DataStaR, and much IR work generally, emanates from scholarly need, need to share living work in stages, let work evolve groupwise; but really it’s all about beginning to use an IR as a memory stick. A little.

We’re embarking on a trial of an intentionally dark archive–a part of our institutional digital repository, up till now only for materials publicly described, that will take things fresh off the truck, electronic material (disks, CDs, DVDs, floppies…) we may not have appraised yet that needs someplace to live other than in a literal box. Accessioned, but not cataloged. Just dashed off into a form we know will be there when we, or they of the future, get to it.  We’ll park electronic files of potentially unknown content in this corner of the repository until we have time to sort them (and the rest of their associated collections) out. A dumping ground with a preservation angle. (Though I’m not entirely sure what the repository manager has agreed to. Migration?) Brings to mind the preservation track of IRs that we’ve been considering in school. It figures that shades of grey would begin to emerge in the business of digital holdings; once storage is cheap enough…no reason not to.

But a little concern about just starting technologically, which, even as I speak, we are, without really gaming this out. In a sense it’s no different than my usual practice of leaving the disks in place next to the folders in a Paige box that goes to remote storage…as long as we know where to find the preserved unwashed digital bits. But it’s kind of like the difference between the physical envelopes of mail piling up and your inbox filling with megs and megs…you might just toss everything out in either case, but in my opinion you’re more likely to empty the inbox without examination than to toss all the paper without a glance. Intentional dark e-storage isn’t quite preservation and it isn’t access. It is sure to be more expensive than storage in a very stable dry, cool box; we’ll be outsourcing conversion of things we can’t even read, no floppy drives for miles around, sometimes without knowing a thing about the contents. And we don’t yet have a plan for metadatification of the dark content.

I’m not arguing in favor of dry, cool boxes for everything…just musing before embarking. I will need to bring up some points and try to seem credible beyond my actual place in the foodchain.

the space that is d

D Well DSpace is friendly. The web self of it is friendly…and happily, all the rest is command line, happy! No sniping, really happy. Prefer it, nearly, possibly. Though functionality is not high or many or multifarious, yet…I know there are command line secrets.