AWB Plus: the Multimedia Annotator’s Workbench

A Prototype for a General Temporal Workbench

Donald Byrd

School of Informatics and Jacobs School of Music

Indiana University, Bloomington

Early May 2010

AWB Plus vs. AWB

The EVIA Annotator’s Workbench—“AWB” for short—is a program for assembling, segmenting, and annotating video; it was designed for presenting ethnographic field work, particularly ethnomusicology. Work on the GTW project has extended AWB 4.0, adding facilities for attaching independent audio files and still images to the timeline and for displaying user-specified “bands” along with the timeline. The resulting program, “AWB Plus”, is one step towards a real General Temporal Workbench and a demonstration of some of the GTW’s possibilities; the GTW is intended go much, much further. Note also that AWB Plus is a rough draft, so to speak, even within the limits of the extensions it has. For example, interactive editing of the way independent audio files and still images are attached to the timeline (i.e., what files are attached and where they’re attached) is very important for many applications, but it’s not possible yet: they’re described via XML in an AWB “project file”, period. And you won’t hear audio files when you play the video; you can hear them only within the audio-file editing dialog.

This document is a quick first attempt at a user’s guide for AWB Plus, based on material from grant proposals, etc. It should be enough to get the first brave users started, but only in conjunction with knowledge of the standard AWB, so I’ll present AWB Plus in comparison to the standard program. For more information about AWB, see:

• Annotators Workbench User’s Manual (AnnotatorsGuideUsersManualFINAL_2009_06_04.docx)

• AWB Preview, a 6-min. movie showing how to get started (AWB Preview.mov)

• EVIADA Developer Notes (EVIADeveloperNotes.doc)

“EVIA” (formerly called “EVIADA”) is the recently-concluded Ethnographic Video for Instruction and Analysis project; the public EVIA website is . AWB is a standalone program written in Java, using QuickTime for Java to handle audiovisual media, and it’s the major piece of software developed by the EVIA project, though not the only one; see for information on it and other EVIA software. The current version of the standard AWB, version 4.0.x, is scheduled for release in open source form in the near future. Note that the EVIA project focused narrowly on ethnography, where typical users start with multiple pre-recorded video files and nothing else, and of course both its design and its documentation reflect that. “GTW” is my own General Temporal Workbench project. It aims to develop a framework and toolkit capable of handling any media related to any temporal phenomena in any combination; but it’s still in the early stages. There’s a two-page introduction to the GTW project, with pointers to more, at .

Figure 1 shows AWB 4.0.x in an application of the type it was designed for, presenting ethnographic field work. For our purposes, its main features are:

1. Architecture. Standalone desktop application.

2. Media. AWB handles only video (with its built-in audio).

3. Timeline. Its timeline consists of a single video file, which it builds from specified excerpts from pre-existing video, so at any point there's exactly one video. Temporal resolution is fixed at 1 millisecond, and for practical purposes, the maximum duration might be 20 hours or so.

4. Structure over time. The combined video file can be subdivided, with three levels of subdivisions, into segments. Subdivision is hierarchic, but any subtree of the hierarchy can be incomplete, i.e., it can include one or more isolated non-overlapping chunks of the time occupied by the level above.

5. Attachments to timeline. Other than notes on technical problems with the recorded material, the only things that can be attached to the timeline are textual transcriptions: these are intended literally as a transcription of the corresponding action in the video. Transcriptions are not limited by the video segment hierarchy: they can cross segment boundaries arbitrarily. They can even overlap each other, but the user interface limits practical use.

6. User interface. AWB displays a timeline showing the hierarchic segmentation, with a band for the transcriptions superimposed. Since they're in one band, if there's much overlap among transcriptions, it's difficult or impossible to choose the one you want. There's a dialog with controls designed to make it easy to align transcriptions with the section of the timeline they correspond to.

7. Extensibility, including communication with external programs. Nothing specific.

Figure 2demonstrates AWB Plus in a rather different application from standard AWB use, namely adding music to and documenting a fireworks show; Figure 3 shows an application that’s radically different,visualizing the “molecular motor” Myosin V crawling along a thread of actin. AWB Plus extends AWB 4.0.x as follows:

1. Architecture. No change.

2. Media. Besides video, AWB Plus handles images (at a minimum, still photos, diagrams/drawings, and images of music score pages are useful) and audio files (or excerpts).

3. Timeline. No change.

4. Structure over time. No change.

5. Attachments to timeline. Images and audio files can also be attached to the timeline. Each has a start time and an end time; they can cross video segment boundaries, and instances of each can overlap instances of the same or the other type. Audio files also have start and end times within the file, specifying what part of the file is to be played.

6. User interface. (a) Images appear as small red triangles below the time axis at the proper start time position; they display in a separate window from that time to the specified end time. If two or more are to be displayed at the same time (as in Figure 6), the program shows only one but lets the user switch whenever they like; no other interaction is supported yet. (b) In addition to the display of segments and transcriptions, the timeline has any number of additional “bands”. In the demo version shown in Figure 6, the topmost of these displays audio files. There is a dialog for editing them similar to the transcription-editing dialog; it plays specified excerpts from them and has controls for aligning them with the video (not fully functional). (c) The extra bands may be used for any desired purpose; see the next paragraph for more about them. Images and audio files are described in AWB’s XML project file, and they can’t be added, removed, or modified via the interactive UI.

7. Extensibility, including communication with external programs. Each extra band is handled by its own Java object. The current AWB Plus has objects that fill the band with an image or a piecewise-linear curve described by data read from a file, or simply a solid color, and that optionally responds to a click by changing the color and displaying the timeline time of the click. Also, we’ve written an interface for two-way communications between AWB Plus and any ActionScript program (using the open-source package Merapi). There is a new, free, Web-based music-notation program, Noteflight (www,noteflight.com), written in Flash’s ActionScript language, that can display music in another window; with cooperation from the developers of Noteflight, we have a demo in which we control it from AWB Plus. In many cases controlling an external music-notation program is much better than just displaying images of music pages, since the notation program can do things like displaying a moving cursor on the music, scrolling or turning pages automatically, playing a synthesized version of the notation for comparison, etc.

Note that the AWB Plus objects to fill the extra bands mentioned under Extensibility could be made far more powerful with relatively little effort by taking advantage of existing free projects: for the “image” version, e.g., imagej; for the “curve from data” version, e.g., JFreeChart.

AWB Plus Changes to Project and Configuration Files

Project (“.awx”) files

AWB and AWB Plus use an XML project file with the extension “.awx”. To support AWB Plus’ new features, the file format has three new pairs of tags:

  • <bandInfos>:contains at least one<bandInfo>; each describes the background of an entire extra band. A <bandInfo> gives a band number plus a filename and type , either “data” (line segments) or “image” (still image), for the background for that band. NB: as of this writing (15 March 2010), the top band (#1) is always considered type “data” and any other bands are always considered type “image”.
  • <audios>:contains at least one<audio>, i.e., parallel audio file.
  • <images>:contains at least one<image>, i.e., still image file.

Here are some examples of all three to show the details.

<bandInfos>

<bandInfo number="1" type="data" url="file:/Users/donbyrd/Documents/MusicInformatics/stuff/steppingcoords_DLIM10.txt"/>

<bandInfo number="2" type="image" url="file:/Users/donbyrd/Documents/MusicInformatics/stuff/Band2Image.png"/>

</bandInfos>

<audios>

<audio from="0:01:30:001" until="0:02:45:002">

<creator>Marina Band</creator>

<filepath>/Users/donbyrd/Documents/MusicInformatics/junk/Star Spangled Banner.mp3</filepath>

<excerpt start="0:00:00:000" end="0:00:55:555"/>

</audio>

<audio from="0:07:30:001" until="0:08:00:002">

<creator>Siegfried Bassoon</creator>

<filepath>/Users/donbyrd/Documents/junk/viola.wav</filepath>

<excerpt start="0:00:00:000" end="0:00:02:160"/>

</audio>

</audios>

<images>

<image from="0:01:30:001" until="0:02:45:002" url="file:/Users/donbyrd/Documents/MusicInformatics/EVIA_AWB/AWBPlus/pix/StarSpangledBanner.png"/>

<image from="0:13:50:647" until="0:13:50:647" url="file:/Users/donbyrd/Documents/MusicInformatics/EVIA_AWB/AWBPlus/pix/2007FW/IMG_6595_09_59_42.JPG"/>

</images>

About images

• When image display times overlap, all that are viewable are shown in a tabbed pane, with the most recent in front. If the user stops playback, they can bring others to the front to compare them.

• The current AWB Plus displays all images from 25 millisec. before the <from> time until 100 millisec. after the <until> time; this is so that, (1) given images that just adjoin, the previous one will be available to compare for a few video frames; but (2) given images with blank periods between, the display times won't be too different from what was requested.

• Images that have a display duration of zero (that is, <until> time is the same as <from> time) will have their duration changed to a default value, currently 2.5 sec., by increasing the <until> time.

Extra bands and configuration files

Extra bands. The height of each extra band and how many there are are set in a configuration file, eviadaconfig.cfx. It’s done like this:

<parameters>

<termMappingFile>conf/cvtermmapping.props</termMappingFile>

<volume>1.0</volume>

...

<layoutFiles>

<file>conf/editing.xml</file>

<file>conf/playback.xml</file>

<file>conf/standby.xml</file>

</layoutFiles>

<extraBandHeight>30</extraBandHeight>

<numExtraBands>3</numExtraBands>

</parameters>

(etc., etc.)

extraBandHeight must be from 10 to 500 inclusive; numExtraBands must be from 0 to 3 inclusive. Note the reference to three .xml files; these are separate configuration files for AWB’s major modes, editing, playback, and standby. Unfortunately, you must also specify the height of the entire timeline area in the editing.xml and (in the unlikely case you use playback mode) playback.xml files, like this:

<componentBlock component="TIMELINE">

<position>

<direction>BOTTOM</direction>

<minimumPixels>260</minimumPixels>

<maximumPixels>260</maximumPixels>

(etc., etc.)

With the standard AWB, the number is always 150. In AWB Plus, it’s 170 (since the audio band takes 20 pixels) plus whatever the extra bands need. To accomodate the three extra bands, each 30 pixels high, of our example, the BOTTOM’s <minimumPixels> and <maximumPixels> should both be 170+(3*30) = 170+90 = 260.

Issues

The current AWB Plus has many loose ends and issues. Some of the major ones, roughly in decreasing order of importance:

• The project files contain absolute paths for the pictures, band backgrounds, audio files and video files. AWB Plus (like plain AWB) will notice that the video files aren't where it expects the first time you open a project and it will prompt you to locate them; but, for now, you'll have to hand edit the paths to the other files.

• When you play the video, attached audio files aren’t played; they can be played only from the Edit Parallel Audio File dialog. This likely makes them useless for many purposes.

• Setting the height or number of extra bands is quite awkward, since it requires changing both cnfg/editing.xml and cnfg/playback.xml. It should be handled by just a single files, and ideally in project files.

• It ignores the <bandInfo>’s type; the upper band is assumed to be type “data” and the lower band is assumed to be type “image”.

• For extra bands of type “data”, the maximum number of line segments is 1000, which is inadequate for some purposes.

• When a second project is opened without quitting and relaunching AWB Plus, the extra bands from the previous project are used, which is very confusing.

• When a second project is opened without quitting and relaunching AWB Plus, the Image Viewer initially displays its contents from the previous one, which is very confusing.

Figures

Figure 1. Ethnomusicology project in AWB 4.0.x.

Figure 2. Fireworks show with music in AWB Plus.

Figure 3. Nanotechnology visualization: Myosin V “motor” in AWB Plus.

1