Thoughts about Memex

Shlomo Dubnov

Abstract

These notes describe the ideas and the algorithms that were used for creation of a computer composition for violin “Memex”. The work consist of a recombination of phrases by Bach, Mozart and Beethoven using ideas from universal coding and machine learning, as explained in the notes. It also represents an approach to music modeling as an information source, which opens new possibilities for style learning, mixing and experimenting with various music-listener relations based on memories, expectations and surprises.

Memex is a computer artifact, a composition resulting from mathematical operations on a database of musical works that was designed and created by the author. The name of the piece comes from an article by Vannevar Bush[1], 1945, where he described a futuristic device “in which an individual stores all his books, records, and communications, and which is mechanized so that it may be consulted with exceeding speed and flexibility... When the user is building a trail, he names it, inserts the name in his code book, and taps it out on his keyboard. Before him are the two items to be joined, projected onto adjacent viewing positions. At the bottom of each there are a number of blank code spaces, and a pointer is set to indicate one of these on each item. The user taps a single key, and the items are permanently joined".

The idea of building memory trails, joining of information and deriving new meanings is designed into the composition Memex in a very formal and algorithmically precise manner. What we hear is new music, where every note belongs to one of the great masters, either Bach, or Mozart or Beethoven. Works by these composers were analyzed using information processing algorithms to be described below, creating an automaton that can travel across the web of musical associations, leaving a trail of memories, expectations and surprises. The work is provocative, intended to leave the listener perplexed conceptually, aesthetically, and may be emotionally.

Experiments with music models using IT methods (usually named musical style learning) are now almost a decade old. In many respects, these works build upon a long musical tradition of statistical modeling that began with Hiller and Isaacson "Illiac Suite"[2] in the 50th and the French composer / mathematician / architect Xenakis [3]using Markov chains and stochastic processes. My experiments with machine learning of musical style began with a simple mutual-source algorithm suggested by El-Yaniv et al.[4] that was made to jump between different musical sources looking for the longest matching suffix, effectively creating a new source that is closest in terms of cross-entropy to the original musical sequences. The next step in experiments, done with Assayag et al.[5],was learning of musical works using compression algorithms, specifically the Lempel-Ziv [6]incremental parsing (IP) algorithm for creation of context dictionary and probability assignment as suggested by Feder and Merhav's [7]universal prediction. Performing a random walk on the phrase dictionary with appropriate probabilities for continuations generated new music.

These works achieved surprisingly credible musical results in terms of style imitation. Some informal testing suggested that people could not distinguish between real and computer improvisation for about 30-40 seconds. This was important for showing that major aspects of music can be captured without explicit coding of musical rules or knowledge. Additional experiments were done using Ron et al. [8]Probabilistic Suffix Tree (PST) machine learning method, trying to improve on "generalization" capabilities of the statistical models at the cost of some extra "false notes" resulting from "lossy compression".

Memex presents a new approach [9]using Allauzen et al. [10]Factor Oracle (FO)for generation of new music from examples. FO is an automaton that is functionally equivalent to a suffix tree, but with much fewer nodes. In comparison to IP and PST trees that discard substrings, FO is preferred because it can be built quickly and like the suffix tree it encodes all possible substrings. One of the main properties of FO is that it indexes the sequence in such a way that at every point along the data it builds a pointer to future continuations for most recent suffixes that appeared in that place. By "recent suffixes" we mean suffixes that occur for the first time when a new symbol is observed. Since FO is constructed in online manner, all "previously seen" suffixes are detected earlier in the sequence. So, at every point along the sequence FO provides pointers to continuations of most recent suffixes, and a pointer back to the longest repeating suffix. This way, we can either jump into the “future” based on the most recent past, or go to earlier past to look for continuations of previously encountered suffixes (i.e. suffixes of shorter prefixes), and so on. So, instead of considering best context with log-loss "gambling" on the next note, the new method operates by "forgetting" and selective choice of historical precedence for deciding about the future.

The piece Memex for violin is created by such "random walk" over an FO that was constructed from a collection of works by Bach, Mozart and Beethoven. Prior to construction of FO, the music material was analyzed in short times to construct a set of events (individual or simultaneous notes and chords become symbols in a new sequence). This is needed to represent polyphony (account for simultaneous notes) and deal with invariance and possible symmetries. At generation step the algorithm randomly chooses (in this piece with probability .87) to continue to next state (advance along the original sequence) or jump back (with probability .13) along the suffix link and follow from there to any forward link. As explained above, this procedure effectively uses the longest repeating suffix of the sequence to perform transitions to a new place where continuation of this suffix can be found.

Music, in its pure form, is devoid of symbolism, denotation or concrete meanings, which makes it a powerful “probe” into higher functions of our mind. In terms of information theoretic modeling this research goes beyond modeling and recreation of the source entropy. Considering music – listener relations as an information channel opens new ways to definition of musical anticipations, memory and its relations to human cognitive responses. In this sense, Memex can be used as a tool for investigating new insights into musical theory and musical perception, raising some interesting thoughts about what composing and listening actually means: What is the style of the piece? What is its form, story, its meaning? If “controlling” the automaton amounts to varying anticipations and memories, does this lead to new insights about play of cognition, creativity, or new venues for art making? How is listener experience related to pervious training on related musical examples? Where does the free will of the composer / artist / creator end and self-reproduction of culture begins?

Richard Moore, a computer music professor in UCSD, wrote about the piece:

"Have you ever had a lucid dream? While not exactly common, lucid dreams are ones in which the dreamer somehow becomes aware that the experience-in-progress is a dream. Once you know you’re dreaming (I have occasionally had this experience), you can relax. Sometimes, lucid dreamers just wake up. However, they can sometimes elect to continue the dream, exercising various levels of influence over what is going on. One can elect to fly, to fulfill sexual fantasies, to explore death, or life in other dimensions. Fantasy becomes the ruler of experience. Exactly what many people want out of life.

If one were to elect to hear music in a lucid dream, what would it sound like? Clearly, any such music would not be constrained by rules, such as those of radio stations, music theory, gravity, or social convention. Whatever such fantastic music might be based on, it is hard to imagine any sense in which it would not be based on memory. If necessity is the mother in invention, then memory is its father, for how could anything appear in the mind that is not the product of (possibly rearranged) memory?

Besides memory, there is an additional source of creativity, described by many people, perhaps most famously by Leonardo da Vinci. He is reputed to have used a technique of staring at stains on walls, or patterns in mud, or splatters of paint, to see what they might suggest. Any child who has found rabbits or ships in the sky while staring at clouds has done the same thing. Japanese artists suggest tigers and rivers and billowing drapes with but a few brushstrokes. The human mind has a powerful penchant for inference. Mostly, this capacity is used to make “sense” of the sensorial world: we see, hear, feel, taste, or smell, and almost immediately interpret. Once we’ve inferred the rabbit in the cloud, it becomes difficult not to see it there, even when we remember that’s it’s “just a cloud.” Such inference is very fast, faster than the speed of thought, especially logical thought. It is not hard to imagine how those of our predecessors who quickly inferred the saber-toothed tiger behind the bush from a few flashes of light would have more likely survived to become our ancestors.

No one yet knows what sleep is, nor why we do it, nor why we dream, but I have a theory about the last, which others have corroborated. Whatever else happens during sleep, the body shuts down in certain ways. In particular, sensory input seems to be greatly attenuated, though not entirely shut off (thus, we can still be awakened by a sudden crash of thunder). The brain, freed from most sensory input, doggedly continues to interpret what is going on. That which is interpreted is somewhat unclear, but it seems that it is chaotic (that is, greatly affected in unpredictable ways by tiny changes in both external and internal stimuli). The information that comes into the brain during sleep seems both random and complex, which allows it to be characterized stochastically, as with the heat-dance of molecules in a warm fluid (from which Einstein established the existence of atoms). Even random information is subject to the brain’s “interpreter,” which apparently never sleeps. The result is dreams, which (according to my theory) are the brain’s interpretations of chaotically appearing snippets of memories combined with nearly nonexistent, random sensory inputs. Technically, a random signal is noise. Thus, the food of dreams is memory spiced with noise.

Could we explore the world of music that might be intentfully invoked in lucid dreams? One way would be to enhance our ability to dream lucidly. Some people “practice” lucid dreaming by various methods, and report varying degrees of success. Others, apparently, never dream lucidly. Your mileage may vary.

A computer scientist might use another method. Compared with brains, computers are fairly primitive devices. Even to the limited extents that we understand them, the memory and processing capacities of computers and brains still differ by many orders of magnitude (though some researchers have pointed out that computers are growing in capacity at a rate much greater than human brains). The most capable current supercomputers have capacities measured in impressive units like teraflops and petabytes. Might it be possible to explore musical memory in a way similar to lucid dreaming on a computer by assembling fleeting “snippets” taken from one or more sections of the vast domain of musical literature according to stochastic (i.e., random) methods?
The answer is yes. Without going into technical details, this is the essence of a method used in, what? assembling? composing? extricating? snippetizing? dreaming? music for violin solo by Shlomo Dubnov, a music professor with a background in computer science at UCSD. Dubnov’s recent composition Memex, performed recently by UCSD violinist JánosNégyesy, is based on recollections of detailed musical moments taken from the violin literature of Bach, Mozart and Beethoven (and presumably—by extension—anyone). The music retains a familiar quality, even though it is obviously previously unheard. It is not like music composed by a student attempting to imitate the style of one of these composers. It is the original music, presented in a way completely unheard-as-yet. It would never be mistaken for Bach, or Mozart, or Beethoven, yet, every note was, in some ultimate sense, was written by these composers.

A related technique has been used by another music professor with a background in computer science: David Cope at UCSC has produced a CD entitled Bach by Design, in which Bach’s music is used as a database for “deriving” additional music “by” Bach (even though Bach never wrote it). Cope also based other “derived” music on the works of other composers, with varying degrees of verisimilitude. His stated motivation for such work is the desire to hear more music from composers of the past that he has known and loved—more, even than they wrote!

Cope is clearly attempting to capture the essence of the musical style of various composers of the past, while Dubnov is attempting something different. Dubnov’s “lucid music” touches on something essential about the musical nature of mind, of intelligence, of consciousness itself. It is not about producing more violin pieces by past composers. It is a musical tool for the exploration of mind, and its boundless ability to fail to interpret.”

Acknowledgment

The piece is written and dedicated to JánosNégyesy, whose enthusiasm of experimental art is never ceasing and whose intellectual curiosity inspired this work.

References

[1]Bush, V.,"As We May Think", in the Atlantic Monthly, July 1945.

[2]Hiller, L. A. and L. M. Isaacson. “Experimental Music: Composition With An Electronic Computer”, New York: McGraw Hill, 1959

[3]Xenakis, I. “Formalized Music: Thought and Mathematics in Composition”, IndianaUniversity Press, 1971

[4]El-Yaniv, R., S. Fine and N. Tishby, "Agnostic Classification of Markovian Sequences", in Advances in Neural Information Processing Systems, Vol. 10, 1998

[5]Dubnov, S., G. Assayag, O. Lartillot, and G. Bejerano, "Using Machine-Learning Methods for Musical Style Modeling", IEEE Computers, 36 (10), pp. 73-80, Oct. 2003.

[6]Ziv J., and A. Lempel, “Compression of Individual Sequences via Variable Rate Coding,” IEEE Trans. Information Theory, vol. 24, no. 5, 1978, pp. 530-536.

[7]Feder, M., N. Merhav, and M. Gutman, “Universal Prediction of Individual Sequences,” IEEE Trans.Information Theory, vol. 38, 1992, pp. 1258-1270.

[8]Ron, D., Y. Singer, and N. Tishby, “The Power of Amnesia: Learning Probabilistic Automata with VariableMemory Length,” Machine Learning, vol. 25, 1996, pp. 117-149.

[9]Assayag, G. and S. Dubnov, “Using Factor Oracles for Machine Improvisation”, Soft Computing 8, pp. 1432-7643, September 2004

[10]Allauzen C, Crochemore M, Raffinot M, “Factor oracle: a new structure for pattern matching”, in Proceedingsof SOFSEM’99, Theory and Practice of Informatics, J. Pavelka, G. Tel and M. Bartosek ed., Milovy,Czech Republic, Lecture Notes in Computer Science pp. 291–306, Springer-Verlag, Berlin