Table of Contents:
Introduction...... 2
Approach...... 3-4
Program Development...... 5
Project Data and Results...... 6-10
Conclusions...... 10
Source Code for C application...... A
Matlab Source Code...... B
Introduction (Quick Story):
The general public is constantly offered a barrage of new (as well as old disguised as new) music for their consumption by the record industry. Whether it be a new song by a popular artist or a old classic that the listener has never heard before, it is often difficult to tell if it is worthy of valuable time and money of the listener.
Music in the more traditional sense has been on the decline for the past century. The first part of the 20th century saw a growing interest in dance music and thus the “Big Band Era.” Big bands, arguably for the first time in musical history, became the “POPular Music.” Famous ones such as Glen Miller, Benny Goodman, and the Dorsey Brothers (to name only a few) succeeded in selling records in large numbers. At no other times after these periods (1920's ~ 1950's) have single bands dominantly control the entire music industry for extended periods of time.
And so we now enter the modern “Rock” era, which came to replace the big bands. It became possible for Rock to gain a firm foothold in the popular music due to its more raunchy and rebellious nature with which a new generation found liberation. Many blamed artists such as Woody Herman and Stan Kenton, the last of the big bands, for promoting singers in their bands whose popularity eventually started to overshadow the musicians. The rock era introduced music with a much more poetically cerebral quality due to bands featuring singers with LYRICS to express messages to listeners.
The prominence of lyrics in modern music allows for music to enter the realms relevant to this project. The new style gives music, a highly subjective entity, a new aspect which can be studied analyzed to form objective measurements. Using characteristics derived from these lyrics, a MLP can be trained to an individual's preferences and used to judge whether a song is viable for the listener's ear.
Basic Approach:
It is up to the listener to decide what music is worthy of purchasing and what is not worth the time and money. Assuming that the listener is intelligent and brave enough to have his or her own opinion (with or without succumbing to peer pressure), this project will attempt to provide a useful tips for the would-be music consumer to make a decision on whether a given song by a particular group is worth listening to.
This program primarily focuses on breaking down the lyrics of a song. The lyrics of the song is where an artist conveys his/her emotions, intelligence, and deep underlying messages. I understand that the very way the artist sings the lyrics ( stressing of certain notes, facial expressions, etc.) also convey the things mentioned above but the lyrics are the only real part of a song that can be analyzed and quantified easily within the scope of this class.
Quantified Data will include:
•% of unique words – bad artists often end up repeating the same lyrics over and over which often suggest the artist has nothing really intelligent or worthy to say.
•The very number of unique words, also may help in developing the MLP
•Count of key words – I find that bad pop artists abuse and overuse particular words such as LOVE, BABY, HEART, FEEL, etc.
•Word length – Sophisticated, intelligent artist will use bigger multi-syllabled words. This may point towards more meaningful lyrics.
The MLP Classifier predicts a music's category into different groups ranging from BAD to GOOD. Many possibilities were explored. Eventually, I settled on a MLP that dealt with 5 input features, and only 2 classifications (Good, Bad). The results and conclusions of research/experiments are contained in the following sections.
How many categories in between (e.g. OK, MARGINAL), depended on the performance of the MLP during the developmental stages of the project.
I trained and tested with cross validation techniques used in some of the class assignments. The baseline value achieved using the knearest neighbor algorithm was somewhere between 25 to 30 % error rate making it have a 70 to 75% classification rate. A decent figure to shoot for.
Program Development
The project specification required me to develop an application that took in raw data in the form of lyrics (plain text files) and process them for the required data.
This C application was developed and tested using Microsoft's .NET Developmental Environment.
The steps required to be performed on the data where:
•Traversal of the current directory structure in search of proper lyrics files.
•Parsing and insertion of lyrics into an array data structure
•Filtering of extraneous characters that skew data (punctuation).
•Analysis of data structure of uniqueness, avg length, and so forth.
•Output to data file with format compatible with MLP matlab program.
Details of development are included in the remarks of the source code attached in the appendix. Next, the MLP program provided by the class website was augmented to fit the needs of this project's specifications. The neural network's input were automated with variables defined in the upper portion of code and the entire structure looped for efficiency.
Initial experiments were run with a 3 classification data sets. The input features defined as:
1.Number of lyrics
2.Number of unique lyrics
3.% of Lyrics that are unique
4.Avg length of lyrics
5.Total number of Characters
This program is intended to be trained by the consumer to his/her taste in music and so a bit of subjectivity is required for the completion of the classification of training data. Out of the vast universe of lyrics I could have chosen from, I decided to pit in what is my opinion, good music versus really bad music. These songs being from 2 different eras: 60's,70's Rock and Late 90's “Pop Music.” I collected numerous lyrics of songs by the classic artists I knew: The Beatles, Pink Floyd, and Jimi Hendrix. For the pop music, I decided to go with the current sensations: N'Sync, Backstreet Boys, and Britney Spears. At
http://www.sing365.com/ a free lyric web site for many popular artist, I obtained data on 2 groups of music and ran them through my compiled C program application. (The data and training files available in the appendix)
The data in the matlab program was normalized since the feature vectors' mean, variance and max,min values differed greatly.
Training Data / mean / variance / min / max /Feature Vector 1 / 177.8125 / 4607 / 39 / 325
Feature Vector 2 / 79.3125 / 1358 / 26 / 138
Feature Vector 3 / 697.7500 / 64832 / 209 / 1240
Feature Vector 4 / 3.9975 / 0.1731 / 3.61 / 5.3600
Feature Vector 5 / 27.4294 / 270.4976 / 0.73 / 53.8500
The first results obtained were quite awful to say the least. The classification rates on the training were at acceptable ranges of upwards to about 90% but the classification of the test data were unacceptably low:
Failed convergence
Variable number of hidden layer w/ α = 0.1 and 5 neurons/layer
# of Hidden Layers / 2 / 3 / 4 / 5 / 6 / 7 / 8 / 9 / 10 /Training Crate (%) / 90.0000 / 84.2857 / 88.5714 / 84.2857 / 82.8571 / 82.8571 / 82.8571 / 82.8571 / 82.8571
Testing Crate (%) / 37.5000 / 68.7500 / 37.5000 / 56.2500 / 31.2500 / 43.7500 / 31.2500 / 31.2000 / 31.2000
Variable number of neurons/hidden layer w/ α = 0.1 and 3 hidden layers
# of neurons/layer / 1 / 2 / 3 / 4 / 5 / 6 / 7 / 8 / 9 / 10 /Training Crate (%) / 82.8571 / 85.7143 / 87.1429 / 88.5714 / 91.4286 / 95.7143 / 95.7143 / 92.8571 / 91.4286 / 91.4286
Testing Crate (%) / 31.2500 / 31.2500 / 31.2500 / 31.2500 / 43.7500 / 31.25 / 56.2500 / 50.0000 / 50.0000 / 56.2500
The best Test classification rate achieved in the 1st experiment was 68.75% and most were below 50% making these results inadequate to the task. Similarly, for the 2nd run where the number of neurons/layer were changed, the best test data classification rate achieved was only ~56%. I analyzed the results and with the correlating data and found that the main inconsistency was with the middle classification: the OK class: between Good and Bad. Thus I reevaluated what parameters of the specification I could change to improve the performance of the classification rate. I decided to remove the mid-rating and go with either Good or Bad. As I looked through the feature vectors of the data, I found 2 key characteristics in the feature vectors within the data file that correlated well with my ratings (classifications) They were the character length and the % unique lyrics features. The character length tended to be lower and % unique feature higher for lyrics that were given a higher classification rating. Therefore, I decided that removing ambiguities introduced by 3 classes instead of 2 was a good way of improving my MLP performance. After altering the original C application to output only 2 classifications, the results were much better:
Variable number of hidden layer w/ α = 0.1 and 5 neurons/layer
# of Hidden Layers / 2 / 3 / 4 / 5 / 6 / 7 / 8 / 9 / 10 /Training Crate (%) / 100 / 100 / 97.1429 / 95.7143 / 95.7143 / 95.7143 / 95.7143 / 95.7143 / 80
Testing Crate (%) / 87.0968 / 90.3226 / 96.7742 / 93.5484 / 93.5484 / 93.8454 / 61.2903 / 87.0968 / 61.2903
Variable number of neurons/hidden layer w/ α = 0.1 and 3 hidden layers
# of neurons/layer / 1 / 2 / 3 / 4 / 5 / 6 / 7 / 8 / 9 / 10 /Training Crate (%) / 88.5714 / 95.7143 / 100 / 97.1429 / 100 / 100 / 100 / 100 / 100 / 100
Testing Crate (%) / 80.6452 / 93.5484 / 93.5484 / 90.3226 / 96.7742 / 93.55 / 93.5484 / 93.5484 / 90.3226 / 87.0698
# of neurons/layer / 11 / 12 / 13 / 14 / 15 /
Training Crate (%) / 100 / 100 / 100 / 100 / 100
Testing Crate (%) / 93.5484 / 90.3226 / 74.1935 / 96.7742 / 93.5484
Conclusions:
In the end, I found that the optimal settings for the MLP were setting:
•alpha = 0.1
•momentum = 0.8
•hidden layer = 3
•neurons / layer = 5
classification = 96.7742%
These values resulted in the optimum results having near convergence during training and a test data classification rate of nearly as high as the training classification rate. I was able to achieve a classification rate that exceeded the baseline case using the k-nearest neighbor algorithm.
Appendix A-1
/*******************************************************************************
ECE 539 PROJECT
Author: Koji Yabumoto
Name: ece539prj.c current: version 5.3
History: 1.0 2003 - 11 - 03 Started
1.1~1.4 2003 - 11 - 04~06 Visits every file in directory
2.0~2.1 2003 - 11 - 15 Detects lyrics files only.
3.0~3.4 2003 - 11 - 15 Calculates number of chars,
lyrics and avg length.
4.0 2003 - 11 - 16 Calculates number of unique
words.
4.1 2003 - 11 - 22 Calculates % of unique words
5.0~5.2 2003 - 12 - 01 prints out results to file
5.3 2003 - 12 - 06 Input rating from user
5.4 2003 - 12 - 15 Classification down to 2
Description:
This program is part of the ECE 539 project that
takes lyrics from songs in a particular directory and
analyzes them for certain characteristics and then
formats them for the MLP matlab program to use.
Notes: This program file was compiled and developed in Windows XP
using the:
-- Microsoft Development Environment 2002
ver. 7.0.9492
Copyright 1987-2001 Microsoft Corp.
Microsoft .NET Framework 1.0 ver. 1.0.3705
Copyright 1998-2001 Microsoft Corp.
The executable that results is thus meant to be run in a
command prompt in the Windows environment.
*******************************************************************************/
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <windows.h>
void prjError(int e);
int chew_lyrics(char *filenamebuf,char *pLastName);
int process_file(char *filenamebuf, char *pLastName);
int detect_lyr(char* filenamebuf);
int load_lyrics(char *filenamebuf, char *pLastName);
int unique_count();
int parse_load(char* ibuf);
int number = 0;
int number_u = 0;
int rating1 = -1;
FILE *DataFile;
char version_number[14] = " 5.3";
typedef struct {
char data[30];
//data[0] = '\0';
}aWord;
aWord words[5000];
aWord not_unique[5000];
char* write_file = "lyric_data.txt";
A-2
int main (int argc, char* argv[])
{
char filenamebuf[MAX_PATH];
char* pLastName;
HANDLE hfind;
WIN32_FIND_DATA fdata;
DataFile = fopen(write_file,"a");
if(argc == 2 & (strcmp(argv[1],"/?")) == 0){
prjError(2);
return(0);
}else if (argc > 1){
prjError(1);
return(1);
}//else if
printf("note: 1st valid character (either 1,2,or3) will be used.\n");
if(GetFullPathName("*",
sizeof(filenamebuf),
filenamebuf,
&pLastName)){
hfind = FindFirstFile(filenamebuf, &fdata);
if (hfind != INVALID_HANDLE_VALUE){
do{
if (fdata.dwFileAttributes & FILE_ATTRIBUTE_DIRECTORY)
continue;
strcpy(pLastName, fdata.cFileName);
if (process_file(filenamebuf, pLastName))
return(1);
}while(FindNextFile(hfind, &fdata));
FindClose(hfind);
}
}//if
fclose(DataFile);
printf("DONE\n");
return(0);
}//main
int process_file(char *filenamebuf, char *pLastName)
{
int i= 0;
//printf("filename: %s", pLastName);
if (!detect_lyr(filenamebuf)){
//printf("note: 1st valid character (either 1 or 2) will be used.\n");
printf("%s Rating(1=BAD, 2=GOOD):",pLastName);
//i = scanf("%d", &rating1);
i = getchar();