Converting Shoutcast W3C files into sound exchange format using Excel

By Bill Fawcett - WMRA

PRELIMINARY PROCESSING

FILE/OPEN/ALL FILES

select the W3c.log file of interest

(*) delimited

NEXT

[*] SPACE (the Shoutcast file is SPACE delimited)

FINISH

Trim Spreadsheet to about 16 days (this is not final trim)

At this point it may be helpful to move the column headers over the correct columns.

Move all to the left 1 column.

Delete Columns:

B C-dns

H sc-bytes

J avg bandwidth

Copy Column G x-duration before

Column E c-status

Delete Columns H & I

FILE/SAVE AS/TEXT (Tab Delimited) temp.txt

click past warnings

Close file

TRIM REFERRER CODE

FILE/OPEN/ALL FILES

select temp.txt

(*) delimited

NEXT

[*] TAB and [*] OTHER %

FINISH

Delete all columns H to end

STREAM CODE CLEANUP

If your stream code looks like this:

/stream?title=Unknown

Highlight Fields D(end) to D1

(click on D(end), hit shift key, pull slider bar up to D1. click on D1).

Find and Replace

FIND: /stream?title=Unknown

REPLACE: your callsign or stream ID

REPLACE ALL / CLOSE

TIME ZONE MANIPULATION

Insert two columns after column C

Dates in Excel are stored as a number of days since 1/1/1900

Time are a fraction of the day- so 12 noon, for instance, is .500000

The Date is in column B.

It will look like this: 4/9/2013

The time goes in Column C

It will look like this: 7:01:53

In our example, the data starts in Row 6, and we are EDT so we will add 4 hours.

Column D:

=sum(B6+C6)+(4/24) Substitute your offset for “4”

the solution returned will be

4/9/13 11:01 DEPENDING ON HOW THAT COLUM IS FORMATTED

Now we need to break it down into a separate Date and Time field.

Simple, copy column D to column E

=sum(D1)

Now format Column D to DATE (Locale=English UK) (2001-03-14) yyyy:mm:dd

and format Column E to TIME (Locale=English UK) (13:30:55) hh:mm:ss

(note 2 options, choose the FIRST one with leading zeros)

Now copy the formula in D6 and E6 to all active rows in the spreadsheet.

Again, using the SHIFT and slider trick will speed things up on huge files.

Now, you need to FIX the data in the new columns before deleting the original columns;

select the entire columns D & E, copy, past special - VALUES. Do this for the new Day and Time column.

Then delete the original time and date columns.

REMOVAL OF SUPERFLUOUS ROWS

Select the entire active block

(use SHIFT and slider trick)

Sort using column A, on Values A to Z

Check for any extra stuff at top and bottom, delete all rows.

You could easily have 500 rows of garbage starting with #

Select Active Block Again

Sort by Column B, on Values, Oldest to newest AND

Column C, on Values, smallest to largest

(Note- you could leave those columns in GENERAL format and just sort by one column)

Trim dates down to a 14 day period starting at Day 1 00:00:00 and ending at Day 1 +13 24:00:00.

EXPORT TO TAB DELIMITED FORMAT

Select Active Block again

Ctrl-C (copy)

Open new spreadsheet FILE/NEW/BLANK

With cursor in A1 PASTE - CNTRL-V

(This step eliminates any extra columns or rows)

FILE/SAVE AS/TEXT (Tab Delimited) Wxxx Q2 201x.txt

Click past warnings

DONE!

COMMON PROBLEMS

1. Trying to sort/process on ginormous whole sheet. Select active section only

2. Not using UK date and time codes

3. Not properly eliminating chaff from file which includes stuff like this:

#Software: SHOUTcast

#Version: 1.9.8

#Fields: c-ip c-dns date time cs-uri-stem c-status cs(User-Agent) sc-bytes x-duration avgbandwidth

4. Neglecting <copy/paste special VALUE> before trimming original date columns

5. Trying to span a time period which is part STANDARD and part DAYLIGHT.

wdfver 7-8-2013 1045