RecordAudio Walkthrough: C#–1

RecordAudio Walkthrough: C#

Recording an Audio Stream and Monitoring Direction

About This Walkthrough In the Kinect™ for Windows® Software Development Kit (SDK) Beta, RecordAudio is a C# console application that demonstrates how to record an audio stream from the microphone array of the Kinect for Xbox 360®sensor and monitor the direction of the audio source. This document is a walkthrough of the RecordAudio applicationthat is provided with the beta SDK.

Resources For a complete list of documentation for the Kinect for Windows SDK Beta, plus related reference and links to the online forums, see the beta SDK website at:

Contents

Introduction

Program Basics

Create and Configure an Audio Source Object

Record the Audio Stream

Monitor the Beam Direction

License: The Kinect for Windows SDK Beta is licensed for non-commercial use only. By installing, copying, or otherwise using the beta SDK, you agree to be bound by the terms of its license. Read the license.

Disclaimer: This document is provided “as-is”. Information and views expressed in this document, including URL and other Internet Web site references, may change without notice. You bear the risk of using it.

This document does not provide you with any legal rights to any intellectual property in any Microsoft product. You may copy and use this document for your internal, reference purposes.

© 2011 Microsoft Corporation. All rights reserved.

Microsoft, DirectX, Kinect, MSDN, and Windows are trademarks of the Microsoft group of companies. All other trademarks are property of their respective owners.

Introduction

The audio component of the Kinect™ for Xbox 360®sensor is a four-element microphone array. An array provides some significant advantages over a single microphone, including more sophisticated acoustic echo cancellation and noise suppression. By using beamforming algorithms, applications can use a microphone array as a directional microphone and focus on a particular audio source.

RecordAudio is a C# console application that demonstrates how to record an audio stream from the Kinect sensor’s microphone array and monitor the source direction. This document is a walkthrough of theRecordAudio application, which is provided with theKinect for Windows®Software Development Kit (SDK) Beta.

For an example of how to implement a managed application to capture an audio stream from the Kinect sensor’s microphone array, see RecordAudio. For examples of how to implement a C++ application to capturean audio stream from the Kinect sensor’s microphone array, see “MicArrayEchoCancellation Walkthrough,” “AudioCaptureRaw Walkthrough,” and “MFAudioFilterWalkthrough” on the beta SDK website.

Program Basics

RecordAudiois installed with the Kinect for Windows Software Development Kit (SDK) Beta samples in %KINECTSDK_DIR%\Samples\KinectSDKSamples.zip. RecordAudio is a C# console application that is implemented in a single file, Program.cs.

Important RecordAudio targets the x86 platform.

The basic program flow is as follows:

1.Create an object to represent the Kinect sensor’s microphone array.

2.Capture the audio stream and write it to a file.

3.Monitor the source direction.

To use the RecordAudio

1.Build the application

2.Press Ctrl+F5 to run the application.

3.Speak while you are moving side to side.

The following shows some sample output:

Recording for 20 seconds

Beam direction changed (radians): 0

Beam direction changed (radians): -0.175

Sound source position (radians): -0.217024366288511 Beam: -0.175

Beam direction changed (radians): -0.349

Sound source position (radians): -0.340237945622282 Beam: -0.349

Beam direction changed (radians): -0.175

Sound source position (radians): -0.14727806217808 Beam: -0.175

The remainder of this document walks you through the application.

Note This document includes code excerpts, most of which have been edited for brevity and readability. In particular, most routine errorcorrection code has been removed. For the complete code, see the RecordAudio sample. Hyperlinks in this walkthrough refer to content on the Microsoft® Developer Network (MSDN®) website.

Create and Configure an Audio Source Object

The KinectAudioSource object represents the Kinect sensor’s microphone array. Behind the scenes, it uses the MSRKinectAudioMicrosoft DirectX® Media object (DMO), as described in detail in“MicArrayEchoCancellation Walkthrough“ on the beta SDK website.

Most of the sample is implemented in Main. The first step is to create and configure KinectAudioSource, as follows:

staticvoid Main(string[] args)

{

var buffer = newbyte[4096];

constint recordTime = 20;

constint recordingLength = recordTime * 2 * 16000;

conststring outputFileName = "out.wav";

Thread.CurrentThread.Priority = ThreadPriority.Highest;

using (var source = newKinectAudioSource())

{

source.SystemMode = SystemMode.OptibeamArrayOnly;

source.BeamChanged += source_BeamChanged;

...

}

...

}

RecordAudio first defines two constants that control the recording process:

  • The recording time, which is set to 20 seconds.
  • The recording length, in bytes, which is set to the product of the recording time, the sample size (2bytes), and the number of bits per sample (16,000).

To avoid dropped samples, RecordAudio sets the thread priority to ThreadPriority.Highest.

RecordAudio next creates and configures a KinectAudioSource object, which represents the microphone array. You configure KinectAudioSource by setting various properties, which map directly to the MSRKinectAudio DMO’s property keys. For details, see the API reference.

The RecordAudio application configures theKinectAudioSourceobject’s system mode as an adaptive beam without acoustic echo cancellation (AEC). Otherwise, RecordAudio uses default settings.

KinectAudioSourcehandles beamforming internally and provides the results to the application. To use beamforming, you must setKinectAudioSource.MicArrayMode to one of the following MicArrayMode values, which differ in how they direct KinectAudioSource to choose among multiple audio sources:

  • MicArrayFixedBeamuses the center beam.
  • MicArrayExternalBeamuses the beam that the application specifies.
  • MicArrayAdaptiveBeamuses the beam that is closest to the direction that is specified by an internal source localization algorithm. This mode is enabled by default if you specify either Optibeam system mode.

RecordAudiouses the defaultMicArrayAdaptiveBeam mode.

Finally, RecordAudio subscribes to the KinectAudioSource.SoundSourceChanged event, which is raised when the source direction changes.

Record the Audio Stream

RecordAudio starts the audio stream, records it for 20 seconds, and writes the recorded stream to a .wav file, as follows:

staticvoid Main(string[] args)

{

...

using (var source = newKinectAudioSource())

{

...

using (var fileStream = newFileStream(outputFileName, FileMode.Create))

{

WriteWavHeader(fileStream, recordingLength);

using (var audioStream = source.Start())

{

int count, totalCount = 0;

while ((count = audioStream.Read(buffer, 0, buffer.Length)) > 0

& totalCount<recordingLength)

{

fileStream.Write(buffer, 0, count);

totalCount += count;

if(source.SoundSourcePositionConfidence>0.9)

Console.Write("Sound source position (radians): {0}\t\tBeam: {1}\r",

source.SoundSourcePosition, source.MicArrayBeamAngle);

}

}

}

}

}

Before starting the recording process, RecordAudio creates a FileStream object to represent the output file and calls the private WriteWavHeader method to write the file’s .wav header. For details, see the sample.

RecordAudio then calls KinectAudioSource.Start, which starts the audio stream and returns the associated Stream object. The recording process is handled by the while loop, which callsStream.Read to read the stream buffer by buffer and FileStream.Write to write the buffer to the output file. The loop then prints the source and beam directions if the source location’s confidence value is greater than 0.9.The loop terminates when the number of recorded buffers reaches a specified recording length.

Monitor the Beam Direction

KinectAudioSource raises a BeamChanged event when the adaptive beamforming algorithm switches beams. RecordAudio handles the event and prints the new beam direction, as follows:

staticvoid source_BeamChanged(object sender, BeamChangedEventArgse)

{

Console.WriteLine("Beam direction changed (radians): {0}", e.Angle);

}

The BeamChangedEventArgsobject contains the currentbeam angle, in radians. From the perspective of a user facing the Kinect sensor, you interpret the angle as follows:

  • 0: The beam is directly in front of the sensor.
  • Positive angle: The beam is right of center.
  • Negative angle: The beam is left of center.
For More Information

For more information about implementing audio and related samples, see the Programming Guide page on the Kinect for Windows SDK Beta website at: