Interconference Room Video Communications System

Project Plan

Team

May 03-13

Date of Submission

Tuesday, September 24th, 2002

Client

Senior Design

Faculty Advisor

S.S. Venkata

ECpE Department Chair

Team Members

Noah Korba

Brian Marshall

Nick McInerney

Jalal Saidi

Melissa Weverka


Table of Contents

1Introductory Materials

1.1 Abstract

1.2 Definition of Terms

2Project Plan

2.1Introduction

2.1.1General Background

2.1.2Technical Problem

2.1.3Operating Environment

2.1.4Intended Users and Uses

2.1.5Assumptions and Limitations

2.2Design Requirements

2.2.1Design Objectives

2.2.2Functional Requirements

2.2.3Design Constraints

2.2.4Measurable Milestones

2.3End-Product Description

2.4Approach and Design

2.4.1Technical Approaches

2.4.2Technical Design

2.4.3Testing Description

2.4.4Risks/Risk Management

2.5Financial Budget

2.6Personal Effort Budget

2.7Project Schedule

3Closure Material

3.1 Project Team Information

3.2Summary

List of Figures

Figure 1: A diagram of our system

Figure 2: Project Schedule


List of Tables

Table 1: Financial Budget

Table 2 : Personal Effort Budget

Introductory Materials

1.1 Abstract

Many large organizations are incorporating telecommunications into their everyday routine. To make this technology available to smaller institutions, a low cost multi-conference room video-communications system will be created. To accomplish this, the system will be Internet-based, and provide two-way audio and video streaming. This design allows information to be presented around the world.

1.2 Definition of Terms

Asymmetric Key Encryption – AKE is usually known as Public/Private Key Encryption. The most famous, RSA, is one of the worlds strongest overall encryption methods. AKE involves choosing a strength (“bit strength”), and producing a set of keys. (which are then owned by the key set owner) The public key is made available to anyone who wishes to send you encrypted data. The data is run through the public key, and then transmitted to the key set owner. The data then is decrypted using the private key, made available only to the key set owner. The only way to decrypt this data is to own (or forcefully generate) the private key. For comparison, a 56-bit RSA key using the RC-5 algorithm has a possibility of 72 quadrillion combinations, and takes (using parallel processing) about 100 days to crack.

Audio Capture Device – A device attached to the computer, usually plugged into the PCI bus on a computer, which converts sound patterns into Raw Audio Data. Usually dubbed a “Sound Card”.

CODEC – See both Compression and Decompression

Compression – The “CO” in CODEC, compression is the method of representing a piece data in a smaller size. In the early days of computing, compression was limited to general algorithms designed to shrink the size of raw, random data. In recent years, compression has become specific, from Audio/Video Compression (such as DiVX ;) or MPEG-3) to Textual-Based Compression. These newer compression methods use a combination of shifting-frame pattern recognition and data removal to shrink the size of data - without sacrificing quality. Most compression utilities can be configured to favor a smaller size over data quality and vice versa.

Decompression – The “DEC” in CODEC, decompression is the method of recreating raw data from a compressed piece of data. The primary job of the decompression portion is to detect the style of compression used on a file, and produce raw data, in the best form as possible. Many decompression algorithms implore decompression and best-guess techniques to predict and reproduce data as close to the original as possible.

Decryption – The process of revealing concealed (“encrypted”) data with the proper authorization.

Demultiplexing – The process of taking a multiplexed stream of data, and splitting it back into its original individual parts.

Encryption – The process of concealing data in an unreadable form, unless proper authorization is possessed to view (“decrypt”) the data.

MCP – Acronym for motion controlled platform

Multiplexing – The process of taking individual data streams and combining them into one single data stream. Also called a multiplexed stream.

Raw Audio Data – The format of data in which standard audio generation can be applied to, to produce audible sounds through the users output device. For example, Compressed Audio Data (such as MPG – MPEG3 Compression) cannot be heard unless it is decompressed into Raw Audio Data. Only then can it be successfully heard on the users audio output device.

Raw Video Data – The format of data in which standard graphical rendering can be applied to, to produce full-motion video on the users computer screen. For example, Compressed Video Data (such as DiVX ;) – MPEG-4 Compression) cannot be displayed unless it is decompressed it into Raw Video Data. Only then can it be successfully rendered on the users video output device.

RPC – Remote Participant Control

Socket – When someone says socket, they are usually using it in reference to Berkeley Sockets – a programming interface for communications over the Internet. Implementations of Berkeley Sockets are used on virtually all operating systems, from UNIX (IO-Socket libraries) to Windows (WinSock). They provide a seamless way for programmers to create network applications. Older UNIX implementations contained three states – Raw (on top of IP), TCP/IP and UDP/IP. Most implementations now provide access to only UDP/IP and TCP/IP sockets. Sockets allow one to create both clients and servers for network communications.

TCP – TCP stands for Transmission Control Protocol. Most noted as being the Internet Protocol, TCP was established in the late 1970’s to provide reliable communications over packet-switched networks. Unlike its sibling UDP, TCP provides a three-way handshake to ensure a solid connection between two computers. TCP also ensures packet transmission success by providing a CRC (cyclical redundancy check) at the end of the packet to determine if, during its course of broadcast, the packet is corrupt. If so, TCP has built in send/resend as well as flow control to ensure the packet arrives safely and unharmed. If the physical connection causes the non-transmission of packets (a break in the line, for instance), TCP will “timeout” on the connection, and return an error to the user. Most application layer protocols (such as Telnet, FTP, HTTP and SSH) use TCP as their primary transport method.

Stream, Streaming – The word “stream” is used for many different applications. Most notably, socket communications over the Internet are called “streaming” since they move data from one point to another constantly, like a river. Streams can also be referenced to raw unprocessed data, which usually comes directly from a socket buffer.

Symmetric Key Encryption – SKE involves the process of “secret key” encryption and decryption. Unlike AKE, where everyone can know your public key, SKE keys must be held by only those involved in the data exchange. SKE’s are often a one-to-one key. The most basic SKE was invented by Julius Caesar. This cipher (known as the Caesar Cipher) was based off of the equation E=((N+K) modulus 26), where N is the numerical representation of a letter (A=1, B=2, etc), and K is the modifier. The modulus 26 is to ensure the numerical character range stays between 1 and 26 (A and Z).

Synchronized Multimedia Stream – A synchronized multimedia stream is exactly what it implies – a stream of data in which video is matched up to its corresponding audio. This stream will eventually be split apart by the same means. For example, if you did not convert your multimedia stream to a synchronized multimedia stream, video of you talking may not match up to the corresponding audio. In Hollywood, this is called “bad dubbing”.

TCP – TCP stands for Transmission Control Protocol. Most noted as being the Internet Protocol, TCP was established in the late 1970’s to provide reliable communications over packet-switched networks. Unlike its sibling UDP, TCP provides a three-way handshake to ensure a solid connection between two computers. TCP also ensures packet transmission success by providing a CRC (cyclical redundancy check) at the end of the packet to determine if, during its course of broadcast, the packet is corrupt. If so, TCP has built in send/resend as well as flow control to ensure the packet arrives safely and unharmed. If the physical connection causes the non-transmission of packets (a break in the line, for instance), TCP will “timeout” on the connection, and return an error to the user. Most application layer protocols (such as Telnet, FTP, HTTP and SSH) use TCP as their primary transport method.

UDP – UDP stands for User Datagram Protocol. Sibling to TCP, UDP provides connectionless communications, providing no guarantee of data delivery or integrity. UDP does not support “timeout”, so it is the sole responsibility of the programmer to determine when a set amount of time has passed to call a connection “dead”. UDP is most often used when the overhead of a TCP connection is too much for the connection needed. UDP is most often used for streaming audio and video applications, where error correction and packet retransmission are detriments to overall performance. A handful of application-layer protocols use UDP (DNS, older versions of RealAudio and Shoutcast) as transport methods. Most programs that require the functionality of UDP for “streaming” implementations are now moving to the Multicast Standard.

USB – Acronym for Universal Serial Bus

Video Capture Device – A device attached to the computer, usually plugged into the USB (Universal Serial Bus) port on a computer, that converts light patterns into Raw Video Data.

Visual Aid– Any format of a graphic that will illustrate what a speaker is attempting to communicate. Ex. Whiteboard drawings, notes on a piece of paper.

1Project Plan

2.1Introduction

2.1.1General Background

The Interconference room visual communication system’s objective is to create a low cost alternative to current telecommunication technology. To keep this system low cost, some traditional features were modified to make the system more cost effective while maintaining a relatively high level of quality. For example, the Internet will be used as the communication medium, since it is free, and already established. Also, the rooms will be limited in size to reduce the cost of equipment. Finally, the system will be portable so it can be shared between many people in the institution.

2.1.2Technical Problem

In order to make the system efficient, the project will be split up into two different segments – the hardware segment and the software segment.

  • Hardware

The hardware segment consists mainly of creating, buying and testing the hardware that will interface with the rest of the system. It will also involve creating drivers to interface those hardware devices with the software portion of the project.

  • Software

The software segment consists of writing a Graphical User Interface, implementing CODECs for audio, video and data compression, creating an encryption and decryption engine for the system, and providing seamless integration with the ever-changing technology existing on the Internet.

2.1.3Operating Environment

The software will be written for the Microsoft Windows family of operating systems. The hardware will be used indoors, but will need to be slightly durable, since it is portable.

2.1.4Intended Users and Uses

The intended users of the multimedia conference system are those who need to conduct meetings with others who are not in the immediate area, or cannot attend a meeting in person. It is designed for both the novice and expert computer user - providing an easy to understand interface for the beginner, along with the ability to control virtually all aspects of the system for the computer guru.

2.1.5Assumptions and Limitations

Several assumptions have been made for this project. This project also entails limitations as well. The assumptions and limitations are defined below:

Assumptions

  • Conference rooms will be small in size
  • There will be only two rooms in communication with each other
  • Each room will contain a maximum of five participants

Limitations

  • Internet Bandwidth
  • Maximum CPU processing power
  • Compression strength

2.2Design Requirements

2.2.1Design Objectives

The project has been broken down into the objectives listed below.

  • Display multimedia on both participants screens

Both conference rooms will be able to see the remote video, as well as hear remote audio.

  • Compressed and Encrypted video and audio streams

The system will use various audio, video and data compression techniques to condense the audio/video streams, to conserve both participants bandwidth. The system shall also use an optional medium-strength Asymmetric Key encryption technique to provide data security- so that only the authorized participants are allowed to view the audio/video stream.

  • Motion Controlled Platform (MCP)

A platform will be created in which a camera could be mounted on. This will allow the camera to be moved both up and down and right to left.

Figure 1: A diagram of our system

2.2.2Functional Requirements

The following defines the functional requirements the end product will perform.

  • All participants can operate the system

The system will be designed to provide ease of both setup and use, so that participants with a wide knowledge of computers can operate the system.

  • Participants can choose to enable encryption and/or compression

The system shall allow the participants to turn on or off both compression and encryption, to increase transmission performance, or save on CPU cycles.

  • Local Camera Control

The system shall give each participant local control over their camera, to focus the camera on a specific person, or all of the participants.

  • Portability

The system can be easily moved and set up in different conference rooms.

2.2.3Design Constraints

This section defines constraints considered during the design and implementation of the project.

  • Cost

The system must be low cost while maintaining its functionality.

  • Local Internet bandwidth

Since video will be streaming over the Internet, there must be sufficient bandwidth to view real time video at all client computers.

  • CPU processing power

The server must composite and stream real time audio and video, as well as process multiple incoming audio streams. This will require significant CPU power.

2.2.4Measurable Milestones

The following lists the measurable milestones of the project.

  • Project scope and intended features defined (5%)

Features for the completed system are defined that would create a complete and usable product.

  • All subsystems’ functionalities and interfaces designed (20%)

All subsystems can be implemented using the documented subsystem description and interfaces.

  • All subsystems function properly under controlled conditions (30%)

Using test inputs, all subsystems pass limited tests and can produce expected output for all features, even if under controlled inputs.

  • All subsystems function properly under all conditions (20%)

Using test inputs, all subsystems will work under all foreseeable conditions and produce expected outputs.

  • Complete system operates under controlled conditions (15%)

Using completed subsystems, the final product can be assembled, and produce expected outputs under controlled conditions.

  • Complete system operates under all conditions (10%)

The completed system will operate in a useable state, with no known errors.

2.3End-Product Description

The system will connect two small conference rooms with a small number of participants per room. It will broadcast both audio and video from one room to another using the Internet. The cameras will be mounted on a motion-controlled platform, which allows them to be focused on individual speakers, as well as the entire group. The system will be easily portable from one room to another, allowing many people to utilize one system.

2.4Approach and Design

2.4.1Technical Approaches

This system will be composed of a software subsystem and a hardware subsystem. The following lists the different approaches that will be considered for the design phase of the project.

Software

The software segment of this system can be approached several ways. The various software components can be seen as self-contained modules that will communicate through an external language such as TCP/IP or pipes, or it can be seen as one module with different components like DLL’s or classes.

The network communication can be seen one of two ways as well. It can be approached as a client-server model, or a peer-to-peer model.

Hardware

  • Motion Controlled Platform (MCP)

There are several ways in which MCP could be designed. The MCP can be controlled using an X-Y axis location process with a joystick or similar device. It could also be preprogrammed to move to specific positions in the conference room, focusing on a specific area, rather than a panoramic view of the entire room.

  • Loud Speakers

Self-amplified computer speakers could be used or an amplified version of a stereo speaker via a receiver could be implemented.

2.4.2Technical Design

Software

Server model

The two conference rooms will each have a computer that will run the audio and video feeds. These computers can talk to each other as peers, or one can establish itself as a server to set conference properties such as encryption strength and compression type.

Compression/Encryption

The conference can be encrypted and compressed. There are numerous different algorithms to accomplish these tasks. Either one algorithm for each category will be used, or several will be available to the user to choose. The user will have the option to turn off encryption, since it will produce additional CPU strain.

Video Capture

The video from the cameras will need to be captured into a raw digital video format. Depending on the camera used, this can be done by writing an interface driver for a digital camera, or writing an interface driver for a video capture card.

Audio Capture

The audio for each conference room will be input to the computer through the sound card, so the raw audio will need to be extracted either from the operating system, or the sound card itself.