SRI RAMAKRISHNA ENGINEERING COLLEGE

COIMBATORE 641 022

AN ELECTRONIC EYE FOR VISUALLY IMPAIRED AND HOMELAND SECURITY APPLICATION USING

NANOTECHNOLOGY

Submitted by

B.SUGANYA.

P.RAJALAKSHMI ABIRAMI.

III Year CSE

ABSTRACT

This paper proposes two applications of nanotechnology- a method for detecting frontal pedestrian crossings from image data obtained with a single camera as a travel aid for the visually challenged and another is the use of nanotechnology as a security for a nation in the future.

In Electronic eye application it would be mounted on a pair of glasses, will be capable of detecting the existence and location of a pedestrian crossing, to measure the width of the road, and to detect the color of the traffic lights. The process of detecting a crossing is a pre-process followed by the process for detecting the state of the traffic lights. It is important for the visually challenged to know whether or not a frontal area is a crossing.

The existence of a crossing is detected in two steps. In the first step, edge detection and pattern detection are employed to identify the crossing. In the second step, the existence of a crossing is detected by checking the periodicity of white lines on the road using projective invariants. Then the traffic light detector is used to check the pedestrian light and the time display. The calculated time is then compared with the average time needed for a blind person to cross.

The observations are relayed through voice signals using the voice vision technology. Thus, this effective technology aids mobility for the visually impaired throughout the globe.

In Security system from shape-shifting armor to fabric that can turn away microbes as well as bullets to new power sources, the defense industries are launching major initiatives and planning for Nanotechnology.

INTRODUCTION

Blindness is the most feared of all human ailments. CROSSING busy roads can be a challenge for people with good vision. For blind people, it is a perilous activity. Our electronic eye aims at helping millions of blind and visually impaired people lead more independent lives.

The electronic eye can be adapted to help the blind or visually impaired get around without a walking stick or seeing-eye dog. Canes and other travel aids with sonar or lasers can alert the user to approaching objects. Global Positioning Systems can tell what streets, restaurants, parks and other landmarks the user is passing. Devices like these are very good at giving locations and directions. But the limitations of G.P.S. technology mean that they cannot pin down the location of a curb or crosswalk and frequently fail in areas that have many tall buildings and high traffic. None of these devices are able to specifically identify a crosswalk, nor do they have the potential for figuring out the state of the traffic signals. An effective navigation system would improve the mobility of millions of blind people all over the world. Our new “eye” will allow blind people to cross busy roads in total safety for the first time. Our “electronic eye”, which would be mounted on a pair of glasses, will be capable of detecting the existence and location of a pedestrian crossing, and at the same time measure the width of the road to the nearest step and detect the color of the traffic lights.

AN OVERVIEW OF OUR ELECTRONIC EYE

We have developed a system that is able to detect the existence of a pedestrian crossing in front of a blind person using a single camera. By measuring the width of the road and the color of traffic lights, this single camera can now give the blind all the information they need to cross a road in safety. The camera would be mounted at eye level, and be connected to a tiny computer. It will relay information using a voice speech system and give vocal commands and information through a small speaker placed near the ear.

FUNCTIONING OF THE SYSTEM

1

1-Tells the user whether any cross road is present 1 – Tells the user whether any cross road is present

2 - Tells the user whether the traffic signal is favorable or not

3 – Tells the user the time taken to cross the road.

The style of crosswalks commonly used in India are known as zebra crossings and they feature a series of thick white bands that run in the same direction as the vehicle traffic.

To detect the presence of a zebra crossing we use the “projective invariant” which takes the distance between the white lines and a set of linear points on the edges of the white lines. This gives an accurate way of detecting whether crossing is present in a given image or not.

1

2 3

1 2 3

The length of a pedestrian crossing is measured by projective geometry. The camera makes an image of the white lines painted on the road, and then the actual distances are determined using the properties of geometric shapes as seen in the image.

The traffic light detector checks images for symmetrical shapes and compares them to a list of road signs. If the pedestrian light is ON, the voice speech system instructs the user to cross the road.

The timer unit calculates the average time required by the visually challenged person to cross the road and ‘tells’ it to the user via the voice speech system.

High-level scene interpretation applied to the processed images will produce a symbolic description of the scene. The symbolic description is then converted into verbal instructions appropriate to the needs of the user by using voice speech software.

IMAGE ANALYSER

The image analyzer contains the bitmap image, which has to be processed to detect the presence of a zebra crossing. Given an X-bit per pixel image, slicing the image at different planes (bit-planes) plays an important role in image processing.

EDGE DETECTION

One way to detect edges or variations within a region of an image is by using the gradient operator. There are several well-known gradient filters. In this experiment we use the Sobel gradients, which are obtained by convolving the image with two kernels, one for each direction.

CROSSROAD PATTERN DETECTION

The zebra crossing has alternate white bands running across the width of the road. This pattern has to be recognized to confirm the presence of a crossing. To detect basic shapes within the image, we make use of the Hough transform. At its simplest the Hough transform can be used to detect straight lines from edges detected in an earlier processing step.

If the pixels detected fall on a straight line then they can be expressed by the equation:

Y=mx+c

The basis of the Hough transform is to translate the points in (x, y) space into (m,c) space using the equation:

c=(-x)m+y

Thus each point in (x,y) space (i.e. the image) represents a line in (m,c) space. Where three or more of these lines intersect a value can be found for the gradient (m) and intercept (c) of the line that connects the (x,y) space points.

CALCULATION OF THE WIDTH OF THE ROAD AND TIME REQUIRED TO CROSS IT

Calculation of the width of the road is based on the concept of projection invariants. This requires us to define the term Cross Ratio. L1

P4 L2

P3

P2 L3

P1 L4

The cross ratio can defined for the four collinear points as,

t(P1, P2, P3, P4) = (P1P3/P2P3)/(P1P4/P2P4)

Where P1P2 is the distance P1 to P2. The cross ratio of the four lines is given by,

t(L1, L2, L3, L4) = (sinq13/sinq23)/ (sinq14/sinq24)

Where qij is the angle between Li and Lj.

L1

P1 L2

P2

P3 L3

P4

L4

In the above figure, lines are constructed from the collinear points and in the adjacent figure a line is formed by joining the points in the lines L1, L2, L3, L4.

A useful fact is that the cross-ratio of the original four points is equal to the cross-ratio of the constructed lines.

To detect the presence of a zebra crossing we use the “projective invariant” which takes the distance between the white lines and a set of linear points on the edges of the white lines. The system effectively draws a virtual line out into the road. If a crosswalk is present, the edges of the painted white lines will form a

predictable series of points along the virtual line.

Let M and N be two distinct points of the projective space. Here we take points M and N as the points on the edges of the line formed on the image. The projective line between M and N consists of all points A of the form

Here (l,m) are the coordinates of A in the 2D linear subspace spanned by the coordinate vectors M and N. N is represented by (0, 1) and serves as the ``origin'' and M is represented by (1, 0) and serves as the ``point at infinity''.

For an arbitrary point (l,m) ¹ (1,0) we can rescale (l,m) to m=1 and represent A by its ``affine coordinates'', (l, 1) or just l for short. Since we have mapped M to infinity, this is just linear distance along the line from N.

The time required to cross the road is calculated based on an assumption that the user covers a distance of one foot in a minute on an average. So, the time required to cover the calculated distance is calculated based on a simple logic.

Generally, the time taken, T, to cross the road can be found out by

TRAFFIC LIGHT DETECTOR

The function of the traffic light detector is to recognize if the pedestrian light is on for the user to cross the road. If the user happens to reach the road when the pedestrian

light is already on, the time indicated by the timer display in the traffic light must be detected and compared with the time

required by the user to cross the road. If the user can cross the road safely, the voice speech system will instruct him to cross the road. This process can be effectively done by having an image database in the system and comparing the obtained image of camera to detect if pedestrian light is on and to detect the time left to cross the road.

We have a large number of images and wish to select some of them, which are similar to a certain image (for example, the image of the pedestrian light). So we need a content-based image database system, which accepts an image as its input and retrieves all images like that by using some image properties such as color, texture, shape and keywords.

Every image is processed to recover the boundary contour, which is then represented by three global shape parameters and the maxima of the curvature zero-crossing contours in its Curvature Scale Space image.

Curvature Scale Space Computation and Matching

The CSS image is a multi-scale organization of the inflection points (or curvature zero-crossing points) of the contour as it evolves. Intuitively, curvature is a local measure of how fast a planar contour is turning. Contour evolution is achieved by first parametrizing using arclength. This involves sampling the contour at equal intervals and recording the 2-D coordinates of each sampled point. The result is a set of 2 coordinate functions (of arclength), which are then convolved, with a Gaussian filter of increasing width or standard deviation. Next step is to compute curvature on each smoothed contour.

As a result, curvature zero-crossing points can be recovered and mapped to the CSS image in which the horizontal axis represents the arclength parameter on the original contour, and the vertical axis represents the standard deviation of the Gaussian filter.

The features recovered from a CSS image for matching are the maxima of its zero-crossing contours. The matching of two CSS images consists of finding the optimal horizontal shift of the maxima in one of the CSS images that would yield the best possible overlap with the maxima of the other CSS image. The matching cost is then defined as the sum of pair wise distances (in CSS) between corresponding pairs of maxima.

So, if an image of a pedestrian light in the image database finds a match with an image in the camera, the pedestrian can cross the road. The time in seconds required to cross the road is also detected based on the image of numbers in the database.

TIMING UNIT

The timing unit compares the calculated value T, the time required by the user to cross the road with the time left to cross the road T1, as identified from the image (traffic signal time). If T < T1, the system instructs the user to cross the road. Else it asks to wait till it is safe to cross the road.

VOICE SPEECH SYSTEM

AUDITORY IMAGE REPRESENTATION

The images captured by the camera are swept from left to right at little less than one image per second. The pixels in each column generate a particular sound pattern, consisting of a

combination of frequencies based on that specific set of Pixels. The result

is an auditory signature effectively an inverse spectrogram that characterizes the particular image.

High-level scene interpretation applied to the processed images will produce a symbolic description of the scene. The symbolic description is then converted into verbal instructions appropriate to the needs of the user.

VOICE vision

The VOICE VISION technology for the totally blind offers the experience of live camera views through sophisticated image-to-sound renderings. If we have a 64 * 64, 16 gray tone image, the 64-channel sound synthesis maps the image into an exponentially distributed frequency interval for a one second visual sound. The VOICE mapping: vertical positions of points in a visual sound are represented by pitch, while horizontal positions are represented by time-after-click. Brightness is represented by loudness. In this manner, pixels become... voicels.

FUNCTIONAL BLOCK

START