Messagelabs Is a Software Development Company Who Specialises in Internet Application Technology

Abstract

Over the past 5 years the Internet has changed vastly from its academic origins to become the lifeblood of global business communications. Unfortunately, along with all the obvious benefits associated with such connectivity there is a downside. This paper aims to investigate the carrier class virus phenomenon and propose solutions.

Historically, computer viruses spread primarily via booting from infected disks or executing infected files or documents. In every case, a human element has factored in a virus's ability to pollinate. However, this situation has recently changed and with it so have some important factors regarding virus protection.

Email aware viruses such as Melissa, Happy99 or ExploreZIP are able to pollinate themselves, instantly and with great efficiency. Worse still, the powerful scripting languages presented in todays’ email clients and office suites make creating such viruses a comparatively easy task. We know that statistically, 1 in every 1,500 emails on average will contain a virus.
In order to help counter this trend towards email capable viruses, ISPs must take more responsibility for the email they forward and provide an effective front line of defense for their customers. In this presentation, MessageLabs will introduce supporting data to highlight important new issues and trends in virus protection and detail how ISPs can efficiently and easily integrate virus scanning into their networks.

The Internet is at the core of the problem so it is logical that it should be at the core of the solution.

Contents: Page No.

Emerging Virus Trends…………………………………………………………………3

Why Are Viruses So Prolific?…………………………………………………………4-6

2.1 Networked Plumbing

2.2 Too Much Functionality

2.3 Common Platforms

Current Solutions are Outdated……………………………………………………6-7

Evolving a New Approach: Scanning For Viruses

at the Internet Level………………………………………………………………………7-14

4.1 How Does the Virus Scanning System work?

4.2 How is the Virus Scanning System Deployed?

4.3Skeptic™

Mail Encryption……………………………………………………………………………..15

Making Use of Live Statistics……………………………………………..………….16

Section 1. Emerging Virus Trends

Recent developments in the industry have seen virus writers incorporating email capabilities into viruses with devastating effects. Older techniques used to increase the impact or longevity of a virus, such as stealth and polymorphism, have become old hat. By using email as a method of distribution, it is possible to write a virus capable of infecting thousands of computers in a matter of minutes. Worse still, creating such viruses has become easy due to the immense programming capabilities contained within today’s powerful office suites.

Since the first PC virus appeared on the scene a little over 12 years ago, there are now approximately some 40-50,000 known viruses. However, whilst this figure may sound dramatic, it is not an indication of the overall threat, indeed at any point in time only a very small proportion of viruses are actually in the wild causing damage. The current number of viruses in the wild is estimated to be around 400 although this figure precludes variants it still seems small contrasted against the number of documented incidents.

Traditionally, virus incidents are hard to track and obtain accurate data on, as this information is often kept “under wraps”. In December 1999, Dell Computer Corporation were publicly exposed as being hit by the Funlove virus.

Section 2. Why are Viruses So Prolific?

It is easy to see that the number of incidents has risen sharply over the past 18 months. In order to see what conclusions and possible preventative steps can be drawn lets first take a closer look at the three main contributory elements.

2.1 Networked Plumbing

We’re all connected! The corporate world has just spent the last 5 years feverishly gluing itself together. The major driving force behind this is email.

Five years ago, the chances of the author of this paper being able to email you, the reader, were probably around 50/50. Nowadays, having an email address at work is taken as a given. Aside from the corporate world, domestic connectivity is catching up fast. In a recent Email Marketing Report (February 2000) there were 409 million email accounts worldwide in 1999, up from 234 million the previous year, that’s a growth rate of 170%. With current estimates projecting 700M+ mail accounts worldwide by the year 2005, the total pervasiveness of email will be truly woven into the fabric of society.

Email is a type of plumbing that links us all together – globally. It does not take a rocket scientist to see that viruses engineered to exploit this global connectivity, have the potential to wreak havoc worldwide. This is what we are seeing right now. According to a recent study published by ICSA Labs, The Computer Virus Prevalence Survey 2000 (Nov 2000), there have been very significant changes in virus distribution methods – in 1996, 9% of viruses were distributed by email, in 2000 email was the main method of virus distribution at 87%.

2.2 Too much functionality?

Consider the Scenario:

Date: 25th December 1978, Christmas Day morning.

Venue: My parents’ house.

It’s Christmas day and I’ve just opened my largest Christmas present. It’s a chemistry set and I’m a very happy 8 year old boy! I can’t believe my luck ! I can clearly remember taking out and ogling nearly 100 test tubes full of various potions and powders.

I can also remember my parents returning to the house, after having been with friends for just an hour. They find me standing in the center of the living room, looking very guilty. The living room was proudly sporting a new look, a fairly uniform dark purple splatter. Having been handed all the essential tools (chemicals, a large test tube, mentholated spirit and a wick) it was just a matter of time before I’d try my hand at some form of explosive.

All this brings me to the powerful programming languages now incorporated as standard in today’s Office Application Suites and browsers. A while ago, writing a virus was just like writing any program and required a bit of savvy on behalf of the author. Rapid development tools we now take for granted simply did not exist in the past and this fact kept many would-be virus writers at bay. Putting the obvious malicious aspect to one side for a moment this was a job for men not boys. However the proliferation of the Internet has accelerated the need for new development tools and environments that can exploit this connectivity.

Visual Basic for Applications (VBA) and Visual Basic Script (VBS) both present obvious advantages for power users who want better integration between applications, it also presents virus authors with a perfect toolkit for quickly developing viruses based around Office applications. More recently, the ability to easily gain access to just about any conceivable email function is responsible for the sharp increase in the number of worm type macro viruses now propagating in the wild. This is a growing trend, which is visible on just about every virus prevalence table available.

Given the right tools, it is only a matter of time before a certain combination will cause havoc.

2.3 Common Platforms: Microsoft Outlook

Consider the Scenario:

Hybridization was the term given to a crude form of genetic engineering back in the 60’s. The idea was simple, cross the biggest, hardiest, fastest-growing varieties of corn (or any other crop) until you end up with SUPERCORN. SUPERCORN is so much better than regular corn. It yields more bushels per acre and is more resistant to disease. Soon there are millions of acres of SUPERCORN that are not only the same variety, but since it is derived from a single master plant it is genetically identical. And that's the dark side of hybridization, because if a new disease comes along that does bother SUPERCORN, it affects the entire crop. There is always the risk that the entire crop will die all at once.

This brings us on to the Love Bug, the latest in a recent spate of computer virus furor that caused billions of dollars in damage. Hybridization comes into this drama because the I Love You worm isn't just a computer virus or a PC virus or a Windows virus or even an e-mail virus. I Love You is specifically a Microsoft Outlook/Visual Basic virus. It takes advantages of features in this SUPERCORN of e-mail programs to cause damage to the greatest possible number of users.

Our networks, desktops and office suites are literally littered with SUPERCORN like Outlook. This is a strong argument for genetic diversity in software because what made the Love Bug so costly was Microsoft's success at getting people to use its software.

In summary:

We are all totally connected via the Internet and email.

Writing viruses is easier than ever before.
Once flaws are discovered in common software platforms, virus writers will exploit them, impacting a massive installed base.

Section 3. Current Solutions are Outdated

An important conclusion that can be drawn is that anti-virus solutions have been slow in evolving to meet the new virus threat. Anti-virus solutions in the main are still focused on the desktop or local gateway. A desktop strategy should certainly always exist as people will always continue to use suspect floppy disks and CDs from a magazine. Such solutions are however inadequate at dealing with a global outbreak such as the Love Bug.

To help illustrate this suppose we could step into a time machine and travel back four years…….

Four years in the past, somewhere in the Philippines, our bad guy sits hunched over his keyboard in a darkened room illuminated only by the phosphorous of his monitor. He is about to release a virus into a CompuServe forum. However, back then, the virus he is about to release is not a potential threat to the “computer user” over in the UK for weeks, if not months or years.

The essential ingredient for virus pollination at this time is people. The very speed in which people can exchange floppy disks, .exe files and office documents is a very limiting factor – at this point, we are not all glued together. Anti virus strategies around at this time were adequate in dealing with the virus problem.

Our time machine now whizzes forward to the present day. This time, our bad guy sits at his flat screen and posts his email aware virus into a news group. It’s a simple bit of Visual Basic Script (VBS) and thanks to COM, it is easily able to locate and exploit Microsoft Outlook, the dominant mail client.

Depending on whose address book contains my email details, this virus threat could arrive within minutes. My desktop AV software does not know about this new virus nor can it catch it heuristically. It takes, on average, 6 hours for my AV vendor to obtain a sample and make a signature available. As everyone is trying to obtain the signature at the same time from the publicly accessed web-sites of my AV vendor, the ftp server is unavailable. In effect, a pseudo denial of service attack is being performed.

Working in this way places a huge burden on network managers and dramatically increases the amount of anti-virus activity necessary to provide effective protection. Performing tasks such as maintaining multiple scanners and updating virus signatures several times day are often impractical for network administrators.

Section 4. Evolving a New Approach: Scanning for Viruses at the Internet Level

In order to provide better virus protection we need to implement scanning and detection systems higher up the food chain of mail delivery where economies of scale make a more sophisticated approach possible. The logic lies in scanning for email viruses at the Internet level.

MessageLabs has been providing a virus scanning service at the ISP level for 2 years. The system took just under a year to develop and essentially integrated several commercial virus scanners into an ISPs mail infrastructure, plus a rules-based system for emergency outbreaks (such as Lovebug). The greatest hurdle to overcome with ISP virus scanning is scalability.

From the first live date the Virus Scanning System started to intercept approximately 40 viruses each day, instantly vindicating the approach. Today the system has evolved into a carrier class solution and regularly intercepts in excess of 1000 viruses per day. As an important aside the system also generates valuable data about what viruses are actually in the wild in real-time and the effectiveness of many commercial scanners against our ‘live list’.

The chart below details the progression of MessageLabs virus scanning technology and effectiveness over the last year. Some of the major lessons learned along the way were that more than one virus scanner is needed and that the overriding threat to computer users lay within macro viruses.

We settled on 3 scanners ultimately finding that this number gave the most effective results without reducing performance. Significant re-coding of the mail platform was necessary in order to provide the self-tuning required ensuring latency remained low.

Failure to use more than one scanner resulted in an average 3% non-detection rate. Given that a single virus could prove devastating, this was thought to be unacceptable.

The average figure for virus detection across all email is 1 virus in every 1500 emails. From this we are able to estimate that an average of 66 viruses will pass through an ISP’s network for every 100,000 messages handled. This formula makes it easy for any ISP to estimate the potential effectiveness of implementing virus scanning technology with relative ease.

Analysis over a long period of time has shown that a greater percentage of viruses come from ‘free’ mail accounts than from general private domains. Further investigation revealed that the average number of viruses contained within popular free mail accounts soars to 1 in 500. From this information we are able to broadly estimate that the average number of viruses passing through an ISP increases to 200 in 100,000 for free mail messages handled. We suspect webmail vastly increases the promiscuous use of multiple computers for handling documents and hence increases the chance of infection.

4.1 How does the Virus Scanning System Work?

The system, known as the MessageLabs Virus Control Center (VCC), runs on scalable architecture comprising a cluster of towers densely populated with high performance servers, Cisco Catalysts and load distributors which host McAfee, F-Secure and Vfind virus scanning software. We have a rolling program to locate tower clusters at major peering locations around the world. Currently installations are deployed in London, Amsterdam and New York. Additional

Towers will be deployed on a Global basis during Q2 and Q3 2001.

Each tower comprises 2 Cisco load balancers, 2 Cisco high performance 100Mbs switches, 24 industrial PCs each running the MessageLabs proprietary mail engine and temperature and fan monitors linked to all PCs. A management system (PC) ensures that load is not only distributed but also tuned dynamically. If an individual mail server queue becomes excessive the management system lessens the delivery priority to the affected system. Excessively large emails are handled by a separate “Big-email server” to permit a more even flow. Should either the management server or big email server fail then an election is forced and other systems will take over either role. In order to perform the scan, we intercept all mail as it passes through our system so that it can be processed and examined for viruses before being allowed to continue to its ultimate destination.

After being delivered to a tower the SMTP session is distributed to one of the scanning mail servers. The session is authenticated against the customer database to ensure that it is either coming from or going to a known customer. If authentication succeeds then mail and associated file attachments are decoded using open standard formats (nested and combinations are also decoded) and passed to the binary queue for scanning. During this process if an abnormal file structure is unpacked such as a “Zip of Death” expanding to several terabytes the file is rejected and an error message sent. The file is then passed through three commercial virus scanners and Skeptic™, MessageLabs own proprietary heuristics and rules based scanner used to detect the very latest viruses for which no signature is available.

If all is well the message finally passes to the Processed queue where an optional corporate “Scan successful” message can be added. If a virus is detected the email is moved to a quarantine area where it will remain retrievable for a period of 10 days. Once 10 days have elapsed the email is destroyed.

A separate Health monitor process constantly monitors the status of the mail server to ensure that all processes are running smoothly, security trip wires have not activated and that adequate disk space always remains. Health reports are then fed back to the management system, which is monitoring the health of the tower as a whole. These reports are in turn fed back to our central monitoring system which keeps an eye on our worldwide network.