AUTHOR: TITLEodd page 1

Telephone Fraud Monitoring

and Detection System

Jeremy Batozynski, Matthew Jaskula, Chad Larkin, and Zlatko Vetrov

Abstract— Team Fraudix, a four-member team participating in the Software Engineering Senior Project at the Rochester Institute of Technology, took on the task of providing a Telephone Fraud Monitoring and Detection System to PAETEC Communications. The six month project, ending in May 2004, provided PAETEC with solid requirements specification and design documents that will facilitate the company’s progress towards a full-scale product.

Index Terms - Batch processing systems, Initiation and scope definition, Process implementation and change, Schedule and organizational issues.

——————————  ——————————

1 Introduction

Batozynski, Et Al.: Telephone Fraud Monitoring and Detection Systemodd page 1

P

AETEC Communications (PAETEC), based in Victor, NY, is an Incumbent Communications Provider (ICP) concerned with telephone fraud committed against its customers and the company itself. PAETEC provides local and long distance telephone service, as well as internet services to its customers across the United States. Telephone fraud is defined as the unauthorized use of a customer’s telephone service, usually to make long distance or otherwise expensive calls. Since the customer is contractually obligated to cover the cost of these fraudulent calls, PAETEC provides telephone fraud detection as a value-added service, protecting them from these costs.

The problem of fraud monitoring is easily solved by a software solution. The monitoring process centers on a set of rules that are applied to call data. A broken rule indicates a high likelihood of fraud. A fraud analyst, in this case a PAETEC employee, is left to determine if the broken rule is actually a case of fraud. In the end it is always a person, not the software, that makes the decision as to whether or not a particular call is a case of fraud. In the case of PAETEC, the act of fraud monitoring will involve the processing of about 15 rules against each of the 5 million calls its customers make each day. This processing must also happen in real time. Clearly, this amount of data processing indicates a database intensive, software-based solution.

Currently, telephone fraud is detected using a commercial software product that meets most of PAETEC’s needs. PAETEC would like to create a product that replicates the features of this existing product while adding several key features. One new feature, involves the change from a standalone application to an enterprise application based solution. Second is the ability to partition the information used in the fraud detection process between PAETEC and any interested customers. This feature will allow customers the ability to perform their own fraud monitoring, independent of PAETEC’s monitoring. These two additions drive PAETEC’s motivation for this project.

We, team Fraudix, took on this project, the Telephone Fraud Monitoring and Detection System (the System), as our Senior Project in the Software Engineering program at the Rochester Institute of Technology. The success of the project centered on our ability to provide PAETEC with a solid starting point from which their engineers could develop a complete working system. Our primary deliverables are a complete Software requirements specification (SRS) and high level architecture and design document.

2 Project Requirements

2.1 Functional Requirements

The functional requirements for the System center on the two main portions of the functionality. First is the automated processing of call data for fraud detection and second is the user interface for system setup and information browsing. The automated processing is performed by a background process triggered by events such as incoming call data or time of day. The setup and information browsing sections of the System rely on user generated events to control the system behavior.

The System receives call data in batches of call data records (CDRs) approximately every 15 minutes. Each CDR contains pertinent information about a single call, including the originating and terminating phone numbers, and the length of the call. This information is parsed by the System and stored in its database. As each CDR is processed the System must check it against each of the rules to determine the probability of fraud. If the rule is broken the System may take further action, which includes notifying a fraud analyst that a rule was broken.Fig. 1 shows this process.

In addition, the System receives information about PAETEC customers each day. This information includes the customer’s phone number, also called the automatic identification number (ANI), as well as name and billing information. This information is parsed by the System and stored in its database.

We have defined a rule to be a boolean statement that can be applied to one or more CDRs which indicates a likelihood of fraud. A typical rule may state that ‘a call with a duration of X minutes to region Y indicates fraud’, or perhaps ‘more than X number of calls from the same ANI to region Y within a period of Z minutes indicates fraud’. Rules, like these examples, with constant values (X, Y and Z) are not very useful when you consider the varying differences in call habits between PAETEC customers. For this reason, the System keeps a profile of normal usage for each of PAETEC’s customers. Using these profiled values, rules can be written that will be broken when a customer’s usage steps outside their normal calling patterns. Using these values, we now can have rules like, ‘a call to region X whose duration is greater than that customer’s average call duration when calling region X, indicates fraud’. Here the System uses a combination of averages and standard deviation to determine if a call is actually longer than the customer’s ‘average’ call duration. As shown in the example these profiled values pertain to behavior when calling certain regions, in addition the System will profile call behavior for different times of the day. This allows the fraud analyst to create rules that reference a customer’s calling habits, thereby defining fraud in a more general sense.

The System uses the concept of a region to define a conceptual calling area. Each region is comprised of any number of area codes, country codes, and other regions. Regions can now be defined to describe geographical areas such as ‘Europe’ or ‘the Middle East’. These regions can be used to make rules easier to understand conceptually.

Each rule in the System has the ability to send an alert to a fraud analyst or modify the CDR’s score, or both. A CDR’s score represents the likelihood that the CDR is a case of fraud, the higher the score the more likely the CDR is fraudulent. Each ANI has a score as well, which is the sum of the scores of the CDR’s that originate from that ANI.

The System sends alerts to fraud analysts using an external system. Currently PAETEC uses a system developed in-house, called CONES, to handle alert escalation and notification. CONES maintains a schedule containing alert recipients and methods of contact so that, based on the time of day and day of week, CONES can select the appropriate recipient and notification method. The System will send the alert message to CONES in the body of an email message which CONES will forward on to the appropriate recipient.Therefore the System only needs to be concerned with the fact that an alert needs to be sent, not to whom it is sent or the means in which the message is conveyed. The alert body itself contains information about the call(s) that broke the rule and the rule that was broken.

The System provides an interface for the user to specify the previously mentioned rules and alerts. Rules are made up of an identifying name and a definition along with an optional alert type and optional score modifier. Alerts are defined in a similar way, with a name and definition. An alert definition indicates what kind of information is sent to the fraud analyst, this could include information about the customer as well as the rule that was broken.

The System provides a means for viewing a list of ANIs that have accumulated a score according to some criteria. These lists, called views, allow the fraud analysts to investigate ANIs that have not generated alerts, but may still be cases of fraud. Typical view criteria could be ‘ANIs with scores over X’ or ‘ANIs that have broken rule Y’. Selecting a specific view in the interface allows the user to see a list of the ANIs that meets that view’s criteria. Selecting a specific ANI from the resulting list will allow the user to view all of the specific call data for that ANI in the System. Using these views and customer information a fraud analyst has at their disposal all of the information that they will need to determine which calls are cases of fraud.

The fraud analyst has a method of tracking an investigation of possible fraud originating from a specific ANI, which is called a case. Each case is comprised of a set of offending CDRs, originating from a single ANI. The fraud analyst can edit the case by adding and removing CDRs from the case. The fraud analyst can track a case by adding text based notes to the case describing the case as it progresses towards resolution. Each case has a status associated with it; a case’s status is ‘unknown’ until it is resolved as either ‘fraudulent’ or‘not fraudulent’.

The System also provides to the fraud analyst a means for viewing the imported customer information. This information is supplemented with notes made by a fraud analyst about a particular customer.

Finally, an administrator of the System can manage user accounts and customer partitions. Managing user accounts includes the addition and deletion of users as well as the control of a user’s access to system features and customer partitions.

The customer partitions allow each partition to have its own exclusive set of fraud detection resources. A partition is represented as a subset of PAETEC customers whose fraud detection is managed by a third party. Each partition will have access to only the CDRs of customers in its partition, while PAETEC will have access to all CDRs. The fraud detection resources managed by and exclusive to, a partition include rules, alerts, views, cases, case notes and customer notes. Partitions will share CDR information with the PAETEC partition, but will not share CDR score.

These requirements we have described represent a high-level overview of the full requirements of the System.

2.2 Non-Functional Requirements

One of our main concerns is the System’s ability to perform the necessary tasks, on each CDR that is imported, faster than the CDRs themselves are imported. Since PAETEC customers make approximately 5 million long distance calls per day, theSystem will receive batches of 20,000CDRs, on average, every 15 minutes. For each CDR the System must process every rule, update the CDR score, ANI score, and thresholds, as well as send any necessary alerts. While this is happening the System must also handle user queries for information. It is clear from this description that much care was necessary in the design of the System to ensure that it would meet these performance goals.

TheSystemwas also designed to meet requirements for flexibility in methods for CDR importation and parsing. The System is able to change the scheme for CDR importation with no affect on other areas of the system. Possibilities for changes in the CDR importation system include changes to improve importation performance as well as changes made to include new CDR types.

Finally, theSystem must be flexible in the area of alert sending. TheSystem will allow for the replacement of the CONES system with an alternate system, with little change to other systems. Proposed changes include alternate protocols of communication with the alerting system.

3 Development Process and Plan

We developed the System with process in mind. The sponsor mentioned the use of the Team Software Process SM(TSPSM) and Personal Software Process SM(PSPSM) but did not actively require the use of a specific model. With this in mind, we initially decided on a loose process that seemed beneficial to our situation. Unfortunately, the decided process would prove impractical and was modified throughout the project.

A software development schedule for any project needs to be derived from a base of three elements: scope, effort, and time. The scope of a project involves the amount of functionality and process activities that must be completed. The effort describes the rate at which the scope can be done. This amount of work over time is affected by the number of people on the project team, the familiarity with the project, communication, etc. The available time is the difference between the starting date and the delivery date. In our project, we came across a major blockage. Late into the project, it was discovered that our scope was very large. The desired product was too large and too complex for the available effort and time. Project Requirements (Section 2) illustrates the large amount of requirements that show this. Due to this we needed to alter our schedule. Our project had a fixed deadline at the end of the semester. We had a fixed team of individuals with fixed available effort considering other coursework. Our only option was to perform a smaller amount of the project scope.

At the project’s proposal and initial meetings, we constructed a process. (see fig. 2a) It contained the common software activities of many processes. We intended to have ‘launch’, ‘requirements’, ‘design’, ‘implementation’, ‘testing’, and ‘deployment’ phases of the project development. We had learned the benefits of performing small, iterative cycles and included such a model in our approach. We desired to have a single Launch, followed by a single, fully detailed requirements phase. Obtaining all of the requirements at once seemed like a good choice for front-loading the primary customer-interaction activities. The sponsor contacts have other tasks that they perform outside of this project, and impacting these as little as possible was a consideration. After specifying the requirements, we intended to perform three cycles of design, implementation and testing. Doing so would provide the sponsor with increasingly functional portions of the System. With performance testing and tuning an important aspect of the System, early evaluation could be determined and redesigning and refactoring could be done in needed areas within the next cycle. After the three cycles, a single deployment phase was envisioned to create any further documentation and to assist with the system installation on PAETEC’s systems. To aid in the focus of these activities, we created roles that we planned to follow throughout the project. Matt was given the role of ‘leader’ and would act as the main communications contact for the sponsor. He would also manage the requirements phase. Jeremy would manage the design and testing phases, as well as focus on quality assurance. Chad was assigned to manage the configuration of files, and to lead the team in planning. Zlatko’s role was to manage the implementation phase and to track documentation control.

During elicitation, and the discovery of the project’s immense scope, the process was modified. It was determined that the requirements phase was going to grow because more time would be needed to gather, analyze, and specify. We decided to perform only one cycle instead of three of the design, implementation and testing. This essentially turned into a process similar to the Waterfall process, in which each activity is visited once without retreating to the previous activities (see fig. 2c). We knew that we would still design the entire project, but that implementation might be limited and testing and deployment would probably be shorter in time because of the lesser implementation. These shorter times would help to alleviate the scope issue.

After finishing the requirements specification, and heading into design, we were aware that the requirements phase had taken longer than expected, and that design probably would as well. These problems are associated with further project complexities and scope underestimations. With a hard deadline, there would not be enough time for a full implementation and testing. A further process modification was made, creating a ‘truncated waterfall’ in which we would perform a very limited implementation phase and not do any testing or physical deployment. (see fig. 2c) During this decision, the product aim shifted from a primary deliverable of the proof-of-concept product to the SRS and the design document. These documents would provide PAETEC with the means to continue development of the project at a later time with another team.