Retrieval of hourly cost revenue and traffic
Emin Gabrielyan
2013-07-23
1. Table of Contents
1. Table of Contents 1
2. ODBC 1
3. Excel Data connections 1
4. Data importing 1
5. Construction of the hourly data 1
6. Preparing the data for visualization 1
7. Constructing the chart 1
8. Future works 1
9. References 1
2. ODBC
In the ODBC Data Source Administrator I created a user DSN to the replication server.
The data source name is rep-db-1.switzernet.com
3. Excel Data connections
In Excel worksheet two data connections are created
One of the data connections links to the CDR_Vendors table containing the records of answered calls
Its SQL command text is the following.
select CLI, CLD, setup_time, connect_time, disconnect_time, revenue, charged_amount, account_id, charged_quantity from `porta-billing`.`CDR_Vendors` where disconnect_time > subtime(now(),'2 12:0:0.0') order by disconnect_time desc limit 99000
The corresponding MySQL command line and its output looks as follows:
mysql> select CLI, CLD, setup_time, connect_time, disconnect_time, revenue, charged_amount, account_id, charged_quantity from CDR_Vendors where disconnect_time > subtime(now(),'0 0:1:0.0') order by disconnect_time desc limit 99000;
+------+------+------+------+------+------+------+------+------+
| CLI | CLD | setup_time | connect_time | disconnect_time | revenue | charged_amount | account_id | charged_quantity |
+------+------+------+------+------+------+------+------+------+
| 41215505329 | 41779130974 | 16000 | 2013-07-23 19:15:24 | 2013-07-23 19:18:12 | 0.30800 | 0.80080 | 41215505329 | 168 |
| 41225500694 | 33134684253 | 26000 | 2013-07-23 19:17:55 | 2013-07-23 19:18:04 | 0.00000 | 0.00068 | 41225500694 | 9 |
| 0797548325 | 41215504015 | 21000 | 2013-07-23 19:14:20 | 2013-07-23 19:17:58 | 0.01817 | 0.00000 | 212.249.15.6 | 218 |
| 41215502593 | 41767334356 | 13000 | 2013-07-23 19:17:24 | 2013-07-23 19:17:55 | 0.05683 | 0.05683 | 41215502593 | 31 |
| 41215503301 | 13479424115 | 17000 | 2013-07-23 18:59:52 | 2013-07-23 19:17:51 | 0.00000 | 0.10970 | 41215503301 | 1079 |
| 0014076168477 | 41215507336 | 9000 | 2013-07-23 19:10:05 | 2013-07-23 19:17:47 | 0.03850 | 0.00000 | 212.249.15.6 | 462 |
| 004961518008982 | 41225502514 | 13000 | 2013-07-23 19:01:57 | 2013-07-23 19:17:46 | 0.07908 | 0.00000 | 212.249.15.6 | 949 |
| 41225503964 | 41795058177 | 16000 | 2013-07-23 19:02:41 | 2013-07-23 19:17:45 | 1.65733 | 1.34545 | 41225503964 | 904 |
| 41215505681 | 41223487750 | 19000 | 2013-07-23 18:50:53 | 2013-07-23 19:17:38 | 0.00000 | 0.17923 | 41215505681 | 1605 |
| 41215041326 | 41215350938 | 26000 | 2013-07-23 19:16:32 | 2013-07-23 19:17:37 | 0.00000 | 0.00726 | 41215041326 | 65 |
| 0786750156 | 41215094195 | 8000 | 2013-07-23 19:11:24 | 2013-07-23 19:17:33 | 0.03075 | 0.00000 | 212.249.15.6 | 369 |
| 41225500126 | 33950293220 | 15000 | 2013-07-23 19:17:08 | 2013-07-23 19:17:29 | 0.00000 | 0.00102 | 41225500126 | 21 |
| 41215506332 | 41789359056 | 18000 | 2013-07-23 19:16:12 | 2013-07-23 19:17:25 | 0.13383 | 0.13712 | 41215506332 | 73 |
| 41215500462 | 14182643365 | 16000 | 2013-07-23 18:57:13 | 2013-07-23 19:17:14 | 0.00000 | 0.04404 | 41215500462 | 1201 |
+------+------+------+------+------+------+------+------+------+
14 rows in set (0.05 sec)
mysql>
Note that the 1 day and 12 hour period of the Excel connection is changed to 1 minute period in the MySQL command line version.
The other Excel connection links to the CDR_Accounts_Failed containing the failed call attempts.
Its default SQL text is changed as well
The command text used for this connection is the following
select CLI, CLD, setup_time, connect_time, disconnect_cause, account_id from `porta-billing`.`CDR_Accounts_Failed` where connect_time > subtime(now(),'2 12:0:0.0') order by connect_time desc limit 99000;
Its execution from the MySQL command line prompt gives the following result.
mysql> select CLI, CLD, setup_time, connect_time, disconnect_cause, account_id from CDR_Accounts_Failed where connect_time > subtime(now(),'0 0:1:0
.0') order by connect_time desc limit 99000;
+------+------+------+------+------+------+
| CLI | CLD | setup_time | connect_time | disconnect_cause | account_id |
+------+------+------+------+------+------+
| 41225501942 | 212522988785 | 0 | 2013-07-23 19:22:36 | 16 | 41225501942 |
| 41413677373 | 41245042380 | 0 | 2013-07-23 19:22:31 | 16 | 212.249.15.6 |
| 41215503967 | 80097334226 | 0 | 2013-07-23 19:22:29 | 16 | 41215503967 |
| 41815091225 | 3901119379334 | 0 | 2013-07-23 19:22:28 | 16 | 41815091225 |
| 41215509642 | 492112907552 | 0 | 2013-07-23 19:22:26 | 40 | 41215509642 |
| 41225501379 | 4133450756972 | 0 | 2013-07-23 19:22:26 | 40 | 41225501379 |
+------+------+------+------+------+------+
6 rows in set (1.54 sec)
mysql>
Note that here also the 60h period is changed to 1m.
4. Data importing
The data is imported into the Excel sheet with a desired frequency using the Existing Connection menu
The statistics of the answered calls retrieved from the replication server are computed in the following table
You can see the formulas behind the above values in the capture below
Take a note of the Excel columns of the imported data table
The connection of the failed calls is responsible for importation of the second table, as shown below
The statistics for this table are computed as well
Below you can see the formulas behind the values shown above
5. Construction of the hourly data
The raw data available in both tables is grouped in an aggregate table of the hourly data.
See the formulas behind these values in the following screenshots.
Here are the formulas for computing the Start and the Stop of each hourly interval. The first index is the number of hours behind the current time. The second value is the beginning of the corresponding interval, the third value (under the column AA) is the end of the current interval. The formula under the header name ‘answered’ computes the number of answered calls having a duration more than 0 seconds.
The following formulas compute the PDD value for each hourly interval. The PDD stands for post dial delay. It is the time in seconds between sending the call request and the moment when the call is answered.
The revenue and cost correspond to the amounts in CHF charged to our customers versus the amount charged to us by our vendors. Refer to the previous section for the correspondence between the excel header letters and the imported data.
The formulas under the header [seconds], give us the number of seconds of conversation for each interval. The last column entitled [failed] computes the number of failed calls the list of which is retrieved from the other Excel database connection link of failed calls.
6. Preparing the data for visualization
Additional data processing prepares the data for visualization.
Here is the final table supplying our Excel chart.
And the formulas corresponding to these final values are shown below.
As you see the revenue values are replaces by margins, the difference between the revenue and the cost.
The following parameters are defined for the visualization purposes.
The call factor value, equal in the below screenshot to 3, is the introduced to scale the curves of failed and answered calls with the curve of the minutes. For the same purpose there is a factor of 100 introduced for the values of PDD.
The formula in front of the [label] briefs on the last version of the data chunk being retrieved from the server by the two Excel connections.
=TEXT(M17,"#'##0")&" answered from "&TEXT(M4,"yy-mm-dd hh:mm")&" to "&TEXT(M3,"yy-mm-dd hh:mm")&" and "&TEXT(W5,"#'##0")&" failed from "&TEXT(W4,"yy-mm-dd hh:mm")&" to "&TEXT(W3,"yy-mm-dd hh:mm")
7. Constructing the chart
Our chart looks as follows.
The cost and revenue histograms are associated to the primary axis shown on the left. As this example shows, the values of the primary axis can be negative. The reason is the negative margin during certain periods of time, meaning that the cost is above our revenue.
The values of the cost are shown only if they are above CHF70 and they are marked in red if their value exceeds the level of CHF100.
A filter is set also for the values of the margin:
All curves (PDD x100, failed x3, answered x3, and minutes) are associated to the secondary axis located on the right side of the chart.
The red and green tick curves correspond to the number of failed and answered calls respectively. If the red curve goes well above the green one, it is bad and indicates on a connection problem with a vendor or an outage with one of our servers.
The pink curve, corresponding to PDD, shall indicate a high CPU load or a vendor connection problem when it goes above 20 seconds corresponding to the value of 2000 on the chart (as we show PDD x100).
The hourly values of minutes are shown in blue.
Note that the hours are computed from the current time back hour-by-hour. It means that the grid of hours is continuously floating across the call records and even though the data of the past does not change the chart will evolve over the time.
8. Future works
It remains to change the connection properties to refresh the data periodically, for example every 5 minutes. Once the Excel chart is up and running on a monitoring computer, screenshots must be taken periodically and uploaded on a web server for our outsourced team.
9. References
Connecting Excel to a remote MySQL server: http://www.switzernet.com/3/public/130715-excel-to-remote-mysql/
Properties of Excel to remote MySQL connection: http://www.switzernet.com/3/public/130715-excel-mysql-connections/
Creating a vendor cost on-line monitoring chart: http://switzernet.com/3/public/130716-vendor-cost-monitor/
[boxcutter] is a command line application for taking screenshots http://keepnote.org/boxcutter/
The excel file used in this document: [130723_cost_revenue_and_traffic.xlsx]
Copyright © 2013 by switzernet.com
End of this document