Modeling Complex System Architectures
(Model located at demo/performance/UAV/UAV_Data_Link.xml)
Models:
Click here to view and execute the Top-Level Model of the UAV in VisualSim
Click here to view the Network of Sensors and Processor Array as a VisualSim Model
Click here to view the VisualSim Resource Model of a Sensor System
Click here to view the Datalink Model in VisualSim
Early performance analysis and virtual system prototyping provides a methodology and platform to evaluate architectures, processing requirements and module functionality within the Unmanned Aerial Vehicle. This methodology allows the system engineer to start with an abstract concept and increase model fidelity and accuracy through successive levels of decomposition. Establishing a simulation platform allows for quick and accurate trade studies and spiral engineering enhancements over the program’s lifecycle. This also allows investigations into interoperability which lessens Total Overall Cost of the electronic systems. This is especially important for electronics because this system element faces obsolescence or enhancement replacement soonest.
The example below uses VisualSim software to conduct trade studies on the architecture of the processing Datalinks and to select the best Bus backplane. The system prototype combines existing components available in VisualSim model library to assemble the sensors, on-board multi-blade processing units, wireless channels and the operation of a ground vehicle. These processing Datalinks and the wireless channels are connected together over a 1553B Bus. Each Datalink processes messages from a number of sensors and transmits the results across the 1553B backbone through a common set of transmitters to ground vehicles.
Based on the analysis conducted by constructing a model of the proposed architecture and support for the different use scenarios, the optimal architecture was identified to be a Datalink with a 6-board, 30 MHz processor, 66 MHz shared cache and 1 Mbps- 4 link downlink.
Output Results
The simulation model evaluates the maximum handling capacity of the processing units, impact of channel errors and speed on the latency, and performance of the 1553B bus. The following metrics are evaluated:
1. End-to-end latency from the sensors to the ground vehicles. This is the time taken to retrieve data from the sensors, process the data, transmit it across the 1553B bus, over the Wireless channel and terminate at the ground vehicle (DCGS).
2. 1553B Bus Latency Histogram shows the variations of the latency from Datalink Remote Terminals (RT2 and RT3) to the Wireless Transmitter Remote Terminal (RT).
3. Packet Histogram displays packet sizes transmitted across the 1553B bus.
4. 1553B Bus Throughput displays the Peak and Mean throughput on the 1553B bus.
5. Display Rejects plots the times when sensor messages were dropped at the Datalink because of buffer overflow or lack of processing power.
6. Datalink Statistics captures the statistics of the buffer occupancy, utilization, processing time at each processor, buffers, flash memory and disk.
7. 1553B Bus Statistics captures (1) the buffer occupancy and waiting time at the RT and (2) the utilization of the bus controller.
Architecture
The system model consists of sensor generators, Datalink, 1553B RT, 1553B bus controller, Wireless channel, error checking and DCGS- Ground Vehicles. All of these are connected together on a single 1553B bus with a single controller.
Figure 1 Top-level VisualSim block diagram of the Unmanned Aerial Vehicle
Sensor Generator- This emulates the sensor data including acquisition rate, size, header information and sensor to Datalink distance. A sensor generator template was created and each sensor had different parameters values. The parameter values being modified include size, inter-arrival time, processing cycles, delayed start and distance to Datalink.
Datalink- One Datalink handles many sensors. In this model, there are 4 sensors in a network feeding into a single Datalink. The Datalink contains a RTOS that feeds in parallel to both a processor array and a flash memory. The flash memory is a temporary buffer that writes into an archiving system. The processor array contains a number of processors running at a fixed speed. The results from the processor are written into a shared cache. The resulting messages are transmitted by the RTOS to the RT.
Wireless Channel- The wireless channel is modeled as a multi-link channel with variable error probability. The channel also has an Ack-Nack to support retransmission.
DCGS- The ground vehicle is a sink that receives message and computes the latency.
1553B Bus Remote Terminal (RT) - This models the queuing, request for bus resource, cable propagation delay, response time and the ability to broadcast.
1553B Bus Controller- This does a simple arbitration according to the 1553B bus standard. Also, the latency across the controller is specified here.
Figure 2 Network of Sensors and the Processing Array on the UAV
Analysis
The initial analysis is performed to characterize a rough architecture that will support the processing requirements for a fixed arrival of messages from the sensors. The Datalink architecture is validated and adjusted for various arrival rates of the sensor traffic.
Figure 3 Analysis Plots captured from the UAV Simulation
Performance Evaluation
The inter-arrival rate of messages is set for an initial range of 0.0008 +/- 30% for each sensor. The rough architecture has 4 Sun SPARC processor boards of 20 MHz with 66 MHz Cache and a 1 Mbps channel. You will see that the Datalink starts dropping messages after 1.37 seconds of simulated operation and continues rejecting message until the end of the simulation. Also, the end-to-end delay is in a wide range from 1.25 seconds to 2.57 seconds. The cache statistics indicate that there is no buffer overflow and the utilization is quite low. The Cache is not a bottleneck. The individual statistics for the 4 processor boards shows a buffer overflow indicating that the processing speed is extremely low. The sensor messages are unevenly distributed to the different processor boards, with the usage ranging from 100% to 23%. Also, the rejection of the messages at the processor boards makes the 1553B bus under utilized.
To refine this architecture, a number of alternates exist- increase the number of processors, speed up the processors, modify the scheduling algorithm, increase the cache speed and pipeline, as opposed to parallel execution, of the 4 processor boards. In this model, we have tried the following- more number of faster processors, more number of processors at the same speed, increase the cache speed and increase the channel speed.
Case 1: We shall first increase the processor speed from 20 MHz to 50 MHz. This is done by changing a single parameter at the top-level of the model. This single parameter is linked to all the 4 processor boards. The rest of the parameters are maintained the same. The new latency histogram shows a narrower range of latency values at 1.25, 1.42 and 1.67 seconds. All the sensor messages are processed and transmitted across the 1553B bus without any message being rejected. The processor and cache statistics indicates no buffer overflow. The processor utilization is now uniform across all the processor boards at around 43%. The 1553B controller utilization has increased from 10% to 36%. There is a small buffering at the Remote Terminal-2 (RT), thus indicating that data is arriving at a faster rate than the 1553B controller can handle. The buffers at the RT prevent any loss of data but add some latency. The mean 1553B bus throughput has now doubled for the same traffic from .16 Mbps to .41 Mbps. The peak throughput on the 1553B with this architecture is .72 Mbps, out of an available 1 Mbps. This is because of the protocol overhead and the controller latency.
Case 2: The next experiment is to increase the number of processors to 6 and reduce the speed to 30 MHz. The results indicate no significant performance improvement from the previous experiments. On the other hand there is a small increase in the buffering at the cache. The shared cache is receiving data at a higher rate. Also, the average processor utilization is slightly higher than the Case 1 but it peaks out at 50%. The same volume of data is received at the 1553B bus controller and the mean utilization remains at 36%.
Case 3: Now let us reduce the number of processor to 4 and increase each processor speed to 30 MHz. There is still a buffer overload but also the peak latency increases to 3.8 seconds. This is not a viable option.
Case 4: Additional experiments can be performed by simply changing the parameters at the top-level of this model. Experiments include (1) increasing the channel speed, (2) increasing the number of channel links and (3) changing Cache speed from 66MHz to 133/288 MHz. Simulation show that these do not contribute to a reduction of the end-to-end latency. At this point the bottleneck is at the 1553B bus and not on the processing architecture.
Cost Comparison
The cost of a processor board is a function of the processor speed. For the sake of this analysis, we shall assume that the price is $1 per MHz. The Table shows the comparison:
Case # / # of Processors / Processor Speed / Cost / ViabilityRough initial Architecture / 4 / 20 MHz / $20 * 4 = $80 / No
1 / 4 / 50 MHz / $50 * 4 = $200 / Yes
2 / 6 / 30 MHz / $30 * 6 = $180 / Yes/cost-effective
3 / 4 / 30 MHz / $30 * 4 = $120 / No
Table 1 Cost Comparison of different architectures
Functional Analysis
After running this complex system, a very interesting observation was made. The burst nature of the handling from the sensors to the RTOS increases the buffering on the processor boards without increasing the utilization of the processor. Even though the processors do not achieve 100% utilization, there are still a number of messages that are being rejected. There are a number of factors affecting the overall performance including the RTOS scheduling, the distribution between parallel executions and burst data arrival. There are periods of inactivity followed by a burst of traffic that fills up the buffer and then starts to overflow. This can be modified by altering the sensor acquisition mechanism, which was beyond the scope of this evaluation. This could be easily added to this experiment as a future extension.
Summary
Trade studies of this kind can help in analysis of the system performance for a variety of operating conditions. Early understanding of optimal performance can same significant prototype testing and deliver much more robust system operation. The model for this presentation was built in a few hours using existing VisualSim library models. A more advanced model could consider effects of redundant processing and consolidation of functional modules on a single processing board. Also, trade-off between the partitioning of the functional nodes from the UML diagram on to different boards, separate 1553B or VME bus structures can also be evaluated. Finally, channel interference and jamming can easily be included as refinements to further explore operational effects on performance.
The model has been constructed using highly modular components creating a design platform. The bus in this model can be easily replaced with the VME bus architecture to evaluate the performance on a different backplane. The channel could be modified to try a more unreliable channel or the use of a cellular standard. Future spiral engineering possibilities can be tried and quickly determined to be feasible or not.
Case Study: Designing Complex Defense Systems Page 1 of 5 7/4/2005
Mirabilis Design Inc.; www.mirabilisdesign.com;