Dynamic Voltage Frequency Scaling

Get Complete Project Material File(s) Now! »

Chapter 3 Power and Performance Analysis

Filters And Datasets

In this chapter, we rst discuss the ParaView rendering pipeline. Then we discuss the lters and the data sets that we used for CPU- integrated GPU power and performance comparison and analysis. Finally, we discuss the results.

The Paraview Rendering Pipeline

The basic steps for visualizing data are reading, ltering, and rendering. Figure 3.1 presents the ParaView visualization pipeline. First, data is read into ParaView and next, a number of lters are applied which process the data to generate, extract, or derive features from the data. Finally, a viewable image is rendered from the data. The gure 3.1 describes the ParaView visualization pipeline. Below is the description of the visualization pipelines that we use.
Sources: The ways to get data into ParaView are reading from a data  le or generating data with a built-in source object. For MAPS and Particulate Ensemble data sets we read the data from the respective data le and for Cone and Sphere objects we use the Paraview built-in source.
Filter Operations: After data is read, multiple  lter operations can be applied to discover much more about the data. These  lters are functional units that process the input data in order to generate, extract, or derive features from the data. There are many lters available in ParaView and the ones we investigate are described in the sections on lters.
Rendering: Rendering or image synthesis is the process of generating an image from a 2D or 3D model. ParaView supports data representation with multiple techniques like surface rendering, volume rendering etc. In this thesis, we do volume rendering for built-in data types and particulate ensemble data set and surface rendering for MPAS data set. With volume rendering, a solid mesh is rendered as a translucent solid with the scalar eld determining the color and density at every point and this allows to see the features all the way through a volume. The nal output of rendering operation is a digital image or raster graphics image le.
Writers: The nal output les are written to disk by writers. Writers provide attributes for e.g.: le format to write ASCII or binary les etc.

The Paraview Filters

Filters connect to form the ParaView visualization pipeline. They process the input data in order to generate, extract, or derive features from the data [8]. Below is the description of some lters that we investigate.
Clip: The Clip  lter operates on all types of input data sets and cuts away a portion of the input data set using an implicit plane without reducing the data set dimensionality.

The output data type produced by this  lter is an unstructured grid.

Contour: The Contour  lter operates on any type of input data set and generates isolines or isosurfaces using point scalars. It does this by using a selected point-centered scalar array. This lter requires input data to have at least one point-centered scalar array. The output data type produced by this lter is polygonal.
Glyph: The Glyph  lter operates on any type of input data set and generates a glyph at each point in the input dataset.  Examples of glyphs are, an arrow, cone, cube, cylinder, line, sphere, or 2D glyph and they can be oriented and scaled by the input point-centered scalars and vectors. The output data type produced by this lter is polygonal.
Shrink: The Shrink  lter operates on any type of input data set and it shrinks each input cell pulling them away from their neighbors and thus causing the individual cells of a dataset to break apart from each other. In this, each point in a cell moves toward the centroid of all the points in the cell. The output data type produced by this lter is unstructured grid output. The main input to this lter is the shrink factor, which determines how far the points will move towards the centroid. A value of 1 keeps the points at their original position and a value of 0 positions the points at the centroid of the cell. We use the Paraview defaults.
Slice: The Slice  lter takes any data set as input and slices it with a plane. Similar to a contour, it creates lines from surfaces and surfaces from volumes, extracting the portion of the input data set lying along the sliced plane. The output data type produced by this lter is polygonal.
StreamTracer: The StreamTracer  lter operates on any type of input data set having point-centered vectors and generates streamlines in a vector eld from a collection of seed points. Production of streamlines terminates on crossing the exterior boundary of the input data set or on hitting the maximum number of steps, terminal speed, and maximum propagation properties. The output of this lter is polygonal data containing polylines.
WrapByVector: The Warp by vector  lter operates on polygonal, curvilinear, and unstructured grid input data sets and translates the points of the dataset along a vector by a distance determined by the speci ed scalars. The output data set type for this lter is the same as input as in this operation only the positions of the input points are changed.

READ  est of Basic Chemical Knowledge (TBCK)

The Simulation Data sets

The data sets that we use for our analysis are described below. Cone and Sphere are the ParaView built-in data sets. Model for Prediction Across Scales-Ocean (MPAS-O) and Particulate Ensemble are real-world data sets
Built In Data Sets: We use the ParaView built in three-dimensional sphere and cone data sets. Figure 3.2 shows di erent ParaView lters applied to the sphere data set.
MPAS Ocean: The MPAS is a collaborative project between climate modeling group
Figure 3.2: Di erent ParaView  lters applied to the sphere data set at Los Alamos National Laboratory and the National Center for Atmospheric Research. It develops atmosphere, ocean and other earth-system simulation components for use in climate, regional climate and weather studies. MPAS-Ocean can simulate the ocean system in spatial scales from sub 1 km to global circulations and in time scales of months to millennia. Many MPAS model components use the unstructured Voronoi meshes and C-grid discretization used as the basis. We use a scaled down, 1.00*1.00 data set for our analysis. The gure 3.3 shows the glyph lter applied to the MPAS-O data set.
Particular Ensembles: This data set is taken from Scienti c Visualization Contest 2016 and it corresponds to the simulation of a solid body of salt dissolving in a cylindrical ow domain that contains water. The salt sits at the top of the cylinder and the top is modeled by a corresponding boundary condition. Due to the transient nature of the solution process, simulation is processed in multiple time steps. For our analysis, we take the data set corresponding to a single run of the simulation.


We measure the impact of techniques such as dynamic voltage and frequency scaling (DVFS) and integrated GPU usage on the greenness for certain visualization tasks shown in Table 3.1. Greenness may refer to power, energy, or the energy-delay product (EDP).


Table 3.2 presents the execution time, dynamic power, dynamic energy, and EDP for the integrated GPU for various visualization tasks and data sets. The values presented in this table are normalized with respect to the CPU, i.e., normalized execution time = execution time on GPU/execution time on CPU).
We can clearly see that there exists a trade-o between performance and power for the two devices. While the CPU exhibits better performance for nearly all of our visualization tasks, the GPU consistently consumes less power. This trade-o results in widely varying energy and EDP characteristics for the two devices. To shed light on the nature of the greenness characteristics of the two devices, we summarize the EDP by data set and visualization tasks in Fig. 3.4 and Fig. 3.5 respectively.
Data Set Results: Figure 3.4 shows the EDP for the CPU and GPU for di erent data sets. Note that the data sets are arranged in increasing order of data size. The cone and sphere are on the order of kilobytes (KB) while the PE and MPAS-O data sets consume megabytes (MB) and gigabytes (GB), respectively. The EDP increases with increase in data size, but the rate of increase di ers for the two devices. The cone and sphere show similar EDP pro les; that is, the CPU consumes signi cantly less EDP in both the cases, as they are similar in size. However, as the data size increases, the percentage di erence in EDP shrinks signi cantly as in the case of PE, and when the data size exceeds a gigabyte (MPAS-O), the EDP of the GPU becomes lower (i.e., better) than the CPU.
Visualization Task Results: Figure 3.5 presents the mean EDP for the CPU and the GPU for di erent lters. We nd that the modern CPU is well optimized for visual processing tasks and delivers a better EDP for nearly all the tasks. The average EDP for the CPU is 21% better than that of the GPU. One major exception is the glyph lter, where the integrated GPU does better with an EDP that is only 89% of the CPU’s EDP. This is because the glyph operation is computationally more intensive than the other operations.

1 Introduction
1.1 Motivation
1.2 Related Work
1.3 Contributions
1.4 Document Overview
2 Heterogeneous Computation 
2.1 Graphics Processing Units
2.2 Dynamic Voltage Frequency Scaling
2.3 The Energy Delay Product Metric
2.4 Systems And Setups
3 Power and Performance Analysis 
3.1 Filters And Datasets
3.2 Methodology
3.3 Results
4 Power Prediction 
4.1 Base Methods
4.2 Methodology
4.3 Results
4.4 An Ensemble Method for GPU Power Prediction
5 Summary and Future Work 
5.1 Summary and Future Work

Power Analysis and Prediction for Heterogeneous Computation

Related Posts