Open Access Open Access  Restricted Access Subscription or Fee Access

Category-wise Feature Analysis in SVM-based Hot Method Prediction

Sandra Johnson, Valli Shanmugam

Abstract


The optimizing compiler makes critical decisions regarding the type of optimization to be performed and the target sequence on the input program. Program segments called hot methods that are frequently executed form important targets for optimization. Machine learnt prediction of these target segments for optimization relieves the virtual machine from profiling and the associated overhead. Because the efficiency of the prediction approaches based on machine learning depends on the core issue of feature selection, this study focuses on the impact of feature categories on the performance of the Support Vector Machine-based hot method predictive model. The standard sequential forward selection algorithm is used in the category-wise analysis and identification of the most influential feature category and the combination of categories. It is seen that the category combinations have larger impact on the performance of the predictive model than the individual feature categories and that the model is able to obtain the highest prediction accuracy of 67% when trained with a combination of the categories of method size and control flow graph.

Compared with default optimization heuristics enabled by Low Level Virtual Machine, the predictive model trained with method size and control flow graph achieves a modest speedup gain of 4% and 1% respectively for inlining and intra-procedural optimizations like constant propagation and loop unrolling on the MediaBench programs whereas on the UTDSP the results are mixed. Overall, the approach has the advantage of speeding up program execution in virtual machines.

Keywords


Feature Category, Feature Selection, Hot Methods, Machine Learning, Prediction, Virtual Machines.

Full Text:

PDF

References


V.N.Vapnik, “The support vector method of function estimation, Generalization in Neural Network and Machine Learning,” Springer-Verlag, 1999, pp.239-268.

C. Lee, UTDSP benchmark suite [Online]. Available: http://www.eecg.toronto.edu/~corinna/DSP/infrastructure/UTDSP.htm, 1998

C. Lee , M. Potkonjak and W.H.M. Smith, “MediaBench: A Tool for Evaluating and Synthesizing Multimedia and Communication Systems,” Proceedings of the 30th International Symposium on Microarchitecture, 1997, pp. 330-335.

C. Lattner , and V. Adve, “LLVM: A compilation framework for lifelong program analysis & transformation,” Proc. of the 2004 Int. Symposium on Code Generation and Optimization (CGO’04), 2004, pp. 75-86.

C. Krintz, ‘Coupling On-Line and Off-Line Profile Information to Improve Program Performance,” ACM International Symposium on Code Generation and Optimization (CGO’03), 2003, pp. 69-78.

M. Arnold and D. Grove, “Collecting and Exploiting High-Accuracy Call Graph Profiles in Virtual Machines,” IEEE Proceedings of the International symposium on Code Generation and Optimization, 2005, pp 51-62.

T. Way and L. Pollock, “A Region-based Partial Inlining Algorithm for an ILP Optimizing Compiler,” Proceedings of the International Conference on Parallel and Distributed Processing Techniques and Applications (PDPTA'02), 2002, pp. 552-556.

B.J. Bradel and T.S. Abdelrahman, “The use of traces for inlining in Java programs,” International Workshop on Languages and Compilers for Parallel Computing, 2004, pp. 179-193.

L. Stepanian, A.D. Brown, A. Kielstra, G. Koblents and K. Stoodley, “Inlining Java Native Calls at Runtime,” Proceedings of the 1st ACM/Usenix International Conference on Virtual Execution Environments (VEE 2005), 2005, pp. 121-131.

J. Cavazos and M.F.P. O'Boyle, “Automatic tuning of inlining heuristics,” Proceedings of the 2005 ACM IEEE Conference on Supercomputing (SC 2005), 2005, pp 14.

J. Cavazos, J. E. B. Moss and M. F. P. O'Boyle, “Hybrid Optimizations: Which Optimization Algorithm to Use?,” 15th International Conference on Compiler Construction (CC 2006), 2006, pp.124-138.

J. Cavazos and M. O'Boyle, “Method-Specific Dynamic Compilation using Logistic Regression,” ACM Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA), 2006, pp. 229-240.

G. Fursin, C. Miranda, O. Temam, M. Namolaru, E. Yom-Tov, A. Zaks, B. Mendelson, P. Barnard, E. Ashton, E. Courtois, F. Bodin, E. Bonilla, J. Thomson, H. Leather, C. Williams, M. O'Boyle, “Milepost gcc: machine learning based research compiler,” Proceedings of the GCC Developers' Summit, 2008.

J. Cavazos, “Intelligent Compilers,” 3rd International Workshop on Atutomatic Performance Tuning, iWAPT, 2008, pp. 360-368.

J. Sandra and S. Valli,” An Approach to Predict Hot Methods Using Support Vector Machines,” Proceedings of the Sixteenth International Conference on Advanced Computing and Communication (ADCOM08) , 2008, pp. 27 - 31.

J. Sandra and S. Valli, “Hot Method Prediction Using Support Vector Machines, Ubiquitous Computing and Communication Journal,” vol. 3, no. 4, 2008, pp. 75-81.

J. Sandra and S. Valli, “Support Vector Machine based Hot Method Prediction and Optimization using an Effective Feature Set”, submitted for publication

J.Sandra and S. Valli, “Feature Subset Selection for Hot Method Prediction using Genetic Algorithm wrapped with Support Vector Machine,“ submitted for publication

G. John, R. Kohavi, K. Phleger, ”Irrelevant features and the feature subset problem,” Proceedings of the 11th International Conference on Machine Learning, Morgan Kaufmann, 1994 , pp. 121-129.

R. Kohavi, G. H. John, “Wrappers for feature subset selection,” Artificial Intelligence, vol. 97 Issues 1-2, 1997, pp. 273 - 324.

W. Whitney, “A direct method of nonparametric measurement selection,” IEEE Transactions on Computers C-20, 1971, pp. 1100–1103.

H. Liu and H. Motoda, “Feature Selection for Knowledge Discovery and Data Mining,” Boston: Kluwer Academic Publishers, 1998.

P. Pudil, F.J. Ferri, J. Novovicova, and J.V. Kittler, “Floating search methods for feature selection with nonmonotonic criterion functions,” In ICPR94, 1994, pp. 279-283.

D. W. Aha and R. L Bankert, “A comparative evaluation of sequential feature selection algorithms,” Proceedings of 5th International Workshop on AI and Statistics, 1995, pp. 1-7.

Guyon,A. Elisseeff, “An introduction to variable and feature selection,” The Journal of Machine Learning Research, 3, 2003, pp. 1157-1182.

M. Dash and H. Liu, “Feature selection for classification,” Intelligent Data Analysis, vol. 1, 1997, pp. 131-156.

F. Ferri, P. Pudil, M. Hatef, and J. Kittler, “Comparative study of techniques for large-scale feature selection,” In E. Gelsma and L. Kamal, editors, Pattern Recognition in Practice IV, Elsevier Science B.V., 1994, pp. 403–413.

A.K. Jain and D. Zongker, “Feature-Selection: Evaluation, Application, and Small Sample Performance,'' IEEE Trans. on Pattern Analysis and Machine Intelligence, 19(2), 1997, pp. 153-158.

M. Kudo, J. Sklansky, “Comparison of algorithms that select features for pattern classifiers,” Pattern Recognition, vol. 33(1), 2000, pp. 25–41.

P. Langley, “Selection of relevant features in machine learning,” Proceedings of the AAAI Fall Symposium on Relevance. AAAI Press, 1994.

M. D. Bond, and Kathryn S. McKinley, “Practical Path Profiling for Dynamic Optimizers,” IEEE Proc. of the Int. Symposium on Code Generation and Optimization, 2005, pp. 205- 216.

T. Ball and J. R. Larus, “Optimally profiling and tracing programs,” ACM Trans. on Programming Languages and Systems, vol. 16, No. 4, 1994, pp. 1319–1360.


Refbacks

  • There are currently no refbacks.