DaCapo数据集是目前软件分析,特别是动态分析方面经常用到的数据集,但是我之前一直不是很了解,想从今天开始进行深入的学习。在我之前的一篇博客里,引用了Eric Bodden的一篇博文,主要讲用temiflex和Soot来对Dacapo数据集进行静态分析,但是对DaCapo数据集并不是很了解。
下面的几篇文章都用到了DaCapo数据集:
E. Bodden, "Efficient hybrid typestate analysis by determining continuation-equivalent states," in Proceedings of the 32nd ACM/IEEE International Conference on Software Engineering - Volume 1, Cape Town, South Africa, 2010, pp. 5-14.
M. Gabel and Z. Su, "Online inference and enforcement of temporal properties," in Proceedings of the 32nd ACM/IEEE International Conference on Software Engineering - Volume 1, Cape Town, South Africa, 2010, pp. 15-24.
M. Pradel and T. R. Gross, "Detecting anomalies in the order of equally-typed method arguments," in Proceedings of the 2011 International Symposium on Software Testing and Analysis, Toronto, Ontario, Canada, 2011, pp. 232-242.
DaCapo数据集最早是在OOPSLA 06上发表和介绍的 :
S. M. Blackburn, R. Garner, C. Hoffmann, A. M. Khang, K. S. McKinley, R. Bentzur, A. Diwan, D. Feinberg, D. Frampton, S. Z. Guyer, M. Hirzel, A. Hosking, M. Jump, H. Lee, J. E. B. Moss, A. Phansalkar, D. Stefanovi, T. VanDrunen, D. von Dincklage, and B. Wiedermann, "The DaCapo benchmarks: java benchmarking development and analysis," in Proceedings of the 21st annual ACM SIGPLAN conference on Object-oriented programming systems, languages, and applications, Portland, Oregon, USA, 2006, pp. 169-190.
今天下午抽一些时间,简单看了一下上面这篇论文,在这篇文章的Introduction部分,作者介绍到,他们主要构建了一个通用的、来源于实际的、无偿提供的Java Benchmark,并且在这篇文章中推荐了一些选择和比较Benchmark的方法,例如使用了Time-series,使用PCA(主成分分析)来评价benchmarks之间的区别。

DaCapo数据集的主页:http://www.dacapobench.org/
DaCapo数据集的下载地址:http://sourceforge.net/projects/dacapobench/files/ 其最新版本为2009年底发布的9.12版。这一版本包含14个benchmark:
avrora
simulates a number of programs run on a grid of AVR microcontrollers
batik
produces a number of Scalable Vector Graphics (SVG) images based on the unit tests in Apache Batik
eclipse
executes some of the (non-gui) jdt performance tests for the Eclipse IDE
fop
takes an XSL-FO file, parses it and formats it, generating a PDF file.
h2
executes a JDBCbench-like in-memory benchmark, executing a number of transactions against a model of a banking application, replacing the hsqldb benchmark
jython
inteprets a the pybench Python benchmark
luindex
Uses lucene to indexes a set of documents; the works of Shakespeare and the King James Bible
lusearch
Uses lucene to do a text search of keywords over a corpus of data comprising the works of Shakespeare and the King James Bible
pmd
analyzes a set of Java classes for a range of source code problems
sunflow
renders a set of images using ray tracing
tomcat
runs a set of queries against a Tomcat server retrieving and verifying the resulting webpages
tradebeans
runs the daytrader benchmark via a Jave Beans to a GERONIMO backend with an in memory h2 as the underlying database
tradesoap
runs the daytrader benchmark via a SOAP to a GERONIMO backend with in memory h2 as the underlying database
xalan
transforms XML documents into HTML
但实际上14个Benchmark对应的是12个软件,如下图所示:

其中,DayTrader分为tradebeans和tradesoap两部分,Lucene分为luindex和lusearch两部分。
但是,实际上怎么使用DaCapo数据集,我并不是很清楚,DaCapo的官网也介绍得很简略,只是介绍,使用命令:
java -jar dacapo-9.12-bach.jar讯享网
版权声明:本文内容由互联网用户自发贡献,该文观点仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容,请联系我们,一经查实,本站将立刻删除。
如需转载请保留出处:https://51itzy.com/kjqy/30128.html