/********************************************** Notes on thesis work Qingfeng Duan **********************************************/ ****************************************************************** Table of Contents: * 1. SimpleScalar-ppc installation and use * 1.1 Installation (Linux/x86,Aix/PowerPc,Solaris/UltraSparc) * 1.2 Use the simulator * 1.3 Learn more (tutorial v4) * 1.4 Modifications (add live/dead analysis) * 2. Jikes RVM Installation * 2.1 On Intelx86/Linux * 2.2 On PowerPC/AIX * 2.2.1 jvm98 * 2.2.2 pseudojbb * 2.3 Jikes RVM - UMASS snapshot * 2.3.1 Installation on PowerPC/Aix * 2.4 Cross-build Jikes RVM IA32/linux->PowerPC/linux * 2.5 Cross-build Jikes RVM UMASS IA32/linux->PowerPC/linux * 3. Spec2000 * 3.1 Compile individual benchmark by hand * on PowrPC/Aix * 3.2 Run on PowerPC/Aix * and on top of SimpleScalar-ppc * 3.3 Measurements of dead ratio on Solaris/UltraSparc using * modified ss3ppc (on sampi.cs.unm.edu) * 4. DynamicSimpleScalar-PPC and JikesRVM-2.0.3 * 4.1 Sources * 4.1.1 DynamicSimpleScalar * 4.1.2 JikesRVM-2.0.3 * 4.2 Build * 4.2.1 DynamicSimpleScalar * 1) on UltraSparc/Solaris * 2) on PowerPC/AIX * 4.2.2 JikesRVM-2.0.3 * Cross Build - Intel/x86/Linux -> PowerPC/Aix * 4.3 Modifications of DSS-PPC (add dead/live times) * 4.4 Run Jikes on top of DynamicSimpleScalar * 4.4.1 HelloWorld - pseudojbb - jvm98 * 4.4.2 SPEC2000 * 4.5 Tracing-enabled version of Jikes RVM and DSS 5. Results * 5.1 pseudojbb/jvm98 (JikesRVM2.0.3/DSS) * 5.2 spec2000 C benchmarks (DSS) * 5.3 various heap sizes for java benchmarks * ****************************************************************** 1. SimpleScalar installation and use Try to build SimpleScalar targeted to PowerPC/AIX. Try to build SimpleScalar directly on a native host PowerPC/Aix (Vista), but cause "vitural memory exhausted!" Have to ask system ad to realease the vitural memory limit? So, Try to build SimpleScalar on Intelx86/LINUX. ------------------------------------------------------- 1.1 installation ------------------------------------------------------ Host: Intelx86/LINUX (rcde42.ahpcc.unm.edu) Target archiecture: PowerPC/AIX (vista.ahpcc.unm.edu) *Download: http://www.simplescalar.com/v4test.html or, (http://www.cs.utexas.edu/users/cart/code/ss-ppc-big.tgz) http://www.cs.utexas.edu/users/cart/code/ss-ppc-little.tgz *Unpacking: zcat ss-ppc-little.tgz | tar xvf - => (directory ./ss3ppc) or directory ./ss3ppc-linux *Read 'README.ppc' and modify Makefile This is the PowerPC port of SimpleScalar 3.0 To compile: make clean make config-ppc make sim-fast 1.1.1 Linux version on x86/Linux before making: rm -f machine.def ln -s target-ppc/powerpc-nonnative.def machine.def make will build the entire toolset In Makefile: repalce MFLAGS = './sysprobe -flags' with MFLAGS = '~/ss3ppc-linux/sysprobe -flags' cd libcheetah; in line 42 of libcheetah.c, B is redeclared as int so change all of occurences of B to _B; Do the same changes in sacopt.c dmvl.c util.c faclru.c an saclru.c 1.1.2 Aix/PowerPC Note: When installing ss3ppc-aix version on vista, it causes ' virtual memory exhausted ' when 'make outorder' using gcc. But this version can be compiled on a Umass machine. So here the running version on vista is actually precompiled on a Umass Machine. 1.1.3 Solaris/UltraSparc sigma.cs.unm.edu (Solaris 9 / UltraSPARC) In Makefile: MFLAGS = `./sysprobe -flags` -DSOLARIS remove -mpowerpc from OFLAGS after make config-ppc rm -f machine.def ln -s target-ppc/powerpc-nonnative.def ./machine.def make sim-outorder ar: command not found add /usr/ccs/bin to the path liner errors: Undefined first referenced symbol in file faccessx syscall.o fclear syscall.o accessx syscall.o ld: fatal: Symbol referencing errors. No output written to sim-fast As karu's suggested by email, just comments out these lines to call faccessx, fclear, and accessx. -------------------------------------------------------------- 1.2: use -------------------------------------------------------------- Timing: pi.c: ---- /* Description: This program calculates the value of Pi using (pi/4)=arctan(1) and arctan(x) = x-x^3/3+x^5/5-x^7/7+.... */ #include #include static int COUNT = 100000000; static int i; static float pi = 0; static double before, after; void wtime(double* t); main() { wtime(&before); for (i = 1; i <= COUNT; i++) { if (i % 2) pi = pi + (1.0 / (2.0 * i - 1)); else pi = pi - (1.0 / (2.0 * i - 1)); } pi=pi*4; printf("The approximate value of pi = %f\n", pi); wtime(&after); printf("Execution time = %f Seconds\n", after-before); } /* void wtime(double* t) { struct timeval tb; int iret = gettimeofday(&tb,NULL); *t = (double) ((double)(tb.tv_sec) + (double)(tb.tv_usec)*1.0e-6); } */ void wtime(double *t) { struct timestruc_t tb; int iret; iret = gettimer(TIMEOFDAY,&tb); *t = (double) ((double)(tb.tv_sec) + (double)(tb.tv_nsec)*1.0e-9); } ///////// //Results: ///////// ssh vista.ahpcc.unm.edu cd csthesis/PowerPCISA bash-2.05$ gcc -o pi pi.c bash-2.05$ pi The approximate value of pi = 3.141597 Execution time = 21.158037 Seconds Using SimpleScalar: 1) compile on PowerPC/AIX bash-2.05$ gcc -static -o pi.sim pi.c 2) run simulator on intelx86/LINUX ss3ppc-linux/sim-fast -redir:sim ./csthesis/PowerPCISA/simpi.out -redir:prog ./csthesis/PowerPCISA/simpip.out ./csthesis/PowerPCISA/pi.sim ********** Here is a test for Umass-compiled ss3ppc-aix on vista see ~qfduan/csthesis/PowerPCISA for details bash-2.05$ cc -bnso -bI:/usr/lib/syscalls.exp -o pi.simcc pi.c sim-safe -redir:sim sim-safe.sim.out -redir:prog pi.simcc sim-fast -redir:sim sim-fast.sim.out -redir:prog pi.simcc sim-cache $PPC_SS_CACHE -redir:sim sim-cache.sim.out pi.simcc sim-outorder $PPC_SS_DESCRIPTION -redir:sim sim-outorder.sim.out pi.simcc ------------------------------------------------------------- 1.3: learn more ------------------------------------------------------------- Get tutorial: http://www.simplescalar.com/docs/simple_tutorial_v4.pdf 1) Simulator Software Architecture Simulator Software Architecture: "Target software (apps and OS) runs on simulator " Performance model tracks time Perf core implements machine Standard modules speed coding " Simulation kernel provides event simulation services " Target ISA emulation support PISA, Alpha, StrongARM, PPC, x86 " Target I/O support Syscalls, devices, I/O traces Target Application and OS ========================= Hardware Model:Fetch Pipeline Predictor Caches Perf Core ========================= Simulation Kernel ========================= Target ISA Target I/O Interface ========================= Host Platform 2) Standard Models Sim-Fast Sim-Safe Sim-Profile Sim-Cache/Sim-Cheetah Sim-Outorder 420 lines - 350 lines - ~1000 lines - 900 lines - 3900 lines no timing - no timing - functional - no timing - performance -4+ MIPS - w/ checks - cache stats - lot of stats - OoO issue - cache 3) pp47: ss-ppc SimpleScalar Simulation of the PowerPC Instruction Set Architecture www.cs.utexas.edu/users/cart Learn sim-cache: Usage: sim-cache {-options} executable {arguments} sim-cache: This simulator implements a functional cache simulator. Cache statistics are generated for a user-selected cache and TLB configuration, which may include up to two levels of instruction and data cache (with any levels unified), and one level of instruction and data TLBs. No timing information is generated. -cache:dl1 # dl1:256:32:1:l # l1 data cache config, i.e., {|none} -cache:dl2 # ul2:1024:64:4:l # l2 data cache config, i.e., {|none} -cache:il1 # il1:256:32:1:l # l1 inst cache config, i.e., {|dl1|dl2|none} -cache:il2 # dl2 # l2 instruction cache config,i.e., {|dl2|none} -tlb:itlb # itlb:16:4096:4:l # instruction TLB config, i.e., {|none} -tlb:dtlb # dtlb:32:4096:4:l # data TLB config, i.e., {|none} The cache config parameter has the following format: :::: - name of the cache being defined - number of sets in the cache - block size of the cache - associativity of the cache - block replacement strategy, 'l'-LRU, 'f'-FIFO, 'r'-random Examples: -cache:dl1 dl1:4096:32:1:l -dtlb dtlb:128:4096:32:r Cache levels can be unified by pointing a level of the instruction cache hierarchy at the data cache hiearchy using the "dl1" and "dl2" cache configuration arguments. Most sensible combinations are supported, e.g., A unified l2 cache (il2 is pointed at dl2): -cache:il1 il1:128:64:1:l -cache:il2 dl2 -cache:dl1 dl1:256:32:1:l -cache:dl2 ul2:1024:64:2:l Or, a fully unified cache hierarchy (il1 pointed at dl1): -cache:il1 dl1 -cache:dl1 ul1:256:32:1:l -cache:dl2 ul2:1024:64:2:l Darko's setups for sim-outorder: export POWERPC_SIMPLESCALAR_DESCRIPTION="-fetch:ifqsize 16 -fetch:mplat 6 -bpred:bimod 2048 -bpred:ras 8 -decode:width 4 -decode:width 4 -issue:width 4 -issue:inorder false -issue:wrongpath true -commit:width 4 -ruu:size 16 -lsq:size 4 -cache:dl1 dl1:128:32:8:l -cache:dl1lat 2 -cache:il1 il1:128:32:8:l -cache:il1lat 2 -cache:dl2 ul2:512:64:8:r -cache:il2 dl2 -cache:dl2lat 9 -cache:il2lat 13 -tlb:dtlb dtlb:64:4096:2:l -tlb:itlb itlb:64:4096:2:l -tlb:lat 30 -mem:lat 37 2 -mem:width 8 -res:ialu 4 -res:imult 1 -res:fpalu 1 -res:fpmult 1 -res:memport 2" ----------------------------------------------------------------------- 1.4 Modifications ----------------------------------------------------------------------- Objective: to obtain the total live and dead times of cache lines over the entire program run. see cache.[ch], on machine, for example, sampi.cs.unm.edu for details. ====================================================================== 2. Jikes RVM Installation 2.1 on Intel/Linux - Succeed rvm PPCDisassembler 2.2 on PowerPC/AIX - IBM JDK 1.3.1 has been installed on vista (by Chris) - install jikes compiler 1.15 in home directory ./configure CXX=/usr/bin/xlC CXXFLAGS="-qlanglvl=ansi -qnotempinc -+-qinlglue" --prefix=/home/qfduan/jikes-1.15 ./make in long.cpp: line 477 complains "parameter list error for operator '?:'" solution: changes to if ... else ... see line 465 ./ make install - build jikes RVM: - set enviroment variables - edit ./rvmRoot/rvm/config/powerpc-ibm-aix4.3.3.0 Download jazzlib-binary-0.04-juz.jar into ./rvm/support/lib from here http://people.debian.org/~jewel//jikes/jikesport/jazzlib/ and RENAME it to jazzlib.jar - ./jconfigure BaseBaseSemispace - ./jbuild jbuild.compile: (classes compiled) (jksvm.jar built) jbuild.linkImage: (bootimage cleaned) (primordials updated) make: *** [/home/qfduan/rvmBuild/RVM.scratch/libyuck.jar] Error 127 Solution: need install unzip - ./jconfigure OptOptSemispace - ./jbuild ... jbuild.compile: (classes compiled) (jksvm.jar built) jbuild.linkImage: (bootimage cleaned) (primordials updated) VM_BootImageCompiler: init (opt compiler) stackpointer=2ff20bb8 Writing Java core file .... Written Java core to /home/qfduan/rvmRoot/rvm/src/tools/bootImageWriter/javacore12212.1035924710.txt make: *** [/home/qfduan/rvmBuild/RVM.image] Segmentation fault 2.2.1 jvm98 on vista machine: ~/jvm98 How to run: cd ~/jvm98 rvm SpecApplication _201_compress rvm SpecApplication _213_javac Some of these require a lot of memory, so you really have to say: rvm -X:h=200 SpecApplication _213_javac which means that JikesRVM will use 200 MB of heap. benchmarks: (see index.html in ./jvm98 for details) _200_check _201_compress _202_jess _205_raytrace _209_db _213_javac _222_mpegaudio _227_mtrt _228_jack _999_checkit 2.2.2 pseudojbb on vista machine: ~/pseudojbb in csh, rvm -X:h=100 -classpath ./jbb.jar:./jbb_no_precompile.jar:./check.jar:./reporter.jar:./src spec.jbb.JBBmain -propfile SPECjbb.props |& cat Important: You have to put |& cat at the end to solve the problem that the programs hange over there during run. 2.3 Jikes RVM - UMASS 2.3.1 installation on PowerPC/AIX (vista) #------------------------------------------------------------------------- # Jikes 2.0.3 UMASS #------------------------------------------------------------------------- setenv RVM_ROOT $HOME/jikes2.0.3-umass # <--define your working directory setenv RVM_BUILD $HOME/jikes2.0.3-umass-build setenv PATH $RVM_ROOT/rvm/bin:$PATH setenv RVM_HOST_CONFIG $RVM_ROOT/rvm/config/powerpc-ibm-aix4.3.3.0 setenv RVM_TARGET_CONFIG $RVM_ROOT/rvm/config/powerpc-ibm-aix4.3.3.0 # unzip setenv PATH $HOME/unzip-aix:$PATH # classpath for jikes compiler 1.15 setenv PATH $HOME/jikes-1.15/bin:$PATH jconfigure GCTkAppelOptOpt jbuild.compile: (classes compiled) jbuild.linkImage: (bootimage cleaned) (primor dials updated) (built workaround zip) VM_BootImageCompiler: init (opt compiler)(bootimage linked) jbuild.linkBooter: (booter cleaned) (booter linked)ld: 0711-317 ERROR: Undefined symbol: .pthread_self ld: 0711-317 ERROR: Undefined symbol: .pthread_mutex_init ld: 0711-317 ERROR: Undefined symbol: .pthread_mutex_lock ld: 0711-317 ERROR: Undefined symbol: .pthread_join ld: 0711-317 ERROR: Undefined symbol: .pthread_mutex_unlock ld: 0711-317 ERROR: Undefined symbol: .pthread_mutex_destroy ld: 0711-345 Use the -bloadmap or -bnoquiet option to obtain more information. collect2: ld returned 8 exit status ar: creating an archive file libjni.a ar: No such file or directory ar: 0707-117 The fopen system call failed on file libjni.so. cp: /home/qfduan/jikes2.0.3-umass-build/RVM.scratch/libjni.a: No such file or directory (JNI libraries linked) Solution: in jbuild.linkBooter: at line 55 add: CCLIBS="-lpthread -lm" 2.3.2 jvm98 2.3.3 psedojbb 2.4 cross-build Jikes RVM IA32(kappa/linux)->PPC(digamma/linux) - host/target i686-pc-linux-gnu built on kappa successfully. in configuration file, HOST_JAVA_HOME = "/usr/java/jdk1.3.1_02" - cross build Also need set HOST_JAVA_HOME = "/usr/java/jdk1.3.1_02 in powerpc-unknown-linux-gnu which means that build all Java sources on host machine. Then you can see "please run me on Linux PowerPC". ssh digamma.cs.unm.edu ./jbuild.linkBooter % rvm Hello <- good 2.4.1 jvm98 Runs OK: _201_compress _202_jess _209_db _213_javac _228_jack Runs Bad: _200_check [qfduan@digamma ~/jvm98]$ rvm -X:h=200 SpecApplication _213_javac Caching Off Speed = 100 ======= _213_javac Starting ======= Run 0 start. Total memory=262144000 free memory=103940096 Javac benchmark starting... Javac benchmark starting... Javac benchmark starting... Javac benchmark starting... File output byte count = 3909160 checksum = -40392 #### IO Statistics for this Run #### ## IO time : 0.597 seconds ## No. of File opens : 448 ## No. of Byte Reads from cache : 0 ## No. of Byte Reads from File : 8459868 ## No. of Byte Reads from Url : 0 #### Cumulative Cache Stats: N 0, B 0, H 0, M 448 #### No. of HTTP retries : 0 Run 0 end. Total memory=262144000 free memory=101777408 ======= _213_javac Finished in 71.251 secs 2.4.2 pseudojbb UNKNOWN ERRORS! 2.5 Cross-build Jikes RVM - UMASS IA32->PPC host: i686-pc-linux-gnu target: powerpc-unknown-linux-gnu For both HOST_JAVA_HOME = "/usr/java/jdk1.3.1_02 - ON kappa jconfigure GCTkAppelOptOpt ./jbuild - ssh digamma.cs.unm.edu ./jbuild -booter Succeed! [qfduan@digamma ~/rvmTest]$ rvm -X:h=100 Hello small heap = 104857600, large heap = 26214400 vm: booting VM_RuntimeCompiler: boot (opt compiler) Hello World. 2.5.1 jvm98 Runs OK: _200_check _202_jess _205_raytrace _209_db _213_javac _222_mpegaudio _227_mtrt _228_jack _201_compress Bad: _999_checkit got some exceptions 2.5.2 psedojbb GcTK: out of memory ====================================================================== 3. Spec2000 3.1 Compile individual benchmark by hand on PowerPC/AIX reading: (./spec2000/docs/) readme1st.txt execution_without_SPEC_tools.txt install_guide_unix.txt 1) ./benchspec/CFP2000 a) 177.mesa see Makefile CC = cc CXX = xlC CFLAGS = -O3 -qarch=ppc LIBS = -bnso -bI:/usr/lib/syscalls.exp MATH_LIBS = -lm PROG = mesa.ppcaix OBJS = accum.o alpha.o alphabuf.o api1.o api2.o attrib.o \ SOURCES = accum.c alpha.c alphabuf.c api1.c api2.c attrib.c \ $(PROG): $(OBJS) $(CC) -o $(PROG) $(OBJS) $(MATH_LIBS) $(LIBS) $(OBJS): $(SOURCES) $(CC) -c $(CFLAGS) $(SOURCES) clean: /bin/rm/ *.o in ./data/test/input (done) $../../../src/mesa.ppcaix -frames 10 -meshfile mesa.in -ppmfile mesa.ppm bash-2.05$ diff mesa.ppm ../output/mesa.ppm in ./data/train/input (done) $../../../src/mesa.ppcaix -frames 500 -meshfile mesa.in -ppmfile mesa.ppm in SimpleScalar $sim-outorder $PPC_SS_DESCRIPTION -redir:sim sim.mesa.out mesa.ppcaix -frames 500 -meshfile mesa.in -ppmfile mesa.ppm Finish loading Segmentation fault b) 179.art test/input: (done) ../../../src/art.ppcaix -scanfile c756hel.in -trainfile1 a10.img -stride 2 -startx 134 -starty 220 -endx 139 -endy 225 -objects 1 > test.out train/input: (done) ../../../src/art.ppcaix -scanfile c756hel.in -trainfile1 a10.img -stride 2 -startx 134 -starty 220 -endx 184 -endy 240 -objects 3 > train.out ssppc-aix: (done) $sim-outorder $PPC_SS_DESCRIPTION -redir:sim sim-outorder.art.out art.ppcaix -scanfile c756hel.in -trainfile1 a10.img -stride 2 -startx 134 -starty 220 -endx 184 -endy 240 -objects 3 > train.out 2) ./benchspec/CINT2000 A) 176.gcc test/input: (done) ../../../src/gcc.ppcaix cccp.i -o cccp.s train/input: (done) ../../../src/gcc.ppcaix cp-decl.i -o cp-decl.s ssppc-aix: (done) bash-2.05$ sim-outorder $PPC_SS_DESCRIPTION -redir:sim sim-outorder.gcc.out gcc.ppcaix cp-decl.i -o cp-decl.s B) 255.vortex test/input: (done) ../../../src/vortex.ppcaix bendian.raw -o vortex.out train/input: (done) ../../../src/vortex.ppcaix bendian.raw -o vortex.out ssppc-aix: (done) sim-outorder $PPC_SS_DESCRIPTION -redir:sim sim-outorder.vortex.out vortex.ppcaix bendian.raw -o vortex.out C) 300.twolf test/input: (done) ../../../src/twolf.ppcaix test > test.stdout train/input: (done) ../../../src/twolf.ppcaix train > train.stdout ssppc-aix: (done) sim-outorder $PPC_SS_DESCRIPTION -redir:sim sim-outorder.twolf.out twolf.ppcaix train > train.stdout D) 164.gzip test/input: (done) ../../../src/gzip.ppcaix input.compressed 2 > input.compressed.out train/input: (done) ../../../src/gzip.ppcaix input.combined 32 > input.combined.out ssppc-aix: sim-outorder $PPC_SS_DESCRIPTION -redir:sim sim-outorder.gzip.out gzip.ppcaix input.combined 32 -o input.combined.out Finished loading spec_init Loading Input Data Duplicating 3121844 bytes Duplicating 6243688 bytes Duplicating 12487376 bytes Duplicating 8579680 bytes Input data 33554432 bytes in length Compressing Input Data, level 1 Compressed data 26375537 bytes in length Uncompressing Data Segmentation fault (core dumped) --------------------------- A) 254.gap test/input: ../../../src/gap.ppcaix -l ./ -q -m 64M -i test.in -o test.out train/input: ../../../src/gap.ppcaix -l ./ -q -m 128M -i train.in -o train.out B)197.parser test/input: (pretty slow) ../../../src/parser.ppcaix 2.1.dict -batch -i test.in -o test.out -e test.err train/input: ../../../src/parser.ppcaix 2.1.dict -batch -i train.in -o train.out ----------------------------- 3.3 Measurements of dead ratio on Solaris/Sparc using ss3ppc (./ss3ppc-spec2k on sampi.cs.unm.edu) Notes: - Using ref inputs - Sources precompiled on PowerPC/AIX (on vista) 3.3.1 cint2000 (1) 164.gzip (done) sim-outorder $PPC_SS_DESCRIPTION -redir:sim sim.gzip.source.out -fastfwd 1000000000 -max:inst 500000000 gzip.ppcaix input.source 60 -o input.source.out input.log 60 -o input.log.out input.graphic 60 -o input.graphic.out input.random 60 -o input.random.out input.program 60 (2) 175.vpr (cannot open arch.in?) sim-outorder $PPC_SS_DESCRIPTION -redir:sim sim.vpr.place.out -fastfwd 1000000000 -max:inst 500000000 vpr.ppcaix net.in arch.in place.out dum.out -nodisp -place_only -init_t 5 -exit_t 0.005 -alpha_t 0.9412 -inner_num 2 net.in arch.in place.in route.out -nodisp -route_only -route_chan_width 15 -pres_fac_mult 2 -acc_fac 1 -first_iter_pres_fac 4 -initial_pres_fac 8 (3) 176.gcc (done) sim-outorder $PPC_SS_DESCRIPTION -redir:sim sim.gcc.source.out -fastfwd 1000000000 -max:inst 500000000 gcc.ppcaix 166.i -o 166.s 200.i -o 200.s expr.i -o expr.s integrate.i -o integrate.s scilab.i -o scilab.s (4) 181.mcf (done) sim-outorder $PPC_SS_DESCRIPTION -redir:sim sim.mcf.inp.out -fastfwd 1000000000 -max:inst 500000000 mcf.ppcaix inp.in (5) 186.crafty sim-outorder $PPC_SS_DESCRIPTION -redir:sim sim.crafty.out -fastfwd 1000000000 -max:inst 500000000 crafty.ppcaix < crafty.in on vista: ~/data/ref/input/ ../../../src/crafty.ppcaix < crafty.in (6) 197.parser (done) sim-outorder $PPC_SS_DESCRIPTION -redir:sim sim.parser.ref.out -fastfwd 1000000000 -max:inst 500000000 parser.ppcaix 2.1.dict -batch < ref.in (7) 253.perlbmk (done) (fatal: out of virtual memory) sim-outorder $PPC_SS_DESCRIPTION -redir:sim sim.perlbmk.diffmail.out -fastfwd 1000000000 -max:inst 500000000 perlbmk.ppcaix -I./lib diffmail.pl 2 550 15 24 23 100 -I. -I./lib makerand.pl -I./lib perfect.pl b 3 m 4 -I./lib splitmail.pl 850 5 19 18 1500 -I./lib splitmail.pl 704 12 26 16 836 -I./lib splitmail.pl 535 13 25 24 1091 -I./lib splitmail.pl 957 12 23 26 1014 (8) 254.gap (done) sim-outorder $PPC_SS_DESCRIPTION -redir:sim sim.gap.ref.out -fastfwd 1000000000 -max:inst 500000000 gap.ppcaix -l ./ -q -m 192M < ref.in (9) 255.vortex (not enough insts) sim-outorder $PPC_SS_DESCRIPTION -redir:sim sim.vortex.bendian1.out -fastfwd 1000000000 -max:inst 500000000 vortex.ppcaix bendian1.raw bendian2.raw bendian3.raw (10) 256.bzip2 (done) sim-outorder $PPC_SS_DESCRIPTION -redir:sim sim.bzip2.source.out -fastfwd 1000000000 -max:inst 500000000 bzip2.ppcaix input.source 58 input.graphic 58 input.program 58 (11) 300.twolf (done) sim-outorder $PPC_SS_DESCRIPTION -redir:sim sim.twolf.ref.out -fastfwd 1000000000 -max:inst 500000000 twolf.ppcaix ref 3.3.2 cfp2000 (1) 168.wupwise (done) sim-outorder $PPC_SS_DESCRIPTION -redir:sim sim.wupwise.ref.out -fastfwd 0 -max:inst 500000000 wupwise.ppcaix (2) 171.swim (done) sim-outorder $PPC_SS_DESCRIPTION -redir:sim sim.swim.ref.out -fastfwd 0 -max:inst 500000000 swim.ppcaix < swim.in (3) 172.mgrid (done) sim-outorder $PPC_SS_DESCRIPTION -redir:sim sim.mgrid.ref.out -fastfwd 0 -max:inst 500000000 mgrid.ppcaix < mgrid.in (4) 173.applu (done) sim-outorder $PPC_SS_DESCRIPTION -redir:sim sim.applu.ref.out -fastfwd 0 -max:inst 500000000 applu.ppcaix < applu.in (5) 177.mesa (done *) sim-outorder $PPC_SS_DESCRIPTION -redir:sim sim.mesa.ref.out -fastfwd 1000000000 -max:inst 500000000 mesa.ppcaix -frames 1000 -meshfile mesa.in -ppmfile mesa.ppm (6) 178.galgel (done) sim-outorder $PPC_SS_DESCRIPTION -redir:sim sim.galgel.ref.out -fastfwd 0 -max:inst 500000000 galgel.ppcaix < galgel.in (7) 179.art (done *) sim-outorder $PPC_SS_DESCRIPTION -redir:sim sim.art.ref.out -fastfwd 1000000000 -max:inst 500000000 art.ppcaix -scanfile c756hel.in -trainfile1 a10.img -trainfile2 hc.img -stride 2 -startx 110 -starty 200 -endx 160 -endy 240 -objects 10 -scanfile c756hel.in -trainfile1 a10.img -trainfile2 hc.img -stride 2 -startx 470 -starty 140 -endx 520 -endy 180 -objects 10 (8) 183.equake (done *) sim-outorder $PPC_SS_DESCRIPTION -redir:sim sim.equake.ref.out -fastfwd 1000000000 -max:inst 500000000 equake.ppcaix < inp.in (9) 187.facerec (done) sim-outorder $PPC_SS_DESCRIPTION -redir:sim sim.facerec.ref.out -fastfwd 0 -max:inst 500000000 facerec.ppcaix < ref.in (10) 188.ammp (done *) sim-outorder $PPC_SS_DESCRIPTION -redir:sim sim.ammp.ref.out -fastfwd 1000000000 -max:inst 500000000 ammp.ppcaix < ammp.in (11) 189.lucas sim-outorder $PPC_SS_DESCRIPTION -redir:sim sim.lucas.ref.out -fastfwd 1000000000 -max:inst 500000000 lucas.ppcaix < lucas2.in (12) 200.sixtrack sim-outorder $PPC_SS_DESCRIPTION -redir:sim sim.sixtrack.ref.out -fastfwd 1000000000 -max:inst 500000000 sixtrack.ppcaix < inp.in (13) 301.apsi sim-outorder $PPC_SS_DESCRIPTION -redir:sim sim.apsi.ref.out -fastfwd 1000000000 -max:inst 500000000 apsi.ppcaix ======================================================================== 4. DynamicSimpleScalar-PPC and JikesRVM-2.0.3 4.1 Sources 4.1.1 DynamicSimpleScalar /nfs/kappa/common/SimpleScalar/dssmem or,/nfs/home/qfduan/DynamicSimpleScalar/dss3ppc-UltraSparc 4.2.2 JikesRVM-2.0.3 /nfs/epsilon/darko/active/DynamicSimpleScalar/JikesRVM-2.0.3 or,/nfs/home/qfduan/DynamicSimpleScalar/JikesRVM-2.0.3 4.2 Build 4.2.1 DynamicSimpleScalar 4.2.1.1 UltraSparc/Solaris The source is already compiled for sim-outorder 4.2.1.2 PowerPC/Aix 4.2.2 JikesRVM-2.0.3 Cross build: host intelx86/linux(kappa) -> target PowerPC/Aix Enviroments: setenv RVM_ROOT $HOME/DynamicSimpleScalar/JikesRVM-2.0.3 setenv PATH $RVM_ROOT/rvm/bin:$PATH setenv RVM_HOST_CONFIG $RVM_ROOT/rvm/config/i686-pc-linux-gnu.ibmjdk setenv RVM_TARGET_CONFIG $RVM_ROOT/rvm/config/powerpc-ibm-aix4.3.3.0.static setenv CONFIGURATIONNAME GCTkAppelOLWBOptOptFastTimingDSS setenv RVM_BUILD $RVM_ROOT/build/PowerPC32-AIX/$CONFIGURATIONNAME jconfigure $CONFIGURATIONNAME cd $RVM_BUILD ./jbuild => please run me on Aix. 4.3 Modifications of DSS-PPC Add dead/live timing analysis in DSS-PPC. Pretty much work is done in cache.[ch]. This version of DSS-PCC is compiled on a Solaris/ UltraSparc machine (e.g. Sampi.cs) and will be used in section 4.4 for the measurements. 4.4 Run Jikes on top of DSS 4.4.1 on UltraSparc/Solaris (sampi.cs.unm.edu) 1) HelloWorld - pseudojbb - jvm98(_213_javac) Enviroments: #--------------------------------------------------------------------------- # Run JikesRVM on top of DynamicSimpleScalar (on sampi) #--------------------------------------------------------------------------- setenv DSSPPC_ROOT $HOME/DynamicSimpleScalar/dss3ppc-UltraSparc setenv RVM_ROOT $HOME/DynamicSimpleScalar/JikesRVM-2.0.3 setenv BENCHMARKS_ROOT $HOME/DynamicSimpleScalar/benchmarks setenv RVM_BOOTER $RVM_ROOT/build/PowerPC32-AIX/Booter setenv CONFIGURATIONNAME GCTkAppelOLWBOptOptFastTimingDSS setenv RVM_BUILD $RVM_ROOT/build/PowerPC32-AIX/$CONFIGURATIONNAME # HelloWorld #setenv BENCHMARK HelloWorld #setenv BENCHMARK_DIR HelloWorld #setenv BENCHMARK_PARAMS "-cp . HelloWorld" #setenv GC_PARAMS "-X:h=50" # pseudojbb benchmark #setenv BENCHMARK pseudojbb #setenv BENCHMARK_DIR spec/pseudojbb #setenv BENCHMARK_PARAMS "-classpath src spec.jbb.JBBmain -propfile SPECjbb.props" #setenv GC_PARAMS "-X:h=74" # specjvm98 benchmark setenv BENCHMARK jvm98 setenv BENCHMARK_DIR spec/jvm98 setenv BENCHMARK_PARAMS "-classpath . SpecApplication _213_javac" setenv GC_PARAMS "-X:h=200" setenv SIMULATOR sim-outorder setenv SIMLOGFILE "sim-log-DSSNMM-mem-PPC-XiangLong-$SIMULATOR-JikesRVM-$CONFIGURATIONNAME-$\BENCHMARK-$GC_PARAMS" setenv SIMPARAMS "-redir:sim $SIMLOGFILE -config $HOME/DynamicSimpleScalar/mem-PPC-XiangLong\.cfg" #cd $BENCHMARKS_ROOT/$BENCHMARK_DIR #$DSSPPC_ROOT/$SIMULATOR $SIMPARAMS $RVM_BOOTER/JikesRVM #-X:i=$RVM_BUILD/RVM.image -X:vmClasses=$RVM_BUILD/RVM.classes $GC_PARAMS #$BENCHMARK_PARAMS 2) spec2000 - CINT2000 See section 3.3.1 to get how to invoke each benchmark on top of sim-outorder # spec2000int benchmark setenv BENCHMARK gzip.ppcaix setenv BENCHMARK_DIR spec2000/cint2000/164.gzip setenv BENCHMARK_PARAMS "input.source 60 -o input.source.out" setenv SIMULATOR sim-outorder setenv SIMLOGFILE "sim-log-DSSNMM-mem-PPC-XiangLong-$SIMULATOR-JikesRVM-$CONFIGURATIONNAME-$\BENCHMARK-$GC_PARAMS" setenv SIMPARAMS "-redir:sim $SIMLOGFILE -config $HOME/DynamicSimpleScalar/mem-PPC-XiangLong.cfg -max:inst 500000000" # for spec2000 C benchmarks #cd $BENCHMARKS_ROOT/$BENCHMARK_DIR #$DSSPPC_ROOT/$SIMULATOR $SIMPARAMS $BENCHMARK $BENCHMARK_PARAMS 3) results ----------------------- Dead ration of data level 1 cache after Executing the 1st 1 billion instructions Measured using DSS. ----------------------- Benchmark Dead ratio pseudojbb 0.7775 _201_compress 0.7811 _202_jess 0.7717 _205_raytrace 0.7834 _222_mpegaudio 0.7839 _200_check 0.7986 (died after around 550M insts) 164.gzip 0.7051 256.bzip2 0.8701 254.gap seg fault 300.twolf fatal error 197.parser bus error core dumped ------------------------- Skip 1st 1 billion then measure 500M instructions Spec C benchmarks results measured using normal SS ------------------------- Benchmark Dead ratio gzip 0.8120 gcc 0.9982 mcf 0.9038 parser 0.6731 perlbmk 0.9985 gap 0.9971 bzip2 0.8594 SPEC CFP2K mesa 0.5356 art 0.9953 equake 0.7973 ammp 0.7807 4.5 Tracing version of Jikes RVM and DSS see ./DynamicSimpleScalar-TOOL 4.5.1 Build Jikes RVM see configuration in ./rvm/config/build GCTkAppelOLWBOptOptFastTimingDSSEXPINTERLEAVED #--------------------------------------------------------------------------- # JikesRVM-2.0.3 - special version for DynamicSimpleScalar # - From Zhenlin Wang at UMASS # cross build from epsilon (Intel/x86/Linux -> PowerPC/Aix) #--------------------------------------------------------------------------- setenv RVM_ROOT $HOME/DynamicSimpleScalar-TOOL/JikesRVM-2.0.3 setenv PATH $RVM_ROOT/rvm/bin:$PATH setenv RVM_HOST_CONFIG $RVM_ROOT/rvm/config/i686-pc-linux-gnu.ibmjdk setenv RVM_TARGET_CONFIG $RVM_ROOT/rvm/config/powerpc-ibm-aix4.3.3.0.static setenv CONFIGURATIONNAME GCTkAppelOLWBOptOptFastTimingDSSEXPINTERLEAVED #setenv CONFIGURATIONNAME GCTkEXPOFOLWBOptOptFastTimingDSS setenv RVM_BUILD $RVM_ROOT/build/PowerPC32-AIX/$CONFIGURATIONNAME 4.5.2 DSS sim-fast sim-cache 4.5.3 Run cd DynamicSimpleScalar-TOOL/benchmarks/HelloWorld /nfs/home/qfduan/DynamicSimpleScalar-TOOL/dssppc/sim-fast $RVM_ROOT/Booter/JikesRVM -X:i=$RVM_BUILD/RVM.image -X:vmClasses=$RVM_BUILD/RVM.classes -X:h=200 -cp . HelloWorld /nfs/home/qfduan/DynamicSimpleScalar-TOOL/dssppc/sim-cache $PPC_SS_CA CHE -output . $RVM_ROOT/Booter/JikesRVM -X:i=$RVM_BUILD/RVM.image -X:vmClasses=$RVM_BUILD/RVM.classes -X:h=200 -cp . HelloWorld -redir:sim /nfs/sampi/dss-jikes/HelloWorld0406/sim.out /nfs/home/qfduan/DynamicSimpleScalar-TOOL/dssppc/sim-cache $PPC_SS_CA CHE -output /nfs/sampi/dss-jikes/ $RVM_ROOT/Booter/JikesRVM -X:i=$RVM_BUILD/RVM.image -X:vmClasses=$RVM_BUILD/RVM.classes -X:h=200 -classpath . SpecApplication -s1 _202_jess _213_javac _228_jack /nfs/home/qfduan/DynamicSimpleScalar-TOOL/dssppc/sim-cache $PPC_SS_CA CHE -output /nfs/sampi/dss-jikes/ $RVM_ROOT/Booter/JikesRVM -X:i=$RVM_BUILD/RVM.image -X:vmClasses=$RVM_BUILD/RVM.classes -X:h=200 -classpath src spec.jbb.JBBmain -propfile SPECjbb.props =============================================================== 5. Results 5.1 pseudojbb/jvm98 (JikesRVM2.0.3/DSS) see kapp.cs.unm.edu ./DynamicSimpleScalar/benchmarks/spec/jvm98/Duan.sim-outorder.1223 5.2 spec2000 C benchmarks (DSS) 5.2.1 CINT2000 Test Inputs: 164.gzip: input.compressed 2 (Good) 175.vpr: net.in arch.in place.out dum.out -nodisp -place_only -init_t 5 -exit_t 0.005 -alpha_t 0.9412 -inner_num 2 176.gcc: cccp.i -o cccp.s (Good) 181.mcf: inp.in 186.crafty: < crafty.in (Good) 197.parser: 2.1.dict -batch < test.in (Good) 253.perlbmk: -I. -I./lib test.pl 254.gap: -l ./ -q -m 64M < test.in 255.vortex: bendian.raw 256.bzip2: input.random 58 (Good) 300.twolf: test REF inputs see 3.3.1 5.2.2 CFP2000 Test Inputs: 177.mesa: (good) mesa.ppcaix -frames 10 -meshfile mesa.in -ppmfile mesa.ppm 179.art: art.ppcaix -scanfile c756hel.in -trainfile1 a10.img -stride 2 -startx 134 -starty 220 -endx 139 -endy 225 -objects 1 > test.out 183.equake: equake.ppcaix < inp.in 188.ammp: ammp.ppcaix < ammp.in (good) 5.3 various heap sizes for java benchmarks 5.3.1 Find out the MIN heap size for each java benchmark 1) build JikesRVM2.0.3 UMASS version on (PowerPC/AIX) vista.ahpcc jconfigure GCTkAppelOLWBOptOptFastTiming ./jbuild Notes: - could go to /nfs/eta/darko2/JikesRVM-2.0.3/rvm/config/build on eta.cs.unm.edu to get more configurations - might got ld error solution: in jbuild.linkBooter: at line 55 add: CCLIBS="-lpthread -lm 2) MIN heap sizes BENCHMARKS (PowerPC/Linux) (PowerPC/AIX) pseudojbb 59 74 _200_check 26 _201_compress 19 19 _202_jess 12 12 _205_raytrace 15 15 _209_db 22 22 _213_javac 28 26 _222_mpegaudio 10 10 _227_mtrt 21 21 _228_jack 14 13 Notes: (IN csh) - run pseudojbb rvm -X:h=74 -classpath ./jbb.jar:./jbb_no_precompile.jar:./check.jar:./reporter.jar:./src spec.jbb.JBBmain -propfile SPECjbb.props |& cat - run jvm98 rvm -X:h=26 SpecApplication _200_check |& cat rvm -X:h=19 SpecApplication _201_compress |& cat rvm -X:h=12 SpecApplication _202_jess |& cat rvm -X:h=15 SpecApplication _205_raytrace |& cat rvm -X:h=22 SpecApplication _209_db |& cat rvm -X:h=26 SpecApplication _213_javac |& cat rvm -X:h=10 SpecApplication _222_mpegaudio |& cat rvm -X:h=21 SpecApplication _227_mtrt |& cat rvm -X:h=13 SpecApplication _228_jack |& cat