[hercules-390] Hercules vs z/PDT

w.f.j.mueller@gsi.de [hercules-390]

2018-05-26 12:49:58 UTC

Hallo,

I received some time ago results from a handful of s370_perf https://github.com/wfjm/s370-perf/blob/master/README.md runs done on a z/PDT V1.7 system. Took some time to analyze, the data and full analysis is now under

https://github.com/wfjm/s370-perf/blob/master/narr/2018-03-10_zpdt.md https://github.com/wfjm/s370-perf/blob/master/narr/2018-03-10_zpdt.md

Performance testing of z/PDT is explicitly disallowed in the z/PDT license conditions. s370_perf would be not the proper tool any way. The page cited above merely gives some observations on general features of z/PDT, the key ones are repeated here
z/PDT is based on on-the-fly binary translation, see section background https://github.com/wfjm/s370-perf/blob/master/narr/2018-03-10_zpdt.md#user-content-back.
z/PDT does an optimizing compilation, see section code optimization https://github.com/wfjm/s370-perf/blob/master/narr/2018-03-10_zpdt.md#user-content-obs-opt.
the same code is sometimes compiled, sometimes not, see section to compile or not to compile https://github.com/wfjm/s370-perf/blob/master/narr/2018-03-10_zpdt.md#user-content-obs-comp.
if code is compiled, compilation can happen with substantial delay, see section compilation delay https://github.com/wfjm/s370-perf/blob/master/narr/2018-03-10_zpdt.md#user-content-obs-compdel.
performance in plain interpretive mode seems similar to Hercules, see section interpreter mode performance https://github.com/wfjm/s370-perf/blob/master/narr/2018-03-10_zpdt.md#user-content-obs-inter.
RR instructions is the easy part, see section RR instructions https://github.com/wfjm/s370-perf/blob/master/narr/2018-03-10_zpdt.md#user-content-obs-rr.
RX instructions is the hard part, see section RX instructions https://github.com/wfjm/s370-perf/blob/master/narr/2018-03-10_zpdt.md#user-content-obs-rx.
z/PDT vs Hercules comparisons difficult to interpret, see section z/PDT vs Hercules https://github.com/wfjm/s370-perf/blob/master/narr/2018-03-10_zpdt.md#user-content-obs-sum.
little gain for floating point arithmetic, see section floating point performance https://github.com/wfjm/s370-perf/blob/master/narr/2018-03-10_zpdt.md#user-content-obs-float.
a bit faster for decimal packed arithmetic, see section decimal packed performance https://github.com/wfjm/s370-perf/blob/master/narr/2018-03-10_zpdt.md#user-content-obs-dec.
EX is apparently always interpreted, see section EX instruction https://github.com/wfjm/s370-perf/blob/master/narr/2018-03-10_zpdt.md#user-content-obs-ex.
z/PDT performs well for some instructions which are slow on Hercules, see sections CLCL https://github.com/wfjm/s370-perf/blob/master/narr/2018-03-10_zpdt.md#user-content-obs-clcl, MVCIN https://github.com/wfjm/s370-perf/blob/master/narr/2018-03-10_zpdt.md#user-content-obs-mvcin, and TRT https://github.com/wfjm/s370-perf/blob/master/narr/2018-03-10_zpdt.md#user-content-obs-trt.
overall JIT gain depends heavily on workload, see bottom line https://github.com/wfjm/s370-perf/blob/master/narr/2018-03-10_zpdt.md#user-content-obs-bline.

For any further background and detail follow the links.
Any remarks and comments are very welcome.

With best regards, Walter