w.f.j.mueller@gsi.de [hercules-390]
2018-12-02 13:34:45 UTC
Hi,
Michael Short was so kind to run a s370_perf https://github.com/wfjm/s370-perf/blob/master/doc/s370_perf.md version ported to MUSIC/SP https://en.wikipedia.org/wiki/MUSIC/SP on his Sim390 http://www.canpub.com/teammpg/de/sim390/ emulator based system. The different OS should have no sizable impact on the measured instruction timings since SVC and privileged instructions, which depend on system response times, aren't covered. A reference run with Hercules 4.0 on the same host CPU is available too. The data and full analysis is under
https://github.com/wfjm/s370-perf/blob/master/narr/2018-07-30_ms1-herc40.md https://github.com/wfjm/s370-perf/blob/master/narr/2018-07-30_ms1-herc40.md
https://github.com/wfjm/s370-perf/blob/master/narr/2018-07-30_ms1-sim390.md https://github.com/wfjm/s370-perf/blob/master/narr/2018-07-30_ms1-sim390.md
The key findings are
- Sim390 is, on identical Host hardware, a factor is a factor 6.5 slower than Hercules 4.0, based on the lmark https://github.com/wfjm/s370-perf/blob/master/narr/README_narr.md#user-content-lmark MIPS ratio of 6.39 to 41.54.
- simple instructions, like LR R,R, are about a factor 9 slower, see section LR timing https://github.com/wfjm/s370-perf/blob/master/narr/2018-07-30_ms1-sim390.md#user-content-find-lr.
- branch timing does not depend on same/different page, see section branch timing https://github.com/wfjm/s370-perf/blob/master/narr/2018-07-30_ms1-sim390.md#user-content-find-btime.
- CLCL and TRT are much faster on Sim390, see section CLCL+TRT performance https://github.com/wfjm/s370-perf/blob/master/narr/2018-07-30_ms1-sim390.md#user-content-find-clcl-trt.
The overall summary is that Hercules has a much more efficient handling of instruction fetch and decoding and of virtual to real address mapping.
Any remarks and comments are very welcome.
With best regards, Walter
Michael Short was so kind to run a s370_perf https://github.com/wfjm/s370-perf/blob/master/doc/s370_perf.md version ported to MUSIC/SP https://en.wikipedia.org/wiki/MUSIC/SP on his Sim390 http://www.canpub.com/teammpg/de/sim390/ emulator based system. The different OS should have no sizable impact on the measured instruction timings since SVC and privileged instructions, which depend on system response times, aren't covered. A reference run with Hercules 4.0 on the same host CPU is available too. The data and full analysis is under
https://github.com/wfjm/s370-perf/blob/master/narr/2018-07-30_ms1-herc40.md https://github.com/wfjm/s370-perf/blob/master/narr/2018-07-30_ms1-herc40.md
https://github.com/wfjm/s370-perf/blob/master/narr/2018-07-30_ms1-sim390.md https://github.com/wfjm/s370-perf/blob/master/narr/2018-07-30_ms1-sim390.md
The key findings are
- Sim390 is, on identical Host hardware, a factor is a factor 6.5 slower than Hercules 4.0, based on the lmark https://github.com/wfjm/s370-perf/blob/master/narr/README_narr.md#user-content-lmark MIPS ratio of 6.39 to 41.54.
- simple instructions, like LR R,R, are about a factor 9 slower, see section LR timing https://github.com/wfjm/s370-perf/blob/master/narr/2018-07-30_ms1-sim390.md#user-content-find-lr.
- branch timing does not depend on same/different page, see section branch timing https://github.com/wfjm/s370-perf/blob/master/narr/2018-07-30_ms1-sim390.md#user-content-find-btime.
- CLCL and TRT are much faster on Sim390, see section CLCL+TRT performance https://github.com/wfjm/s370-perf/blob/master/narr/2018-07-30_ms1-sim390.md#user-content-find-clcl-trt.
The overall summary is that Hercules has a much more efficient handling of instruction fetch and decoding and of virtual to real address mapping.
Any remarks and comments are very welcome.
With best regards, Walter