04-24-2014, 09:46 PM -
TL,DR; you can skip first part below if you just want to see my ideas:
I would like to first say that I have been long time observer of this emulator, and have digged in the source lots and notice the many issues that can be fixed. Also, speed seems to be problem for the future as single die chips will not sport the power needed. I have followed these forums for many months and have seen progression, but notice some simple problems can be sorted out by creating separate builds and a co-design plan to cator for higher-end machines that could handle the power of parallelism that ps3 needs.
Here are some of my ideas (more to be added) on how RPCS3 can be benefit:
1.JIT or JITIL emulation:
JIT will compiles the PowerPC code into x86 code (or whatever targets wanted), copies it into a cache, then executes it. While in cache, it run faster since access time is reduced. JITIL can be optimized and experimental, but can work better since JITIL does basically the same thing, but can compile PS3's machine code to an intermediate language before native execution and can cache some of that as well. For some FPU instructions in PS3, caching and transferring some code to an IL will benefit some game code. At the least, it should be considered since Dolphin use this and it works wonders on running some games and gives a performance boost.
2.Supercomputer optimization:
Though it is not practical, optimizing RSX and Cell's PPU parallelism between a specific option for supercomputing platforms will greatly benefit performance. I mean not talking about ten thousand dollar supercomputers and stuff, but a lighter supercomputer that can take advantage of maybe a MPI (message passing interface, a system designed for parallelism, typically in supercomputers), definitely multi-GPU and GPGPU stuff (get a build set up to work with Mantle and some low-level shaders, implement some GPGPU for off-loading the main CPU and use multi-GPU for drawing to the screen and other logical GPU for computations). Linux release could benefit a similar technology of Open MPI as well as Mac/Unix,etc. This supercomputer could be an array of several i7s connected through a MPI or such and multi-GPUs ($2000-$3000), and all tasks can be splitted up and processed by many at once, making the desired output faster. This may not seem worthy ,but it is not a total waste and will definitely be promising if done right. Any PS3 game and power should run perfect as the long run for this, and making it an option for rpcs3 with other execution core might suffice.
3.Take advantage of lower-level advantage:
Some parts of code can be optimized much better in Assembler language for the desired platforms. By doing so you reduce C/C++ overhead, even if you think your compiler can beat assembler all the time. Doing this for one subroutine is hardly worthy, but if you optimize many parts of execution (subroutines, loops, iterations, etc.) for lower-level, faster algorithms, the whole program will benefit. Some high-level C++ code can better be written in target assembly and take advantage of features C++ cannot offer. It also avoid using the C++ runtime somewhat, and can be replaced with assembly using less opcode that gets the same amount of work done with (sometimes) millions of saved clock cycles, depending on optimizing.
If this thread still open, I will come back and add a few more ideas but right now I have to go somewhere and do something.
I would like to first say that I have been long time observer of this emulator, and have digged in the source lots and notice the many issues that can be fixed. Also, speed seems to be problem for the future as single die chips will not sport the power needed. I have followed these forums for many months and have seen progression, but notice some simple problems can be sorted out by creating separate builds and a co-design plan to cator for higher-end machines that could handle the power of parallelism that ps3 needs.
Here are some of my ideas (more to be added) on how RPCS3 can be benefit:
1.JIT or JITIL emulation:
JIT will compiles the PowerPC code into x86 code (or whatever targets wanted), copies it into a cache, then executes it. While in cache, it run faster since access time is reduced. JITIL can be optimized and experimental, but can work better since JITIL does basically the same thing, but can compile PS3's machine code to an intermediate language before native execution and can cache some of that as well. For some FPU instructions in PS3, caching and transferring some code to an IL will benefit some game code. At the least, it should be considered since Dolphin use this and it works wonders on running some games and gives a performance boost.
2.Supercomputer optimization:
Though it is not practical, optimizing RSX and Cell's PPU parallelism between a specific option for supercomputing platforms will greatly benefit performance. I mean not talking about ten thousand dollar supercomputers and stuff, but a lighter supercomputer that can take advantage of maybe a MPI (message passing interface, a system designed for parallelism, typically in supercomputers), definitely multi-GPU and GPGPU stuff (get a build set up to work with Mantle and some low-level shaders, implement some GPGPU for off-loading the main CPU and use multi-GPU for drawing to the screen and other logical GPU for computations). Linux release could benefit a similar technology of Open MPI as well as Mac/Unix,etc. This supercomputer could be an array of several i7s connected through a MPI or such and multi-GPUs ($2000-$3000), and all tasks can be splitted up and processed by many at once, making the desired output faster. This may not seem worthy ,but it is not a total waste and will definitely be promising if done right. Any PS3 game and power should run perfect as the long run for this, and making it an option for rpcs3 with other execution core might suffice.
3.Take advantage of lower-level advantage:
Some parts of code can be optimized much better in Assembler language for the desired platforms. By doing so you reduce C/C++ overhead, even if you think your compiler can beat assembler all the time. Doing this for one subroutine is hardly worthy, but if you optimize many parts of execution (subroutines, loops, iterations, etc.) for lower-level, faster algorithms, the whole program will benefit. Some high-level C++ code can better be written in target assembly and take advantage of features C++ cannot offer. It also avoid using the C++ runtime somewhat, and can be replaced with assembly using less opcode that gets the same amount of work done with (sometimes) millions of saved clock cycles, depending on optimizing.
If this thread still open, I will come back and add a few more ideas but right now I have to go somewhere and do something.