Alphakill
Started by Inue




34 posts in this topic
Inue
Unregistered


 
02-07-2016, 07:45 AM -
#21
(02-07-2016, 06:55 AM)tambre Wrote:
(02-06-2016, 10:31 PM)Inue Wrote: In my opinion emulator doesn't need to be cycle accurate to be LLE because HLE and LLE are just methods so no matter how innacurate zsnes is it's still LLE.
What do you mean by modules and working out of box? If you mean bios then they can sell without it. LLE emulators of older consoles don't really use any modules since they don't emulate any libraries just hardware by simulating registers according to a guy who was PS1 GPU plugin developer. From what I read consoles like Saturn and PS1 didn't really have any standardized library unlike PS3 every company used their own so HLE emulation of these consoles probably would had to be done on per game basis.

Per game basis? What? The library is included with the game, so you just execute the instructions of the library on your emulator, if the game calls a function of their library.
I think vlj explained the HLE/LLE for PS3 quite well.
HLE in RPCS3 is simply replacing system calls with our own implemented functions.
LLE in RPCS3 is simply loading the firmware files and executing the instructions of the function, when they're called.

I understand this is how it works in rpcs3 however it seems they emulated old consoles differently. That's not my opinion but guys who worked on PS1 plugins said HLE for PS1 would have to be done per game basis that's why nobody has done it yet. N64 libraries are much more standardized than PS1's yet there are still problems to this day with HLE emulation of unique or rare RSP microcodes.
According to this N64 HLE is simply bypassing direct RSP emulation and emulating microcodes instead.

http://gliden64.blogspot.com/2014/11/a-w...r-hle.html
http://gliden64.blogspot.com/2014/11/lle-is-here.html

HLE vs. LLE in N64 emulation

'''For a long time HLE was key feature of N64 emulation. All N64 games use a library for render audio and graphics. The library runs on powerful multimedia Reality Co-Processor, RCP. RCP actually consist of two parts: programmable Reality Signal Processor, RSP, which executes graphics and audio tasks, and Reality Display Processor, RDP, which draws the graphics in the frame buffer. The library, known as microcode, is loaded into RSP and controls its work. That is it defines how graphics and audio tasks must be performed. HLE emulators implement the microcode and thus bypass emulation of very powerful module, which can perform many scalar and vector operations per cycle.'''

(02-07-2016, 12:15 AM)vlj Wrote:
(02-06-2016, 10:31 PM)Inue Wrote: You mean bsnes because zsnes while LLE is far from accurate.

Yes sorry

Quote:I noticed everyone has different definition of HLE and LLE.
In my opinion emulator doesn't need to be cycle accurate to be LLE because HLE and LLE are just methods so no matter how innacurate zsnes is it's still LLE.
What do you mean by modules and working out of box?

Some PSX era emulator needed a Bios dump in order to launch any game. Same for PCSX2 which doesn't run without a Bios (but I didn't tested lately).
Bios is copyrighted material, as well as system library (what rpcs3 calls "LLE Module&quotWink. You can dump them from your own console (and only if your country doesn't have a low that forbid that which is not true everywhere) but you can't redistribute them.
This means you can't sell an emulated PS2 game on Steam or GPlay except if you have Sony's approval even if you own the right for the game.

Of course as you mentionned older console didn't have standardised lib so if you HLE Bios you can sell the game. That's how Console Classics does, they use PCSX-R since it doesn't require a BIOS AFAIK.
In order to sell emulated PS3 game the underlying emulator needs not to use "LLE Module" but provide a custom implementation.

Quote:Defintely if GPU isn't documented/reverse engineered enough so that functions of registers aren't known and in effect they aren't directly emulated then this is probably HLE in my opinion(or inaccurate lle if only some are emulated).
Are RSX/Cell registers documented at all and do you emulate their functions?

By the way what do you think is harder emulating fixed functions with shaders or emulating shader with well shaders. I heard opinions that older consoles are harder to emulate than PS3 for instance because PS3's gpu is more similar to PC architecture.

RSX Registers are mostly documented, either by previous reverse engineering attempt or by the nouveau project (since RSX is very close to G70).

I don't think it's harder or easier to emulate fixed function or shader based gpu. It's different kind of work.
I never emulated a fixed function pipeline but I guess most of the work goes into guessing what is the influence of a given register on the final result ; you need to have a good "uber shader" or shader generator that cover all cases correctly, up to the floating point precision.
With shader based gpu you need to correctly understand shader opcode and their difference from IEEE standard. Since RSX is also not completly shader based there's also influence from external register (two sided lighting, transform program input/output, vertex attributes...) that needs to be properly sorted too.

Being close to PC architecture doesn't mean it's easier to emulated. First Xbox was the closest to PC architecture of the 6th generation and yet it's still far from being correctly emulated. And I guess it will be the same for PS4/Xbox One especially with the HSA functionnality that has currently no equivalent (except on Linux with (some) Amd APU).

Do you know how many layers of mipmapped textures can RSX use at most? I know PS2 can use 7 levels though it can't use multitexturing. Another important question is how many cores can rpcs3 use? Would it benefit from buying 10 core Broadwell-E?
tambre
Unregistered


 
02-07-2016, 08:10 AM -
#22
(02-07-2016, 07:45 AM)Inue Wrote:
(02-07-2016, 06:55 AM)tambre Wrote:
(02-06-2016, 10:31 PM)Inue Wrote: In my opinion emulator doesn't need to be cycle accurate to be LLE because HLE and LLE are just methods so no matter how innacurate zsnes is it's still LLE.
What do you mean by modules and working out of box? If you mean bios then they can sell without it. LLE emulators of older consoles don't really use any modules since they don't emulate any libraries just hardware by simulating registers according to a guy who was PS1 GPU plugin developer. From what I read consoles like Saturn and PS1 didn't really have any standardized library unlike PS3 every company used their own so HLE emulation of these consoles probably would had to be done on per game basis.

Per game basis? What? The library is included with the game, so you just execute the instructions of the library on your emulator, if the game calls a function of their library.
I think vlj explained the HLE/LLE for PS3 quite well.
HLE in RPCS3 is simply replacing system calls with our own implemented functions.
LLE in RPCS3 is simply loading the firmware files and executing the instructions of the function, when they're called.

I understand this is how it works in rpcs3 however it seems they emulated old consoles differently. That's not my opinion but guys who worked on PS1 plugins said HLE for PS1 would have to be done per game basis that's why nobody has done it yet. N64 libraries are much more standardized than PS1's yet there are still problems to this day with HLE emulation of unique or rare RSP microcodes.
According to this N64 HLE is simply bypassing direct RSP emulation and emulating microcodes instead.

http://gliden64.blogspot.com/2014/11/a-w...r-hle.html
http://gliden64.blogspot.com/2014/11/lle-is-here.html

HLE vs. LLE in N64 emulation

'''For a long time HLE was key feature of N64 emulation. All N64 games use a library for render audio and graphics. The library runs on powerful multimedia Reality Co-Processor, RCP. RCP actually consist of two parts: programmable Reality Signal Processor, RSP, which executes graphics and audio tasks, and Reality Display Processor, RDP, which draws the graphics in the frame buffer. The library, known as microcode, is loaded into RSP and controls its work. That is it defines how graphics and audio tasks must be performed. HLE emulators implement the microcode and thus bypass emulation of very powerful module, which can perform many scalar and vector operations per cycle.'''

(02-07-2016, 12:15 AM)vlj Wrote:
(02-06-2016, 10:31 PM)Inue Wrote: You mean bsnes because zsnes while LLE is far from accurate.

Yes sorry

Quote:I noticed everyone has different definition of HLE and LLE.
In my opinion emulator doesn't need to be cycle accurate to be LLE because HLE and LLE are just methods so no matter how innacurate zsnes is it's still LLE.
What do you mean by modules and working out of box?

Some PSX era emulator needed a Bios dump in order to launch any game. Same for PCSX2 which doesn't run without a Bios (but I didn't tested lately).
Bios is copyrighted material, as well as system library (what rpcs3 calls "LLE Module&quotWink. You can dump them from your own console (and only if your country doesn't have a low that forbid that which is not true everywhere) but you can't redistribute them.
This means you can't sell an emulated PS2 game on Steam or GPlay except if you have Sony's approval even if you own the right for the game.

Of course as you mentionned older console didn't have standardised lib so if you HLE Bios you can sell the game. That's how Console Classics does, they use PCSX-R since it doesn't require a BIOS AFAIK.
In order to sell emulated PS3 game the underlying emulator needs not to use "LLE Module" but provide a custom implementation.

Quote:Defintely if GPU isn't documented/reverse engineered enough so that functions of registers aren't known and in effect they aren't directly emulated then this is probably HLE in my opinion(or inaccurate lle if only some are emulated).
Are RSX/Cell registers documented at all and do you emulate their functions?

By the way what do you think is harder emulating fixed functions with shaders or emulating shader with well shaders. I heard opinions that older consoles are harder to emulate than PS3 for instance because PS3's gpu is more similar to PC architecture.

RSX Registers are mostly documented, either by previous reverse engineering attempt or by the nouveau project (since RSX is very close to G70).

I don't think it's harder or easier to emulate fixed function or shader based gpu. It's different kind of work.
I never emulated a fixed function pipeline but I guess most of the work goes into guessing what is the influence of a given register on the final result ; you need to have a good "uber shader" or shader generator that cover all cases correctly, up to the floating point precision.
With shader based gpu you need to correctly understand shader opcode and their difference from IEEE standard. Since RSX is also not completly shader based there's also influence from external register (two sided lighting, transform program input/output, vertex attributes...) that needs to be properly sorted too.

Being close to PC architecture doesn't mean it's easier to emulated. First Xbox was the closest to PC architecture of the 6th generation and yet it's still far from being correctly emulated. And I guess it will be the same for PS4/Xbox One especially with the HSA functionnality that has currently no equivalent (except on Linux with (some) Amd APU).

Do you know how many layers of mipmapped textures can RSX uses at most? I know PS2 can use 7 levels. Another important question is how many cores can rpcs3 use? Would it benefit from buying 10 core Broadwell-E?

It can benefit from as many cores as the game uses. PS3 has a total of 8 "cores" (1 PPU, 7 SPU). So if you have more cores, it might help. Though most games use very few SPU cores, from my experience.
vlj
Unregistered


 
02-10-2016, 01:03 AM -
#23
Ps3 doesn't have a limit on mipmap count other than base level texture size (4096x4096) so this makes 12 levels.
The more core (real or logical) the better. For instance on Disgaea 3 (without lle module) fps are more than doubling with a core i7 with hyperthreading and an i5 despite having same core number.
Cell is composed of a ppu core with 7 spu. The ppu has a smt architecture (generic name of hyper threading like tech) which means ps3 game are likely to run at least 2 threads on it, and every spu can run a single thread. There's also the gsbackend thread which often consumes a whole core and optionally one for the ppu llvm recompiler. This means that something like 10 or 12 core should be "optimal" to run rpcs3 : on a real ps3 a game has to divide all the threads on the 8 cell cores and it's logical ppu one.
Additionally Intel core IPCs is higher than ppc's one from 2005 (which focused more on frequency than on ipc). I don't know for spu but it's likely the case too.
Inue
Unregistered


 
02-14-2016, 06:05 PM -
#24
(02-10-2016, 01:03 AM)vlj Wrote: Ps3 doesn't have a limit on mipmap count other than base level texture size (4096x4096) so this makes 12 levels.
The more core (real or logical) the better. For instance on Disgaea 3 (without lle module) fps are more than doubling with a core i7 with hyperthreading and an i5 despite having same core number.
Cell is composed of a ppu core with 7 spu. The ppu has a smt architecture (generic name of hyper threading like tech) which means ps3 game are likely to run at least 2 threads on it, and every spu can run a single thread. There's also the gsbackend thread which often consumes a whole core and optionally one for the ppu llvm recompiler. This means that something like 10 or 12 core should be "optimal" to run rpcs3 : on a real ps3 a game has to divide all the threads on the 8 cell cores and it's logical ppu one.
Additionally Intel core IPCs is higher than ppc's one from 2005 (which focused more on frequency than on ipc). I don't know for spu but it's likely the case too.

More than doubling with just HT is impressive because HT is supposed to increase performance by 30% at best so how high must be the increase with actual physical cores? I wonder if when someone will try to use rpcs3/Xenia on a CPU without SMT will that cause contention for resources or some inaccuracy because PPE and Xenon are dual threaded?
Good thing that Intel finally started increasing core counts in their E series processors. PPE is around Pentium 4's IPC but surprisingly it's in order when even GC's Gekko was out of order same for SPEs but they at least had far higher floating point performance. I read now that SPEs/SPUs aren't IEEE 754 compliant is that a problem when emulating them on modern cpus? I mean are any hacks necessary? PS2's FPU and Vector Units aren't IEEE 754 compliant either and they use hacks for this with multiple clamping and rounding modes in PCSX2.
Do you know if RSX has Accumulation Buffer since it's deprecated in modern OpenGL and if some PS3 games/engines use quads or it's purely triangle based rendering?
vlj
Unregistered


 
02-18-2016, 05:05 PM -
#25
I guess the doubled perf is due to some bug/implementation bottleneck in the SPU management code.
There is no accuracy issue when using a cpu with SMT vs a cpu without SMT. RPCS3 is executing PPU threads in separate host cpu threads ; they're not aware of "what" runs them, what matters is that they can use some synchronisation mechanism.

IEEE 754 compliance can be an issue ; extra instructions are required to make compliant cpu (like x64) "uncompliant" one. BTW RSX doesn't follow IEEE 754 norm and it's the root of some gfx bugs.
There's no dedicated Accumulation buffer in RSX. I'm not really sure what the feature did bring to the table though since you can use offscreen framebuffer and blending to achieve similar effect.
RSX supports quads.
Inue
Unregistered


 
02-19-2016, 07:42 PM -
#26
(02-18-2016, 05:05 PM)vlj Wrote: I guess the doubled perf is due to some bug/implementation bottleneck in the SPU management code.
There is no accuracy issue when using a cpu with SMT vs a cpu without SMT. RPCS3 is executing PPU threads in separate host cpu threads ; they're not aware of "what" runs them, what matters is that they can use some synchronisation mechanism.

IEEE 754 compliance can be an issue ; extra instructions are required to make compliant cpu (like x64) "uncompliant" one. BTW RSX doesn't follow IEEE 754 norm and it's the root of some gfx bugs.
There's no dedicated Accumulation buffer in RSX. I'm not really sure what the feature did bring to the table though since you can use offscreen framebuffer and blending to achieve similar effect.
RSX supports quads.

Can new instructions like AVX2, AVX512 and FMA3/4 help with compliancy?
So now that Vulkan is out how does it compare to DX12, does it allow better control over rendering pipeline compared to D3D12 and did it expose any new hardware functionality?
DX12 has shader model 5.1 what about Vulkan it seems to still only have SM 5.0 and does Vulkan have support for read-write textures and buffers?
flashmozzg
Unregistered


 
02-19-2016, 08:13 PM -
#27
(02-19-2016, 07:42 PM)Inue Wrote: Can new instructions like AVX2, AVX512 and FMA3/4 help with compliancy?

It's ps3 that has problem with following standards not modern CPUs/GPUs.
kd-11
RPCS3 Developer


0
76 posts 1 threads Joined: Aug 2017
02-25-2016, 08:12 PM -
#28
(02-19-2016, 07:42 PM)Inue Wrote:
(02-18-2016, 05:05 PM)vlj Wrote: I guess the doubled perf is due to some bug/implementation bottleneck in the SPU management code.
There is no accuracy issue when using a cpu with SMT vs a cpu without SMT. RPCS3 is executing PPU threads in separate host cpu threads ; they're not aware of "what" runs them, what matters is that they can use some synchronisation mechanism.

IEEE 754 compliance can be an issue ; extra instructions are required to make compliant cpu (like x64) "uncompliant" one. BTW RSX doesn't follow IEEE 754 norm and it's the root of some gfx bugs.
There's no dedicated Accumulation buffer in RSX. I'm not really sure what the feature did bring to the table though since you can use offscreen framebuffer and blending to achieve similar effect.
RSX supports quads.

Can new instructions like AVX2, AVX512 and FMA3/4 help with compliancy?
So now that Vulkan is out how does it compare to DX12, does it allow better control over rendering pipeline compared to D3D12 and did it expose any new hardware functionality?
DX12 has shader model 5.1 what about Vulkan it seems to still only have SM 5.0 and does Vulkan have support for read-write textures and buffers?

Read/write buffers is a ps3 thing, not a PC/GPU thing; thusly, neither DX12 nor Vulkan support read/write buffers the way the PS3 does. Both DX12 and Vulkan both provide finer grained control over the inner workings of an application, but none add any new features to existing hardware over traditional GL or DX. I honestly cant say either one is better as of yet. Read/write buffers in opengl is not a problem due to API limitations, its just not implemented properly yet. I may get around to it eventually when I have more time.
Inue
Unregistered


 
03-01-2016, 12:29 AM -
#29
(02-25-2016, 08:12 PM)kd-11 Wrote:
(02-19-2016, 07:42 PM)Inue Wrote:
(02-18-2016, 05:05 PM)vlj Wrote: I guess the doubled perf is due to some bug/implementation bottleneck in the SPU management code.
There is no accuracy issue when using a cpu with SMT vs a cpu without SMT. RPCS3 is executing PPU threads in separate host cpu threads ; they're not aware of "what" runs them, what matters is that they can use some synchronisation mechanism.

IEEE 754 compliance can be an issue ; extra instructions are required to make compliant cpu (like x64) "uncompliant" one. BTW RSX doesn't follow IEEE 754 norm and it's the root of some gfx bugs.
There's no dedicated Accumulation buffer in RSX. I'm not really sure what the feature did bring to the table though since you can use offscreen framebuffer and blending to achieve similar effect.
RSX supports quads.

Can new instructions like AVX2, AVX512 and FMA3/4 help with compliancy?
So now that Vulkan is out how does it compare to DX12, does it allow better control over rendering pipeline compared to D3D12 and did it expose any new hardware functionality?
DX12 has shader model 5.1 what about Vulkan it seems to still only have SM 5.0 and does Vulkan have support for read-write textures and buffers?

Read/write buffers is a ps3 thing, not a PC/GPU thing; thusly, neither DX12 nor Vulkan support read/write buffers the way the PS3 does. Both DX12 and Vulkan both provide finer grained control over the inner workings of an application, but none add any new features to existing hardware over traditional GL or DX. I honestly cant say either one is better as of yet. Read/write buffers in opengl is not a problem due to API limitations, its just not implemented properly yet. I may get around to it eventually when I have more time.
I heard that Nvidia added some vendor OpenGL extension which adds R/W buffer support to the 2nd generation Maxwell , could you tell me the name of that extension?
DX12 did add some new rendering features and shader model 5.1 what about Vulkan and is conservative rasterization supported by it?

https://msdn.microsoft.com/en-us/library...s.85).aspx
https://msdn.microsoft.com/en-us/library...s.85).aspx
DX12 r/w buffer seem to be mentioned here
https://msdn.microsoft.com/en-us/library...s.85).aspx
vlj
Unregistered


 
03-02-2016, 02:11 AM -
#30
You're confusing read/write buffer with raster ordered view which is a dx12 features (used to synchronise pixel shader execution on the same pixel to implement correct transparency).

Read/write buffer refers to the synchronisation of data between rsx emulation thread and the rest of the emulator. There is often an option in emulator like "frame buffer location" or efb potion in dolphin.

On PC the gpu is drawing things in memory which is separate from the application memory. However on console this is not the case and the gpu is able to draw anywhere in memory, and cell is able to read rsx memory (albeit slowly).
This means that for accurate emulation rpcs3 would have to issues a lot of image transfer from opaque gpu memory to visible memory and this is obviously costly (no game would run at more than a single digit fps).
Read/write toggles allows to bypass such copies. Since almost every rsx produced data are never used by cell this doesn't break a lots of game at the moment.

On the other hand Vulkan might (but that's completely optional and depends on the driver) support a feature that is likely to help rsx memory management corner case : linear tiling for render target and sampled texture. Dx12 doesn't support this at all (actually dx12 is only exposing the common subset of feature that a Vulkan capable card can support) and it's quite difficult to work around.


Forum Jump:


Users browsing this thread: 2 Guest(s)