Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crash during resolution change in Windows XP with Banshee #470

Open
Vort opened this issue Feb 1, 2025 · 6 comments
Open

Crash during resolution change in Windows XP with Banshee #470

Vort opened this issue Feb 1, 2025 · 6 comments

Comments

@Vort
Copy link
Contributor

Vort commented Feb 1, 2025

When resolution is changed in Windows XP with Banshee, crash have chance to appear.
This crash appears frequently enough to be annoying, but not frequently enough to have precise reproduction steps.

It happens inside of bx_banshee_c::blt_rectangle_fill function, during rop_fn execution:

BLT.rop_fn[rop](dst_ptr1, BLT.fgcolor, dpitch, dpxsize, dpxsize, 1);

When dst_base = 0x00f69c00, dst_w = 0x0280 and dst_h = 0x0fff, access outside of fbi.ram bounds occurs (0x0280 * 0x0fff * 2 + 0x00f69c00 = 0x01469700, which is larger than array size 0x01000000).

I don't know why exactly driver tries to clear memory outside of 16 MB, maybe because of some bug, but I think bx_banshee_c::blt_rectangle_fill needs check preventing such crash no matter what reason causes such behaviour.

Image

Version: 700bd7f

@Vort
Copy link
Contributor Author

Vort commented Feb 1, 2025

Here are log lines before crash: banshee_crash_crop.txt.
Most important parts:
16913894351i[VVGA ] switched to 800 x 600 x 32 @ 59 Hz
...
16913897609i[VOODOO] CMDFIFO #0 now enabled
16913897628d[VOODOO] banshee write to offset 0xe4: value = 0x00010400 len=4 (vidDesktopStartAddr)
16913897628d[VOODOO] 2D write register 0x034 (srcBaseAddr) value = 0x00010400
16913897660d[VOODOO] 2D write register 0x010 (dstBaseAddr) value = 0x00010400
...
16913905686d[VOODOO] 2D write register 0x008 (clip0Min) value = 0x00000000
16913905702d[VOODOO] 2D write register 0x00c (clip0Max) value = 0x0fff0fff
16913905702d[VOODOO] 2D write register 0x064 (colorFore) value = 0x00000000
16913905702d[VOODOO] 2D write register 0x068 (dstSize) value = 0x0fff0320
16913905702d[VOODOO] 2D write register 0x06c (dstXY) value = 0x00000000
16913905702d[VOODOO] 2D write register 0x070 (command) value = 0xcc002105
...
16913979911d[VOODOO] banshee write to offset 0xe4: value = 0x00e2a680 len=4 (vidDesktopStartAddr)
16913979911d[VOODOO] 2D write register 0x034 (srcBaseAddr) value = 0x00e2a680
16913979911d[VOODOO] 2D write register 0x010 (dstBaseAddr) value = 0x00e2a680
16913985614d[VOODOO] Rectangle fill: 800 x 4095 ROP0 CC

Rectangle fill command starts with correct address 0x00010400, but then, before it is finished, address gets changed to 0x00e2a680.
I wonder if real hardware allows to change command parameters while command is executing.

@Vort
Copy link
Contributor Author

Vort commented Feb 1, 2025

To improve probability of reproduction, same method as in #405 can be used:

Hack to make crashes appear more frequently
diff --git a/bochs/iodev/display/banshee.cc b/bochs/iodev/display/banshee.cc
index 9da498f97..5a3e87ed8 100644
--- a/bochs/iodev/display/banshee.cc
+++ b/bochs/iodev/display/banshee.cc
@@ -2481,6 +2481,7 @@ void bx_banshee_c::blt_rectangle_fill()
   dy = BLT.dst_y;
   w = BLT.dst_w;
   h = BLT.dst_h;
+  volatile int z = 0; for (; z < 1000000; z++) {}
   BX_DEBUG(("Rectangle fill: %d x %d  ROP0 %02X", w, h, BLT.rop[0]));
   if (!blt_apply_clipwindow(NULL, NULL, &dx, &dy, &w, &h)) {
     BLT.busy = 0; 

Not sure if my formula is correct, but I see no crashes with such change:

diff --git a/bochs/iodev/display/banshee.cc b/bochs/iodev/display/banshee.cc
index 9da498f97..1197679d9 100644
--- a/bochs/iodev/display/banshee.cc
+++ b/bochs/iodev/display/banshee.cc
@@ -2471,6 +2471,7 @@ Bit32u bx_banshee_c::blt_yuv_conversion(Bit8u *ptr, Bit16u xc, Bit16u yc,
 void bx_banshee_c::blt_rectangle_fill()
 {
   Bit32u dpitch = BLT.dst_pitch;
+  Bit32u dbase = BLT.dst_base;
   Bit8u dpxsize = (BLT.dst_fmt > 1) ? (BLT.dst_fmt - 1) : 1;
   Bit8u *dst_ptr, *dst_ptr1;
   Bit8u colorkey_en = BLT.reg[blt_commandExtra] & 3;
@@ -2486,8 +2487,13 @@ void bx_banshee_c::blt_rectangle_fill()
     BLT.busy = 0;
     return;
   }
+  if (dbase + (dy + h - 1) * dpitch + (dx + w - 1) * dpxsize > v->fbi.mask) {
+    BX_ERROR(("skip address wrap during blt_rectangle_fill()"));
+    BLT.busy = 0;
+    return;
+  }
   BX_LOCK(render_mutex);
-  dst_ptr = &v->fbi.ram[BLT.dst_base + dy * dpitch + dx * dpxsize];
+  dst_ptr = &v->fbi.ram[dbase + dy * dpitch + dx * dpxsize];
   for (y = 0; y < h; y++) {
     dst_ptr1 = dst_ptr;
     for (x = 0; x < w; x++) { 

@vruppert
Copy link
Contributor

vruppert commented Feb 2, 2025

I don't know if this is related or not: the Cirrus card has double-buffered BitBLT registers. With real hardware you can start a command and immediately after that you can prepare the next command. Since Bochs doesn't use threads in the Cirrus emulation, it doesn't matter that the double buffered registers are not implemented. I don't know whether or not the Banshee / Voodoo3 cards are similar. You could try to change dst_base to dst_base[2] and write to index 0, copy index 0 to index 1 at BitBLT start and use index 1 in the execution methods.

@Vort
Copy link
Contributor Author

Vort commented Feb 2, 2025

I thought about double buffering as well.
But I can't find information about it neither in Banshee documentation, nor in CSIM code.
Here is how functions are called in CSIM: csimStore32 -> sstgGo -> rectFill -> doPixel -> csimWritePixel -> writePixel -> writeMem16 (for 16 bpp case).
writePixel function contains call to csimPixelAddress, which accesses dstBaseAddr register and only warns about out of bounds result.
When such address appear in writeMem16 function, it calls fault macro, which intentionally produces crash.
I assume real hardware won't crash or hang and something else will happen, but I doubt it uses completely different approaches than CSIM for rectFill execution.
In such situation, where implementation of double buffering with high chance will make emulation less correct than it is now, I think it is safer to just prevent crashing and come back to this problem once more information about real hardware will be available.
Also you can look at CSIM code yourself, it may be that I'm just missing double buffering there.

vruppert added a commit that referenced this issue Feb 4, 2025
@vruppert
Copy link
Contributor

vruppert commented Feb 4, 2025

I think real hardware simply continues at video memory address 0 in case of an address wrap. Since most of the Bochs graphics code uses pointers for optimization, implementing address wrap in this code would cause a slowdown. I have now applied your code to skip rectangle fill in case of an address wrap.

@Vort
Copy link
Contributor Author

Vort commented Feb 4, 2025

I think real hardware simply continues at video memory address 0 in case of an address wrap.

I suspect there may be important data at lower addresses, like cursor for example.
If address jumps during execution and then wraps, it may cause noticeable damage.
So either my suspicion is wrong and there are no important data there or something else is going on with real hardware (for example, operation stops when it reaches top address, or undocumented double buffering happens somewhere and address jump is ignored).

I have now applied your code to skip rectangle fill in case of an address wrap.

Thanks.

So what to do next? I think it's ok to close this issue because crash is fixed.
But if some side effects appear, then new issue can be created instead.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants