Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Garbage collection appears to cause a memory error #49

Open
notmgsk opened this issue Oct 5, 2023 · 6 comments
Open

Garbage collection appears to cause a memory error #49

notmgsk opened this issue Oct 5, 2023 · 6 comments

Comments

@notmgsk
Copy link
Member

notmgsk commented Oct 5, 2023

When SBCL's garbage collector is invoked, it seems to mess with memory in such a way that my shared library fails (with some fairly obscure errors).

I've created a repo on github with a minimal example project which reproduces the error (on Linux, haven't tried other OSs). The error, in this example, happens when parsing a ~100KB JSON file after a call to (sb-ext:gc), although the same class of error will occur if the GC is triggered indirectly. Smaller files can trigger the error if the dynamic space size is reduced.

The project has a dependency on quilc. I assume the error can be reproduced without quilc, but, in my limited testing, including quilc as a dependency causes the error 100% of the time.

I don't know enough about SBCL and it's garbage collector to be sure, but it seems as though GC is affecting the pointer that is received by my parse function such that it points to invalid memory.

This is the error I am seeing currently, though it has been different in the past

gcc test-parse.c -o test-parse -lsbcl -lparsejson -L. -I.
;
; compilation unit aborted
;   caught 1 fatal ERROR condition
ERROR The value
  0
is not of type
  SB-C::CONSTRAINT
error: The value
  0
is not of type
  SB-C::CONSTRAINT
@kartik-s
Copy link
Contributor

kartik-s commented Oct 5, 2023

Got a memory fault with SBCL master:

$ LD_LIBRARY_PATH=$HOME/sbcl/src/runtime:.:$CONDA_PREFIX/lib ./test-parse


debugger invoked on a TYPE-ERROR @5333028F in thread
#<FOREIGN-THREAD tid=939163 "callback" RUNNING {10040B0023}>:
  The value
CORRUPTION WARNING in SBCL pid 939163 tid 939163:
Memory fault at 0x55 (pc=0x546c78c2 [code 0x546c7810+0xB2 ID 0x43d], fp=0x7ffe404cf200, sp=0x7ffe404cf1e8) tid 939163
The integrity of this image is possibly compromised.
Continuing with fingers crossed.
(A SB-SYS:MEMORY-FAULT-ERROR was caught when trying to print *DEBUG-CONDITION*
when entering the debugger. Printing was aborted and the
SB-SYS:MEMORY-FAULT-ERROR was stored in SB-DEBUG::*NESTED-DEBUG-CONDITION*.)

The current thread is not at the foreground,
SB-THREAD:RELEASE-FOREGROUND has to be called in #<SB-THREAD:THREAD tid=939163 "main thread" RUNNING {10041300E3}>
for this thread to enter the debugger.

The crash does not occur using SBCL 2.2.4.

We were seeing crashes internally when trying to use SBCL 2.3.0 with sbcl-librarian, so we bisected and found sbcl/sbcl@53d80ca as the potentially breaking change for us. You may be encountering the same issue.

@kartik-s
Copy link
Contributor

kartik-s commented Oct 5, 2023

Crashed with an the same error on the commit before sbcl/sbcl@53d80ca

@kartik-s
Copy link
Contributor

kartik-s commented Oct 5, 2023

On SBCL master with this patch to the minimal example:

diff --git a/src/build-image.lisp b/src/build-image.lisp
index 8b8f8cf..d702ea7 100644
--- a/src/build-image.lisp
+++ b/src/build-image.lisp
@@ -9,5 +9,7 @@
   parsejson-api
   sbcl-librarian:handles)

+(push (lambda () (sb-thread:release-foreground)) sb-ext:*init-hooks*)
+
 (sbcl-librarian:build-bindings parsejson "." :initialize-lisp-args `("--dynamic-space-size" ,(format nil "~a" (expt 2 13))))
 (sbcl-librarian:build-core-and-die parsejson ".")

I get this:

$ ./test-parse
;
; compilation unit aborted
;   caught 1 fatal ERROR condition
ERROR Invalid index 2684355118 for (SIMPLE-VECTOR 23), should be a non-negative integer below 23.
error: AWIAVIAUAATL%

or a hang

@kartik-s
Copy link
Contributor

kartik-s commented Oct 5, 2023

$ ./test-parse
^C;
; compilation unit aborted
;   caught 1 fatal ERROR condition
ERROR The value
  NIL
is not of type
  SB-THREAD:THREAD
error: The value
  NIL
is not of type
  SB-THREAD:THREAD

@karlosz
Copy link
Collaborator

karlosz commented Oct 6, 2023

The thing to do here is to minimize the reproducer as much as possible and submit a bug upstream with a carefully bisected commit attached.

@kartik-s
Copy link
Contributor

Update: I tried building this against SBCL 2.4.4 and the error doesn't seem to appear anymore

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants