Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chat.exe will instantly exit with no text or error msg #90

Open
yigalnavon opened this issue Mar 21, 2023 · 14 comments
Open

chat.exe will instantly exit with no text or error msg #90

yigalnavon opened this issue Mar 21, 2023 · 14 comments

Comments

@yigalnavon
Copy link

chat.exe will produce blank line with no text and will exit.
On windows 10 compiled with cmake

Please help.

@technoqz
Copy link

D:\ALPACA\alpaca-win>chat.exe
main: seed = 1679458006
llama_model_load: loading model from 'ggml-alpaca-7b-q4.bin' - please wait ...
llama_model_load: ggml ctx size = 6065.34 MB
llama_model_load: memory_size =  2048.00 MB, n_mem = 65536
llama_model_load: loading model part 1/1 from 'ggml-alpaca-7b-q4.bin'
llama_model_load: .................................... done
llama_model_load: model size =  4017.27 MB / num tensors = 291

system_info: n_threads = 4 / 4 | AVX = 1 | AVX2 = 1 | AVX512 = 0 | FMA = 0 | NEON = 0 | ARM_FMA = 0 | F16C = 0 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 0 | SSE3 = 0 | VSX = 0 |
main: interactive mode on.
sampling parameters: temp = 0.100000, top_k = 40, top_p = 0.950000, repeat_last_n = 64, repeat_penalty = 1.300000



D:\ALPACA\alpaca-win>

Same problem. I downloaded compiled exe from Releases. It shows blank line after model loading and exit. I also tried run same exe file on another PC(with newer CPU, but same OS win10 ) - there is no such error.

@andrenaP
Copy link

andrenaP commented Mar 22, 2023

I have same problem

F:\lama\alpaca-win>.\chat.exe -m ggml-alpaca-7b-q4.bin
main: seed = 1679476791
llama_model_load: loading model from 'ggml-alpaca-7b-q4.bin' - please wait ...
llama_model_load: ggml ctx size = 6065.34 MB
llama_model_load: memory_size =  2048.00 MB, n_mem = 65536
llama_model_load: loading model part 1/1 from 'ggml-alpaca-7b-q4.bin'
llama_model_load: .................................... done
llama_model_load: model size =  4017.27 MB / num tensors = 291

system_info: n_threads = 4 / 4 | AVX = 1 | AVX2 = 1 | AVX512 = 0 | FMA = 0 | NEON = 0 | ARM_FMA = 0 | F16C = 0 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 0 | SSE3 = 0 | VSX = 0 |
main: interactive mode on.
sampling parameters: temp = 0.100000, top_k = 40, top_p = 0.950000, repeat_last_n = 64, repeat_penalty = 1.300000



F:\lama\alpaca-win>.\chat.exe -m ggml-alpaca-7b-q4.bin

@linonetwo
Copy link

Mine quit after accept my question:

PS E:\repo\langchain-alpaca\dist\binary>  ./chat.exe --model "e:\repo\langchain-alpaca\model\ggml-alpaca-7b-q4.bin"  --threads 6
main: seed = 1679490011
llama_model_load: loading model from 'e:\repo\langchain-alpaca\model\ggml-alpaca-7b-q4.bin' - please wait ...
llama_model_load: ggml ctx size = 6065.34 MB
llama_model_load: memory_size =  2048.00 MB, n_mem = 65536
llama_model_load: loading model part 1/1 from 'e:\repo\langchain-alpaca\model\ggml-alpaca-7b-q4.bin'
llama_model_load: .................................... done
llama_model_load: model size =  4017.27 MB / num tensors = 291

system_info: n_threads = 6 / 12 | AVX = 1 | AVX2 = 1 | AVX512 = 0 | FMA = 0 | NEON = 0 | ARM_FMA = 0 | F16C = 0 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 0 | SSE3 = 0 | VSX = 0 |
main: interactive mode on.
sampling parameters: temp = 0.100000, top_k = 40, top_p = 0.950000, repeat_last_n = 64, repeat_penalty = 1.300000


== Running in chat mode. ==
 - Press Ctrl+C to interject at any time.
 - Press Return to return control to LLaMA.
 - If you want to submit another line, end your input in '\'.

> Use the following pieces of context to answer the question at the end. If you don't know the answer, just say that you don't know, don't try to make up an answer. harrison went to harvard ankush went to princeton Question Where did harrison go to college Helpful Answer

I have enough memory for it, not a OOM.

@aofalcao
Copy link

I have exactly the same issue. Compiled with cmake and VS2019. I get the info about the parameters. Waits like 10 seconds and then it is over without producing any error

@aofalcao
Copy link

I have exactly the same issue. Compiled with cmake and VS2019. I get the info about the parameters. Waits like 10 seconds and then it is over without producing any error

I guess I found part of the problem, not necessarily the real cause, and not even close to a solution, but this may help at least the maintainers
On function llama_eval this call:

ggml_graph_compute (ctx0, &gf);

is the one that never finishes and the program aborts

@rangedreign
Copy link

Same issue, with even less text than the others.

E:\AI-Chat\alpaca-win>chat -m ggml-alpaca-13b-q4.bin
main: seed = 1679516224
llama_model_load: loading model from 'ggml-alpaca-13b-q4.bin' - please wait ...
llama_model_load: ggml ctx size = 10959.49 MB

E:\AI-Chat\alpaca-win>

clicking on chat.exe will not load anything.

@fancellu
Copy link

Me too. Windows10, 32gb ram. "chat.exe has stopped working". Same for 7b and 13b

I don't even get to ask it a question

@StanDaMan0505
Copy link

Same here

D:\StableDiffusion\Alpaca>chat.exe -i -m ggml-alpaca-13b-q4.bin -t 1
main: seed = 1679581051
llama_model_load: loading model from 'ggml-alpaca-13b-q4.bin' - please wait ...
llama_model_load: ggml ctx size = 10959.49 MB
llama_model_load: memory_size =  3200.00 MB, n_mem = 81920
llama_model_load: loading model part 1/1 from 'ggml-alpaca-13b-q4.bin'
llama_model_load: ............................................. done
llama_model_load: model size =  7759.39 MB / num tensors = 363

system_info: n_threads = 1 / 8 | AVX = 1 | AVX2 = 1 | AVX512 = 0 | FMA = 0 | NEON = 0 | ARM_FMA = 0 | F16C = 0 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 0 | SSE3 = 0 | VSX = 0 |
main: interactive mode on.
sampling parameters: temp = 0.100000, top_k = 40, top_p = 0.950000, repeat_last_n = 64, repeat_penalty = 1.300000

16 GB RAM of which about 12 GB are being used during "startup"

@SiemensSchuckert
Copy link

Building with MINGW might help

  1. Install MSYS2 (www.msys2.org)

  2. Place sources into C:\msys64\home\USERNAME\alpaca.cpp
    and apply this patch Add support for building on native Windows via MINGW. #84

  3. Inside UCRT64 terminal run:
    pacman -S mingw-w64-ucrt-x86_64-gcc
    pacman -S make
    cd alpaca.cpp
    make

chat.exe should appear there C:\msys64\home\USERNAME\alpaca.cpp\chat.exe

@fancellu
Copy link

Building with MINGW might help

  1. Install MSYS2 (www.msys2.org)
  2. Place sources into C:\msys64\home\USERNAME\alpaca.cpp
    and apply this patch Add support for building on native Windows via MINGW. #84
  3. Inside UCRT64 terminal run:
    pacman -S mingw-w64-ucrt-x86_64-gcc
    pacman -S make
    cd alpaca.cpp
    make

chat.exe should appear there C:\msys64\home\USERNAME\alpaca.cpp\chat.exe

Doesn't work for me

When I "make"

D:/msys64/ucrt64/lib/gcc/x86_64-w64-mingw32/12.2.0/include/f16cintrin.h:52:1: error: inlining failed in call to 'always_
inline' '_mm256_cvtph_ps': target specific option mismatch
   52 | _mm256_cvtph_ps (__m128i __A)
      | ^~~~~~~~~~~~~~~
ggml.c:911:33: note: called from here
  911 | #define GGML_F32Cx8_LOAD(x)     _mm256_cvtph_ps(_mm_loadu_si128((__m128i *)(x)))
      |                                 ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
ggml.c:911:33: note: in definition of macro 'GGML_F32Cx8_LOAD'
  911 | #define GGML_F32Cx8_LOAD(x)     _mm256_cvtph_ps(_mm_loadu_si128((__m128i *)(x)))
      |                                 ^~~~~~~~~~~~~~~
ggml.c:1274:21: note: in expansion of macro 'GGML_F16_VEC_LOAD'
 1274 |             ay[j] = GGML_F16_VEC_LOAD(y + i + j*GGML_F16_EPR, j);
      |                     ^~~~~~~~~~~~~~~~~
D:/msys64/ucrt64/lib/gcc/x86_64-w64-mingw32/12.2.0/include/f16cintrin.h:52:1: error: inlining failed in call to 'always_
inline' '_mm256_cvtph_ps': target specific option mismatch
   52 | _mm256_cvtph_ps (__m128i __A)

etc

If I don't include the patch, it compiles, but chat.exe gives me an illegal instruction

@SiemensSchuckert
Copy link

Try this version
https://github.com/SiemensSchuckert/alpaca.cpp

@fancellu
Copy link

Try this version https://github.com/SiemensSchuckert/alpaca.cpp

Thanks that works fine. yaaaay

@sarfraznawaz2005
Copy link

same probelm

@sarfraznawaz2005
Copy link

@SiemensSchuckert that worked thanks a lot :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

10 participants