Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

18b katago weights can't be loaded. #45

Open
luosonggu opened this issue Aug 26, 2024 · 16 comments
Open

18b katago weights can't be loaded. #45

luosonggu opened this issue Aug 26, 2024 · 16 comments

Comments

@luosonggu
Copy link

I have tried cuda and opencl engine, both got this message.
[2024-08-26 19:24:07.457] [multi_sink] [warning] Syntax error parsing property declaration 'margin-top: 1em;' in : 34.
[2024-08-26 19:24:07.463] [multi_sink] [warning] Syntax error parsing property declaration 'margin-top: 1em;' in : 34.
[2024-08-26 19:24:07.466] [multi_sink] [warning] Syntax error parsing property declaration 'margin-top: 1em;' in : 34.
[2024-08-26 19:24:15.135] [multi_sink] [error] setting boardsize failed
[2024-08-26 19:24:15.137] [multi_sink] [error] setting boardsize failed
[2024-08-26 19:24:15.139] [multi_sink] [error] setting boardsize failed
[2024-08-26 19:24:15.141] [multi_sink] [error] setting boardsize failed
[2024-08-26 19:24:15.143] [multi_sink] [error] setting boardsize failed
[2024-08-26 19:24:15.146] [multi_sink] [error] setting boardsize failed
[2024-08-26 19:24:15.148] [multi_sink] [error] setting boardsize failed
[2024-08-26 19:24:15.151] [multi_sink] [error] setting boardsize failed
[2024-08-26 19:24:15.153] [multi_sink] [error] setting boardsize failed
[2024-08-26 19:24:15.155] [multi_sink] [error] setting boardsize failed
[2024-08-26 19:24:15.158] [multi_sink] [error] setting boardsize failed
[2024-08-26 19:24:15.161] [multi_sink] [error] setting boardsize failed
[2024-08-26 19:24:15.163] [multi_sink] [error] setting boardsize failed

@popojan
Copy link
Owner

popojan commented Aug 26, 2024

Thank you for reporting. The error message is not very informative indeed.
I can reproduce this error when katago cannot find the weights file. It is not distributed along with the application.
Given e.g. this sample disabled config

{
      "name": "Katago #kata9x9 b18",
      "path": "./engine/katago",
      "command": "katago",
      "parameters": "gtp -model ./engine/katago/kata9x9-b18c384nbt-20231025.bin.gz -config ./engine/katago/default_gtp.cfg",
      "enabled": 0,
      "kibitz": 0,
      "messages": [
          {
            "regex": "^:\\s+T.*--\\s*([A-Z0-9]+)",
            "output": "$1",
            "var": "$primaryMove"
	        },
          {
            "regex": "^$primaryMove.*(W\\s+[^\\s]+).*\\(\\s*([^\\s]+\\s+L)",
            "output": "$1 $2"
          },
          {
            "regex": "Controller:",
            "output": " "
          }
      ]
    },

The file kata9x9-b18c384nbt-20231025.bin.gz is expected in the same directory as katago executable.
You can freely edit this config file, as briefly described in wiki#bots.
You might also want to edit default_gtp.cfg to enable logging to stderr, as the logs are parsed and katago ranking shown thanks to the predefined regexes.

If all this seems correct, please try to run katago as a standalone app using the same parameters as in the config file, or copy & paste the engine configuration here. Thank you.

@luosonggu
Copy link
Author

I've got the reason. My display card is nvidia gt730, the display driver is 474. I tested another machine with a 1080ti card, goban run fine. Can you check the code and make goban support old cards, in other gui like sabaki, nvidia gt730 can load 18b weights.

@popojan
Copy link
Owner

popojan commented Aug 27, 2024

I see, might be a bug, could you please suggest:

  • is it a regression? (worked in 2022-02-03 goban and does not in 2024-08-25)
  • katago can load the weights and genmove when run standalone, i.e. from command-line katago.exe gtp -model <weights>
  • possibly attach last katago log file (should be created even when run via goban unless logging only to stderr)
  • your katago version and weights file

I may need to reproduce the problem.

@luosonggu
Copy link
Author

To reproduce the problem, you need an old nvidia card (driver version 474 and below). I have tried many 18b weights and some weights trained by lionfenfen, so I'm sure that 9x9 weights are not support by goban. Some people encountered the same problem and someone posted a post at lightvector's katago issues and got no answer.

@luosonggu
Copy link
Author

lightvector/KataGo#924

@luosonggu
Copy link
Author

Every version of goban get the same problem. I guess the problem is caused by nvidia driver. Old display card can't upgrade its driver to 5 series. On my 1080ti with driver version 5xx, all is fine.

@luosonggu
Copy link
Author

But, with nvidia 474 driver, sabaki can load 18b weights smoothly. So, the problem may be, new 9x9 weights need new gui code to fit old nvidia driver demands.

@popojan
Copy link
Owner

popojan commented Aug 27, 2024

We need to address the errors reported by katago, if any.
To me it seems it's lightvector who did not get answers in the linked issue.

You may run goban.exe -v debug to get more information in the last_run.log. If katago is configured to log into stderr, the stderr output should be included as well.

I am sorry I cannot test with the hardware mentioned in the near future.

@lj739
Copy link

lj739 commented Aug 28, 2024

[2024-08-28 08:41:27.559] [multi_sink] [debug] Loaded font face Lacuna Regular Regular (from byte stream).
[2024-08-28 08:41:27.587] [multi_sink] [debug] Loaded font face Lacuna Italic Regular (from byte stream).
[2024-08-28 08:41:27.597] [multi_sink] [warning] Syntax error parsing property declaration 'margin-top: 1em;' in : 34.
[2024-08-28 08:41:27.605] [multi_sink] [warning] Syntax error parsing property declaration 'margin-top: 1em;' in : 34.
[2024-08-28 08:41:27.612] [multi_sink] [warning] Syntax error parsing property declaration 'margin-top: 1em;' in : 34.
[2024-08-28 08:41:27.617] [multi_sink] [info] Loading font file [./data/fonts/Delicious-Roman.otf]
[2024-08-28 08:41:27.622] [multi_sink] [debug] Loaded font face Delicious Roman (from ./data/fonts/Delicious-Roman.otf).
[2024-08-28 08:41:27.625] [multi_sink] [info] Loading font file [./data/fonts/Delicious-Bold.otf]
[2024-08-28 08:41:27.630] [multi_sink] [debug] Loaded font face Delicious Bold (from ./data/fonts/Delicious-Bold.otf).
[2024-08-28 08:41:27.653] [multi_sink] [info] Preloading sounds...
[2024-08-28 08:41:27.708] [multi_sink] [info] Loading font file [./data/fonts/default-font.ttf]
[2024-08-28 08:41:27.712] [multi_sink] [debug] Creating overlay buffer[0]
[2024-08-28 08:41:27.713] [multi_sink] [debug] Adding text glyphs[0]
[2024-08-28 08:41:27.718] [multi_sink] [debug] gid 19: endpoints 38; err 50; tex fetch 3.2; mem 2.9kb
[2024-08-28 08:41:27.719] [multi_sink] [debug] gid 20: endpoints 7; err 0; tex fetch 1.6; mem 1.2kb
[2024-08-28 08:41:27.721] [multi_sink] [debug] gid 21: endpoints 30; err 95; tex fetch 2.8; mem 2.6kb
[2024-08-28 08:41:27.725] [multi_sink] [debug] gid 22: endpoints 50; err 76; tex fetch 3.7; mem 3.4kb
[2024-08-28 08:41:27.727] [multi_sink] [debug] gid 23: endpoints 18; err 0; tex fetch 1.9; mem 2.2kb
[2024-08-28 08:41:27.733] [multi_sink] [debug] gid 24: endpoints 38; err 85; tex fetch 3.3; mem 2.9kb
[2024-08-28 08:41:27.740] [multi_sink] [debug] gid 25: endpoints 46; err 69; tex fetch 3.6; mem 3.3kb
[2024-08-28 08:41:27.745] [multi_sink] [debug] gid 26: endpoints 16; err 61; tex fetch 2.3; mem 2.3kb
[2024-08-28 08:41:27.753] [multi_sink] [debug] gid 27: endpoints 52; err 99; tex fetch 3.9; mem 3.4kb
[2024-08-28 08:41:27.760] [multi_sink] [debug] gid 28: endpoints 47; err 51; tex fetch 3.6; mem 3.2kb
[2024-08-28 08:41:27.778] [multi_sink] [debug] gid 59: endpoints 13; err 0; tex fetch 1.8; mem 2.6kb
[2024-08-28 08:41:27.778] [multi_sink] [debug] Creating overlay buffer[1]
[2024-08-28 08:41:27.779] [multi_sink] [debug] Adding text glyphs[1]
[2024-08-28 08:41:27.780] [multi_sink] [debug] Creating overlay buffer[2]
[2024-08-28 08:41:27.780] [multi_sink] [debug] Adding text glyphs[2]
[2024-08-28 08:41:27.780] [multi_sink] [debug] 11 glyphs; avg num endpoints 32.27; avg error 53.1;avg tex fetch 2.89; avg 2.73kb per glyph
[2024-08-28 08:41:27.781] [multi_sink] [debug] sound ./data/sound/collision.wav frame count 22050
[2024-08-28 08:41:27.782] [multi_sink] [debug] sound ./data/sound/stone.wav frame count 5762
[2024-08-28 08:41:28.191] [multi_sink] [debug] setting gamma = 1.0
[2024-08-28 08:41:28.192] [multi_sink] [debug] setting contrast = 0.0
[2024-08-28 08:41:28.193] [multi_sink] [info] Starting GTP client [./engine/gnugo/gnugo]
[2024-08-28 08:41:28.252] [multi_sink] [info] About to run GTP engine [./engine/gnugo/gnugo.exe]
[2024-08-28 08:41:28.253] [multi_sink] [info] running child [./engine/gnugo/gnugo.exe --mode gtp --japanese-rules]
[2024-08-28 08:41:28.264] [multi_sink] [info] Setting [GNU Go 3.8] engine as coach and referee.
[2024-08-28 08:41:28.264] [multi_sink] [debug] Player[0] newType = [human = false, computer = true] newRole = [black = false, white = false]
[2024-08-28 08:41:28.265] [multi_sink] [info] Starting GTP client [./engine/katago/katago]
[2024-08-28 08:41:28.325] [multi_sink] [info] About to run GTP engine [./engine/katago/katago.exe]
[2024-08-28 08:41:28.326] [multi_sink] [info] running child [./engine/katago/katago.exe gtp -model ./engine/katago/b18c384nbt-optimisticv13-s5971M.bin.gz -config ./engine/katago/default_gtp.cfg]
[2024-08-28 08:41:28.336] [multi_sink] [info] Setting [Katago #kata9x9 b18] engine as trusted kibitz.
[2024-08-28 08:41:28.337] [multi_sink] [debug] Player[1] newType = [human = false, computer = true] newRole = [black = false, white = false]
[2024-08-28 08:41:28.338] [multi_sink] [info] gnugo << boardsize 19
[2024-08-28 08:41:28.339] [multi_sink] [info] getting response...
[2024-08-28 08:41:28.343] [multi_sink] [info] gnugo >> =
[2024-08-28 08:41:28.343] [multi_sink] [info] gnugo >>
[2024-08-28 08:41:28.344] [multi_sink] [info] gnugo << clear_board
[2024-08-28 08:41:28.345] [multi_sink] [info] getting response...
[2024-08-28 08:41:28.346] [multi_sink] [info] gnugo >> =
[2024-08-28 08:41:28.347] [multi_sink] [info] gnugo >>
[2024-08-28 08:41:28.348] [multi_sink] [info] katago << boardsize 19
[2024-08-28 08:41:28.349] [multi_sink] [info] getting response...
[2024-08-28 08:41:28.665] [multi_sink] [debug] gtp err = KataGo v1.13.0

[2024-08-28 08:41:28.666] [multi_sink] [debug] gtp err = Using TrompTaylor rules initially, unless GTP/GUI overrides this

[2024-08-28 08:41:35.005] [multi_sink] [info] katago << clear_board
[2024-08-28 08:41:35.006] [multi_sink] [info] getting response...
[2024-08-28 08:41:35.007] [multi_sink] [debug] Player[2] newType = [human = true, computer = false] newRole = [black = false, white = false]
[2024-08-28 08:41:35.008] [multi_sink] [debug] Player[2] newType = [human = true, computer = false] newRole = [black = true, white = false]
[2024-08-28 08:41:35.009] [multi_sink] [debug] Player[0] newType = [human = false, computer = true] newRole = [black = false, white = false]
[2024-08-28 08:41:35.010] [multi_sink] [debug] Player[0] newType = [human = false, computer = true] newRole = [black = false, white = true]
[2024-08-28 08:41:35.064] [multi_sink] [debug] gtp err =
[2024-08-28 08:41:35.326] [multi_sink] [debug] Load
[2024-08-28 08:41:35.330] [multi_sink] [debug] Player[2] newType = [human = true, computer = false] newRole = [black = false, white = false]
[2024-08-28 08:41:35.330] [multi_sink] [debug] Player[2] newType = [human = true, computer = false] newRole = [black = true, white = false]
[2024-08-28 08:41:35.332] [multi_sink] [debug] Player[0] newType = [human = false, computer = true] newRole = [black = false, white = false]
[2024-08-28 08:41:35.332] [multi_sink] [debug] Player[0] newType = [human = false, computer = true] newRole = [black = false, white = true]
[2024-08-28 08:41:35.340] [multi_sink] [info] switching shader to #0
[2024-08-28 08:41:35.387] [multi_sink] [debug] FPS: 0.0
[2024-08-28 08:41:35.390] [multi_sink] [info] gnugo << boardsize 19
[2024-08-28 08:41:35.390] [multi_sink] [info] getting response...
[2024-08-28 08:41:35.391] [multi_sink] [info] gnugo >> =
[2024-08-28 08:41:35.392] [multi_sink] [info] gnugo >>
[2024-08-28 08:41:35.393] [multi_sink] [info] gnugo << clear_board
[2024-08-28 08:41:35.394] [multi_sink] [info] getting response...
[2024-08-28 08:41:35.395] [multi_sink] [info] gnugo >> =
[2024-08-28 08:41:35.396] [multi_sink] [info] gnugo >>
[2024-08-28 08:41:35.397] [multi_sink] [info] katago << boardsize 19
[2024-08-28 08:41:35.398] [multi_sink] [info] getting response...

@popojan
Copy link
Owner

popojan commented Aug 28, 2024

@lj739 Thank you. And that's all? Then goban quits or seemingly hangs?

@lj739
Copy link

lj739 commented Aug 28, 2024

goban quits, last_run.log these message apears:
[2024-08-26 19:24:07.457] [multi_sink] [warning] Syntax error parsing property declaration 'margin-top: 1em;' in : 34.
[2024-08-26 19:24:07.463] [multi_sink] [warning] Syntax error parsing property declaration 'margin-top: 1em;' in : 34.
[2024-08-26 19:24:07.466] [multi_sink] [warning] Syntax error parsing property declaration 'margin-top: 1em;' in : 34.
[2024-08-26 19:24:15.135] [multi_sink] [error] setting boardsize failed
[2024-08-26 19:24:15.137] [multi_sink] [error] setting boardsize failed
[2024-08-26 19:24:15.139] [multi_sink] [error] setting boardsize failed
[2024-08-26 19:24:15.141] [multi_sink] [error] setting boardsize failed
[2024-08-26 19:24:15.143] [multi_sink] [error] setting boardsize failed
[2024-08-26 19:24:15.146] [multi_sink] [error] setting boardsize failed
[2024-08-26 19:24:15.148] [multi_sink] [error] setting boardsize failed

@popojan
Copy link
Owner

popojan commented Aug 28, 2024

The first one for goban.exe -v debug looks incomplete or truncated, compared to the second one with default verbosity.

I could investigate further if you'd be so kind to run katago standalone, ie.

katago.exe gtp -model b18c384nbt-optimisticv13-s5971M.bin.gz -config default_gtp.cfg

let it genmove, and attach the logged output.
Thank you in advance.

@luosonggu
Copy link
Author

2024-08-28 22:36:16+0800: Running with following config:
allowResignation = true
defaultBoardSize = 19
lagBuffer = 1.0
logAllGTPCommunication = true
logDir = gtp_logs
logSearchInfo = true
logSearchInfoForChosenMove = true
logToStderr = true
maxTimePondering = 60.0
maxVisits = 100
numSearchThreads = 6
ponderingEnabled = false
resignConsecTurns = 3
resignThreshold = -0.999
rules = tromp-taylor
searchFactorAfterOnePass = 0.50
searchFactorAfterTwoPass = 0.25
searchFactorWhenWinning = 0.40
searchFactorWhenWinningThreshold = 0.95

2024-08-28 22:36:16+0800: GTP Engine starting...
2024-08-28 22:36:16+0800: KataGo v1.15.2
2024-08-28 22:36:16+0800: Using TrompTaylor rules initially, unless GTP/GUI overrides this
2024-08-28 22:36:16+0800: Using 6 CPU thread(s) for search
2024-08-28 22:36:16+0800: nnRandSeed0 = 8178244719051708548
2024-08-28 22:36:16+0800: After dedups: nnModelFile0 = kata1-b18c384nbt-s9996604416-d4316597426.bin.gz useFP16 auto useNHWC auto
2024-08-28 22:36:16+0800: Initializing neural net buffer to be size 19 * 19 exactly
2024-08-28 22:36:20+0800: Found OpenCL Platform 0: NVIDIA CUDA (NVIDIA Corporation) (OpenCL 3.0 CUDA 11.4.309)
2024-08-28 22:36:20+0800: Found 1 device(s) on platform 0 with type CPU or GPU or Accelerator
2024-08-28 22:36:20+0800: Found OpenCL Device 0: NVIDIA GeForce GT 730 (NVIDIA Corporation) (score 11000300)
2024-08-28 22:36:20+0800: Creating context for OpenCL Platform: NVIDIA CUDA (NVIDIA Corporation) (OpenCL 3.0 CUDA 11.4.309)
2024-08-28 22:36:20+0800: Using OpenCL Device 0: NVIDIA GeForce GT 730 (NVIDIA Corporation) OpenCL 3.0 CUDA (Extensions: cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_fp64 cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_icd cl_khr_gl_sharing cl_nv_compiler_options cl_nv_device_attribute_query cl_nv_pragma_unroll cl_nv_d3d10_sharing cl_khr_d3d10_sharing cl_nv_d3d11_sharing cl_nv_copy_opts cl_nv_create_buffer cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_device_uuid cl_khr_pci_bus_info)
2024-08-28 22:36:20+0800: Loaded tuning parameters from: D:\Goban2024a\engine\katagopencl/KataGoData/opencltuning/tune11_gpuNVIDIAGeForceGT730_x19_y19_c384_mv14.txt
2024-08-28 22:36:20+0800: OpenCL backend thread 0: Model version 14
2024-08-28 22:36:20+0800: OpenCL backend thread 0: Model name: kata1-b18c384nbt-s9996604416-d4316597426
2024-08-28 22:36:27+0800: OpenCL backend thread 0: FP16Storage false FP16Compute false FP16TensorCores false FP16TensorCoresFor1x1 false
2024-08-28 22:36:27+0800: Loaded neural net with nnXLen 19 nnYLen 19
2024-08-28 22:36:27+0800: Initializing board with boardXSize 19 boardYSize 19
2024-08-28 22:36:27+0800: Loaded config default_gtp.cfg
2024-08-28 22:36:27+0800: Loaded model kata1-b18c384nbt-s9996604416-d4316597426.bin.gz
2024-08-28 22:36:27+0800: Model name: kata1-b18c384nbt-s9996604416-d4316597426
2024-08-28 22:36:27+0800: GTP ready, beginning main protocol loop

@luosonggu
Copy link
Author

PS D:\Goban2024\engine\katagocuda> .\katago.exe gtp -model kata1-b18c384nbt-s9996604416-d4316597426.bin.gz -config default_gtp.cfg
2024-08-28 22:46:29+0800: Running with following config:
allowResignation = true
cudaUseFP16 = false
cudaUseNHWC = false
lagBuffer = 1.0
logAllGTPCommunication = true
logDir = gtp_logs
logSearchInfo = true
logToStderr = true
maxTimePondering = 60.0
maxVisits = 50
numSearchThreads = 6
ponderingEnabled = false
resignConsecTurns = 3
resignThreshold = -0.999
rules = tromp-taylor
searchFactorAfterOnePass = 0.50
searchFactorAfterTwoPass = 0.25
searchFactorWhenWinning = 0.40
searchFactorWhenWinningThreshold = 0.95

2024-08-28 22:46:29+0800: GTP Engine starting...
2024-08-28 22:46:29+0800: KataGo v1.13.0
2024-08-28 22:46:29+0800: Using TrompTaylor rules initially, unless GTP/GUI overrides this
2024-08-28 22:46:29+0800: Using 6 CPU thread(s) for search
2024-08-28 22:46:30+0800: nnRandSeed0 = 16223772968998217411
2024-08-28 22:46:30+0800: After dedups: nnModelFile0 = kata1-b18c384nbt-s9996604416-d4316597426.bin.gz useFP16 false useNHWC false
2024-08-28 22:46:30+0800: Initializing neural net buffer to be size 19 * 19 exactly
2024-08-28 22:46:33+0800: Cuda backend thread 0: Found GPU NVIDIA GeForce GT 730 memory 1073741824 compute capability major 3 minor 5
2024-08-28 22:46:33+0800: Cuda backend thread 0: Model version 14 useFP16 = false useNHWC = false
2024-08-28 22:46:33+0800: Cuda backend thread 0: Model name: kata1-b18c384nbt-s9996604416-d4316597426
2024-08-28 22:46:46+0800: Loaded neural net with nnXLen 19 nnYLen 19
2024-08-28 22:46:46+0800: Initializing board with boardXSize 19 boardYSize 19
2024-08-28 22:46:46+0800: Loaded config default_gtp.cfg
2024-08-28 22:46:46+0800: Loaded model kata1-b18c384nbt-s9996604416-d4316597426.bin.gz
2024-08-28 22:46:46+0800: Model name: kata1-b18c384nbt-s9996604416-d4316597426
2024-08-28 22:46:46+0800: GTP ready, beginning main protocol loop

@luosonggu
Copy link
Author

I don't know how to genmove in the powershell.

@popojan
Copy link
Owner

popojan commented Aug 28, 2024

@luosonggu When GTP is ready issue gtp command genmove B, but it is obvious it would work, so these logs are enough for now. Thank you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants