Skip to content

Releases: LeelaChessZero/lc0

v0.26.3

10 Oct 19:44
Compare
Choose a tag to compare

Starting with this release, we are distributing two packages for windows with Nvidia GPUs: the cuda package and the cudnn package. The cudnn package is what we used to distribute so far (but we called it cuda), and comes with the same versions of cuda and cudnn dlls we were using for the last few months. The new cuda package comes with cuda 11.1 dlls and requires at least version 456.38 of the windows Nvidia drivers, and should give better performance on RTX cards and in particular the new RTX 30XX cards.

Notes:

  1. The cudnn package will work as-is in existing setups, but for the cuda package you may have to replace cudnn with cuda (or cuda-auto or cuda-fp16) as a backend (if specified) - this will certainly be necessary for multi-gpu setups.
  2. Some testing indicates that cuda 11.1 may be slower for GTX 10XX cards, so owners of older cards may want to stay with the cudnn package. If your testing shows otherwise do let us know.

v0.26.3-rc2

03 Oct 13:48
Compare
Choose a tag to compare
v0.26.3-rc2 Pre-release
Pre-release
  • Fix for uninitialized variable that led to crashes with the cudnn backend.
  • Correct windows support for systems with more than 64 threads.
  • A new package is built for the cuda backend with cuda 11.1. The old cuda package is renamed to cudnn.

Note: The cuda package requires nvidia driver 456.38 or newer.

v0.26.3-rc1

28 Sep 11:23
Compare
Choose a tag to compare
v0.26.3-rc1 Pre-release
Pre-release
  • Residual block fusion optimization for cudnn backend, that depends on custom_winograd=true. Enabled by default only for networks with up to 384 filters in fp16 mode and never in fp32 mode. Default can be overridden with --backend-opts=res_block_fusing=false to disable (or =true to enable).
  • New experimental cuda backend without cudnn dependency (cuda-auto, cuda and cuda-fp16 are available).

v0.26.2

02 Sep 12:53
Compare
Choose a tag to compare

No changes since rc1. Enjoy!

v0.26.2-rc1

28 Aug 15:33
Compare
Choose a tag to compare
v0.26.2-rc1 Pre-release
Pre-release
  • Repetitions in the search tree are marked as draws, to explore more promising lines. Enabled by default (except in selfplay mode) use --two-fold-draws=false to disable.
  • Syzygy tablebase files can now be used in selfplay. Still need to add adjudication support before we can consider using this for training.
  • Default net updated to 703810.
  • Fix for book with CR/LF line endings.
  • Updated Eigen wrap to use new download link.

If you build from source, note that old versions of meson cannot download from the new Eigen download link. You will either have to update meson or build with -Dblas=false.

v0.26.1

15 Jul 13:14
Compare
Choose a tag to compare
  • Fixed an issue where an incorrectly specified openings-pgn path would be ignored rather than cause a failure.
  • Added support for compressed opening books.
  • Windows builds include v29 of the lc0-training-client.

v0.26.0

02 Jul 21:50
Compare
Choose a tag to compare

No changes since rc1. Enjoy!

v0.26.0-rc1

29 Jun 06:45
Compare
Choose a tag to compare
v0.26.0-rc1 Pre-release
Pre-release
  • Verbose move stats now includes a line for the root node itself.
  • Added optional alphazero time manager type for fixed fraction of
    remaining time per move.
  • The WL score is now tracked with double pecision to improve accuracy
    during very long search.
  • Fix for a performance bug when playing from tablebase position with
    tablebases enabled and the PV move was changing frequently.
  • Illegal searchmove restrictions will now be ignored rather than crash.
  • Policy is cleared for terminal losses to encourage better quality MLH
    estimates by reducing how many visits a move that will not be selected
    (unless all other options are equally bad) receives.
  • Smart pruning will now cause leela to play immediately once mate score has
    been declared.
  • Fix an issue where sometimes the pv reported wouldn't match the move that
    would be selected at that moment.
  • Improvement for logic for when to disable custom_winograd optimization to
    avoid running out of video ram.
  • --show-hidden can now be specified after --help and still work.
  • Performance tuning for populating the policy into nodes after nn eval
    completes.
  • Enable custom optimized SE paths for nets with 384 filters when using the
    custom_winograd=false path.
  • Updates to zlib/gtest/eigen when included via meson wrap.
  • Added build option to build python bindings to the lc0 engine.
  • Only show the git hash in uci name if not a release tag build.
  • Add --nps-limit option to artificially reduce nps to make for easier
    opponent or whatever other reason you want.
  • Fixed a bug where search tree shape could be affected even when the
    --smart-pruning-factor setting was 0.
  • Changed the search logic to find the lc0.config file if left on the default
    value.
  • Changed the search logic to find network files in autodiscover mode.
  • Changed the logic to determine the default location for training games
    generated by selfplay in training mode.
  • Changed the logic to decide where to look for the opencl backend tuning
    settings file.
  • Android binaries published by appveyor are now stripped.
  • Build can now use system installed eigen if available.
  • When nodes in the tree get proven terminal, parents are updated as if they
    had always been terminal. This allows for faster convergence on more
    accurate MLH estimates amongst other details.
  • Removed shortsightedness and logit-q options that have not found a reliable
    use case.
  • Fixed a bug where m_effect calculated as part of S in verbose move stats was
    not consistent with the value used in search itself.
  • Added 'pro' mode as an alternative to --show-hidden for UCI hosts that do
    not support command line arguments. Simply rename the lc0 binary to include
    'pro' in order to enable.
  • backendbench now has a --clippy option to try and auto suggest which
    batch size is a good idea.
  • The demux backend now splits the batch into equal sizes based on the number
    of threads that demux is using rather than number of backends. By default
    this is no change as usually there is 1 thread per backend. But it allows
    to more easily use demux against a blas backend sending one chunk per core.
  • Added support for new training input variants canonical_hectoplies and
    canonical_hectoplies_armageddon.
  • Fixed a bug where if the network search paths for autodiscover contain files
    which lc0 cannot open it would error out rather than continuing on to other
    files.
  • Blas backends no longer have a blas_cores option, as it never seemed useful
    compared to running more threads at a higher level.
  • --help-md option removed as it was deemed not very useful.
  • Updated to the latest version of dnnl for the dnnl build.
  • Selfplay mode now supports per color settings in addition to per player
    settings. Per player settings have higher priority if there is a conflict.
    This will be used as part of armageddon training.
  • Added a new experimental backend type: recordreplay. This allows to
    record the output of a backend under a particular search and then replay it
    back again later. Theoretically this lets you simulate a CPU bottlenecked
    environment but still use a search tree that is a match for what might be a
    GPU bottlenecked environment. In practice there are a lot of corner cases
    where replay is not reliable yet. At a minimum you must disable prefetch.
  • During search the node tree is occasionally compacted to reduce cache misses
    during the search tree walk. New option --solid-tree-threshold can be used
    to adjust how aggressive this optimization is. Note that very small values
    can cause very large growth in ram usage and are not a good idea. The default
    value is a little conservative, if you have plenty of spare ram it can be
    good to decrease it a bit.
  • Small performance optimization for windows build with MLH enabled.
  • Meson configuration changed to build with LTO by default. Note that meson
    does not always configure visual studio project files to apply this
    correctly on windows.
  • The included net in appveyor builds is now 703350. This network supports MLH
    although the default MLH parameters are still threshold 1.0 which means it
    will not trigger without parameter adjustment.
  • New backend option to explicitly override the net details and force MLH
    disabled. If you weren't going to use MLH anyway, this may give a tiny nps
    increase.
  • New flag --show-movesleft (or UCI_ShowMovesLeft for UCI hosts that
    support it) will cause movesleft (in moves) to be reported in the uci info
    messages. Only works with networks that have MLH enabled.
  • More sensible default values for MLH are in. Note that threshold is still
    1.0 by default, so that will still need to be configured to enable it.
  • The smooth-experimental time manager has been renamed smooth and support
    added to increase search time whenever the best N does not correspond with
    the move with best utility estimate. legacy remains the default for now
    as smooth has only been tuned for short time controls and evidence suggests
    it doesn't scale with these defaults.
  • Selfplay mode now supports a logfile parameter just like normal mode.
  • Reinstated the 4 billion visit limit on search to avoid overflowing counters
    and causing very strange behavior to occur.
  • Performance optimization to make tree walk faster by ensuring that node
    edges are always sorted by policy. This has some very small side effects to
    do with tiebreaks in search no longer always being dominated by movegen
    order.
  • Appveyor built blas and Android binaries now default to minibatch size 1
    and prefetch 0, which should be much better than the normal GPU optimized
    defaults. Note this only affects Appveyor built binaries.
  • The included client in Windows Appveyor releases is now v27 and is named
    lc0-training-client.exe instead of client.exe.

v0.25.1

30 Apr 08:54
Compare
Choose a tag to compare
  • Fixed some issues with cudnn backend on the 16xx GTX models and also for
    low memory devices with large network files where the new optimizations
    could result in out of memory errors.
  • Added a workaround for a cutechess issue where reporting depth 0 during
    instamoves causes it to ignore our info message.

v0.25.0

28 Apr 11:23
Compare
Choose a tag to compare

A few small updates since RC2. Lots of new stuff in this release, take a look at the RC1/RC2 release notes for details.

  • Relax strictness for complete standard fens in uci and opening books. Fen
    must still be standard, but default values will be substituted for sections
    that are missing.
  • Restore some backwards compatibility in cudnn backends that was lost with
    the addition of the new convolution implementation. It is also on by default
    for more scenarios, although still off for fp16 on RTX gpus.
  • Small logic fix for nps smoothing in the new optional experimental time
    manager.