24 Sep 23:37

Ray Libraries

Ray Data

💫 Enhancements:

Simplify custom metadata provider API (#47575)
Change counts of metrics to rates of metrics (#47236)
Throw exception for non-streaming HF datasets with "override_num_blocks" argument (#47559)
Refactor custom optimizer rules (#47605)

🔨 Fixes:

Remove ineffective retry code in plan_read_op (#47456)
Fix incorrect pending task size if outputs are empty (#47604)

Ray Train

💫 Enhancements:

Update run status and add stack trace to TrainRunInfo (#46875)

Ray Serve

💫 Enhancements:

Allow control of some serve configuration via env vars (#47533)
[serve] Faster detection of dead replicas (#47237)

🔨 Fixes:

[Serve] fix component id logging field (#47609)

RLlib

💫 Enhancements:

New API stack:
- Add restart-failed-env option to EnvRunners. (#47608 )
- Offline RL: Store episodes in state form. (#47294 )
- Offline RL: Replace GAE in MARWILOfflinePreLearner with GeneralAdvantageEstimation connector in learner pipeline. (#47532)
- Off-policy algos: Add episode sampling to EpisodeReplayBuffer. (#47500)
- RLModule APIs: Add SelfSupervisedLossAPI for RLModules that bring their own loss and InferenceOnlyAPI. (#47581, #47572)

Ray Core

💫 Enhancements:

[aDAG] Allow custom NCCL group for aDAG (#47141)
[aDAG] support buffered input (#47272)
[aDAG] Support multi node multi reader (#47480)
[Core] Make is_gpu, is_actor, root_detached_id fields late bind to workers. (#47212)
[Core] Reconstruct actor to run lineage reconstruction triggered actor task (#47396)
[Core] Optimize GetAllJobInfo API for performance (#47530)

🔨 Fixes:

[aDAG] Fix ranks ordering for custom NCCL group (#47594)

Ray Clusters

📖 Documentation:

[KubeRay] add a guide for deploying vLLM with RayService (#47038)

Thanks

Many thanks to all those who contributed to this release!
@ruisearch42, @andrewsykim, @timkpaine, @rkooo567, @WeichenXu123, @GeneDer, @sword865, @simonsays1980, @angelinalg, @sven1977, @jjyao, @woshiyyya, @aslonnie, @zcin, @omatthew98, @rueian, @khluu, @justinvyu, @bveeramani, @nikitavemuri, @chris-ray-zhang, @liuxsh9, @xingyu-long, @peytondmurray, @rynewang

Contributors

sword865, jjyao, and 23 other contributors

Assets 2

23 Sep 18:47

khluu

ray-2.36.1

999f766

Ray-2.36.1

Ray Core

🔨 Fixes:

Fix broken dashboard cluster page when there are dead nodes (#47701)
Fix broken dashboard worker page (#47714)

Assets 2

17 Sep 18:30

GeneDer

ray-2.36.0

85d98e1

Ray-2.36.0

Ray Libraries

Ray Data

💫 Enhancements:

Remove limit on number of tasks launched per scheduling step (#47393)
Allow user-defined Exception to be caught. (#47339)

🔨 Fixes:

Display pending actors separately in the progress bar and not count them towards running resources (#46384)
Fix bug where arrow_parquet_args aren't used (#47161)
Skip empty JSON files in read_json() (#47378)
Remove remote call for initializing Datasource in read_datasource() (#47467)
Remove dead from_*_operator modules (#47457)
Release test fixes
Add AWS ACCESS_DENIED as retryable exception for multi-node Data+Train benchmarks (#47232)
Get AWS credentials with boto (#47352)
Use worker node instead of head node for read_images_comparison_microbenchmark_single_node release test (#47228)

📖 Documentation:

Add docstring to explain Dataset.deserialize_lineage (#47203)
Add a comment explaining the bundling behavior for map_batches with default batch_size (#47433)

Ray Train

💫 Enhancements:

Decouple device-related modules and add Huawei NPU support to Ray Train (#44086)

🔨 Fixes:

Update TORCH_NCCL_ASYNC_ERROR_HANDLING env var (#47292)

📖 Documentation:

Add missing Train public API reference (#47134)

Ray Tune

📖 Documentation:

Add missing Tune public API references (#47138)

Ray Serve

💫 Enhancements:

Mark proxy as unready when its routers are aware of zero replicas (#47002)
Setup default serve logger (#47229)

🔨 Fixes:

Allow get_serve_logs_dir to run outside of Ray's context (#47224)
Use serve logger name for logs in serve (#47205)

📖 Documentation:

[HPU] [Serve] [experimental] Add vllm HPU support in vllm example (#45893)

🏗 Architecture refactoring:

Remove support for nested DeploymentResponses (#47209)

RLlib

🎉 New Features:

New API stack: Add CQL algorithm. (#47000, #47402)
New API stack: Enable GPU and multi-GPU support for DQN/SAC/CQL. (#47179)

💫 Enhancements:

New API stack: Offline RL enhancements: #47195, #47359
Enhance new API stack stability: #46324, #47196, #47245, #47279
Fix large batch size for synchronous algos (e.g. PPO) after EnvRunner failures. (#47356)
Add torch.compile config options to old API stack. (#47340 )
Add kwargs to torch.nn.parallel.DistributedDataParallel (#47276)
Enhanced CI stability: #47197, #47249

📖 Documentation:

New API stack example scripts:
- Float16 training example script. (#47362)
- Mixed precision training example script (#47116)
- ModelV2 -> RLModule wrapper for migrating to new API stack. (#47425)
Remove "new API stack experimental" hint from docs. (#47301)

🏗 Architecture refactoring:

Remove 2nd Learner ConnectorV2 pass from PPO (#47401)
Add separate learning rates for policy and alpha to SAC. (#47078)

🔨 Fixes:

Various bug fixes: #47401, #47194, #47259, #47271, #47277, #47382

Ray Core

💫 Enhancements:

[ADAG] Raise proper error message for nccl within the same actor (#47250)
[ADAG] Support multi-read of the same shm channel (#47311 )
Log why core worker is not idle during HandleExit (#47300 )
Add PREPARED state for placement groups in GCS for better fault tolerance. (#46858)

🔨 Fixes:

Fix ray_unintentional_worker_failures_total to only count unintentional worker failures (#47368)
Fix runtime env race condition when uploading the same package concurrently (#47482)

Dashboard

🔨 Fixes:

Performance optimizations for dashboard backend logic (#47392) (#47367) (#47160) (#47213)
Refactor to simplify dashboard backend logic (#47324)

Docs

💫 Enhancements:

Add sphinx-autobuild and documentation for make local (#47275): Speed up of local docs builds with make local.
Add Algolia search to docs (#46477)
Update PyTorch Mnist Training doc for KubeRay 1.2.0 (#47321)
Life-cycle of documentation policy of Ray APIs

Thanks

Many thanks to all those who contributed to this release!
@GeneDer, @Bye-legumes, @nikitavemuri, @kevin85421, @MortalHappiness, @LeoLiao123, @saihaj, @rmcsqrd, @bveeramani, @zcin, @matthewdeng, @raulchen, @mattip, @jjyao, @ruisearch42, @scottjlee, @can-anyscale, @khluu, @aslonnie, @rynewang, @edoakes, @zhanluxianshen, @venkatram-dev, @c21, @allenyin55, @alexeykudinkin, @snehakottapalli, @BitPhinix, @hongchaodeng, @dengwxn, @liuxsh9, @simonsays1980, @peytondmurray, @KepingYan, @bryant1410, @woshiyyya, @sven1977

Contributors

alexeykudinkin, mattip, and 35 other contributors

Assets 2

28 Aug 00:11

khluu

ray-2.35.0

c5d536d

Ray-2.35.0

Notice: Starting from this release, pip install ray[all] will not include ray[cpp], and will not install the respective ray-cpp package. To install everything that includes ray-cpp, one can use pip install ray[cpp-all] instead.

Ray Libraries

Ray Data

🎉 New Features:

Upgrade supported Arrow version from 16 to 17 (#47034)
Add support for reading from Iceberg (#46889)

💫 Enhancements:

Various Progress Bar UX improvements (#46816, #46801, #46826, #46692, #46699, #46974, #46928, #47029, #46924, #47120, #47095, #47106)
Try get size_bytes from metadata and consolidate metadata methods (#46862)
Improve warning message when read task is large (#46942)
Extend API to enable passing sample weights via ray.dataset.to_tf (#45701)
Add a parameter to allow overriding LanceDB scanner options (#46975)
Add failure retry logic for read_lance (#46976)
Clarify warning for reading old Parquet data (#47049)
Move datasource implementations to _internal subpackage (#46825)
Handle logs from tensor extensions (#46943)

🔨 Fixes:

Change type of DataContext.retried_io_errors from tuple to list (#46884)
Make Parquet tests more robust and expose Parquet logic (#46944)
Change pickling log level from warning to debug (#47032)
Add validation for shuffle arg (#47055)
Fix validation bug when size=0 in ActorPoolStrategy (#47072)
Fix exception in async map (#47110)
Fix wrong metrics group for Object Store Memory metrics on Ray Data Dashboard (#47170)
Handle errors in SplitCoordinator when generating a new epoch (#47176)

📖 Documentation:

Auto-gen GroupedData api (#46925)
Fix signature of Rule.plan (#47094)

Ray Train

💫 Enhancements:

[train] Updates to support xgboost==2.1.0 (#46667)
[train] Add hardware stats (#46719)

Ray Tune

🔨 Fixes:

[RLlib; Tune] Fix WandB metric overlap after restore from checkpoint. (#46897)

Ray Serve

💫 Enhancements:

Improved handling of replica death and replica unavailability in deployment handle routers before controller restarts replica (#47008)
Eagerly create routers in proxy for better GCS fault tolerance (#47031)
Immediately send ping in router when receiving new replica set (#47053)

🏗 Architecture refactoring:

Deprecate passing arguments that contain DeploymentResponses in nested objects to downstream deployment handle calls (#46806)

RLlib

🎉 New Features:

Offline RL on the new API stack:
- Record offline data (#46818, #47046, #47133, #47155) and support to directly read from episodes. (#46865)
- RLUnplugged example. (#46792)
- Progress on BC/MARWIL migration: #44970, #47154, #46799
- Progress on CQL migration: #46969, #47105

💫 Enhancements:

Add ObservationPreprocessor (ConnectorV2). (#47077)

🔨 Fixes:

New API stack: Fix IMPALA/APPO + LSTM for single- and multi-GPU. (#47132, #47158)
Various bug fixes: #46898, #47047, #46963, #47021, #46897
Add more control to Algorithm.add_module/policy methods. (#46932, #46836)

📖 Documentation:

Example scripts for new API stack:
- Curiosity (inverse dynamics model-based) RLModule example. (#46841)
- Add example script for Env with protobuf observation space. (#47071)
New API stack documentation:
- Cleanup old API stack docs (rllib-dev.rst). (#47172)
- Episodes (SingleAgentEpisode). (#46985)
- Redo rllib-algorithms.rst page. (#46916)

🏗 Architecture refactoring:

Rename MultiAgent...RLModule... into MultiRL...Module for more generality. (#46840)
Add learner_only flag to RLModuleConfig/Spec and simplify creation of RLModule specs from algo-config. (#46900)

Ray Core

💫 Enhancements:

Emit total lineage bytes metrics (#46725)
Adding accelerator type H100 (#46823)
More structured logging in core worker (#46906)
Change all callbacks to move to save copies. (#46971)
Add ray[adag] option to pip install (#47009)

🔨 Fixes:

Fix dashboard process reporting on windows (#45578)
Fix Ray-on-Spark cluster crashing bug when user cancels cell execution (#46899)
Fix PinExistingReturnObject segfault by passing owner_address (#46973)
Fix raylet CHECK failure from runtime env creation failure. (#46991)
Fix typo in memray command (#47006)
[ADAG] Fix for asyncio outputs (#46845)

📖 Documentation:

Clarify behavior of placement_group_capture_child_tasks in docs (#46885)
Update ray.available_resources() docstring (#47018)

🏗 Architecture refactoring:

Async APIs for the New GcsClient. (#46788)
Replace GCS stubs in the dashboard to use NewGcsAioClient. (#46846)

Dashboard

💫 Enhancements:

Polish and minor improvements to the Serve page (#46811)

🔨 Fixes:

Fix CPU/GPU/RAM not being reported correctly on Windows (#44578)

Docs

💫 Enhancements:

Add more information about developer tooling for docs contributions (#46636), including esbonio section

🔨 Fixes:

Use PyData Sphinx theme version switcher (#46936)

Thanks

Many thanks to all those who contributed to this release!
@simonsays1980, @bveeramani, @tungh2, @zcin, @xingyu-long, @WeichenXu123, @aslonnie, @MaxVanDijck, @can-anyscale, @galenhwang, @omatthew98, @matthewdeng, @raulchen, @sven1977, @shrekris-anyscale, @deepyaman, @alexeykudinkin, @stephanie-wang, @kevin85421, @ruisearch42, @hongchaodeng, @khluu, @alanwguo, @hongpeng-guo, @saihaj, @Superskyyy, @tespent, @slfan1989, @justinvyu, @rynewang, @nikitavemuri, @amogkam, @mattip, @dev-goyal, @ryanaoleary, @peytondmurray, @edoakes, @venkatajagannath, @jjyao, @cristianjd, @scottjlee, @Bye-legumes

Contributors

alexeykudinkin, alanwguo, and 40 other contributors

Assets 2

31 Jul 18:02

can-anyscale

ray-2.34.0

fc87217

Release 2.34.0 Notes

Ray Libraries

Ray Data

💫 Enhancements:

Add better support for UDF returns from list of datetime objects (#46762)

🔨 Fixes:

Remove read task warning if size bytes not set in metadata (#46765)

📖 Documentation:

Fix read_tfrecords() docstring to display tfx-bsl tip (#46717)
Update Dataset.zip() docs (#46757)

Ray Train

🔨 Fixes:

Sort workers by node ID rather than by node IP (#46163)

🏗 Architecture refactoring:

Remove dead RayDatasetSpec (#46764)

RLlib

🎉 New Features:

Offline RL support on new API stack:
- Initial design for Ray-Data based offline RL Algos (on new API stack). (#44969)
- Add user-defined schemas for data loading. (#46738)
- Make data pipeline better configurable and tuneable for users. (#46777)

💫 Enhancements:

Move DQN into the TargetNetworkAPI (and deprecate RLModuleWithTargetNetworksInterface). (#46752)

🔨 Fixes:

Numpy version fix: Rename all np.product usage to np.prod (#46317)

📖 Documentation:

Examples for new API stack: Add 2 (count-based) curiosity examples. (#46737)
Remove RLlib CLI from docs (soon to be deprecated and replaced by python API). (#46724)

🏗 Architecture refactoring:

Cleanup, rename, clarify: Algorithm.workers/evaluation_workers, local_worker(), etc.. (#46726)

Ray Core

🏗 Architecture refactoring:

New python GcsClient binding (#46186)

Many thanks to all those who contributed to this release! @KyleKoon, @ruisearch42, @rynewang, @sven1977, @saihaj, @aslonnie, @bveeramani, @akshay-anyscale, @kevin85421, @omatthew98, @anyscalesam, @MaxVanDijck, @justinvyu, @simonsays1980, @can-anyscale, @peytondmurray, @scottjlee

Contributors

simonsays1980, justinvyu, and 15 other contributors

Assets 2

25 Jul 20:28

jjyao

ray-2.33.0

914af09

Ray-2.33.0

Ray Libraries

Ray Core

💫 Enhancements:

Add "last exception" to error message when GCS connection fails in ray.init() (#46516)

🔨 Fixes:

Add object back to memory store when object recovery is skipped (#46460)
Task status should start with PENDING_ARGS_AVAIL when retry (#46494)
Fix ObjectFetchTimedOutError (#46562)
Make working_dir support files created before 1980 (#46634)
Allow full path in conda runtime env. (#45550)
Fix worker launch time formatting in state api (#43516)

Ray Data

🎉 New Features:

Deprecate Dataset.get_internal_block_refs() (#46455)
Add read API for reading Databricks table with Delta Sharing (#46072)
Add support for objects to Arrow blocks (#45272)

💫 Enhancements:

Change offsets to int64 and change to LargeList for ArrowTensorArray (#45352)
Prevent from_pandas from combining input blocks (#46363)
Update Dataset.count() to avoid unnecessarily keeping BlockRefs in-memory (#46369)
Use Set to fix inefficient iteration over Arrow table columns (#46541)
Add AWS Error UNKNOWN to list of retried write errors (#46646)
Always print traceback for internal exceptions (#46647)
Allow unknown estimate of operator output bundles and ProgressBar totals (#46601)
Improve filesystem retry coverage (#46685)

🔨 Fixes:

Replace lambda mutable default arguments (#46493)

📖 Documentation:

Auto-generate Dataset API documentation (#46557)
Update outdated ExecutionPlan docstring (#46638)

Ray Train

💫 Enhancements:

Update run status and actor status for train runs. (#46395)

🔨 Fixes:

Replace lambda default arguments (#46576)

📖 Documentation:

Add MNIST training using KubeRay doc page (#46123)
Add example of pre-training Llama model on Intel Gaudi (#45459)
Fix tensorflow example by using ScalingConfig (#46565)

Ray Tune

🔨 Fixes:

Replace lambda default arguments (#46596)

Ray Serve

🎉 New Features:

Fully deprecate target_num_ongoing_requests_per_replica and max_concurrent_queries, respectively replaced by max_ongoing_requests and target_ongoing_requests (#46392 and #46427)
Configure the task launched by the controller to build an application with Serve’s logging config (#46347)

RLlib

💫 Enhancements:

Moving sampling coordination for batch_mode=complete_episodes to synchronous_parallel_sample. (#46321)
Enable complex action spaces with stateful modules. (#46468)

🏗 Architecture refactoring:

Enable multi-learner setup for hybrid stack BC. (#46436)
Introduce Checkpointable API for RLlib components and subcomponents. (#46376)

🔨 Fixes:

Replace Mapping typehint with Dict: #46474

📖 Documentation:

More example scripts for new API stack: Two separate optimizers (w/ different learning rates). (#46540) and custom loss function. (#46445)

Dashboard

🔨 Fixes:

Task end time showing the incorrect time (#46439)
Events Table rows having really bad spacing (#46701)
UI bugs in the serve dashboard page (#46599)

Thanks

Many thanks to all those who contributed to this release!

@alanwguo, @hongchaodeng, @anyscalesam, @brucebismarck, @bt2513, @woshiyyya, @terraflops1048576, @lorenzoritter, @omrishiv, @davidxia, @cchen777, @nono-Sang, @jackhumphries, @aslonnie, @JoshKarpel, @zjregee, @bveeramani, @khluu, @Superskyyy, @liuxsh9, @jjyao, @ruisearch42, @sven1977, @harborn, @saihaj, @zcin, @can-anyscale, @veekaybee, @chungen04, @WeichenXu123, @GeneDer, @sergey-serebryakov, @Bye-legumes, @scottjlee, @rynewang, @kevin85421, @cristianjd, @peytondmurray, @MortalHappiness, @MaxVanDijck, @simonsays1980, @mjovanovic9999

Contributors

omrishiv, davidxia, and 40 other contributors

Assets 2

10 Jul 16:40

aslonnie

ray-2.32.0

607f2f3

Ray-2.32.0

Highlight: aDAG Developer Preview

This is a new Ray Core specific feature called Ray accelerated DAGs (aDAGs).

aDAGs give you a Ray Core-like API but with extensibility to pre-compile execution paths across pre-allocated resources on a Ray Cluster to possible benefits for optimization on throughput and latency. Some practical examples include:
- Up to 10x lower task execution time on single-node.
- Native support for GPU-GPU communication, via NCCL.
This is still very early, but please reach out on #ray-core on Ray Slack to learn more!

Ray Libraries

Ray Data

💫 Enhancements:

Support async callable classes in map_batches() (#46129)

🔨 Fixes:

Ensure InputDataBuffer doesn't free block references (#46191)
MapOperator.num_active_tasks should exclude pending actors (#46364)
Fix progress bars being displayed as partially completed in Jupyter notebooks (#46289)

📖 Documentation:

Fix docs: read_api.py docstring (#45690)
Correct API annotation for tfrecords_datasource (#46171)
Fix broken links in README and in ray.data.Dataset (#45345)

Ray Train

📖 Documentation:

Update PyTorch Data Ingestion User Guide (#45421)

Ray Serve

💫 Enhancements:

Optimize ServeController.get_app_config() (#45878)
Change default for max and target ongoing requests (#45943)
Integrate with Ray structured logging (#46215)
Allow configuring handle cache size and controller max concurrency (#46278)
Optimize DeploymentDetails.deployment_route_prefix_not_set() (#46305)

RLlib

🎉 New Features:

APPO on new API stack (w/ EnvRunners). (#46216)

💫 Enhancements:

Stability: APPO, SAC, and DQN activate multi-agent learning tests (#45542, #46299)
Make Tune trial ID available in EnvRunners (and callbacks). (#46294)
Add env- and agent_steps to custom evaluation function. (#45652)
Remove default-metrics from Algorithm (tune does NOT error anymore if any stop-metric is missing). (#46200)

🔨 Fixes:

Various bug fixes: #45542

📖 Documentation:

Example for new API stack: Offline RL (BC) training on single-agent, while evaluating w/ multi-agent setup. (#46251)
Example for new API stack: Custom RLModule with an LSTM. (#46276)

Ray Core

🎉 New Features:

aDAG Developer Preview.

💫 Enhancements:

Allow env setup logger encoding (#46242)
ray list tasks filter state and name on GCS side (#46270)
Log ray version and ray commit during GCS start (#46341)

🔨 Fixes:

Decrement lineage ref count of an actor when the actor task return object reference is deleted (#46230)
Fix negative ALIVE actors metric and introduce IDLE state (#45718)
psutil process attr num_fds is not available on Windows (#46329)

Dashboard

🎉 New Features:

Added customizable refresh frequency for metrics on Ray Dashboard (#44037)

💫 Enhancements:

Upgraded to MUIv5 and React 18 (#45789)

🔨 Fixes:

Fix for multi-line log items breaking log viewer rendering (#46391)
Fix for UI inconsistency when a job submission creates more than one Ray job. (#46267)
Fix filtering by job id for tasks API not filtering correctly. (#45017)

Docs

🔨 Fixes:

Re-enabled automatic cross-reference link checking for Ray documentation, with Sphinx nitpicky mode (#46279)
Enforced naming conventions for public and private APIs to maintain accuracy, starting with Ray Data API documentation (#46261)

📖 Documentation:

Upgrade Python 3.12 support to alpha, marking the release of the Ray wheel to PyPI and conducting a sanity check of the most critical tests.

Thanks

Many thanks to all those who contributed to this release!

@stephanie-wang, @MortalHappiness, @aslonnie, @ryanaoleary, @jjyao, @jackhumphries, @nikitavemuri, @woshiyyya, @JoshKarpel, @ruisearch42, @sven1977, @alanwguo, @GeneDer, @saihaj, @raulchen, @liuxsh9, @khluu, @cristianjd, @scottjlee, @bveeramani, @zcin, @simonsays1980, @SumanthRH, @davidxia, @can-anyscale, @peytondmurray, @kevin85421

Contributors

davidxia, alanwguo, and 25 other contributors

Assets 2

26 Jun 22:06

khluu

ray-2.31.0

1240d3f

Ray-2.31.0

Ray Libraries

Ray Data

🔨 Fixes:

Fixed bug where preserve_order doesn’t work with file reads (#46135)

📖 Documentation:

Added documentation for dataset.Schema (#46170)

Ray Train

💫 Enhancements:

Add API for Ray Train run stats (#45711)

Ray Tune

💫 Enhancements:

Missing stopping criterion should not error (just warn). (#45613)

📖 Documentation:

Fix broken references in Ray Tune documentation (#45233)

Ray Serve

WARNING: the following default values will change in Ray 2.32:

Default for max_ongoing_requests will change from 100 to 5.
Default for target_ongoing_requests will change from 1 to 2.

💫 Enhancements:

Optimize DeploymentStateManager.get_deployment_statuses (#45872)

🔨 Fixes:

Fix logging error on passing traceback object into exc_info (#46105)
Run del even if constructor is still in-progress (#45882)
Spread replicas with custom resources in torch tune serve release test (#46093)
[1k release test] don't run replicas on head node (#46130)

📖 Documentation:

Remove todo since issue is fixed (#45941)

RLlib

🎉 New Features:

IMPALA runs on the new API stack (with EnvRunners and ConnectorV2s). (#42085)
SAC/DQN: Prioritized multi-agent episode replay buffer. (#45576 )

💫 Enhancements:

New API stack stability: Add systematic CI learning tests for all possible combinations of: [PPO|IMPALA] + [1CPU|2CPU|1GPU|2GPU] + [single-agent|multi-agent]. (#46162, #46161)

📖 Documentation:

New API stack: Example script for action masking (#46146)
New API stack: PyFlight example script cleanup (#45956 )
Old API stack: Enhanced ONNX example (+LSTM). (#43592 )

Ray Core and Ray Clusters

Ray Core

💫 Enhancements:

[runtime-env] automatically infer worker path when starting worker in container (#42304)

🔨 Fixes:

On GCS restart, destroy not forget the unused workers. Fixing PG leaks. (#45854)
Cancel lease requests before returning a PG bundle (#45919)
Fix boost fiber stack overflow (#46133)

Thanks

Many thanks to all those who contributed to this release!

@jjyao, @kevin85421, @vincent-pli, @khluu, @simonsays1980, @sven1977, @rynewang, @can-anyscale, @richardsliu, @jackhumphries, @alexeykudinkin, @bveeramani, @ruisearch42, @shrekris-anyscale, @stephanie-wang, @matthewdeng, @zcin, @hongchaodeng, @ryanaoleary, @liuxsh9, @GeneDer, @aslonnie, @peytondmurray, @Bye-legumes, @woshiyyya, @scottjlee, @JoshKarpel

Contributors

alexeykudinkin, jjyao, and 25 other contributors

Assets 2

20 Jun 23:08

can-anyscale

ray-2.30.0

97c3729

Ray-2.30.0

Ray Libraries

Ray Data

💫 Enhancements:

Improve fractional CPU/GPU formatting (#45673)
Use sampled fragments to estimate Parquet reader batch size (#45749)
Refactoring ParquetDatasource and metadata fetching logic (#45728, #45727, #45733, #45734, #45767)
Refactor planner.py (#45706)

Ray Tune

💫 Enhancements:

Change the behavior of a missing stopping criterion metric to warn instead of raising an error. This enables the use case of reporting different sets of metrics on different iterations (ex: a separate set of training and validation metrics). (#45613)

Ray Serve

💫 Enhancements:

Create internal request id to track request objects (#45761)

RLLib

💫 Enhancements:

Stability: DreamerV3 weekly release test (#45654); Add "official" benchmark script for Atari PPO benchmarks. (#45697)
Enhance env-rendering callback (#45682)

🔨 Fixes:

Bug fix in new MetricsLogger API: EMA stats w/o window would lead to infinite list mem-leak. (#45752)
Various other bug fixes: (#45819, #45820, #45683, #45651, #45753)

📖 Documentation:

Re-do examples overview page (new API stack): #45382
- PyFlyt QuadX WayPoints example #44758, #45956
- RLModule inference on new API stack (#45831, #45845)
- How to resume a tune.Tuner.fit() experiment from checkpoint. (#45681)
- Custom RLModule (tiny CNN): #45774
- Connector examples docstrings (#45864)
Old API stack examples: #43592, #45829

Ray Core

🎉 New Features:

Alpha release of job level logging configuration: users can now config the user logging to be logfmt format with logging context attached. (#45344)

💫 Enhancements:

Integrate amdsmi in AMDAcceleratorManager (#44572)

🔨 Fixes:

Fix the C++ GcsClient Del not respecting del_by_prefix (#45604)
Fix exit handling of FiberState threads (#45834)

Dashboard

💫 Enhancements:

Parse out json logs (#45853)

Many thanks to all those who contributed to this release: @liuxsh9, @peytondmurray, @pcmoritz, @GeneDer, @saihaj, @khluu, @aslonnie, @yucai, @vickytsang, @can-anyscale, @bthananjeyan, @raulchen, @hongchaodeng, @x13n, @simonsays1980, @peterghaddad, @kevin85421, @rynewang, @angelinalg, @jjyao, @BenWilson2, @jackhumphries, @zcin, @chris-ray-zhang, @c21, @shrekris-anyscale, @alanwguo, @stephanie-wang, @Bye-legumes, @sven1977, @WeichenXu123, @bveeramani, @nikitavemuri

Contributors

pcmoritz, x13n, and 31 other contributors

Assets 2

06 Jun 18:16

khluu

ray-2.24.0

cfea8b2

Ray-2.24.0

Ray Libraries

Ray Data

🎉 New Features:

Allow user to configure timeout for actor pool (#45508)
Add override_num_blocks to from_pandas and perform auto-partition (#44937)
Upgrade Arrow version to 16 in CI (#45565)

💫 Enhancements:

Clarify that num_rows_per_file isn't strict (#45529)
Record more telemetry for newly added datasources (#45647)
Avoid pickling LanceFragment when creating read tasks for Lance (#45392)

Ray Train

📖 Documentation:

[HPU] Add example of Stable Diffusion fine-tuning and serving on Intel Gaudi (#45217)
[HPU] Add example of Llama-2 fine-tuning on Intel Gaudi (#44667)

Ray Tune

🏗 Architecture refactoring:

Improve excessive syncing warning and deprecate TUNE_RESULT_DIR, RAY_AIR_LOCAL_CACHE_DIR, local_dir (#45210)

Ray Serve

💫 Enhancements:

Clean up Serve proxy files (#45486)

📖 Documentation:

vllm example to serve llm models (#45430)

RLLib

💫 Enhancements:

DreamerV3 on tf: Bug fix, so it can run again with tf==2.11.1 (2.11.0 is not available anymore) (#45419); Added weekly release test for DreamerV3.
Added support for multi-agent off-policy algorithms (DQN and SAC) in the new (#45182)
Config option for APPO/IMPALA to change number of GPU-loader threads (#45467)

🔨 Fixes:

Various MetricsLogger bug fixes (#45543, #45585, #45575)
Other fixes: #45588, #45617, #45517, #45465

📖 Documentation:

Example script for new API stack: How-to restore 1 of n agents from a checkpoint. (#45462)
Example script for new API stack: Autoregressive action module. #45525

Ray Core

💫 Enhancements:

Improve node death observability (#45320, #45357, #45533, #45644, #45497)
Ray c++ backend structured logging (#44468)

🔨 Fixes:

Fix worker crash when getting actor name from runtime context (#45194)
log dedup should not dedup number only lines (#45385)

📖 Documentation:

Improve doc for --object-store-memory to describe how the default value is set (#45301)

Dashboard

🔨 Fixes:

Move Job package uploading to another thread to unblock the event loop. (#45282)

Many thanks to all those who contributed to this release: @maxliuofficial, @simonsays1980, @GeneDer, @dudeperf3ct, @khluu, @justinvyu, @andrewsykim, @Catch-Bull, @zcin, @bveeramani, @rynewang, @angelinalg, @matthewdeng, @jjyao, @kira-lin, @harborn, @hongchaodeng, @peytondmurray, @aslonnie, @timkpaine, @982945902, @maxpumperla, @stephanie-wang, @ruisearch42, @alanwguo, @can-anyscale, @c21, @Atry, @KamenShah, @sven1977, @raulchen

Contributors

Atry, alanwguo, and 29 other contributors

Assets 2

Releases: ray-project/ray

Ray-2.37.0

Ray Libraries

Ray Data

Ray Train

Ray Serve

RLlib

Ray Core

Ray Clusters

Thanks

Contributors

Ray-2.36.1

Ray Core

Ray-2.36.0

Ray Libraries

Ray Data

Ray Train

Ray Tune

Ray Serve

RLlib

Ray Core

Dashboard

Docs

Thanks

Contributors

Ray-2.35.0

Ray Libraries

Ray Data

Ray Train

Ray Tune

Ray Serve

RLlib

Ray Core

Dashboard

Docs

Thanks

Contributors

Release 2.34.0 Notes

Ray Libraries

Ray Data

Ray Train

RLlib

Ray Core

Contributors

Ray-2.33.0

Ray Libraries

Ray Core

Ray Data

Ray Train

Ray Tune

Ray Serve

RLlib

Dashboard

Thanks

Contributors

Ray-2.32.0

Highlight: aDAG Developer Preview

Ray Libraries

Ray Data

Ray Train

Ray Serve

RLlib

Ray Core

Dashboard

Docs

Thanks

Contributors

Ray-2.31.0

Ray Libraries

Ray Data

Ray Train

Ray Tune

Ray Serve

RLlib

Ray Core and Ray Clusters

Ray Core

Thanks

Contributors

Ray-2.30.0

Ray Libraries