Release Dynamo Release v0.2.0 · ai-dynamo/dynamo

Dynamo is an open source project with Apache 2 license. The primary distribution is done via pip wheels with minimal binary size. The ai-dynamo github org hosts 2 repos: dynamo and NIXL. Dynamo is designed as the ideal next generation inference server, building upon the foundations of the Triton Inference Server. While Triton focuses on single-node inference deployments, we are committed to integrating its robust single-node capabilities into Dynamo within the next several months. We will maintain ongoing support for Triton while ensuring a seamless migration path for existing users to Dynamo once feature parity is achieved. As a vendor-agnostic serving framework, Dynamo supports multiple LLM inference engines including TRT-LLM, vLLM, and SGLang, with varying degrees of maturity and support.

Dynamo v0.2.0 features:

GB200 support with ARM builds (Note: currently requires a container build)
Planner - new experimental support for spinning workers up and down based on load
Improved K8s deployment workflow
- Installation wizard to enable easy configuration of Dynamo on your Kubernetes cluster
- CLI to manage your operator-based deployments
- Consolidate Custom Resources for Dynamo Deployments
- Documentation improvements (including Minikube guide to installing Dynamo Platform)

Future plans

Dynamo Roadmap

Known Issues

Benchmark guides are still being validated on public cloud instances (GCP / AWS)
Benchmarks on internal clusters show a 15% degradation from results displayed in summary graphs for multi-node 70B and are being investigated.
TensorRT-LLM examples are not working currently in this release - but are being fixed in main.

What's Changed

fix: fix max_local_prefill_length not being printed out in disagg router log by @tedzhouhk in #628
docs: Add instructions to install git lfs by @tanmayv25 in #627
fix: add DYNAMO_HOME env var to vLLM docker image by @nv-anants in #629
fix: Account for Metrics.decode() changes by @rmccorm4 in #619
fix: Update test_report by @pvijayakrish in #641
fix: serviceArgs in config was not getting set for workers by @mohammedabdulwahhab in #640
fix: adding conversion to string for notif id comparison by @nnshah1 in #638
docs: Add documentation for UCX KV cache transfer in TRTLLM by @tanmayv25 in #639
build: Define UCX env var to use NVLink when available by @tanmayv25 in #631
feat: ETCD prefix watcher + python binding + runtime reconfiguration for router and disagg router by @tedzhouhk in #581
fix: dynamo build should work with link syntax by @mohammedabdulwahhab in #646
fix: change trtllm kv_router default block_size to 32 by @ziqif-nv in #642
fix: signal handlers to clean up zombie vllm processes by @ishandhanani in #545
feat: add .devcontainer based off images in container/ by @alec-flowers in #497
fix: devcontainer mounts and vllm c api by @alec-flowers in #663
fix: deploy command should support passing config by @mohammedabdulwahhab in #626
feat(dynamo-run): improve available engines list in --help by @XueSongTap in #664
feat: add dynamoDeployment CR finalizer by @julienmancuso in #623
fix: set correct parent_hash for each kv block when publish kv events by @ziqif-nv in #671
docs: Use the same term for dynamo base image across code snippets and text by @hutm in #670
docs: move deploy docs to docs/guides by @hhzhang16 in #674
fix: frontend and http server signal handling by @alec-flowers in #677
fix: check for resource in pipeline helm chart by @julienmancuso in #687
fix: ensure VLLM_LOGGING_LEVEL=xyz followsDYN_LOG=xyz by @ishandhanani in #692
feat: replace dynamo server with dynamo cloud by @hhzhang16 in #696
feat: base Dynamo docker image improvements and fixes by @hhzhang16 in #658
fix: fix pipeline helm chart by @julienmancuso in #698
docs: Benchmarking guide updates by @kthui in #678
feat: bump vLLM version to v0.8.4 by @ptarasiewiczNV in #690
chore: Replace TRD->Dynamo in llmctl help output by @rmccorm4 in #710
fix: allow for an empty dynamo config file by @hhzhang16 in #712
fix: cli version by @ishandhanani in #716
docs: Remove outdated python-wheels directory reference by @rmccorm4 in #719
fix: direct clients vs dependancies by @ishandhanani in #704
feat: adding dynamo-tokens crate by @ryanolson in #718
fix: bump GAP to r25.03 by @tedzhouhk in #724
feat: make ingress configurable in operator by @julienmancuso in #717
feat: configure logger with detail info by @tlipoca9 in #654
feat: Add disagg skeleton example by @kylehh in #683
fix: dynamo deploy helm chart cleanup by @mohammedabdulwahhab in #727
docs: add dedicated minikube guide by @mohammedabdulwahhab in #735
feat(dynamo-engine-vllm): vllm 0.8.X support by @grahamking in #728
feat: gracefully shutdown endpoint by revoking etcd lease + python binding by @tedzhouhk in #730
fix: Add missing deps for '--framework none' build by @rmccorm4 in #738
chore: Remove TRT-LLM C++ engine in favor of Python one by @grahamking in #747
docs: Support matrix post release. by @pvijayakrish in #736
docs: add aggregated deployment guide for multi-node sized model by @GuanLuo in #713
feat: make the model name to be the same as the HF repo name for dynamo-run by @AndyDai-nv in #749
feat: add additional packages to log filters by @abrarshivani in #752
chore(dynamo-run): Fix echo_core for EOS tokens by @grahamking in #759
feat: add custom lease to worker components by @ishandhanani in #748
chore: Add roadmap to main README.md by @harryskim in #763
feat: MLA disaggregation support to vLLM patch by @ptarasiewiczNV in #745
fix: Fix cancellation flow in python component graph by @pankajroark in #765
fix: give the user ownership permissions of /opt/dynamo/venv by @hhzhang16 in #767
docs: deployment docs improvements by @hhzhang16 in #753
feat: add option to configure separate docker registry for pipelines docker images by @julienmancuso in #744
chore: Update bug report to use dynamo env for collecting environment information by @nv-tusharma in #558
docs: R1 disaggregation guide by @GuanLuo in #720
feat: allow to CRUD dynamo pipelines by @julienmancuso in #761
docs: Custom Backend/Worker Guide by @rmccorm4 in #608
chore: fix arg name in example by @CormickKneey in #770
build: add rust binaries in manylinux image by @nv-anants in #783
feat: remove bento/yatai references by @julienmancuso in #782
docs: add note to use release branch examples by @nv-anants in #793
feat: Add log verbosity level flag to dynamo-run cli by @abrarshivani in #780
feat: rename operator CRDs by @julienmancuso in #795
feat: Add linux aarch64 support to dynamo-run build by @rmccorm4 in #802
fix: Update TRTLLM version and fix disagg workflow by @tanmayv25 in #804
chore: Increase sleep times from 2s -> 30s for startup logs by @rmccorm4 in #807
feat: Warm‑up mistral.rs engine to reduce latency on subsequent requests by @abrarshivani in #796
feat: improve dynamo deployment CLI by @hhzhang16 in #798
feat: Add unified x86 / aarch64 (ARM) build for TRTLLM image by @rmccorm4 in #803
feat: remove old bento images by @julienmancuso in #801
refactor: transition CLI to use typer for UX and testing by @ishandhanani in #703
docs: Update README.md by @alec-flowers in #821
feat: remove proxy side car by @julienmancuso in #822
refactor: refactor dynamo serve part-1/N by @biswapanda in #788
chore: Publish Model Deployment Card to NATS by @grahamking in #799
fix: remove dynamo cloud login by @mohammedabdulwahhab in #824
fix: Change default vLLM router to round-robin by @piotrm-nvidia in #597
build: update cudarc dependency to crate version by @nv-anants in #815
feat: add network configuration wizard during platform install by @julienmancuso in #820
fix: add VLLM_KV_CAPI_PATH to vllm dockerfile to make kv routing working by @ziqif-nv in #832
chore: update vllm wheel dependency version by @nv-anants in #828
feat: misc changes while deploying by @hhzhang16 in #831
fix: wrong lease_id by @alec-flowers in #833
chore: bump NIXL version and package versions by @saturley-hall in #836
feat: local planner for 0.2.0 release by @tedzhouhk in #398
feat: Add unified x86 / aarch64 (ARM) build for VLLM image (#839) by @rmccorm4 in #871
refactor: change trtllm example kv routing use python bindings | deal with trtllm partial blocks | trtllm event change (#866) by @ziqif-nv in #877
chore: bump nixl commit to 0.2.0 rc1 by @saturley-hall in #878
fix: manylinux tag in ai-dynamo-vllm wheel (#884) by @nv-anants in #887
fix: add fastapi depenedncy in pyproject.toml cherry-pick of #888 by @saturley-hall in #898
chore: update support matrix by @saturley-hall in #880
docs: cherry pick docs fixes for dynamo deploy by @mohammedabdulwahhab in #907
fix: cherry pick fix for VLLM_KV_CAPI_PATH by @nnshah1 in #906
docs: update pythonpath for starting planner (cherry-pick #890) by @saturley-hall in #908

New Contributors

@XueSongTap made their first contribution in #664
@kylehh made their first contribution in #683
@pankajroark made their first contribution in #765
@CormickKneey made their first contribution in #770

Full Changelog: v0.1.1...v0.2.0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Dynamo Release v0.2.0

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

What's Changed

New Contributors

Contributors

Uh oh!