diff --git a/.github/workflows/scala-steward.yml b/.github/workflows/scala-steward.yml
index 813b265b270..d9b77266b62 100644
--- a/.github/workflows/scala-steward.yml
+++ b/.github/workflows/scala-steward.yml
@@ -19,6 +19,6 @@ jobs:
           java-version: '17'
           distribution: 'temurin'
 
-      - uses: scala-steward-org/scala-steward-action@v2.71.0
+      - uses: scala-steward-org/scala-steward-action@v2.72.0
         with:
           mill-version: 0.12.1
diff --git a/blog/antora.yml b/blog/antora.yml
new file mode 100644
index 00000000000..fa746cc85ef
--- /dev/null
+++ b/blog/antora.yml
@@ -0,0 +1,8 @@
+name: blog
+title: Mill Blog
+version: ~
+nav:
+  - modules/ROOT/nav.adoc
+asciidoc:
+  attributes:
+    mill-version: dummy-mill-version
diff --git a/blog/modules/ROOT/nav.adoc b/blog/modules/ROOT/nav.adoc
new file mode 100644
index 00000000000..c090b81378c
--- /dev/null
+++ b/blog/modules/ROOT/nav.adoc
@@ -0,0 +1,3 @@
+
+* xref:2-monorepo-build-tool.adoc[]
+* xref:1-java-compile.adoc[]
diff --git a/docs/modules/ROOT/pages/comparisons/java-compile.adoc b/blog/modules/ROOT/pages/1-java-compile.adoc
similarity index 95%
rename from docs/modules/ROOT/pages/comparisons/java-compile.adoc
rename to blog/modules/ROOT/pages/1-java-compile.adoc
index 1ca666f7dda..578fbdb7f81 100644
--- a/docs/modules/ROOT/pages/comparisons/java-compile.adoc
+++ b/blog/modules/ROOT/pages/1-java-compile.adoc
@@ -1,6 +1,13 @@
+// tag::header[]
+
 # How Fast Does Java Compile?
+:page-aliases: xref:mill:ROOT:comparisons/java-compile.adoc
+
+:author: Li Haoyi
+:revdate: 29 November 2024
+_{author}, {revdate}_
 
-include::partial$gtag-config.adoc[]
+include::mill:ROOT:partial$gtag-config.adoc[]
 
 Java compiles have the reputation for being slow, but that reputation does
 not match today's reality. Nowadays the Java compiler can compile "typical" Java code at over
@@ -24,6 +31,8 @@ all build tools fall short of how fast compiling Java _should_ be. This post exp
 these numbers were arrived at, and what that means in un-tapped potential for Java build
 tooling to become truly great.
 
+// end::header[]
+
 ## Mockito Core
 
 To begin to understand the problem, lets consider the codebase of the popular Mockito project:
@@ -291,8 +300,8 @@ Tabulating this all together gives us the table we saw at the start of this page
 | Gradle | 4.41s | 9,400 | 15.2x | Maven | 4.89s | 6,100 | 16.9x
 |===
 
-We explore the comparison between xref:comparisons/gradle.adoc[Gradle vs Mill]
-or xref:comparisons/maven.adoc[Maven vs Mill] in more detail on their own dedicated pages.
+We explore the comparison between xref:mill:ROOT:comparisons/gradle.adoc[Gradle vs Mill]
+or xref:mill:ROOT:comparisons/maven.adoc[Maven vs Mill] in more detail on their own dedicated pages.
 For this article, the important thing is not comparing the build tools against each other,
 but comparing the build tools against what how fast they _could_ be if they just used
 the `javac` Java compiler directly. And it's clear that compared to the actual work
@@ -307,8 +316,8 @@ we explored above only deals with compiling a single small module. But a similar
 Clean Compile* benchmarks which compiles the entire Mockito and Netty projects on
 a single core shows similar numbers for the various build tools:
 
-* xref:comparisons/gradle.adoc#_sequential_clean_compile_all[Gradle compiling 100,000 lines of Java at ~5,600 lines/s]
-* xref:comparisons/maven.adoc#_sequential_clean_compile_all[Maven compiling 500,000 lines of Java at ~5,100 lines/s]
+* xref:mill:ROOT:comparisons/gradle.adoc#_sequential_clean_compile_all[Gradle compiling 100,000 lines of Java at ~5,600 lines/s]
+* xref:mill:ROOT:comparisons/maven.adoc#_sequential_clean_compile_all[Maven compiling 500,000 lines of Java at ~5,100 lines/s]
 * Mill compiling at ~25,000 lines/s on both the above whole-project benchmarks
 
 All of these are far below the 100,000 lines/s that we should expect from Java compilation,
@@ -342,7 +351,7 @@ compiling Java_ is pure overhead. Checking for cache invalidation in _shouldn't_
 as long as actually compiling your code. I mean it obviously does _today_, but it _shouldn't_!
 
 The Mill build tool goes to great lengths to try and minimize overhead, and already gets
-xref:comparisons/why-mill.adoc#_performance[~4x faster builds] than Maven or Gradle on
+xref:mill:ROOT:comparisons/why-mill.adoc#_performance[~4x faster builds] than Maven or Gradle on
 real-world projects like Mockito or Netty. But there still is a long way to go give Java
 developers the fast, snappy experience that the underlying Java platform can provide. If
 Java build and compile times are things you find important, you should try out Mill on
diff --git a/blog/modules/ROOT/pages/2-monorepo-build-tool.adoc b/blog/modules/ROOT/pages/2-monorepo-build-tool.adoc
new file mode 100644
index 00000000000..899c41e8ed4
--- /dev/null
+++ b/blog/modules/ROOT/pages/2-monorepo-build-tool.adoc
@@ -0,0 +1,281 @@
+// tag::header[]
+
+# Why Use a Monorepo Build Tool?
+
+
+:author: Li Haoyi
+:revdate: 17 December 2024
+_{author}, {revdate}_
+
+include::mill:ROOT:partial$gtag-config.adoc[]
+
+
+Software build tools mostly fall into two categories:
+
+1. Single-language build tools, e.g.
+   https://maven.apache.org/[Maven] (Java), https://python-poetry.org/[Poetry] (Python),
+   https://doc.rust-lang.org/cargo/[Cargo] (Rust)
+
+2. Monorepo build tools targeting large codebases, e.g. https://bazel.build/[Bazel],
+   https://www.pantsbuild.org/[Pants], https://buck.build/[Buck], and https://mill-build.org/[Mill]
+
+One question that comes up constantly is why do people use Monorepo build tools? Tools
+like Bazel are orders of magnitude more complicated and hard to use than tools
+like Poetry or Cargo, so why do people use them at all?
+https://knowyourmeme.com/memes/is-he-stupid-is-she-smart-are-they-stupid[Are they stupid?]
+
+It turns out that Monorepo build tools like Bazel or Mill do a lot of non-obvious things that
+other build tools don't, that become important in larger codebases (100-10,000 active developers).
+These features are generally irrelevant for smaller projects, which explains why most people
+do not miss them. But past a certain size of codebase and engineering organization these
+features become critical. We'll explore some of the core features of "Monorepo Build Tools"
+below, from the perspective of Bazel (which I am familiar with) and Mill (which this
+technical blog is about).
+
+// end::header[]
+
+## Support for Multiple Languages
+
+While small software projects usually start in one programming language, larger ones
+inevitably grow more heterogeneous over time. For example, you may be building a Go binary
+and Rust library which are both used in a Python executable, which is then tested using a
+Bash script and deployed as part of a Java backend server. The Java backend server may also
+server a front-end web interface compiled from Typescript, and the whole deployment again
+tested using Selenium in Python or Playwright in Javascript.
+
+The reality of working in any large codebase and organization, such rube-goldberg
+code paths _do_ happen on a regular basis, and any monorepo build tool has to accommodate them.
+If the build tool does not accommodate multiple languages, then what ends up happening is you
+end up having lots of small build tools wired together. Taking the example above,
+you may have:
+
+- A simple Maven build for your backend server,
+- A simple Webpack build for the Web frontend
+- A simple Poetry build for your Python executable
+- A simple Cargo build for your Rust library
+- A simple Go build for your Go binary
+
+Although each tool does its job, none of those tools are sufficient to build/test/deploy
+your project! Instead, you end up having a `bin/` or `build/` folder full of `.sh` scripts
+that wire up these simpler per-language build tools in ad-hoc ways. And while the individual
+language-specific build tools may be clean and simple, the rats nest of shell scripts that
+you also need usually ends up being a mess that is impossible to work with.
+
+That is why monorepo build tools like Bazel and Mill try to be _language agnostic_.
+Although they may come with some built in functionality (e.g. Bazel comes with C/C++/Java
+support built in, Mill comes with Java/Scala/Kotlin), monorepo build tools must make
+it extending them to support additional languages _easy_. Bazel via its ecosystem
+of `rules_*` libraries, Mill via it's extensibility APIs which make it easy to
+implement your own support for additional languages like
+xref:mill:ROOT:extending/example-python-support.adoc[Python] or
+xref:mill:ROOT:extending/example-typescript-support.adoc[Typescript]. That means that when
+the built-in language support runs out - which is inevitable in large growing monorepos -
+the user can smoothly extend the build tool to keep going rather than falling back to
+ad-hoc shell scripts.
+
+## Support for Custom Build Tasks
+
+As projects get large, they also get more unique. Every hello-world Java or Python or
+Javascript project looks about the same, but larger projects start having unusual
+requirements that no-one else does, for example:
+
+- Invoking a bespoke code-generator to integrate with your company's internal RPC system
+
+
+- Generating custom deployment artifact formats to support that one legacy datacenter you
+  need to get your code running in
+
+- Downloading third-party dependency sources, patching them, and building them from source
+  to work around issues that you have fixed but not yet succeeded in upstreaming
+
+- Compiling the compiler you need to compile the rest of your codebase, again perhaps
+  to make use of compiler bugfixes that you have not yet managed to get into an upstream release.
+
+The happy paths in build tools are usually great, and the slightly-off-the-beaten-path
+workflows usually have third-party plugins supporting them: things like linting, generating
+docker containers, and so on. But as any growing software organization quickly finds itself
+with build-time use cases that nobody else in the world has. At that point the paved paths
+have ended and the build engineers will need to implement the custom build tasks themselves
+
+Every build tool allows some degree of customization, but how easy and flexible they are
+differs from tool to tool. e.g. a build tool like Maven requires its plugins to fit into
+a very restricted Build Lifecycle (https://maven.apache.org/guides/introduction/introduction-to-the-lifecycle.html[link]):
+this is good when compiling Java source code is all you need to do, but can be problematic when
+you need to do something more far afield. The alternative is the aforementioned rats-nest
+of shell scripts - either wrapping or wrapped by the traditional build tools - that implement
+the custom build tasks you require.
+
+That is why monorepo build tools like Bazel and Mill make it easy to write custom tasks. In
+Bazel a custom task is just a https://bazel.build/reference/be/general#genrule[genrule()], in Mill
+just `def foo = Task { ... }` with a block of code doing what you need,
+and you can even xref:mill:ROOT:extending/import-ivy-plugins.adoc[use any third-party JVM library]
+you are already familiar with as part of your custom task. This helps ensure your custom
+tasks are written in concise type-checked code with automatic caching and parallelism,
+which are all things that are lacking if you start implementing your logic outside of
+your build tool in ad-hoc scripts.
+
+## Automatically Caching and Parallelizing Everything
+
+In most build tools, caching is opt in, so the core build/compile tasks usually end up getting
+cached but everything else is not and ends up being wastefully recomputed all the time. In
+monorepo build tools like Bazel or Mill, everything is cached and everything is parallelized.
+Even tests are cached so if you run a test twice on the same code and inputs (transitively),
+the second time it is skipped.
+
+The importance of caching and parallelism grows together with the codebase:
+
+- For smaller codebases, you do not need to cache or parallelize at all: compilation and
+  testing are fast enough that you can just run them every time from a clean slate
+  without inconvenience
+
+- For medium-sized codebases, caching and parallelizing the slowest tasks (e.g. compilation
+  or testing) is usually enough. Most build tools have some support for manually opting-in to
+  some kind of caching or parallelization framework, and although you will likely miss out
+  on many "ad-hoc" build tasks that still run un-cached and sequentially, those are few
+  enough not to matter
+
+- For large codebases, you want everything to be cached and parallelized. Caching just the
+  "core" build tasks is no longer enough, and any non-cached or non-parallel build tasks
+  results in noticeable slowness and inconvenience.
+
+Take ad-hoc source code generation as an example: a small codebase may not have any. A
+medium-sized codebase may have some but little enough that it doesn't matter if it runs
+sequentially un-cached. But a large codebase may have multiple RPC IDL
+code generators (e.g. https://protobuf.dev/[protobuf], https://thrift.apache.org/[thrift],
+static resource pre-processors, and other custom tasks that not caching and parallelizing
+these causes visible slowdowns and inconvenience.
+
+In monorepo build tools like Mill or Bazel, caching and parallelism are automatic and
+enabled by default. That means that it doesn't matter what you are running - whether
+it's the core compilation workflows or some ad-hoc custom tasks - you always get the
+benefits of caching and parallelization to keep your build system fast and responsive.
+
+## Seamless Remote Caching
+
+"Remote caching" means I can compile something on my laptop, you download it to your laptop
+for usage. "Seamless" means I don't need to do anything to get this behavior - no manual
+commands to upload and download - so the distribution of build outputs from my laptop to
+yours happens completely automatically.
+
+This also applies to tests: e.g. if TestFoo was run in CI on master, if I pull
+master and run all tests without changing the Foo code, TestFoo is skipped and uses the
+CI result.
+
+Bazel, Pants, and many other monorepo build tools support this out of the box, with
+open source back-end servers such as https://github.com/buchgr/bazel-remote[Bazel Remote].
+The clients and servers all conform to a https://github.com/bazelbuild/remote-apis[standardize
+protocol], so you can easily drop in a new server or new build client and have it work
+with all your existing infrastructure. Mill does not yet support remote caching, but there
+are some https://github.com/com-lihaoyi/mill/pull/2777[prototypes] and
+https://github.com/com-lihaoyi/mill/pull/4065[work in progress] that will hopefully
+add support in the not-too-distant future.
+
+## Remote Execution
+
+"Remote execution" means that I can run "compile" on my laptop and have it automatically
+happen in the cloud on 96 core machines, or I run a lot of tests (e.g. after a big refactor)
+on my laptop and it seamlessly gets farmed out to run 1024x parallel on a large
+compute cluster.
+
+Remote execution is valuable for two reasons:
+
+1. *Better Parallelism*:
+   The largest cloud machines you can get are typically around 96 cores, whereas if you farm
+   out the execution to a cluster you can easily run on many 1024 or more cores in parallel
+
+2. *Better Utilization*: e.g. If you
+   give every individual a 96 core devbox, most of the time when they are not actively running
+   anything (e.g. they are thinking, typing, talking to someone, etc.) those 96 cores are
+   completely idle. It's not usual for utilization on devboxes to be <1% while you are still
+   paying for the other 99% of idle CPU time. In contrast, an auto-scaling remote execution
+   cluster can spin down machines that are not in use, and achieve >50% utilization rates
+
+One surprising thing is that remote execution can be both faster _and_ cheaper_than running
+things locally on a laptop or devbox! Running 256 cores for 1 minute doesn't cause any more
+cloud spending than running 16 cores for 16 minutes, even though the former finishes 16x
+faster! And due to the improved utilization from remote execution clusters, the total savings
+can be significant.
+
+Monorepo build tools like Bazel, Pants, and Buck all support remote execution out of the box.
+Mill does not support it, which means it might not be suitable for the largest monorepos
+with >10,000 active developers.
+
+## Dependency based test selection
+
+When using Bazel to build a large project, you can use bazel query to determine the possible
+targets and tests affected by a code change, allowing you to easily set up pull-request validation
+to only run tests downstream of a PR diff and skip unrelated ones. The Mill build tool also supports
+this, as xref:mill:ROOT:large/selective-execution.adoc[Selective Execution], letting you snapshot
+your code before and after a code change and only run tasks that are downstream of those changes.
+
+Fundamentally, running "all tests" in CI is wasteful when you know from the build tool
+that only some tests are relevant to the code change being tested. If every pull request always
+runs every single test in a monorepo, then it's natural for PR validation times to grow unbounded
+as the monorepo grows. Sooner or later this will start causing issues.
+
+
+Any large codebase that doesn't use a monorepo build tool ends up re-inventing this manually, e.g.
+consider this code in apache/spark that re-implements this in a Python script that wraps
+`mvn` or `sbt` (https://github.com/apache/spark/blob/290b4b31bae2e02b648d2c5ef61183f337b18f8f/dev/sparktestsupport/modules.py#L108-L126[link]).
+With a proper monorepo build tool, such functionality comes for free out-of-the-box with better
+precision and correctness than anything you could hack together manually.
+
+## Build Task Sandboxing
+
+There are two kinds of sandboxing that monorepo build tools like Bazel do:
+
+1. *Semantic sandboxing*: this ensures your build tasks do not make use of un-declared files,
+   or write to places on disk that can affect other tasks. In most build tools, this
+   kind of mistake results in confusing nondeterministic parallelism and cache invalidation
+   problems down the road, where e.g. your build step may rely on a file on disk but not realize
+   it needs to re-compute when the file changes. In Bazel, these mis-configurations result in a
+   deterministic error up front, enforced via a https://bazel.build/docs/sandboxing[variety of mechanisms]
+   (e.g. https://en.wikipedia.org/wiki/Cgroups[CGroups] on Linux,
+   https://www.chromium.org/developers/design-documents/sandbox/osx-sandboxing-design/[Seatbelt Sandboxes] on Mac-OSX).
+
+1. *Resource sandboxing*: Bazel also has the ability to limit CPU/Memory usage
+  (https://github.com/bazelbuild/bazel/pull/21322), which eliminates the noisy neighbour
+   problem and ensures a build step or test gets the same compute footprint whether run alone
+   during development or 96x parallel on a CI worker.
+   Otherwise it's common for tests to pass when run alone during manual development, then timeout
+   or OOM when run in CI under resource pressure from other tests hogging the CPU or RAM
+
+Both kinds of sandboxing have the same goal: to make sure your build tasks behave the same
+way no matter how they are run sequentially or in parallel with one another. Even Bazel's
+sandboxes aren't 100% hermetic, but are hermetic enough
+
+xref:mill:ROOT:depth/sandboxing.adoc[The Mill build tool's sandboxing] is less powerful
+than Bazel's CGroup/Seatbelt sandboxes, and simply runs tasks and subprocesses in
+sandbox directories to try and limit cross-task interference. But it has the same goal
+of adding best-effort guardrails to mitigate race conditions and non-determinism.
+
+## Who Needs Monorepo Build Tools?
+
+Most small projects never need the features listed above: small projects build quickly
+without any optimizations, use a single language toolchain without customization, and
+any bugs related to non-determinism or resource footprint can usually be investigated
+and dealt with manually. Any missing build-tool features can be papered over with shell
+scripts.
+
+That is how every small project starts, and as most small projects never grow big you
+can go quite a distance without needing anything more. While the features above would be
+nice to have, they are _wants_ rather than _needs_.
+
+But once in a while, a project _does_ grow large. Sometimes the rocket-ship really _does_
+take off! In such cases, as the number of developers grows from 1 to 10 to 1,000,
+you will inevitably start feeling pain:
+
+1. Local build times slowing to a crawl on your laptop, using 1 out of 16 available CPUs
+2. Pull-request validation taking 4 hours to run mostly-unnecessary tests with a 50% flake rate
+3. An unmaintainable multi-layer jungle of shell, Python, and Make scripts layered on
+   top of your classic build tools like Maven/Poetry/Cargo, that everyone knows should be
+   cleaned up but nobody knows how.
+
+Monorepo build tools bring performance optimizations to
+bring down CI times, sandboxing improvements to reduce flakiness, and structured way
+of replacing the ubiquitous folder-full-of-bash-scripts. It is these features that really
+let a codebase _scale_, allowing you to grow your developer team from 100 to 1,000 developers
+and beyond without everything grinding to a halt. That is why people use "monorepo build tools"
+like Mill (most suitable for projects 10-1,000 active developers) or Bazel
+(most suitable for larger projects 100-10,000 active developers) .
+
diff --git a/blog/modules/ROOT/pages/index.adoc b/blog/modules/ROOT/pages/index.adoc
new file mode 100644
index 00000000000..42571d4fd2f
--- /dev/null
+++ b/blog/modules/ROOT/pages/index.adoc
@@ -0,0 +1,17 @@
+# Mill Build Tool Engineering Blog
+
+include::mill:ROOT:partial$gtag-config.adoc[]
+
+
+Welcome to the Mill development blog! This page contains posts and articles on
+technical topics related to the development of the Mill build tool. These discuss
+topics related to JVM languages and monorepo build tooling.
+
+include::2-monorepo-build-tool.adoc[tag=header,leveloffset=1]
+
+xref:2-monorepo-build-tool.adoc[Read More...]
+
+include::1-java-compile.adoc[tag=header,leveloffset=1]
+
+xref:1-java-compile.adoc[Read More...]
+
diff --git a/docs/modules/ROOT/nav.adoc b/docs/modules/ROOT/nav.adoc
index 61d65edf34b..095c8975a0b 100644
--- a/docs/modules/ROOT/nav.adoc
+++ b/docs/modules/ROOT/nav.adoc
@@ -41,7 +41,6 @@
 ** xref:comparisons/gradle.adoc[]
 ** xref:comparisons/sbt.adoc[]
 ** xref:comparisons/unique.adoc[]
-** xref:comparisons/java-compile.adoc[]
 * The Mill CLI
 ** xref:cli/installation-ide.adoc[]
 ** xref:cli/flags.adoc[]
@@ -102,7 +101,8 @@
 
 * Mill In Depth
 ** xref:depth/sandboxing.adoc[]
-** xref:depth/evaluation-model.adoc[]
+** xref:depth/execution-model.adoc[]
+** xref:depth/process-architecture.adoc[]
 ** xref:depth/design-principles.adoc[]
 ** xref:depth/why-scala.adoc[]
 // Reference pages that a typical user would not typically read top-to-bottom,
diff --git a/docs/modules/ROOT/pages/comparisons/gradle.adoc b/docs/modules/ROOT/pages/comparisons/gradle.adoc
index 9458b1bf9f9..e65794657d4 100644
--- a/docs/modules/ROOT/pages/comparisons/gradle.adoc
+++ b/docs/modules/ROOT/pages/comparisons/gradle.adoc
@@ -18,7 +18,7 @@ does so with much less fixed overhead. This means less time waiting for your bui
 tool, and more time for the things that really matter to your project.
 
 * **Mill enforces best practices by default**.
-xref:depth/evaluation-model.adoc#_caching_at_each_layer_of_the_evaluation_model[All parts of a Mill build are cached and incremental by default].
+xref:depth/execution-model.adoc#_caching_in_mill[All parts of a Mill build are cached and incremental by default].
 All Mill tasks write their output to xref:fundamentals/out-dir.adoc[a standard place].
 All task inter-dependencies are automatically captured without manual annotation. Where Gradle requires
 considerable effort and expertise to understand your build and set it up in the right way, Mill's
diff --git a/docs/modules/ROOT/pages/comparisons/unique.adoc b/docs/modules/ROOT/pages/comparisons/unique.adoc
index b174510ba67..a760cd9c2c9 100644
--- a/docs/modules/ROOT/pages/comparisons/unique.adoc
+++ b/docs/modules/ROOT/pages/comparisons/unique.adoc
@@ -480,7 +480,7 @@ functionality, but at 10% the complexity:
 * Bazel itself is not getting any simpler over time - instead is getting more complex with
   additional features and functionality, as tends to happen to projects over time
 
-Mill provides many of the same things Bazel does: automatic xref:depth/evaluation-model.adoc[caching],
+Mill provides many of the same things Bazel does: automatic xref:depth/execution-model.adoc[caching],
 parallelization, xref:depth/sandboxing.adoc[sandboxing],
 xref:extending/import-ivy-plugins.adoc[extensibility]. Mill
 can already work with a wide variety of programming languages,
diff --git a/docs/modules/ROOT/pages/depth/evaluation-model.adoc b/docs/modules/ROOT/pages/depth/evaluation-model.adoc
deleted file mode 100644
index 4919b8bb802..00000000000
--- a/docs/modules/ROOT/pages/depth/evaluation-model.adoc
+++ /dev/null
@@ -1,147 +0,0 @@
-= The Mill Evaluation Model
-:page-aliases: The_Mill_Evaluation_Model.adoc
-
-include::partial$gtag-config.adoc[]
-
-Evaluating a Mill task typically goes through the following phases:
-
-1. *Compilation*: Mill compiles the `build.mill` to classfiles, following the
-<<_the_mill_bootstrapping_process>> to eventually produce a `RootModule` object
-
-2. *Resolution*: Mill resolves the list of xref:fundamentals/tasks.adoc[] given from the command line,
-   e.g. `resolve _` or `foo.compile` or `{bar,qux}.__.test`, to a list of
-   concrete `Task` objects nested on xref:fundamentals/modules.adoc[] within the `RootModule` along
-   with their transitive dependencies
-
-    * In the process, the relevant Mill ``Module``s  are lazily instantiated
-
-3. *Evaluation*: Mill evaluates the gathered ``Task``s in dependency-order,
-    either serially or in parallel
-
-== Limitations of the Mill Evaluation Model
-
-This three-phase evaluation model has consequences for how you structure your
-build. For example:
-
-1. You can have arbitrary code outside of ``Task``s that helps
-   set up your task graph and module hierarchy, e.g. computing what keys exist
-   in a `Cross` module, or specifying your `def moduleDeps`
-
-2. You can have arbitrary code inside of ``Task``s, to perform your build
-   actions
-
-3. *But* your code inside of ``Task``s cannot influence the shape of the task
-   graph or module hierarchy, as all *Resolving* and *Planning* happens first
-   *before* any ``Task``s are evaluated.
-
-This should not be a problem for most builds, but it is something to be aware
-of. In general, we have found that having "two places" to put code - outside of
-``Task``s to run during *Planning* or inside of ``Task``s to run during
-*Evaluation* - is generally enough flexibility for most use cases.
-
-The hard boundary between these two phases is what lets users easily query
-and visualize their module hierarchy and task graph without running them: using
-xref:cli/builtin-commands.adoc#_inspect[inspect], xref:cli/builtin-commands.adoc#_plan[plan],
-xref:cli/builtin-commands.adoc#_visualize[visualize], etc.. This helps keep your
-Mill build discoverable even as the `build.mill` codebase grows.
-
-== Caching at Each Layer of the Evaluation Model
-
-Apart from fine-grained caching of ``Task``s during *Evaluation*, Mill also
-performs incremental evaluation of the other phases. This helps ensure
-the overall workflow remains fast even for large projects:
-
-1. *Compilation*:
-
-    * Done on-demand and incrementally using the Scala
-      incremental compiler https://github.com/sbt/zinc[Zinc].
-
-    * If some of the files `build.mill` imported changed but not others, only the
-      changed files are re-compiled before the `RootModule` is re-instantiated
-
-    * In the common case where `build.mill` was not changed at all, this step is
-      skipped entirely and the `RootModule` object simply re-used from the last
-      run.
-
-2. *Planning*:
-
-    * If the `RootModule` was re-used, then all
-      previously-instantiated modules are simply-re-used
-
-3. *Evaluation*:
-
-    * ``Task``s are evaluated in dependency order
-
-    * xref:fundamentals/tasks.adoc#_cached_tasks[Cached Task]s only re-evaluate if their input ``Task``s
-     change.
-
-    * xref:fundamentals/tasks.adoc#_persistent_tasks[Task.Persistent]s preserve the `Task.dest` folder on disk between runs,
-      allowing for finer-grained caching than Mill's default task-by-task
-      caching and invalidation
-
-    * xref:fundamentals/tasks.adoc#_workers[Task.Worker]s are kept in-memory between runs where possible, and only
-      invalidated if their input ``Task``s change as well.
-
-    * ``Task``s in general are invalidated if the code they depend on changes,
-      at a method-level granularity via callgraph reachability analysis. See
-      https://github.com/com-lihaoyi/mill/pull/2417[#2417] for more details
-
-This approach to caching does assume a certain programming style inside your
-Mill build: we may-or-may-not re-instantiate the modules in your
-`build.mill` and we may-or-may-not re-execute any particular task depending on caching,
-but your code needs to work either way. Furthermore, task ``def``s and module `object`s in your
-build are instantiated lazily on-demand, and your code needs to work regardless
-of which order they are executed in. For code written in a typical Scala style,
-which tends to avoid side effects, this is not a problem at all.
-
-One thing to note is for code that runs during *Resolution*: any reading of
-external mutable state needs to be wrapped in an `interp.watchValue{...}`
-wrapper. This ensures that Mill knows where these external reads are, so that
-it can check if their value changed and if so re-instantiate `RootModule` with
-the new value.
-
-== The Mill Bootstrapping Process
-
-Mill's bootstrapping proceeds roughly in the following phases:
-
-1. If using the bootstrap script, it first checks if the right version of Mill
-is already present, and if not it downloads it to `~/.mill/download`
-
-2. It instantiates an in-memory `MillBuildRootModule.BootstrapModule`,
-which is a hard-coded `build.mill` used for bootstrapping Mill
-
-3. If there is a meta-build present `mill-build/build.mill`, it processes that
-first and uses the `MillBuildRootModule` returned for the next steps.
-Otherwise it uses the `MillBuildRootModule.BootstrapModule` directly
-
-4. Mill evaluates the `MillBuildRootModule` to parse the `build.mill`, generate
-a list of `ivyDeps` as well as appropriately wrapped Scala code that we can
-compile, and compiles it to classfiles
-
-5. Mill loads the compiled classfiles of the `build.mill` into a
-`java.lang.ClassLoader` to access it's `RootModule`
-
-Everything earlier in the doc applies to each level of meta-builds in the
-Mill bootstrapping process as well.
-
-In general, `.sc` files, `import $file`, and `import $ivy` can be thought of as
-a short-hand for configuring the meta-build living in `mill-build/build.mill`:
-
-1. `.sc` and `import $file` are a shorthand for specifying the `.scala` files
-   living in `mill-build/src/`
-
-2. `import $ivy` is a short-hand for configuring the `def ivyDeps` in
-   `mill-build/build.mill`
-
-Most builds would not need the flexibility of a meta-build's
-`mill-build/build.mill`, but it is there if necessary.
-
-Mill supports multiple levels of meta-builds for bootstrapping:
-
-- Just `build.mill`
-- One level of meta-builds: `mill-build/build.mill` and `build.mill`
-- Two level of meta-builds: `mill-build/mill-build/build.mill`,
-  `mill-build/build.mill` and `build.mill`
-
-xref:extending/meta-build.adoc[The Mill Meta Build] works through a simple use case
-and example for meta-builds.
\ No newline at end of file
diff --git a/docs/modules/ROOT/pages/depth/execution-model.adoc b/docs/modules/ROOT/pages/depth/execution-model.adoc
new file mode 100644
index 00000000000..fb67118b27f
--- /dev/null
+++ b/docs/modules/ROOT/pages/depth/execution-model.adoc
@@ -0,0 +1,426 @@
+= The Mill Execution Model
+:page-aliases: The_Mill_Evaluation_Model.adoc, depth/evaluation-model.adoc
+
+include::partial$gtag-config.adoc[]
+
+This page does a deep dive on how Mill evaluates your build tasks, so you can better understand
+what Mill is doing behind the scenes when building your project.
+
+## Example Project
+
+For the purposes of this article, we will be using the following example build
+as the basis for discussion:
+
+```scala
+// build.mill
+package build
+import mill._, javalib._
+
+object foo extends JavaModule {}
+
+object bar extends JavaModule {
+  def moduleDeps = Seq(foo)
+
+  /** Total number of lines in module source files */
+  def lineCount = Task {
+    allSourceFiles().map(f => os.read.lines(f.path).size).sum
+  }
+
+  /** Generate resources using lineCount of sources */
+  override def resources = Task {
+    os.write(Task.dest / "line-count.txt", "" + lineCount())
+    Seq(PathRef(Task.dest))
+  }
+}
+```
+
+This is a simple two-module build with two ``JavaModule``s, one that depends on the other.
+There is a custom task `bar.lineCount` implemented that replaces the default `resources/`
+folder with a generated resource file for use at runtime, as a simple example of a
+xref:javalib/intro.adoc#_custom_build_logic[Custom Build Logic].
+
+This expects the source layout:
+
+```
+foo/
+    src/
+        *.java files
+    package.mill (optional)
+bar/
+    src/
+        *.java files
+    package.mill (optional)
+build.mill
+```
+
+You can operate on this build via commands such as
+
+```bash
+> ./mill bar.compile
+
+> ./mill foo.run
+
+> ./mill _.assembly # evaluates both foo.compile and bar.compile
+```
+
+
+For the purposes of this article, we will consider what happens when you run
+`./mill _.assembly` on the above example codebase.
+
+## Primary Phases
+
+### Compilation
+
+Initial `.mill` build files:
+
+```bash
+bar/
+    package.mill #optional
+foo/
+    package.mill #optional
+build.mill
+```
+
+
+This stage involves compiling your `build.mill` and any
+xref:large/multi-file-builds.adoc[subfolder package.mill files] into JVM classfiles.
+Mill build files are xref:depth/why-scala.adoc[written in Scala], so this is done
+using the normal Mill Scala compilation toolchain (`mill.scalalib.ScalaModule`), with
+some minor pre-processing to turn `.mill` files into valid `.scala` files.
+
+Compilation of your build is _global_ but _incremental_: running any `./mill` command
+requires that you compile all `build.mill` and `package.mill` files in your entire
+project, which can take some time the first time you run a `./mill` command in a project.
+However, once that is done, updates to any `.mill` file are re-compiled incrementally,
+such that updates can happen relatively quickly even in large projects.
+
+After compilation, the `.mill` files are converted into JVM classfiles as shown below:
+
+
+```bash
+bar/
+    package.class
+foo/
+    package.class
+build.class
+```
+
+
+These classfiles are dynamically loaded into the Mill process and instantiated into
+a concrete Mill `RootModule` object, which is then used in the subsequent tasks below:
+
+### Resolution
+
+Resolution converts the Mill xref:cli/query-syntax.adoc[task selector] ``_.assembly`` the list of
+xref:fundamentals/tasks.adoc[] given from the command line. This explores the `build` and `package`
+files generated in the <<Compilation>> step above, instantiates the xref:fundamentals/modules.adoc[Modules]
+and xref:fundamentals/tasks.adoc[Tasks] as necessary, and returns a list of the final tasks that
+were selected by selector:
+
+```graphviz
+digraph G {
+  node [shape=box width=0 height=0 style=filled fillcolor=white]
+  bgcolor=transparent
+  newrank=true;
+  build -> foo -> "foo.assembly"
+  build -> bar -> "bar.assembly"
+}
+```
+
+Mill starts from the `RootModule` instantiated after <<Compilation>>, and uses
+Java reflection to walk the tree of modules and tasks to find the tasks that match
+your given selector.
+
+Task and module resolution is _lazy_, so only modules that are required by the given
+selector `_.assembly` are instantiated. This can help keep task resolution fast even
+when working within a large codebase by avoiding instantiation of modules that are
+unrelated to the selector you are running.
+
+
+### Planning
+
+Planning is the step of turning the tasks selected during <<Resolution>> into a full
+build graph that includes all transitive upstream dependencies. This is done by
+traversing the graph of task dependencies, and generates a (simplified) task graph
+as shown below:
+
+```graphviz
+digraph G {
+  rankdir=LR
+  node [shape=box width=0 height=0 style=filled fillcolor=white]
+  bgcolor=transparent
+  newrank=true;
+  subgraph cluster_0 {
+    style=dashed
+    node [shape=box width=0 height=0 style=filled fillcolor=white]
+    label = "foo";
+
+    "foo.sources" -> "foo.compile" -> "foo.classPath" -> "foo.assembly"
+    "foo.resources" -> "foo.assembly"
+    "foo.classPath"
+  }
+  subgraph cluster_1 {
+    style=dashed
+    node [shape=box width=0 height=0 style=filled fillcolor=white]
+    label = "bar";
+
+
+    "bar.sources" -> "bar.compile" -> "bar.classPath" -> "bar.assembly"
+
+    "bar.sources" -> "bar.lineCount" -> "bar.resources" -> "bar.assembly"
+  }
+  "foo.classPath" -> "bar.compile" [constraint=false]
+  "foo.classPath" -> "bar.classPath"
+}
+```
+
+In this graph, we can see that even though <<Resolution>> only selected `foo.assembly`
+and `bar.assembly`, their upstream task graph requires tasks such as `foo.compile`,
+`bar.compile`, as well as our custom task `bar.lineCount` and our override of `bar.resources`.
+
+
+### Evaluation
+
+The last phase is execution. Execution depends not only on the tasks you selected at the
+command line, and those discovered during <<Resolution>>, but also what input files changed
+on disk. Tasks that were not affected by input
+changes may have their value loaded from cache (if already evaluated earlier) or skipped entirely
+(e.g. due to xref:large/selective-execution.adoc[Selective Execution]).
+
+For example, a change to `foo/src/*.java` would affect the `foo.sources` task, which
+would invalidate and cause evaluation of the tasks highlighted in red below:
+
+```graphviz
+digraph G {
+  rankdir=LR
+  node [shape=box width=0 height=0 style=filled fillcolor=white]
+  bgcolor=transparent
+  newrank=true;
+  subgraph cluster_0 {
+    style=dashed
+    node [shape=box width=0 height=0 style=filled fillcolor=white]
+    label = "foo";
+
+    "foo.sources" -> "foo.compile" -> "foo.classPath" -> "foo.assembly"  [color=red, penwidth=2]
+    "foo.resources" -> "foo.assembly"
+    "foo.classPath"
+    "foo.sources" [color=red, penwidth=2]
+
+    "foo.assembly" [color=red, penwidth=2]
+    "foo.compile" [color=red, penwidth=2]
+    "foo.classPath" [color=red, penwidth=2]
+  }
+  subgraph cluster_1 {
+    style=dashed
+    node [shape=box width=0 height=0 style=filled fillcolor=white]
+    label = "bar";
+
+
+    "bar.sources" -> "bar.compile" ->  "bar.classPath"
+    "bar.classPath" -> "bar.assembly"  [color=red, penwidth=2]
+
+    "bar.classPath" [color=red, penwidth=2]
+    "bar.assembly" [color=red, penwidth=2]
+    "bar.sources" -> "bar.lineCount" -> "bar.resources" -> "bar.assembly"
+  }
+  "foo.classPath" -> "bar.compile" [constraint=false]
+  "foo.classPath" -> "bar.classPath"  [color=red, penwidth=2]
+}
+```
+
+On the other hand a change to `bar/src/*.java` would affect the `bar.sources` task, which
+would invalidate and cause evaluation of the tasks highlighted in red below:
+
+```graphviz
+digraph G {
+  rankdir=LR
+  node [shape=box width=0 height=0 style=filled fillcolor=white]
+  bgcolor=transparent
+  newrank=true;
+  subgraph cluster_0 {
+    style=dashed
+    node [shape=box width=0 height=0 style=filled fillcolor=white]
+    label = "foo";
+
+    "foo.sources" -> "foo.compile" -> "foo.classPath" -> "foo.assembly"
+    "foo.resources" -> "foo.assembly"
+    "foo.classPath"
+  }
+  subgraph cluster_1 {
+    style=dashed
+    node [shape=box width=0 height=0 style=filled fillcolor=white]
+    label = "bar";
+
+    "bar.sources" -> "bar.compile" -> "bar.classPath" -> "bar.assembly" [color=red, penwidth=2]
+
+    "bar.sources" [color=red, penwidth=2]
+    "bar.lineCount" [color=red, penwidth=2]
+    "bar.resources" [color=red, penwidth=2]
+    "bar.assembly" [color=red, penwidth=2]
+    "bar.compile" [color=red, penwidth=2]
+    "bar.classPath" [color=red, penwidth=2]
+    "bar.sources" -> "bar.lineCount" -> "bar.resources" -> "bar.assembly" [color=red, penwidth=2]
+  }
+  "foo.classPath" -> "bar.compile" [constraint=false]
+  "foo.classPath" -> "bar.classPath"
+}
+```
+
+In the example changing `bar/src/*.java`, Mill may also take the opportunity to parallelize
+things:
+
+- `bar.compile` and `bar.classPath` can on a separate thread from `bar.lineCount` and `bar.resources`
+
+- `bar.assembly` must wait for both `bar.classPath` and `bar.resources` to complete before proceeding.
+
+This parallelization is automatically done by Mill, and requires no effort from the user to enable.
+The exact parallelism may depend on the number of CPU cores available and exactly when each task
+starts and how long it takes to run, but Mill will generally parallelize things where possible
+to minimize the time taken to execute your tasks.
+
+Some other things to note:
+
+- Tasks have their metadata cached to xref:fundamentals/out-dir.adoc#_task_json[<task>.json] files
+  in the xref:fundamentals/out-dir.adoc[out/ folder], with any files created by the task are cached
+  in xref:fundamentals/out-dir.adoc#_task_dest[<task>.dest/] folders. These file paths are all
+  automatically assigned by Mill.
+
+- Mill treats builtin tasks (e.g. `compile`) and user-defined (e.g. `lineCount`) exactly the same.
+  Both get automatically cached or skipped when not needed, and parallelized where possible.
+  This happens without the task author needing to do anything to enable caching or parallelization
+
+- Mill evaluation does not care about the _module_ structure of `foo` and `bar`. Mill modules are
+  simply a way to define and re-use parts of the task graph, but it is the task graph that matters
+  during evaluation
+
+## Bootstrapping
+
+One part of the Mill evaluation model that is skimmed over above is what happens before
+*Compilation*: how does Mill actually get everything necessary to compile your `build.mill`
+and `package.mill` files? This is called bootstrapping, and proceeds roughly in the following phases:
+
+1. Mill's xref:cli/installation-ide.adoc#_bootstrap_scripts[bootstrap script] first checks
+   if the right version of Mill is already present, and if not it downloads the assembly jar
+   to `~/.mill/download`
+
+2. Mill instantiates an in-memory `MillBuildRootModule.BootstrapModule`,
+   which is a hard-coded `build.mill` used for bootstrapping Mill
+
+3. If there is a xref:extending/meta-build.adoc[meta-build] present `mill-build/build.mill`, Mill processes that
+   first and uses the `MillBuildRootModule` returned for the next steps.
+   Otherwise it uses the `MillBuildRootModule.BootstrapModule` directly
+
+4. Mill evaluates the `MillBuildRootModule` to parse the `build.mill`, generate
+   a list of `ivyDeps` as well as appropriately wrapped Scala code that we can
+   compile, and compiles it to classfiles (<<Compilation>> above)
+
+For most users, you do not need to care about the details of the Mill bootstrapping
+process, except to know that you only need a JVM installed to begin with and
+Mill will download everything necessary from the standard Maven Central package repository
+starting from just the bootstrap script (available as `./mill` for Linux/Mac and `./mill.bat`
+for Windows). The documentation for xref:extending/meta-build.adoc[The Mill Meta Build]
+goes into more detail of how you can configure and make use of it.
+
+== Consequences of the Mill Execution Model
+
+This four-phase evaluation model has consequences for how you structure your
+build. For example:
+
+1. You can have arbitrary code outside of ``Task``s that helps
+   set up your task graph and module hierarchy, e.g. computing what keys exist
+   in a `Cross` module, or specifying your `def moduleDeps`. This code runs
+   during <<Resolution>>
+
+2. You can have arbitrary code inside of ``Task``s, to perform your build
+   actions. This code runs during <<Evaluation>>
+
+3. *But* your code inside of ``Task``s cannot influence the shape of the task
+   graph or module hierarchy, as all <<Resolution>> logic happens first
+   *before* any <<Evaluation>> of the ``Task``s bodies.
+
+This should not be a problem for most builds, but it is something to be aware
+of. In general, we have found that having "two places" to put code - outside of
+``Task``s to run during <<Planning>> or inside of ``Task``s to run during
+<<Evaluation>> - is generally enough flexibility for most use cases. You
+can generally just write "direct style" business logic you need - in the example
+above counting the lints in `allSourceFiles` - and Mill handles all the caching,
+invalidation, and parallelism for you without any additional work.
+
+The hard boundary between these two phases is what lets users easily query
+and visualize their module hierarchy and task graph without running them: using
+xref:cli/builtin-commands.adoc#_inspect[inspect], xref:cli/builtin-commands.adoc#_plan[plan],
+xref:cli/builtin-commands.adoc#_visualize[visualize], etc.. This helps keep your
+Mill build discoverable even as the `build.mill` codebase grows.
+
+== Caching in Mill
+
+Apart from fine-grained caching of ``Task``s during *Evaluation*, Mill also
+performs incremental evaluation of the other phases. This helps ensure
+the overall workflow remains fast even for large projects:
+
+1. <<Compilation>>:
+
+    * Done on-demand and incrementally using the Scala
+      incremental compiler https://github.com/sbt/zinc[Zinc].
+
+    * If some of the files `build.mill` imported changed but not others, only the
+      changed files are re-compiled before the `RootModule` is re-instantiated
+
+    * In the common case where `build.mill` was not changed at all, this step is
+      skipped entirely and the `RootModule` object simply re-used from the last
+      run.
+
+2. <<Resolution>>:
+
+    * If the `RootModule` was re-used, then all
+      previously-instantiated modules are simply-re-used
+
+    * Any modules that are lazily instantiated during <<Resolution>> are
+      also re-used.
+
+3. <<Planning>>
+
+    * Planning is relatively quick most of the time, and is not currently cached.
+
+4. <<Evaluation>>:
+
+    * ``Task``s are evaluated in dependency order
+
+    * xref:fundamentals/tasks.adoc#_cached_tasks[Cached Task]s only re-evaluate if their input ``Task``s
+     change.
+
+    * xref:fundamentals/tasks.adoc#_persistent_tasks[Persistent Tasks]s preserve the `Task.dest`
+      folder on disk between runs, allowing for finer-grained caching than Mill's default task-by-task
+      caching and invalidation
+
+    * xref:fundamentals/tasks.adoc#_workers[Worker]s are kept in-memory between runs where possible, and only
+      invalidated if their input ``Task``s change as well.
+
+    * ``Task``s in general are invalidated if the code they depend on changes,
+      at a method-level granularity via callgraph reachability analysis. See
+      https://github.com/com-lihaoyi/mill/pull/2417[#2417] for more details
+
+This approach to caching does assume a certain programming style inside your
+Mill build:
+
+- Mill may-or-may-not instantiate the modules in your `build.mill` the first time
+  you run something (due to laziness)
+
+- Mill may-or-may-not *re*-instantiate the modules in your `build.mill` in subsequent runs
+  (due to caching)
+
+- Mill may-or-may-not re-execute any particular task depending on caching,
+  but your code needs to work either way.
+
+- Execution of any task may-or-may-not happen in parallel with other unrelated
+  tasks, and may happen in arbitrary order
+
+Your build code code needs to work regardless of which order they are executed in.
+However, for code written in a typical Scala style (which tends to avoid side effects),
+and limits filesystem operations to the `Task.dest` folder, this is not a problem at all.
+
+One thing to note is for code that runs during *Resolution*: any reading of
+external mutable state needs to be wrapped in an `interp.watchValue{...}`
+wrapper. This ensures that Mill knows where these external reads are, so that
+it can check if their value changed and if so re-instantiate `RootModule` with
+the new value.
diff --git a/docs/modules/ROOT/pages/depth/process-architecture.adoc b/docs/modules/ROOT/pages/depth/process-architecture.adoc
new file mode 100644
index 00000000000..436be147912
--- /dev/null
+++ b/docs/modules/ROOT/pages/depth/process-architecture.adoc
@@ -0,0 +1,167 @@
+= The Mill Process Architecture
+
+include::partial$gtag-config.adoc[]
+
+This page goes into detail of how the Mill process and application is structured.
+At a high-level, a simplified version of the main components and data-flows within
+a running Mill process is shown below:
+
+```graphviz
+digraph G {
+  rankdir=LR
+  node [shape=box width=0 height=0 style=filled fillcolor=white]
+  bgcolor=transparent
+
+  "client-stdin" [penwidth=0]
+  "client-stdout" [penwidth=0]
+  "client-stderr" [penwidth=0]
+  "client-exit" [penwidth=0]
+  "client-args" [penwidth=0]
+  subgraph cluster_client {
+      label = "mill client";
+      "Socket"
+      "MillClientMain"
+  }
+  "client-stdin" -> "Socket"
+  "client-stderr" -> "Socket" [dir=back]
+  "client-stdout" -> "Socket" [dir=back]
+  "client-args" -> "MillClientMain"
+  "client-exit" -> "MillClientMain" [dir=back]
+  "MillClientMain" -> "runArgs"
+  subgraph cluster_out {
+    label = "out/";
+
+
+    subgraph cluster_mill_server_folder {
+      label = "mill-server/";
+      "socketPort" [penwidth=0]
+      "exitCode" [penwidth=0]
+      "runArgs" [penwidth=0]
+    }
+        subgraph cluster_out_foo_folder {
+      label = "foo/";
+      "compile.json" [penwidth=0]
+      "compile.dest" [penwidth=0]
+      "assembly.json" [penwidth=0]
+      "assembly.dest" [penwidth=0]
+
+    }
+  }
+
+
+  subgraph cluster_server {
+    label = "mill server";
+    "PromptLogger"
+    "MillServerMain"
+    "Evaluator"
+    "ServerSocket"
+
+    "server-stdout" [penwidth=0]
+    "server-stderr" [penwidth=0]
+    subgraph cluster_classloder {
+      label = "URLClassLoader";
+      subgraph cluster_build {
+        style=dashed
+        label = "build";
+        subgraph cluster_foo {
+          style=dashed
+          label = "foo";
+
+          "foo.sources" -> "foo.compile" -> "foo.classPath" -> "foo.assembly"
+          "foo.resources" -> "foo.assembly"
+          "foo.classPath"
+        }
+      }
+
+    }
+  }
+
+
+  "runArgs" -> "MillServerMain"
+  "MillServerMain" -> "Evaluator" [dir=both]
+  "ServerSocket" -> "PromptLogger" [dir=back]
+  "exitCode" -> "MillServerMain" [dir=back]
+  "MillClientMain" -> "exitCode" [dir=back]
+  "Socket" -> "socketPort"  [dir=both]
+  "socketPort" -> "ServerSocket"  [dir=both]
+
+  "PromptLogger" -> "server-stderr" [dir=back]
+  "PromptLogger" -> "server-stdout" [dir=back]
+  "compile.dest" -> "foo.compile"  [dir=both]
+  "compile.json" -> "foo.compile"  [dir=both]
+
+  "assembly.dest" -> "foo.assembly"  [dir=both]
+  "assembly.json" -> "foo.assembly"  [dir=both]
+}
+```
+
+
+== The Mill Client
+
+The Mill client is a small Java application that is responsible for launching
+and delegating work to the Mill server, a long-lived process. Each `./mill`
+command spawns a new Mill client, but generally re-uses the same Mill server where
+possible in order to reduce startup overhead and to allow the Mill server
+process to warm up and provide good performance
+
+* The Mill client takes all the inputs of a typical command-line application -
+stdin and command-line arguments - and proxies them to the long-lived Mill
+server process.
+
+* It then takes the outputs from the Mill server - stdout, stderr,
+and finally the exitcode - and proxies those back to the calling process or terminal.
+
+In this way, the Mill client acts and behaves for most all intents and purposes
+as a normal CLI application, except it is really a thin wrapper around logic that
+is actually running in the long-lived Mill server.
+
+The Mill server sometimes is shut down and needs to be restarted, e.g. if Mill
+version changed, or the user used `Ctrl-C` to interrupt the ongoing computation.
+In such a scenario, the Mill client will automatically restart the server the next
+time it is run, so apart from a slight performance penalty from starting a "cold"
+Mill server such shutdowns and restarts should be mostly invisibl to the user.
+
+== The Mill Server
+
+The Mill server is a long-lived process that the Mill client spawns.
+Only one Mill server should be running in a codebase at a time, and each server
+takes a filelock at startup time to enforce this mutual exclusion.
+
+The Mill server compiles your `build.mill` and `package.mill`, spawns a
+`URLClassLoader` containing the compiled classfiles, and uses that to instantiate
+the variousxref:fundamentals/modules.adoc[] and xref:fundamentals/tasks.adoc[]
+dynamically in-memory. These are then used by the `Evaluator`, which resolves,
+plans, and executes the tasks specified by the given `runArgs`
+
+During execution, both standard output
+and standard error are captured during evaluation and forwarded to the `PromptLogger`.
+`PromptLogger` annotates the output stream with the line-prefixes, prompt, and ANSI
+terminal commands necessary to generate the dynamic prompt, and then forwards both
+streams multi-plexed over a single socket stream back to the Mill client. The client
+then de-multiplexes the combined stream to split it back into output and error, which
+are then both forwarded to the process or terminal that invoked the Mill client.
+
+Lastly, when the Mill server completes its tasks, it writes the `exitCode` to a file
+that is then propagated back to the Mill client. The Mill client terminates with this
+exit code, but the Mill server remains alive and ready to serve to the next Mill
+client that connects to it
+
+For a more detailed discussion of what exactly goes into "execution", see
+xref:depth/execution-model.adoc[].
+
+
+== The Out Folder
+
+The `out/` directory is where most of Mill's state lives on disk, both build-task state
+such as the `foo/compile.json` metadata cache for `foo.compile`, or the `foo/compile.dest`
+which stores any generated files or binaries. It also contains `mill-server/` folder which
+is used to pass data back and forth between the client and server: the `runArgs`, `exitCode`,
+etc.
+
+Each task during evaluation reads and writes from its own designated paths in the `out/`
+folder. Each task's files are not touched by any other tasks, nor are they used in the rest
+of the Mill architecture: they are solely meant to serve each task's caching and filesystem
+needs.
+
+More documentation on what the `out/` directory contains and how to make use of it can be
+found at xref:fundamentals/out-dir.adoc[].
diff --git a/docs/modules/ROOT/pages/index.adoc b/docs/modules/ROOT/pages/index.adoc
index c34ad0f7da1..9a5326f8eb2 100644
--- a/docs/modules/ROOT/pages/index.adoc
+++ b/docs/modules/ROOT/pages/index.adoc
@@ -34,7 +34,7 @@ offer a better alternative, letting your build system take full advantage of the
 Java platform's performance and usability:
 
 * *Performance*: Mill automatically
-xref:depth/evaluation-model.adoc#_caching_at_each_layer_of_the_evaluation_model[caches]
+xref:depth/execution-model.adoc#_caching_in_mill[caches]
 and xref:cli/flags.adoc#_jobs_j[parallelizes] build tasks to keep local development fast,
 and avoids the long configuration times seen in other tools like Gradle or SBT.
 xref:large/selective-execution.adoc[Selective execution] keeps
diff --git a/docs/modules/ROOT/pages/large/selective-execution.adoc b/docs/modules/ROOT/pages/large/selective-execution.adoc
index 5b4d5e3ce8d..665e2a323ba 100644
--- a/docs/modules/ROOT/pages/large/selective-execution.adoc
+++ b/docs/modules/ROOT/pages/large/selective-execution.adoc
@@ -52,4 +52,27 @@ def myProjectVersion: T[String] = Task.Input {
 * Look at xref:fundamentals/out-dir.adoc#_mill_invalidation_tree_json[out/mill-invalidation-tree.json],
   whether on disk locally or printing it out (e.g via `cat`) on your CI machines to diagnose issues
   there. This would give you a richer view of what source tasks or inputs are the ones actually
-  triggered the invalidation, and what tasks were just invalidated due to being downstream of them.
\ No newline at end of file
+  triggered the invalidation, and what tasks were just invalidated due to being downstream of them.
+
+
+== Limitations
+
+* *Selective execution can only work at the Mill Task granularity*. e.g. When working with
+  Java/Scala/Kotlin modules and test modules, the granularity of selection is at entire modules.
+  That means that if your modules are individually large, selective execution may not be able
+  to significantly narrow down the set of tests that need to run
+
+* *Selective execution usually cannot narrow down the set of integration tests to run*. Integration
+  tests by their nature depend on the entire application or system, and run test cases that
+  exercise different parts of it. But selective execution works at the task level and can only
+  see that every integration test depends on the entire codebase, and so any change in the
+  entire codebase could potentially affect any integration test, so selective execution will
+  select all of them.
+
+* *Selective execution is coarser-grained than runtime task caching*. e.g. If you add a newline
+  to a `foo/src/Foo.java` file and run `foo.testCached`, selective testing only knows that
+  `foo.sources` changed and `foo.testCached` is downstream of it, but it cannot know that
+  when you run `foo.compile` on the changed sources, the compilation output is unchanged, and
+  so `.testCached` can be skipped. This is inherent in the nature of selective execution, which
+  does its analysis without evaluation-time information and thus will always be more conservative
+  than the task skipping and cache-reuse that is done during evaluation.
diff --git a/docs/package.mill b/docs/package.mill
index 4b087de3674..9760d5a7559 100644
--- a/docs/package.mill
+++ b/docs/package.mill
@@ -200,6 +200,7 @@ object `package` extends RootModule {
     os.write.over(dest / "antora.yml", (lines ++ newLines).mkString("\n"))
   }
 
+  def blogFolder = Task.Source(build.millSourcePath / "blog")
   def githubPagesPlaybookText(authorMode: Boolean) = T.task { extraSources: Seq[os.Path] =>
     val taggedSources = for (path <- extraSources) yield {
       s"""    - url: ${build.baseDir}
@@ -226,6 +227,9 @@ object `package` extends RootModule {
        |
        |    - url: ${build.baseDir}
        |      start_path: ${devAntoraSources().path.relativeTo(build.baseDir)}
+       |
+       |    - url: ${build.baseDir}
+       |      start_path: ${blogFolder().path.relativeTo(build.baseDir)}
        |ui:
        |  bundle:
        |    url: https://gitlab.com/antora/antora-ui-default/-/jobs/artifacts/master/raw/build/ui-bundle.zip?job=bundle-stable
diff --git a/docs/supplemental-ui/partials/header-content.hbs b/docs/supplemental-ui/partials/header-content.hbs
index e3a6630f666..953b8867ea2 100644
--- a/docs/supplemental-ui/partials/header-content.hbs
+++ b/docs/supplemental-ui/partials/header-content.hbs
@@ -16,7 +16,8 @@
     </div>
     <div id="topbar-nav" class="navbar-menu">
       <div class="navbar-end">
-        <a class="navbar-item" href="https://github.com/com-lihaoyi/mill">Source Code (GitHub)</a>
+        <a class="navbar-item" href="https://github.com/com-lihaoyi/mill">GitHub</a>
+        <a class="navbar-item" href="{{{or site.url (or siteRootUrl siteRootPath)}}}/blog">Blog</a>
         <a class="navbar-item" href="{{{or site.url (or siteRootUrl siteRootPath)}}}/api/latest/index.html">API</a>
         <a class="navbar-item" href="https://github.com/com-lihaoyi/mill/issues">Issues</a>
         <a class="navbar-item" href="https://github.com/com-lihaoyi/mill/discussions">Discuss</a>
diff --git a/main/client/src/mill/main/client/InputPumper.java b/main/client/src/mill/main/client/InputPumper.java
index bae7005a5a8..cb77c67d5f0 100644
--- a/main/client/src/mill/main/client/InputPumper.java
+++ b/main/client/src/mill/main/client/InputPumper.java
@@ -36,7 +36,12 @@ public void run() {
     byte[] buffer = new byte[1024];
     try {
       while (running) {
-        if (!runningCheck.getAsBoolean()) running = false;
+        if (!runningCheck.getAsBoolean()) {
+          running = false;
+          // We need to check `.available` and avoid calling `.read`, because if we call `.read`
+          // and there is nothing to read, it can unnecessarily delay the JVM exit by 350ms
+          // https://stackoverflow.com/questions/48951611/blocking-on-stdin-makes-java-process-take-350ms-more-to-exit
+        } else if (checkAvailable && src.available() == 0) Thread.sleep(1);
         else {
           int n;
           try {
diff --git a/readme.adoc b/readme.adoc
index 38995d25fe5..c3c030e1e24 100644
--- a/readme.adoc
+++ b/readme.adoc
@@ -390,6 +390,9 @@ endif::[]
 * Reduced overhead of terminal prompt UI https://github.com/com-lihaoyi/mill/pull/4095[#4095]
   https://github.com/com-lihaoyi/mill/pull/4110[#4110]
 
+* Launch of the https://mill-build.org/blog[Mill Build Engineering Blog], where we will post articles
+  discussing topics around JVM build tooling
+
 [#0-12-3]
 === 0.12.3 - 2024-11-24
 :version: 0.12.3