From b47c26eb282cb3587a610fd7f631d49d035c6a43 Mon Sep 17 00:00:00 2001 From: Nick Banks Date: Thu, 31 Oct 2024 08:35:20 -0400 Subject: [PATCH 01/13] Update Execution Docs --- docs/API.md | 10 +----- docs/Execution.md | 83 +++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 84 insertions(+), 9 deletions(-) create mode 100644 docs/Execution.md diff --git a/docs/API.md b/docs/API.md index 7b270b7fcc..535898cdc4 100644 --- a/docs/API.md +++ b/docs/API.md @@ -67,15 +67,7 @@ The **PATCH** version **only changes** when a servicing fix is made to an existi ## Execution Mode -In general, MsQuic uses a callback model for all asynchronous events up to the app. This includes things like connection state changes, new streams being created, stream data being received, and stream sends completing. All these events are indicated to the app via the callback handler, on a thread owned by MsQuic. - -Generally, MsQuic creates multiple threads to parallelize work, and therefore will make parallel/overlapping upcalls to the application, but not for the same connection. All upcalls to the app for a single connection and all child streams are always delivered serially. This is not to say, though, it will always be on the same thread. MsQuic does support the ability to shuffle connections around to better balance the load. - -Apps are expected to keep any execution time in the callback **to a minimum**. MsQuic does not use separate threads for the protocol execution and upcalls to the app. Therefore, any significant delays on the callback **will delay the protocol**. Any significant time or work needed to be completed by the app must happen on its own thread. - -This doesn't mean the app isn't allowed to do any work in the callback. In fact, many things are expressly designed to be most efficient when the app does them on the callback. For instance, closing a handle to a connection or stream is ideally implemented in the "shutdown complete" indications. - -One important aspect of this design is that all blocking calls invoked on a callback always happen inline (to prevent deadlocks), and will supercede any calls in progress or queued from a separate thread. +MsQuic has a very different execution model than classic BSD-style sockets. Please see [Execution](./Execution.md) for more details on how threads and upcalls are handled inside of MsQuic. ## Settings and Configuration diff --git a/docs/Execution.md b/docs/Execution.md new file mode 100644 index 0000000000..de79bbd320 --- /dev/null +++ b/docs/Execution.md @@ -0,0 +1,83 @@ +Execution +====== + +The MsQuic API takes a very difference stance when it comes to its execution model compared to BSD-style sockets (and most other networking libraries built on top of them). +The sections below detail the designs MsQuic uses, with some of the details as to why these design choices were made. + +## Event Model + +In the MsQuic API, all state changes and other notifications are indicated directly to the application via a callback. +This includes things like connection state changes, new streams being created, stream data being received, and stream sends completing. + +```c +typedef struct QUIC_LISTENER_EVENT { + QUIC_LISTENER_EVENT_TYPE Type; + union { + struct { ... } NEW_CONNECTION; + struct { ... } STOP_COMPLETE; + }; +} QUIC_LISTENER_EVENT; + +typedef +_IRQL_requires_max_(PASSIVE_LEVEL) +_Function_class_(QUIC_LISTENER_CALLBACK) +QUIC_STATUS +(QUIC_API QUIC_LISTENER_CALLBACK)( + _In_ HQUIC Listener, + _In_opt_ void* Context, + _Inout_ QUIC_LISTENER_EVENT* Event + ); +``` + +Above is an example of the type of callback delivered to the listener interface. +The application is requires to register a callback handler that should handle all the events MsQuic may indicate, returning a status for if it was successfully handled or not. + +This is very different from BSD sockets which required the application to make a call (e.g., `send` or `recv`) in order to determine something happened. +This difference was made for several reasons: + +- The MsQuic API **runs in-proc**, and therefore doesn't require a kernel to user mode boundary switch to indicate something to the application layer. This allows for the callback-based design which is not as practical for BSD sockets. + +- MsQuic, by virtue of the QUIC protocol itself, has a lot of different types of events. Just considering streams, the app maybe have hundreds of objects at once which may have some state change. By leveraging the callback model, the application doesn't have to manage having pending calls on each object. + +- Experience has shown it to be very difficult to write correct, performant code on top of the BSD-style interface. By leveraging callbacks (that happen at the correct time, on the correct thread/processor), it allows MsQuic to abstract a lot of complexity away from applications and make things "just work" out of the box. + +- It simplifies much of the logic in MsQuic, because it eliminates the need for a queue or cached state that needs to be indicated to the application. In the BSD model, the networking stack must wait for the top-down call from the application before it can indicate the completion. This adds increased code size, complexity and memory usage. + +### Writing Event Handlers + +Event handlers are **required** for all objects (that have them), because much of the MsQuic API happens through these callbacks. +Additionally, important events, such as "shutdown complete" events provide crucial information to the application to function properly. +Without these events, the application cannot not know when it is safe to clean up objects. + +Applicationss are expected to keep any execution time in the callbacks **to a minimum**. +MsQuic does not use separate threads for the protocol execution and upcalls to the application. +Therefore, any significant delays on the callback **will delay the protocol**. +Any significant time or work needed to be completed by the application must happen on its own thread. + +This doesn't mean the application isn't allowed to do any work in the callback handler. +In fact, many things are expressly designed to be most efficient when the application does them on the callback. +For instance, closing a handle to a connection or stream is ideally implemented in the "shutdown complete" indications. + +One important aspect of this design is that all blocking API (down) calls invoked on a callback always happen inline (to prevent deadlocks), and will supercede any calls in progress or queued from a separate thread. + +## Threading + +By default, MsQuic creates its own threads to manage execution its logic. +The nature of the number and configuration of these threads depends on the configuration the apps passes to [RegistrationOpen](api/RegistrationOpen.md) or `QUIC_PARAM_GLOBAL_EXECUTION_CONFIG`. + +The default behavior is to create dedicated, per-processor threads that are hard affinitized to a given NUMA-node, and soft-affinitized (set 'ideal processor') to a given processor. +These threads are then used to drive both the datapath (i.e. UDP) and QUIC layers. +Great care it taken to (try to) align the MsQuic processing logic with the rest of the networking stack (including hardware RSS) so that all processing stays at least on the same NUMA node, but ideally the same processor. + +The complexity required to achieve alignment in processing across various threads and processors is why MsQuic takes the stance of managing all its own threading by default. +The goal is to abstract all this complexity from the many applications that build on top, so every app doesn't have to build the necessary logic to do this itself. +Things should 'just work' out of the box. + +Each of these threads manage the execution of one or more connections. +All connections are spread across the various threads based on their RSS alignment, which generally should evenly spread the traffic based on the different UDP tuples used. +Each connection and all derived state (i.e., streams) are managed and executed by a single thread at a time; but may move across threads to align with any RSS changes. +This means that each connection and its streams is effectively single-threaded, including all upcalls to the application layer. +MsQuic will **never** make upcalls for a single connection or any of its streams in parallel. + +For listeners, the application callback will be called in parallel for new connections. +This allows server applications to scale efficiently with the number of processors. From 8d83395ad833fedb421ad39bb20f4cd981b30d8a Mon Sep 17 00:00:00 2001 From: Nick Banks Date: Thu, 31 Oct 2024 08:45:48 -0400 Subject: [PATCH 02/13] Copilot suggestions/fixes --- docs/Execution.md | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/docs/Execution.md b/docs/Execution.md index de79bbd320..c2a975254d 100644 --- a/docs/Execution.md +++ b/docs/Execution.md @@ -1,7 +1,7 @@ Execution ====== -The MsQuic API takes a very difference stance when it comes to its execution model compared to BSD-style sockets (and most other networking libraries built on top of them). +The MsQuic API takes a very different stance when it comes to its execution model compared to BSD-style sockets (and most other networking libraries built on top of them). The sections below detail the designs MsQuic uses, with some of the details as to why these design choices were made. ## Event Model @@ -30,14 +30,14 @@ QUIC_STATUS ``` Above is an example of the type of callback delivered to the listener interface. -The application is requires to register a callback handler that should handle all the events MsQuic may indicate, returning a status for if it was successfully handled or not. +The application is required to register a callback handler that should handle all the events MsQuic may indicate, returning a status for if it was successfully handled or not. This is very different from BSD sockets which required the application to make a call (e.g., `send` or `recv`) in order to determine something happened. This difference was made for several reasons: - The MsQuic API **runs in-proc**, and therefore doesn't require a kernel to user mode boundary switch to indicate something to the application layer. This allows for the callback-based design which is not as practical for BSD sockets. -- MsQuic, by virtue of the QUIC protocol itself, has a lot of different types of events. Just considering streams, the app maybe have hundreds of objects at once which may have some state change. By leveraging the callback model, the application doesn't have to manage having pending calls on each object. +- MsQuic, by virtue of the QUIC protocol itself, has a lot of different types of events. Just considering streams, the app may have hundreds of objects at once which may have some state change. By leveraging the callback model, the application doesn't have to manage having pending calls on each object. - Experience has shown it to be very difficult to write correct, performant code on top of the BSD-style interface. By leveraging callbacks (that happen at the correct time, on the correct thread/processor), it allows MsQuic to abstract a lot of complexity away from applications and make things "just work" out of the box. @@ -49,7 +49,7 @@ Event handlers are **required** for all objects (that have them), because much o Additionally, important events, such as "shutdown complete" events provide crucial information to the application to function properly. Without these events, the application cannot not know when it is safe to clean up objects. -Applicationss are expected to keep any execution time in the callbacks **to a minimum**. +Applications are expected to keep any execution time in the callbacks **to a minimum**. MsQuic does not use separate threads for the protocol execution and upcalls to the application. Therefore, any significant delays on the callback **will delay the protocol**. Any significant time or work needed to be completed by the application must happen on its own thread. @@ -58,16 +58,16 @@ This doesn't mean the application isn't allowed to do any work in the callback h In fact, many things are expressly designed to be most efficient when the application does them on the callback. For instance, closing a handle to a connection or stream is ideally implemented in the "shutdown complete" indications. -One important aspect of this design is that all blocking API (down) calls invoked on a callback always happen inline (to prevent deadlocks), and will supercede any calls in progress or queued from a separate thread. +One important aspect of this design is that all blocking API (down) calls invoked on a callback always happen inline (to prevent deadlocks), and will supersede any calls in progress or queued from a separate thread. ## Threading -By default, MsQuic creates its own threads to manage execution its logic. +By default, MsQuic creates its own threads to manage the execution of its logic. The nature of the number and configuration of these threads depends on the configuration the apps passes to [RegistrationOpen](api/RegistrationOpen.md) or `QUIC_PARAM_GLOBAL_EXECUTION_CONFIG`. The default behavior is to create dedicated, per-processor threads that are hard affinitized to a given NUMA-node, and soft-affinitized (set 'ideal processor') to a given processor. These threads are then used to drive both the datapath (i.e. UDP) and QUIC layers. -Great care it taken to (try to) align the MsQuic processing logic with the rest of the networking stack (including hardware RSS) so that all processing stays at least on the same NUMA node, but ideally the same processor. +Great care is taken to (try to) align the MsQuic processing logic with the rest of the networking stack (including hardware RSS) so that all processing stays at least on the same NUMA node, but ideally the same processor. The complexity required to achieve alignment in processing across various threads and processors is why MsQuic takes the stance of managing all its own threading by default. The goal is to abstract all this complexity from the many applications that build on top, so every app doesn't have to build the necessary logic to do this itself. From 7c3e5b73e9c8467e5c9f74910a9faa2251633f89 Mon Sep 17 00:00:00 2001 From: Nick Banks Date: Thu, 31 Oct 2024 08:50:38 -0400 Subject: [PATCH 03/13] More Copilot improvements --- docs/Execution.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/docs/Execution.md b/docs/Execution.md index c2a975254d..dc8f797fa8 100644 --- a/docs/Execution.md +++ b/docs/Execution.md @@ -1,13 +1,13 @@ Execution ====== -The MsQuic API takes a very different stance when it comes to its execution model compared to BSD-style sockets (and most other networking libraries built on top of them). -The sections below detail the designs MsQuic uses, with some of the details as to why these design choices were made. +The MsQuic API uses a different execution model compared to BSD-style sockets and most other networking libraries built on them. +The sections below detail the designs MsQuic uses and the reasons behind these choices. ## Event Model -In the MsQuic API, all state changes and other notifications are indicated directly to the application via a callback. -This includes things like connection state changes, new streams being created, stream data being received, and stream sends completing. +In the MsQuic API, all state changes and notifications are indicated directly to the application via a callback. +This includes connection state changes, new streams being created, stream data being received, and stream sends completing. ```c typedef struct QUIC_LISTENER_EVENT { From 37d477e0197ac4cf3216c70c907e366afa9f08d3 Mon Sep 17 00:00:00 2001 From: Nick Banks Date: Thu, 31 Oct 2024 08:55:26 -0400 Subject: [PATCH 04/13] More Copilot --- docs/Execution.md | 62 +++++++++++++++++++++++------------------------ 1 file changed, 30 insertions(+), 32 deletions(-) diff --git a/docs/Execution.md b/docs/Execution.md index dc8f797fa8..9794e8a9f4 100644 --- a/docs/Execution.md +++ b/docs/Execution.md @@ -29,55 +29,53 @@ QUIC_STATUS ); ``` -Above is an example of the type of callback delivered to the listener interface. -The application is required to register a callback handler that should handle all the events MsQuic may indicate, returning a status for if it was successfully handled or not. +Above is an example of a callback delivered to the listener interface. +The application must register a callback handler to manage all events MsQuic may indicate, returning a status to show if it was successfully handled or not. -This is very different from BSD sockets which required the application to make a call (e.g., `send` or `recv`) in order to determine something happened. -This difference was made for several reasons: +This approach differs significantly from BSD sockets, where the application must make a call (e.g., `send` or `recv`) to determine if something happened. +This design choice was made for several reasons: -- The MsQuic API **runs in-proc**, and therefore doesn't require a kernel to user mode boundary switch to indicate something to the application layer. This allows for the callback-based design which is not as practical for BSD sockets. +- The MsQuic API **runs in-process**, eliminating the need for a kernel to user mode boundary switch to notify the application layer. This makes the callback-based design more practical compared to BSD sockets. -- MsQuic, by virtue of the QUIC protocol itself, has a lot of different types of events. Just considering streams, the app may have hundreds of objects at once which may have some state change. By leveraging the callback model, the application doesn't have to manage having pending calls on each object. +- MsQuic, due to the QUIC protocol, has numerous event types. Applications may have hundreds of objects with potential state changes. The callback model allows the application to avoid managing pending calls on each object. -- Experience has shown it to be very difficult to write correct, performant code on top of the BSD-style interface. By leveraging callbacks (that happen at the correct time, on the correct thread/processor), it allows MsQuic to abstract a lot of complexity away from applications and make things "just work" out of the box. +- Writing correct, performant code on top of the BSD-style interface has proven challenging. Callbacks, executed at the correct time and on the correct thread/processor, enable MsQuic to abstract much complexity from applications, making things "just work" out of the box. -- It simplifies much of the logic in MsQuic, because it eliminates the need for a queue or cached state that needs to be indicated to the application. In the BSD model, the networking stack must wait for the top-down call from the application before it can indicate the completion. This adds increased code size, complexity and memory usage. +- It simplifies MsQuic's logic by eliminating the need for a queue or cached state to indicate to the application. In the BSD model, the networking stack must wait for a top-down call from the application before indicating completion, increasing code size, complexity, and memory usage. ### Writing Event Handlers -Event handlers are **required** for all objects (that have them), because much of the MsQuic API happens through these callbacks. -Additionally, important events, such as "shutdown complete" events provide crucial information to the application to function properly. -Without these events, the application cannot not know when it is safe to clean up objects. +Event handlers are **essential** for all objects that support them, as much of the MsQuic API operates through these callbacks. +Critical events, such as "shutdown complete" notifications, provide vital information necessary for the application to function correctly. +Without these events, the application cannot determine when it is safe to clean up objects. -Applications are expected to keep any execution time in the callbacks **to a minimum**. -MsQuic does not use separate threads for the protocol execution and upcalls to the application. -Therefore, any significant delays on the callback **will delay the protocol**. -Any significant time or work needed to be completed by the application must happen on its own thread. +Applications should keep the execution time within callbacks **to a minimum**. +MsQuic does not use separate threads for protocol execution and upcalls to the application. +Therefore, any significant delays in the callback **will delay the protocol**. +Any substantial work required by the application should be performed on its own thread. -This doesn't mean the application isn't allowed to do any work in the callback handler. -In fact, many things are expressly designed to be most efficient when the application does them on the callback. -For instance, closing a handle to a connection or stream is ideally implemented in the "shutdown complete" indications. +This does not mean the application cannot perform any work in the callback handler. +In fact, many operations are designed to be most efficient when executed within the callback. +For example, closing a handle to a connection or stream is ideally done during the "shutdown complete" indication. -One important aspect of this design is that all blocking API (down) calls invoked on a callback always happen inline (to prevent deadlocks), and will supersede any calls in progress or queued from a separate thread. +A crucial aspect of this design is that all blocking API (down) calls invoked within a callback always occur inline (to prevent deadlocks) and will take precedence over any calls in progress or queued from a separate thread. ## Threading By default, MsQuic creates its own threads to manage the execution of its logic. -The nature of the number and configuration of these threads depends on the configuration the apps passes to [RegistrationOpen](api/RegistrationOpen.md) or `QUIC_PARAM_GLOBAL_EXECUTION_CONFIG`. +The number and configuration of these threads depend on the settings passed to [RegistrationOpen](api/RegistrationOpen.md) or `QUIC_PARAM_GLOBAL_EXECUTION_CONFIG`. -The default behavior is to create dedicated, per-processor threads that are hard affinitized to a given NUMA-node, and soft-affinitized (set 'ideal processor') to a given processor. -These threads are then used to drive both the datapath (i.e. UDP) and QUIC layers. -Great care is taken to (try to) align the MsQuic processing logic with the rest of the networking stack (including hardware RSS) so that all processing stays at least on the same NUMA node, but ideally the same processor. +Typically, MsQuic creates dedicated threads for each processor, which are hard-affinitized to a specific NUMA node and soft-affinitized (set as 'ideal processor') to a specific processor. +These threads handle both the datapath (i.e., UDP) and QUIC layers. +MsQuic aligns its processing logic with the rest of the networking stack (including hardware RSS) to ensure that all processing stays on the same NUMA node, and ideally, the same processor. -The complexity required to achieve alignment in processing across various threads and processors is why MsQuic takes the stance of managing all its own threading by default. -The goal is to abstract all this complexity from the many applications that build on top, so every app doesn't have to build the necessary logic to do this itself. -Things should 'just work' out of the box. +The complexity of aligning processing across various threads and processors is why MsQuic manages its own threading by default. +This abstraction simplifies the development process for applications built on top of MsQuic, ensuring that things "just work" out of the box. -Each of these threads manage the execution of one or more connections. -All connections are spread across the various threads based on their RSS alignment, which generally should evenly spread the traffic based on the different UDP tuples used. -Each connection and all derived state (i.e., streams) are managed and executed by a single thread at a time; but may move across threads to align with any RSS changes. -This means that each connection and its streams is effectively single-threaded, including all upcalls to the application layer. +Each thread manages the execution of one or more connections. +Connections are distributed across threads based on their RSS alignment, which should evenly distribute traffic based on different UDP tuples. +Each connection and its derived state (i.e., streams) are managed and executed by a single thread at a time, but may move across threads to align with any RSS changes. +This ensures that each connection and its streams are effectively single-threaded, including all upcalls to the application layer. MsQuic will **never** make upcalls for a single connection or any of its streams in parallel. -For listeners, the application callback will be called in parallel for new connections. -This allows server applications to scale efficiently with the number of processors. +For listeners, the application callback will be called in parallel for new connections, allowing server applications to scale efficiently with the number of processors. From 2f5d607c7590ef3022599089a5084206f3eee426 Mon Sep 17 00:00:00 2001 From: Nick Banks Date: Thu, 31 Oct 2024 10:12:16 -0400 Subject: [PATCH 05/13] Add a diagram --- docs/Execution.md | 31 +++++++++++++++++++++++++++++++ 1 file changed, 31 insertions(+) diff --git a/docs/Execution.md b/docs/Execution.md index 9794e8a9f4..622b87bc9d 100644 --- a/docs/Execution.md +++ b/docs/Execution.md @@ -79,3 +79,34 @@ This ensures that each connection and its streams are effectively single-threade MsQuic will **never** make upcalls for a single connection or any of its streams in parallel. For listeners, the application callback will be called in parallel for new connections, allowing server applications to scale efficiently with the number of processors. + +```mermaid +graph TD + subgraph NIC + RSS1 + RSS2 + RSS3 + end + RSS1 -->|Receive| Processor1 + RSS2 -->|Receive| Processor2 + RSS3 -->|Receive| Processor3 + subgraph Processor1 + Thread1 + Thread1 -->|Manages| Connection1 + Thread1 -->|Manages| Connection2 + Connection1 -->|Delivers Event| ApplicationCallback1 + Connection2 -->|Delivers Event| ApplicationCallback2 + end + subgraph Processor2 + Thread2 + Thread2 -->|Manages| Connection3 + Connection3 -->|Delivers Event| ApplicationCallback3 + end + subgraph Processor3 + Thread3 + Thread3 -->|Manages| Connection4 + Thread3 -->|Manages| Connection5 + Connection4 -->|Delivers Event| ApplicationCallback4 + Connection5 -->|Delivers Event| ApplicationCallback5 + end +``` From 7bcdfd9f472666173e71f0fad6070433c46f56c6 Mon Sep 17 00:00:00 2001 From: Nick Banks Date: Thu, 31 Oct 2024 10:58:43 -0400 Subject: [PATCH 06/13] Update diagram --- docs/Execution.md | 45 ++++++++++++++++++++------------------------- 1 file changed, 20 insertions(+), 25 deletions(-) diff --git a/docs/Execution.md b/docs/Execution.md index 622b87bc9d..53267b0776 100644 --- a/docs/Execution.md +++ b/docs/Execution.md @@ -82,31 +82,26 @@ For listeners, the application callback will be called in parallel for new conne ```mermaid graph TD - subgraph NIC - RSS1 - RSS2 - RSS3 + subgraph Kernel + NIC-Queue1[NIC Queue] + NIC-Queue2[NIC Queue] + NIC-Queue1 -->|RSS Receive| UDP1[IP/UDP] + NIC-Queue2 -->|RSS Receive| UDP2[IP/UDP] end - RSS1 -->|Receive| Processor1 - RSS2 -->|Receive| Processor2 - RSS3 -->|Receive| Processor3 - subgraph Processor1 - Thread1 - Thread1 -->|Manages| Connection1 - Thread1 -->|Manages| Connection2 - Connection1 -->|Delivers Event| ApplicationCallback1 - Connection2 -->|Delivers Event| ApplicationCallback2 - end - subgraph Processor2 - Thread2 - Thread2 -->|Manages| Connection3 - Connection3 -->|Delivers Event| ApplicationCallback3 - end - subgraph Processor3 - Thread3 - Thread3 -->|Manages| Connection4 - Thread3 -->|Manages| Connection5 - Connection4 -->|Delivers Event| ApplicationCallback4 - Connection5 -->|Delivers Event| ApplicationCallback5 + subgraph MsQuic Process + UDP1 -.-> Processor1 + UDP2 -.-> Processor2 + subgraph Processor1[Processor 0] + Thread1[Thread] + Thread1 -->|Manages| Connection1[Connection 1] + Thread1 -->|Manages| Connection2[Connection 2] + Connection1 -->|Delivers Event| ApplicationCallback1[App Callback] + Connection2 -->|Delivers Event| ApplicationCallback2[App Callback] + end + subgraph Processor2[Processor 1] + Thread2[Thread] + Thread2 -->|Manages| Connection3[Connection 3] + Connection3 -->|Delivers Event| ApplicationCallback3[App Callback] + end end ``` From cfe49553e52b7bff68bcb00f51b46f9f40989810 Mon Sep 17 00:00:00 2001 From: Nick Banks Date: Mon, 4 Nov 2024 16:13:10 +0000 Subject: [PATCH 07/13] Few tweaks of language --- docs/Execution.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/docs/Execution.md b/docs/Execution.md index 53267b0776..6ddb0bd133 100644 --- a/docs/Execution.md +++ b/docs/Execution.md @@ -32,16 +32,16 @@ QUIC_STATUS Above is an example of a callback delivered to the listener interface. The application must register a callback handler to manage all events MsQuic may indicate, returning a status to show if it was successfully handled or not. -This approach differs significantly from BSD sockets, where the application must make a call (e.g., `send` or `recv`) to determine if something happened. +This approach differs significantly from sockets and most networking libraries, where the application must make a call (e.g., `send` or `recv`) to determine if something happened. This design choice was made for several reasons: -- The MsQuic API **runs in-process**, eliminating the need for a kernel to user mode boundary switch to notify the application layer. This makes the callback-based design more practical compared to BSD sockets. +- The MsQuic API **runs in-process**, eliminating the need for a kernel to user mode boundary switch to notify the application layer. This makes the callback-based design more practical compared to sockets. - MsQuic, due to the QUIC protocol, has numerous event types. Applications may have hundreds of objects with potential state changes. The callback model allows the application to avoid managing pending calls on each object. -- Writing correct, performant code on top of the BSD-style interface has proven challenging. Callbacks, executed at the correct time and on the correct thread/processor, enable MsQuic to abstract much complexity from applications, making things "just work" out of the box. +- Writing correct, scalable code on top of the socket interfaces has proven challenging. Callbacks, executed at the correct time and on the correct thread/processor, enable MsQuic to abstract much complexity from applications, making things "just work" out of the box. -- It simplifies MsQuic's logic by eliminating the need for a queue or cached state to indicate to the application. In the BSD model, the networking stack must wait for a top-down call from the application before indicating completion, increasing code size, complexity, and memory usage. +- It simplifies MsQuic's logic by eliminating the need for a queue or cached state to indicate to the application. In the socket model, the networking stack must wait for a top-down call from the application before indicating completion, increasing code size, complexity, and memory usage. ### Writing Event Handlers From 2714ab163cc26f767ed259d76e468ddad786f601 Mon Sep 17 00:00:00 2001 From: Nick Banks Date: Mon, 4 Nov 2024 16:34:33 +0000 Subject: [PATCH 08/13] wordsmithing --- docs/Execution.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/Execution.md b/docs/Execution.md index 6ddb0bd133..fb008def2b 100644 --- a/docs/Execution.md +++ b/docs/Execution.md @@ -39,7 +39,7 @@ This design choice was made for several reasons: - MsQuic, due to the QUIC protocol, has numerous event types. Applications may have hundreds of objects with potential state changes. The callback model allows the application to avoid managing pending calls on each object. -- Writing correct, scalable code on top of the socket interfaces has proven challenging. Callbacks, executed at the correct time and on the correct thread/processor, enable MsQuic to abstract much complexity from applications, making things "just work" out of the box. +- Writing correct, scalable code on top of the socket interfaces has proven challenging. By offloading the threading to MsQuic it enables MsQuic to abstract much complexity from applications, making things "just work" out of the box. - It simplifies MsQuic's logic by eliminating the need for a queue or cached state to indicate to the application. In the socket model, the networking stack must wait for a top-down call from the application before indicating completion, increasing code size, complexity, and memory usage. From 5b3db5675835451f28bf9737ba7cdaf084e2dc03 Mon Sep 17 00:00:00 2001 From: Nick Banks Date: Mon, 4 Nov 2024 16:46:50 +0000 Subject: [PATCH 09/13] Some text about recursive callbacks --- docs/Execution.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/docs/Execution.md b/docs/Execution.md index fb008def2b..f1bbae5e0c 100644 --- a/docs/Execution.md +++ b/docs/Execution.md @@ -59,6 +59,8 @@ In fact, many operations are designed to be most efficient when executed within For example, closing a handle to a connection or stream is ideally done during the "shutdown complete" indication. A crucial aspect of this design is that all blocking API (down) calls invoked within a callback always occur inline (to prevent deadlocks) and will take precedence over any calls in progress or queued from a separate thread. +It's also worth noting that MsQuic will not invoke a recursive callback to the application by default in these cases. +The one exception to this rule is if the application opts in via the `QUIC_STREAM_SHUTDOWN_FLAG_INLINE` flag when calling `StreamShudown` on a callback. ## Threading From a863beab34e7d4edc1b12d02c8a606dba3838eca Mon Sep 17 00:00:00 2001 From: Nick Banks Date: Mon, 4 Nov 2024 16:49:39 +0000 Subject: [PATCH 10/13] per-object callback text --- docs/Execution.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/Execution.md b/docs/Execution.md index f1bbae5e0c..02ea868a07 100644 --- a/docs/Execution.md +++ b/docs/Execution.md @@ -30,7 +30,7 @@ QUIC_STATUS ``` Above is an example of a callback delivered to the listener interface. -The application must register a callback handler to manage all events MsQuic may indicate, returning a status to show if it was successfully handled or not. +The application must register a per-object callback handler to manage all events MsQuic may indicate for that object, returning a status to show if it was successfully handled or not. This approach differs significantly from sockets and most networking libraries, where the application must make a call (e.g., `send` or `recv`) to determine if something happened. This design choice was made for several reasons: From c515b58b9a642a68e194db0ab407706772b7268a Mon Sep 17 00:00:00 2001 From: Nick Banks Date: Mon, 4 Nov 2024 16:50:43 +0000 Subject: [PATCH 11/13] async --- docs/Execution.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/Execution.md b/docs/Execution.md index 02ea868a07..4695f1df7c 100644 --- a/docs/Execution.md +++ b/docs/Execution.md @@ -6,7 +6,7 @@ The sections below detail the designs MsQuic uses and the reasons behind these c ## Event Model -In the MsQuic API, all state changes and notifications are indicated directly to the application via a callback. +In the MsQuic API, all asynchronous state changes and notifications are indicated directly to the application via a callback. This includes connection state changes, new streams being created, stream data being received, and stream sends completing. ```c From 0516c56696b17ea37b87c951e62c91d691e9b813 Mon Sep 17 00:00:00 2001 From: Nick Banks Date: Mon, 4 Nov 2024 16:51:42 +0000 Subject: [PATCH 12/13] for QUIC --- docs/Execution.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/Execution.md b/docs/Execution.md index 4695f1df7c..264c517c6a 100644 --- a/docs/Execution.md +++ b/docs/Execution.md @@ -72,7 +72,7 @@ These threads handle both the datapath (i.e., UDP) and QUIC layers. MsQuic aligns its processing logic with the rest of the networking stack (including hardware RSS) to ensure that all processing stays on the same NUMA node, and ideally, the same processor. The complexity of aligning processing across various threads and processors is why MsQuic manages its own threading by default. -This abstraction simplifies the development process for applications built on top of MsQuic, ensuring that things "just work" out of the box. +This abstraction simplifies the development process for applications built on top of MsQuic, ensuring that things "just work" for QUIC out of the box. Each thread manages the execution of one or more connections. Connections are distributed across threads based on their RSS alignment, which should evenly distribute traffic based on different UDP tuples. From 99a13600545e9c97812c2a33b034c05ded63cf1d Mon Sep 17 00:00:00 2001 From: Nick Banks Date: Mon, 4 Nov 2024 17:11:45 +0000 Subject: [PATCH 13/13] More about threads --- docs/Execution.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/docs/Execution.md b/docs/Execution.md index 264c517c6a..7d7259e7a1 100644 --- a/docs/Execution.md +++ b/docs/Execution.md @@ -69,6 +69,8 @@ The number and configuration of these threads depend on the settings passed to [ Typically, MsQuic creates dedicated threads for each processor, which are hard-affinitized to a specific NUMA node and soft-affinitized (set as 'ideal processor') to a specific processor. These threads handle both the datapath (i.e., UDP) and QUIC layers. +By default both layers are handled by a single thread (per-processor), but QUIC may be configured to run these layers on separate threads. +By using the same thread MsQuic can achieve lower latency, but by using separate threads it can achieve higher throughput. MsQuic aligns its processing logic with the rest of the networking stack (including hardware RSS) to ensure that all processing stays on the same NUMA node, and ideally, the same processor. The complexity of aligning processing across various threads and processors is why MsQuic manages its own threading by default.