When authoring high performance applications, the following generalized rules can be considered valuable for creating highly performant capabilities:
Share nothing across threads, even at the expense of memory Sharing across thread boundaries leads to increased preemptions, costly thread contention, and may introduce other less obvious expenses in L2 cache, and more
When working with shared state that is seldom or never updated, give each thread its own copy even at the expense of memory
Create thread affinity if the workload represents saga’s for object state, but keep in mind this may limit scalability within a single instance of a process Where possible, isolating threads is ideal Embrace lock-free architectures The fewer locks the better, which is obvious to most people Understanding how to achieve thread-safety using lock-free patterns can be somewhat nuanced, so digging into the details of how the primitive/native locking semantics work and the concepts behind memory fencing can help ensure you have leaner execution paths # dedicated long-running Threads == Number of processing cores It’s easy to just spin up another thread. Unfortunately, the more threads you create the more contention you are likely to create with them. Eventually, you may find your application is spending so much time jumping between threads; there is no time to do any real work. This is known as a ‘Live Lock’ scenario and is somewhat challenging to debug
Test the performance of your application using different threading patterns on hardware that is representative of the production environment to ensure the number of threads you’ve chosen is actually optimal
For background tasks that have more flexibility in how often they can be run, when, and how much work can be done at any given time, consider Continuation or Task Scheduler patterns and collapse them onto fewer (or a single) threads Consider using patterns that utilize the ThreadPool instead of using dedicated long-running threadsStay in-memory and avoid or batch I/O where possible File, Database, and Network I/O can be costly
Consider batching updates when I/O is required. This includes buffering file writes, batch message transmissions, etc…
Read more: Smelser.net
QR:
Share nothing across threads, even at the expense of memory Sharing across thread boundaries leads to increased preemptions, costly thread contention, and may introduce other less obvious expenses in L2 cache, and more
When working with shared state that is seldom or never updated, give each thread its own copy even at the expense of memory
Create thread affinity if the workload represents saga’s for object state, but keep in mind this may limit scalability within a single instance of a process Where possible, isolating threads is ideal Embrace lock-free architectures The fewer locks the better, which is obvious to most people Understanding how to achieve thread-safety using lock-free patterns can be somewhat nuanced, so digging into the details of how the primitive/native locking semantics work and the concepts behind memory fencing can help ensure you have leaner execution paths # dedicated long-running Threads == Number of processing cores It’s easy to just spin up another thread. Unfortunately, the more threads you create the more contention you are likely to create with them. Eventually, you may find your application is spending so much time jumping between threads; there is no time to do any real work. This is known as a ‘Live Lock’ scenario and is somewhat challenging to debug
Test the performance of your application using different threading patterns on hardware that is representative of the production environment to ensure the number of threads you’ve chosen is actually optimal
For background tasks that have more flexibility in how often they can be run, when, and how much work can be done at any given time, consider Continuation or Task Scheduler patterns and collapse them onto fewer (or a single) threads Consider using patterns that utilize the ThreadPool instead of using dedicated long-running threadsStay in-memory and avoid or batch I/O where possible File, Database, and Network I/O can be costly
Consider batching updates when I/O is required. This includes buffering file writes, batch message transmissions, etc…
Read more: Smelser.net
QR: