As a prerequisite, I am going to assume that you understand how TLS works, and in particular how __declspec(thread) variables work. There's a quite thorough treatise on the subject by Ken Johnson (better known as Skywing), who comments quite frequently on this site. The series starts here and continues for a total of 8 installments, ending here. That last page also has a table of contents so you can skip over the parts you already know to get to the parts you don't know.
Now that you've read Ken's articles...
No, wait I know you didn't read them and you're just skimming past it in the hopes that you will be able to fake your way through the rest of this article without having read the prerequisites. Well, okay, but don't be surprised when I get frustrated if you ask a question that is answered in the prerequisites.
Anyway, as you learned from Part 5 of Ken's series, the __declspec(thread) model, as originally envisioned, assumed that all DLLs which use the feature would be present at process startup, so that all the _tls_index values can be computed and the total sizes of each module's TLS data can be calculated before any threads get created. (Well, okay, the initial thread already got created, but that's okay; we'll set up that thread's TLS before we execute any application code.)
If you loaded a __declspec(thread)-dependent module dynamically, bad things happened. For one, TLS data was not set up for any pre-existing threads, since those threads were initialized before your module got loaded. Windows doesn't have a time machine where it can go back in time to when those threads were initialized and pre-reserve space for the TLS variables your new module needed. Nope, your module is just out of luck with respect to those pre-existing threads, and if it tries to use __declspec(thread) variables, it'll find that its TLS slot never got initialized, and there's no data there to access.
Read more: The old new thing
Now that you've read Ken's articles...
No, wait I know you didn't read them and you're just skimming past it in the hopes that you will be able to fake your way through the rest of this article without having read the prerequisites. Well, okay, but don't be surprised when I get frustrated if you ask a question that is answered in the prerequisites.
Anyway, as you learned from Part 5 of Ken's series, the __declspec(thread) model, as originally envisioned, assumed that all DLLs which use the feature would be present at process startup, so that all the _tls_index values can be computed and the total sizes of each module's TLS data can be calculated before any threads get created. (Well, okay, the initial thread already got created, but that's okay; we'll set up that thread's TLS before we execute any application code.)
If you loaded a __declspec(thread)-dependent module dynamically, bad things happened. For one, TLS data was not set up for any pre-existing threads, since those threads were initialized before your module got loaded. Windows doesn't have a time machine where it can go back in time to when those threads were initialized and pre-reserve space for the TLS variables your new module needed. Nope, your module is just out of luck with respect to those pre-existing threads, and if it tries to use __declspec(thread) variables, it'll find that its TLS slot never got initialized, and there's no data there to access.
Read more: The old new thing