[TriLUG] Questions about Threading
Ed Hill
ed at eh3.com
Tue Mar 26 13:43:02 EST 2002
On Tue, 2002-03-26 at 10:50, Jeremy P wrote:
> On Tue, 26 Mar 2002, Jeff Bollinger wrote:
>
> > I'm not a programmer, but I am curious about threading and the
> > advantage to having a multi-processor linux box (I guess running an
> > smp kernel). Do programs have to be written to take advantage of a
> > dual-processor, or will cycles be distributed evenly during a process?
> > How do you know if your program will support multiple processors?
>
> This is my general, non-programming understanding of this; hopefully
> someone can correct any misconceptions.
>
> Programs can be written to take direct advantage of multiple processors,
> but as far as I know, most for Linux don't. There seems to be some
> argument about different threading models, and until that's sorted out, we
> probably won't see much multithreaded stuff. The problem is that these
> things are very different among the Unix platforms, and most open source
> projects want their software to work on Linux, Solaris, BSD, etc. -- and
> each has very different systems for multithreaded processes.
Yes and no. Essentially all Unixes, including Linux, now have fairly
decent support for the Posix threading model.
Note that Linus defined a thread as a CoE (Context of Execution) and
threads in Linux are implemented using a clone() system call that is
nearly identical to a fork() so that Linux threads essentially *are*
processes. The reasons for this choice were:
- full processes in Linux are already *very* lightweight and
incur a (provably) minimal penalty due to context switches
on many architectures (eg. context switch for linux for x86
is something like 50-100x faster than solaris for x86)
- treating threads as processes within the kernel allows them
to be easily and cheaply scheduled on both uni- and multi-proc
systems
- treating threads within the kernel as processes helps keep
the kernel (scheduling, MM, etc) code much simpler
> My understanding is that "symmetric" multi-processing, the S in SMP,
> implies that each heavyweight (non-threaded) process is assigned to a
No. "Symmetric" means the processors are "equal" in that they all have
equal access to the system memory. An example of non-SMP systems is
NUMA (Non-Uniform Memory Access) where the processors have private
memory ranges and must query each other for memory addresses outside
their local chunk.
> given processor. If a computer only runs one single-process program, this
> does you no good. But since programs like Apache often use many
> sub-processes, the scheduler distributes the load pretty evenly by
> assigning some to one processor, and some to another. Also, the kernel
> apparently can easily switch the processes around to balance the load.
> If you run "top" on a multi-processor system, you can monitor this; on a
> loaded system the CPU% will add up to n*100%, with 100% for each
> processor.
A multi-processor box *will* help a little even in the case that your
code is not threaded. The OS and daemon CPU-load will tend to run on
the processor that isn't, at any given moment, running your non-threaded
app, thus giving it a (slight?) performance boost.
And note that processes are not assigned to CPUs in any lasting sense.
The OS scheduler may try to keep a process on a particular CPU (cache
affinity) but, in general, processes rapidly bounce in and out of
context on both uni-proc and SMP systems.
Ed
--
Edward H. Hill III, PhD
Post-Doctoral Researcher | Email: ed at eh3.com, ehill at mines.edu
Division of ESE | URL: http://www.eh3.com
Colorado School of Mines | Phone: 303-273-3483
Golden, CO 80401 | Fax: 303-273-3311
Key fingerprint = 5BDE 4DA1 66BE 4F7B BC17 3A0C 932B 7266 1E76 F123
More information about the TriLUG
mailing list