Navigation:  User Guide >

Using Multiple Processors

Print this Topic Previous pageReturn to chapter overviewNext page

FlexPDE version 6 uses multi-threaded computation to support modern multi-core and multi-processor hardware configurations.  Only shared-memory multi-processors are supported, not clusters.

Each opened problem runs in its own computation thread, and can use up to eight additional computation threads.  A single main thread controls the graphic interface and screen display.

Matrix construction, residual calculations and linear system solvers are all multi-threaded.  Mesh generation and plot functions are not, although graphics load is shared between the problem thread and the main graphics thread.

Individual Problem Control

Each individual script can declare the number of worker threads to be used in the computation:

SELECT THREADS = <number>

requests that <number> worker threads be used, in addition to the main graphics thread and the individual problem thread.

Setting the Default

The default number of worker threads can be set by manually editing the configuration file "flexpde6.ini" in the "flexpde6user" folder.  This folder resides in the "My Documents" folder under Windows, and the user's "home" folder under Linux and MacOSX.  Edit the line:

[THREADS] 1

to reflect the desired default number of worker threads.

Command-Line Control

If you run FlexPDE6 from a command line and include the switch -T<number>, the default thread count will be set to <number>.  For example, the command line

flexpde6 -T4 problem

will set the default to 4 threads and load the script file "problem.pde".  The selected thread count will be written to the flexpde6.ini file on conclusion of the flexpde6 session.

Speed Effects of Multiple Processors

There are many factors that will influence the timing of a multi-thread run.

The dominant factor is the memory bandwidth. If the memory cannot keep up with the processor speed, then more threads will run slower due to the overhead of constructing and synchronizing threads and merging data.
The size of the problem will also affect the speedup, because with a larger problem a smaller proportion of data can be held in cache memory. The memory bandwidth limitation will therefore be greater with a larger problem.
Graphics construction is not multi-threaded in FlexPDE V6. Too many complex plots will therefore drive the performance to 1-thread levels.  (Graphic redraw is handled in a separate thread).

The following chart shows our experience with speeds in versions 5 and 6. These tests were run on a 4-core AMD Phenom with 667 MHz 128-bit memory. Notice that the Black_Oil problem is significantly faster in version 6, even though it is taking many more timesteps. This timestep count indicates that the timestep control in V6 is more pessimistic than V5.  The speedup with V6 1 thread is partly due to the fact that graphic redraws are run in a separate thread in V6 but not in V5.

Notice that in this machine, the memory saturates at 3 threads, so that the fourth thread produces no significant speed improvement (and in fact may be slower).

 


 

Black_Oil.pde

3D_FlowBox.pde

Version

Threads

CPU time

timesteps

CPU time

5

1

14:37

534

8:15

5

2

12:17

540

6:09

 

 

 

 

 

6

1

10:21

688

8:06

6

2

6:58

684

4:14

6

3

6:16

696

3:30

6

4

7:13

703

3:22

 

 


Page url: index.html?using_multiple_processors.html