Particle Re-distribution for Load Balancing on the Cray T3D
Xin Yuan, Bohr He and Rami Melhem
Computer Science Department
University of Pittsburgh
PROCEDURE NAME AND PARAMETERS:
subroutine bisect(xmin, xmax, numbin, extbin, npartpemax,
nattri, nattrf, x, attri, attrf, npartpe, nghostpe, xminpe, xmaxpe,
binminpe, binmaxpe, bal_flag, bound_flag, usercut_flag, cutdimension, iwork, rwork)
INTRODUCTION
bisect() distributes particles among p PEs according to
different
criteria (described below) by using nested bi-section, bin sorting and domain
extention.
Particles are specified by their positions within a rectangular
parallelepiped physical domain, and each particle has a certain number
of attributes. One of these attributes may be a "weight" associated with that
particle.
The routine imposes on the domain a 3-dimensional grid of
bins whose size is specified by the user.
It, then, bisects the domain at bin boundaries into parallelepiped
subdomains. Each cut doubles the number of subdomains. After log p
cuts, the original physical domain is decomposed into p
parallelepipled subdomains with the number of particles or the sum of weights of
the particles in each subdomain being balanced.
Each subdomain is assigned to one PE along with the particles within the subdomain,
and the particles within a user specified extention of the subdomain.
HIGH LEVEL I/O SPECIFICATION
Inputs
Each PE has npartpe particles, where npartpe may differ from one PE to the next. The positions of the particles in the domain are stored in a local array x(npartpemax,3) along with other integer and floating point attributes that are stored in local arrays attri(npartpemax,nattri) and attrf(npartpemax,nattrf), respectively.Outputs
The physical space is divided up into subdomains each of which belongs to one processor. Thus aside from the external boundaries associated with the boundaries of the physical domain, there are internal boundaries associated with the subdomain assigned to a PE. The positions and attributes of the particles in that subdomain (corresponding to a single PE), are stored in the arrays x, attri and attrf. Moreover, by assigning a non-zero value to the input parameter extbin, the arrays x, attri and attrf will contain, in addition to the particles in the subdomain, replicated copies of the particles from the original domain that lie within extbin bins outside the subdomain boundaries. These replicated particles, called "ghost particles", may be needed to account for the interaction between subdomains. The number of ghost particles is returned in nghostpe. This provides the user with enough flexibility to avoid future off-processor communication to access these ghost particles.INTERFACE
The following parameters are passed tobisect.
The calling procedure has to set any variable marked
*in* or *inout*
to appropriate values before calling bisect(). The
variables marked *out*
and *inout* are set or modified, by bisect().
The variables marked *work*
are work arrays used internally by bisect().
Problem domain definition
- xmin(3), xmax(3)
- domain boundaries for the entire physical problem.*in*
- numbin(3)
- number of bins in each dimension. *in*
Particle parameters
- npartpe
- number of particles in the local array x at the calling PE. *inout*
- nghostpe
- number of ghosts in x following the regular particles. *out*
- npartpemax
- maximum number of particles allowed per PE. The subroutine writes out warnings if, at any point of the bisection step, npartpe + nghostpe exceeds npartpemax. *in*
- nattri
- number of integer attributes per particle. Can be set by the user at run time.*in*
- nattrf
- number of floating point attributes per particle. Can be set by the user at run time.*in*
Particle arrays - modified by bisect()
- x(npartpemax,3)
- array containing the positions of the particles. *inout*
- attri(npartpemax,nattri)
- array containing integer attributes. *inout*
- attrf(npartpemax,nattrf)
- array containing floating point attributes. *inout*
When Bisect returns, the first npartpe positions of the arrays -- x, attri and attrf on a given processor contain information about the particles assigned to that processor. The next nghostpe positions contain information about the ghosts duplicated on that processor.
Distribution specification
- 0
- balance the number of of particles per subdomain
- 1
- uniform bisection (all subdomains have roughly the same
volume - always bisect into equal halves).
The particles are swapped between processors so that each processor has
the particles that correspond to the local partition of a shared array
that is distributed in
(:block(numbin(1)), :block(numbin(2)),:block(numbin(3)))
fashion across the PEs. - 2
- balance a weighted sum of the particles per subdomain. The weight of each particle, i, is taken from the first array in attrf(). That is attrf(i,1). In typical use, the weight of a particle will be chosen to be proportional to the execution time needed to process that particle.
- extbin
- number of ghost bins used to extend the subdomains at cut boundaries. It is the same in each bisected direction. *in*
- 0
- no outside boundary conditions. Only particles close to interior boundaries, edges or corners are replicated.
- 1
- periodic boundary conditions. Particles close to interior boundaries, edges or corners are still replicated as before. Furthermore, a particle within extbin bins of a boundary in the physical domain is replicated as a ghost in the processor which owns the opposite boundary. The coordinates (position) of the replicated ghost particle are kept equal to the coordinates of the original particle.
- 2
- same as in bound_flag = 1, except that The coordinates (position) of the replicated ghost particle are changed to reflect its new position in the extended domain.
- 0
- cuts performed automatically in round robin
- 1
- cuts performed along user specified dimensions
Subdomain specification
- xminpe (3)
- Floating point variables that give the lower boundaries of the subdomain owned by the current processor. This represents the "inner" subdomain, i.e. without the ghost zones on the boundaries. *out*
- xmaxpe (3)
- Similar to "xminpe" above. The upper boundaries of the inner subdomain. *out*
- binminpe (3)
- Integer variables that give the lower bin number in each dimension of the subdomain owned by the current processor. *out*
- binmaxpe (3)
- Similarly for the upper bin number. *out*
Work arrays
- rwork(npartpemax,2)
- A floating point work array. *work*
- iwork(npartpemax,3)
- An integer work array. *work*
Particle Re-distribution for Load Balancing on the Cray T3D / Xin Yuan, Bohr He and Rami Melhem, Computer Science Department, University of Pittsburgh / melhem@cs.pitt.edu