pub struct ParallelOperation<'c> { /* private fields */ }
Expand description
A parallel
operation. Parallel for operation.
The “scf.parallel” operation represents a loop nest taking 4 groups of SSA
values as operands that represent the lower bounds, upper bounds, steps and
initial values, respectively. The operation defines a variadic number of
SSA values for its induction variables. It has one region capturing the
loop body. The induction variables are represented as an argument of this
region. These SSA values always have type index, which is the size of the
machine word. The steps are values of type index, required to be positive.
The lower and upper bounds specify a half-open range: the range includes
the lower bound but does not include the upper bound. The initial values
have the same types as results of “scf.parallel”. If there are no results,
the keyword init
can be omitted.
Semantically we require that the iteration space can be iterated in any order, and the loop body can be executed in parallel. If there are data races, the behavior is undefined.
The parallel loop operation supports reduction of values produced by individual iterations into a single result. This is modeled using the scf.reduce operation (see scf.reduce for details). Each result of a scf.parallel operation is associated with an initial value operand and reduce operation that is an immediate child. Reductions are matched to result and initial values in order of their appearance in the body. Consequently, we require that the body region has the same number of results and initial values as it has reduce operations.
The body region must contain exactly one block that terminates with “scf.yield” without operands. Parsing ParallelOp will create such a region and insert the terminator when it is absent from the custom format.
Example:
%init = arith.constant 0.0 : f32
scf.parallel (%iv) = (%lb) to (%ub) step (%step) init (%init) -> f32 {
%elem_to_reduce = load %buffer[%iv] : memref<100xf32>
scf.reduce(%elem_to_reduce) : f32 {
^bb0(%lhs : f32, %rhs: f32):
%res = arith.addf %lhs, %rhs : f32
scf.reduce.return %res : f32
}
}
Implementations§
source§impl<'c> ParallelOperation<'c>
impl<'c> ParallelOperation<'c>
sourcepub fn as_operation(&self) -> &Operation<'c>
pub fn as_operation(&self) -> &Operation<'c>
Returns a generic operation.
sourcepub fn builder(
context: &'c Context,
location: Location<'c>
) -> ParallelOperationBuilder<'c, Unset, Unset, Unset, Unset, Unset, Unset>
pub fn builder( context: &'c Context, location: Location<'c> ) -> ParallelOperationBuilder<'c, Unset, Unset, Unset, Unset, Unset, Unset>
Creates a builder.