pub struct ParallelInsertSliceOperation<'c> { /* private fields */ }
Expand description
A parallel_insert_slice
operation.
Specify the tensor slice update of a single thread of a parent
ParallelCombiningOpInterface op.
.
The parallel_insert_slice
yields a subset tensor value to its parent
ParallelCombiningOpInterface. These subset tensor values are aggregated to
in some unspecified order into a full tensor value returned by the parent
parallel iterating op.
The parallel_insert_slice
is one such op allowed in the
ParallelCombiningOpInterface op.
Conflicting writes result in undefined semantics, in that the indices written to by multiple parallel updates might contain data from any of the updates, or even a malformed bit pattern.
If an index is updated exactly once, the value contained at that index in the resulting tensor will be equal to the value at a corresponding index of a slice that was used for the updated. If an index is not updated at all, its value will be equal to the one in the original tensor.
This op does not create a new value, which allows maintaining a clean separation between the subset and full tensor.
Note that we cannot mark this operation as pure (Pures), even though it has no side effects, because it will get DCEd during canonicalization.
The parallel_insert_slice operation supports the following arguments:
- source: the tensor that is inserted.
- dest: the tensor into which the source tensor is inserted.
- offsets: tensor-rank number of offsets into the
dest
tensor into which the slice is inserted. - sizes: tensor-rank number of sizes which specify the sizes of the source tensor type.
- strides: tensor-rank number of strides that specify subsampling in each dimension.
The representation based on offsets, sizes and strides support a
partially-static specification via attributes specified through the
static_offsets
, static_sizes
and static_strides
arguments. A special
sentinel value ShapedType::kDynamic encodes that the corresponding entry has
a dynamic value.
After buffer allocation, the “parallel_insert_slice” op is expected to lower into a memref.subview op.
A parallel_insert_slice operation may additionally specify insertion into a tensor of higher rank than the source tensor, along dimensions that are statically known to be of size 1. This rank-altering behavior is not required by the op semantics: this flexibility allows to progressively drop unit dimensions while lowering between different flavors of ops on that operate on tensors. The rank-altering behavior of tensor.parallel_insert_slice matches the rank-reducing behavior of tensor.insert_slice and tensor.extract_slice.
§Verification in the rank-reduced case
The same verification discussion and mechanisms apply as for ExtractSliceOp. Unlike ExtractSliceOp however, there is no need for a specific inference.
Implementations§
source§impl<'c> ParallelInsertSliceOperation<'c>
impl<'c> ParallelInsertSliceOperation<'c>
sourcepub fn as_operation(&self) -> &Operation<'c>
pub fn as_operation(&self) -> &Operation<'c>
Returns a generic operation.
sourcepub fn builder(
context: &'c Context,
location: Location<'c>
) -> ParallelInsertSliceOperationBuilder<'c, Unset, Unset, Unset, Unset, Unset, Unset, Unset, Unset>
pub fn builder( context: &'c Context, location: Location<'c> ) -> ParallelInsertSliceOperationBuilder<'c, Unset, Unset, Unset, Unset, Unset, Unset, Unset, Unset>
Creates a builder.