We start by explaining the simple case of matrix-vector multiplication:

where *A* is an matrix.
For our explanations, we will assume
that matrix *A* and vectors *x* and *y* are
created according to the driver given in Example 2.3.
Thus, the template is created with a
communicator and a blocking
of . Given this template,
vectors *x* and *y* are aligned with the first element
of the template vector and matrix *A* with the upper-left
element of the template matrix. This yields objects
`a`, `x`, and `y`.

As explained in Section 1.5, the following
steps will perform the matrix-vector multiply *y* = *A x* :

We will now show how to translate these operations into PLAPACK code.

- spread (collect) the entries of
xwithin columns of nodes,- perform the local matrix-vector multiply, and
- perform a distributed reduce (summation) of the local results within rows, leaving the global result in vector
y.

The mechanism used by PLAPACK to communicate is to describe
the initial and final distribution as objects, and perform
a copy or reduce.
Thus, the following statements will perform the
spread of *x* within columns of nodes:

PLA_Obj_datatype( a, &datatype ); PLA_Pvector_create( datatype, PLA_PROJ_ONTO_ROW, PLA_ALL_ROWS, n, template, PLA_ALIGN_FIRST, &xdup ); PLA_Copy( x, xdup );After this, all information is available locally to perform the local matrix-vector multiply. Before doing so, we need to create duplicated multiscalars to hold the constants ``0'' and ``1''. Also, a duplicated projected vector (column) must be created to hold the result:

PLA_Mscalar_create( datatype, PLA_ALL_ROWS, PLA_ALL_COLS, 1, 1, template, &one ); PLA_Obj_set_to_one( one ); PLA_Mscalar_create( datatype, PLA_ALL_ROWS, PLA_ALL_COLS, 1, 1, template, &zero ); PLA_Obj_set_to_zero( zero ); PLA_Pvector_create( datatype, PLA_PROJ_ONTO_COL, PLA_ALL_COLS, m, template, PLA_ALIGN_FIRST, &ydup ); PLA_Local_gemv( PLA_NO_TRANSPOSE, one, a, xdup, zero, ydup );

Finally, the local results (in the different versions of the duplicated
projected vector `ydup`) must be reduced into a single vector *y* :

PLA_Obj_set_to_zero( y ); PLA_Reduce( ydup, MPI_SUM, y );