HPL_dgemm - Man Page
C := alpha * op(A) * op(B) + beta * C.
Synopsis
#include "hpl.h"
void HPL_dgemm( const enum HPL_ORDER ORDER, const enum HPL_TRANS TRANSA, const enum HPL_TRANS TRANSB, const int M, const int N, const int K, const double ALPHA, const double * A, const int LDA, const double * B, const int LDB, const double BETA, double * C, const int LDC );
Description
HPL_dgemm performs one of the matrix-matrix operations
C := alpha * op( A ) * op( B ) + beta * C
where op( X ) is one of
op( X ) = X or op( X ) = X^T.
Alpha and beta are scalars, and A, B and C are matrices, with op(A) an m by k matrix, op(B) a k by n matrix and C an m by n matrix.
Arguments
- ORDER (local input) const enum HPL_ORDER
On entry, ORDER specifies the storage format of the operands as follows:
ORDER = HplRowMajor,
ORDER = HplColumnMajor.- TRANSA (local input) const enum HPL_TRANS
On entry, TRANSA specifies the form of op(A) to be used in the matrix-matrix operation follows:
TRANSA==HplNoTrans : op( A ) = A,
TRANSA==HplTrans : op( A ) = A^T,
TRANSA==HplConjTrans : op( A ) = A^T.- TRANSB (local input) const enum HPL_TRANS
On entry, TRANSB specifies the form of op(B) to be used in the matrix-matrix operation follows:
TRANSB==HplNoTrans : op( B ) = B,
TRANSB==HplTrans : op( B ) = B^T,
TRANSB==HplConjTrans : op( B ) = B^T.- M (local input) const int
On entry, M specifies the number of rows of the matrix op(A) and of the matrix C. M must be at least zero.
- N (local input) const int
On entry, N specifies the number of columns of the matrix op(B) and the number of columns of the matrix C. N must be at least zero.
- K (local input) const int
On entry, K specifies the number of columns of the matrix op(A) and the number of rows of the matrix op(B). K must be be at least zero.
- ALPHA (local input) const double
On entry, ALPHA specifies the scalar alpha. When ALPHA is supplied as zero then the elements of the matrices A and B need not be set on input.
- A (local input) const double *
On entry, A is an array of dimension (LDA,ka), where ka is k when TRANSA==HplNoTrans, and is m otherwise. Before entry with TRANSA==HplNoTrans, the leading m by k part of the array A must contain the matrix A, otherwise the leading k by m part of the array A must contain the matrix A.
- LDA (local input) const int
On entry, LDA specifies the first dimension of A as declared in the calling (sub) program. When TRANSA==HplNoTrans then LDA must be at least max(1,m), otherwise LDA must be at least max(1,k).
- B (local input) const double *
On entry, B is an array of dimension (LDB,kb), where kb is n when TRANSB==HplNoTrans, and is k otherwise. Before entry with TRANSB==HplNoTrans, the leading k by n part of the array B must contain the matrix B, otherwise the leading n by k part of the array B must contain the matrix B.
- LDB (local input) const int
On entry, LDB specifies the first dimension of B as declared in the calling (sub) program. When TRANSB==HplNoTrans then LDB must be at least max(1,k), otherwise LDB must be at least max(1,n).
- BETA (local input) const double
On entry, BETA specifies the scalar beta. When BETA is supplied as zero then the elements of the matrix C need not be set on input.
- C (local input/output) double *
On entry, C is an array of dimension (LDC,n). Before entry, the leading m by n part of the array C must contain the matrix C, except when beta is zero, in which case C need not be set on entry. On exit, the array C is overwritten by the m by n matrix ( alpha*op( A )*op( B ) + beta*C ).
- LDC (local input) const int
On entry, LDC specifies the first dimension of C as declared in the calling (sub) program. LDC must be at least max(1,m).
Example
#include "hpl.h"
int main(int argc, char *argv[])
{
double a[2*2], b[2*2], c[2*2];
a[0] = 1.0; a[1] = 2.0; a[2] = 3.0; a[3] = 3.0;
b[0] = 2.0; b[1] = 1.0; b[2] = 1.0; b[3] = 2.0;
c[0] = 4.0; c[1] = 3.0; c[2] = 2.0; c[3] = 1.0;
HPL_dgemm( HplColumnMajor, HplNoTrans, HplNoTrans,
2, 2, 2, 2.0, a, 2, b, 2, -1.0, c, 2 );
printf(" [%f,%f]\n", c[0], c[2]);
printf("c=[%f,%f]\n", c[1], c[3]);
exit(0); return(0);
}
See Also
HPL_dtrsm (3).