Matrix Market Format and Parser
Overview
The Matrix Market Exchange Format is a de facto standard for distributing sparse matrices in scientific computing. It was developed by the National Institute of Standards and Technology (NIST) and is documented in Boisvert et al. (1997).
A Matrix Market file (extension .mtx) consists of:
A header line identifying the format
Optional comment lines
Size line:
m n nnznnzdata lines, each containingrow col [value]
Example Matrix Market File
%%MatrixMarket matrix coordinate real general
%
% This is a 5×5 symmetric matrix with 13 non-zeros
%
5 5 13
1 1 2.0
1 2 1.0
2 1 1.0
2 3 3.0
3 2 3.0
...
Format Specification
Field Definitions
Object (second field):
Token |
Meaning |
|---|---|
|
Sparse matrix (supported) |
|
Dense vector (not supported) |
|
Element stiffness matrix (not supported) |
Format (third field):
Token |
Storage Format |
Our Support |
|---|---|---|
|
COO format (row, col, value triples) |
Supported |
|
Dense row-major array |
Not supported in Phase 1 |
Data Type (fourth field):
Token |
Meaning |
Our Treatment |
|---|---|---|
|
IEEE-754 floating point |
Stored as |
|
Integer values |
Promoted to |
|
Complex number pair |
Not supported in Phase 1 |
|
Value is implicitly 1.0 |
Treated as value = 1.0 |
Symmetry (optional fifth field):
Token |
Meaning |
Handling |
|---|---|---|
|
All entries stored explicitly |
Fully handled |
|
Only upper or lower triangle stored |
Handled (duplicates to full matrix) |
|
\(A_{ij} = -A_{ji}\) |
Not yet handled |
|
Complex conjugate symmetry |
Not yet handled |
Index Convention
Important: Matrix Market uses 1-based indexing (Fortran convention). All indices are converted to 0-based (C convention) upon reading.
The conversion is simply:
Parsing Algorithm
The parser operates in two phases:
Phase 1: COO Parse
The file is read once, line by line:
Skip comments: lines starting with
%Read header: parse the
%%MatrixMarketbannerRead size line: extract \(m, n, \text{nnz}\)
Read data lines: for each non-comment line:
row_mm, col_mm, value = parse(line) row_csr = row_mm - 1 col_csr = col_mm - 1 append to COO_SparseMatrix
For pattern format, the value is implicitly 1.0:
Phase 2: COO to CSR Conversion
After parsing into COO format, the matrix is converted to CSR using the algorithm described in CSR (Compressed Sparse Row) Format (see sparse_matrix.rst).
Symmetry Expansion
For symmetric matrices, the stored triangle represents the full matrix.
Each entry \((i, j)\) with \(i \neq j\) implicitly defines
\(A_{ji} = A_{ij}\). The parser expands these entries:
Each off-diagonal entry is written twice (once for each position).
Error Handling
The parser throws std::runtime_error on:
Error Condition |
Description |
|---|---|
Missing banner |
First line is not |
Unsupported format |
|
Malformed size line |
Non-integer values or wrong column count |
Index out of range |
row or col outside \([1, m]\) or \([1, n]\) |
I/O error |
File not found, read failure |
The error message includes the problematic line number and content for debugging.
API Reference
Note: The C++ API reference below requires Doxygen to generate the XML
intermediate. Currently, these are documented for future integration with
Breathe. Run make docs after building the project with CMake to generate
the full API documentation.
- enum class MatrixFormat
Storage layout as declared in the banner.
-
enumerator COO
Coordinate format — row/col/index triples (what we read).
-
enumerator CSR
Array format — dense row-major (not yet supported).
-
enumerator COO
- enum class ScalarField
Data type of each matrix entry.
-
enumerator REAL
IEEE-754 double (internal representation).
-
enumerator INTEGER
Integer values, promoted to double on read.
-
enumerator COMPLEX
Complex number pair (not supported in Phase 1).
-
enumerator PATTERN
Value is implicitly 1.0.
-
enumerator REAL
- enum class SymmetryField
Symmetry of the matrix.
-
enumerator GENERAL
All entries stored explicitly.
-
enumerator SYMMETRIC
Only one triangle stored; doubles on expansion.
-
enumerator SKEW_SYMMETRIC
\(A_{ij} = -A_{ji}\) (not yet handled).
-
enumerator HERMITIAN
Complex conjugate symmetry (not yet handled).
-
enumerator GENERAL
- struct MatrixMarketHeader
Parsed header — the four tokens of the
%%MatrixMarketbanner.-
MatrixFormat format
-
ScalarField scalar
-
SymmetryField symmetry
-
MatrixFormat format
-
SparseMatrix parse_matrix_market(const std::string &filepath)
Parse a
.mtxfile and return a CSR matrix.This is the primary entry point. Internally it:
Reads into
COO_SparseMatrixviaparse_matrix_market_coo()Converts to CSR via
coo_to_csr()
- Parameters:
filepath – Absolute or relative path to the
.mtxfile- Returns:
SparseMatrixin CSR format, ready for SpMV- Throws:
std::runtime_erroron malformed input
-
COO_SparseMatrix parse_matrix_market_coo(const std::string &filepath)
Parse a
.mtxfile and return a COO matrix.Use this when you need the raw \((row, col, value)\) triples — for example to build other formats (ELL, HYB) or to inspect the original ordering.
- Parameters:
filepath – Absolute or relative path to the
.mtxfile- Returns:
COO_SparseMatrix; entries are in file order (not sorted)- Throws:
std::runtime_erroron malformed input
-
SparseMatrix coo_to_csr(const COO_SparseMatrix &coo)
Convert a COO matrix to CSR format using counting sort.
Algorithm complexity: \(O(m + \text{nnz})\) time, \(O(m)\) auxiliary space.
- Parameters:
coo – Input matrix in COO format; entries need NOT be sorted
- Returns:
SparseMatrixin CSR format
References
Boisvert, R., Pozo, R., Remming, K. & Suzuki, J. (1997). The Matrix Market Exchange Formats. NIST Report NISTIR-6025. https://math.nist.gov/MatrixMarket/
Matrix Market website: https://math.nist.gov/MatrixMarket/