yaz-marcdump - Man Page

MARC record dump utility

Synopsis

yaz-marcdump [-i format] [-o format] [-f from] [-t to] [-l spec] [-c cfile] [-s prefix] [-C size] [-O offset] [-L limit] [-n] [-p] [-r] [-v] [-V] [file...]

Description

yaz-marcdump reads MARC records from one or more files. It parses each record and supports output in line-format, ISO2709, MARCXML[1], MARC-in-JSON[2], MarcXchange[3] as well as Hex output.

This utility parses records ISO2709(raw MARC), line format, MARC-in-JSON format as well as XML if that is structured as MARCXML/MarcXchange.

MARC-in-JSON encoding/decoding is supported in YAZ 5.0.5 and later.

Note

As of YAZ 2.1.18, OAI-MARC is no longer supported. OAI-MARC is deprecated. Use MARCXML instead.

By default, each record is written to standard output in a line format with newline for each field, $x for each sub-field x. The output format may be changed with option -o,

yaz-marcdump can also be requested to perform character set conversion of each record.

Options

-i format

Specifies input format. Must be one of marcxml, marc (ISO2709), marcxchange (ISO25577), line (line mode MARC), turbomarc (Turbo MARC), or json (MARC-in-JSON).

-o format

Specifies output format. Must be one of marcxml, marc (ISO2709), marcxchange (ISO25577), line (line mode MARC), turbomarc (Turbo MARC), or json (MARC-in-JSON).

-f from

Specify the character set of the input MARC record. Should be used in conjunction with option -t. Refer to the yaz-iconv man page for supported character sets.

-t to

Specify the character set of the output. Should be used in conjunction with option -f. Refer to the yaz-iconv man page for supported character sets.

-l leaderspec

Specify a simple modification string for MARC leader. The leaderspec is a list of pos=value pairs, where pos is an integer offset (0 - 23) for leader. Value is either a quoted string or an integer (character value in decimal). Pairs are comma separated. For example, to set leader at offset 9 to a, use 9='a'.

-s prefix

Writes a chunk of records to a separate file with prefix given, i.e. splits a record batch into files with only at most "chunk" ISO2709 records per file. By default chunk is 1 (one record per file). See option -C.

-C chunksize

Specifies chunk size; to be used conjunction with option -s.

-O offset

Integer offset for at what position records whould be written. 0=first record, 1=second, .. With -L option, this allows a specific range of records to be processed.

-L limit

Integer limit for how many records should at most be written. With -O option, this allows a specific range of records to be processed.

-p

Makes yaz-marcdump print record number and input file offset of each record read.

-n

MARC output is omitted so that MARC input is only checked.

-r

Writes to stderr a summary about number of records read by yaz-marcdump.

-v

Writes more information about the parsing process. Useful if you have ill-formatted ISO2709 records as input.

-V

Prints YAZ version.

Examples

The following command converts MARC21/USMARC in MARC-8 encoding to MARC21/USMARC in UTF-8 encoding. Leader offset 9 is set to 'a'. Both input and output records are ISO2709 encoded.

    yaz-marcdump -f MARC-8 -t UTF-8 -o marc -l 9=97 marc21.raw >marc21.utf8.raw

The same records may be converted to MARCXML instead in UTF-8:

    yaz-marcdump -f MARC-8 -t UTF-8 -o marcxml marc21.raw >marcxml.xml

Turbo MARC is a compact XML notation with same semantics as MARCXML, but which allows for faster processing via XSLT. In order to generate Turbo MARC records encoded in UTF-8 from MARC21 (ISO), one could use:

    yaz-marcdump -f MARC8 -t UTF8 -o turbomarc -i marc marc21.raw >out.xml

Files

prefix/bin/yaz-marcdump

prefix/include/yaz/marcdisp.h

See Also

yaz(7)

yaz-iconv(1)

Authors

Index Data

Notes

  1. MARCXML
    https://www.loc.gov/standards/marcxml/
  2. MARC-in-JSON
    https://rossfsinger.com/blog/2010/09/a-proposal-to-serialize-marc-in-json/
  3. MarcXchange
    https://www.loc.gov/standards/iso25577/

Info

09/19/2024 YAZ 5.34.2 Commands