![]() |
Home | TVS papers |
Meta-information about MARC: an XML framework for validation, explanation and help systems
This page works as a support for the paper Meta-information about MARC: an XML framework for validation, explanation and help systems, JOAQUIM DE CARVALHO (BookMARC/University of Coimbra, Coimbra, Portugal), MARIA INÊS CORDEIRO (Art Library, Calouste Gulbenkian Foundation, Lisbon, Portugal), ANTÓNIO LOPES (BookMARC, Coimbra, Portugal) and MIGUEL VIEIRA (BookMARC, Coimbra, Portugal).
The paper was published by the Library Hi Tech Journal.
A diagram of the flow of transformations
![]() |
An XML schema for MARC21DOC rules
The purpose of the XML scheme for MARC is to provide a formalism for the representation of MARC rules and human oriented information currently held in MARC manuals. The scheme will allow not only HTML, PDF, Windows HELP version of the MARC manual, all generated automatically, but also the production of stylesheets for record transformation with validation or decoding purposes.
See MARC21DOC schema.
Samples
Download the samples described in the following sections.
Generation of valitation stylesheets
The MARC21Validation.xsl stylesheet is automatically generated by the MARC21ValidationGenerator.xslt stylesheet. This stylesheet uses the information contained in the MARC21DOC.xml file to build validation rules to create the validation stylesheets.
The MARC21Validation.xsl has the same structure and functionality of the LoC validation stylesheet, with the following differences and additional features:
Generation of decoding stylesheets
The EnglishFormat.xsl stylesheet is automatically generated by the EnglishFormatGenerator.xslt. This stylesheet uses the information contained in the MARC21DOC.xml file to create the decoding stylesheets.
Our EnglishFormat.xsl has the same structure and functionality as the LoC one, and some additional features:
HTML formating of rules
The MARC21DOCtoHTML.xsl stylesheet transforms the XML version of the MARC 21 manual in an HTML document for referencial purposes.
How to use the samples
Requires Java VM 1.3 or later.
At the DOS prompt or Linux/Unix terminal:
The transform.sh and transform.bat are simple scripts that call the command line interpreter of the Saxon java XLST processor. They take three arguments: XML document, XLS stylesheet, output file name. They will apply the stylesheet to the XML document and save the result in the output file.
./bin/transform.sh src/MARC21DOC.xml src/MARC21DOCtoHTML.xsl MARC21DOC.html
./bin/transform.sh src/MARC21DOC.xml src/EnglishFormatGenerator.xslt englishFormater.xsl
./bin/transform.sh src/sandburg.xml englishFormater.xsl english.html
./bin/transform.sh src/MARC21DOC.xml src/MARC21ValidationGenerator.xslt validator.xsl
./bin/transform.sh src/sandburg.xml validator.xsl errors.xml
Additional examples
This set of additional examples is intended to demonstrate the validator.xsl capabilities regarding error detection.
In order to test these examples several errors are going to be inserted in the sandburg.xml file. Use the command:
./bin/transform.sh src/sandburg.xml validator.xsl errors.xml
.
<controlfield tag="999"> 92005291 </controlfield>
<error type="MandatoryControlfield" tag="001"/> <warning type="UnknownControlfieldTag"> <controlfield xmlns="http://www.loc.gov/MARC21/slim" tag="999">92005291</controlfield> </warning>
<datafield tag="040" ind1=" " ind2=" "> <subfield code="a">DLC</subfield> <subfield code="c">DLC</subfield> <subfield code="x">DLC</subfield> </datafield>
<error type="InvalidSubfieldCode" tagID="d0e43"> <code>x</code> </error>
<controlfield tag="008">920219X1993 caua j 000 0 eng </controlfield>
<error type="InvalidPSubfield"> <field tag="008" start="6" end="6"> <invalid>X</invalid> <content>920219X1993 caua j 000 0 eng</content> <vocabulary> <ITEM xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" code="b" name="No dates given; B.C. date involved"> <DESCRIPTION>Each character position in fields 008/07-10 and 008/11-14 contains a blank (#).</DESCRIPTION> </ITEM> <ITEM xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" code="c" name="Serial item currently published"> <DESCRIPTION>008/07-10 contain the beginning date of publication; 008/11-14 contain 9999.</DESCRIPTION> </ITEM> <ITEM xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" code="d" name="Serial item ceased publication"> <DESCRIPTION>008/07-10 contain the beginning date of publication; 008/11-14 contain the ending date of publication.</DESCRIPTION> </ITEM> <ITEM xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" code="e" name="Detailed date"> <DESCRIPTION>008/07-10 contain the year and 008/11-14 contain the month and day, recorded in the pattern mmdd.</DESCRIPTION> </ITEM> <ITEM xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" code="i" name="Inclusive dates of collection"> <DESCRIPTION/> </ITEM> <ITEM xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" code="k" name="Range of years of bulk of collection"> <DESCRIPTION/> </ITEM> <ITEM xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" code="m" name="Multiple dates"> <DESCRIPTION>008/07-10 usually contain the beginning date and 008/11-14 the ending date.</DESCRIPTION> </ITEM> <ITEM xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" code="n" name="Dates unknown"> <DESCRIPTION>Indicates that the dates appropriate for 008/07-10 and 008/11-14 are unknown (e.g., when no dates are given in field 260).</DESCRIPTION> </ITEM> <ITEM xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" code="p" name="Date of distribution/release/issue and production/recording session when different"> <DESCRIPTION/> </ITEM> <ITEM xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" code="q" name="Questionable date"> <DESCRIPTION>008/07-10 contain the earliest possible date; 008/11-14 contain the latest possible date.</DESCRIPTION> </ITEM> <ITEM xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" code="r" name="Reprint/reissue date and original date"> <DESCRIPTION>008/07-10 contain the date of reproduction or reissue; 008/11-14 contain the date of the original, if known; 008/11-14 contain code u ("uuuu"), if unknown.</DESCRIPTION> </ITEM> <ITEM xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" code="s" name="Single known date/probable date"> <DESCRIPTION>008/07-10 contain the date; 008/11-14 contain blanks (####).</DESCRIPTION> </ITEM> <ITEM xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" code="t" name="Publication date and copyright date"> <DESCRIPTION/> </ITEM> <ITEM xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" code="u" name="Serial status unknown"> <DESCRIPTION>008/07-10 contain the beginning date of publication; 008/11-14 contain code u ("uuuu").</DESCRIPTION> </ITEM> <ITEM xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" code="|" name="No attempt to code"> <DESCRIPTION/> </ITEM> </vocabulary> </field> </error>