sblg - Man Page
static blog utility
Synopsis
Description
The sblg utility merges XML articles and templates in a number of ways.
Standalone mode (-c) merges a single article's content and metadata into a template. For example, "sblg -o- -c foo.xml" merges
foo.xml
into the templatearticle-template.xml
.Blog mode (the default) merges multiple articles' content and metadata into a template. For example, "sblg -o- bar.xml baz.xml" merges
bar.xml
andbaz.xml
into the templateblog-template.xml
.Combined mode (-C) links multiple articles' content and metadata in standalone style. For example, "sblg -o- -C bar.xml baz.xml" will show content for only
bar.xml,
but metadata for both inputs. The similar -L flags runs the process for each input file without reparsing.Atom mode (-a) merges multiple articles into an Atom feed template.
JSON mode (-j) merges all articles into a JSON object.
By default, sblg operates in blog mode with template blog-template.xml
. Its arguments are as follows:
- -a
Creates an Atom feed from its input files.
- -c
Create standalone articles instead of merging articles together.
- -l
Instead of emitting any output files, simply process the input and report a table of tags. This table consists of the input file name, a tab, then the tag. (Also known as article-major order.) The tag has escaped white-space printed as unescaped. You can also use -r to have tag-major order and -j for JSON output. Specify -l twice to show matches (tags for article-major, articles for tag-major) all on one tab-separated line, instead of one per line.
- -r
Print the -l tag listing in “tag-major” order wherein the first column is the tag and the second column is the article. If the -j flag is specified, this is JSON formatted.
- -j
JSON instead of XML output mode. This behaves as in blog mode, but outputs JSON instead of XML. If -l is specified, the tag listing will be displayed in JSON instead. See JSON Schema for details.
- -C file
Like -c, but creating a blog from the article in file with the remaining files being articles used for navigation.
- -L
Like -C, but acting on all input files, translating the input to output files such as in -c without -o. If there are multiple articles in an output file, the output is recreated for each (so only the last will remain). So running with “article0.xml article1.xml” will produce “article0.html article1.html” as if -C were seperately specified for both. This avoids needing to parse all inputs for each input.
- -o file
Output file. If unspecified, standalone articles have
.html
appended to the input file name, unless the input file extension is.xml
, in which case the.xml
is replaced by.html
. If multiple input files are specified, -o is ignored. If unspecified for the blog, blog.html is used by default. If unspecified for the Atom feed or JSON, atom.xml or blog.json, respectively, is used by default. Use -o - for standard output.- -s sort
Change how articles are sorted before being written into navigation or article entries. The default is date, which sorts oldest-newest by date. You can also specify filename, which sorts in increasing A–Z case-sensitive order of the source filename; cmdline for the command-line order; ititle for the case-insensitive document title; or title for the case-sensitive document title. Each sort may be prefixed with "r" (e.g., rcmdline) to reverse the sort.
- -t template
Template for all modes. If unspecified, defaults to article-template.xml for -c, atom-template.xml for -a, and blog-template.xml otherwise.
- -V
Emits the version as
sblg-xx.yy.zz
and exits.- file ...
Input files. In standalone mode with -c, input XML files are merged with a template into an output file. Otherwise, multiple input files are merged into a single blog.
All input must be well-formed XML. Element names and attributes are case-sensitive.
Article Input
Article input files consist of the following within the document:
<article data-sblg-article="1"> <header> <h1>Article Name</h1> <address>Author Name</address> <time datetime="2013-06-29">29 June, 2013</time> </header> <aside> This is used as the feed <b>abstract</b>. </aside> <p> Some text in the <b>content</b>. <img src="foo.jpg" alt="An image for the feed" /> </p> </article>
All content outside of the element with the data-sblg-article="1"
attribute, usually an <article>
, is discarded. Then the article is scanned for the following:
the article title (both as text data only and inclusive of markup) is extracted from the first
<hn>
(header 1–4);the article publication date is extracted from the datetime attribute of the first
<time>
(which must be a date, YYYY-MM-DD, or time, YYYY-MM-DDTHH:MM:SSZ) interpreted in UTC;the author (both as text data only and inclusive of markup) from the first
<address>
;the first
<aside>
is used for the feed abstract; andthe first
<img>
is associated as the article's image.
These are all set once: subsequent invocations will not override prior setting. See data-sblg-aside
, data-sblg-author
, data-sblg-datetime
, data-sblg-img
, and data-sblg-title
for explicitly setting or overriding these values.
If unspecified, the default article title text (and mark-up) is "Untitled article", the default author text (and mark-up) is the "Unknown author", the publication time is set to the document's file-system creation time, the abstract is left empty, and the image is empty.
There are a number of special attributes that are recognised in the input file.
- data-sblg-aside=string
Sets the aside material as otherwise would be set from the first
<aside>
element. It overrides the previously set aside. The alternativedata-sblg-const-aside
only sets the aside if it has not yet been set.- data-sblg-author=url
Sets the author as otherwise would be set from the first
<address>
element. It overrides the previously set author. The alternativedata-sblg-const-author
only sets the author if it has not yet been set.- data-sblg-datetime=datetime
Overrides the first
<time>
element. This must be YYYY-MM-DD or YYYY-MM-DDTH:MM:SSZ. It overrides the previously set date. The alternativedata-sblg-const-datetime
only sets the date if it has not yet been set.- data-sblg-img=url
Set the image associated with the article. It overrides any previously set image. The alternative
data-sblg-const-img
only sets the image if it has not yet been set.- data-sblg-lang=string
May only be set on the
<article>
and specifies one or more space-separated languages for the document. You can escape spaces with a backslash (“\”) if you have spaces in the tag name, e.g., “foo\ bar”. These languages are removed in the “stripping” operations for the Tag Symbols.- data-sblg-set-xxx=string
This allows arbitrary values to be attached to the article. For example, specifying
data-sblg-set-foo="bar"
sets thefoo
keyword tobar
. If specified multiple times for the same key, only the last value is used. These may be retrieved with${sblg-get}
or queried with${sblg-has}
of the Tag Symbols.- data-sblg-sort=first|last
May only be set on the
<article>
element and overrides the article's position relative to other articles. This can be eitherfirst
orlast
. If multiple articles have the same sort override, they are ordered in the natural way.- data-sblg-source=file
Set the source filename associated with the article. It overrides the implicit value set from the actual file.
- data-sblg-tags=string
This tag may be specified on any element within the article and consists of space-separated tag names. You can escape spaces with a backslash (“\”) if you have spaces in the tag name, e.g., “foo\ bar”. These tags are extracted for navigation tag operation. It may not contain any tabs.
- data-sblg-title=string
Sets the title as otherwise would be set in a
<hN>
element. It overrides the previously set title. The alternativedata-sblg-const-title
only sets the title if it has not yet been set.
Standalone Template
The standalone template file replaces the first element with the data-sblg-article="1"
attribute, usually an <article>
, with the article contents.
<body> <header>This consists of a single blog entry.</header> <article>This is kept.</article> <article data-sblg-article="1">This is removed.</article> <footer>Something.</footer> </body>
Article templates may contain the following attributes:
- data-sblg-article=boolean
If set to true, the contents are replaced with the input article. This only happens once: subsequent elements are ignored.
- data-sblg-ign-once=boolean
If an element has the
data-sblg-article="1"
attribute set to true, the element is not processed as an article and thedata-sblg-ign-once
attribute is removed.
See Tag Symbols for a list of symbols that will be replaced if found in attribute value or textual contexts. These may occur anywhere in the template document.
Blog Template
The blog template replaces elements with the data-sblg-article="1"
attribute, usually <article>
, with ordered (by default, newest to oldest) article contents. If there aren't enough articles, the element is removed.
Elements with a data-sblg-nav="1"
attribute, usually <nav>
, are replaced by the same list of articles within an unordered list.
If an element has both attributes, only the first is recognised.
Usually, the article elements are used for displaying full articles, while the navigation elements are used for displaying navigation to articles, such as just their titles, dates, and links.
<body> <header>This consists of two blog entries.</header> <nav data-sblg-nav="1" /> <article data-sblg-article="1" /> <article data-sblg-article="1" /> <footer>Something.</footer> </body>
Article templates may contain several attributes.
- data-sblg-article=boolean
If set to true, the contents (including the element itself) are replaced with the input article.
- data-sblg-articletag=string
If an element with the
data-sblg-article="1"
attribute contains this, limit displayed articles to those matching the space-separated tags or${sblg-get|xxx}
when in -L or -C mode. This scans for tags from the current article in the list of articles.- data-sblg-ign-once=boolean
If an element with the
data-sblg-article="1"
attribute has this set to true, the element is not processed as an article and thedata-sblg-ign-once
attribute is removed.- data-sblg-permlink=boolean
If an element with the
data-sblg-article="1"
attribute has this set to true, a permanent link to the article's input filename is emitted within a<div data-sblg-permlink="1">
element after the element with thedata-sblg-article="1"
attribute.
The navigation element may contain several attributes.
- data-sblg-navcontent=boolean
Deprecated alias for respective content and element styles list-keep and keep if true, list-summarise and keep if false.
- data-sblg-navstyle-content=style
Style for formatting articles into the content of the navigation element. May be keep, to output the content per-article and perform Tag Symbols substitution; summarise or summarize, to discard content and output the article time followed by a link to the article; list-keep, same as keep except surrounding each article with
<li>
and all articles with<ul>
; or list-summarise, or list-summarize, same as summarise except surrounding each article with<li>
and all articles with<ul>
. If not given or unknown, defaults to list-summarise.- data-sblg-navstyle-element=style
Style for the navigation element. May be keep, to output the element as-is once around all articles; keep-strip, to output the element without attributes once around all articles; repeat-strip, to output the element without attributes around each article (if the content styles are list-keep or list-summarise, the element is output within the
li
); or discard to suppress output. If not given or unknown, defaults to keep.- data-sblg-navsort=sort
Overrides the global search order given with -s. Uses the same names. If the search name is not recognised, the attribute is silently ignored and the global search order used.
- data-sblg-navstart=number
How many articles will skip being displayed (so if you have tags, it will only account for articles that would meet those tags) before showing the first navigation entry. Starts at one (a value of zero is the same as a value of one).
- data-sblg-navsz=number
If the
<nav>
element contains this attribute with a positive integer, it is used to limit the number of navigation entries.- data-sblg-navtag=string
Only articles with matching tags are shown. You can specify multiple space-separated tags, for instance,
data-sblg-navtag="foo bar"
will search for foo or bar. Tags to be matched against are extracted from the space-separateddata-sblg-tags
element of each article's topmost element. Escape spaces with a backslash (“\”) if you have spaces in the tag name, e.g., “foo\ bar”. Use${sblg-get|xxx}
or (for multi-word values)${sblg-get-escaped|xxx}
when in -C or -L mode to use the current article's set data as part of a string, e.g.,location-${sblg-get|location}
.- data-sblg-navxml=boolean
Deprecated alias for respective content and element styles keep and discard if true, list-summarise and keep if false.
Combined Template
This is identical to the Blog Template except that a single article is noted with -C, and this is the only article displayed in the article stub. Furthermore, like in standalone mode, Tag Symbols may be used anywhere in the document template and refer to the current article unless within a navigation element, in which case the symbol resolves to the currently-printed article. In the given example,
<body> <header>This consists of two blog entries.</header> <nav data-sblg-nav="1" /> <article data-sblg-article="1" /> <article data-sblg-article="1" /> <footer>Something.</footer> </body>
the navigation would be populated by all articles, but only the first article stub would be filled in with the specified article. The second would be removed.
This follows the usual rules of data-sblg-articletag
, so if the article you specify with -C doesn't have the correct tag, it won't inline the article.
Atom Template
The Atom template file must be a well-formed XML file where each <entry>
element with a Boolean data-sblg-entry
attribute is replaced by ordered (newest to oldest) article information. If there aren't enough articles, the element is removed. The template may contain pre-existing entries.
The following is a minimal template: anything less will not conform to the Atom specification:
<?xml version="1.0" encoding="utf-8"?> <feed xmlns="http://www.w3.org/2005/Atom"> <link href="http://example.org" /> <title>A Title Here</title> <updated /> <id /> <entry data-sblg-entry="1" data-sblg-forall="1" /> </feed>
The recognised elements are as follows. Un-recognised elements are printed verbatim.
- <entry data-sblg-entry="1">
Filled-in article entry. If the attribute is not specified, the entry is retained verbatim. Otherwise it is filled in with an article's information.
- <id>
If this is empty, it is filled in with the URL in
<link [rel="alternate"]>
, which must exist. Otherwise, the value is copied and used for subsequent feed entries.- <link [rel="alternate"]>
Unless an
<id>
is provided, thehref
attribute must be a full URL, e.g.,<link href="https://kristaps.bsd.lv/">
. Otherwise, it may be a relative path. This element must be first.- <updated>
This is filled in with the most recent article. Its contents are discarded.
There are a number of special attributes that may be given to the above elements.
- data-sblg-altlink=boolean
If an
<entry data-sblg-entry="1">
element contains this set to true, the alternate<link>
is printed.- data-sblg-altlink-fmt=string
If both
data-sblg-entry
anddata-sblg-altlink
are true for an<entry>
, the value is used as the link address. Accepts Tag Symbols, most commonly being${sblg-base}
.- data-sblg-atomcontent=boolean
If
<entry data-sblg-entry="1">
contains this set to true, the contents are printed directly and the Tag Symbols are processed. This overridesdata-sblg-altlink
anddata-sblg-content
.- data-sblg-content=boolean
If
<entry data-sblg-entry="1">
contains this set to true, the article's contents (everything within the element having thedata-sblg-article="1"
attribute) are inlined within the<content>
element with typehtml
. Tag Symbols are processed.- data-sblg-entry=boolean
Each
<entry>
element with this is filled in with article content.- data-sblg-forall=boolean
If an
<entry data-sblg-entry="1">
element contains this set to true, it is used for all remaining articles. Any<entry data-sblg-entry="1">
following this are discarded.
If not using data-sblg-atomcontent
, entries are filled in with a <title>
, <id>
, <author>
, HTML <content>
(specified in the article as an <aside>
), and alternate <link>
. The <id>
is constructed by appending the source filename, hash print, and date following the feed's <id>
or <link>
element.
When filling in HTML content, sblg will strip away HTML attributes that do not fit into a white-list. This white-list is defined by the W3C's Feed Validator.
JSON Schema
sblg can produce JSON with the -j flag. The structure of the JSON file is consumable either with a JSON schema (noted in the Files section) or using the typings that may be downloaded with npm(1):
npm install sblg
If -l is specified, the output schema is simply an array as follows. Let source1.xml
and source2.xml
be input files with a variety of tags.
[ {"src": "source1.xml", "tags": ["tag1","tag2"]}, {"src": "source2.xml", "tags": ["tag1"]} ]
If, however, -r is also specified, the reverse format is used:
[ {"tag": "tag1", "srcs": ["source1.xml","source2.xml"]}, {"tag": "tag2", "srcs": ["source1.xml"]} ]
Tag Symbols
Within the template for -c or -C, or in any article contents written (either into an article or navigation entry), the following special strings are replaced. These symbols concern the current article being processed: in a navigation entry, or as article contents. In the event of the positional “next” and “prev” symbols, these refer to the article's position within the input articles. Obviously, -c has only a single article.
In general, these must be considered strict values, e.g., ${sblg-aside}
and not ${ sblg-aside }
. Some symbols accept optional arguments, which have the format ${sblg-tags[|argument]}
. Here, |argument
may be omitted.
Be careful in using tag symbols: the contents are copied directly, so if specifying a value within an HTML attribute that has a double-quote, the attribute will be prematurely closed.
To prevent regular text with ${...}
from being processed, escape one or more character, such as ${...}
.
- ${sblg-abscount}
The total number of articles. This is only valid in
<nav data-sblg-nav="1">
, otherwise it always prints 1. See also${sblg-count}
and${sblg-setcount}
.- ${sblg-abspos}
The position (from 1) of the article's position in the list of all articles. This is only valid in a
<nav data-sblg-nav="1">
context, otherwise it always prints 1. See also${sblg-pos}
.- ${sblg-aside}
The article's first aside with markup.
- ${sblg-asidetext}
The article's first aside, textual parts only.
- ${sblg-author}
The article's author with markup.
- ${sblg-authortext}
The article's author, textual parts only
- ${sblg-realbase}
Like
${sblg-base}
, and having the same sub-types, except deriving from${sblg-real}
.- ${sblg-base}
Same as
${sblg-source}
but with the last suffix part chopped off. For example,foo/bar.xml
becomesfoo/bar
. The${sblg-stripbase}
variant will strip off the directory part and any sufix. For example,foo/bar.xml
becomesbar
. The${sblg-striplangbase}
variant will also strip the language. For example, if “en” language was specified on the article,foo/bar.en.xml
becomesbar
.- ${sblg-count}
The total number of articles that will be shown, i.e., taking into consideration the navigation length and offset. In standalone mode, this is always 1. In
<nav data-sblg-nav="1">
, it's the total number within the navigation. See also${sblg-abscount}
and${sblg-setcount}
.- ${sblg-date}
The publication date as YYYY-MM-DD (UTC).
- ${sblg-datetime}
The publication date and time as YYYY-MM-DDTHH:MM:SSZ (UTC).
- ${sblg-datetime-fmt[|fmt]}
A human-readable representation of the date and, if specified, time in local time. This accepts an optional format string passed to strftime(3). If the format string is empty or “auto”, a human-readable date (with
%x
) or date-time (%c
) is printed.- ${sblg-img}
The article's associated image. This will be an empty string if no image was specified.
- ${sblg-first-base}
The first (newest) base name in the list of articles. There are also
${sblg-first-stripbase}
and${sblg-first-striplangbase}
variants. See${sblg-base}
.- ${sblg-last-base}
The last (oldest) base name in the list of articles. There are also
${sblg-last-stripbase}
and${sblg-last-striplangbase}
variants. See${sblg-base}
.- ${sblg-next-base}
The next base name when chronologically ordered from newest to oldest, wrapping back to the beginning for the last. There are also
${sblg-next-stripbase}
and${sblg-next-striplangbase}
variants. See${sblg-base}
.- ${sblg-next-has}
Prints
sblg-next-has
if there exists a next article in the ordered set, otherwise prints nothing.- ${sblg-pos}
The position (from 1) of the articles actually shown. This always starts at 1 and increments by one, regardless the tag filtering or starting position. In standalone mode, it always prints 1. In blog mode (outside of a
<nav>
context), it shows the position in the input files. Within a<nav>
context, it shows the position within the navigation.- ${sblg-pos-frac}
The fractional (0–1) value of
${sblg-pos}/$(sblg-count}
.- ${sblg-pos-pct}
The percentage (0–100, not including the percent sign) form of
${sblg-pos-frac}
.- ${sblg-prev-base}
The previous base name when chronologically ordered from newest to oldest, wrapping back to the beginning for the last. There are also
${sblg-prev-stripbase}
and${sblg-prev-striplangbase}
variants. See${sblg-base}
.- ${sblg-prev-has}
Prints
sblg-prev-has
if there exists a previous article in the ordered set, otherwise prints nothing.- ${sblg-get[|key]}
Print the value of
key
assigned indata-sblg-set-key
. If unspecified or the key was not found, this is ignored and omitted from output. The lookup is case sensitive.- ${sblg-get-escaped[|key]}
Like
${sblg-get[|key]}
, but escapes the value of the key so that it may be used fordata-sblg-navtag
ordata-sblg-articletag
attribute values for multi-word tags.- ${sblg-has[|key]}
Like
${sblg-get[|key]}
, but queries with thekey
exists. If it is specified and it does exist, then the stringsblg-has-key
is printed. This is useful inclass
attributes to test whether a given key has been specified.- ${sblg-setcount}
Like
${sblg-count}
, but only the articles matching the requested tags. See also${sblg-count}
and${sblg-abscount}
.- ${sblg-real}
The article's actual source file. See
${sblg-source}
for an overridable source indicator.- ${sblg-source}
The source file associated with the article.
- ${sblg-tags[|tagspec]}
List of unique tags in the article, optionally filtered by those having the prefix
tagspec
. If the prefix is not specified, all tags. Each tag (e.g., TAG) is listed as<span class="sblg-tag">TAG</span>
. If no tags were found, a single<span class="sblg-tags-notfound"></span>
is emitted.- ${sblg-title}
The article title with markup.
- ${sblg-titletext}
The article title, textual parts only.
- ${sblg-url}
The output filename, which is empty for standard output.
- ${sblg-version}
The current sblg version as
xx.yy.zz
.
Files
The following files are installed in /usr/share/doc/sblg
.
- schema.json
JSON schema for output generated with -j.
Exit Status
The sblg utility exits 0 on success, and >0 if an error occurs.
Examples
First, create standalone HTML5 files (filled-in <article data-sblg-article="1">
) from article fragments. An article-template.xml
file is assumed to exist. This will create article1.html
and article2.html
from the re-write rule for the XML suffix.
% sblg -c article1.xml article2.xml
Next, merge formatted files into a front page. A blog-template.xml
file is assumed to exist.
% sblg -o index.html article1.html article2.html
This will create index.html
with filled-in <article data-sblg-article="1">
and <nav data-sblg-nav="1">
elements.
Combining the above two examples, we can specify a single article to be displayed along with a full navigation as follows:
% sblg -o article1.html -C article1.xml article1.xml article2.xml
This will fill the contents of article1.xml
into the <article data-sblg-article="1">
but use both (along with any others) in the <nav data-sblg-nav="1">
.
If we want to make an output article as in the above example for each element of the input, we could either run -C for each input element, or use -L to avoid re-running sblg for each input article, which can be costly for many articles!
% sblg -L article1.xml article2.xml
This re-writes the suffixes and fills in the <article data-sblg-article="1">
for article1.xml
in article1.html
, and so on. For each of these, it will fill in <nav data-sblg-nav="1">
.
Standards
Input files and templates must be properly-formed XML files. Output files are guranteed to be XML as well. The Atom file template must be well-formed; output is guaranteed to satisfy the Atom 1.0 and Tag ID standards.
Authors
The sblg utility was written by Kristaps Dzonsons, kristaps@bsd.lv.
Caveats
Boolean XML values must have an attribute specified. In other words, <foo bar="1">
is valid, while <foo bar>
is not.
HTML entity names with attributes, e.g. <a title="foo…">
, are not properly passed to output.