public-inbox-extindex - Man Page
create and update external search indices
Synopsis
public-inbox-extindex [Options] EXTINDEX_DIR INBOX_DIR...
Description
public-inbox-extindex creates and updates an external search and overview database used by the read-only public-inbox PSGI (HTTP), NNTP, and IMAP interfaces. This requires either the Search::Xapian XS bindings OR the Xapian SWIG bindings, along with DBD::SQLite and DBI Perl modules.
Options
- -j JOBS
- --jobs=JOBS
- --no-fsync
- --dangerous
- --rethread
- --max-size SIZE
- --batch-size SIZE
These switches behave as they do for public-inbox-index(1)
- --all
Index all
publicinbox
entries inPI_CONFIG
.publicinbox
entries indexed bypublic-inbox-extindex
can have full Xapian searching abilities with the per-publicinbox indexlevel
set tobasic
and their respective Xapian (xap15
orxapian15
) directories removed. For multiple public-inboxes where cross-posting is common, this allows significant space savings on Xapian indices.- --gc
Perform garbage collection instead of indexing. Use this if inboxes are removed from the extindex, or if messages are purged or removed from some inboxes.
- --reindex
Forces a re-index of all messages in the extindex. This can be used for in-place upgrades and bugfixes while read-only server processes are utilizing the index. Keep in mind this roughly doubles the size of the already-large Xapian database.
The extindex locks will be released roughly every 10s to allow public-inbox-mda(1) and public-inbox-watch(1) processes to write to the extindex.
- --fast
Used with
--reindex
, it will only look for new and stale entries and not touch already-indexed messages.
Files
Configuration
public-inbox-extindex does not write to the public-inbox-config(5) file, it must be entered manually. The extindex name of all
is a special case which corresponds to indexing --all
inboxes. An example for --all
is as follows:
[extindex "all"] topdir = /path/to/extindex_dir url = all coderepo = foo coderepo = bar
Putting an extindex
entry in the config allows PublicInbox::WWW. You can have any number of extentry.$NAME
sections where $NAME
is something other than all
to display a union of several inboxes.
See public-inbox-config(5) for more details.
Environment
- PI_CONFIG
Used to override the default "~/.public-inbox/config" value.
- XAPIAN_FLUSH_THRESHOLD
The number of documents to update before committing changes to disk. This environment is handled directly by Xapian, refer to Xapian API documentation for more details.
Setting
XAPIAN_FLUSH_THRESHOLD
orpublicinbox.indexBatchSize
for a large--reindex
may cause public-inbox-mda(1), public-inbox-learn(1) and public-inbox-watch(1) tasks to wait long and unpredictable periods of time during--reindex
.Default: none, uses
publicinbox.indexBatchSize
Upgrading
Occasionally, public-inbox will update it's schema version and require a full index by running this command.
Contact
Feedback welcome via plain-text mail to <mailto:meta@public-inbox.org>
The mail archives are hosted at <https://public-inbox.org/meta/> and <http://4uok3hntl7oi7b4uf4rtfwefqeexfzil2w6kgk2jn5z2f764irre7byd.onion/meta/>
Copyright
Copyright all contributors <mailto:meta@public-inbox.org>
License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>
See Also
Search::Xapian, DBD::SQLite
Referenced By
lei(1), lei-add-external(1), lei-overview(7), lei-reindex(1), public-inbox-config(5), public-inbox-tuning(7).