The configuration for ukwa-pywb consists of several environment variables and the main config.yaml file.
The following environment variables can be used:
-
UKWA_INDEX
-- Points to the CDX/index source data. Required with the default config.yaml. -
UKWA_ARCHIVE
-- Points to the WARC/archive source data. Required with the default config.yaml. -
WEBHDFS_USER
-- For use with thewebhdfs://
archive source data to set the WebHDFSuser.name=
field. -
WEBHDFS_TOKEN
-- For use with thewebhdfs://
archive source data to set the WebHDFSdelegation=
token. -
REDIS_URL
-- A Redis url (eg.redis://redis:6379/0
) pointing to a Redis instances for use with the Single-Concurrent Lock system. -
TEST_SESSION_LOCK_INTERVAL
-- a custom interval to override the concurrent-lock timeout, designed for testing only. See Single-Concurrent Lock for more details. -
LOCKS_AUTH
-- If set tousername:password
, provides Basic Auth access restrictions to all session locks operations. See Single-Concurrent Lock for more details.
The default config.yaml is a good starting point for customizing the configuration.
The ukwa-pywb config supports the default pywb configuration options plus the following additional options:
To enable Memento Prefer header support, set this option:
enable_prefer: true
To enable localization, the following entry is needed:
locales_root_dir: ./i18n/translations/
locales: ['en', 'cy']
See Localization Docs for more info on localization.
The single-concurrent lock mode can be enabled per-collection by setting single-use-lock: true
in the collection config:
collection:
ukwa:
...
single-use-lock: true
See Single-Concurrent Lock docs for more info.
Access Controls files can be added to any collection, eg:
collections:
ukwa:
...
acl_paths:
- /webarchive/block_list.aclj
See Access Controls for more info on the ACL configuration.
Environment variables ${...}
can be used in the config as needed, to define paths, etc...
ukwa:
...
acl_paths:
- ${BLOCK_ACL_PATH}
(This is how UKWA_INDEX
and UKWA_ARCHIVE
are used in the default config.)
When mounting a custom config directory as /webarchive
in Docker, it should contain a root config.yaml
file.
Any paths to other files within the directory, such as access control files, should be absolute within /webarchive.
See Deployment Docs for more info on deployment options.