2024 September edition. A quick-and-dirty webscraper for LDS General Conference, using Pandoc to convert to orgmode files.
You will need Leiningen 2.0.0 or above installed.
You will also need Pandoc which is available free and also through linux repos. Pandoc performs the format conversion from HTML.
The output will be emacs’ orgmode, which is included in emacs and supported by pandoc. Orgmode syntax functions like a fuller version of markdown (org is older, and not inherently compatible, although exporting to markdown is possible).
Right now this can only be run from the REPL, like:
gc-scraper.general-conference> (get-web-gc "my/slash/ending/output/directory/"
- [ ] Clean up handling of references (footnotes)
- [X] More robust output-dir handling
- [ ] Commandline Operability
- [X] Allow specifying of different conference years or urls
Copyright © 2008-2024000 Tory S. Anderson with the GPL v3 License. Use and modification is fine; attribution appreciated.