Skip to content

Latest commit

 

History

History
58 lines (39 loc) · 1.29 KB

README.md

File metadata and controls

58 lines (39 loc) · 1.29 KB

Novel Website Parser

This project is a parser for a novel website designed to enhance the search and retrieval of novels. The website's native tools are insufficient for efficient searching, so this parser offers an improved solution. It supports both synchronous and asynchronous parsing, with the async parser being approximately 8 times faster.

Features

  • Synchronous Parsing: Basic parsing of the novel website.
  • Asynchronous Parsing: Enhanced performance with async parsing, approximately 8 times faster than synchronous.
  • Data Storage: Extracted data is saved in data/novels_data.json.
  • Error Handling: Robust error handling for fetching and parsing novel data.
  • Logging: Comprehensive logging for monitoring and debugging.

Dependencies

This project uses the following libraries:

  • requests
  • beautifulsoup4
  • lxml
  • aiohttp

Installation

  1. Clone the repository:
git clone git@github.com:RomaP13/animestuff-parser.git
cd animestuff-parser/
  1. Install dependencies using pipenv:
pipenv install

Usage

Synchronous Parsing

To run the synchronous parser:

make parse

Asynchronous Parsing

To run the asynchronous parser:

make async_parse

License

MIT