novel-spider-banzhu

这是什么？

这是小说爬取项目 novel-spider 的特化版本，针对的是某一特定网站。

为什么要编写这个？主要是因为目标网站较为特别，常规爬取手段无法生效，而为了下载小说内容，因此需要特别处理。

脚本使用 Selenium 进行爬取，由于网站特别，在爬取的时候需要手动干预。

在使用 Selenium 爬取时，脚本将使用浏览器进行操作，以模拟真实用户操作。

该网站存在极其严格的 Cloudflare 人机验证，即使使用 Selenium 也无法自动绕过。（手动获取 Cookie 值 cf_clearance 绕过）
该网站正文内容存在“图片文字”，使用特别的图片来表示文字。（判断并替换）
该网站正文内容存在“字体文字”，使用特别的字形来表示文字。（判断并替换）

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
replace		replace
tools		tools
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
main.py		main.py
novel_spider_banzhu.py		novel_spider_banzhu.py