2020-07-08 22:53:39 +04:00
|
|
|
# Changelog
|
|
|
|
|
2020-09-09 04:34:41 +04:00
|
|
|
## 0.3 - 2020-09-09
|
|
|
|
### Added
|
|
|
|
- Parser for lolifox.cc.
|
|
|
|
|
|
|
|
### Removed
|
|
|
|
- BasicScraper. Not needed anymore, there is a faster threaded version.
|
|
|
|
|
|
|
|
### Fixed
|
|
|
|
- Now User-Agent is correctly applied everywhere.
|
|
|
|
|
|
|
|
|
2020-07-20 03:51:41 +04:00
|
|
|
## 0.2.2 - 2020-07-20
|
|
|
|
### Added
|
|
|
|
- Parser for 8kun.top.
|
|
|
|
|
|
|
|
### Changed
|
|
|
|
- The way of comparison if that site is supported to just looking for a
|
|
|
|
substring.
|
|
|
|
- Edited regex that checks if filename is just an "image.ext" so it only checks
|
|
|
|
if after "image." only goes 1 to 4 characters.
|
|
|
|
|
|
|
|
### Notes
|
|
|
|
- Consider that issue with size on 2ch.hk. Usually it really tells the size in
|
|
|
|
kB. The problem is that sometimes it just wrong.
|
|
|
|
|
2020-09-09 04:34:41 +04:00
|
|
|
|
2020-07-18 05:10:31 +04:00
|
|
|
## 0.2.1 - 2020-07-18
|
2020-07-20 03:51:41 +04:00
|
|
|
### Changed
|
2020-07-18 05:10:31 +04:00
|
|
|
- Now program tells you what thread doesn't exist or about to be scraped. That
|
|
|
|
is useful in batch processing with scripts.
|
|
|
|
|
2020-09-09 04:34:41 +04:00
|
|
|
|
2020-07-18 04:43:45 +04:00
|
|
|
## 0.2.0 - 2020-07-18
|
|
|
|
### Added
|
|
|
|
- Threaded version of the scraper, so now it is fast as heck!
|
|
|
|
|
|
|
|
### Fixed
|
|
|
|
- Handled situation when OP's post has no comment and/or subject.
|
|
|
|
|
|
|
|
|
2020-07-08 22:53:39 +04:00
|
|
|
## 0.1.0 - 2020-07-08
|
|
|
|
### Added
|
|
|
|
- JSON parsers for 4chan.org, lainchan.org and 2ch.hk.
|
|
|
|
- Basic straightforward scraper that downloads files one by one.
|
|
|
|
|
|
|
|
### Issues
|
|
|
|
- 2ch.hk: I can't figure out what exactly it tells as a size and hash of a file.
|
|
|
|
Example: file may have a size of 127798 bytes (125K) but 2ch reports 150 and a
|
|
|
|
hash reported doesn't equal to a computed one.
|