2020-07-08 22:53:39 +04:00
|
|
|
This is a tool for scraping files from imageboards' threads.
|
|
|
|
|
|
|
|
It extracts the files from a JSON version of a thread. And then downloads 'em
|
|
|
|
in a specified output directory or if it isn't specified then creates following
|
|
|
|
directory hierarchy in a working directory:
|
|
|
|
|
|
|
|
<imageboard name>
|
|
|
|
|-<board name>
|
|
|
|
|-<thread>
|
|
|
|
|-[!op.txt]
|
|
|
|
|-...
|
|
|
|
|-...
|
|
|
|
|
|
|
|
# Usage
|
|
|
|
|
|
|
|
```bash
|
2020-07-08 23:13:32 +04:00
|
|
|
scrapthechan [OPTIONS] (<url> | <imageboard> <board> <thread>)
|
2020-07-08 22:53:39 +04:00
|
|
|
```
|
|
|
|
|
2020-07-08 23:13:32 +04:00
|
|
|
`<url>` -- URL of a thread.
|
|
|
|
|
|
|
|
`<imageboard> <board> <thread>` -- imageboard name, board name and thread ID
|
|
|
|
separately. E.g. `4chan b 1100500`.
|
2020-07-08 22:53:39 +04:00
|
|
|
|
|
|
|
`-o`, `--output-dir` -- output directory where all files will be dumped to.
|
|
|
|
|
|
|
|
`--no-op` -- by default OP's post will be saved in a `!op.txt` file. This flag
|
|
|
|
disables this behaviour. I desided to put an `!` in a name so this file will be
|
|
|
|
on the top in a directory listing.
|
|
|
|
|
|
|
|
`-v`, `--version` prints the version of the program, and `-h`, `--help` prints
|
2020-07-20 04:13:39 +04:00
|
|
|
help for a program.
|
|
|
|
|
|
|
|
# Supported imageboards
|
|
|
|
|
|
|
|
- [4chan.org](https://4chan.org) since 0.1.0
|
|
|
|
- [lainchan.org](https://lainchan.org) since 0.1.0
|
|
|
|
- [2ch.hk](https://2ch.hk) since 0.1.0
|
|
|
|
- [8kun.top](https://8kun.top) since 0.2.2
|