1
0
A tool for scraping files from imageboards’ threads.
Go to file
2020-07-08 23:13:17 +04:00
scrapthechan Updated usage. 2020-07-08 23:13:17 +04:00
.gitignore Initial commit with all the files. 2020-07-08 22:53:39 +04:00
CHANGELOG.md Initial commit with all the files. 2020-07-08 22:53:39 +04:00
COPYING Initial commit with all the files. 2020-07-08 22:53:39 +04:00
Makefile Fixed version of package. 2020-07-08 22:59:36 +04:00
README.md Initial commit with all the files. 2020-07-08 22:53:39 +04:00
setup.cfg Initial commit with all the files. 2020-07-08 22:53:39 +04:00
setup.py Initial commit with all the files. 2020-07-08 22:53:39 +04:00

This is a tool for scraping files from imageboards' threads.

It extracts the files from a JSON version of a thread. And then downloads 'em in a specified output directory or if it isn't specified then creates following directory hierarchy in a working directory:

<imageboard name>
|-<board name>
  |-<thread>
    |-[!op.txt]
    |-...
  |-...

Usage

scrapthechan [<url> | <imageboard> <board> <thread>] [-o,--output-dir] [--no-op]
 [-v,--version] [-h,--help]

There are two ways to pass a thread. One is by passing a full URL of a thread (<url> argument), and the other one is by passing thread in three components: <imageboard> is a name of website (e.g. 4chan), <board> is a name of a board (e.g. wg), and <thread> is a number of a thread on that board.

-o, --output-dir -- output directory where all files will be dumped to.

--no-op -- by default OP's post will be saved in a !op.txt file. This flag disables this behaviour. I desided to put an ! in a name so this file will be on the top in a directory listing.

-v, --version prints the version of the program, and -h, --help prints help for a program.