RScraper is a family of independent tools including a scraper, browser addon, and chart generators.
- rtagger addon - the browser addon for tagging Reddit users
- tagger - the server for the browser addon addon
- hub - a GUI manager for the database and configuring the scraper
- init - one-off helper tools to initialse the database
- scraper - tool for scraping data from Reddit
- io - import/export tools (as an alternative to scraping Reddit yourself)
- man - UNIX man pages
- utils - CLI database admin tools
To install the rtagger
browser addon, you do not need to install any of these packages; only the addon (or Javascript script) is necessary. Only the server needs to install (and run) the rscraper-tagger
package.
Even the server doesn't need any packages other than that one, though whoever is managing the server will want to install either the rscraper-io
or rscraper-scraper
packages to populate the database, and the rscraper-gui
package for managing the database, and the rscraper-init
package to initialise the database.
See hub usage guide for detailed instructions on using rscraper-hub
.
See man directory for more generic instructions on using the other programs.
Debian-based systems can use the deb
installer packages in the releases page - amd64
for x86_64
systems (most laptops and desktops), armhf
for 64bit arm (e.g. Raspberry Pi). I have tested it on Ubuntu
, Raspbian
, and Debian
. Other (up to date) Debian-based distros should also work.
It should work on MacOS and other Linux distros too. I just don't have access to such systems, so currently the only option for these systems is to build from source.
Windows support is pending someone more knowledgeable about Windows builds helping out.
First install libcompsky:
regexp="https://github\.com/NotCompsky/libcompsky/releases/download/[0-9]+\.[0-9]+\.[0-9]+/libcompsky-[0-9]+\.[0-9]+\.[0-9]+-$(dpkg --print-architecture)\.deb"
url=$(curl -s https://api.github.com/repos/NotCompsky/libcompsky/releases/latest | egrep "$regexp" | sed 's%.*"\(https://.*\)"%\1%g')
wget -O /tmp/libcompsky.deb "$url"
sudo apt install /tmp/libcompsky.deb
Then set the array of packages you wish to install (init
is not required but the configuration guide assumes it is installed)
Then download the packages you want from the releases page.
Then see the configuration guide.
If installation still fails for some reason, see installing on Ubuntu (and also make a bug report).
Not supported yet, but very open to PRs. Some weeks ago it cross-compiled fine, so there shouldn't be many changes to the source code required to build it on or for Windows.
The big hurdle to build for Windows is doing one of the following:
- Modifying CMake to cross-compile on MXE for Windows
- Convert the CMake to
pro
files forqmake
- Convert the CMake to work with Visual Studio files
The person who issues a PR to allow building for Windows will get a big recognition at the top of the page here. Create an issue if you want to discuss with me the steps I took in cross-compiling test versions.
See BUILDING.md
This is still in active development, so expect quite a few things to change.
What should stay the same is the database structure. Purely aesthetic changes - such as the names of columns - will not be made.
Backwards-incompatible changes are very unlikely in the database structure (defined in init.sql), tagger, init and io, and unlikely in utils.
Features may be added in particular to rscraper-hub
.