Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create bot that periodically checks if citations still exist in policies #23

Open
milesmcc opened this issue Jun 29, 2020 · 7 comments
Open
Assignees
Labels
enhancement New feature or content

Comments

@milesmcc
Copy link
Collaborator

What do you want added to PrivacySpy?
A bot that checks if policies still contain their citations, and if not, flags them for revision/as out of date.

Have you considered implementing this addition yourself and submitting a pull request?
Yes, I’ll probably end up building this myself. Creating an issue so that I don’t forget.

Additional context
None

@milesmcc milesmcc added the enhancement New feature or content label Jun 29, 2020
@milesmcc milesmcc self-assigned this Jun 29, 2020
@privacyspy-bot
Copy link

Thanks for submitting this issue. @ibarakaiev has been assigned to determine next steps.

To learn about the PrivacySpy contribution process, check out the contribution guide.

@doamatto
Copy link
Collaborator

Currently working on it. Although I can do it with the current API, it would be super slow considering how all of the products are in one JSON file. I proposed that we add in a slug endpoint for products to allow for this to be much faster and smaller files, as well as expanding the possibilities of the PS api. (issue is #101)

@doamatto
Copy link
Collaborator

With the close of #101 and #103, it appears my wish has been granted (yay!).

So far, this is what I have. It's lacking a few checks and is yet to be tested, but I'll let you guys know once it's in a prod-ready state™.

@doamatto
Copy link
Collaborator

So as I feared in a past mention, things that aren't literal would be caught as problems and would then cause problems (especially in prod). My proof of this is the 65 issues that were generated in testing and my now rate-limited PAT (which was funny to watch happen). It may be ideal to fix those issues before moving the bot to work on the official repo. I'll also do my best to make it work with the existing bot.

@doamatto
Copy link
Collaborator

The bot is safe to be used for tests with CI mode enabled (this doesn't create GitHub issues). As of right now, I've had a few issues with how issues are made, so I need to polish that before I can give the OK there. In case you guys need another link to the repo, it's here. Once it's finished entirely, I'm fine with transferring it into /orgs/Politiwatch

cc @ibarakaiev @milesmcc

@ibarakaiev
Copy link
Collaborator

Looks awesome, @doamatto! I see that you are still working on pre-rendering content and fixing character issues; I can't help you with code since I don't know Dart, but I know that at least in Python there are packages that do both of those things. Maybe try searching for something similar in Dart? (or quickly rewrite in Python/Node.JS, lol.)

@doamatto
Copy link
Collaborator

doamatto commented Jul 1, 2021

working on pre-rendering content and fixing character issues [...] I know that at least in Python there are packages that do both of those things

So far, it's actually pretty smooth sailing. Only one hiccup with Java server pages, but I think I have a fix for that too. The main issue I need to remedy is making sure that things like links are handled properly (should be easy to do with the html and the yamh package). Outside of that, it's just a matter of fixing formatting, as well as adding workarounds for the vast majority of quotes (\", \n\n, and \r\n on older quotes)

I can't help you with code since I don't know Dart,

All good. If you did want to try and aid me in some way, try helping with making sure those older quotes I mentioned above are slowly moved out (feel free to add it to the PR I'm using; #10) or looking into the mystery bugs to see why they would be getting flagged

Main reason I'm using Dart as opposed to Python or Node is because I (or really anyone) can easily compile it to one binary; meaning it'll be both fast on its own, and easy to use with things like Cron scripts, GitHub actions, et al. That, and I never re-installed Node until just now to make changes to the products and test locally (much faster than continuously pinging the official servers)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or content
Projects
None yet
Development

No branches or pull requests

3 participants