Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support for creating a url_history.txt in each rips/<name> directory.… #861

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

owski
Copy link

@owski owski commented Aug 12, 2018

… this will considerably improve performance

Category

This change is exactly one of the following (please change [ ] to [x]) to indicate which:

  • a bug fix (Fix #...)
  • a new Ripper
  • [ x ] a refactoring
  • [ x ] a style change/fix

Description

As the url_history.txt file grows, performance decreases. This PR will now save the url_history.txt file to each individual directory being processed. This will not break backwards compatibility with existing installations; if a url_history.txt file exists in ~/.config/ripme (or where ever the rip.properties file is), it will continue to use this file until it is removed or renamed.

Example:

 java -jar target/ripme-1.7.61-jar-with-dependencies.jar -u 'https://www.instagram.com/puppies/'

head rips/instagram_puppies/url_history.txt
https://scontent-lga3-1.cdninstagram.com/vp/1ec454f99a4c6fd0ab16256b9156a58e/t51.2885-15/e35/38531095_542541279509261_7708136067639017472_n.jpg
https://scontent-lga3-1.cdninstagram.com/vp/3d725a1dcef1dfd264dd440a3517c326/t50.2886-16/37779799_743762896015565_6202087789383647232_n.mp4
https://scontent-lga3-1.cdninstagram.com/vp/7b5630909646ef319f035d9ee2e1b633/t50.2886-16/38130576_444954456007413_605107561397485568_n.mp4
https://scontent-lga3-1.cdninstagram.com/vp/4da5931f57d371d4ecd7a4b2bc30f31a/t51.2885-15/e35/37304592_641259729588873_5656126192554606592_n.jpg
https://scontent-lga3-1.cdninstagram.com/vp/d7bdab173e467d8074ccd4c6fed61fdd/t50.2886-16/37223956_256773365104275_5677799125769404342_n.mp4
https://scontent-lga3-1.cdninstagram.com/vp/cf8509a8704057005b028971bbc91578/t51.2885-15/e35/37202580_1571061726526461_4026329117543628800_n.jpg
https://scontent-lga3-1.cdninstagram.com/vp/86a87f388e25b285a41285853c2db34e/t51.2885-15/e35/35575312_501913830222914_9132211729659330560_n.jpg
https://scontent-lga3-1.cdninstagram.com/vp/5e869e98aaf810e8081073f98e127591/t51.2885-15/e35/35575471_2096872163659383_5062661617481678848_n.jpg
https://scontent-lga3-1.cdninstagram.com/vp/0eaf03e6b85f13768388375422c83d50/t51.2885-15/e35/35574810_1862890253749285_4106889737510322176_n.jpg
https://scontent-lga3-1.cdninstagram.com/vp/2a095643616b5de19d8dcc47599223b9/t51.2885-15/e35/35617062_2094110887524918_5019118049328889856_n.jpg

Testing

Required verification:

  • [ x ] I've verified that there are no regressions in mvn test (there are no new failures or errors).
  • [ x ] I've verified that this change works as intended.
    • [ x ] Downloads all relevant content.
    • [ x ] Downloads content from multiple pages (as necessary or appropriate).
    • [ x ] Saves content at reasonable file names (e.g. page titles or content IDs) to help easily browse downloaded content.
  • [ x ] I've verified that this change did not break existing functionality (especially in the Ripper I modified).

Optional but recommended:

  • I've added a unit test to cover my change.

… this will considerably improve performance
@coveralls
Copy link

Coverage Status

Coverage decreased (-0.09%) to 38.763% when pulling d01fe77 on owski:master into b7f8b0e on RipMeApp:master.

@cyian-1756
Copy link
Collaborator

cyian-1756 commented Aug 16, 2018

Overall this looks good to me, but I'd like a config option to always use the old location of url_history.txt (Basically just add a if (!Utils.getString("history.use_global_url_history", false) at line 245 )

@metaprime metaprime changed the base branch from master to main January 5, 2025 02:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants