-
Notifications
You must be signed in to change notification settings - Fork 169
Downloaded media files lose case in filenames if downloading albums without specifying any album filters #400
Comments
I am bemused by this. There is an algorithm that adds '(2)' to the second instance of the same filename and this is affected by the folder structure as duplicate filenames in different folders are OK. But I can't immediately see why using the album reg would change that. I'll have a review of the code and see if I can see anything |
Might it be related to this? (line 135/258): I didn't check where settings.case_insensitive_fs is derived, but I am using Windows |
Having a deeper dig in the code I can see that when indexing the media it will take 2 different routes in the code depending on whether you specify --album-regex or not (or other varients of arguments). For the index_photos_media route, this will consider the value of case_insensitive_fs and set the to_lower attribute which eventually converts the filename to all lowercase if set to true or if unspecified and the derived value is true (which is the case for windows). While the album route does not utilise the case_insensitive_fs value at all. The question is, is it necessary to convert the filename to all lowercase if the filesystem is detected as case insensitive? If so, why does the album route work without doing so, or is there a bug for an edge case in the album route? |
Thanks for the detailed report. When the filesystem is case insensitive we do need to change the case internally because the filename clash detection will fail otherwise. Therefore it is a bug if we don't do this in album names, however album names with just a case difference are reasonably unlikely so I have got away with it so far. For you original issue I'm guessing that the code thinks you have a case insensitive FS when you don't? Usually windows is case insensitive but it does support other filesystems. |
The thing with Windows is the default behaviour is you're not allowed to have files with the same name but in different case (therefore it fails this check https://github.com/gilesknap/gphotos-sync/blob/main/src/gphotos_sync/Checks.py#L108), yet the case of filenames when naming a file is retained. To be honest it's not ideal to be converting everything to lowercase in the final output, but I appreciate it could be a load of work to change the algorithm for filename conflict detection to retain the original case of filenames on a case insensitive filesystem. If you didn't fancy changing that then I think it would be best to fix the GoogleAlbumsSync route to consider the case_insensitive_fs flag, and then document somewhere clearly that if the filesystem is case insensitive everything will be converted to lowercase, unless you pass '--case_insensitive_fs false' explicitly, with the disclaimer that doing so will mean files with the same filename in different case on Google Photo will overwrite each other |
re Windows filesystem: I'm not sure I knew that - or at least if I did then I made a bad choice on how to handle these filesystems when I wrote the clash handling. I think we'd have to go with the latter fix you suggest. I should have made the clash check case insensitive and preserve case in filenames. But, to make this change now would cause loads of updates to existing syncs and sorting out how to resolve those would be really fiddly. |
Also the line 108 check covers other file systems that are fully case insensitive - not sure if they exist but I would not be surprised. |
I'm going through issues today and getting a release out. I'm putting this one on hold for a later release as its going to break lots of system tests and take some time to sort out. |
Also hitting this bug, running on macOS installed version 3.1.2 from pip today, the app indexes 732 photos but saves 1462, saving a duplicate for each file ending in (2)
Using --album-regex "^(.*)$" as a workaround:
|
@hubert3 thanks for the report. I'll try to take a look at this again soon. |
I was using gphotos-sync to download everything from my Google Photo account a couple of months ago and noticed this issue (I was using release 3.0.3 at the time).
If I run gphotos-sync without any filters on album or album-regex I found that all the files which get downloaded have their filenames converted to all lowercase. I noticed however when downloading a single album using --album or --album-regex, the case of the filenames are correctly retained on the downloaded media. In order to download my whole collection of photos and videos in a single run, I had to work around the issue by using --album-regex "^(.*)$" to effectively use a regex which matched all albums
The text was updated successfully, but these errors were encountered: