-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Commiting files larger than 4 GB #1063
Comments
The memory address room does not imply whether large files are supported or
not. 32 bit processes can very well handle files larger 4gb if the
developer decides to implement such :). You anyway would not try to load a
whole file into memory (would not scale well), but rather operate on chunks.
So the question is: Does the 32bit git for Windows support large files?
On Feb 15, 2017 10:45 AM, "J Wyman" <[email protected]> wrote:
From the data you provided:
$ git --version --build-options
git version 2.11.1.windows.1
built from commit: 1c1842b
sizeof-long: 4
machine: x86_64
It appears that you are using a 32-bit version of Git. 4-byte longs can
address 4 GiB of memory, which is the most likely source of your problem.
Have you tried a 64-bit version of Git? If so, does your problem still
reproduce?
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#1063 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AHJ-yPM0jEWRMVJVYXnyNkQFWhMHbbkyks5rcx2IgaJpZM4MBvCT>
.
|
I used the 64-bit installer from git-for-windows... thought I'd get a 64-bit git with it: |
That was my mistake, I misread the output of the |
@elmorisor @whoisj the red herring was The problem is the Git source code, which uses |
I have also hit this problem. |
Yes, it most certainly is. On 32-bit platforms, for example, you simply cannot map 4GB files into memory via |
Would really appreciate a fix for this. |
@polygonica I warmly welcome you to work on this. (While it may seem convenient to expect others, including myself, to fulfill your wishes, it rarely works.) There is already code to stream large objects (so that they do not have to be mapped into memory), and it should be possible to at least fall back to that option in |
Just to confirm, this is only reproducible on 32 bit builds correct? @dscho I can take a stab at it this weekend |
@isometric I am not really sure, as I have not followed the recent developments on the As a quick first glance, you may want to run It is always worth a look to see whether Git's test suite has something related, because then it is relatively easy and quick to run a test to validate possible fixes (or prove that they don't fix the issue). I also stumbled across the |
Had a number of power issues in the neighbourhood this weekend so didn't get a chance to take a look. I'll try to find some time next weekend. |
BTW, I was able to hash-object/cat-file a 5GB blob successfully w/ git in Ubuntu on Windows. It turns out that hash-object produces the same (correct?) result for but linux/Windows, but cat-file fails on Windows only. I'm using 64-bit for both versions of Git. I used this repro instead:
|
It looks like this error boils down to There are many places in Git that use |
Right. And Git also uses |
Can I do anything to help? Is this a bug in git for windows, or does it need to be fixed upstream? This bug prevents me not only committing 4GB files directly, but also trying to use LFS. |
@ksulli help is always welcome. Please note that there had been a couple patches flying about on the mailing list, to try to address the However, for the concrete purpose of resolving this here issue, I think there is some sort of streaming mode available in the internal Git API. That would make it possible to, say, generate an object larger than 4GB via How's your C fu? I see that there is already something called Lines 1900 to 1922 in 918fa5c
So the trick would be to first test whether it works now, and if it does not, investigate in that code (possibly using a debugger and/or inserting debug statements) where things go south. If you need to debug this, that is really easy: install the Git for Windows SDK (it'll clone about half a gig worth of Git objects, though), then call Please let me know when/where you get stuck. |
Thanks for the pointers, I'm a bit rusty with compiled languages in general but this issue really irks me so I'll do my best. |
Thanks! As I said, any help is welcome. If you get stuck, just holler (and provide details ;-)). |
I also have a problem on windows with files >4 GB via Git LFS. Can you please tell me, when we can expect a bugfix for that in git? |
It really needs someone to help the upstream git with the migration to a streaming interface, if I understand dscho's well informed comment above If you are able to help with coding that would be great. (many codez make all issues shallow ;-) |
Note in the referenced git lfs issue there is a workaround for using >4GB files with git-lfs on windows. It is just a slight change in workflow for those that can't wait or don't have the time to fix directly. |
long vs size_t difference doesn't matter, because neither of them should be used for file sizes or offsets. Instead, off_t must be used. Indeed, off_t used all throughout the code for this purpose. If there's a case where long or size_t is incorrectly used for file size or offset, it must be changed to off_t |
There is plenty of discussion on the upstream mailing list about the issue of the size of various types on different systems, and their incompatibilities. The archive https://public-inbox.org/git/?q= is probably the most useful one for searching. |
I think it might be this mindset that turned the discussion in this ticket away from a useful course: if you want something, you gotta put some effort behind it, not just wait for others to miraculously fulfill your wishes without getting anything in return. So I'll close this ticket, and let those who are putting in more effort than mere words (you know who you are) be active elsewhere (you know where), being grateful for it (you know I am). |
I'm sorry some users either don't realize or don't appreciate that much of the work on git-for-windows is done by volunteers. I think we lose a lot by closing this issue though - it is still an issue, it contains information about the root cause, and it is linked to as the cause of a git-lfs bug (git-lfs/git-lfs#2434). Would you consider reopening? Perhaps someone will pick it up someday (perhaps even me); while closing it may send a message, I also think it will create a good bit of confusion from folks watching the issue or dealing with it. |
extending on previous comment, try The current code for detecting zlib decode length errors is full of poorly defined behaviour because the up/down casting of the different variable types on different architectures produces different results (as opposed to undefined behaviour..). I expect that some 'C language lawyer' action is needed to cast the zlib stream length to ptrdiff and then use that (ptr arithmetic) ubiquitously to get consistent results on all platforms. I think the git_lfs link is a red herring because it fails to get to the bottom of the problem for systems where Windows can handle proper 64 bit addresses. |
I disagree. The valuable technical discussion with people following up with patches was not happening here. There are people putting their money where their mouth is, making sure that their wishes come true by putting some energy and effort behind it. Just not here. So: Let's just draw the curtain of charity over the rest of this ticket, and let it rest in peace. |
I understand the frustration, but following that logic means all the real issues that aren't seeing active investigation and/or fixing should be closed. Is that the plan moving forward? To someone who experiences the bug and ends up here via google, etc, there will be confusion. They'll think "oh, this is a known issue, cool - wait, it's closed - why am I still encountering it"? It would help users if something visible about the issue (perhaps title) could at least be updated to indicate that this issue is not fixed and users should not have any expectation that it ever will be. |
@aggieNick02 are you really trying your best to bind our time here? Is that what you want? To keep talking, talking, talking, and not get anything done?
I am totally not on board with this idea. Why? Because it makes you feel that you are a strict user and not responsible for anything while others should do all that. How about getting involved instead? How about you update this ticket with the progress? How about you pay attention to the discussion on the mailing list, summarizing where the progress is at? That is easily something you can do. And something that takes away the burden from others. Rather than piling and piling even more responsibility on those few who take care of the issues you want to see resolved. Or better put: trying to pile, because really, it is not the responsibility of anyone to take care of your wishes, not if you do not give them money or time or anything in return. So: while I see what you are saying about the confusion and about opaque progress, I have to point out that this is a community effort, and if you choose not to be part of that effort, you have no say in how it is run. If you choose to be part of the effort, your contributions will be appreciated. And even better: you can then have what you want, because you make it so. |
Dear dscho |
Dear @JohnFrampton thank you for speaking up. However, your speaking up does not help getting the issue at hand resolved, does it? What can you do to help? |
Well I downloaded the code and have a look and have to find out what I understand and how I can deal with that. I will give it a try. But currently i'm payed to work on something else, so ... lets see ... |
That's good. Now let's also get you into the conversation with the people who are already working on this: please head over to gitgitgadget#115. |
Hi just for others to know this issue still occurs in git for windows 2.29.2 - the linked gitgitgadget#115 is also closed but as far as I can tell not "complete" - it links to this which is still open however been quiet for over a year |
Thanks for the update @srothery . If you want/need to work with larger files on windows, it is possible, but involves workarounds. There is a bit of discussion at git-lfs/git-lfs#2434, with the workaround explained in a post there by @technoweenie. It isn't perfect, but it is workable. We run a self-hosted git-lfs server with >4GB files both committed from and pulled to windows machines. |
Thanks @aggieNick02 - @technoweenie 's fix was to do with the smudge filter - should I also disable the clean filter too? If I do both of those does that mean for the whole repo all lfs files won't go into my working folder but the .git/lfs/objects right? I was looking to see if I could disable smudge/clean just for my files that are >4GB but can't spot examples or hints if this is possible. |
So it's been a little while since I've configured this, but here's what I remember/have settled on:
|
For anyone else stumbling onto this discussion and not knowing what to make of all those closed issues and what the state of the issue actually is: Even though this issue was closed prematurely, folks organized somewhere else (thank you who ever you are!) and according to git-lfs/git-lfs#2434 (comment) the fix has been merged to git for windows for the 2.34.0 release and there seems to be an effort to upstream the fix to mainline git as well for 3.10 as far as I understood, but don't quote me on that, I might have misunderstood that part. And just to avoid any ambiguity: This was always a windows-only bug. Upstreaming to mainline really just means that Windows builds created from the mainline sources will behave correctly too then. |
This issue was closed because the conversation was becoming counter-productive. You reminded me that I wanted to lock it, thank you. |
Setup
defaults?
to the issue you're seeing?
no
Details
CMD
Minimal, Complete, and Verifiable example
this will help us understand the issue.
I was managing my large files with the git-lfs-extension. Some of them were more than 4GB in size. After deleting one of those files from my working tree and do a normal git checkout I ended up with a somehow crippled file with a size of only 46 MB left.
For testing reasons I tried to commit a 4,3 GB file to my git repository without the LFS extension.
After deleting that file from the working tree and checking out again, I expected the 4,3 Gb file to
be present again. Intead I ended up with the same small file.
Seems like the file was never committed correctly. The .git directory is about 100 MB in size.
Reinstalling Git and changing machines did not change the issues.
Files smaller than 4GB are not affected.
After that I tried to search the gitconfig for some settings realted to 64-bit. I found the core.gitPackedLimit, which should default to 8GB on 64-bit systems. I manually set it to 8g myself. Git told me that the value is out of range. Only after setting it to a value smaller than 4 GB I could use git normally again.
URL to that repository to help us with testing?
Issue is not repository-specific
The text was updated successfully, but these errors were encountered: