-
Notifications
You must be signed in to change notification settings - Fork 102
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xed in position 10: invalid continuation byte #148
Comments
I also attempted to use the latest release and got a similar error:
|
@luawtf Could you make this @Fenixin Please note twoolie/NBT#144. |
@macfreek See the ZIP file I provided, the file you are looking for can be found in |
Also, thank you so much for the quick response! I'm really hopeful that I can get my gamesave back in one piece :P (or atleast most of it). Is there any way to financially donate to or sponsor this project? |
I had the idea to replace
|
Hello! I might be wrong but if I recall correctly this is fixed in the branch bugfix. I probably should have pushed this to master long time ago. @luawtf, please, give bugfix a try and tell us how it goes!
@macfreek, thanks for the heads-up! I have always considered these I will give this a look when I have some free time to spend (maybe summer), right know this symptoms will have to wait as corrupted chunks.
@luawtf, I'm the maintainer of this project but, to tell you the truth, if regionfixer exists is thanks to @macfreek and his package, so feel free to donate to him. |
Thanks, I'll take a look! |
Using bugfix now, I get the same error as my hotfix, maybe I should make a new issue with this error specifically?:
|
I will give it a look, hopefully, next weekend. EDIT: there is no need for a new issue, thanks! |
Hi @Fenixin Here is a short script to reproduce it. You need to adjust the path to Minecraft-Region-Fixer and the affected MCA file.
You will find that if you I have not delved into the actually issue, sorry about that. I was just fighting over pdb (I seem to have encounter a bug in Python multiprocessing library), so took me a little longer than anticipated. Hope you can take it from here. Please tag me if it is an issue in the NBT library. Note that NBT is not really my library -- twoolie is the author. I maintained it for a while, but am not actively doing so. (@luawtf was lucky that I was just watching television on a Sunday evening, and chasing a bug felt more interesting than watching TV :) ) |
Wahoo, lucky me! Since Python isn't really my area of expertise, I've begun building a Minecraft-Region-Fixer-like tool that I'm calling anvil_recovery_tool in Rust. I'm going to be taking a different approach, working on region files directly. Also, instead of attempting to repair broken data, I'm generating valid (but empty) region files, then copying over whatever data I can (with a focus on block data). This includes using the very stable hematite_nbt NBT library, and falling back to a dirty system of searching the chunk blobs for the I think next time I might just back up my Minecraft world better, though :P |
Couldn't resist taking a closer look. The following is an even shorter snipper that reproduces the issue.
Of course, Minecraft-Region-Fixer shouldn't crash after this Unicode exception. @Fenixin I leave it up to make sure it doesn't. @luawtf It seems you indeed got a corrupt world at your hand. I found two issues:
Here is the relevant part of the NBT file for the corrupted book:
@luawtf in the attached file, I replaced these lines with:
Here is a short analysis of r.-1.4.mca:
I'll attach two region files, that I cleaned with the underlying NBT library. I removed the pointers to chunks in r.-1.4.mca.zip I hope these work for you. |
The underlying NBT library has actually 4 parts: world folders, region files, NBT structures, and Minecraft chunks. The world folder part is OK, the NBT structure part could use cleanup but is fairly stable last 10+(!) years. The chunk part is hopelessly outdated (it didn't keep up with the changes in the data structure over the years). The region part is actually the part that I contributed. By adding lots of unit tests, I'm fairly certain that it is robust, and will not corrupt any region, more than it already is. Even in really weird cases, e.g. when the header of two different chunks point to the same location in the file. If you would delete an overlapping chunk, it would not touch the one, and gently move it to a free location. |
OMG! Thank you so much, I'll test these new Anvil files right away! |
Update! The world loads fine now and while a considerable amount of stuff is back, I still ended up losing the spawn area :(, oh well. Thank you so much for the help! |
That doesn't look corrupted, it's just MUTF-8. Your example works for me with a simple change:
Don't forget to |
@macfreek |
Oh wow... thanks you all! I will read this slowly and try to update RegionFixer. Busy weeks are coming for me but I will give this a good look when the time arrives. |
It seems that I'm not that good with predicting when I'm going to work on regionfixer... This should be fixed in the last release (v0.3.6). The fix is ugly but works. What I have done is to change the UTF-8 dedoding to MUTF-8 (from here https://pypi.org/project/mutf8/) in nbt. Thanks very much for all your research and making this easy. I don't know if this should be pushed upstream to nbt. In order to make the use of regionfixer easy for everyone I've just included the library in regionfixer code (which is not ideal). I'm going to close this. Open a new issue if you feel like I left something important out. Thank you all! |
Did you include my mutf8 library just to keep the project dependency free or did you run into a bug with it? Just asking because using it like this you're not getting the (much, much, much) faster C extension, just the pure python version. If speed isn't a concern you might as well just remove the C version. |
Hello! Thanks for the heads-up. I included the library to make it dependency free (easier for the user) and also because I did this in a hurry. Speed should be a concern, I should have tested speeds before and after using this approach to see the effects. Regionfixer is used to scan big worlds in servers so they are probably not very happy right now. Why is there the C version too? Because I did it in a hurry. Is there a solution that would have the best of both worlds? Easy for users and speed for people that want it. |
Both the python and the c version are significantly faster than the alternatives, but the c version is orders of magnitude faster. Your users shouldn't be too unhappy. If you want to bundle the C version you'll want to modify your setup.py to include an Extension so pip knows to build it, and you'll want CI to build and release as many binary wheels as possible so users don't need to compile (see cibuildwheel). The mutf8 package does all this, so you can cut and paste. Or, just stick with the .py version :) |
Describe the bug
Region Fixer crashes with a UnicodeDecodeError when fixing corrupted chunks.
Full copied text from the MS-DOS view
Expected behavior
Region Fixer should attempt to fix corrupted chunks instead of crashing.
Screenshots
Files that would help solving the issue
ZIP file containing my server's world files
Desktop (please complete the following information):
Additional context
World files were in a .tar.xz archive that was corrupted. Also, since this is a UnicodeDecodeError, it might be related to the changes made in 528dfc6c0420799ee875792612c6c4a12d721044?
The text was updated successfully, but these errors were encountered: