Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xed in position 10: invalid continuation byte #148

Closed
luavixen opened this issue Jan 17, 2021 · 22 comments
Labels

Comments

@luavixen
Copy link

luavixen commented Jan 17, 2021

Describe the bug
Region Fixer crashes with a UnicodeDecodeError when fixing corrupted chunks.

Full copied text from the MS-DOS view

[lua@lua-box ~/Downloads/recover]$ python3 Minecraft-Region-Fixer/regionfixer.py --fix-wrong-located --fix-missing-tag --fix-corrupted minecraft/serversurvival

Welcome to Region Fixer!
(v 0.3.3)

############################################################
#############  Scanning world: serversurvival  #############
############################################################

World info:
There are 419 region files, 40 player files and 0 data files in the world directory.

-------------------- Checking level.dat --------------------
'level.dat' is readable

---------------- Scanning UUID player files ----------------
40 of 40|########################################################################################################################################################################|Time: 0:00:00

------------- Scanning old format player files -------------
Info: No files to scan.

---------- Scanning structures and map data files ----------
Info: No files to scan.

------------------ Scanning region files -------------------
419 of 419|######################################################################################################################################################################|Time: 0:02:17


############################################################
############# Scan results for: serversurvival #############
############################################################


Unreadable player files:
No problems found.

Unreadable data files:
No problems found.

Chunk problems:
-------------------------------
| Problem | Corrupted  Total  |
-------------------------------
|  Count  |    386     153751 |
-------------------------------

Region problems:
No problems found.





######### Repairing chunks with status: Corrupted ##########
Repairing chunks in regionset "Overworld":


Ops! Something went really wrong and regionfixer crashed.


Bug report:

<class 'UnicodeDecodeError'>
Traceback (most recent call last):
  File "/home/lua/Downloads/recover/Minecraft-Region-Fixer/regionfixer.py", line 588, in <module>
    value = main()
  File "/home/lua/Downloads/recover/Minecraft-Region-Fixer/regionfixer.py", line 548, in main
    fix_bad_chunks(args, w)
  File "/home/lua/Downloads/recover/Minecraft-Region-Fixer/regionfixer.py", line 71, in fix_bad_chunks
    counter = scanned_obj.fix_problematic_chunks(problem)
  File "/home/lua/Downloads/recover/Minecraft-Region-Fixer/regionfixer_core/world.py", line 1398, in fix_problematic_chunks
    counter += regionset.fix_problematic_chunks(status)
  File "/home/lua/Downloads/recover/Minecraft-Region-Fixer/regionfixer_core/world.py", line 936, in fix_problematic_chunks
    counter += self._set[r].fix_problematic_chunks(status)
  File "/home/lua/Downloads/recover/Minecraft-Region-Fixer/regionfixer_core/world.py", line 374, in fix_problematic_chunks
    chunk = region_file.get_chunk(*local_coords)
  File "/home/lua/Downloads/recover/Minecraft-Region-Fixer/nbt/region.py", line 599, in get_chunk
    return self.get_nbt(x, z)
  File "/home/lua/Downloads/recover/Minecraft-Region-Fixer/nbt/region.py", line 578, in get_nbt
    nbt = NBTFile(buffer=data)
  File "/home/lua/Downloads/recover/Minecraft-Region-Fixer/nbt/nbt.py", line 628, in __init__
    self.parse_file()
  File "/home/lua/Downloads/recover/Minecraft-Region-Fixer/nbt/nbt.py", line 655, in parse_file
    self._parse_buffer(self.file)
  File "/home/lua/Downloads/recover/Minecraft-Region-Fixer/nbt/nbt.py", line 493, in _parse_buffer
    tag._parse_buffer(buffer)
  File "/home/lua/Downloads/recover/Minecraft-Region-Fixer/nbt/nbt.py", line 493, in _parse_buffer
    tag._parse_buffer(buffer)
  File "/home/lua/Downloads/recover/Minecraft-Region-Fixer/nbt/nbt.py", line 404, in _parse_buffer
    self.tags.append(TAGLIST[self.tagID](buffer=buffer))
  File "/home/lua/Downloads/recover/Minecraft-Region-Fixer/nbt/nbt.py", line 476, in __init__
    self._parse_buffer(buffer)
  File "/home/lua/Downloads/recover/Minecraft-Region-Fixer/nbt/nbt.py", line 493, in _parse_buffer
    tag._parse_buffer(buffer)
  File "/home/lua/Downloads/recover/Minecraft-Region-Fixer/nbt/nbt.py", line 493, in _parse_buffer
    tag._parse_buffer(buffer)
  File "/home/lua/Downloads/recover/Minecraft-Region-Fixer/nbt/nbt.py", line 493, in _parse_buffer
    tag._parse_buffer(buffer)
  File "/home/lua/Downloads/recover/Minecraft-Region-Fixer/nbt/nbt.py", line 404, in _parse_buffer
    self.tags.append(TAGLIST[self.tagID](buffer=buffer))
  File "/home/lua/Downloads/recover/Minecraft-Region-Fixer/nbt/nbt.py", line 345, in __init__
    self._parse_buffer(buffer)
  File "/home/lua/Downloads/recover/Minecraft-Region-Fixer/nbt/nbt.py", line 353, in _parse_buffer
    self.value = read.decode("utf-8")
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xed in position 10: invalid continuation byte

Expected behavior
Region Fixer should attempt to fix corrupted chunks instead of crashing.

Screenshots

Files that would help solving the issue
ZIP file containing my server's world files

Desktop (please complete the following information):

Additional context
World files were in a .tar.xz archive that was corrupted. Also, since this is a UnicodeDecodeError, it might be related to the changes made in 528dfc6c0420799ee875792612c6c4a12d721044?

@luavixen luavixen added the Bug label Jan 17, 2021
@luavixen
Copy link
Author

I also attempted to use the latest release and got a similar error:

[lua@lua-box ~/Downloads/recover]$ python3 Minecraft-Region-Fixer-0.3.3/regionfixer.py --fc --fm --fw --dt --verbose minecraft/serversurvival

Welcome to Region Fixer!
(v 0.3.3)

############################################################
#############  Scanning world: serversurvival  #############
############################################################

World info:
There are 419 region files, 40 player files and 0 data files in the world directory.

-------------------- Checking level.dat --------------------
'level.dat' is readable

---------------- Scanning UUID player files ----------------
Scanned playerdata/22af2493-7338-42e1-b9f5-03916a109f95.dat (File: "22af2493-7338-42e1-b9f5-03916a109f95.dat"; status: OK) 1/40
Scanned playerdata/c8b72170-9914-4988-ae3a-9c8c7d86a8ee.dat (File: "c8b72170-9914-4988-ae3a-9c8c7d86a8ee.dat"; status: OK) 2/40
Scanned playerdata/33890b8c-61d2-4ec4-a8ee-af8cd9dfbffd.dat (File: "33890b8c-61d2-4ec4-a8ee-af8cd9dfbffd.dat"; status: OK) 3/40
Scanned playerdata/531cf254-333c-39ea-bbc4-69c42ff80e25.dat (File: "531cf254-333c-39ea-bbc4-69c42ff80e25.dat"; status: OK) 4/40
Scanned playerdata/ccf462b1-33f9-4f15-98d0-6085a4d8b539.dat (File: "ccf462b1-33f9-4f15-98d0-6085a4d8b539.dat"; status: OK) 5/40
Scanned playerdata/169cb08d-ae68-4d4f-aec8-7f41d1cd0e90.dat (File: "169cb08d-ae68-4d4f-aec8-7f41d1cd0e90.dat"; status: OK) 6/40
Scanned playerdata/96b34d60-e7e1-4f7b-a94b-6b0419369798.dat (File: "96b34d60-e7e1-4f7b-a94b-6b0419369798.dat"; status: OK) 7/40
Scanned playerdata/18f5eb80-0ee3-46ee-807f-9a39113856b0.dat (File: "18f5eb80-0ee3-46ee-807f-9a39113856b0.dat"; status: OK) 8/40
Scanned playerdata/39898be2-3f90-4db8-a865-3cda04178207.dat (File: "39898be2-3f90-4db8-a865-3cda04178207.dat"; status: OK) 9/40
Scanned playerdata/4ae8a1fd-0392-4bda-8b28-b9c47f1dd7f2.dat (File: "4ae8a1fd-0392-4bda-8b28-b9c47f1dd7f2.dat"; status: OK) 10/40
Scanned playerdata/4d13b4d3-b2ed-49b0-8a63-8fb51fd85610.dat (File: "4d13b4d3-b2ed-49b0-8a63-8fb51fd85610.dat"; status: OK) 11/40
Scanned playerdata/8bdc22d7-e08a-4a18-9392-96206c59a70e.dat (File: "8bdc22d7-e08a-4a18-9392-96206c59a70e.dat"; status: OK) 12/40
Scanned playerdata/559b981f-d223-42f2-b7e9-adc7e5ec818b.dat (File: "559b981f-d223-42f2-b7e9-adc7e5ec818b.dat"; status: OK) 13/40
Scanned playerdata/ae60cf7c-6ba0-4cf6-884e-23decd3e0ab6.dat (File: "ae60cf7c-6ba0-4cf6-884e-23decd3e0ab6.dat"; status: OK) 14/40
Scanned playerdata/d81a461f-14bb-4016-b319-7c9e7f06f880.dat (File: "d81a461f-14bb-4016-b319-7c9e7f06f880.dat"; status: OK) 15/40
Scanned playerdata/6d69f4f2-f058-4461-82d4-113f05eb0583.dat (File: "6d69f4f2-f058-4461-82d4-113f05eb0583.dat"; status: OK) 16/40
Scanned playerdata/474b345e-5455-4d5a-83dc-855c6223c72f.dat (File: "474b345e-5455-4d5a-83dc-855c6223c72f.dat"; status: OK) 17/40
Scanned playerdata/84cc25f6-1689-4729-a3fa-43a79e428404.dat (File: "84cc25f6-1689-4729-a3fa-43a79e428404.dat"; status: OK) 18/40
Scanned playerdata/db791a2c-a8d8-4780-a9e7-4beb3ed2034a.dat (File: "db791a2c-a8d8-4780-a9e7-4beb3ed2034a.dat"; status: OK) 19/40
Scanned playerdata/4f656103-5cef-4e7c-a8e6-b7719567442a.dat (File: "4f656103-5cef-4e7c-a8e6-b7719567442a.dat"; status: OK) 20/40
Scanned playerdata/00000000-0000-0000-0009-01ff910b407a.dat (File: "00000000-0000-0000-0009-01ff910b407a.dat"; status: OK) 21/40
Scanned playerdata/988222fa-82b5-45f0-a045-67589c7c158e.dat (File: "988222fa-82b5-45f0-a045-67589c7c158e.dat"; status: OK) 22/40
Scanned playerdata/00000000-0000-0000-0009-01f3172bf7f6.dat (File: "00000000-0000-0000-0009-01f3172bf7f6.dat"; status: OK) 23/40
Scanned playerdata/56905fa1-0f82-4f2e-9151-907af8f50642.dat (File: "56905fa1-0f82-4f2e-9151-907af8f50642.dat"; status: OK) 24/40
Scanned playerdata/794e3115-9df6-44c9-be60-ab39868627f2.dat (File: "794e3115-9df6-44c9-be60-ab39868627f2.dat"; status: OK) 25/40
Scanned playerdata/0c1ce21c-1534-4ad3-94a8-17454b42ca32.dat (File: "0c1ce21c-1534-4ad3-94a8-17454b42ca32.dat"; status: OK) 26/40
Scanned playerdata/cb54969f-81f1-4c11-a19b-7305fa242b7e.dat (File: "cb54969f-81f1-4c11-a19b-7305fa242b7e.dat"; status: OK) 27/40
Scanned playerdata/e3a53f15-503b-4daf-b235-eb717539c70c.dat (File: "e3a53f15-503b-4daf-b235-eb717539c70c.dat"; status: OK) 28/40
Scanned playerdata/0fe9ec42-19b5-401f-a542-854713a157f2.dat (File: "0fe9ec42-19b5-401f-a542-854713a157f2.dat"; status: OK) 29/40
Scanned playerdata/aa1127d5-492e-4e7d-a53a-0bee6ae729de.dat (File: "aa1127d5-492e-4e7d-a53a-0bee6ae729de.dat"; status: OK) 30/40
Scanned playerdata/251c7671-7f47-3e19-b647-6684ab0d9b96.dat (File: "251c7671-7f47-3e19-b647-6684ab0d9b96.dat"; status: OK) 31/40
Scanned playerdata/704fd478-9208-4cf6-9df3-088b39319a58.dat (File: "704fd478-9208-4cf6-9df3-088b39319a58.dat"; status: OK) 32/40
Scanned playerdata/435e0d4a-9db9-489e-a65e-27b78f2d1c0e.dat (File: "435e0d4a-9db9-489e-a65e-27b78f2d1c0e.dat"; status: OK) 33/40
Scanned playerdata/895e7004-fe96-4a9f-ace3-774e2f77a714.dat (File: "895e7004-fe96-4a9f-ace3-774e2f77a714.dat"; status: OK) 34/40
Scanned playerdata/6950e399-cfb4-496b-970a-9bcc59e429f8.dat (File: "6950e399-cfb4-496b-970a-9bcc59e429f8.dat"; status: OK) 35/40
Scanned playerdata/00000000-0000-0000-0009-01fcc2d7e4d5.dat (File: "00000000-0000-0000-0009-01fcc2d7e4d5.dat"; status: OK) 36/40
Scanned playerdata/02f7f324-f948-4664-bd7c-8c9c9b46acfc.dat (File: "02f7f324-f948-4664-bd7c-8c9c9b46acfc.dat"; status: OK) 37/40
Scanned playerdata/d85ae3df-c16f-347a-a6e8-1d8d04e4ed51.dat (File: "d85ae3df-c16f-347a-a6e8-1d8d04e4ed51.dat"; status: OK) 38/40
Scanned playerdata/f641e067-15cc-4709-b15e-c321db3a39fb.dat (File: "f641e067-15cc-4709-b15e-c321db3a39fb.dat"; status: OK) 39/40
Scanned playerdata/cb49f78a-bca3-432e-899a-5b148e2aaa89.dat (File: "cb49f78a-bca3-432e-899a-5b148e2aaa89.dat"; status: OK) 40/40

------------- Scanning old format player files -------------
Info: No files to scan.

---------- Scanning structures and map data files ----------
Info: No files to scan.

------------------ Scanning region files -------------------
Scanned region/r.-22.-12.mca (c:0, w:0, tme:0, so:0, mt:0, t:1)......... 1/419
Scanned region/r.17.-24.mca (c:0, w:0, tme:0, so:0, mt:0, t:1)......... 2/419
Scanned region/r.-28.22.mca (c:0, w:0, tme:0, so:0, mt:0, t:1)......... 3/419
Scanned region/r.-4.-10.mca (c:0, w:0, tme:0, so:0, mt:0, t:74)........ 4/419
Scanned region/r.5.2.mca (c:0, w:0, tme:0, so:0, mt:0, t:740)....... 5/419
Scanned region/r.22.-17.mca (c:0, w:0, tme:0, so:0, mt:0, t:1)......... 6/419
Scanned region/r.-1955.-1953.mca (c:0, w:0, tme:0, so:0, mt:0, t:56)........ 7/419
Scanned region/r.3.6.mca (c:0, w:0, tme:0, so:0, mt:0, t:1)......... 8/419
Scanned region/r.-11.-17.mca (c:0, w:0, tme:0, so:0, mt:0, t:1)......... 9/419
Scanned region/r.-2.-8.mca (c:0, w:0, tme:0, so:0, mt:0, t:1024)...... 10/419
Scanned region/r.-17.26.mca (c:0, w:0, tme:0, so:0, mt:0, t:928)....... 11/419
Scanned region/r.-10.31.mca (c:0, w:0, tme:0, so:0, mt:0, t:1)......... 12/419
Scanned region/r.0.1.mca (c:0, w:0, tme:0, so:0, mt:0, t:1024)...... 13/419
Scanned region/r.-8.-19.mca (c:0, w:0, tme:0, so:0, mt:0, t:1)......... 14/419
Scanned region/r.-27.-24.mca (c:0, w:0, tme:0, so:0, mt:0, t:1)......... 15/419
Scanned region/r.-5.-6.mca (c:0, w:0, tme:0, so:0, mt:0, t:949)....... 16/419
Scanned region/r.12.4.mca (c:0, w:0, tme:0, so:0, mt:0, t:531)....... 17/419
Scanned region/r.-9.5.mca (c:0, w:0, tme:0, so:0, mt:0, t:484)....... 18/419
Scanned region/r.-18.-5.mca (c:0, w:0, tme:0, so:0, mt:0, t:1)......... 19/419
Scanned region/r.-8.6.mca (c:0, w:0, tme:0, so:0, mt:0, t:1012)...... 20/419
Scanned region/r.13.-22.mca (c:0, w:0, tme:0, so:0, mt:0, t:1)......... 21/419
Scanned region/r.-7.-25.mca (c:0, w:0, tme:0, so:0, mt:0, t:1)......... 22/419
Scanned region/r.4.-4.mca (c:0, w:0, tme:0, so:0, mt:0, t:753)....... 23/419
Scanned region/r.-19.1.mca (c:0, w:0, tme:0, so:0, mt:0, t:1)......... 24/419
Scanned region/r.0.2.mca (c:0, w:0, tme:0, so:0, mt:0, t:1024)...... 25/419
Scanned region/r.-12.0.mca (c:0, w:0, tme:0, so:0, mt:0, t:1)......... 26/419
Scanned region/r.-30.5.mca (c:0, w:0, tme:0, so:0, mt:0, t:1)......... 27/419
Scanned region/r.-19.-29.mca (c:0, w:0, tme:0, so:0, mt:0, t:1)......... 28/419
Scanned region/r.-10.-16.mca (c:0, w:0, tme:0, so:0, mt:0, t:1)......... 29/419
Scanned region/r.11.-7.mca (c:0, w:0, tme:0, so:0, mt:0, t:1)......... 30/419
Scanned region/r.13.15.mca (c:0, w:0, tme:0, so:0, mt:0, t:1)......... 31/419
Scanned region/r.-10.20.mca (c:0, w:0, tme:0, so:0, mt:0, t:399)....... 32/419
Scanned region/r.3.1.mca (c:0, w:0, tme:0, so:0, mt:0, t:1024)...... 33/419
Scanned region/r.-1.17.mca (c:0, w:0, tme:0, so:0, mt:0, t:94)........ 34/419
Scanned region/r.-7.12.mca (c:0, w:0, tme:0, so:0, mt:0, t:1024)...... 35/419
Scanned region/r.-6.11.mca (c:0, w:0, tme:0, so:0, mt:0, t:1024)...... 36/419
Scanned region/r.-4.-4.mca (c:0, w:0, tme:0, so:0, mt:0, t:1024)...... 37/419
Scanned region/r.-7.21.mca (c:0, w:0, tme:0, so:0, mt:0, t:1)......... 38/419
Scanned region/r.8.22.mca (c:0, w:0, tme:0, so:0, mt:0, t:1)......... 39/419
Scanned region/r.-6.-7.mca (c:0, w:0, tme:0, so:0, mt:0, t:1024)...... 40/419
Scanned region/r.10.10.mca (c:0, w:0, tme:0, so:0, mt:0, t:1)......... 41/419
Scanned region/r.-9.24.mca (c:0, w:0, tme:0, so:0, mt:0, t:45)........ 42/419
Scanned region/r.13.10.mca (c:0, w:0, tme:0, so:0, mt:0, t:1)......... 43/419
Scanned region/r.-8.4.mca (c:0, w:0, tme:0, so:0, mt:0, t:82)........ 44/419
Scanned region/r.-10.17.mca (c:0, w:0, tme:0, so:0, mt:0, t:1024)...... 45/419
Scanned region/r.-10.5.mca (c:0, w:0, tme:0, so:0, mt:0, t:1)......... 46/419
Scanned region/r.-9.0.mca (c:0, w:0, tme:0, so:0, mt:0, t:1)......... 47/419
Scanned region/r.1.-9.mca (c:0, w:0, tme:0, so:0, mt:0, t:263)....... 48/419
Scanned region/r.-24.23.mca (c:0, w:0, tme:0, so:0, mt:0, t:1)......... 49/419
Scanned region/r.16.-25.mca (c:0, w:0, tme:0, so:0, mt:0, t:1)......... 50/419
Scanned region/r.-20.5.mca (c:0, w:0, tme:0, so:0, mt:0, t:1)......... 51/419
Scanned region/r.-1.12.mca (c:0, w:0, tme:0, so:0, mt:0, t:200)....... 52/419
Scanned region/r.-9.-4.mca (c:0, w:0, tme:0, so:0, mt:0, t:1)......... 53/419
Scanned region/r.-4.3.mca (c:0, w:0, tme:0, so:0, mt:0, t:1024)...... 54/419
Scanned region/r.2.0.mca (c:0, w:0, tme:0, so:0, mt:0, t:1024)...... 55/419
Scanned region/r.-15.18.mca (c:0, w:0, tme:0, so:0, mt:0, t:1)......... 56/419
Scanned region/r.16.23.mca (c:0, w:0, tme:0, so:0, mt:0, t:1)......... 57/419
Scanned region/r.-3.-1.mca (c:0, w:0, tme:0, so:0, mt:0, t:1024)...... 58/419
Scanned region/r.-29.-17.mca (c:0, w:0, tme:0, so:0, mt:0, t:1)......... 59/419
Scanned region/r.-7.-14.mca (c:0, w:0, tme:0, so:0, mt:0, t:1)......... 60/419
Scanned region/r.-10.-25.mca (c:0, w:0, tme:0, so:0, mt:0, t:1)......... 61/419
Scanned region/r.1.5.mca (c:0, w:0, tme:0, so:0, mt:0, t:714)....... 62/419
Scanned region/r.-8.-9.mca (c:0, w:0, tme:0, so:0, mt:0, t:1)......... 63/419
Scanned region/r.-30.18.mca (c:0, w:0, tme:0, so:0, mt:0, t:1)......... 64/419
Scanned region/r.5.-19.mca (c:0, w:0, tme:0, so:0, mt:0, t:1)......... 65/419
Scanned region/r.0.-2.mca (c:0, w:0, tme:0, so:0, mt:0, t:1024)...... 66/419
Scanned region/r.-6.-8.mca (c:0, w:0, tme:0, so:0, mt:0, t:1024)...... 67/419
Scanned region/r.-14.25.mca (c:0, w:0, tme:0, so:0, mt:0, t:789)....... 68/419
Scanned region/r.7.-10.mca (c:0, w:0, tme:0, so:0, mt:0, t:9)......... 69/419
Scanned region/r.-12.-9.mca (c:0, w:0, tme:0, so:0, mt:0, t:1)......... 70/419
Scanned region/r.-12.23.mca (c:0, w:0, tme:0, so:0, mt:0, t:1024)...... 71/419
Scanned region/r.7.1.mca (c:0, w:0, tme:0, so:0, mt:0, t:1024)...... 72/419
Scanned region/r.18.23.mca (c:0, w:0, tme:0, so:0, mt:0, t:1)......... 73/419
Scanned region/r.-7.5.mca (c:0, w:0, tme:0, so:0, mt:0, t:322)....... 74/419
Scanned region/r.-1954.1953.mca (c:0, w:0, tme:0, so:0, mt:0, t:896)....... 75/419
Scanned region/r.-5.-7.mca (c:0, w:0, tme:0, so:0, mt:0, t:688)....... 76/419
Scanned region/r.-17.-10.mca (c:0, w:0, tme:0, so:0, mt:0, t:1)......... 77/419
Scanned region/r.-15.-25.mca (c:0, w:0, tme:0, so:0, mt:0, t:1)......... 78/419
Scanned region/r.6.3.mca (c:0, w:0, tme:0, so:0, mt:0, t:1024)...... 79/419
Scanned region/r.-15.10.mca (c:0, w:0, tme:0, so:0, mt:0, t:1)......... 80/419
Scanned region/r.-2.-6.mca (c:0, w:0, tme:0, so:0, mt:0, t:1024)...... 81/419
Scanned region/r.-15.26.mca (c:0, w:0, tme:0, so:0, mt:0, t:996)....... 82/419
Scanned region/r.-12.-14.mca (c:0, w:0, tme:0, so:0, mt:0, t:1)......... 83/419
Scanned region/r.-7.11.mca (c:0, w:0, tme:0, so:0, mt:0, t:650)....... 84/419
Scanned region/r.20.-20.mca (c:0, w:0, tme:0, so:0, mt:0, t:1)......... 85/419


Ops! Something went really wrong and regionfixer crashed.


Bug report:

**********
*** Exception while scanning:
*** r.-2.-3.mca
**********
*** Printing the child's traceback:
*** Exception:<class 'UnicodeDecodeError'>'utf-8' codec can't decode byte 0xed in position 10: invalid continuation byte
**********
*** File /home/lua/Downloads/recover/Minecraft-Region-Fixer-0.3.3/regionfixer_core/scan.py, line 806, in scan_region_file 
***   chunk, tup = scan_chunk(region_file,**********
*** File /home/lua/Downloads/recover/Minecraft-Region-Fixer-0.3.3/regionfixer_core/scan.py, line 908, in scan_chunk 
***   chunk = region_file.get_chunk(*coords)**********
*** File /home/lua/Downloads/recover/Minecraft-Region-Fixer-0.3.3/nbt/region.py, line 599, in get_chunk 
***   return self.get_nbt(x, z)**********
*** File /home/lua/Downloads/recover/Minecraft-Region-Fixer-0.3.3/nbt/region.py, line 578, in get_nbt 
***   nbt = NBTFile(buffer=data)**********
*** File /home/lua/Downloads/recover/Minecraft-Region-Fixer-0.3.3/nbt/nbt.py, line 628, in __init__ 
***   self.parse_file()**********
*** File /home/lua/Downloads/recover/Minecraft-Region-Fixer-0.3.3/nbt/nbt.py, line 655, in parse_file 
***   self._parse_buffer(self.file)**********
*** File /home/lua/Downloads/recover/Minecraft-Region-Fixer-0.3.3/nbt/nbt.py, line 493, in _parse_buffer 
***   tag._parse_buffer(buffer)**********
*** File /home/lua/Downloads/recover/Minecraft-Region-Fixer-0.3.3/nbt/nbt.py, line 493, in _parse_buffer 
***   tag._parse_buffer(buffer)**********
*** File /home/lua/Downloads/recover/Minecraft-Region-Fixer-0.3.3/nbt/nbt.py, line 404, in _parse_buffer 
***   self.tags.append(TAGLIST[self.tagID](buffer=buffer))**********
*** File /home/lua/Downloads/recover/Minecraft-Region-Fixer-0.3.3/nbt/nbt.py, line 476, in __init__ 
***   self._parse_buffer(buffer)**********
*** File /home/lua/Downloads/recover/Minecraft-Region-Fixer-0.3.3/nbt/nbt.py, line 493, in _parse_buffer 
***   tag._parse_buffer(buffer)**********
*** File /home/lua/Downloads/recover/Minecraft-Region-Fixer-0.3.3/nbt/nbt.py, line 493, in _parse_buffer 
***   tag._parse_buffer(buffer)**********
*** File /home/lua/Downloads/recover/Minecraft-Region-Fixer-0.3.3/nbt/nbt.py, line 493, in _parse_buffer 
***   tag._parse_buffer(buffer)**********
*** File /home/lua/Downloads/recover/Minecraft-Region-Fixer-0.3.3/nbt/nbt.py, line 404, in _parse_buffer 
***   self.tags.append(TAGLIST[self.tagID](buffer=buffer))**********
*** File /home/lua/Downloads/recover/Minecraft-Region-Fixer-0.3.3/nbt/nbt.py, line 345, in __init__ 
***   self._parse_buffer(buffer)**********
*** File /home/lua/Downloads/recover/Minecraft-Region-Fixer-0.3.3/nbt/nbt.py, line 353, in _parse_buffer 
***   self.value = read.decode("utf-8")
**********

@macfreek
Copy link
Contributor

@luawtf Could you make this r.-2.-3.mca file available somehow?

@Fenixin Please note twoolie/NBT#144.
As I now understand, NBT format does not properly use UTF-8 encoding, but some non-standard encoding sometimes referred to as MUTF-8. I'm not sure if that's the root cause here, though (I would have expect a can't decode byte 0x00 instead of can't decode byte 0xed), but it is good to be able to pursue this matter.

@luavixen
Copy link
Author

luavixen commented Jan 17, 2021

@macfreek See the ZIP file I provided, the file you are looking for can be found in serversurvival.zip/serversurvival/region/r.-2.-3.mca. I also attempted to attach the file to this comment, but GitHub has been stuck on "Uploading your files..." for the past 5 minutes (6MB file on a gigabit connection).

@luavixen
Copy link
Author

Also, thank you so much for the quick response! I'm really hopeful that I can get my gamesave back in one piece :P (or atleast most of it). Is there any way to financially donate to or sponsor this project?

@luavixen
Copy link
Author

I had the idea to replace read.decode("utf-8") with ead.decode("utf-8", "ignore") to ignore the unicode errors, and managed to generate a new trace:

<class 'TypeError'>
Traceback (most recent call last):
  File "/home/lua/Downloads/recover/Minecraft-Region-Fixer/regionfixer_core/world.py", line 374, in fix_problematic_chunks
    chunk = region_file.get_chunk(*local_coords)
  File "/home/lua/Downloads/recover/Minecraft-Region-Fixer/nbt/region.py", line 599, in get_chunk
    return self.get_nbt(x, z)
  File "/home/lua/Downloads/recover/Minecraft-Region-Fixer/nbt/region.py", line 574, in get_nbt
    data = self.get_blockdata(x, z) # This may raise a RegionFileFormatError.
  File "/home/lua/Downloads/recover/Minecraft-Region-Fixer/nbt/region.py", line 566, in get_blockdata
    raise ChunkDataError(err)
nbt.region.ChunkDataError: Error -3 while decompressing data: invalid distance too far back

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/lua/Downloads/recover/Minecraft-Region-Fixer/regionfixer.py", line 588, in <module>
    value = main()
  File "/home/lua/Downloads/recover/Minecraft-Region-Fixer/regionfixer.py", line 548, in main
    fix_bad_chunks(args, w)
  File "/home/lua/Downloads/recover/Minecraft-Region-Fixer/regionfixer.py", line 71, in fix_bad_chunks
    counter = scanned_obj.fix_problematic_chunks(problem)
  File "/home/lua/Downloads/recover/Minecraft-Region-Fixer/regionfixer_core/world.py", line 1398, in fix_problematic_chunks
    counter += regionset.fix_problematic_chunks(status)
  File "/home/lua/Downloads/recover/Minecraft-Region-Fixer/regionfixer_core/world.py", line 936, in fix_problematic_chunks
    counter += self._set[r].fix_problematic_chunks(status)
  File "/home/lua/Downloads/recover/Minecraft-Region-Fixer/regionfixer_core/world.py", line 391, in fix_problematic_chunks
    out += dc.decompress(i)
TypeError: a bytes-like object is required, not 'int'

@Fenixin
Copy link
Owner

Fenixin commented Jan 17, 2021

Hello!

I might be wrong but if I recall correctly this is fixed in the branch bugfix. I probably should have pushed this to master long time ago.

@luawtf, please, give bugfix a try and tell us how it goes!

@Fenixin Please note twoolie/NBT#144.
As I now understand, NBT format does not properly use UTF-8 encoding, but some non-standard encoding sometimes referred >to as MUTF-8. I'm not sure if that's the root cause here, though (I would have expect a can't decode byte 0x00 instead of can't >decode byte 0xed), but it is good to be able to pursue this matter.

@macfreek, thanks for the heads-up!

I have always considered these UnicodeDecodeErrors as corrupted chunks becuase I have never found them in healthy worlds, (I've got a 42GB world where this problem is never present and I think that it in such a big sample of regionfiles the probability of this happening should be pretty big) and, also, because I had never imagined something like MUTF-8. Thanks again for the information.

I will give this a look when I have some free time to spend (maybe summer), right know this symptoms will have to wait as corrupted chunks.

Also, thank you so much for the quick response! I'm really hopeful that I can get my gamesave back in one piece :P (or atleast most of it). Is there any way to financially donate to or sponsor this project?

@luawtf, I'm the maintainer of this project but, to tell you the truth, if regionfixer exists is thanks to @macfreek and his package, so feel free to donate to him.

@luavixen
Copy link
Author

Thanks, I'll take a look!

@luavixen
Copy link
Author

Using bugfix now, I get the same error as my hotfix, maybe I should make a new issue with this error specifically?:

<class 'TypeError'>
Traceback (most recent call last):
  File "/home/lua/Downloads/recover/Minecraft-Region-Fixer/regionfixer_core/world.py", line 374, in fix_problematic_chunks
    chunk = region_file.get_chunk(*local_coords)
  File "/home/lua/Downloads/recover/Minecraft-Region-Fixer/nbt/region.py", line 599, in get_chunk
    return self.get_nbt(x, z)
  File "/home/lua/Downloads/recover/Minecraft-Region-Fixer/nbt/region.py", line 574, in get_nbt
    data = self.get_blockdata(x, z) # This may raise a RegionFileFormatError.
  File "/home/lua/Downloads/recover/Minecraft-Region-Fixer/nbt/region.py", line 566, in get_blockdata
    raise ChunkDataError(err)
nbt.region.ChunkDataError: Error -3 while decompressing data: invalid distance too far back

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/lua/Downloads/recover/Minecraft-Region-Fixer/regionfixer.py", line 588, in <module>
    value = main()
  File "/home/lua/Downloads/recover/Minecraft-Region-Fixer/regionfixer.py", line 548, in main
    fix_bad_chunks(args, w)
  File "/home/lua/Downloads/recover/Minecraft-Region-Fixer/regionfixer.py", line 71, in fix_bad_chunks
    counter = scanned_obj.fix_problematic_chunks(problem)
  File "/home/lua/Downloads/recover/Minecraft-Region-Fixer/regionfixer_core/world.py", line 1395, in fix_problematic_chunks
    counter += regionset.fix_problematic_chunks(status)
  File "/home/lua/Downloads/recover/Minecraft-Region-Fixer/regionfixer_core/world.py", line 933, in fix_problematic_chunks
    counter += self._set[r].fix_problematic_chunks(status)
  File "/home/lua/Downloads/recover/Minecraft-Region-Fixer/regionfixer_core/world.py", line 391, in fix_problematic_chunks
    out += dc.decompress(i)
TypeError: a bytes-like object is required, not 'int'

@Fenixin
Copy link
Owner

Fenixin commented Jan 17, 2021

I will give it a look, hopefully, next weekend.

EDIT: there is no need for a new issue, thanks!

@macfreek
Copy link
Contributor

Hi @Fenixin

Here is a short script to reproduce it. You need to adjust the path to Minecraft-Region-Fixer and the affected MCA file.

#!/usr/bin/env python

# MRF_PATH = '/Users/freek/Repository/mine-minecraft/Minecraft-Region-Fixer'
MRF_PATH = '/Users/freek/Downloads/Minecraft-Region-Fixer-0.3.3/'
MCA_PATH = '/Users/freek/Downloads/serversurvival/serversurvival/region/r.-2.-3.mca'

# Make sure the library in the above paths are loaded, not somewhere else
import sys
sys.path = [MRF_PATH] + sys.path

import regionfixer_core.scan
import nbt

assert nbt.__file__.startswith(MRF_PATH)
assert regionfixer_core.scan.__file__.startswith(MRF_PATH)

region = nbt.region.RegionFile(filename = MCA_PATH)
coords = (28, 17)
global_coords = (-36, -79)
entity_limit = 300
result = regionfixer_core.scan.scan_chunk(region, coords, global_coords, entity_limit)
print (result)

You will find that if you git checkout v0.3.3, the bug is there, but for git checkout master of git checkout bugfix. That's simply because commit 503efd2 moved the UnicodeDecodeError catch to a higher level. So it really masquerades the issue.

I have not delved into the actually issue, sorry about that. I was just fighting over pdb (I seem to have encounter a bug in Python multiprocessing library), so took me a little longer than anticipated. Hope you can take it from here. Please tag me if it is an issue in the NBT library.

Note that NBT is not really my library -- twoolie is the author. I maintained it for a while, but am not actively doing so. (@luawtf was lucky that I was just watching television on a Sunday evening, and chasing a bug felt more interesting than watching TV :) )

@luavixen
Copy link
Author

luavixen commented Jan 17, 2021

Note that NBT is not really my library -- twoolie is the author. I maintained it for a while, but am not actively doing so. (@luawtf was lucky that I was just watching television on a Sunday evening, and chasing a bug felt more interesting than watching TV :) )

Wahoo, lucky me!

Since Python isn't really my area of expertise, I've begun building a Minecraft-Region-Fixer-like tool that I'm calling anvil_recovery_tool in Rust. I'm going to be taking a different approach, working on region files directly. Also, instead of attempting to repair broken data, I'm generating valid (but empty) region files, then copying over whatever data I can (with a focus on block data). This includes using the very stable hematite_nbt NBT library, and falling back to a dirty system of searching the chunk blobs for the Sections string then ripping out the binary data after it, then sanitising it to make it a valid NBT structure.

I think next time I might just back up my Minecraft world better, though :P

@macfreek
Copy link
Contributor

macfreek commented Jan 18, 2021

Couldn't resist taking a closer look.

The following is an even shorter snipper that reproduces the issue.

import nbt
MCA_PATH = '/Users/freek/Downloads/serversurvival/serversurvival/region/r.-2.-3.mca'
region = nbt.region.RegionFile(filename = MCA_PATH)
nbt_data = region.get_chunk(28, 17)

Of course, Minecraft-Region-Fixer shouldn't crash after this Unicode exception. @Fenixin I leave it up to make sure it doesn't.

@luawtf It seems you indeed got a corrupt world at your hand. I found two issues:

  • In r.-2.-3.mca, there is a corrupt minecraft:written_book at -1259, -575.
  • File r.-1.4.mca got truncated, leading to 385 lost chunks (out of the 1024) in that region (x=-512...0, z=2048...2560).

Here is the relevant part of the NBT file for the corrupted book:

TAG_List('TileEntities'): [1 TAG_Compound(s)]
{
  TAG_Compound: {7 Entries}
  {
    TAG_Int('Page'): 0
    TAG_Compound('Book'): {3 Entries}
    {
      TAG_String('id'): minecraft:written_book
      TAG_Compound('tag'): {4 Entries}
      {
        TAG_List('pages'): [6 TAG_String(s)]
        {
          TAG_String: {"text":"H∴ ᓵᔑリ ||⚍ ᓭᒷᒷ ╎リℸ ̣  ᒲ|| ᒷ||ᒷᓭ ꖎ╎ꖌᒷ !¡ᒷリ ↸∷ᓭ?\nlᒷᔑ↸╎リ⊣ ||⚍ ↸∴リ, ╎リℸ ̣  ᒲ|| ᓵ∷ᒷ\nw⍑ᒷ∷ᒷ i'⍊ᒷ ʖᒷᓵᒲᒷ ᓭ リ⚍ᒲʖ, ∴╎ℸ ̣ ⍑⚍ℸ ̣  ᔑ ᓭ⚍ꖎ\nm|| ᓭ!¡╎∷╎ℸ ̣ 'ᓭ ᓭꖎᒷᒷ!¡╎リ⊣ ᓭᒲᒷ∴⍑ᒷ∷ᒷ ᓵꖎ↸\nuリℸ ̣ ╎ꖎ ||⚍ ⎓╎リ↸ ╎ℸ ̣  ℸ ̣ ⍑ᒷ∷ᒷ, ᔑリ↸ ꖎᒷᔑ↸ ╎ℸ ̣ , ʖᔑᓵꖌ, ⍑ᒲᒷ"}
          TAG_String: {"text":"wᔑꖌᒷ ᒲᒷ ⚍!¡ ╎リᓭ╎↸ᒷ\nwᔑꖌᒷ ᒲᒷ ⚍!¡ ╎リᓭ╎↸ᒷ\ncᔑꖎꖎ ᒲ|| リᔑᒲᒷ ᔑリ↸ ᓭᔑ⍊ᒷ ᒲᒷ ⎓∷ᒲ ℸ ̣ ⍑ᒷ ↸ᔑ∷ꖌ\nb╎↸ ᒲ|| ʖꖎ↸ ℸ ̣  ∷⚍リ\nbᒷ⎓∷ᒷ i ᓵᒲᒷ ⚍リ↸リᒷ"}
          TAG_String: {"text":"sᔑ⍊ᒷ ᒲᒷ ⎓∷ᒲ ℸ ̣ ⍑ᒷ リℸ ̣ ⍑╎リ⊣ i'⍊ᒷ ʖᒷᓵᒲᒷ\nn∴ ℸ ̣ ⍑ᔑℸ ̣  i ꖌリ∴ ∴⍑ᔑℸ ̣  i'ᒲ ∴╎ℸ ̣ ⍑⚍ℸ ̣ \ny⚍ ᓵᔑリ'ℸ ̣  ⋮⚍ᓭℸ ̣  ꖎᒷᔑ⍊ᒷ ᒲᒷ\nb∷ᒷᔑℸ ̣ ⍑ᒷ ╎リℸ ̣  ᒲᒷ ᔑリ↸ ᒲᔑꖌᒷ ᒲᒷ ∷ᒷᔑꖎ\nb∷╎リ⊣ ᒲᒷ ℸ ̣  ꖎ╎⎓ᒷ"}
          TAG_String: {"text":"wᔑꖌᒷ ᒲᒷ ⚍!¡ ╎リᓭ╎↸ᒷ\nwᔑꖌᒷ ᒲᒷ ⚍!¡ ╎リᓭ╎↸ᒷ\ncᔑꖎꖎ ᒲ|| リᔑᒲᒷ ᔑリ↸ ᓭᔑ⍊ᒷ ᒲᒷ ⎓∷ᒲ ℸ ̣ ⍑ᒷ ↸ᔑ∷ꖌ\nb╎↸ ᒲ|| ʖꖎ↸ ℸ ̣  ∷⚍リ\nbᒷ⎓∷ᒷ i ᓵᒲᒷ ⚍リ↸リᒷ\nsᔑ⍊ᒷ ᒲᒷ ⎓∷ᒲ ℸ ̣ ⍑ᒷ リℸ ̣ ⍑╎リ⊣ i'⍊ᒷ ʖᒷᓵᒲᒷ\nb∷╎リ⊣ ᒲᒷ ℸ ̣  ꖎ╎⎓ᒷ\nb∷╎リ⊣ ᒲᒷ ℸ ̣  ꖎ╎⎓ᒷ\nf∷⨅ᒷリ ╎リᓭ╎↸ᒷ, ∴╎ℸ ̣ ⍑⚍ℸ ̣  ||⚍∷ ℸ ̣ ⚍ᓵ⍑\nw╎ℸ ̣ ⍑⚍ℸ ̣  ||⚍∷ ꖎ⍊ᒷ, ↸ᔑ∷ꖎ╎リ⊣"}
          TAG_String: {"text":"oリꖎ|| ||⚍ ᔑ∷ᒷ ᒲ|| ꖎ╎⎓ᒷ\naᒲリ⊣ ℸ ̣ ⍑ᒷ ↸ᒷᔑ↸\ni'⍊ᒷ ʖᒷᒷリ ᓭꖎᒷᒷ!¡╎リ⊣ ᔑ ℸ ̣ ⍑⚍ᓭᔑリ↸ ||ᒷᔑ∷ᓭ ╎ℸ ̣  ᓭᒷᒷᒲᓭ\ngℸ ̣  ℸ ̣  !¡ᒷリ ᒲ|| ᒷ||ᒷᓭ ℸ ̣  ᒷ⍊ᒷ∷||ℸ ̣ ⍑╎リ⊣\ndリ'ℸ ̣  ꖎᒷℸ ̣  ᒲᒷ ↸╎ᒷ ⍑ᒷ∷ᒷ\nb∷╎リ⊣, ᒲᒷ, ℸ ̣ , ꖎ╎⎓ᒷ\nwᔑꖌᒷ ᒲᒷ ⚍!¡ ╎リᓭ╎↸ᒷ\nwᔑꖌᒷ ᒲᒷ ⚍!¡ ╎リᓭ╎↸ᒷ"}
          TAG_String: {"text":"cᔑꖎꖎ ᒲ|| リᔑᒲᒷ ᔑリ↸ ᓭᔑ⍊ᒷ ᒲᒷ ⎓∷ᒲ ℸ ̣ ⍑ᒷ ↸ᔑ∷ꖌ\nb╎↸ ᒲ|| ʖꖎ↸ ℸ ̣  ∷⚍リ\nbᒷ⎓∷ᒷ i ᓵᒲᒷ ⚍リ↸リᒷ\nsᔑ⍊ᒷ ᒲᒷ ⎓∷ᒲ ℸ ̣ ⍑ᒷ リℸ ̣ ⍑╎リ⊣ i'⍊ᒷ ʖᒷᓵᒲᒷ\nb∷╎リ⊣ ᒲᒷ ℸ ̣  ꖎ╎⎓ᒷ\nb∷╎リ⊣ ᒲᒷ ℸ ̣  ꖎ╎⎓ᒷ\nb∷╎リ⊣ ᒲᒷ ℸ ̣  ꖎ╎⎓ᒷ"}
        }
        TAG_String('title'): Truth - Vl. I
        TAG_String('author'): fairyflosslord7
        TAG_Byte('resolved'): 1
      }
      TAG_Byte('Count'): 1
    }
    TAG_Int('z'): -1259
    TAG_String('id'): minecraft:lectern
    TAG_Int('y'): 73
    TAG_Int('x'): -575
    TAG_Byte('keepPacked'): 0
  }
}

@luawtf in the attached file, I replaced these lines with:

TAG_List('pages'): [6 TAG_String(s)]
{
  TAG_String: {"text":"This page left blank"}
  TAG_String: {"text":"This page left blank"}
  TAG_String: {"text":"This page left blank"}
  TAG_String: {"text":"This page left blank"}
  TAG_String: {"text":"This page left blank"}
  TAG_String: {"text":"This page left blank"}
}
TAG_String('title'): Truth - Vl. I
TAG_String('author'): fairyflosslord7
TAG_Byte('resolved'): 1

Here is a short analysis of r.-1.4.mca:

File size is 4022784 bytes, which is not a multiple of 4096
chunk 0,4 has status -1: Out Of File
chunk 0,4 starts at sector 1115, while the file is only 983 sectors
chunk 0 ,4  part 1/2 outside file
chunk 0,11 has status -1: Out Of File
chunk 0,11 starts at sector 1617, while the file is only 983 sectors
chunk 0 ,11 part 1/2 outside file
chunk 0,12 has status -1: Out Of File
chunk 0,12 starts at sector 1707, while the file is only 983 sectors
...
...

I'll attach two region files, that I cleaned with the underlying NBT library.

I removed the pointers to chunks in r.-1.4.mca that where no longer there, and made sure the file has had the proper length. The regions that were there will still exists, other will be regenerated once you load Minecraft and start walking around in that area (x=-512...0, z=2048...2560).

r.-1.4.mca.zip
r.-2.-3.mca.zip

I hope these work for you.

@macfreek
Copy link
Contributor

I'm going to be taking a different approach, working on region files directly.

The underlying NBT library has actually 4 parts: world folders, region files, NBT structures, and Minecraft chunks. The world folder part is OK, the NBT structure part could use cleanup but is fairly stable last 10+(!) years. The chunk part is hopelessly outdated (it didn't keep up with the changes in the data structure over the years). The region part is actually the part that I contributed. By adding lots of unit tests, I'm fairly certain that it is robust, and will not corrupt any region, more than it already is. Even in really weird cases, e.g. when the header of two different chunks point to the same location in the file. If you would delete an overlapping chunk, it would not touch the one, and gently move it to a free location.

@luavixen
Copy link
Author

OMG! Thank you so much, I'll test these new Anvil files right away!

@luavixen
Copy link
Author

Update! The world loads fine now and while a considerable amount of stuff is back, I still ended up losing the spawn area :(, oh well. Thank you so much for the help!

@TkTech
Copy link

TkTech commented Jan 21, 2021

In r.-2.-3.mca, there is a corrupt minecraft:written_book at -1259, -575.

That doesn't look corrupted, it's just MUTF-8. Your example works for me with a simple change:

diff --git a/nbt/nbt.py b/nbt/nbt.py
index 46ccac1..7752251 100644
--- a/nbt/nbt.py
+++ b/nbt/nbt.py
@@ -5,6 +5,7 @@ Handle the NBT (Named Binary Tag) data format
 from struct import Struct, error as StructError
 from gzip import GzipFile
 from collections import MutableMapping, MutableSequence, Sequence
+import mutf8
 import sys

 _PY3 = sys.version_info >= (3,)
@@ -350,10 +351,10 @@ class TAG_String(TAG, Sequence):
         read = buffer.read(length.value)
         if len(read) != length.value:
             raise StructError()
-        self.value = read.decode("utf-8")
+        self.value = mutf8.decode_modified_utf8(read)

     def _render_buffer(self, buffer):
-        save_val = self.value.encode("utf-8")
+        save_val = mutf8.encode_modified_utf8(self.value)
         length = TAG_Short(len(save_val))
         length._render_buffer(buffer)
         buffer.write(save_val)

Don't forget to pip install mutf8.

@TkTech
Copy link

TkTech commented Jan 21, 2021

(I would have expect a can't decode byte 0x00 instead of can't decode byte 0xed), but it is good to be able to pursue this matter.

@macfreek 0xED is the lead byte for a six-byte MUTF-8 codepoint.

@Fenixin
Copy link
Owner

Fenixin commented Jan 24, 2021

Oh wow... thanks you all!

I will read this slowly and try to update RegionFixer. Busy weeks are coming for me but I will give this a good look when the time arrives.

@Fenixin
Copy link
Owner

Fenixin commented Aug 1, 2022

It seems that I'm not that good with predicting when I'm going to work on regionfixer...

This should be fixed in the last release (v0.3.6). The fix is ugly but works.

What I have done is to change the UTF-8 dedoding to MUTF-8 (from here https://pypi.org/project/mutf8/) in nbt. Thanks very much for all your research and making this easy.

I don't know if this should be pushed upstream to nbt. In order to make the use of regionfixer easy for everyone I've just included the library in regionfixer code (which is not ideal).

I'm going to close this. Open a new issue if you feel like I left something important out.

Thank you all!

@Fenixin Fenixin closed this as completed Aug 1, 2022
@TkTech
Copy link

TkTech commented Aug 10, 2022

Did you include my mutf8 library just to keep the project dependency free or did you run into a bug with it? Just asking because using it like this you're not getting the (much, much, much) faster C extension, just the pure python version. If speed isn't a concern you might as well just remove the C version.

@Fenixin
Copy link
Owner

Fenixin commented Aug 15, 2022

Did you include my mutf8 library just to keep the project dependency free or did you run into a bug with it? Just asking because using it like this you're not getting the (much, much, much) faster C extension, just the pure python version. If speed isn't a concern you might as well just remove the C version.

Hello!

Thanks for the heads-up.

I included the library to make it dependency free (easier for the user) and also because I did this in a hurry.

Speed should be a concern, I should have tested speeds before and after using this approach to see the effects. Regionfixer is used to scan big worlds in servers so they are probably not very happy right now. Why is there the C version too? Because I did it in a hurry.

Is there a solution that would have the best of both worlds? Easy for users and speed for people that want it.

@TkTech
Copy link

TkTech commented Aug 15, 2022

Both the python and the c version are significantly faster than the alternatives, but the c version is orders of magnitude faster. Your users shouldn't be too unhappy.

If you want to bundle the C version you'll want to modify your setup.py to include an Extension so pip knows to build it, and you'll want CI to build and release as many binary wheels as possible so users don't need to compile (see cibuildwheel). The mutf8 package does all this, so you can cut and paste.

Or, just stick with the .py version :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants