Replies: 2 comments 3 replies
-
Great writeup, thanks for the detail. I don't have any ideas, but cross linking what I suspect is relevant source at https://github.com/OE4T/linux-tegra-5.10/blob/5921377f5ffb5b1fbca9e40a187d1059743ef631/nvidia/nvgpu/drivers/gpu/nvgpu/os/linux/module.c#L152 based on the stacktrace. Is there a way to reproduce this with https://github.com/OE4T/tegra-demo-distro ? |
Beta Was this translation helpful? Give feedback.
-
Hi, just for information, I solved the problem of the kernel panic. It was really tricky. First of all, the main problem was there since a while, but the kernel panic is just produced with the latest version of L4T. I found out, that before, I also had an error, but it was still managing to shut down. I made a change to my machine configuration, and even if it has a " I didnt notice this also for a while since it was a minimal OS without gpu drivers, but I built a small c++ software with qt dependencies that activated Of course, this was not so easy to reproduce because if all there conditions were not there, the problem was disappearing. I hope nobody has this problem again, but for information, just in case, I have been this error visible in dmesg since I introduced my own machine configuration, which of course is not there any more if you install back the tegra-firmware:
Anyway, thanks for the support and your work on the layer. Regards, |
Beta Was this translation helpful? Give feedback.
-
Hello,
I opened a new topic since even if it is coming from a previous one, I think this deserves its own topic just in case others have the same problem. It is coming from here: #1182
First I will describe the status of my layers and how do I get to the kernel panic, then I paste the kernel panic and finally some other observations in comparison with the behavior of the nvidia ubuntu sample rootfs. I understand that it is a lot of information and I hope I am clear enough and I am just missing a small detail that triggers all of it.
Summarizing the context:
After this context, here is my lovely kernel panic, hoping that somebody of you recognizes what could it be, and can give me a hint (in bold, my suspicion of what is important):
I am not sure if it has something to do with the problem or not, but I have been comparing with the sample rootfs from nvidia, and this is what I found out as differences:
After a manual reset, the device starts properly. If I set to boot from partition B with nvbootctrl, it does it. It does not do it from B to A again. It creates the variable in the esp partition (
BootChainFwNext-781e084c-a330-417c-b678-38e696380cb
has the proper value "0") on the same way, but somehow it is not taken. On the sample ubuntu is working.nvbootctrl dump-slot-info is always showing the current slot
retry_count=2
(instead of 3) and the other slotretry_count=3
, and that means after a first flash and before rebooting. Also after kernel panics and rebooting shows the same, never goes down, never is overtaken from the bootloader to come back to the previous partition.Both slots after reboot in the sample ubuntu have on esp nothing but a variable file,
BootChainFwStatus-781e084c-a330-417c-b678-38e696380cb9
. The first boot after flashing, this variable is not there, but 2 others (TegraPlatformCompatSpec-781e084c-a330-417c-b678-38e696380cb9
andTegraPlatformSpec-781e084c-a330-417c-b678-38e696380cb9
) which after reboot are not there any more.For my distribution built with kirkstone,
BootchainFwStatus
is not even under/sys/firmware/efi/efivars
and of course not created as a variable at the esp partition. The other 2 are always there, even after reboot (remember, there is a kernel panic so I have to do it physically).And I think that is all, I hope somebody recognizes something because I am a little bit out of options.
Best regards,
Alvaro.
Beta Was this translation helpful? Give feedback.
All reactions