Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

-O2 breaks this lib on atmel M0 #102

Open
cujomalainey opened this issue Jul 8, 2022 · 14 comments
Open

-O2 breaks this lib on atmel M0 #102

cujomalainey opened this issue Jul 8, 2022 · 14 comments

Comments

@cujomalainey
Copy link

  • Arduino board: itsy bitsy m0

  • Arduino IDE version (found in Arduino -> About Arduino menu): platformio 6.0.2

  • List the steps to reproduce the problem below (if possible attach a sketch or
    copy the sketch code in too):

Use the following build config and try and use the BusIO library

[env:itsy]
platform = atmelsam
board = adafruit_itsybitsy_m0
framework = arduino
build_flags = -O2
build_unflags = -Os
lib_deps = 6214

main.cpp

#include "Wire.h"
#include <Adafruit_SPIDevice.h>

void setup() {}
void loop() {}

build output (dirty build to reduce length)

Processing itsy (platform: atmelsam; board: adafruit_itsybitsy_m0; framework: arduino)
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Verbose mode can be enabled via `-v, --verbose` option
CONFIGURATION: https://docs.platformio.org/page/boards/atmelsam/adafruit_itsybitsy_m0.html
PLATFORM: Atmel SAM (7.1.0+sha.ec05bdd) > Adafruit ItsyBitsy M0
HARDWARE: SAMD21G18A 48MHz, 32KB RAM, 256KB Flash
DEBUG: Current (atmel-ice) External (atmel-ice, blackmagic, jlink)
PACKAGES:
 - framework-arduino-samd-adafruit @ 1.7.5
 - framework-cmsis @ 2.50400.181126 (5.4.0)
 - framework-cmsis-atmel @ 1.2.2
 - toolchain-gccarmnoneeabi @ 1.90301.200702 (9.3.1)
LDF: Library Dependency Finder -> https://bit.ly/configure-pio-ldf
LDF Modes: Finder ~ chain, Compatibility ~ soft
Found 12 compatible libraries
Scanning dependencies...
Dependency Graph
|-- Adafruit BusIO @ 1.12.0
|   |-- SPI @ 1.0
|   |   |-- Adafruit Zero DMA Library @ 1.1.0
|   |   |   |-- Adafruit TinyUSB Library @ 1.4.3
|   |   |-- Adafruit TinyUSB Library @ 1.4.3
|   |-- Wire @ 1.0
|   |   |-- Adafruit TinyUSB Library @ 1.4.3
|-- Wire @ 1.0
|   |-- Adafruit TinyUSB Library @ 1.4.3
Building in release mode
Compiling .pio/build/itsy/src/main.cpp.o
Archiving .pio/build/itsy/lib716/libAdafruit_ZeroDMA.a
Indexing .pio/build/itsy/lib716/libAdafruit_ZeroDMA.a
Archiving .pio/build/itsy/lib0f0/libSPI.a
Indexing .pio/build/itsy/lib0f0/libSPI.a
Compiling .pio/build/itsy/lib6bd/Wire/Wire.cpp.o
Compiling .pio/build/itsy/libc16/Adafruit BusIO/Adafruit_BusIO_Register.cpp.o
Compiling .pio/build/itsy/libc16/Adafruit BusIO/Adafruit_I2CDevice.cpp.o
Compiling .pio/build/itsy/libc16/Adafruit BusIO/Adafruit_SPIDevice.cpp.o
/var/folders/wy/yddn63455wd2vk7m62shqd780000gn/T//ccEbUdN3.s: Assembler messages:
/var/folders/wy/yddn63455wd2vk7m62shqd780000gn/T//ccEbUdN3.s:462: Error: lo register required -- `sub r10,#1'
Compiling .pio/build/itsy/FrameworkArduino/WInterrupts.c.o
Compiling .pio/build/itsy/FrameworkArduino/WMath.cpp.o
Compiling .pio/build/itsy/FrameworkArduino/WString.cpp.o
Compiling .pio/build/itsy/FrameworkArduino/abi.cpp.o
Compiling .pio/build/itsy/FrameworkArduino/avr/dtostrf.c.o
*** [.pio/build/itsy/libc16/Adafruit BusIO/Adafruit_SPIDevice.cpp.o] Error 1
@cujomalainey
Copy link
Author

Note -O1 works fine

@cujomalainey
Copy link
Author

Was able to bisect down the optimizations 2 adds, this also repros the bug. build_flags = -O1 -fcode-hoisting -finline-functions -finline-small-functions

I'm guessing this is likely a bug in GCC now that I look at those flags, its probably not repurposing a register properly for an inlined function.

What is interesting is that by default the system is compiling with -Os which is -O2 minus a few options (not ones above) and includes -finline-functions but "tunes for size" so im guessing whatever is getting inlined and breaking isn't tuned for size.

reference gcc manual

@ladyada
Copy link
Member

ladyada commented Jul 10, 2022

thats unfortunate but....not sure what it could be. what if you de-inline any functions, does that help?

@hathach
Copy link
Member

hathach commented Jul 11, 2022

could you confirm if the issue reproducible with Arduino IDE by changing the -Os to -O2 option here (there are 3 occurrences)

https://github.com/adafruit/ArduinoCore-samd/blob/bd2a9cdbe7433ec88701591c49b1224c8686e940/platform.txt#L36-L42

@cujomalainey
Copy link
Author

@ladyada it appears to be the transfer function, forcing that to not be inlined fixes it

diff --git a/Adafruit_SPIDevice.cpp b/Adafruit_SPIDevice.cpp
index 44a8f55..db3e993 100644
--- a/Adafruit_SPIDevice.cpp
+++ b/Adafruit_SPIDevice.cpp
@@ -119,7 +119,7 @@ bool Adafruit_SPIDevice::begin(void) {
  *    @param  buffer The buffer to send and receive at the same time
  *    @param  len    The number of bytes to transfer
  */
-void Adafruit_SPIDevice::transfer(uint8_t *buffer, size_t len) {
+void __attribute__ ((noinline)) Adafruit_SPIDevice::transfer(uint8_t *buffer, size_t len) {
   if (_spi) {
     // hardware SPI is easy

@cujomalainey
Copy link
Author

Dug a bit deeper through the ASM, looks like its specifically the calls to delayMicroseconds (defined in cores/arduino/delay.h from the BSP) in transfer that are causing the issue when inlined in the transfer function.

/**
 * \brief Pauses the program for the amount of time (in microseconds) specified as parameter.
 *
 * \param dwUs the number of microseconds to pause (uint32_t)
 */
#if defined(__SAMD51__)
extern void delayMicroseconds( unsigned int );
#else
static __inline__ void delayMicroseconds( unsigned int ) __attribute__((always_inline, unused)) ;
static __inline__ void delayMicroseconds( unsigned int usec )
{
  if ( usec == 0 )
  {
    return ;
  }
  /*
   *  The following loop:
   *
   *    for (; ul; ul--) {
   *      __asm__ volatile("");
   *    }
   *
   *  produce the following assembly code:
   *
   *    loop:
   *      subs r3, #1        // 1 Core cycle
   *      bne.n loop         // 1 Core cycle + 1 if branch is taken
   */

  // VARIANT_MCK / 1000000 == cycles needed to delay 1uS
  //                     3 == cycles used in a loop
  uint32_t n = usec * (VARIANT_MCK / 1000000) / 3;

  __asm__ __volatile__(
    "1:              \n"
    "   sub %0, #1   \n" // substract 1 from %0 (n)
    "   bne 1b       \n" // if result is not 0 jump to 1
    : "+r" (n)           // '%0' is n variable with RW constraints
    :                    // no input
    :                    // no clobber
  );
  // https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html
  // https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html#Volatile
}
#endif

This is the from the .s dump @ line 121 (test build config changed the line to this from 462 in original report)

@ 92 "/Users/curtismalainey/.platformio/packages/framework-arduino-samd-adafruit/cores/arduino/delay.h" 1
        1:
   sub r10, #1
   bne 1b

@ 0 "" 2

Note there is another call right before this one and its fine, but if the parser is going from reverse then this is not-helpful.

@ 92 "/Users/curtismalainey/.platformio/packages/framework-arduino-samd-adafruit/cores/arduino/delay.h" 1
        1:
   sub r2, #1
   bne 1b

@ 0 "" 2

Further debugging, looking into the ARMv6 ISA posted here. I think I figured out the error

Section A6.7.65 (pg 164)
Only 3 bits available to express the register Rdn. So assuming a linear addressing, you cannot specify anything above r7 for the sub instruction.

So back to blaming gcc unfortunately.

@ladyada
Copy link
Member

ladyada commented Aug 1, 2022

erk - im ok with forcing a non-inline attribute, can we by chance #ifdef that for the gcc version?

@cujomalainey
Copy link
Author

Sure, I can test that out, I'll also see if I can bisect the affected range since platformio makes adjusting toolchains easy

@ladyada
Copy link
Member

ladyada commented Aug 1, 2022

ok please note we do not have CI or testing for platformio so basically just make sure it passes the actions CI for arduino :)

@cujomalainey
Copy link
Author

Latest gcc on platformio is also broken

arm-none-eabi-g++ (xPack GNU Arm Embedded GCC x86_64) 10.3.1 20210824 (release)

I'll try and build gcc from source later this week on GCC 11 and 12 and see if its fixed there, otherwise i'll report upstream

@cujomalainey
Copy link
Author

Unable to repro on arm-none-eabi-g++ (GNU Toolchain for the Arm Architecture 11.2-2022.02 (arm-11.14)) 11.2.1 20220111 which i fetched straight from the arm website

So it looks like the gcc toolchain needs to be updated for the board, i don't even see that in homebrew formulas yet so i don't expect it to be for a bit.

Thankfully my project is not blocked by this (my project was slow because I was doing way too much float math on an itsybitsy_m0, which i have removed).

@cujomalainey
Copy link
Author

Note I checked the asm dump in my test and the second call was using r6 instead of r10

@ladyada
Copy link
Member

ladyada commented Aug 5, 2022

is the toolchain v defined in platformio - e.g. do you see the same issue in arduino ide?

@cujomalainey
Copy link
Author

is the toolchain v defined in platformio - e.g. do you see the same issue in arduino ide?

Will test when I have time, but I believe it inherits the same toolchains from the Arduino bsp as it tries to reuse as much as possible

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants