Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MKRGSM bug with accents #101

Open
zolanet opened this issue Sep 18, 2019 · 9 comments
Open

MKRGSM bug with accents #101

zolanet opened this issue Sep 18, 2019 · 9 comments

Comments

@zolanet
Copy link

zolanet commented Sep 18, 2019

Hello,
I will be using an Arduino MKR GSM 1400 on an installation that is in french.
I have come across a bug in the library where accents are not sent unless I use emojis:

using the receive SMS example https://www.arduino.cc/en/Tutorial/MKRGSMExamplesReceiveSMS

If I send the following message:
"Teste avec accès aux accents"

The Serial Monitor returns:
"Teste avec accs aux accents"

But if I send:
"Teste avec accès aux accents 😜"

The serial Monitor returns the following hex values:
"0054006500730074006500200061007600650063002000610063006300E80073002000610075007800200061006300630065006E007400730020D83DDE1C"

Which I can decode to:
"Teste avec accès aux accents 😜"

Is there a way to force hex values to be returned instead of text?
It's seems that it's the only way to get accents to show up...

Regards
Marc

@zolanet
Copy link
Author

zolanet commented Sep 26, 2019

Following up. The following sentence:
"J'ai décidé d'aller au café, mais il est rempli à craquer. Ça demande de la patience. Il y a des gens qui attendent dans l'entrée! C'est complètement fou!"
returns:
"J'ai dcid d'aller au caf, mais il est rempli craquer. a demande de la patience. Il y a des gens qui attendent dans l'entre! C'est compltement fou!"
But, if I use a "ê":
"J'ai décidé d'aller au café, mais il est rempli à craquer. Il faut être patient. Il y a des gens qui attendent dans l'entrée! C'est complètement fou!"
It returns:
"004A0027006100690020006400E900630069006400E90020006400270061006C006C00650072002000610075002000630061006600E9002C0020006D00610069007300200069006C0020006500730074002000720065006D0070006C0069002000E000200063007200610071007500650072002E00200049006C00200066006100750074002000EA007400720065002000700061007400690065006E0074002E00200049006C00200079002000610020006400650073002000670065006E0073002000710075006900200061007400740065006E00640065006E0074002000640061006E00730020006C00270065006E0074007200E90065002100200043002700650073007400200063006F006D0070006C00E800740065006D0065006E007400200066006F00750021"
Which decodes to:
"J'ai décidé d'aller au café, mais il est rempli à craquer. Il faut être patient. Il y a des gens qui attendent dans l'entrée! C'est complètement fou!"

Going further, the following accented letters return as HEX values
À Â â È Ê ê Ë ë Ï ï Î î Ô ô Ù Û û ç

The following accented characters are completely deleted from the string if used by themselves:
à è É é Ü ü ù Ç

The latter are all part of the GSM_7 Encoding: https://en.wikipedia.org/wiki/GSM_03.38#GSM_7-bit_default_alphabet_and_extension_table_of_3GPP_TS_23.038_.2F_GSM_03.38
So it seems that ê and emojis force hex output which is expected since they are not part of the GSM_7 encoding.

Please let me know how to force the arduino to output Hex values instead of characters

regards

@Rocketct
Copy link
Contributor

Rocketct commented Sep 27, 2019

Hi @zolanet like replied here #31 for the special character, we doesn't support the extended character set because the read() function use a buffer of bytes, you could implement a logic that handles the unicode code points from the _incomingBuffer in the GSM SMS class, and recognize the characters that you need.

@zolanet
Copy link
Author

zolanet commented Sep 27, 2019

Thanx for the reply @Rocketct,
This may be a bit beyond my skills. Is it as simple as:

int GSM_SMS::read()
{
  int bufferLength = _incomingBuffer.length();

  if (_smsDataIndex < bufferLength && _smsDataIndex <= _smsDataEndIndex) 
  {
		switch(_smsDataIndex)
		{
			case "é":
				return "00E9"
				_smsDataIndex++
				break;
			default:
				return _incomingBuffer[_smsDataIndex++]
				break;
		}
  }
  return -1;
}

if not, can you give a simple example.

regards

@salvq
Copy link

salvq commented Dec 25, 2019

I would also welcome help here...

@zolanet , have you found out the way to convert the special characters like you suggested above ?

I am trying to convert special characters to ASCII to get human readable message rather than HEX code.

Thanks

@Rocketct
Copy link
Contributor

Rocketct commented Mar 6, 2020

sorry for late resposne, @zolanet you should check the value in incoming buffer like:
....
if ( _incomingBuffer[_smsDataIndex++] == special_Char_code ) {
return the value you need
}
return ( _incomingBuffer[_smsDataIndex++]
....

let me find a working example i'll post as soon i'll find

@Rocketct
Copy link
Contributor

Rocketct commented Mar 9, 2020

@zolanet i make some test i got the following results:

  • with the actual configuration the character set by the modem allow the to receive only ** GSM default alphabet (3GPP TS 23.038)**
    this means that different from what i report before, is the modem that filter the received message;

  • to allow the reception of the special character defined in the standard **(USO/IEC10646) **, you have to:

  1. change the character set, add in the GSM_SMS::available() in the very begin of the function; the line MODEM.send("AT+CSCS=\"UCS2\""); this will changes how the modem message body,
  2. read the value from the incoming buffer convert in the char you want;

practically the one written above means:

while ((c = sms.read()) != -1) {
      //Serial.print((char)c);
      //outbuf += String(c);
      counter++;
      if(c == 48){
        outbuf +='0';
      }
      if(c == 69){
        outbuf +='E';
      }
      if(c == 56){
        outbuf +='8';
      }
     
    }
    if (outbuf == "00E8") {
      Serial.println("è");
    }



    Serial.println("\nEND OF MESSAGE");

    // Delete message from modem memory
    sms.flush();
    Serial.println("MESSAGE DELETED");
  }

the output enabling the at debug will be:

without changing the set char:
AT+CMGL="REC UNREAD"

+CMGL: 2,"REC UNREAD","11111111",,"20/03/09,14:08:31+04"

changing the set char:
AT+CSCS="UCS2"
OK
AT+CMGL="REC UNREAD"
+CMGL: 2,"REC UNREAD","00310031003100310031003100310031",,"20/03/09,14:27:48+04"
00E8

pay attention how "11111111" become "00310031003100310031003100310031" (removed my number, this an example i have check the duality and it works) where 0031=1 in the standard ISO-8859-1 notice that the difference between the first and the seconds read is the presence of 00E8 that is 'é' ,

the example i had used is the following :


// include the GSM library
#include <MKRGSM.h>

#include "arduino_secrets.h"
// Please enter your sensitive data in the Secret tab or arduino_secrets.h
// PIN Number
const char PINNUMBER[] = SECRET_PINNUMBER;

// initialize the library instances
GSM gsmAccess(true);
GSM_SMS sms;

// Array to hold the number a SMS is retreived from
char senderNumber[80];

void setup() {
  // initialize serial communications and wait for port to open:
  Serial.begin(9600);
  while (!Serial) {
    ; // wait for serial port to connect. Needed for native USB port only
  }

  Serial.println("SMS Messages Receiver");

  // connection state
  bool connected = false;

  // Start GSM connection
  while (!connected) {
    if (gsmAccess.begin(PINNUMBER) == GSM_READY) {
      connected = true;
    } else {
      Serial.println("Not connected");
      delay(1000);
    }
  }

  Serial.println("GSM initialized");
  Serial.println("Waiting for messages");
}

void loop() {
  int c;

  // If there are any SMSs available()
  if (sms.available()) {
    Serial.println("Message received from:");

    // Get remote number
    sms.remoteNumber(senderNumber, 80);
    Serial.println(senderNumber);

    // An example of message disposal
    // Any messages starting with # should be discarded
    if (sms.peek() == '#') {
      Serial.println("Discarded SMS");
      sms.flush();
    }
//----------------------- made the string and find the equivalent char value ------------------------
    String outbuf = "";
    // Read message bytes and print them
    while ((c = sms.read()) != -1) {
      //Serial.print((char)c);
      //outbuf += String(c);
      if(c == 48){
        outbuf +='0';
      }
      if(c == 69){
        outbuf +='E';
      }
      if(c == 56){
        outbuf +='8';
      }
     
    }
    if (outbuf == "00E8") {
      Serial.println("è");
    }
//--------------------------------------------------------------------------------------------------------------------

    Serial.println("\nEND OF MESSAGE");

    // Delete message from modem memory
    sms.flush();
    Serial.println("MESSAGE DELETED");
  }

  delay(1000);

}

@Rocketct
Copy link
Contributor

Rocketct commented Mar 9, 2020

hope this help

@Wortoll
Copy link

Wortoll commented Jul 14, 2020

Hi there,
I had the same problem while making a sms-printer. (mkr1400 sms to Adafruit thermal printer)

Thanks to information above! I managed to fix this by:

  • Changing the character set in the GSM_SMS.cpp file by adding MODEM.send("AT+CSCS=\"UCS2\""); right under GSM_SMS::available()

  • Converting the incoming UCS2 data to characters of choice with the use of 2 arrays.
    This is done by grouping each 4 incoming chars to a "blokje".
    Comparing this "blokje" to the first "blokjesDB" to find its location and retrieving the corresponding character from the "outputDB". Characters not in the "blokjesDB" are printed as #. Emoticons are UTF16 and 8 chars long, so they are returned as ##.

String blokjesDB [137] = {"0020","0021","0022","0023","0024","0025","0026","0027","0028","0029","002A","002B","002C","002D","002E","002F","0030","0031","0032","0033","0034","0035","0036","0037","0038","0039","003A","003B","003C","003D","003E","003F","0040","0041","0042","0043","0044","0045","0046","0047","0048","0049","004A","004B","004C","004D","004E","004F","0050","0051","0052","0053","0054","0055","0056","0057","0058","0059","005A","005B","005C","005D","005E","005F","0060","0061","0062","0063","0064","0065","0066","0067","0068","0069","006A","006B","006C","006D","006E","006F","0070","0071","0072","0073","0074","0075","0076","0077","0078","0079","007A","007B","007C","007D","007E","00A1","00C0","00C1","00C2","00C3","00C4","00C5","00C7","00C8","00C9","00CA","00CB","00CC","00CD","00CE","00CF","00D0","00D1","00D2","00D3","00D4","00D5","00D6","00D9","00DA","00DB","00DC","00DD","00DF","00E0","00E1","00E2","00E3","00E4","00E5","00E7","00E8","00E9","00EA","00EB"};     
String outputDB [137]  = {   " ",   "!",   " ",   "#",   "S",   "%",   "&",   " ",   "(",   ")",   "*",   "+",   ",",   "-",   ".",   "/",   "0",   "1",   "2",   "3",   "4",   "5",   "6",   "7",   "8",   "9",   ":",   " ",   "<",   "=",   ">",   "?",   "@",   "A",   "B",   "C",   "D",   "E",   "F",   "G",   "H",   "I",   "J",   "K",   "L",   "M",   "N",   "O",   "P",   "Q",   "R",   "S",   "T",   "U",   "V",   "W",   "X",   "Y",   "Z",   "[",   "/",   "]",   "^",   "_",   "",    "a",   "b","c","d","e","f","g","h","i","j","k","l","m","n","o","p","q","r","s","t","u","v","w","x","y","z","{","|","}","~","!","A","A","A","A","A","A","C","E","E","E","E","I","I","I","I","D","N","O","O","O","O","O","U","U","U","U","Y","s","a","a","a","a","a","a","c","e","e","e","e"};
void messageContentPrinter()
{
    int c;
    int positie=0;
    String blokje;

    // Read message bytes and print them    
    while ((c = sms.read()) != -1) 
    {
           bool found_output = false;
              
           blokje.concat(String(char(c)));
           positie++;

              if(positie==4) //filled "blokje" with 4 characters
              {
              Serial.print("blokje: "); Serial.print(blokje); Serial.print(" => ");
              
                      //lookup in blokjesDB
                      for(int i=-1; i<137;i++)
                      {
                        
                        if(blokje.equals(blokjesDB[i])==1)  
                          { 
                            //Serial.print(" output: "); 
                            Serial.println( outputDB[i] ); 
                            found_output = true;
                            break; 
                          }   

                      }
                      
                      if(found_output == false)
                         {
                           Serial.println("#"); //characters not in the "blokjesDB" are printed as #
                         }
              positie=0;
              blokje="";
              }
    }
    sms.flush();
    Serial.println(" MESSAGE DELETED");
    
}
  • there was also an issue where the modem seemed to stop working after a day and a half. implementing a watchdog makes sure the system gets rebooted, its crude but it should work. Funny enough now the code is working for 4 days without any problems.. (maybe its also because the recent library update, I don't know)

QUESTION

  • Is there a way to receive a timestamp from the server? (It is possible to request an epoch timestamp, but conversions are not that easy. I noticed when there is an error the server can send a timestamp in the error, so maybe its possible to receive a timestamp that way?)

@Wortoll
Copy link

Wortoll commented Jul 26, 2020

EDIT:
it seems I should have started a new issue. => #118

also I made a function of the translator so you can send whatever data (like ie. the senderNumber) to it.
I'll post this asap!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants