-
Notifications
You must be signed in to change notification settings - Fork 154
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
replacement character aka �
appears out of nowhere sometimes
#1143
Comments
I can reproduce it without
If so, since apparently nothing was changed in this regard when QTerminal was ported to Qt6 (but that should be checked again), it may be related to Qt6. Although Konsole is also based on Qt6, it uses |
If you want to check the code (which would be very appreciated), this is what was done for porting to Qt6: https://github.com/lxqt/qtermwidget/pull/532/files (unfortunately, its commits weren't squashed before merging; hence mentioning the PR and not a commit). Small changes were also made after it, but I don't think they're relevant. It consists of using Qt6 methods and dropping of legacy encodings ( |
I superficially checked out the PR and whatever was after it as well, but, to by novice eye, nothing popped out as the culprit. Further, I switched I'm unclear whether this means that 1.4.0 is using some kind of qt5 even though I've still only dev-qt/qtbase 6.7.2 (maybe it does, it it's backward compatible?), but if not, then maybe something between that tag and the current git master on one or both of the repos could be the culprit? Looking a bit harder, I do seem to still have something about qt5, so I guess maybe it's using that?
I'll check the diff since 1.4.0 tag |
What I know for sure is that there were characters which Qt5 showed correctly (in any Qt5 app) but Qt6 didn't. That regression is fixed in qt6-base 6.7.1-4 from Arch. Qt 6.7.2 will come soon to my distro (Manjaro), but it's very unlikely that the regression has come back with it (I'll see soon). |
Your tests suggest it. I have 1.4.0 on my old laptop and will check it later. So, if it works fine there, we only need to find the needle in the haystack ;) |
I think you were right though, 1.4.0 working and 2.0.0 not, given the difference in changes, it doesn't seem to be either qterminal or qtermwidget. I had a superficial look and found, possibly not relevant: https://bugreports.qt.io/browse/QTBUG-117051 That being said, i still haven't checked qterminal's 1.4.0 diff, only qtermwidget's (and superficially) EDIT: done, nothing jumps out to me, but I should note I've never programmed anything with qt before, so I've no idea what's what :D I'm switching my focus on qt itself |
I'll join you soon. If the problem is in our commits, two pairs of eyes couldn't miss it. |
Progress: diff --git a/lib/Emulation.cpp b/lib/Emulation.cpp
index c88ac12..a8ed6ea 100644
--- a/lib/Emulation.cpp
+++ b/lib/Emulation.cpp
@@ -224,9 +224,13 @@ void Emulation::receiveData(const char* text, int length)
* U+10FFFF
* https://unicodebook.readthedocs.io/unicode_encodings.html#surrogates
*/
+ qDebug() << "!!!!!!!!!! text:" << text;
QString str = QString::fromUtf8(text, length);
+ qDebug() << "!!!!!!!!!! str:" << str;
auto encoded = _fromUtf8(str);
- std::wstring unicodeText = encoded.data.toStdWString();
+ qDebug() << "!!!!!!!!!! encoded:" << encoded;
+ std::wstring unicodeText = encoded.data.toStdWString();
+ qDebug() << "!!!!!!!!!! unicodeText:" << QString::fromStdWString(unicodeText.c_str());
//send characters to terminal emulator
for (size_t i=0;i<unicodeText.length();i++) with this patch i can redirect output to file and have Well, maybe I jumped the gun a little, as the above shows that 'text' was already having those replacement chars, so the transform didn't happen in this very patch place, but somewhere else, but the idea remains that something split it into chunks that didn't end on utf8 boundary or something, and then something else tried to make the chunks utf8 and thought the last char was invalid utf8 because it was split due to chunking of sorts. You'd know exactly what/why, I'm guessing. Remember I've 0 knowledge of Qt stuff. |
I think that it may be a Qt6 problem which Konsole avoids automatically by using the Qt5 methods provided by |
This is what I've done so far. First I checked QTerminal 1.4.0 and found no issue in this regard. A screenshot: Then I started to compare 1.4.0 and 2.0.0 with each other and focused on Later I'll compare other files that may be related to this, but I think the problem happens in |
the thing is that for some reason the |
ok I see the wrong assumption here on my part: I realized that because I just couldn't get this patch to work(it just wouldn't trigger), heh: diff --git a/CMakeLists.txt b/CMakeLists.txt
index 9cf3516..3533c86 100644
--- a/CMakeLists.txt
+++ b/CMakeLists.txt
@@ -153,6 +153,7 @@ set_target_properties( ${QTERMWIDGET_LIBRARY_NAME} PROPERTIES
VERSION ${QTERMWIDGET_VERSION}
)
+set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -fexceptions")
if(APPLE)
target_compile_definitions(${QTERMWIDGET_LIBRARY_NAME}
diff --git a/lib/Emulation.cpp b/lib/Emulation.cpp
index c88ac12..afc2c1e 100644
--- a/lib/Emulation.cpp
+++ b/lib/Emulation.cpp
@@ -224,9 +224,31 @@ void Emulation::receiveData(const char* text, int length)
* U+10FFFF
* https://unicodebook.readthedocs.io/unicode_encodings.html#surrogates
*/
+ qDebug() << "!!!!!!!!!! length:" << length;
+ qDebug() << "!!!!!!!!!! text:" << text;
QString str = QString::fromUtf8(text, length);
+ qDebug() << "!!!!!!!!!! str:" << str;
auto encoded = _fromUtf8(str);
- std::wstring unicodeText = encoded.data.toStdWString();
+ qDebug() << "!!!!!!!!!! encoded:" << encoded;
+ std::wstring unicodeText = encoded.data.toStdWString();
+ qDebug() << "!!!!!!!!!! unicodeText:" << QString::fromStdWString(unicodeText.c_str());
+ if (length >= 3 &&
+ ((text[length - 3] == '\xEF' &&
+ text[length - 2] == '\xBF' &&
+ text[length - 1] == '\xBD')||(text[0]=='\xEF' && text[1]=='\xBF' && text[2]=='\xBD'))) {
+
+ qDebug() << "Replacement character detected at the end of the input.";
+
+ // Option 1: Print stack trace
+ //printStackTrace();
+
+ // Option 2: Crash the program
+ // This can be done using assertions or by throwing an exception
+ // Q_ASSERT(false); // This will crash the program with a debug assertion
+ throw std::runtime_error("Replacement character detected at the end of the input. Crashing as requested.");
+ // Option 2: Crash the program using assertion
+ Q_ASSERT(false); // This will crash the program with a debug assertion
+ }
//send characters to terminal emulator
for (size_t i=0;i<unicodeText.length();i++) therefore you may be right indeed However, I'm unclear what calls receiveData and why would it have worked before in 1.4.0 if the 'text' arg doesn't end in utf8 boundary. |
I've a stacktrace with who's calling (gdb) bt2
executing: 'frame apply all -q frame'
#0 __pthread_kill_implementation (threadid=<optimized out>, signo=signo@entry=6, no_tid=no_tid@entry=0) at pthread_kill.c:44
44 return INTERNAL_SYSCALL_ERROR_P (ret) ? INTERNAL_SYSCALL_ERRNO (ret) : 0;
#1 0x00007ffff5ea9513 in __pthread_kill_internal (threadid=<optimized out>, signo=6) at pthread_kill.c:78
78 return __pthread_kill_implementation (threadid, signo, 0);
#2 0x00007ffff5e5164e in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
26 int ret = __pthread_kill (__pthread_self (), sig);
#3 0x00007ffff5e399c5 in __GI_abort () at abort.c:79
79 raise (SIGABRT);
#4 0x00007ffff60a6198 in __gnu_cxx::__verbose_terminate_handler () at /usr/src/debug/sys-devel/gcc-14.1.1_p20240622/gcc-14-20240622/libstdc++-v3/libsupc++/vterminate.cc:95
95 abort();
#5 0x00007ffff60bfa3d in __cxxabiv1::__terminate (handler=<optimized out>) at /usr/src/debug/sys-devel/gcc-14.1.1_p20240622/gcc-14-20240622/libstdc++-v3/libsupc++/eh_terminate.cc:48
48 handler ();
#6 0x00007ffff60a5b0f in std::terminate () at /usr/src/debug/sys-devel/gcc-14.1.1_p20240622/gcc-14-20240622/libstdc++-v3/libsupc++/eh_terminate.cc:58
58 __cxxabiv1::__terminate (get_terminate ());
#7 0x00007ffff60bfe5c in __cxxabiv1::__cxa_throw (obj=<optimized out>, tinfo=0x7ffff629ddd8 <typeinfo for std::runtime_error>, dest=0x7ffff60d8b50 <std::runtime_error::~runtime_error()>) at /usr/src/debug/sys-devel/gcc-14.1.1_p20240622/gcc-14-20240622/libstdc++-v3/libsupc++/eh_throw.cc:98
98 std::terminate ();
#8 0x00007ffff7f34220 in Konsole::Emulation::receiveData (this=<optimized out>, text=text@entry=0x555555dba100 "\240\277\"\033[m\033[38;2;204;204;204m\033[48;2;0;0;0m, \033[m\033[38;2;179;246;192m\033[48;2;0;0;0m\"⢿\"\033[m\033[38;2;204;204;204m\033[48;2;0;0;0m,\033[m\r\n", ' ' <repeats 12 times>, "\033[m\033[38;2;179;246;192m\033[48;2;0;0;0m\"⡇\"\033[m\033[38;2;204;204;204m\033[48;"..., length=length@entry=1027) at /tmp/qtermwidget/lib/Emulation.cpp:247
247 throw std::runtime_error("Replacement character detected at the front of the input. Crashing as requested.");
#9 0x00007ffff7f7369b in Konsole::Session::onReceiveBlock (this=0x5555558d96f0, buf=0x555555dba100 "\240\277\"\033[m\033[38;2;204;204;204m\033[48;2;0;0;0m, \033[m\033[38;2;179;246;192m\033[48;2;0;0;0m\"⢿\"\033[m\033[38;2;204;204;204m\033[48;2;0;0;0m,\033[m\r\n", ' ' <repeats 12 times>, "\033[m\033[38;2;179;246;192m\033[48;2;0;0;0m\"⡇\"\033[m\033[38;2;204;204;204m\033[48;"..., len=1027) at /tmp/qtermwidget/lib/Session.cpp:902
902 _emulation->receiveData( buf, len );
#10 0x00007ffff67485b8 in doActivate<false> (sender=0x5555556fe530, signal_index=16, argv=0x7fffffffe170) at /usr/src/debug/dev-qt/qtbase-6.7.2/qtbase-everywhere-src-6.7.2/src/corelib/kernel/qobject.cpp:4098
4098 callFunction(receiver, QMetaObject::InvokeMetaMethod, method_relative, argv);
#11 0x00007ffff66f8ce9 in QMetaObject::activate (sender=sender@entry=0x5555556fe530, m=m@entry=0x7ffff7fb8100 <Konsole::Pty::staticMetaObject>, local_signal_index=local_signal_index@entry=0, argv=argv@entry=0x7fffffffe170) at /usr/src/debug/dev-qt/qtbase-6.7.2/qtbase-everywhere-src-6.7.2/src/corelib/kernel/qobject.cpp:4146
4146 doActivate<false>(sender, signal_index, argv);
#12 0x00007ffff7f952aa in Konsole::Pty::receivedData (this=this@entry=0x5555556fe530, _t1=<optimized out>, _t2=<optimized out>) at /tmp/qtermwidget/build/lib/moc_Pty.cpp:178
178 QMetaObject::activate(this, &staticMetaObject, 0, _a);
#13 0x00007ffff7f64ef0 in Konsole::Pty::dataReceived (this=0x5555556fe530) at /tmp/qtermwidget/lib/Pty.cpp:323
323 emit receivedData(data.constData(),data.size());
#14 0x00007ffff67485b8 in doActivate<false> (sender=0x5555556fe220, signal_index=3, argv=0x7fffffffe248) at /usr/src/debug/dev-qt/qtbase-6.7.2/qtbase-everywhere-src-6.7.2/src/corelib/kernel/qobject.cpp:4098
4098 callFunction(receiver, QMetaObject::InvokeMetaMethod, method_relative, argv);
#15 0x00007ffff66f8ce9 in QMetaObject::activate (sender=sender@entry=0x5555556fe220, m=m@entry=0x7ffff6a31de0 <QIODevice::staticMetaObject>, local_signal_index=local_signal_index@entry=0, argv=argv@entry=0x0) at /usr/src/debug/dev-qt/qtbase-6.7.2/qtbase-everywhere-src-6.7.2/src/corelib/kernel/qobject.cpp:4146
4146 doActivate<false>(sender, signal_index, argv);
#16 0x00007ffff675772b in QIODevice::readyRead (this=this@entry=0x5555556fe220) at /usr/src/debug/dev-qt/qtbase-6.7.2/qtbase-everywhere-src-6.7.2_build/src/corelib/Core_autogen/include/moc_qiodevice.cpp:211
211 QMetaObject::activate(this, &staticMetaObject, 0, nullptr);
#17 0x00007ffff7f62fb4 in KPtyDevicePrivate::_k_canRead (this=0x5555556f4c50) at /tmp/qtermwidget/lib/kptydevice.cpp:146
146 emit q->readyRead();
#18 0x00007ffff7f94fc9 in KPtyDevice::qt_static_metacall (_c=QMetaObject::InvokeMetaMethod, _id=1, _o=<optimized out>, _a=0x7fffffffe430) at /tmp/qtermwidget/build/lib/moc_kptydevice.cpp:105
105 case 1: { bool _r = _t->d_func()->_k_canRead();
#19 KPtyDevice::qt_static_metacall (_o=<optimized out>, _c=<optimized out>, _id=<optimized out>, _a=0x7fffffffe430) at /tmp/qtermwidget/build/lib/moc_kptydevice.cpp:98
98 void KPtyDevice::qt_static_metacall(QObject *_o, QMetaObject::Call _c, int _id, void **_a)
#20 0x00007ffff67485b8 in doActivate<false> (sender=0x5555556fe200, signal_index=5, argv=0x7fffffffe430) at /usr/src/debug/dev-qt/qtbase-6.7.2/qtbase-everywhere-src-6.7.2/src/corelib/kernel/qobject.cpp:4098
4098 callFunction(receiver, QMetaObject::InvokeMetaMethod, method_relative, argv);
#21 0x00007ffff66f8ce9 in QMetaObject::activate (sender=sender@entry=0x5555556fe200, m=m@entry=0x7ffff6a2f620 <QSocketNotifier::staticMetaObject>, local_signal_index=local_signal_index@entry=2, argv=argv@entry=0x7fffffffe430) at /usr/src/debug/dev-qt/qtbase-6.7.2/qtbase-everywhere-src-6.7.2/src/corelib/kernel/qobject.cpp:4146
4146 doActivate<false>(sender, signal_index, argv);
#22 0x00007ffff66f96a3 in QSocketNotifier::activated (this=this@entry=0x5555556fe200, _t1=<optimized out>, _t2=...) at /usr/src/debug/dev-qt/qtbase-6.7.2/qtbase-everywhere-src-6.7.2_build/src/corelib/Core_autogen/include/moc_qsocketnotifier.cpp:202
202 QMetaObject::activate(this, &staticMetaObject, 2, _a);
#23 0x00007ffff66f97ea in QSocketNotifier::event (this=0x5555556fe200, e=<optimized out>) at /usr/src/debug/dev-qt/qtbase-6.7.2/qtbase-everywhere-src-6.7.2/src/corelib/kernel/qsocketnotifier.h:82
82 Q_DECL_CONSTEXPR operator DescriptorType() const noexcept { return sockfd; }
#24 0x00007ffff7a1c7ed in QApplicationPrivate::notify_helper (this=<optimized out>, receiver=0x5555556fe200, e=0x7fffffffe5a0) at /usr/src/debug/dev-qt/qtbase-6.7.2/qtbase-everywhere-src-6.7.2/src/widgets/kernel/qapplication.cpp:3287
3287 consumed = receiver->event(e);
#25 0x00007ffff676578c in QCoreApplication::notifyInternal2 (receiver=0x5555556fe200, event=0x7fffffffe5a0) at /usr/src/debug/dev-qt/qtbase-6.7.2/qtbase-everywhere-src-6.7.2/src/corelib/kernel/qcoreapplication.cpp:1142
1142 return self->notify(receiver, event);
#26 0x00007ffff6765924 in QCoreApplication::sendEvent (receiver=<optimized out>, event=<optimized out>) at /usr/src/debug/dev-qt/qtbase-6.7.2/qtbase-everywhere-src-6.7.2/src/corelib/kernel/qcoreapplication.cpp:1583
1583 return notifyInternal2(receiver, event);
#27 0x00007ffff6503b2e in socketNotifierSourceDispatch (source=0x555555664770) at /usr/src/debug/dev-qt/qtbase-6.7.2/qtbase-everywhere-src-6.7.2/src/corelib/kernel/qeventdispatcher_glib.cpp:75
75 QCoreApplication::sendEvent(p->socketNotifier, &event);
#28 0x00007ffff5d18d36 in g_main_dispatch (context=context@entry=0x7ffff0000ef0) at ../glib-2.78.6/glib/gmain.c:3476
3476 need_destroy = !(* dispatch) (source, callback, user_data);
#29 0x00007ffff5d1caa8 in g_main_context_dispatch_unlocked (context=0x7ffff0000ef0) at ../glib-2.78.6/glib/gmain.c:4284
4284 g_main_dispatch (context);
#30 g_main_context_iterate_unlocked (context=context@entry=0x7ffff0000ef0, block=block@entry=1, dispatch=dispatch@entry=1, self=<optimized out>) at ../glib-2.78.6/glib/gmain.c:4349
4349 g_main_context_dispatch_unlocked (context);
#31 0x00007ffff5d1d3b6 in g_main_context_iteration (context=0x7ffff0000ef0, may_block=1) at ../glib-2.78.6/glib/gmain.c:4414
4414 retval = g_main_context_iterate_unlocked (context, may_block, TRUE, G_THREAD_SELF);
#32 0x00007ffff64ed625 in QEventDispatcherGlib::processEvents (this=0x55555565e900, flags=...) at /usr/src/debug/dev-qt/qtbase-6.7.2/qtbase-everywhere-src-6.7.2/src/corelib/kernel/qeventdispatcher_glib.cpp:394
394 bool result = g_main_context_iteration(d->mainContext, canWait);
#33 0x00007ffff67927fa in QEventLoop::exec (this=this@entry=0x7fffffffe7f0, flags=..., flags@entry=...) at /usr/src/debug/dev-qt/qtbase-6.7.2/qtbase-everywhere-src-6.7.2/src/corelib/global/qflags.h:34
34 constexpr inline Q_IMPLICIT operator uint() const noexcept { return uint(i); }
#34 0x00007ffff67929ac in QCoreApplication::exec () at /usr/src/debug/dev-qt/qtbase-6.7.2/qtbase-everywhere-src-6.7.2/src/corelib/global/qflags.h:74
74 constexpr inline Q_IMPLICIT QFlags(Enum flags) noexcept : i(Int(flags)) {}
#35 0x00007ffff6d9460d in QGuiApplication::exec () at /usr/src/debug/dev-qt/qtbase-6.7.2/qtbase-everywhere-src-6.7.2/src/gui/kernel/qguiapplication.cpp:1926
1926 return QCoreApplication::exec();
#36 0x00007ffff7980930 in QApplication::exec () at /usr/src/debug/dev-qt/qtbase-6.7.2/qtbase-everywhere-src-6.7.2/src/widgets/kernel/qapplication.cpp:2555
2555 return QGuiApplication::exec();
#37 0x000055555557e615 in main (argc=<optimized out>, argv=<optimized out>) at /tmp/qterminal/src/main.cpp:219
219 int ret = app->exec();
(gdb) I used this patch: diff --git a/CMakeLists.txt b/CMakeLists.txt
index 9cf3516..3533c86 100644
--- a/CMakeLists.txt
+++ b/CMakeLists.txt
@@ -153,6 +153,7 @@ set_target_properties( ${QTERMWIDGET_LIBRARY_NAME} PROPERTIES
VERSION ${QTERMWIDGET_VERSION}
)
+set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -fexceptions")
if(APPLE)
target_compile_definitions(${QTERMWIDGET_LIBRARY_NAME}
diff --git a/lib/Emulation.cpp b/lib/Emulation.cpp
index c88ac12..5cfc653 100644
--- a/lib/Emulation.cpp
+++ b/lib/Emulation.cpp
@@ -224,9 +224,30 @@ void Emulation::receiveData(const char* text, int length)
* U+10FFFF
* https://unicodebook.readthedocs.io/unicode_encodings.html#surrogates
*/
+ qDebug() << "!!!!!!!!!! length:" << length;
+ qDebug() << "!!!!!!!!!! text:" << text;
QString str = QString::fromUtf8(text, length);
+ qDebug() << "!!!!!!!!!! str:" << str;
auto encoded = _fromUtf8(str);
- std::wstring unicodeText = encoded.data.toStdWString();
+ qDebug() << "!!!!!!!!!! encoded:" << encoded;
+ std::wstring unicodeText = encoded.data.toStdWString();
+ qDebug() << "!!!!!!!!!! unicodeText:" << QString::fromStdWString(unicodeText.c_str());
+ if (length >= 3 &&
+ (
+ /*(str[length - 3] == '\xEF' &&
+ str[length - 2] == '\xBF' &&
+ str[length - 1] == '\xBD')
+ ||*/
+ //(str[0]=='\xEF' && str[1]=='\xBF' && str[2]=='\xBD'))
+ //str.contains(QChar::ReplacementCharacter))
+ str[0]==QChar::ReplacementCharacter)
+ ) {
+
+ qDebug() << "Replacement character detected at the front of the input.";
+ throw std::runtime_error("Replacement character detected at the front of the input. Crashing as requested.");
+ // Option 2: Crash the program using assertion
+ Q_ASSERT(false); // This will crash the program with a debug assertion
+ }
//send characters to terminal emulator
for (size_t i=0;i<unicodeText.length();i++) then ran this: #!/usr/bin/env -S -i PATH="/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/opt/bin:${PATH}" LANG="${LANG}" TERM='xterm-256color' DISPLAY="${DISPLAY}" HOME="${HOME}" bash --noprofile --norc
cp ./qterminal.ini ~/.config/foo_qterminal.conf
declare -x XDG_RUNTIME_DIR="/run/user/1000"
export QT_LOGGING_RULES="*=true"
#export QT_LOGGING_RULES="qt.*=true"
#qterminal -p foo_qterminal -e 'less -R --shift 5 -- ./less.txt'
#LD_LIBRARY_PATH="/var/tmp/portage/x11-libs/qtermwidget-2.0.0/work/qtermwidget-2.0.0_build" qterminal -p foo_qterminal -e 'less -R --shift 5 -- ./less.txt'
#LD_LIBRARY_PATH="/tmp/qtw1/usr/local/lib64" /tmp/qterminal/build/qterminal -p foo_qterminal -e 'less -R --shift 5 -- ./less.txt'
#LD_LIBRARY_PATH="/tmp/qtw1" /tmp/qterminal/build/qterminal -p foo_qterminal -e 'less -R --shift 5 -- ./less.txt'
#export XDG_RUNTIME_DIR="/run/user/1000"
#/tmp/qterminal/build/qterminal -p foo_qterminal -e 'less -R --shift 5 -- ./less.txt'
#/tmp/qterminal/build/qterminal -p foo_qterminal -e 'less -R --shift 5 -- ./less.txt'
#LD_LIBRARY_PATH="/tmp/qtw1/lib64" /tmp/qtw1/bin/qterminal -p foo_qterminal -e 'less -R --shift 5 -- ./less.txt'
#LD_LIBRARY_PATH="/tmp/qtw2/lib64" /tmp/qtw2/bin/qterminal -p foo_qterminal -e 'less -R --shift 5 -- ./less.txt'
echo "set environment LD_LIBRARY_PATH /tmp/qtw2/lib64" > /tmp/gdb_cmds.txt
echo "run -p foo_qterminal -e 'less -R --shift 5 -- ./less.txt' > /tmp/foo7.log 2>&1 " >> /tmp/gdb_cmds.txt
gdb -x /tmp/gdb_cmds.txt /tmp/qtw2/bin/qterminal then held down git cloned ... chdir into it
mkdir build
cd build
cmake -DCMAKE_INSTALL_PREFIX=/tmp/qtw2 ..
make
make install A more robust patch which wouldn't trigger if the input has replacement chars, might be: diff --git a/CMakeLists.txt b/CMakeLists.txt
index 9cf3516..3533c86 100644
--- a/CMakeLists.txt
+++ b/CMakeLists.txt
@@ -153,6 +153,7 @@ set_target_properties( ${QTERMWIDGET_LIBRARY_NAME} PROPERTIES
VERSION ${QTERMWIDGET_VERSION}
)
+set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -fexceptions")
if(APPLE)
target_compile_definitions(${QTERMWIDGET_LIBRARY_NAME}
diff --git a/lib/Emulation.cpp b/lib/Emulation.cpp
index c88ac12..31cbdab 100644
--- a/lib/Emulation.cpp
+++ b/lib/Emulation.cpp
@@ -208,6 +208,23 @@ void Emulation::sendMouseEvent(int /*buttons*/, int /*column*/, int /*row*/, int
// default implementation does nothing
}
+int countReplacementCharactersInText(const char* text, int length) {
+ int count = 0;
+ for (int i = 0; i < length - 2; ++i) {
+ if (static_cast<unsigned char>(text[i]) == 0xEF &&
+ static_cast<unsigned char>(text[i + 1]) == 0xBF &&
+ static_cast<unsigned char>(text[i + 2]) == 0xBD) {
+ count++;
+ i += 2; // Skip the next two bytes
+ }
+ }
+ return count;
+}
+
+int countReplacementCharactersInString(const QString& str) {
+ return str.count(QChar::ReplacementCharacter);
+}
+
/*
We are doing code conversion from locale to unicode first.
TODO: Character composition from the old code. See #96536
@@ -224,9 +241,41 @@ void Emulation::receiveData(const char* text, int length)
* U+10FFFF
* https://unicodebook.readthedocs.io/unicode_encodings.html#surrogates
*/
+ qDebug() << "!!!!!!!!!! length:" << length;
+ qDebug() << "!!!!!!!!!! text:" << text;
QString str = QString::fromUtf8(text, length);
+ qDebug() << "!!!!!!!!!! str:" << str;
auto encoded = _fromUtf8(str);
- std::wstring unicodeText = encoded.data.toStdWString();
+ qDebug() << "!!!!!!!!!! encoded:" << encoded;
+ std::wstring unicodeText = encoded.data.toStdWString();
+ qDebug() << "!!!!!!!!!! unicodeText:" << QString::fromStdWString(unicodeText.c_str());
+// if (length >= 3 &&
+// (
+// /*(str[length - 3] == '\xEF' &&
+// str[length - 2] == '\xBF' &&
+// str[length - 1] == '\xBD')
+// ||*/
+// //(str[0]=='\xEF' && str[1]=='\xBF' && str[2]=='\xBD'))
+// //str.contains(QChar::ReplacementCharacter))
+// str[0]==QChar::ReplacementCharacter)
+// ) {
+//
+// qDebug() << "Replacement character detected at the front of the input.";
+// throw std::runtime_error("Replacement character detected at the front of the input. Crashing as requested.");
+// // Option 2: Crash the program using assertion
+// Q_ASSERT(false); // This will crash the program with a debug assertion
+// }
+ // Count replacement characters in 'text'
+ int textReplacementCount = countReplacementCharactersInText(text, length);
+ // Count replacement characters in 'str'
+ int strReplacementCount = countReplacementCharactersInString(str);
+
+ // Compare the counts
+ if (strReplacementCount > textReplacementCount) {
+ qDebug() << "More replacement characters found in 'str' than in 'text'.";
+ throw std::runtime_error("Something got replaced with replacement character. Crashing as requested.");
+ Q_ASSERT(false);//not reached
+ }
//send characters to terminal emulator
for (size_t i=0;i<unicodeText.length();i++) So it would only trigger upon detecting replacements with replacement chars. Only tested on non-replacement chars input, though. |
This is the qt6 code where the replacement is made: I've patched out that 'if' and can confirm it's replacing the braille char with nothing on screen. So that's the code. But I think this might be solvable by qterminal/qtermwidget if someone figures out how that receiveData is called(I assume it's not called by qt6 but I may be wrong here, even though I did show the stacktrace above) and then maybe do a tentative I've no idea how this actually works, but definitely ensuring(somehow) that the send buffer is split only on utf8 char boundaries would prevent qt6 from spewing replacement chars. I'm guessing somehow 1.4.0 qt5 didn't replace those chars? else I can't explain how it worked before. EDIT: I followed it to that code (unless I made a mistake along the way), which is here: But if it did somehow keep internal state, as per the comment, it suggests it would recognize valid continuation(for example) if it was split in chunks like it is above in receiveData:
|
Wow I made a discovery, Ok I jumped the gun a little. What I meant to say is that using the same utf8 conversion like qtermwidget uses in version 2, even tho using qt5, the same replacement chars can be seen. In other words, this change is really the culprit: diff --git a/lib/Emulation.cpp b/lib/Emulation.cpp
index 723f5d2..c88ac12 100644
--- a/lib/Emulation.cpp
+++ b/lib/Emulation.cpp
@@ -250,8 +224,9 @@ void Emulation::receiveData(const char* text, int length)
* U+10FFFF
* https://unicodebook.readthedocs.io/unicode_encodings.html#surrogates
*/
- QString utf16Text = _decoder->toUnicode(text,length);
- std::wstring unicodeText = utf16Text.toStdWString();
+ QString str = QString::fromUtf8(text, length);
+ auto encoded = _fromUtf8(str);
+ std::wstring unicodeText = encoded.data.toStdWString();
//send characters to terminal emulator
for (size_t i=0;i<unicodeText.length();i++) because that This patch to test on 1.4.0: diff --git a/CMakeLists.txt b/CMakeLists.txt
index ce50ed0..1747931 100644
--- a/CMakeLists.txt
+++ b/CMakeLists.txt
@@ -153,6 +153,7 @@ set_target_properties( ${QTERMWIDGET_LIBRARY_NAME} PROPERTIES
VERSION ${QTERMWIDGET_VERSION}
)
+set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -fexceptions")
if(APPLE)
target_compile_definitions(${QTERMWIDGET_LIBRARY_NAME}
diff --git a/lib/Emulation.cpp b/lib/Emulation.cpp
index 723f5d2..19afe3a 100644
--- a/lib/Emulation.cpp
+++ b/lib/Emulation.cpp
@@ -38,6 +38,7 @@
#include <QThread>
#include <QTime>
+#include <QDebug>
// KDE
//#include <kdebug.h>
@@ -239,6 +240,23 @@ void Emulation::sendMouseEvent(int /*buttons*/, int /*column*/, int /*row*/, int
TODO: Character composition from the old code. See #96536
*/
+int countReplacementCharactersInText(const char* text, int length) {
+ int count = 0;
+ for (int i = 0; i < length - 2; ++i) {
+ if (static_cast<unsigned char>(text[i]) == 0xEF &&
+ static_cast<unsigned char>(text[i + 1]) == 0xBF &&
+ static_cast<unsigned char>(text[i + 2]) == 0xBD) {
+ count++;
+ i += 2; // Skip the next two bytes
+ }
+ }
+ return count;
+}
+
+int countReplacementCharactersInString(const QString& str) {
+ return str.count(QChar::ReplacementCharacter);
+}
+
void Emulation::receiveData(const char* text, int length)
{
emit stateSet(NOTIFYACTIVITY);
@@ -250,8 +268,24 @@ void Emulation::receiveData(const char* text, int length)
* U+10FFFF
* https://unicodebook.readthedocs.io/unicode_encodings.html#surrogates
*/
+ qDebug() << "!!!!!!!!!! length:" << length;
+ qDebug() << "!!!!!!!!!! text:" << text; //yes replacement chars here because it get converted to utf8
+ QString str = QString::fromUtf8(text, length);
+ qDebug() << "!!!!!!!!!! str:" << str; //yes replacement chars here
QString utf16Text = _decoder->toUnicode(text,length);
+ qDebug() << "!!!!!!!!!! utf16Text:" << utf16Text;//no repl. chars here, gets truncated before it
std::wstring unicodeText = utf16Text.toStdWString();
+ qDebug() << "!!!!!!!!!! unicodeText:" << QString::fromStdWString(unicodeText.c_str());//none here too
+ // Count replacement characters in 'text'
+ int textReplacementCount = countReplacementCharactersInText(text, length);
+ // Count replacement characters in 'str'
+ int strReplacementCount = countReplacementCharactersInString(str);
+
+ // Compare the counts
+ if (strReplacementCount > textReplacementCount) {
+ qDebug() << "More replacement characters found in 'str' than in 'text'.";
+ throw std::runtime_error("Something got replaced with replacement character. Crashing as requested.");
+ }
//send characters to terminal emulator
for (size_t i=0;i<unicodeText.length();i++) in 1.4.0 it is ignoring the incomplete utf8 char at the end, and on next chunk it's somehow1 able to see the correct whole braille char, as if it remembered that it ignored the first byte of it and considered it when processing the first 2 bytes of the next chunk which are the last 2 bytes of the braille char. I think maybe that Footnotes |
Here I have a proof-of-concept patch that does what I suspect It's very crappy patch, but as a p.o.c. it should do, to show how 2.0.0 can be made to work, if we keep track of the state just like I presume diff --git a/CMakeLists.txt b/CMakeLists.txt
index 9cf3516..3533c86 100644
--- a/CMakeLists.txt
+++ b/CMakeLists.txt
@@ -153,6 +153,7 @@ set_target_properties( ${QTERMWIDGET_LIBRARY_NAME} PROPERTIES
VERSION ${QTERMWIDGET_VERSION}
)
+set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -fexceptions")
if(APPLE)
target_compile_definitions(${QTERMWIDGET_LIBRARY_NAME}
diff --git a/lib/Emulation.cpp b/lib/Emulation.cpp
index c88ac12..ed9c7f1 100644
--- a/lib/Emulation.cpp
+++ b/lib/Emulation.cpp
@@ -25,6 +25,7 @@
// System
#include <cstdio>
#include <cstdlib>
+#include <cstring> // for memcpy
#include <unistd.h>
#include <string>
@@ -208,6 +209,23 @@ void Emulation::sendMouseEvent(int /*buttons*/, int /*column*/, int /*row*/, int
// default implementation does nothing
}
+int countReplacementCharactersInText(const char* text, int length) {
+ int count = 0;
+ for (int i = 0; i < length - 2; ++i) {
+ if (static_cast<unsigned char>(text[i]) == 0xEF &&
+ static_cast<unsigned char>(text[i + 1]) == 0xBF &&
+ static_cast<unsigned char>(text[i + 2]) == 0xBD) {
+ count++;
+ i += 2; // Skip the next two bytes
+ }
+ }
+ return count;
+}
+
+int countReplacementCharactersInString(const QString& str) {
+ return str.count(QChar::ReplacementCharacter);
+}
+
/*
We are doing code conversion from locale to unicode first.
TODO: Character composition from the old code. See #96536
@@ -219,14 +237,105 @@ void Emulation::receiveData(const char* text, int length)
bufferedUpdate();
- /* XXX: the following code involves encoding & decoding of "UTF-16
- * surrogate pairs", which does not work with characters higher than
- * U+10FFFF
- * https://unicodebook.readthedocs.io/unicode_encodings.html#surrogates
- */
+// /* XXX: the following code involves encoding & decoding of "UTF-16
+// * surrogate pairs", which does not work with characters higher than
+// * U+10FFFF
+// * https://unicodebook.readthedocs.io/unicode_encodings.html#surrogates
+// */
+// qDebug() << "!!!!!!!!!! length:" << length;
+// qDebug() << "!!!!!!!!!! text:" << text;
+// QString str = QString::fromUtf8(text, length);
+// //QString str = fromUtf8(text, length);
+// qDebug() << "!!!!!!!!!! str:" << str;
+// auto encoded = _fromUtf8(str);
+// qDebug() << "!!!!!!!!!! encoded:" << encoded;
+// std::wstring unicodeText = encoded.data.toStdWString();
+// qDebug() << "!!!!!!!!!! unicodeText:" << QString::fromStdWString(unicodeText.c_str());
+//// if (length >= 3 &&
+//// (
+//// /*(str[length - 3] == '\xEF' &&
+//// str[length - 2] == '\xBF' &&
+//// str[length - 1] == '\xBD')
+//// ||*/
+//// //(str[0]=='\xEF' && str[1]=='\xBF' && str[2]=='\xBD'))
+//// //str.contains(QChar::ReplacementCharacter))
+//// str[0]==QChar::ReplacementCharacter)
+//// ) {
+////
+//// qDebug() << "Replacement character detected at the front of the input.";
+//// throw std::runtime_error("Replacement character detected at the front of the input. Crashing as requested.");
+//// // Option 2: Crash the program using assertion
+//// Q_ASSERT(false); // This will crash the program with a debug assertion
+//// }
+// // Count replacement characters in 'text'
+// int textReplacementCount = countReplacementCharactersInText(text, length);
+// // Count replacement characters in 'str'
+// int strReplacementCount = countReplacementCharactersInString(str);
+//
+// // Compare the counts
+// if (strReplacementCount > textReplacementCount) {
+// qDebug() << "More replacement characters found in 'str' than in 'text'.";
+// throw std::runtime_error("Something got replaced with replacement character. Crashing as requested.");
+// Q_ASSERT(false);//not reached
+// }
+ qDebug() << "!!!!!!!!!! length:" << length;
+ qDebug() << "!!!!!!!!!! text:" << text; //yes replacement chars here because it get converted to utf8
QString str = QString::fromUtf8(text, length);
- auto encoded = _fromUtf8(str);
- std::wstring unicodeText = encoded.data.toStdWString();
+ qDebug() << "!!!!!!!!!! str:" << str; //yes replacement chars here
+ // Count replacement characters in 'text'
+ int textReplacementCount = countReplacementCharactersInText(text, length);
+ // Count replacement characters in 'str'
+ int strReplacementCount = countReplacementCharactersInString(str);
+
+ //char combinedBuffer[3 + length];
+ //std::vector<char> combinedBuffer(_last_max_three_bytes_of_text.size() + length);
+
+ // do we have any new invalid utf8 bytes at the end of this new chunk?
+ int the_diff=strReplacementCount-textReplacementCount;
+ qDebug() << "!!!!!!!!!! the_diff:" << the_diff;
+ qDebug() << "!!!!!!!!!! _how_many_of_last_max_three_are_in:" << _how_many_of_last_max_three_are_in;
+
+
+
+ std::vector<uint8_t> combinedBuffer(_how_many_of_last_max_three_are_in + length);
+ //"allocates memory on the heap, but it is automatically deallocated when the vector goes out of scope, such as when the function exits"
+ int combinedLength = 0;
+ if (_how_many_of_last_max_three_are_in > 0) {
+ //copy last state bytes from the end of a prev. chunk which woulda been invalid utf8
+ //copy to the beginning of this new chunk
+ std::memcpy(combinedBuffer.data(), _last_max_three_bytes_of_text.data(), _how_many_of_last_max_three_are_in);
+ combinedLength+=_how_many_of_last_max_three_are_in;
+ }
+
+ // Compare the counts
+ if (the_diff>0) {
+ qDebug() << "!!!!!!!!!! More replacement characters("<<the_diff<<") found in 'str' than in 'text'.";
+ //throw std::runtime_error("Something got replaced with replacement character. Crashing as requested.");
+ }
+
+ // Copy new text to the combined buffer
+ // but not any invalid utf8 bytes from the end, if any.
+ std::memcpy(combinedBuffer.data() + combinedLength, text, length-the_diff);
+ combinedLength += length-the_diff;
+
+ // Try to convert the combined buffer to UTF-8
+ QString str2 = QString::fromUtf8(reinterpret_cast<const char*>(combinedBuffer.data()), combinedLength);
+ qDebug() << "!!!!!!!!!! resultAftermods:" << str2;
+ auto encoded = _fromUtf8(str2);
+ qDebug() << "!!!!!!!!!! encoded:" << encoded;
+ std::wstring unicodeText = encoded.data.toStdWString();
+ qDebug() << "!!!!!!!!!! unicodeText:" << QString::fromStdWString(unicodeText.c_str());
+// //QString utf16Text = _decoder->toUnicode(text,length-the_diff);
+// QString utf16Text = _decoder->toUnicode(reinterpret_cast<const char*>(combinedBuffer.data()),combinedLength);
+// qDebug() << "!!!!!!!!!! utf16Text:" << utf16Text;
+
+ // Store the last up to 3 bytes of invalid UTF-8 from the end of the text
+ if (the_diff > 0) {
+ _how_many_of_last_max_three_are_in = the_diff;
+ std::memcpy(_last_max_three_bytes_of_text.data(), text + length - the_diff, the_diff);
+ } else {
+ _how_many_of_last_max_three_are_in = 0;
+ }
//send characters to terminal emulator
for (size_t i=0;i<unicodeText.length();i++)
diff --git a/lib/Emulation.h b/lib/Emulation.h
index 98ba2fe..27edaaf 100644
--- a/lib/Emulation.h
+++ b/lib/Emulation.h
@@ -470,6 +470,8 @@ protected:
const KeyboardTranslator* _keyTranslator; // the keyboard layout
+ std::array<uint8_t, 3> _last_max_three_bytes_of_text = {0};
+ int _how_many_of_last_max_three_are_in = 0;
protected slots:
/** same patch but for 1.4.0 (not needed as it works already without it, doh, but just to show that it works for 1.4.0 too, even if diff --git a/CMakeLists.txt b/CMakeLists.txt
index ce50ed0..1747931 100644
--- a/CMakeLists.txt
+++ b/CMakeLists.txt
@@ -153,6 +153,7 @@ set_target_properties( ${QTERMWIDGET_LIBRARY_NAME} PROPERTIES
VERSION ${QTERMWIDGET_VERSION}
)
+set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -fexceptions")
if(APPLE)
target_compile_definitions(${QTERMWIDGET_LIBRARY_NAME}
diff --git a/lib/Emulation.cpp b/lib/Emulation.cpp
index 723f5d2..8be0797 100644
--- a/lib/Emulation.cpp
+++ b/lib/Emulation.cpp
@@ -25,6 +25,7 @@
// System
#include <cstdio>
#include <cstdlib>
+#include <cstring> // for memcpy
#include <unistd.h>
#include <string>
@@ -38,6 +39,7 @@
#include <QThread>
#include <QTime>
+#include <QDebug>
// KDE
//#include <kdebug.h>
@@ -239,6 +241,23 @@ void Emulation::sendMouseEvent(int /*buttons*/, int /*column*/, int /*row*/, int
TODO: Character composition from the old code. See #96536
*/
+int countReplacementCharactersInText(const char* text, int length) {
+ int count = 0;
+ for (int i = 0; i < length - 2; ++i) {
+ if (static_cast<unsigned char>(text[i]) == 0xEF &&
+ static_cast<unsigned char>(text[i + 1]) == 0xBF &&
+ static_cast<unsigned char>(text[i + 2]) == 0xBD) {
+ count++;
+ i += 2; // Skip the next two bytes
+ }
+ }
+ return count;
+}
+
+int countReplacementCharactersInString(const QString& str) {
+ return str.count(QChar::ReplacementCharacter);
+}
+
void Emulation::receiveData(const char* text, int length)
{
emit stateSet(NOTIFYACTIVITY);
@@ -250,8 +269,79 @@ void Emulation::receiveData(const char* text, int length)
* U+10FFFF
* https://unicodebook.readthedocs.io/unicode_encodings.html#surrogates
*/
- QString utf16Text = _decoder->toUnicode(text,length);
+ qDebug() << "!!!!!!!!!! length:" << length;
+ qDebug() << "!!!!!!!!!! text:" << text; //yes replacement chars here because it get converted to utf8
+ QString str = QString::fromUtf8(text, length);
+ qDebug() << "!!!!!!!!!! str:" << str; //yes replacement chars here
+ // Count replacement characters in 'text'
+ int textReplacementCount = countReplacementCharactersInText(text, length);
+ // Count replacement characters in 'str'
+ int strReplacementCount = countReplacementCharactersInString(str);
+
+ //char combinedBuffer[3 + length];
+ //std::vector<char> combinedBuffer(_last_max_three_bytes_of_text.size() + length);
+
+ // do we have any new invalid utf8 bytes at the end of this new chunk?
+ int the_diff=strReplacementCount-textReplacementCount;
+ qDebug() << "!!!!!!!!!! the_diff:" << the_diff;
+ qDebug() << "!!!!!!!!!! _how_many_of_last_max_three_are_in:" << _how_many_of_last_max_three_are_in;
+ //qDebug() << "!!!!!!!!!! _last_max_three_bytes_of_text:" << _last_max_three_bytes_of_text;
+// qDebug() << "!!!!!!!!!! _last_max_three_bytes_of_text:";
+// for (const auto& byte : _last_max_three_bytes_of_text) {
+// qDebug() << QString::number(byte, 16).rightJustified(2, '0'); // Print each byte as a 2-digit hexadecimal number
+// }
+// qDebug() << "!!!!!!!!!! _last_max_three_bytes_of_text:";
+// for (const auto& byte : _last_max_three_bytes_of_text) {
+// qDebug() << QString::number(static_cast<unsigned char>(byte), 16).rightJustified(2, '0');
+// }
+// qDebug() << "!!!!!!!!!! _last_max_three_bytes_of_text:";
+// for (const auto& byte : _last_max_three_bytes_of_text) {
+// qDebug().noquote() << QString::number(static_cast<unsigned char>(byte), 16).rightJustified(2, '0');
+// }
+// qDebug() << "!!!!!!!!!! _last_max_three_bytes_of_text:";
+// for (const auto& byte : _last_max_three_bytes_of_text) {
+// qDebug().noquote() << QString("%1").arg(static_cast<unsigned char>(byte), 2, 16, QChar('0'));
+// }
+
+
+
+ std::vector<uint8_t> combinedBuffer(_how_many_of_last_max_three_are_in + length);
+ //"allocates memory on the heap, but it is automatically deallocated when the vector goes out of scope, such as when the function exits"
+ int combinedLength = 0;
+ if (_how_many_of_last_max_three_are_in > 0) {
+ //copy last state bytes from the end of a prev. chunk which woulda been invalid utf8
+ //copy to the beginning of this new chunk
+ std::memcpy(combinedBuffer.data(), _last_max_three_bytes_of_text.data(), _how_many_of_last_max_three_are_in);
+ combinedLength+=_how_many_of_last_max_three_are_in;
+ }
+
+ // Compare the counts
+ if (the_diff>0) {
+ qDebug() << "!!!!!!!!!! More replacement characters("<<the_diff<<") found in 'str' than in 'text'.";
+ //throw std::runtime_error("Something got replaced with replacement character. Crashing as requested.");
+ }
+
+ // Copy new text to the combined buffer
+ // but not any invalid utf8 bytes from the end, if any.
+ std::memcpy(combinedBuffer.data() + combinedLength, text, length-the_diff);
+ combinedLength += length-the_diff;
+
+ // Try to convert the combined buffer to UTF-8
+ QString str2 = QString::fromUtf8(reinterpret_cast<const char*>(combinedBuffer.data()), combinedLength);
+ qDebug() << "!!!!!!!!!! resultAftermods:" << str2;
+ //QString utf16Text = _decoder->toUnicode(text,length-the_diff);
+ QString utf16Text = _decoder->toUnicode(reinterpret_cast<const char*>(combinedBuffer.data()),combinedLength);
+ qDebug() << "!!!!!!!!!! utf16Text:" << utf16Text;//no repl. chars here, gets truncated before it
std::wstring unicodeText = utf16Text.toStdWString();
+ qDebug() << "!!!!!!!!!! unicodeText:" << QString::fromStdWString(unicodeText.c_str());//none here too
+
+ // Store the last up to 3 bytes of invalid UTF-8 from the end of the text
+ if (the_diff > 0) {
+ _how_many_of_last_max_three_are_in = the_diff;
+ std::memcpy(_last_max_three_bytes_of_text.data(), text + length - the_diff, the_diff);
+ } else {
+ _how_many_of_last_max_three_are_in = 0;
+ }
//send characters to terminal emulator
for (size_t i=0;i<unicodeText.length();i++)
diff --git a/lib/Emulation.h b/lib/Emulation.h
index 7d2263b..b0e4f37 100644
--- a/lib/Emulation.h
+++ b/lib/Emulation.h
@@ -488,6 +488,8 @@ protected:
const QTextCodec* _codec;
QTextDecoder* _decoder;
const KeyboardTranslator* _keyTranslator; // the keyboard layout
+ std::array<uint8_t, 3> _last_max_three_bytes_of_text = {0};
+ int _how_many_of_last_max_three_are_in = 0;
protected slots:
/** So to apply that, on qtermwidget:
Then same build steps(minus the patch) for qterminal itself, (or use system one? then mod the path below), and then: I run it like: #!/usr/bin/env -S -i PATH="/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/opt/bin:${PATH}" LANG="${LANG}" TERM='xterm-256color' DISPLAY="${DISPLAY}" HOME="${HOME}" bash --noprofile --norc
cp ./qterminal.ini ~/.config/foo_qterminal.conf
declare -x XDG_RUNTIME_DIR="/run/user/1000"
export QT_LOGGING_RULES="*=true"
echo "set environment LD_LIBRARY_PATH /tmp/qtw2/lib64" > /tmp/gdb_cmds.txt
echo "run -p foo_qterminal -e 'less -R --shift 5 -- ./less.txt' > /tmp/foo7.log 2>&1 " >> /tmp/gdb_cmds.txt
gdb -x /tmp/gdb_cmds.txt /tmp/qtw2/bin/qterminal Then press and hold |
This worked here (based on https://doc.qt.io/qt-6/qstringdecoder.html): diff -ruNp qtermwidget-orig/lib/Emulation.cpp qtermwidget/lib/Emulation.cpp
--- qtermwidget-orig/lib/Emulation.cpp
+++ qtermwidget/lib/Emulation.cpp
@@ -52,7 +52,7 @@ Emulation::Emulation() :
_keyTranslator(nullptr),
_usesMouse(false),
_bracketedPasteMode(false),
- _fromUtf8(QStringEncoder::Utf16)
+ _toUtf16(QStringEncoder::Utf8)
{
// create screens with a default size
_screen[0] = new Screen(40,80);
@@ -224,9 +224,9 @@ void Emulation::receiveData(const char*
* U+10FFFF
* https://unicodebook.readthedocs.io/unicode_encodings.html#surrogates
*/
- QString str = QString::fromUtf8(text, length);
- auto encoded = _fromUtf8(str);
- std::wstring unicodeText = encoded.data.toStdWString();
+ QByteArray ba(text, length);
+ QString str = _toUtf16(ba);
+ std::wstring unicodeText = str.toStdWString();
//send characters to terminal emulator
for (size_t i=0;i<unicodeText.length();i++)
diff -ruNp qtermwidget-orig/lib/Emulation.h qtermwidget/lib/Emulation.h
--- qtermwidget-orig/lib/Emulation.h
+++ qtermwidget/lib/Emulation.h
@@ -494,7 +494,7 @@ private:
bool _bracketedPasteMode;
QTimer _bulkTimer1{this};
QTimer _bulkTimer2{this};
- QStringEncoder _fromUtf8;
+ QStringDecoder _toUtf16;
};
}
Please test it. If I'm not mistaken, the logic used in the Qt6 port was incorrect. |
The patch works indeed! (EDIT: real world This is really cool patch! Cheers! I don't know why it didn't even cross my mind to check online for documentation, as you pointed out in another place modified version of the above patch, that I used to check that it does indeed work: diff --git a/CMakeLists.txt b/CMakeLists.txt
index 9cf3516..3533c86 100644
--- a/CMakeLists.txt
+++ b/CMakeLists.txt
@@ -153,6 +153,7 @@ set_target_properties( ${QTERMWIDGET_LIBRARY_NAME} PROPERTIES
VERSION ${QTERMWIDGET_VERSION}
)
+set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -fexceptions")
if(APPLE)
target_compile_definitions(${QTERMWIDGET_LIBRARY_NAME}
diff --git a/lib/Emulation.cpp b/lib/Emulation.cpp
index c88ac12..dee470b 100644
--- a/lib/Emulation.cpp
+++ b/lib/Emulation.cpp
@@ -52,7 +52,7 @@ Emulation::Emulation() :
_keyTranslator(nullptr),
_usesMouse(false),
_bracketedPasteMode(false),
- _fromUtf8(QStringEncoder::Utf16)
+ _toUtf16(QStringEncoder::Utf8)
{
// create screens with a default size
_screen[0] = new Screen(40,80);
@@ -213,6 +213,23 @@ void Emulation::sendMouseEvent(int /*buttons*/, int /*column*/, int /*row*/, int
TODO: Character composition from the old code. See #96536
*/
+int countReplacementCharactersInText(const char* text, int length) {
+ int count = 0;
+ for (int i = 0; i < length - 2; ++i) {
+ if (static_cast<unsigned char>(text[i]) == 0xEF &&
+ static_cast<unsigned char>(text[i + 1]) == 0xBF &&
+ static_cast<unsigned char>(text[i + 2]) == 0xBD) {
+ count++;
+ i += 2; // Skip the next two bytes
+ }
+ }
+ return count;
+}
+
+int countReplacementCharactersInString(const QString& str) {
+ return str.count(QChar::ReplacementCharacter);
+}
+
void Emulation::receiveData(const char* text, int length)
{
emit stateSet(NOTIFYACTIVITY);
@@ -224,9 +241,29 @@ void Emulation::receiveData(const char* text, int length)
* U+10FFFF
* https://unicodebook.readthedocs.io/unicode_encodings.html#surrogates
*/
+ qDebug() << "!!!!!!!!!! length:" << length;
+ qDebug() << "!!!!!!!!!! text:" << text; //yes replacement chars here because it get converted to utf8
QString str = QString::fromUtf8(text, length);
- auto encoded = _fromUtf8(str);
- std::wstring unicodeText = encoded.data.toStdWString();
+ qDebug() << "!!!!!!!!!! str:" << str; //yes replacement chars here
+ // Count replacement characters in 'text'
+ int textReplacementCount = countReplacementCharactersInText(text, length);
+ // Count replacement characters in 'str'
+ int strReplacementCount = countReplacementCharactersInString(str);
+ int the_diff=strReplacementCount-textReplacementCount;
+ qDebug() << "!!!!!!!!!! the_diff:" << the_diff;
+ // Compare the counts
+
+
+
+ QByteArray ba(text, length);
+ QString str2 = _toUtf16(ba);
+ qDebug() << "!!!!!!!!!! str2:" << str2;
+ std::wstring unicodeText = str2.toStdWString();
+ qDebug() << "!!!!!!!!!! unicodeText:" << unicodeText.c_str();
+ if (the_diff>0) {
+ qDebug() << "!!!!!!!!!! More replacement characters("<<the_diff<<") found in 'str' than in 'text'.";
+ //throw std::runtime_error("Something got replaced with replacement character. Crashing as requested.");
+ }
//send characters to terminal emulator
for (size_t i=0;i<unicodeText.length();i++)
diff --git a/lib/Emulation.h b/lib/Emulation.h
index 98ba2fe..ad0c513 100644
--- a/lib/Emulation.h
+++ b/lib/Emulation.h
@@ -494,7 +494,7 @@ private:
bool _bracketedPasteMode;
QTimer _bulkTimer1{this};
QTimer _bulkTimer2{this};
- QStringEncoder _fromUtf8;
+ QStringDecoder _toUtf16;
};
} messed up the |
Actually, this part of Qt's documentation is exceptionally obscure and needs to be rewritten (I think I've seen a bug report about the doc), perhaps because it's new. If it wasn't for a few examples in it, I wouldn't understand anything. Since your experience in dealing with characters is more than mine, please test the patch under various circumstances. I'll make a PR and release 2.0.1 after merging it. But first I should check another part of the code, where the new Qt6 encoding/decoding methods are also used. Thanks a lot for the report and also for the discussion! |
Fixes lxqt/qterminal#1143 and related issues.
Fixes lxqt/qterminal#1143 and related issues.
This comment was marked as off-topic.
This comment was marked as off-topic.
Fixes lxqt/qterminal#1143 and related issues.
EDIT2: fixed in qterminal 2.0.1
EDIT: jump to working/tested patch by tsujan here below.
Expected Behavior
this didn't happen in qterminal 1.4.0 or so, iirc
so, not randomly replacing chars with
�
sounds like the expected behavior.Current Behavior
a braille character like
⠿
:is being shown, sometimes, as
���
note how it's made out of
UTF-8: e2 a0 bf
, this must be why there's 3 replacement character(s)�
:Possible Solution
I don't know yet, didn't look in the code. Took a while to find reproduction steps.
Steps to Reproduce (for bugs)
Note that, apparently the
cp
line isn't necessary1.You'll need
less.txt
file which I've generated fromnvim
viascript
command... well it doesn't matter I guess:less.txt
qterminal.ini.txt
(^ you'll have to rename this to remove the .txt extension, couldn't find the .ini to upload here without make it a .txt)
put both files in current dir before running the above bash script from current dir.
less
command open in a newqterminal
of 172x54 size(but size doesn't matter apparently, not withless
), you can now proceed as follows:Here's before step 3 and after step 3 is done:
Context
btop
gets messed up visually, but only inqterminal
, not inside sayalacritty
.System Information
LANG=en_US.utf8
$ less --version
less 643 (PCRE2 regular expressions)
chars
command/executable is https://github.com/antifuchs/charsFootnotes
so it would work with empty factory-generated profile(on my Gentoo system, but who knows if I've some patches or what not), so I see it work even in a small-sized qterminal the same, but when I tried it with
nvim
this was definitely not at all reproducible with any other sizes - seems to work now withless
. ↩The text was updated successfully, but these errors were encountered: