You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The RCP task (exactly-once) frequently reports an error that a certain label has no node. The error occurs in the getNodeId method, as identified through code analysis:
privateLonggetNodeId(TransactionOperationtxnOperation, Stringlabel) throwsUserException {
LongnodeId;
// Save label->be hashmap when beginning a transaction, so that subsequent operators can send to the same BEif (TXN_BEGIN.equals(txnOperation)) {
LongchosenNodeId = GlobalStateMgr.getCurrentState().getNodeMgr()
.getClusterInfo().getNodeSelector().seqChooseBackendOrComputeId();
nodeId = chosenNodeId;
// txnNodeMap is an LRU cache, which atomically removes unused entriesaccessTxnNodeMapWithWriteLock(txnNodeMap -> txnNodeMap.put(label, chosenNodeId));
} else {
nodeId = accessTxnNodeMapWithReadLock(txnNodeMap -> txnNodeMap.get(label));
}
if (nodeId == null) {
thrownewUserException(String.format(
"Transaction with op[%s] and label[%s] has no node.", txnOperation.getValue(), label));
}
returnnodeId;
}
When nodeId == null, the txnNodeMap is printed, revealing that the map's elements are missing and the data is incomplete. Upon investigation, it is found that txnNodeMap is a LinkedHashMap with accessOrder=true, which means that the order of elements changes upon each access. Although a read lock (accessTxnNodeMapWithReadLock) is used, the shared nature of the read lock can lead to issues with the data in txnNodeMap when multiple threads are involved.
The declaration of txnNodeMap should be as follows:
private final Map<String, Long> txnNodeMap = Collections.synchronizedMap(new LinkedHashMap<>
There is LinkedHashMap field created with accessOrder=true.
private final LinkedHashMap<PixelsKey, ImageSoftReference> map
= new LinkedHashMap<>(16, 0.75f, true);
Access to it is guarded with ReentrantReadLock.
public Image getImage(final PixelsKey key){
final ImageSoftReference ref;
lock.readLock().lock();
try {
ref = map.get(key);
} finally {
lock.readLock().unlock();
}
return ref == null ? null : ref.get();
}
BUT there is a catch: LinkedHashMap.get method for such a case - can
cause structural modification.
StarRocks version (Required)
3.3.7
The text was updated successfully, but these errors were encountered:
Steps to reproduce the behavior (Required)
Expected behavior (Required)
Real behavior (Required)
The RCP task (exactly-once) frequently reports an error that a certain label has no node. The error occurs in the getNodeId method, as identified through code analysis:
When nodeId == null, the txnNodeMap is printed, revealing that the map's elements are missing and the data is incomplete. Upon investigation, it is found that txnNodeMap is a LinkedHashMap with accessOrder=true, which means that the order of elements changes upon each access. Although a read lock (accessTxnNodeMapWithReadLock) is used, the shared nature of the read lock can lead to issues with the data in txnNodeMap when multiple threads are involved.
The declaration of txnNodeMap should be as follows:
REF: https://mail.openjdk.org/pipermail/client-libs-dev/2024-October/023429.html
StarRocks version (Required)
The text was updated successfully, but these errors were encountered: