Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

StringIndexOutOfBoundsException when quoted space/s are present in first row with parse option Ignore trailing whitespaces as true and Ignore leading whitespaces as false #542

Open
vikashah opened this issue Jun 14, 2024 · 1 comment

Comments

@vikashah
Copy link

com.univocity.parsers.common.TextParsingException: java.lang.StringIndexOutOfBoundsException thrown at https://github.com/uniVocity/univocity-parsers/blob/master/src/main/java/com/univocity/parsers/common/ArgumentUtils.java#L601 with below parse option configuration:

Parser Configuration: CsvParserSettings:
Auto configuration enabled=true
Auto-closing enabled=true
Autodetect column delimiter=false
Autodetect quotes=false
Column reordering enabled=true
Delimiters for detection=null
Empty value=null
Escape unquoted values=false
Header extraction enabled=null
Headers=null
Ignore leading whitespaces=false
Ignore leading whitespaces in quotes=false
Ignore trailing whitespaces=true
Ignore trailing whitespaces in quotes=false
Input buffer size=1048576
Input reading on separate thread=true
Keep escape sequences=false
Keep quotes=false
Length of content displayed on error=-1
Line separator detection enabled=false
Maximum number of characters per column=4096
Maximum number of columns=512
Normalize escaped line separators=true
Null value=null
Number of records to read=all
Processor=none
Restricting data in exceptions=false
RowProcessor error handler=null
Selected fields=none
Skip bits as whitespace=true
Skip empty lines=true
Unescaped quote handling=nullFormat configuration:
CsvFormat:
Comment character=#
Field delimiter=,
Line separator (normalized)=\n
Line separator sequence=\n
Quote character="
Quote escape character="
Quote escape escape character=null

Key things to note about the scenario is Ignore leading whitespaces=false and Ignore trailing whitespaces=true and the input csv file has quoted spaces in the first line. So, this is reproducible with a csv file like below:
" "

The while loop on https://github.com/uniVocity/univocity-parsers/blob/master/src/main/java/com/univocity/parsers/common/ArgumentUtils.java#L601 is doing an unchecked decrement and access of the index causing a StringIndexOutOfBoundsException.
Proposed fix: modify the while condition as below:
while (right && end >= 0 && input.charAt(end) <= ' ')

@UltraCharge99
Copy link

Please see #534

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants