[BUG] Cannot read excel files using the V2 API #896

massazan · 2024-10-07T16:55:33Z

Am I using the newest version of the library?

I have made sure that I'm using the latest version of the library.

Is there an existing issue for this?

I have searched the existing issues

Current Behavior

When using the V2 API.
When using the version 0.20.4, the following error occurs: ClassCastException: scala.Some cannot be cast to [Lorg.apache.spark.sql.catalyst.InternalRow;
Error occurs when you omit the end boundary cell on the DataAddress parameter i.e "'0'!A5"

Error is occurs for Scala and PySpark

Expected Behavior

Spark DataReader should return a DataFrame with no errors

Steps To Reproduce

Error occurs when you omit the end boundary cell on the DataAddress parameter i.e "'0'!A5"

val configs = Map(
"inferSchema" -> "false",
"dataAddress" -> "'0'!A5",
"header" -> "false"
)

// Ensure you're using the spark-excel package
val df = spark.read.format("excel")
.option("header", configs("header"))
.option("inferSchema", configs("inferSchema"))
.option("dataAddress", configs("dataAddress"))
.load(s3_path)

df.show()

Environment

- Spark version: DataBricks Runtime version: 13.3 LTS (includes Apache Spark 3.4.1, Scala 2.12)
- Spark-Excel version:com.crealytics:spark-excel_2.12:3.4.1_0.20.4
- OS:
- Cluster environment

Anything else?

API V1 works fine.

github-actions · 2024-10-07T16:55:47Z

Please check these potential duplicates:

[[BUG] When Read Excel Files, Several Errors Using Java #837] [BUG] When Read Excel Files, Several Errors Using Java (71.28%)
If this issue is a duplicate, please add any additional info to the ticket with the most information and close this one.

nightscape · 2024-10-08T07:19:14Z

@massazan looks like this one: #808

massazan · 2024-10-09T10:30:14Z

Hi @nightscape, yes it is the same issue. I tried to install the artifact 3.4.2 as mentioned, but I still got problems with the DataBricks Runtime 13.3. I tried on the Runtime 14.3 LTS and it works.
Is there any plans to solve the problem fro the Runtime 13.3?

Thanks

sramesh-nlg · 2024-10-16T10:03:00Z

Hi . I am also facing the same issue, When i am trying to read a excel from azure adls storage.
Error message -
java.lang.NoSuchMethodError: org.apache.spark.sql.execution.datasources.v2.FileDataSourceV2.getPaths$(Lorg/apache/spark/sql/execution/datasources/v2/FileDataSourceV2;Lorg/apache/spark/sql/util/CaseInsensitiveStringMap;

i tried with both 13.3 LTS and 14.3 .

Environment

Spark version: DataBricks Runtime version: 14.3 LTS (includes Apache Spark 3.5.0, Scala 2.12)
Spark-Excel version: com.crealytics:spark-excel_2.13:3.3.4_0.20.4
OS:
Cluster environment

i have tried with 2.13:3.5.1 as well but still the same issue

nightscape · 2024-10-17T06:50:27Z

@sramesh-nlg you always need to use the version of spark-excel that best matches the Spark version:
https://mvnrepository.com/artifact/com.crealytics/spark-excel

@massazan unfortunately DataBricks has a little bit of a habit of breaking API compatibility with the officially released Spark versions...
I don't plan to fix issues with DataBricks as I'm not currently using it myself.
We're very open to PRs though 😃

github-actions bot added the potential-duplicate label Oct 7, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] Cannot read excel files using the V2 API #896

[BUG] Cannot read excel files using the V2 API #896

massazan commented Oct 7, 2024

github-actions bot commented Oct 7, 2024

nightscape commented Oct 8, 2024

massazan commented Oct 9, 2024

sramesh-nlg commented Oct 16, 2024

nightscape commented Oct 17, 2024

[BUG] Cannot read excel files using the V2 API #896

[BUG] Cannot read excel files using the V2 API #896

Comments

massazan commented Oct 7, 2024

Am I using the newest version of the library?

Is there an existing issue for this?

Current Behavior

Expected Behavior

Steps To Reproduce

Environment

Anything else?

github-actions bot commented Oct 7, 2024

nightscape commented Oct 8, 2024

massazan commented Oct 9, 2024

sramesh-nlg commented Oct 16, 2024

nightscape commented Oct 17, 2024