Home:ALL Converter>Read Parquet and ORC HDFS file using Pyspark

Read Parquet and ORC HDFS file using Pyspark

Ask Time:2021-07-09T22:06:17         Author:user2492356

Json Formatter

I have my hive External Tables created with InputFormat "Org.apache.hadoop.hive.ql.io.parquet.serde.MapredParquetInputFormat" and outputformat: "Org.apache.hadoop.hive.ql.io.parquet.serde.MapredParquetOutputFormat".

How do I read these hive table files from hdfs using Pyspark?

Author:user2492356,eproduced under the CC 4.0 BY-SA copyright license with a link to the original source and this disclaimer.
Link to original article:https://stackoverflow.com/questions/68318139/read-parquet-and-orc-hdfs-file-using-pyspark
yy