Skip to content
This repository has been archived by the owner on Jan 3, 2023. It is now read-only.

Hive on MR query with COMPACT file , throw EOFE "Cannot seek to negative offset" #2259

Open
MyqueWooMiddo opened this issue Sep 2, 2022 · 0 comments

Comments

@MyqueWooMiddo
Copy link

I generate a small table 'small_table' with several INSERTs , so there're many small files in HDFS

I create a SSM compact rule on 'small_table' than run action , then the _container_file was produced on the same HDFS path.

I run 'hadoop fs -cat /small_table/xx.txt ' , I can view its content .
In the mean time , I can view xx.txt and _container_file open & getXAttrs operation in hdfs-audit.log .

In Hive , I can query the small_table with filter (without MR) , but I can't run aggregate such an count(*) on it (with MR) .
It will throw the stack :

org.apache.hadoop.mapred.YarnChild: Exception running child : java.io.IOException: java.lang.reflect.InvocationTargetException
at org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderCreationException(HiveIOExceptionHandlerChain.java:97)
at org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderCreationException(HiveIOExceptionHandlerUtil.java:57)
at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.initNextRecordReader(HadoopShimsSecure.java:271)
at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.(HadoopShimsSecure.java:217)
at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileInputFormatShim.getRecordReader(HadoopShimsSecure.java:345)
at org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getRecordReader(CombineHiveInputFormat.java:719)
at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.(MapTask.java:176)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:445)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:350)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:178)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1878)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:172)
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.initNextRecordReader(HadoopShimsSecure.java:257)
... 11 more
Caused by: java.io.EOFException: Cannot seek to negative offset
at org.apache.hadoop.hdfs.CompactInputStream.seek(CompactInputStream.java:135)
at org.apache.hadoop.fs.FSDataInputStream.seek(FSDataInputStream.java:71)
at org.apache.hadoop.mapred.LineRecordReader.(LineRecordReader.java:140)
at org.apache.hadoop.mapred.TextInputFormat.getRecordReader(TextInputFormat.java:67)
at org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.(CombineHiveRecordReader.java:99)
... 16 more

My Hive version is 3.1.x and Hadoop version is 3.3.x

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant