hbase regionserver 偶然挂掉一台,经常会出现这种情况,但不一定是哪一个节点 ,虽然我挂了一个regionserver,但我的集群仍然能成功执行各种任务
regionserver日志如下:
2019-08-07 16:21:28,056 INFO [regionserver/slave2/192.168.1.155:16020.logRoller] wal.FSHLog: java.io.IOException: All datanodes DatanodeInfoWithStorage[192.168.1.160:50010,DS-0cc09d26-d327-4bad-b637-6c18b0d37592,DISK] are bad. Aborting... at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:1224) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:990) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:507) ... 2019-08-07 16:22:08,106 FATAL [regionserver/slave2/192.168.1.155:16020.logRoller] regionserver.HRegionServer: ABORTING region server slave2,16020,1565149078476: Failed log close in log roller org.apache.hadoop.hbase.regionserver.wal.FailedLogCloseException: hdfs://master:9000/hbasedata/WALs/slave2,16020,1565149078476/slave2%2C16020%2C1565149078476.1565162527881, unflushedEntries=0 at org.apache.hadoop.hbase.regionserver.wal.FSHLog.replaceWriter(FSHLog.java:915) at org.apache.hadoop.hbase.regionserver.wal.FSHLog.rollWriter(FSHLog.java:717) at org.apache.hadoop.hbase.regionserver.LogRoller.run(LogRoller.java:173) at java.lang.Thread.run(Thread.java:745) Caused by: java.io.IOException: All datanodes DatanodeInfoWithStorage[192.168.1.160:50010,DS-0cc09d26-d327-4bad-b637-6c18b0d37592,DISK] are bad. Aborting... at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:1224) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:990) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:507) 2019-08-07 16:22:08,106 FATAL [regionserver/slave2/192.168.1.155:16020.logRoller] regionserver.HRegionServer: RegionServer abort: loaded coprocessors are: [org.apache.phoenix.coprocessor.MetaDataEndpointImpl, org.apache.phoenix.coprocessor.ScanRegionObserver, org.apache.phoenix.coprocessor.UngroupedAggregateRegionObserver, org.apache.phoenix.hbase.index.Indexer, org.apache.phoenix.coprocessor.MetaDataRegionObserver, org.apache.phoenix.coprocessor.GroupedAggregateRegionObserver, org.apache.phoenix.coprocessor.ServerCachingEndpointImpl, org.apache.hadoop.hbase.coprocessor.MultiRowMutationEndpoint] 2019-08-07 16:22:08,164 INFO [regionserver/slave2/192.168.1.155:16020.logRoller] regionserver.HRegionServer: "FlushOutputSize_max" : 0, "FlushOutputSize_mean" : 0, "FlushOutputSize_25th_percentile" : 0, "FlushOutputSize_median" : 0, "FlushOutputSize_75th_percentile" : 0, "FlushOutputSize_90th_percentile" : 0, "FlushOutputSize_95th_percentile" : 0, "FlushOutputSize_98th_percentile" : 0, "FlushOutputSize_99th_percentile" : 0, "FlushOutputSize_99.9th_percentile" : 0, "FlushOutputSize_SizeRangeCount_100000000-inf" : 57, "Bulkload_count" : 0, "Bulkload_mean_rate" : 0.0, "Bulkload_1min_rate" : 0.0, "Bulkload_5min_rate" : 0.0, "Bulkload_15min_rate" : 0.0, "Bulkload_num_ops" : 0, "Bulkload_min" : 0, "Bulkload_max" : 0, "Bulkload_mean" : 0, "Bulkload_25th_percentile" : 0, "Bulkload_median" : 0, "Bulkload_75th_percentile" : 0, "Bulkload_90th_percentile" : 0, "Bulkload_95th_percentile" : 0, "Bulkload_98th_percentile" : 0, "Bulkload_99th_percentile" : 0, "Bulkload_99.9th_percentile" : 0 } ] } 2019-08-07 16:22:08,229 INFO [regionserver/slave2/192.168.1.155:16020.logRoller] regionserver.HRegionServer: STOPPED: Failed log close in log roller 2019-08-07 16:22:08,229 INFO [regionserver/slave2/192.168.1.155:16020.logRoller] regionserver.LogRoller: LogRoller exiting. 2019-08-07 16:22:08,234 INFO [regionserver/slave2/192.168.1.155:16020] regionserver.SplitLogWorker: Sending interrupt to stop the worker thread 2019-08-07 16:22:08,234 INFO [regionserver/slave2/192.168.1.155:16020] regionserver.HRegionServer: Stopping infoServer 2019-08-07 16:22:08,240 INFO [SplitLogWorker-slave2:16020] regionserver.SplitLogWorker: SplitLogWorker interrupted. Exiting. 2019-08-07 16:22:08,241 INFO [SplitLogWorker-slave2:16020] regionserver.SplitLogWorker: SplitLogWorker slave2,16020,1565149078476 exiting 2019-08-07 16:22:08,246 INFO [regionserver/slave2/192.168.1.155:16020] mortbay.log: Stopped SelectChannelConnector@0.0.0.0:16030 2019-08-07 16:22:08,350 INFO [regionserver/slave2/192.168.1.155:16020] regionserver.HeapMemoryManager: Stopping HeapMemoryTuner chore. 2019-08-07 16:22:08,350 INFO [regionserver/slave2/192.168.1.155:16020] flush.RegionServerFlushTableProcedureManager: Stopping region server flush procedure manager abruptly. 2019-08-07 16:22:08,350 INFO [regionserver/slave2/192.168.1.155:16020] snapshot.RegionServerSnapshotManager: Stopping RegionServerSnapshotManager abruptly. 2019-08-07 16:22:08,351 INFO [MemStoreFlusher.0] regionserver.MemStoreFlusher: MemStoreFlusher.0 exiting 2019-08-07 16:22:08,351 INFO [MemStoreFlusher.1] regionserver.MemStoreFlusher: MemStoreFlusher.1 exiting 2019-08-07 16:22:08,358 INFO [regionserver/slave2/192.168.1.155:16020] regionserver.HRegionServer: aborting server slave2,16020,1565149078476 2019-08-07 16:22:08,359 INFO [regionserver/slave2/192.168.1.155:16020] client.ConnectionManager$HConnectionImplementation: Closing zookeeper sessionid=0x26c0c913d3b0c0b 2019-08-07 16:22:08,361 INFO [StoreCloserThread-SYSTEM.LOG,\x0F\x00\x00\x00,1563579104305.472f6011874f2c84023779abad60cbb0.-1] regionserver.HStore: Closed 0 2019-08-07 16:22:08,363 INFO [RS_CLOSE_REGION-slave2:16020-1] write.TrackingParallelWriterIndexCommitter: Shutting down TrackingParallelWriterIndexCommitter 2019-08-07 16:22:08,363 INFO [RS_CLOSE_REGION-slave2:16020-1] parallel.BaseTaskRunner: Shutting down task runner because Indexer is being stopped 2019-08-07 16:22:08,364 INFO [StoreCloserThread-SYSTEM.CATALOG,,1563579082594.abe23c4dd29ce41c18c9369bad93f0bd.-1] regionserver.HStore: Closed 0 2019-08-07 16:22:08,364 INFO [StoreCloserThread-DATACENTER.SUBJECT_TOTAL_SCORE_MODEL,-\x00\x00\x00\x00\x00\x00\x00\x00\x00,1563579684320.a7e35efcfa1b78e213ac48348ebe5ac4.-1] regionserver.HStore: Closed 0 ... 2019-08-07 16:22:08,452 INFO [StoreCloserThread-DATACENTER.QUEST_POINT_SCORE_MODEL,\x08\x00\x00\x00\x00\x00\x00\x00\x00\x00,1563579663691.1a79f5cceb0158f8b5523b2f5b5af39c.-1] regionserver.HStore: Closed 0 2019-08-07 16:22:08,452 INFO [StoreCloserThread-DATACENTER.QUEST_POINT_SCORE_MODEL,@\x00\x00\x00\x00\x00\x00\x00\x00\x00,1563579663691.2a82bca6a5bd67e30866fba660d4d154.-1] regionserver.HStore: Closed 0 2019-08-07 16:22:08,452 INFO [RS_CLOSE_REGION-slave2:16020-0] write.TrackingParallelWriterIndexCommitter: Shutting down TrackingParallelWriterIndexCommitter 2019-08-07 16:22:08,452 INFO [RS_CLOSE_REGION-slave2:16020-0] parallel.BaseTaskRunner: Shutting down task runner because Indexer is being stopped 2019-08-07 16:22:08,452 INFO [RS_CLOSE_REGION-slave2:16020-0] write.TrackingParallelWriterIndexCommitter: Shutting down TrackingParallelWriterIndexCommitter 2019-08-07 16:22:08,452 INFO [RS_CLOSE_REGION-slave2:16020-0] parallel.BaseTaskRunner: Shutting down task runner because Indexer is being stopped 2019-08-07 16:22:08,452 INFO [RS_CLOSE_REGION-slave2:16020-1] write.TrackingParallelWriterIndexCommitter: Shutting down TrackingParallelWriterIndexCommitter 2019-08-07 16:22:08,453 INFO [RS_CLOSE_REGION-slave2:16020-1] parallel.BaseTaskRunner: Shutting down task runner because Indexer is being stopped 2019-08-07 16:22:08,453 INFO [RS_CLOSE_REGION-slave2:16020-0] regionserver.HRegion: Closed DATACENTER.QUEST_POINT_SCORE_MODEL,@\x00\x00\x00\x00\x00\x00\x00\x00\x00,1563579663691.2a82bca6a5bd67e30866fba660d4d154. 2019-08-07 16:22:08,453 INFO [RS_CLOSE_REGION-slave2:16020-1] write.TrackingParallelWriterIndexCommitter: Shutting down TrackingParallelWriterIndexCommitter 2019-08-07 16:22:08,453 INFO [RS_CLOSE_REGION-slave2:16020-1] parallel.BaseTaskRunner: Shutting down task runner because Indexer is being stopped 2019-08-07 16:22:08,453 INFO [RS_CLOSE_REGION-slave2:16020-1] client.ConnectionManager$HConnectionImplementation: Closing zookeeper sessionid=0x46c0c913d150c18 2019-08-07 16:22:08,460 INFO [RS_CLOSE_REGION-slave2:16020-1] zookeeper.ZooKeeper: Session: 0x46c0c913d150c18 closed 2019-08-07 16:22:08,461 INFO [RS_OPEN_REGION-slave2:16020-0-EventThread] zookeeper.ClientCnxn: EventThread shut down for session: 0x46c0c913d150c18 2019-08-07 16:22:08,461 INFO [RS_CLOSE_REGION-slave2:16020-1] regionserver.HRegion: Closed DATACENTER.QUEST_POINT_SCORE_MODEL,\x08\x00\x00\x00\x00\x00\x00\x00\x00\x00,1563579663691.1a79f5cceb0158f8b5523b2f5b5af39c. 2019-08-07 16:22:08,579 INFO [regionserver/slave2/192.168.1.155:16020] regionserver.HRegionServer: stopping server slave2,16020,1565149078476; all regions closed. 2019-08-07 16:22:08,581 WARN [regionserver/slave2/192.168.1.155:16020] wal.ProtobufLogWriter: Failed to write trailer, non-fatal, continuing... java.nio.channels.ClosedChannelException at org.apache.hadoop.hdfs.DFSOutputStream.checkClosed(DFSOutputStream.java:1715) at org.apache.hadoop.fs.FSOutputSummer.write(FSOutputSummer.java:104) at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(FSDataOutputStream.java:58) at java.io.DataOutputStream.write(DataOutputStream.java:107) at com.google.protobuf.CodedOutputStream.refreshBuffer(CodedOutputStream.java:833) at com.google.protobuf.CodedOutputStream.flush(CodedOutputStream.java:843) at com.google.protobuf.AbstractMessageLite.writeTo(AbstractMessageLite.java:80) at org.apache.hadoop.hbase.regionserver.wal.ProtobufLogWriter.writeWALTrailer(ProtobufLogWriter.java:157) at org.apache.hadoop.hbase.regionserver.wal.ProtobufLogWriter.close(ProtobufLogWriter.java:130) at org.apache.hadoop.hbase.regionserver.wal.FSHLog.shutdown(FSHLog.java:1104) at org.apache.hadoop.hbase.wal.DefaultWALProvider.shutdown(DefaultWALProvider.java:155) at org.apache.hadoop.hbase.wal.WALFactory.shutdown(WALFactory.java:226) at org.apache.hadoop.hbase.regionserver.HRegionServer.shutdownWAL(HRegionServer.java:1380) at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:1123) at java.lang.Thread.run(Thread.java:745) 2019-08-07 16:22:08,582 INFO [regionserver/slave2/192.168.1.155:16020] regionserver.Leases: regionserver/slave2/192.168.1.155:16020 closing leases 2019-08-07 16:22:08,582 INFO [regionserver/slave2/192.168.1.155:16020] regionserver.Leases: regionserver/slave2/192.168.1.155:16020 closed leases 2019-08-07 16:22:08,583 INFO [regionserver/slave2/192.168.1.155:16020] hbase.ChoreService: Chore service for: slave2,16020,1565149078476 had [[ScheduledChore: Name: CompactedHFilesCleaner Period: 120000 Unit: MILLISECONDS]] on shutdown 2019-08-07 16:22:08,584 INFO [regionserver/slave2/192.168.1.155:16020] regionserver.CompactSplitThread: Waiting for Split Thread to finish... 2019-08-07 16:22:08,584 INFO [regionserver/slave2/192.168.1.155:16020] regionserver.CompactSplitThread: Waiting for Merge Thread to finish... 2019-08-07 16:22:08,584 INFO [regionserver/slave2/192.168.1.155:16020] regionserver.CompactSplitThread: Waiting for Large Compaction Thread to finish... 2019-08-07 16:22:08,584 INFO [regionserver/slave2/192.168.1.155:16020] regionserver.CompactSplitThread: Waiting for Small Compaction Thread to finish... 2019-08-07 16:22:08,606 INFO [regionserver/slave2/192.168.1.155:16020] ipc.RpcServer: Stopping server on 16020 2019-08-07 16:22:08,607 INFO [RpcServer.listener,port=16020] ipc.RpcServer: RpcServer.listener,port=16020: stopping 2019-08-07 16:22:08,607 INFO [RpcServer.responder] ipc.RpcServer: RpcServer.responder: stopped 2019-08-07 16:22:08,607 INFO [RpcServer.responder] ipc.RpcServer: RpcServer.responder: stopping 2019-08-07 16:22:08,648 INFO [regionserver/slave2/192.168.1.155:16020] zookeeper.ZooKeeper: Session: 0x36c0c913d370bd3 closed 2019-08-07 16:22:08,648 INFO [regionserver/slave2/192.168.1.155:16020] regionserver.HRegionServer: stopping server slave2,16020,1565149078476; zookeeper connection closed. 2019-08-07 16:22:08,648 INFO [regionserver/slave2/192.168.1.155:16020] regionserver.HRegionServer: regionserver/slave2/192.168.1.155:16020 exiting 2019-08-07 16:22:08,648 INFO [main-EventThread] zookeeper.ClientCnxn: EventThread shut down for session: 0x36c0c913d370bd3 2019-08-07 16:22:08,649 ERROR [main] regionserver.HRegionServerCommandLine: Region server exiting java.lang.RuntimeException: HRegionServer Aborted at org.apache.hadoop.hbase.regionserver.HRegionServerCommandLine.start(HRegionServerCommandLine.java:68) at org.apache.hadoop.hbase.regionserver.HRegionServerCommandLine.run(HRegionServerCommandLine.java:87) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:127) at org.apache.hadoop.hbase.regionserver.HRegionServer.main(HRegionServer.java:2831) 2019-08-07 16:22:08,652 INFO [Thread-5] regionserver.ShutdownHook: Shutdown hook starting; hbase.shutdown.hook=true; fsShutdownHook=org.apache.hadoop.fs.FileSystem$Cache$ClientFinalizer@24faea88 2019-08-07 16:22:08,653 INFO [Thread-5] regionserver.ShutdownHook: Starting fs shutdown hook thread. 2019-08-07 16:22:08,662 INFO [Thread-5] regionserver.ShutdownHook: Shutdown hook finished.
分析日志可以看出关键在于
Caused by: java.io.IOException: All datanodes DatanodeInfoWithStorage[192.168.1.160:50010,DS-0cc09d26-d327-4bad-b637-6c18b0d37592,DISK] are bad. Aborting
整个hdfs datanode被标记为bad,不可以正常访问,那么什么情况导致hdfs datanode被标记为bad?
查看同一时间hdfs datanode日志:
2019-08-07 15:19:53,753 INFO org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetAsyncDiskService: Scheduling blk_1077031896_3291873 file /ssd/hadoop/data/hadoop/datanode/current/BP-1486468690-127.0.0.1-1550458385182/current/finalized/subdir50/subdir51/blk_1077031896 for deletion 2019-08-07 15:19:53,753 INFO org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetAsyncDiskService: Scheduling blk_1077031899_3291876 file /opt/hadoop/data/hadoop/datanode/current/BP-1486468690-127.0.0.1-1550458385182/current/finalized/subdir50/subdir51/blk_1077031899 for deletion ... 2019-08-07 15:22:07,907 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-1486468690-127.0.0.1-1550458385182:blk_1077046172_3306707 src: /192.168.1.155:49062 dest: /192.168.1.155:50010 2019-08-07 15:23:07,973 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Exception for BP-1486468690-127.0.0.1-1550458385182:blk_1077046172_3306707 java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/192.168.1.155:50010 remote=/192.168.1.155:49062] at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:164) at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:161) at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:131) at java.io.BufferedInputStream.fill(BufferedInputStream.java:246) at java.io.BufferedInputStream.read1(BufferedInputStream.java:286) at java.io.BufferedInputStream.read(BufferedInputStream.java:345) at java.io.DataInputStream.read(DataInputStream.java:149) at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:200) at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doReadFully(PacketReceiver.java:213) at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doRead(PacketReceiver.java:134) at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.receiveNextPacket(PacketReceiver.java:109) at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:503) at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:903) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:808) at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:137) at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:74) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:253) at java.lang.Thread.run(Thread.java:745) 2019-08-07 15:23:07,974 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-1486468690-127.0.0.1-1550458385182:blk_1077046172_3306707, type=HAS_DOWNSTREAM_IN_PIPELINE: Thread is interrupted. 2019-08-07 15:23:07,974 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-1486468690-127.0.0.1-1550458385182:blk_1077046172_3306707, type=HAS_DOWNSTREAM_IN_PIPELINE terminating 2019-08-07 15:23:07,974 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: opWriteBlock BP-1486468690-127.0.0.1-1550458385182:blk_1077046172_3306707 received exception java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/192.168.1.155:50010 remote=/192.168.1.155:49062] 2019-08-07 15:23:07,975 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: slave2:50010:DataXceiver error processing WRITE_BLOCK operation src: /192.168.1.155:49062 dst: /192.168.1.155:50010 java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/192.168.1.155:50010 remote=/192.168.1.155:49062] at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:164) at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:161) at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:131) at java.io.BufferedInputStream.fill(BufferedInputStream.java:246) at java.io.BufferedInputStream.read1(BufferedInputStream.java:286) at java.io.BufferedInputStream.read(BufferedInputStream.java:345) at java.io.DataInputStream.read(DataInputStream.java:149) at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:200) at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doReadFully(PacketReceiver.java:213) at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doRead(PacketReceiver.java:134) at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.receiveNextPacket(PacketReceiver.java:109) at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:503) at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:903) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:808) at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:137) at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:74) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:253) at java.lang.Thread.run(Thread.java:745) 2019-08-07 15:31:08,824 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-1486468690-127.0.0.1-1550458385182:blk_1077046173_3306711 src: /192.168.1.103:34120 dest: /192.168.1.155:50010 2019-08-07 15:31:08,849 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /192.168.1.103:33792, dest: /192.168.1.155:50010, bytes: 98, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_-1877039568_1, offset: 0, srvID: 8eaab255-07a1-4b87-85a5-e7f11b8a6e95, blockid: BP-1486468690-127.0.0.1-1550458385182:blk_1077046143_3306678, duration: 3600141660269 2019-08-07 15:31:08,849 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-1486468690-127.0.0.1-1550458385182:blk_1077046143_3306678, type=HAS_DOWNSTREAM_IN_PIPELINE terminating 2019-08-07 15:41:47,810 INFO org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetAsyncDiskService: Scheduling blk_1077046143_3306678 file /opt/hadoop/data/hadoop/datanode/current/BP-1486468690-127.0.0.1-1550458385182/current/finalized/subdir50/subdir107/blk_1077046143 for deletion 2019-08-07 15:41:47,811 INFO org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetAsyncDiskService: Deleted BP-1486468690-127.0.0.1-1550458385182 blk_1077046143_3306678 file /opt/hadoop/data/hadoop/datanode/current/BP-1486468690-127.0.0.1-1550458385182/current/finalized/subdir50/subdir107/blk_1077046143 2019-08-07 15:50:50,712 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-1486468690-127.0.0.1-1550458385182:blk_1077046174_3306712 src: /192.168.1.159:35002 dest: /192.168.1.155:50010 2019-08-07 15:50:50,723 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /192.168.1.159:35002, dest: /192.168.1.155:50010, bytes: 1193, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1320310655_11556, offset: 0, srvID: 8eaab255-07a1-4b87-85a5-e7f11b8a6e95, blockid: BP-1486468690-127.0.0.1-1550458385182:blk_1077046174_3306712, duration: 5691120 2019-08-07 15:50:50,724 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-1486468690-127.0.0.1-1550458385182:blk_1077046174_3306712, type=HAS_DOWNSTREAM_IN_PIPELINE terminating 2019-08-07 15:50:50,851 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-1486468690-127.0.0.1-1550458385182:blk_1077046175_3306713 src: /192.168.1.156:43630 dest: /192.168.1.155:50010 2019-08-07 15:50:50,862 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /192.168.1.156:43630, dest: /192.168.1.155:50010, bytes: 7114, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_-10071416_11556, offset: 0, srvID: 8eaab255-07a1-4b87-85a5-e7f11b8a6e95, blockid: BP-1486468690-127.0.0.1-1550458385182:blk_1077046175_3306713, duration: 5076580 2019-08-07 15:50:50,862 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-1486468690-127.0.0.1-1550458385182:blk_1077046175_3306713, type=HAS_DOWNSTREAM_IN_PIPELINE terminating 2019-08-07 15:50:50,988 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-1486468690-127.0.0.1-1550458385182:blk_1077046176_3306714 src: /192.168.1.156:43634 dest: /192.168.1.155:50010 2019-08-07 15:50:51,002 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /192.168.1.156:43634, dest: /192.168.1.155:50010, bytes: 605, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_-37037921_11556, offset: 0, srvID: 8eaab255-07a1-4b87-85a5-e7f11b8a6e95, blockid: BP-1486468690-127.0.0.1-1550458385182:blk_1077046176_3306714, duration: 5504399 2019-08-07 15:50:51,002 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-1486468690-127.0.0.1-1550458385182:blk_1077046176_3306714, type=HAS_DOWNSTREAM_IN_PIPELINE terminating 2019-08-07 16:06:58,724 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-1486468690-127.0.0.1-1550458385182:blk_1077046179_3306717 src: /192.168.1.159:35112 dest: /192.168.1.155:50010 2019-08-07 16:07:14,302 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /192.168.1.159:35112, dest: /192.168.1.155:50010, bytes: 134217728, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_-788439705_1, offset: 0, srvID: 8eaab255-07a1-4b87-85a5-e7f11b8a6e95, blockid: BP-1486468690-127.0.0.1-1550458385182:blk_1077046179_3306717, duration: 15575351812 2019-08-07 16:07:14,302 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-1486468690-127.0.0.1-1550458385182:blk_1077046179_3306717, type=LAST_IN_PIPELINE, downstreams=0:[] terminating 2019-08-07 16:09:40,007 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-1486468690-127.0.0.1-1550458385182:blk_1077046190_3306728 src: /192.168.1.160:33804 dest: /192.168.1.155:50010 2019-08-07 16:09:53,521 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /192.168.1.160:33804, dest: /192.168.1.155:50010, bytes: 134217728, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_950405681_1, offset: 0, srvID: 8eaab255-07a1-4b87-85a5-e7f11b8a6e95, blockid: BP-1486468690-127.0.0.1-1550458385182:blk_1077046190_3306728, duration: 13512425969 2019-08-07 16:09:53,522 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-1486468690-127.0.0.1-1550458385182:blk_1077046190_3306728, type=LAST_IN_PIPELINE, downstreams=0:[] terminating 2019-08-07 16:10:07,708 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-1486468690-127.0.0.1-1550458385182:blk_1077046192_3306730 src: /192.168.1.160:33846 dest: /192.168.1.155:50010 2019-08-07 16:10:21,496 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /192.168.1.160:33846, dest: /192.168.1.155:50010, bytes: 134217728, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_950405681_1, offset: 0, srvID: 8eaab255-07a1-4b87-85a5-e7f11b8a6e95, blockid: BP-1486468690-127.0.0.1-1550458385182:blk_1077046192_3306730, duration: 13785989719 2019-08-07 16:10:21,496 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-1486468690-127.0.0.1-1550458385182:blk_1077046192_3306730, type=LAST_IN_PIPELINE, downstreams=0:[] terminating 2019-08-07 16:10:48,601 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-1486468690-127.0.0.1-1550458385182:blk_1077046195_3306733 src: /192.168.1.160:33906 dest: /192.168.1.155:50010 2019-08-07 16:11:03,518 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /192.168.1.160:33906, dest: /192.168.1.155:50010, bytes: 134217728, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_950405681_1, offset: 0, srvID: 8eaab255-07a1-4b87-85a5-e7f11b8a6e95, blockid: BP-1486468690-127.0.0.1-1550458385182:blk_1077046195_3306733, duration: 14914950461 2019-08-07 16:11:03,518 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-1486468690-127.0.0.1-1550458385182:blk_1077046195_3306733, type=LAST_IN_PIPELINE, downstreams=0:[] terminating 2019-08-07 16:13:47,907 INFO org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetAsyncDiskService: Scheduling blk_1077031996_3291987 file /opt/hadoop/data/hadoop/datanode/current/BP-1486468690-127.0.0.1-1550458385182/current/finalized/subdir50/subdir52/blk_1077031996 for deletion 2019-08-07 16:13:47,908 INFO org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetAsyncDiskService: Deleted BP-1486468690-127.0.0.1-1550458385182 blk_1077031996_3291987 file /opt/hadoop/data/hadoop/datanode/current/BP-1486468690-127.0.0.1-1550458385182/current/finalized/subdir50/subdir52/blk_1077031996 2019-08-07 16:14:24,137 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-1486468690-127.0.0.1-1550458385182:blk_1077046197_3306735 src: /192.168.1.159:35162 dest: /192.168.1.155:50010 2019-08-07 16:14:24,140 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /192.168.1.159:35162, dest: /192.168.1.155:50010, bytes: 17116, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_-788439705_1, offset: 0, srvID: 8eaab255-07a1-4b87-85a5-e7f11b8a6e95, blockid: BP-1486468690-127.0.0.1-1550458385182:blk_1077046197_3306735, duration: 1757285 2019-08-07 16:14:24,140 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-1486468690-127.0.0.1-1550458385182:blk_1077046197_3306735, type=LAST_IN_PIPELINE, downstreams=0:[] terminating 2019-08-07 16:15:09,408 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-1486468690-127.0.0.1-1550458385182:blk_1077046199_3306737 src: /192.168.1.159:35176 dest: /192.168.1.155:50010 2019-08-07 16:15:09,417 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-1486468690-127.0.0.1-1550458385182:blk_1077046200_3306738 src: /192.168.1.159:35180 dest: /192.168.1.155:50010 2019-08-07 16:15:09,510 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /192.168.1.160:33306, dest: /192.168.1.155:50010, bytes: 533, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_950405681_1, offset: 0, srvID: 8eaab255-07a1-4b87-85a5-e7f11b8a6e95, blockid: BP-1486468690-127.0.0.1-1550458385182:blk_1077046170_3306705, duration: 3600140375370 2019-08-07 16:15:09,510 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-1486468690-127.0.0.1-1550458385182:blk_1077046170_3306705, type=LAST_IN_PIPELINE, downstreams=0:[] terminating 2019-08-07 16:15:09,942 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-1486468690-127.0.0.1-1550458385182:blk_1077046202_3306740 src: /192.168.1.160:33968 dest: /192.168.1.155:50010 2019-08-07 16:15:09,957 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /192.168.1.127:50922, dest: /192.168.1.155:50010, bytes: 98, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_905909294_1, offset: 0, srvID: 8eaab255-07a1-4b87-85a5-e7f11b8a6e95, blockid: BP-1486468690-127.0.0.1-1550458385182:blk_1077046171_3306706, duration: 3600100418049 2019-08-07 16:15:09,957 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-1486468690-127.0.0.1-1550458385182:blk_1077046171_3306706, type=HAS_DOWNSTREAM_IN_PIPELINE terminating 2019-08-07 16:17:50,922 INFO org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetAsyncDiskService: Scheduling blk_1077036496_3296673 file /opt/hadoop/data/hadoop/datanode/current/BP-1486468690-127.0.0.1-1550458385182/current/finalized/subdir50/subdir69/blk_1077036496 for deletion 2019-08-07 16:17:50,924 INFO org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetAsyncDiskService: Deleted BP-1486468690-127.0.0.1-1550458385182 blk_1077036496_3296673 file /opt/hadoop/data/hadoop/datanode/current/BP-1486468690-127.0.0.1-1550458385182/current/finalized/subdir50/subdir69/blk_1077036496 2019-08-07 16:21:47,933 INFO org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetAsyncDiskService: Scheduling blk_1077045746_3306270 file /ssd/hadoop/data/hadoop/datanode/current/BP-1486468690-127.0.0.1-1550458385182/current/finalized/subdir50/subdir105/blk_1077045746 for deletion 2019-08-07 16:21:47,934 INFO org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetAsyncDiskService: Scheduling blk_1077046197_3306735 file /opt/hadoop/data/hadoop/datanode/current/BP-1486468690-127.0.0.1-1550458385182/current/finalized/subdir50/subdir107/blk_1077046197 for deletion 2019-08-07 16:21:47,934 INFO org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetAsyncDiskService: Scheduling blk_1077046136_3306668 file /ssd/hadoop/data/hadoop/datanode/current/BP-1486468690-127.0.0.1-1550458385182/current/finalized/subdir50/subdir107/blk_1077046136 for deletion 2019-08-07 16:21:47,935 INFO org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetAsyncDiskService: Deleted BP-1486468690-127.0.0.1-1550458385182 blk_1077045746_3306270 file /ssd/hadoop/data/hadoop/datanode/current/BP-1486468690-127.0.0.1-1550458385182/current/finalized/subdir50/subdir105/blk_1077045746 2019-08-07 16:21:47,935 INFO org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetAsyncDiskService: Deleted BP-1486468690-127.0.0.1-1550458385182 blk_1077046136_3306668 file /ssd/hadoop/data/hadoop/datanode/current/BP-1486468690-127.0.0.1-1550458385182/current/finalized/subdir50/subdir107/blk_1077046136 2019-08-07 16:21:47,935 INFO org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetAsyncDiskService: Deleted BP-1486468690-127.0.0.1-1550458385182 blk_1077046197_3306735 file /opt/hadoop/data/hadoop/datanode/current/BP-1486468690-127.0.0.1-1550458385182/current/finalized/subdir50/subdir107/blk_1077046197 2019-08-07 16:22:08,089 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-1486468690-127.0.0.1-1550458385182:blk_1077046203_3306741 src: /192.168.1.155:49404 dest: /192.168.1.155:50010 2019-08-07 16:22:08,657 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /192.168.1.155:49404, dest: /192.168.1.155:50010, bytes: 90, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_900371122_1, offset: 0, srvID: 8eaab255-07a1-4b87-85a5-e7f11b8a6e95, blockid: BP-1486468690-127.0.0.1-1550458385182:blk_1077046203_3306741, duration: 562750546 2019-08-07 16:22:08,657 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-1486468690-127.0.0.1-1550458385182:blk_1077046203_3306741, type=HAS_DOWNSTREAM_IN_PIPELINE terminating 2019-08-07 16:22:09,492 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-1486468690-127.0.0.1-1550458385182:blk_1077046205_3306744 src: /192.168.1.160:34010 dest: /192.168.1.155:50010 2019-08-07 16:22:09,496 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /192.168.1.160:34010, dest: /192.168.1.155:50010, bytes: 10706, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_-1877039568_1, offset: 0, srvID: 8eaab255-07a1-4b87-85a5-e7f11b8a6e95, blockid: BP-1486468690-127.0.0.1-1550458385182:blk_1077046205_3306744, duration: 2302870 2019-08-07 16:22:09,496 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-1486468690-127.0.0.1-1550458385182:blk_1077046205_3306744, type=LAST_IN_PIPELINE, downstreams=0:[] terminating 2019-08-07 16:22:10,831 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-1486468690-127.0.0.1-1550458385182:blk_1077046208_3306747 src: /192.168.1.159:35224 dest: /192.168.1.155:50010 2019-08-07 16:22:10,841 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /192.168.1.159:35224, dest: /192.168.1.155:50010, bytes: 10706, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_-788439705_1, offset: 0, srvID: 8eaab255-07a1-4b87-85a5-e7f11b8a6e95, blockid: BP-1486468690-127.0.0.1-1550458385182:blk_1077046208_3306747, duration: 8458860 2019-08-07 16:22:10,841 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-1486468690-127.0.0.1-1550458385182:blk_1077046208_3306747, type=LAST_IN_PIPELINE, downstreams=0:[] terminating 2019-08-07 16:22:14,507 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-1486468690-127.0.0.1-1550458385182:blk_1077046172_3306743 src: /192.168.1.160:34034 dest: /192.168.1.155:50010 2019-08-07 16:22:14,508 INFO org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetAsyncDiskService: Deleting ReplicaBeingWritten, blk_1077046172_3306707, RBW getNumBytes() = 90 getBytesOnDisk() = 90 getVisibleLength()= 90 getVolume() = /ssd/hadoop/data/hadoop/datanode/current getBlockFile() = /ssd/hadoop/data/hadoop/datanode/current/BP-1486468690-127.0.0.1-1550458385182/current/rbw/blk_1077046172 bytesAcked=90 bytesOnDisk=90 file /ssd/hadoop/data/hadoop/datanode/current/BP-1486468690-127.0.0.1-1550458385182/current/rbw/blk_1077046172 2019-08-07 16:22:14,508 INFO org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetAsyncDiskService: Deleted BP-1486468690-127.0.0.1-1550458385182 ReplicaBeingWritten, blk_1077046172_3306707, RBW getNumBytes() = 90 getBytesOnDisk() = 90 getVisibleLength()= 90 getVolume() = /ssd/hadoop/data/hadoop/datanode/current getBlockFile() = /ssd/hadoop/data/hadoop/datanode/current/BP-1486468690-127.0.0.1-1550458385182/current/rbw/blk_1077046172 bytesAcked=90 bytesOnDisk=90 file /ssd/hadoop/data/hadoop/datanode/current/BP-1486468690-127.0.0.1-1550458385182/current/rbw/blk_1077046172 2019-08-07 16:22:14,509 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Received BP-1486468690-127.0.0.1-1550458385182:blk_1077046172_3306743 src: /192.168.1.160:34034 dest: /192.168.1.155:50010 of size 90 2019-08-07 16:22:17,040 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-1486468690-127.0.0.1-1550458385182:blk_1077046210_3306749 src: /192.168.1.127:55824 dest: /192.168.1.155:50010 2019-08-07 16:22:17,169 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-1486468690-127.0.0.1-1550458385182:blk_1077046212_3306751 src: /192.168.1.103:34436 dest: /192.168.1.155:50010 2019-08-07 16:22:17,933 INFO org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetAsyncDiskService: Scheduling blk_1077046208_3306747 file /opt/hadoop/data/hadoop/datanode/current/BP-1486468690-127.0.0.1-1550458385182/current/finalized/subdir50/subdir107/blk_1077046208 for deletion 2019-08-07 16:22:17,934 INFO org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetAsyncDiskService: Scheduling blk_1077046205_3306744 file /ssd/hadoop/data/hadoop/datanode/current/BP-1486468690-127.0.0.1-1550458385182/current/finalized/subdir50/subdir107/blk_1077046205 for deletion 2019-08-07 16:22:17,934 INFO org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetAsyncDiskService: Deleted BP-1486468690-127.0.0.1-1550458385182 blk_1077046208_3306747 file /opt/hadoop/data/hadoop/datanode/current/BP-1486468690-127.0.0.1-1550458385182/current/finalized/subdir50/subdir107/blk_1077046208 2019-08-07 16:22:17,934 INFO org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetAsyncDiskService: Deleted BP-1486468690-127.0.0.1-1550458385182 blk_1077046205_3306744 file /ssd/hadoop/data/hadoop/datanode/current/BP-1486468690-127.0.0.1-1550458385182/current/finalized/subdir50/subdir107/blk_1077046205 2019-08-07 16:22:19,508 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /192.168.1.103:34436, dest: /192.168.1.155:50010, bytes: 19610804, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_-1877039568_1, offset: 0, srvID: 8eaab255-07a1-4b87-85a5-e7f11b8a6e95, blockid: BP-1486468690-127.0.0.1-1550458385182:blk_1077046212_3306751, duration: 2334435937 2019-08-07 16:22:19,509 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-1486468690-127.0.0.1-1550458385182:blk_1077046212_3306751, type=HAS_DOWNSTREAM_IN_PIPELINE terminating ... 2019-08-07 16:29:50,963 INFO org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetAsyncDiskService: Deleted BP-1486468690-127.0.0.1-1550458385182 blk_1077042871_3303305 file /opt/hadoop/data/hadoop/datanode/current/BP-1486468690-127.0.0.1-1550458385182/current/finalized/subdir50/subdir94/blk_1077042871 2019-08-07 16:29:50,963 INFO org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetAsyncDiskService: Deleted BP-1486468690-127.0.0.1-1550458385182 blk_1077037002_3297229 file /opt/hadoop/data/hadoop/datanode/current/BP-1486468690-127.0.0.1-1550458385182/current/finalized/subdir50/subdir71/blk_1077037002 2019-08-07 16:29:50,964 INFO org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetAsyncDiskService: Deleted BP-1486468690-127.0.0.1-1550458385182 blk_1077045887_3306411 file /opt/hadoop/data/hadoop/datanode/current/BP-1486468690-127.0.0.1-1550458385182/current/finalized/subdir50/subdir106/blk_1077045887 2019-08-07 16:29:53,913 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-1486468690-127.0.0.1-1550458385182:blk_1077046235_3306776 src: /192.168.1.159:35382 dest: /192.168.1.155:50010 2019-08-07 16:29:53,917 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /192.168.1.159:35382, dest: /192.168.1.155:50010, bytes: 49193, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_-788439705_1, offset: 0, srvID: 8eaab255-07a1-4b87-85a5-e7f11b8a6e95, blockid: BP-1486468690-127.0.0.1-1550458385182:blk_1077046235_3306776, duration: 1683641 2019-08-07 16:29:53,917 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-1486468690-127.0.0.1-1550458385182:blk_1077046235_3306776, type=LAST_IN_PIPELINE, downstreams=0:[] terminating 2019-08-07 16:30:50,952 INFO org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetAsyncDiskService: Scheduling blk_1077036386_3296557 file /opt/hadoop/data/hadoop/datanode/current/BP-1486468690-127.0.0.1-1550458385182/current/finalized/subdir50/subdir69/blk_1077036386 for deletion
从日志可以看出 偶尔有SocketTimeoutException,其中有大量任务FsDatasetAsyncDiskService,这个应该是副本同步的任务?
猜测是 太多的任务过于繁忙 导致 datanode被hbase 认为是bad节点,这也说得通
在网上也找了一圈,有一种说法是:原因是linux机器打开了过多的文件导致
在datanode很繁忙时,会执行各种文件操作,linux文件操作数达到上限也是有可能,然后regionserver 无法继续操作文件导致挂掉,这个理论也能成立
所以我试着修改一下 linux文件打开数上限
修改/ect/security/limit.conf,增加
hadoop soft nofile 1024000 hadoop hard nofile 1024000 hadoop soft nproc 1024000 hadoop hard nproc 1024000