Hello,
I am facing vMotion failures in one of the hosts running ESXi6.0 update2. Is anyone had this kind of issue? Please find the below information and help me to understand the root cause for this failure
vMkernel Log is showing below information during the migration failure:
2017-03-12T15:10:00.658Z cpu40:1522911)Migrate: vm 1522912: 3385: Setting VMOTION info: Source ts = 1489331391021669, src ip = <XXXXXXXX> dest ip = <10.115.135.24> Dest wid = 1559963 using SHARED swap
2017-03-12T15:10:00.659Z cpu40:1522911)Hbr: 3394: Migration start received (worldID=1522912) (migrateType=1) (event=0) (isSource=1) (sharedConfig=1)
2017-03-12T15:10:00.660Z cpu0:1553099)VMotionUtil: 3995: 1489331391021669 S: Stream connection 1 added.
2017-03-12T15:10:00.671Z cpu0:1553097)WARNING: VMotionUtil: 733: 1489331391021669 S: failed to read stream keepalive: Connection closed by remote host, possibly due to timeout
2017-03-12T15:10:00.671Z cpu0:1553097)WARNING: Migrate: 270: 1489331391021669 S: Failed: Connection closed by remote host, possibly due to timeout (0xbad003f) @0x41800da14f5e
2017-03-12T15:10:00.840Z cpu24:1522922)WARNING: Migrate: 5454: 1489331391021669 S: Migration considered a failure by the VMX. It is most likely a timeout, but check the VMX log for the true error.
2017-03-12T15:10:00.840Z cpu24:1522922)Hbr: 3488: Migration end received (worldID=1522912) (migrateType=1) (event=1) (isSource=1) (sharedConfig=1)
2017-03-12T15:10:00.875Z cpu29:1418851)Config: 681: "SIOControlFlag2" = 0, Old Value: 1, (Status: 0x0)
Vmware.log in that VM is showing as below:
2017-03-12T15:29:53.281Z| vcpu-0| I125: VMXVmdb_SetMigrationHostLogState: hostlog state transits to failure for migrate 'to' mid 1489332587942583
2017-03-12T15:29:53.305Z| vcpu-0| I125: MigrateSetStateFinished: type=1 new state=6
2017-03-12T15:29:53.305Z| vcpu-0| I125: MigrateSetState: Transitioning from state 2 to 6.
2017-03-12T15:29:53.305Z| vcpu-0| A100: ConfigDB: Setting config.readOnly = "FALSE"
2017-03-12T15:29:53.305Z| vcpu-0| I125: Migrate_SetFailureMsgList: switching to new log file.
2017-03-12T15:29:53.306Z| vcpu-0| I125: Migrate_SetFailureMsgList: Now in new log file.
2017-03-12T15:29:53.548Z| vcpu-0| I125: Migrate: Caching migration error message list:
2017-03-12T15:29:53.548Z| vcpu-0| I125: [msg.checkpoint.precopyfailure] Migration to host <10.115.135.62> failed with error Connection closed by remote host, possibly due to timeout (0xbad003f).
2017-03-12T15:29:53.548Z| vcpu-0| I125: [vob.vmotion.stream.keepalive.read.fail] vMotion migration [a738729:1489332587942583] failed to read stream keepalive: Connection closed by remote host, possibly due to timeout
2017-03-12T15:29:53.548Z| vcpu-0| I125: Migrate: cleaning up migration state.
2017-03-12T15:29:53.549Z| vcpu-0| I125: VigorTransport_ServerSendResponse opID=58b52a1e-ed-7c-851d seq=19532: Completed Migrate request.
2017-03-12T15:29:53.549Z| vcpu-0| I125: Migrate: Final status reported through Vigor.
2017-03-12T15:29:53.549Z| vcpu-0| I125: MigrateSetState: Transitioning from state 6 to 0.
2017-03-12T15:29:53.549Z| vcpu-0| I125: Migrate: Final status reported through VMDB.
2017-03-12T15:29:53.549Z| vcpu-0| I125: Msg_Post: Error
2017-03-12T15:29:53.549Z| vcpu-0| I125: [vob.vmotion.stream.keepalive.read.fail] vMotion migration [a738729:1489332587942583] failed to read stream keepalive: Connection closed by remote host, possibly due to timeout
2017-03-12T15:29:53.549Z| vcpu-0| I125: [msg.checkpoint.precopyfailure] Migration to host <10.115.135.62> failed with error Connection closed by remote host, possibly due to timeout (0xbad003f).
2017-03-12T15:29:53.549Z| vcpu-0| I125: ----------------------------------------
2017-03-12T15:29:53.582Z| vcpu-0| I125: Vigor_MessageRevoke: message 'msg.checkpoint.precopyfailure' (seq 18864120) is revoked
When i try vmkping, no packet drops but i saw DRPRX is showing values in esxtop.
is this unusual or usual?
When i ran esxcli network nic stats -n vmnic0, i am seeing receive missed errors. Is this usual or unusual?
Thank you
VJ