Getting error: SCSI error: return code = 0x00010000

Getting error: SCSI error: return code = 0x00010000

Environment

  • Red Hat Enterprise Linux 5
  • Red Hat Enterprise Linux 6
  • Red Hat Enterprise Linux 7

Issue

Getting error: 'kernel: sd h:c:t:l: SCSI error: return code = 0x00010000':

May 22 23:50:10 localhost kernel: Device sdb not ready.
May 22 23:50:10 localhost kernel: end_request: I/O error, dev sdb, sector 0
May 22 23:50:10 localhost kernel: SCSI error : <0 0 2 14> return code = 0x10000

SAN access issue reported on RHEL server, observed following messages in /var/log/messages

May  5 04:15:00 localhost kernel: sd 3:0:1:42: Unhandled error code
May  5 04:15:00 localhost kernel: sd 3:0:1:42: SCSI error: return code = 0x00010000
May  5 04:15:00 localhost kernel: Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK,SUGGEST_OK

In device mapper multipath, some path fails and going down. Logs have many events logged for tur checker reports path is down and multipath -ll output show paths in failed faulty state:

[root@example ~]# multipath -ll
sdb: checker msg is "tur checker reports path is down"
mpath0 (2001738006296000b) dm-8 IBM,2810XIV
[size=200G][features=1 queue_if_no_path][hwhandler=0][rw]
\_ round-robin 0 [prio=3][active]
\_ 2:0:0:1 sda 8:0   [active][ready]
 \_ 2:0:1:1 sdb 8:16  [failed][faulty]
 \_ 4:0:8:1 sdc 8:32  [active][ready]
 \_ 4:0:9:1 sdd 8:48  [active][ready]

We had A SAN director failure and the system got paniced due to lpfc lost devices.

Resolution

SCSI error: return code = 0x00010000 DID_NO_CONNECT

There is likely a hardware issue that is related to the connectivity problems. Contact storage hardware support for assistance in determining cause and addressing the problem.

Parallel SCSI

DID_NO_CONNECT = SCSI SELECTION failed because there was no device at the address specified

iSCSI

  • iscsi layer returns this if replacement/recovery timeout seconds has expired or if user asked to shutdown session.

FC/SAN

  • Typically this will follow loss of a storage port, for example:
kernel:  rport-0:0-1: blocked FC remote port time out: saving binding
kernel: sd 0:0:1:22: Unhandled error code
kernel: sd 0:0:1:22: SCSI error: return code = 0x00010000
kernel: Result: hostbyte=>DID_NO_CONNECT driverbyte=DRIVER_OK,SUGGEST_OK
  • In the above example the remote port timed out and the controller lost connection to the devices behind that storage port.

Root Cause

The SCSI error return code of 0x00010000 is broken down into constituent parts and decoded as shown below:

0x00 01 00 00
-------------
           00   status byte : {likely} not valid, see other fields
        00         msg byte : {likely} not valid, see other fields
     01           host byte : DID_NO_CONNECT - no connection to device {possibly device doesn't exist or transport failed}
  00            driver byte : {likely} not valid, see other fields

The return code 0x00010000 is a DID_NO_CONNECT -- hardware transport connection to device is no longer available. This scsi error indicates that IO command is being rejected because the command cannot be sent to the device until the hardware transport becomes available again. These are a symptom/result of some other root cause. That other root cause needs to be found and addressed.

Either access to the device is temporarily unavailable or the device is no longer available within the configuration. A temporary service interrupt can be cause by maintenance activity in the san such as a switch reboot. Such activity can cause either a link down condition and/or remote port timeouts. Its possible the hardware transport connectivity will return at some time later. A permanent loss of connectivity can occur if storage is reconfigured to remove that lun from being presented to the host.

Check for the following or similar event messages within the system logs:

Mar 28 19:52:53 hostname kernel: qla2xxx 0000:04:00.1: LOOP DOWN detected (4 4 0 0). Mar 28 19:53:23 hostname kernel:  rport-3:0-0: blocked FC remote port time out: saving binding Mar 28 19:53:23 hostname kernel: sd 3:0:0:1: SCSI error: return code = 0x00010000 : .

Look at what was being logged just before these DID_NO_CONNECTs started being logged.

You can also have remote ports timeout (loss of connectivity) without a link down event. See the following for more information on remote port timeouts:

There is a delay between the link down event at 19:52:53 and remote port time out at 19:53:23 -- a 30 second delay. When a remote port is lost, a delay within the driver called dev_loss_tmo, device loss timeout, is applied before taking further action. If the port returns to the configuration before that timeout expires, then io is immediately retried. If the port hasn't returned by that time then all io is immediately returned with DID_NO_CONNECT status. Any and all further io will immediately fail after the remote port time out event has occurred. See and  configuration guides for more information on dev_loss_tmo behavior. Also  has information for setting dev_loss_tmo outside of multipath.

Other symptoms can result from the host loosing connectivity to storage. If no connectivity remains, issues such as file systems going read-only can result from being unable commit necessary metadata changes to the disks hosting the filesystem.

See  for more information.

Read more

自动化打造信息影响力:用 Web Unlocker 和 n8n 打造你的自动化资讯系统

自动化打造信息影响力:用 Web Unlocker 和 n8n 打造你的自动化资讯系统

一、研究背景 在信息爆炸的时代,及时获取高质量行业资讯成为内容创作者、运营者以及研究者的刚需。无论是IT、AI领域的技术动态,还是招聘、人才市场的趋势新闻,第一时间掌握热点、总结观点并进行内容输出,正逐渐成为提升影响力与构建个人/组织品牌的关键手段。 为实现“日更内容”目标,很多人开始探索自动化的路径——使用爬虫工具定期抓取目标网站内容,借助 AI 模型自动生成摘要,再将结果推送至社群平台。这一流程的核心,是稳定、高效地获取网页数据,在实际操作中,却出现了很多问题: * 首先是出现了验证码,阻断自动化流程; * 紧接着是请求返回403 Forbidden,提示IP被封; * 最终是目标网站直接对我们常用IP段进行了临时封禁,哪怕切换机器或重启网络都无济于事。 按照检查方法,当处于非爬虫操作时,我们在F12控制台输入window.navigator.webdriver时,显示的是false,输入进去出现了刺眼的红色报错,而且显示也出现了True, “Failed to load resource: the server responded with

By Ne0inhk
《Web 自动化测试入门:从概念到百度搜索实战全拆解》

《Web 自动化测试入门:从概念到百度搜索实战全拆解》

一、自动化的核心概念 1. 定义:通过自动方式替代人工操作完成任务,生活中常见案例(自动洒水机、自动洗手液、超市闸机)体现了 “减少人力消耗、提升效率 / 质量” 的特点。 2. 软件自动化测试的核心目的: * 用于回归测试:软件迭代新版本时,验证新增功能是否影响历史功能的正常运行。 3. 常见面试题解析: * 自动化测试不能完全取代人工测试:需人工编写脚本,且功能变更后需维护更新,可靠性未必优于人工。 * 自动化测试不能 “大幅度降低工作量”:仅能 “一定程度” 减少重复工作,需注意表述的严谨性。 二、自动化测试的分类 自动化是统称,包含多种类型,核心分类及说明如下: 分类说明接口自动化针对软件接口的测试,目的是验证接口的功能、性能、稳定性等。UI 自动化 针对软件界面的测试,包含: 1. 移动端自动化:通过模拟器在电脑上编写脚本,测试手机应用;稳定性较差(受设备、

By Ne0inhk

Flutter 三方库 dart_webrtc 的鸿蒙化适配指南 - 在鸿蒙系统上构建极致、透明、基于 WebRTC 标准的工业级实时音视频通讯与低延迟流媒体引擎

欢迎加入开源鸿蒙跨平台社区:https://openharmonycrossplatform.ZEEKLOG.net Flutter 三方库 dart_webrtc 的鸿蒙化适配指南 - 在鸿蒙系统上构建极致、透明、基于 WebRTC 标准的工业级实时音视频通讯与低延迟流媒体引擎 在鸿蒙(OpenHarmony)系统的跨端视频会议、分布式安防监控、直播连麦或者是需要实现“端到端(P2P)”低延迟数据传输的场景中,如何通过一套 Dart 代码调用底层浏览器级的 WebRTC 算力?dart_webrtc 为开发者提供了一套工业级的、针对 Web 平台(JS 接口)进行高度封装的 WebRTC 适配方案。本文将深入实战其在鸿蒙 Web 入口应用中的音视频能力扩展。 前言 什么是 Dart WebRTC?它不仅是一个简单的。管理过程。由于由接口包装。

By Ne0inhk
WebGIS视角:体感温度实证,哪座“火炉”火力全开?

WebGIS视角:体感温度实证,哪座“火炉”火力全开?

目录 前言 一、火炉城市空间分布及特点 1、空间分布 2、气候特点 二、数据来源及技术实现 1、数据来源介绍 2、技术路线简介 三、WebGIS系统实现 1、后端设计与实现 2、前端程序实现 四、成果展示 1、整体展示 2、蒸烤模式城市 3、舒适城市 五、总结 前言         “火炉城市”是中国对夏季天气酷热的城市的夸张称呼。这一说法最早出现在民国时期,当时媒体有“三大火炉”之说,即重庆、武汉和南京,都是长江沿线的著名大城市,分别居于长江的上、中、下游,因夏季气温炎热,被媒体夸张地称为“火炉”。新中国成立后,又有了“四大火炉”之说,

By Ne0inhk