Замена диска в рейде с помощью megacli
Довелось понастраивать сервер DELL T610 с рейд контроллером PERC H700 на борту. Все как обычно, кроме одного нюанса. Решил проверить, как оперативно выполнить замену сбойного диска. На сервер была установлена стандартная утилита mеgacli для управления всеми контроллерами с драйвером MegaRAID, к коим относится и упомянутый выше. Такая тривиальная задача оказалась не совсем тривиальной и пришлось поковыряться с документацией.
Если у вас есть желание научиться строить и поддерживать высокодоступные и надежные системы, рекомендую познакомиться с онлайн-курсом «Администратор Linux» в OTUS. Курс не для новичков, для поступления нужно пройти вступительный тест
.
Очень мне понравилось емкое описание процесса у одного админа: «Но замена дисков через такую утилиту — целый hardcore, только для настоящих Tru-админов )).» http://skeletor.org.ua/?p=4093. И я с ним полностью согласен. В принципе, весь процесс у него описан, но я все равно решил поделиться кое-какими дополнениями и своим опытом. Эта megacli такая неочевидная штука, с документацией страниц на 60, что я даже с готовыми примерами соображал некоторое время, какие же значения адаптера, массива, диска, какого-то row, который я никак не мог понять, что это такое, нужно подставить.
У меня был сервер на Debian 8 с 3 рейдами, raid1, raid1, raid10. Я вытаскивал диск из raid10 и заменял его новым.
Обращаю внимание, это важно. Я вставлял обратно не тот же самый диск, как часто делают, а другой. Это принципиально разные события. Если вынуть диск, а потом его же поставить на место, то ребилд пойдет автоматом и делать ничего дополнительно не надо. Если же вы другой физический диск ставите, то нужно будет проделать все то, что я сейчас опишу.
Сначала проверим состояние наших массивов:
# megacli -LDInfo -LAll -a0 –NoLog
Adapter 0 -- Virtual Drive Information:
Virtual Drive: 0 (Target Id: 0)
Name :sys
RAID Level : Primary-1, Secondary-0, RAID Level Qualifier-0
Size : 931.0 GB
Sector Size : 512
Mirror Data : 931.0 GB
State : Optimal
Strip Size : 64 KB
Number Of Drives : 2
Span Depth : 1
Default Cache Policy: WriteThrough, ReadAdaptive, Direct, No Write Cache if Bad BBU
Current Cache Policy: WriteThrough, ReadAdaptive, Direct, No Write Cache if Bad BBU
Default Access Policy: Read/Write
Current Access Policy: Read/Write
Disk Cache Policy : Disk's Default
Encryption Type : None
Bad Blocks Exist: No
Is VD Cached: Yes
Cache Cade Type : Read Only
Virtual Drive: 1 (Target Id: 1)
Name :file
RAID Level : Primary-1, Secondary-0, RAID Level Qualifier-0
Size : 3.637 TB
Sector Size : 512
Mirror Data : 3.637 TB
State : Optimal
Strip Size : 64 KB
Number Of Drives : 2
Span Depth : 1
Default Cache Policy: WriteThrough, ReadAdaptive, Direct, No Write Cache if Bad BBU
Current Cache Policy: WriteThrough, ReadAdaptive, Direct, No Write Cache if Bad BBU
Default Access Policy: Read/Write
Current Access Policy: Read/Write
Disk Cache Policy : Disk's Default
Encryption Type : None
Bad Blocks Exist: No
Is VD Cached: Yes
Cache Cade Type : Read Only
Virtual Drive: 2 (Target Id: 2)
Name :data
RAID Level : Primary-1, Secondary-0, RAID Level Qualifier-0
Size : 930.5 GB
Sector Size : 512
Mirror Data : 930.5 GB
State : Degraded
Strip Size : 64 KB
Number Of Drives per span:2
Span Depth : 2
Default Cache Policy: WriteThrough, ReadAdaptive, Direct, No Write Cache if Bad BBU
Current Cache Policy: WriteThrough, ReadAdaptive, Direct, No Write Cache if Bad BBU
Default Access Policy: Read/Write
Current Access Policy: Read/Write
Disk Cache Policy : Disk's Default
Encryption Type : None
Bad Blocks Exist: No
Is VD Cached: Yes
Cache Cade Type : Read Only
Чувствуете хардкор? Еще нет? Тогда поехали дальше. Обращаю внимание, что последний массив помечен как Degraded, из него вынут диск. Это raid10. К сожалению, я так и не понял, как через megacli посмотреть тип массива. Где тут указано, что в массиве raid10, я не понял. Теперь посмотрим на список дисков:
# megacli -PDlist -a0 -NoLog
Adapter #0
Enclosure Device ID: 32
Slot Number: 0
Drive's position: DiskGroup: 0, Span: 0, Arm: 0
Enclosure position: N/A
Device Id: 0
WWN: 50014ee60430ad5b
Sequence Number: 2
Media Error Count: 0
Other Error Count: 0
Predictive Failure Count: 0
Last Predictive Failure Event Seq Number: 0
PD Type: SATA
Raw Size: 931.512 GB [0x74706db0 Sectors]
Non Coerced Size: 931.012 GB [0x74606db0 Sectors]
Coerced Size: 931.0 GB [0x74600000 Sectors]
Sector Size: 0
Firmware state: Online, Spun Up
Device Firmware Level: 6A00
Shield Counter: 0
Successful diagnostics completion on : N/A
SAS Address(0): 0x4433221107000000
Connected Port Number: 4(path0)
Inquiry Data: WD-WXU1E83HLFH6WDC WD1000DHTZ-04N21V0 04.06A00
FDE Capable: Not Capable
FDE Enable: Disable
Secured: Unsecured
Locked: Unlocked
Needs EKM Attention: No
Foreign State: None
Device Speed: 6.0Gb/s
Link Speed: 6.0Gb/s
Media Type: Hard Disk Device
Drive: Not Certified
Drive Temperature : N/A
PI Eligibility: No
Drive is formatted for PI information: No
PI: No PI
Drive's NCQ setting : N/A
Port-0 :
Port status: Active
Port's Linkspeed: 6.0Gb/s
Drive has flagged a S.M.A.R.T alert : No
Enclosure Device ID: 32
Slot Number: 1
Drive's position: DiskGroup: 0, Span: 0, Arm: 1
Enclosure position: N/A
Device Id: 1
WWN: 50014ee604ff97cf
Sequence Number: 2
Media Error Count: 0
Other Error Count: 0
Predictive Failure Count: 0
Last Predictive Failure Event Seq Number: 0
PD Type: SATA
Raw Size: 931.512 GB [0x74706db0 Sectors]
Non Coerced Size: 931.012 GB [0x74606db0 Sectors]
Coerced Size: 931.0 GB [0x74600000 Sectors]
Sector Size: 0
Firmware state: Online, Spun Up
Device Firmware Level: 6A01
Shield Counter: 0
Successful diagnostics completion on : N/A
SAS Address(0): 0x4433221106000000
Connected Port Number: 5(path0)
Inquiry Data: WD-WX11E44AFZ47WDC WD1000DHTZ-04N21V1 04.06A01
FDE Capable: Not Capable
FDE Enable: Disable
Secured: Unsecured
Locked: Unlocked
Needs EKM Attention: No
Foreign State: None
Device Speed: 6.0Gb/s
Link Speed: 6.0Gb/s
Media Type: Hard Disk Device
Drive: Not Certified
Drive Temperature : N/A
PI Eligibility: No
Drive is formatted for PI information: No
PI: No PI
Drive's NCQ setting : N/A
Port-0 :
Port status: Active
Port's Linkspeed: 6.0Gb/s
Drive has flagged a S.M.A.R.T alert : No
Enclosure Device ID: 32
Slot Number: 2
Drive's position: DiskGroup: 1, Span: 0, Arm: 0
Enclosure position: N/A
Device Id: 2
WWN: 50014ee20a90eed3
Sequence Number: 2
Media Error Count: 0
Other Error Count: 0
Predictive Failure Count: 0
Last Predictive Failure Event Seq Number: 0
PD Type: SATA
Raw Size: 3.638 TB [0x1d1c0beb0 Sectors]
Non Coerced Size: 3.637 TB [0x1d1b0beb0 Sectors]
Coerced Size: 3.637 TB [0x1d1b00000 Sectors]
Sector Size: 0
Firmware state: Online, Spun Up
Device Firmware Level: 1K02
Shield Counter: 0
Successful diagnostics completion on : N/A
SAS Address(0): 0x4433221105000000
Connected Port Number: 7(path0)
Inquiry Data: WD-WCC132351832WDC WD4000FYYZ-01UL1B1 01.01K02
FDE Capable: Not Capable
FDE Enable: Disable
Secured: Unsecured
Locked: Unlocked
Needs EKM Attention: No
Foreign State: None
Device Speed: 6.0Gb/s
Link Speed: 6.0Gb/s
Media Type: Hard Disk Device
Drive: Not Certified
Drive Temperature : N/A
PI Eligibility: No
Drive is formatted for PI information: No
PI: No PI
Drive's NCQ setting : N/A
Port-0 :
Port status: Active
Port's Linkspeed: 6.0Gb/s
Drive has flagged a S.M.A.R.T alert : No
Enclosure Device ID: 32
Slot Number: 3
Drive's position: DiskGroup: 1, Span: 0, Arm: 1
Enclosure position: N/A
Device Id: 3
WWN: 50014ee20a5ce036
Sequence Number: 2
Media Error Count: 0
Other Error Count: 0
Predictive Failure Count: 0
Last Predictive Failure Event Seq Number: 0
PD Type: SATA
Raw Size: 3.638 TB [0x1d1c0beb0 Sectors]
Non Coerced Size: 3.637 TB [0x1d1b0beb0 Sectors]
Coerced Size: 3.637 TB [0x1d1b00000 Sectors]
Sector Size: 0
Firmware state: Online, Spun Up
Device Firmware Level: 1K02
Shield Counter: 0
Successful diagnostics completion on : N/A
SAS Address(0): 0x4433221104000000
Connected Port Number: 6(path0)
Inquiry Data: WD-WCC132299916WDC WD4000FYYZ-01UL1B1 01.01K02
FDE Capable: Not Capable
FDE Enable: Disable
Secured: Unsecured
Locked: Unlocked
Needs EKM Attention: No
Foreign State: None
Device Speed: 6.0Gb/s
Link Speed: 6.0Gb/s
Media Type: Hard Disk Device
Drive: Not Certified
Drive Temperature : N/A
PI Eligibility: No
Drive is formatted for PI information: No
PI: No PI
Drive's NCQ setting : N/A
Port-0 :
Port status: Active
Port's Linkspeed: 6.0Gb/s
Drive has flagged a S.M.A.R.T alert : No
Enclosure Device ID: 32
Slot Number: 4
Drive's position: DiskGroup: 2, Span: 0, Arm: 0
Enclosure position: N/A
Device Id: 4
WWN: 50014ee659eb7681
Sequence Number: 2
Media Error Count: 0
Other Error Count: 0
Predictive Failure Count: 0
Last Predictive Failure Event Seq Number: 0
PD Type: SATA
Raw Size: 465.761 GB [0x3a386030 Sectors]
Non Coerced Size: 465.261 GB [0x3a286030 Sectors]
Coerced Size: 465.25 GB [0x3a280000 Sectors]
Sector Size: 0
Firmware state: Online, Spun Up
Device Firmware Level: 6A01
Shield Counter: 0
Successful diagnostics completion on : N/A
SAS Address(0): 0x4433221103000000
Connected Port Number: 1(path0)
Inquiry Data: WD-WX71EA3ETUU4WDC WD5000HHTZ-04N21V1 04.06A01
FDE Capable: Not Capable
FDE Enable: Disable
Secured: Unsecured
Locked: Unlocked
Needs EKM Attention: No
Foreign State: None
Device Speed: 6.0Gb/s
Link Speed: 6.0Gb/s
Media Type: Hard Disk Device
Drive: Not Certified
Drive Temperature : N/A
PI Eligibility: No
Drive is formatted for PI information: No
PI: No PI
Drive's NCQ setting : N/A
Port-0 :
Port status: Active
Port's Linkspeed: 6.0Gb/s
Drive has flagged a S.M.A.R.T alert : No
Enclosure Device ID: 32
Slot Number: 5
Drive's position: DiskGroup: 2, Span: 0, Arm: 1
Enclosure position: N/A
Device Id: 5
WWN: 50014ee3aabf9f80
Sequence Number: 2
Media Error Count: 0
Other Error Count: 0
Predictive Failure Count: 0
Last Predictive Failure Event Seq Number: 0
PD Type: SATA
Raw Size: 465.761 GB [0x3a386030 Sectors]
Non Coerced Size: 465.261 GB [0x3a286030 Sectors]
Coerced Size: 465.25 GB [0x3a280000 Sectors]
Sector Size: 0
Firmware state: Online, Spun Up
Device Firmware Level: 6A00
Shield Counter: 0
Successful diagnostics completion on : N/A
SAS Address(0): 0x4433221102000000
Connected Port Number: 2(path0)
Inquiry Data: WD-WXN1E32MCZLNWDC WD5000HHTZ-04N21V0 04.06A00
FDE Capable: Not Capable
FDE Enable: Disable
Secured: Unsecured
Locked: Unlocked
Needs EKM Attention: No
Foreign State: None
Device Speed: 6.0Gb/s
Link Speed: 6.0Gb/s
Media Type: Hard Disk Device
Drive: Not Certified
Drive Temperature : N/A
PI Eligibility: No
Drive is formatted for PI information: No
PI: No PI
Drive's NCQ setting : N/A
Port-0 :
Port status: Active
Port's Linkspeed: 6.0Gb/s
Drive has flagged a S.M.A.R.T alert : No
Enclosure Device ID: 32
Slot Number: 6
Drive's position: DiskGroup: 2, Span: 1, Arm: 0
Enclosure position: N/A
Device Id: 6
WWN: 50014ee75595a31c
Sequence Number: 2
Media Error Count: 0
Other Error Count: 0
Predictive Failure Count: 0
Last Predictive Failure Event Seq Number: 0
PD Type: SATA
Raw Size: 465.761 GB [0x3a386030 Sectors]
Non Coerced Size: 465.261 GB [0x3a286030 Sectors]
Coerced Size: 465.25 GB [0x3a280000 Sectors]
Sector Size: 0
Firmware state: Online, Spun Up
Device Firmware Level: 6A01
Shield Counter: 0
Successful diagnostics completion on : N/A
SAS Address(0): 0x4433221101000000
Connected Port Number: 0(path0)
Inquiry Data: WD-WX71E34SJY05WDC WD5000HHTZ-04N21V1 04.06A01
FDE Capable: Not Capable
FDE Enable: Disable
Secured: Unsecured
Locked: Unlocked
Needs EKM Attention: No
Foreign State: None
Device Speed: 6.0Gb/s
Link Speed: 6.0Gb/s
Media Type: Hard Disk Device
Drive: Not Certified
Drive Temperature : N/A
PI Eligibility: No
Drive is formatted for PI information: No
PI: No PI
Drive's NCQ setting : N/A
Port-0 :
Port status: Active
Port's Linkspeed: 6.0Gb/s
Drive has flagged a S.M.A.R.T alert : No
Enclosure Device ID: 32
Slot Number: 7
Drive's position: DiskGroup: 2, Span: 1, Arm: 1
Enclosure position: N/A
Device Id: 7
WWN: 50014ee659fa7138
Sequence Number: 3
Media Error Count: 0
Other Error Count: 0
Predictive Failure Count: 0
Last Predictive Failure Event Seq Number: 0
PD Type: SATA
Raw Size: 465.761 GB [0x3a386030 Sectors]
Non Coerced Size: 465.261 GB [0x3a286030 Sectors]
Coerced Size: 465.25 GB [0x3a280000 Sectors]
Sector Size: 0
Firmware state: Unconfigured(good), Spun Up
Device Firmware Level: 6A01
Shield Counter: 0
Successful diagnostics completion on : N/A
SAS Address(0): 0x4433221100000000
Connected Port Number: 3(path0)
Inquiry Data: WD-WX71E34LFW34WDC WD5000HHTZ-04N21V1 04.06A01
FDE Capable: Not Capable
FDE Enable: Disable
Secured: Unsecured
Locked: Unlocked
Needs EKM Attention: No
Foreign State: None
Device Speed: 6.0Gb/s
Link Speed: 6.0Gb/s
Media Type: Hard Disk Device
Drive: Not Certified
Drive Temperature : N/A
PI Eligibility: No
Drive is formatted for PI information: No
PI: No PI
Drive's NCQ setting : N/A
Port-0 :
Port status: Active
Port's Linkspeed: 6.0Gb/s
Drive has flagged a S.M.A.R.T alert : No
Нас интересует последний диск. В Firmware state указано Unconfigured(good). Это я уже воткнул новый пустой диск, вместо старого. Если с диском будут какие-то проблемы, то его состояние будет Failed. Дальше вам важно запомнить следующие значения этого диска:
- Enclosure Device ID: 32
- Slot Number: 7
- DiskGroup: 2
Первые два — ID и номер слота жесткого диска. Они нам нужны в дальнейших командах для обозначения диска. Последнее, судя по всему, принадлежность к номеру DiskGroup в описании массива. Я не уверен в этом на 100%, но в моем случае эти данные для всех дисков и массивов показывали полное совпадение. Скорее всего это так. Проверьте по этой цифре, точно ли сбойный диск принадлежит к тому массиву, о котором вы думаете.
Я немного забежал вперед и поторопился с заменой диска. Я вытащил диск, загрузил сервер, убедился, что он работает без диска и что массив понимает, что он находится в состоянии Degraded. После этого мне нужно было бы выполнить следующие команды.
Отключить сбойный диск:
# megacli -PDOffline -PhysDrv [32:7] -a0
Пометить его как отключенный:
# megacli -PDMarkMissing -PhysDrv [32:7] -a0
Удалить его:
# megacli -PDPrpRmv -PhysDrv [32:7] -a0
Я это не сделал, а просто выключил сервер и установил новый диск. После включения убедился, что новый диск присутствует в списке дисков и его статус Unconfigured(good). После этого я указываю контроллеру, что диск заменен:
# megacli -PdReplaceMissing -PhysDrv [32:7] -Array3 -row1 -a0
Над этой командой я долго ломал голову. Расскажу по порядку, что тут к чему.
Array3. Откуда взялась цифра 3? Вот описание:
«The number N of the Array parameter is from the «Span Reference:» line you get using MegaCli -CfgDsply -aALL, minus the 0x0 part.»
Выполняем команду просмотра конфигурации:
# megacli -CfgDsply -a0
==============================================================================
Adapter: 0
Product Name: PERC H700 Integrated
Memory: 1024MB
BBU: Absent
Serial No: 11900L8
==============================================================================
Number of DISK GROUPS: 3
DISK GROUP: 0
Number of Spans: 1
SPAN: 0
Span Reference: 0x00
Number of PDs: 2
Number of VDs: 1
Number of dedicated Hotspares: 0
Virtual Drive Information:
Virtual Drive: 0 (Target Id: 0)
Name :sys
RAID Level : Primary-1, Secondary-0, RAID Level Qualifier-0
Size : 931.0 GB
Sector Size : 512
Mirror Data : 931.0 GB
State : Optimal
Strip Size : 64 KB
Number Of Drives : 2
Span Depth : 1
Default Cache Policy: WriteThrough, ReadAdaptive, Direct, No Write Cache if Bad BBU
Current Cache Policy: WriteThrough, ReadAdaptive, Direct, No Write Cache if Bad BBU
Default Access Policy: Read/Write
Current Access Policy: Read/Write
Disk Cache Policy : Disk's Default
Encryption Type : None
Bad Blocks Exist: No
Is VD Cached: Yes
Cache Cade Type : Read Only
Physical Disk Information:
Physical Disk: 0
Enclosure Device ID: 32
Slot Number: 0
Drive's position: DiskGroup: 0, Span: 0, Arm: 0
Enclosure position: N/A
Device Id: 0
WWN: 50014ee60430ad5b
Sequence Number: 2
Media Error Count: 0
Other Error Count: 0
Predictive Failure Count: 0
Last Predictive Failure Event Seq Number: 0
PD Type: SATA
Raw Size: 931.512 GB [0x74706db0 Sectors]
Non Coerced Size: 931.012 GB [0x74606db0 Sectors]
Coerced Size: 931.0 GB [0x74600000 Sectors]
Sector Size: 0
Firmware state: Online, Spun Up
Device Firmware Level: 6A00
Shield Counter: 0
Successful diagnostics completion on : N/A
SAS Address(0): 0x4433221107000000
Connected Port Number: 5(path0)
Inquiry Data: WD-WXU1E83HLFH6WDC WD1000DHTZ-04N21V0 04.06A00
FDE Capable: Not Capable
FDE Enable: Disable
Secured: Unsecured
Locked: Unlocked
Needs EKM Attention: No
Foreign State: None
Device Speed: 6.0Gb/s
Link Speed: 6.0Gb/s
Media Type: Hard Disk Device
Drive: Not Certified
Drive Temperature : N/A
PI Eligibility: No
Drive is formatted for PI information: No
PI: No PI
Drive's NCQ setting : N/A
Port-0 :
Port status: Active
Port's Linkspeed: 6.0Gb/s
Drive has flagged a S.M.A.R.T alert : No
Physical Disk: 1
Enclosure Device ID: 32
Slot Number: 1
Drive's position: DiskGroup: 0, Span: 0, Arm: 1
Enclosure position: N/A
Device Id: 1
WWN: 50014ee604ff97cf
Sequence Number: 2
Media Error Count: 0
Other Error Count: 0
Predictive Failure Count: 0
Last Predictive Failure Event Seq Number: 0
PD Type: SATA
Raw Size: 931.512 GB [0x74706db0 Sectors]
Non Coerced Size: 931.012 GB [0x74606db0 Sectors]
Coerced Size: 931.0 GB [0x74600000 Sectors]
Sector Size: 0
Firmware state: Online, Spun Up
Device Firmware Level: 6A01
Shield Counter: 0
Successful diagnostics completion on : N/A
SAS Address(0): 0x4433221106000000
Connected Port Number: 4(path0)
Inquiry Data: WD-WX11E44AFZ47WDC WD1000DHTZ-04N21V1 04.06A01
FDE Capable: Not Capable
FDE Enable: Disable
Secured: Unsecured
Locked: Unlocked
Needs EKM Attention: No
Foreign State: None
Device Speed: 6.0Gb/s
Link Speed: 6.0Gb/s
Media Type: Hard Disk Device
Drive: Not Certified
Drive Temperature : N/A
PI Eligibility: No
Drive is formatted for PI information: No
PI: No PI
Drive's NCQ setting : N/A
Port-0 :
Port status: Active
Port's Linkspeed: 6.0Gb/s
Drive has flagged a S.M.A.R.T alert : No
DISK GROUP: 1
Number of Spans: 1
SPAN: 0
Span Reference: 0x01
Number of PDs: 2
Number of VDs: 1
Number of dedicated Hotspares: 0
Virtual Drive Information:
Virtual Drive: 1 (Target Id: 1)
Name :file
RAID Level : Primary-1, Secondary-0, RAID Level Qualifier-0
Size : 3.637 TB
Sector Size : 512
Mirror Data : 3.637 TB
State : Optimal
Strip Size : 64 KB
Number Of Drives : 2
Span Depth : 1
Default Cache Policy: WriteThrough, ReadAdaptive, Direct, No Write Cache if Bad BBU
Current Cache Policy: WriteThrough, ReadAdaptive, Direct, No Write Cache if Bad BBU
Default Access Policy: Read/Write
Current Access Policy: Read/Write
Disk Cache Policy : Disk's Default
Encryption Type : None
Bad Blocks Exist: No
Is VD Cached: Yes
Cache Cade Type : Read Only
Physical Disk Information:
Physical Disk: 0
Enclosure Device ID: 32
Slot Number: 2
Drive's position: DiskGroup: 1, Span: 0, Arm: 0
Enclosure position: N/A
Device Id: 2
WWN: 50014ee20a90eed3
Sequence Number: 2
Media Error Count: 0
Other Error Count: 0
Predictive Failure Count: 0
Last Predictive Failure Event Seq Number: 0
PD Type: SATA
Raw Size: 3.638 TB [0x1d1c0beb0 Sectors]
Non Coerced Size: 3.637 TB [0x1d1b0beb0 Sectors]
Coerced Size: 3.637 TB [0x1d1b00000 Sectors]
Sector Size: 0
Firmware state: Online, Spun Up
Device Firmware Level: 1K02
Shield Counter: 0
Successful diagnostics completion on : N/A
SAS Address(0): 0x4433221105000000
Connected Port Number: 7(path0)
Inquiry Data: WD-WCC132351832WDC WD4000FYYZ-01UL1B1 01.01K02
FDE Capable: Not Capable
FDE Enable: Disable
Secured: Unsecured
Locked: Unlocked
Needs EKM Attention: No
Foreign State: None
Device Speed: 6.0Gb/s
Link Speed: 6.0Gb/s
Media Type: Hard Disk Device
Drive: Not Certified
Drive Temperature : N/A
PI Eligibility: No
Drive is formatted for PI information: No
PI: No PI
Drive's NCQ setting : N/A
Port-0 :
Port status: Active
Port's Linkspeed: 6.0Gb/s
Drive has flagged a S.M.A.R.T alert : No
Physical Disk: 1
Enclosure Device ID: 32
Slot Number: 3
Drive's position: DiskGroup: 1, Span: 0, Arm: 1
Enclosure position: N/A
Device Id: 3
WWN: 50014ee20a5ce036
Sequence Number: 2
Media Error Count: 0
Other Error Count: 0
Predictive Failure Count: 0
Last Predictive Failure Event Seq Number: 0
PD Type: SATA
Raw Size: 3.638 TB [0x1d1c0beb0 Sectors]
Non Coerced Size: 3.637 TB [0x1d1b0beb0 Sectors]
Coerced Size: 3.637 TB [0x1d1b00000 Sectors]
Sector Size: 0
Firmware state: Online, Spun Up
Device Firmware Level: 1K02
Shield Counter: 0
Successful diagnostics completion on : N/A
SAS Address(0): 0x4433221104000000
Connected Port Number: 6(path0)
Inquiry Data: WD-WCC132299916WDC WD4000FYYZ-01UL1B1 01.01K02
FDE Capable: Not Capable
FDE Enable: Disable
Secured: Unsecured
Locked: Unlocked
Needs EKM Attention: No
Foreign State: None
Device Speed: 6.0Gb/s
Link Speed: 6.0Gb/s
Media Type: Hard Disk Device
Drive: Not Certified
Drive Temperature : N/A
PI Eligibility: No
Drive is formatted for PI information: No
PI: No PI
Drive's NCQ setting : N/A
Port-0 :
Port status: Active
Port's Linkspeed: 6.0Gb/s
Drive has flagged a S.M.A.R.T alert : No
SPANNED DISK GROUP: 0
Number of Spans: 2
SPAN: 0
Span Reference: 0x02
Number of PDs: 2
Number of VDs: 1
Number of dedicated Hotspares: 0
Virtual Drive Information:
Virtual Drive: 2 (Target Id: 2)
Name :data
RAID Level : Primary-1, Secondary-0, RAID Level Qualifier-0
Size : 930.5 GB
Sector Size : 512
Mirror Data : 930.5 GB
State : Optimal
Strip Size : 64 KB
Number Of Drives per span:2
Span Depth : 2
Default Cache Policy: WriteThrough, ReadAdaptive, Direct, No Write Cache if Bad BBU
Current Cache Policy: WriteThrough, ReadAdaptive, Direct, No Write Cache if Bad BBU
Default Access Policy: Read/Write
Current Access Policy: Read/Write
Disk Cache Policy : Disk's Default
Encryption Type : None
Bad Blocks Exist: No
Is VD Cached: Yes
Cache Cade Type : Read Only
Physical Disk Information:
Physical Disk: 0
Enclosure Device ID: 32
Slot Number: 4
Drive's position: DiskGroup: 2, Span: 0, Arm: 0
Enclosure position: N/A
Device Id: 4
WWN: 50014ee659eb7681
Sequence Number: 2
Media Error Count: 0
Other Error Count: 0
Predictive Failure Count: 0
Last Predictive Failure Event Seq Number: 0
PD Type: SATA
Raw Size: 465.761 GB [0x3a386030 Sectors]
Non Coerced Size: 465.261 GB [0x3a286030 Sectors]
Coerced Size: 465.25 GB [0x3a280000 Sectors]
Sector Size: 0
Firmware state: Online, Spun Up
Device Firmware Level: 6A01
Shield Counter: 0
Successful diagnostics completion on : N/A
SAS Address(0): 0x4433221103000000
Connected Port Number: 1(path0)
Inquiry Data: WD-WX71EA3ETUU4WDC WD5000HHTZ-04N21V1 04.06A01
FDE Capable: Not Capable
FDE Enable: Disable
Secured: Unsecured
Locked: Unlocked
Needs EKM Attention: No
Foreign State: None
Device Speed: 6.0Gb/s
Link Speed: 6.0Gb/s
Media Type: Hard Disk Device
Drive: Not Certified
Drive Temperature : N/A
PI Eligibility: No
Drive is formatted for PI information: No
PI: No PI
Drive's NCQ setting : N/A
Port-0 :
Port status: Active
Port's Linkspeed: 6.0Gb/s
Drive has flagged a S.M.A.R.T alert : No
Physical Disk: 1
Enclosure Device ID: 32
Slot Number: 5
Drive's position: DiskGroup: 2, Span: 0, Arm: 1
Enclosure position: N/A
Device Id: 5
WWN: 50014ee3aabf9f80
Sequence Number: 2
Media Error Count: 0
Other Error Count: 0
Predictive Failure Count: 0
Last Predictive Failure Event Seq Number: 0
PD Type: SATA
Raw Size: 465.761 GB [0x3a386030 Sectors]
Non Coerced Size: 465.261 GB [0x3a286030 Sectors]
Coerced Size: 465.25 GB [0x3a280000 Sectors]
Sector Size: 0
Firmware state: Online, Spun Up
Device Firmware Level: 6A00
Shield Counter: 0
Successful diagnostics completion on : N/A
SAS Address(0): 0x4433221102000000
Connected Port Number: 2(path0)
Inquiry Data: WD-WXN1E32MCZLNWDC WD5000HHTZ-04N21V0 04.06A00
FDE Capable: Not Capable
FDE Enable: Disable
Secured: Unsecured
Locked: Unlocked
Needs EKM Attention: No
Foreign State: None
Device Speed: 6.0Gb/s
Link Speed: 6.0Gb/s
Media Type: Hard Disk Device
Drive: Not Certified
Drive Temperature : N/A
PI Eligibility: No
Drive is formatted for PI information: No
PI: No PI
Drive's NCQ setting : N/A
Port-0 :
Port status: Active
Port's Linkspeed: 6.0Gb/s
Drive has flagged a S.M.A.R.T alert : No
SPAN: 1
Span Reference: 0x03
Number of PDs: 2
Number of VDs: 1
Number of dedicated Hotspares: 0
Virtual Drive Information:
Virtual Drive: 2 (Target Id: 2)
Name :data
RAID Level : Primary-1, Secondary-0, RAID Level Qualifier-0
Size : 930.5 GB
Sector Size : 512
Mirror Data : 930.5 GB
State : Degraded
Strip Size : 64 KB
Number Of Drives per span:2
Span Depth : 2
Default Cache Policy: WriteThrough, ReadAdaptive, Direct, No Write Cache if Bad BBU
Current Cache Policy: WriteThrough, ReadAdaptive, Direct, No Write Cache if Bad BBU
Default Access Policy: Read/Write
Current Access Policy: Read/Write
Disk Cache Policy : Disk's Default
Encryption Type : None
Bad Blocks Exist: No
Is VD Cached: Yes
Cache Cade Type : Read Only
Physical Disk Information:
Physical Disk: 0
Enclosure Device ID: 32
Slot Number: 6
Drive's position: DiskGroup: 2, Span: 1, Arm: 0
Enclosure position: N/A
Device Id: 6
WWN: 50014ee75595a31c
Sequence Number: 2
Media Error Count: 0
Other Error Count: 0
Predictive Failure Count: 0
Last Predictive Failure Event Seq Number: 0
PD Type: SATA
Raw Size: 465.761 GB [0x3a386030 Sectors]
Non Coerced Size: 465.261 GB [0x3a286030 Sectors]
Coerced Size: 465.25 GB [0x3a280000 Sectors]
Sector Size: 0
Firmware state: Online, Spun Up
Device Firmware Level: 6A01
Shield Counter: 0
Successful diagnostics completion on : N/A
SAS Address(0): 0x4433221101000000
Connected Port Number: 0(path0)
Inquiry Data: WD-WX71E34SJY05WDC WD5000HHTZ-04N21V1 04.06A01
FDE Capable: Not Capable
FDE Enable: Disable
Secured: Unsecured
Locked: Unlocked
Needs EKM Attention: No
Foreign State: None
Device Speed: 6.0Gb/s
Link Speed: 6.0Gb/s
Media Type: Hard Disk Device
Drive: Not Certified
Drive Temperature : N/A
PI Eligibility: No
Drive is formatted for PI information: No
PI: No PI
Drive's NCQ setting : N/A
Port-0 :
Port status: Active
Port's Linkspeed: 6.0Gb/s
Drive has flagged a S.M.A.R.T alert : No
Physical Disk: 1
Enclosure Device ID: 32
Slot Number: 7
Drive's position: DiskGroup: 2, Span: 1, Arm: 1
Enclosure position: N/A
Device Id: 7
WWN: 50014ee659fa7138
Sequence Number: 2
Media Error Count: 0
Other Error Count: 0
Predictive Failure Count: 0
Last Predictive Failure Event Seq Number: 0
PD Type: SATA
Raw Size: 465.761 GB [0x3a386030 Sectors]
Non Coerced Size: 465.261 GB [0x3a286030 Sectors]
Coerced Size: 465.25 GB [0x3a280000 Sectors]
Sector Size: 0
Firmware state: Unconfigured(good), Spun Up
Device Firmware Level: 6A01
Shield Counter: 0
Successful diagnostics completion on : N/A
SAS Address(0): 0x4433221100000000
Connected Port Number: 3(path0)
Inquiry Data: WD-WX71E34LFW34WDC WD5000HHTZ-04N21V1 04.06A01
FDE Capable: Not Capable
FDE Enable: Disable
Secured: Unsecured
Locked: Unlocked
Needs EKM Attention: No
Foreign State: None
Device Speed: 6.0Gb/s
Link Speed: 6.0Gb/s
Media Type: Hard Disk Device
Drive: Not Certified
Drive Temperature : N/A
PI Eligibility: No
Drive is formatted for PI information: No
PI: No PI
Drive's NCQ setting : N/A
Port-0 :
Port status: Active
Port's Linkspeed: 6.0Gb/s
Drive has flagged a S.M.A.R.T alert : No
Получается такая простыня, которую очень трудно читать и анализировать. Грепаю вывод, чтобы разобраться, что тут вообще выходит:
# megacli -CfgDsply -aALL | grep "Span Reference:"
Span Reference: 0x00
Span Reference: 0x01
Span Reference: 0x02
Span Reference: 0x03
Вижу, что у меня 4 конфигурации, хотя массива только 3. Рассуждаю логически. Так как последний массив это RAID10, то наверно он отображается как 2 RAID1. Проверил внимательно вывод конфигурации, убедился, что так оно и есть. Первые 2 рейда обозначены как DISK GROUP: 0 и 1, а raid10 как SPANNED DISK GROUP: 0, в котором соответственно SPAN: 0 и 1. Один из SPAN имеет статус Degraded и параметр Span Reference: 0x03. Судя по документации, мне надо взять это число 0x03 и отбросить 0x0. Получается цифра 3 и параметр Array3 в команде.
Дальше следует параметр row. Я очень старался понять что это такое :) Описание:
«The number N of the row parameter is the Physical Disk in that span or array starting with zero (it can be but is not always the physical disk’s slot!)».
Только сейчас, когда пишу статью, легко понимаю, откуда берется эта цифра. А когда тестировал сильно тупил и никак не мог сообразить. Сильно мешает очень объемный вывод команд. Я устал глазами бегать по простыням. В общем, это номер диска в сбойном SPAN. В моем случае это второй диск в SPAN, то есть цифра 1, так как отсчет идет с нуля. Таким образом получился параметр row1. Еще раз напоминаю команду замены сбойного диска:
megacli -PdReplaceMissing -PhysDrv [32:7] -Array3 -row1 -a0
Пока мы только указали, что заменили диск. Теперь нам надо запустить его ребил:
megacli -pdrbld -start -physdrv[32:7] -a0
Статус ребилда смотрим командой:
megacli -pdrbld -showprog -physdrv[32:7] -a0
Rebuild Progress on Device at Enclosure 32, Slot 7 Completed 41% in 18 Minutes.
После окончания ребилда снова смотрим вывод информации по массивам и дискам. Массив должен стать Optimal, а диск Online, Spun Up. На этом забываем про megacli как страшный сон и вспоминаем про приятный и удобный mdadm.
Я всегда тестирую выход из строя жесткого диска и его замену. Делаю на всех массивах, железных и софтовых. На железных, чтобы вот таких сюрпризов не было, а была рабочая инструкция. А в софтовых, в основном, чтобы убедиться, что загрузчик стоит на всех нужных дисках и система поднимется в случае чего. По надежности и замене дисков у меня к mdadm вопросов нет. Там все понятно и просто.
Комментарии
Отправить комментарий