阅读上一个主题 :: 阅读下一个主题 |
作者 |
留言 |
gothic 半仙
注册时间: 2013-01-01 文章: 12
|
发表于: Fri 2014-05-02 11:10:22 发表主题: NAS掉盘 |
|
|
五一回家发现去年组的NAS有点问题,于是:
代码: | # zpool status -v
pool: pool0
state: DEGRADED
status: One or more devices has been removed by the administrator.
Sufficient replicas exist for the pool to continue functioning in a
degraded state.
action: Online the device using 'zpool online' or replace the device with
'zpool replace'.
scan: scrub repaired 384K in 3h56m with 0 errors on Thu May 1 19:23:33 2014
config:
NAME STATE READ WRITE CKSUM
pool0 DEGRADED 0 0 0
raidz2-0 DEGRADED 0 0 0
gpt/st30a0zfs ONLINE 0 0 0
17537677132637262161 REMOVED 0 0 0 was /dev/gpt/st30b0zfs
gpt/st30c0zfs ONLINE 0 0 0
gpt/st30d0zfs ONLINE 0 0 0
gpt/st30e0zfs ONLINE 0 0 0
gpt/st30f0zfs ONLINE 0 0 0
errors: No known data errors
pool: pool1
state: ONLINE
scan: scrub repaired 0 in 1h31m with 0 errors on Sat Feb 1 13:53:13 2014
config:
NAME STATE READ WRITE CKSUM
pool1 ONLINE 0 0 0
raidz2-0 ONLINE 0 0 0
gpt/st30a1zfs ONLINE 0 0 0
gpt/st30a2zfs ONLINE 0 0 0
gpt/st30a3zfs ONLINE 0 0 0
gpt/st30b1zfs ONLINE 0 0 0
gpt/st30b2zfs ONLINE 0 0 0
gpt/st30b3zfs ONLINE 0 0 0
gpt/st30c1zfs ONLINE 0 0 0
gpt/st30c2zfs ONLINE 0 0 0
gpt/st30c3zfs ONLINE 0 0 0
gpt/st30d1zfs ONLINE 0 0 0
gpt/st30d2zfs ONLINE 0 0 0
gpt/st30d3zfs ONLINE 0 0 0
gpt/st30e1zfs ONLINE 0 0 0
gpt/st30e2zfs ONLINE 0 0 0
gpt/st30e3zfs ONLINE 0 0 0
gpt/st30f1zfs ONLINE 0 0 0
gpt/st30f2zfs ONLINE 0 0 0
gpt/st30f3zfs ONLINE 0 0 0
errors: No known data errors
|
看来是掉盘了,尝试重新上线一下:
代码: | # zpool online pool0 /dev/gpt/st30b0zfs
warning: device '/dev/gpt/st30b0zfs' onlined, but remains in faulted state
use 'zpool replace' to replace devices that are no longer present
|
似乎没救了,那就换盘吧。
把盘从pool里下线:
代码: | # zpool offline pool0 /dev/gpt/st30b0zfs
|
停转硬盘:
代码: | # camcontrol stop da1
Unit stopped successfully
|
我抽~~~~
重新扫描总线:
代码: | # camcontrol rescan all
Re-scan of bus 0 was successful
Re-scan of bus 1 was successful
Re-scan of bus 2 was successful
Re-scan of bus 3 was successful
Re-scan of bus 4 was successful
Re-scan of bus 5 was successful
Re-scan of bus 6 was successful
Re-scan of bus 7 was successful
Re-scan of bus 8 was successful
# camcontrol devlist
<ATA> at scbus0 target 0 lun 0 (pass0,da0)
<ATA> at scbus0 target 2 lun 0 (pass2,da2)
<ATA> at scbus0 target 3 lun 0 (pass3,da3)
<ATA> at scbus0 target 4 lun 0 (pass4,da4)
<ATA> at scbus0 target 5 lun 0 (pass5,da5)
<ATA> at scbus0 target 6 lun 0 (pass6,da6)
<ATA> at scbus0 target 7 lun 0 (pass7,da7)
<ATA> at scbus1 target 0 lun 0 (pass8,da8)
<ATA> at scbus1 target 1 lun 0 (pass9,da9)
<ATA> at scbus1 target 2 lun 0 (pass10,da10)
<ATA> at scbus1 target 3 lun 0 (pass11,da11)
<ATA> at scbus1 target 4 lun 0 (pass12,da12)
<ATA> at scbus1 target 5 lun 0 (pass13,da13)
<ATA> at scbus1 target 6 lun 0 (pass14,da14)
<ATA> at scbus1 target 7 lun 0 (pass15,da15)
<ATA> at scbus2 target 0 lun 0 (pass16,da16)
<ATA> at scbus2 target 1 lun 0 (pass17,da17)
<ATA> at scbus2 target 2 lun 0 (pass18,da18)
<ATA> at scbus2 target 3 lun 0 (pass19,da19)
<INTEL> at scbus3 target 0 lun 0 (ada0,pass20)
<ST3000DM001> at scbus5 target 0 lun 0 (ada1,pass21)
<ST3000DM001> at scbus6 target 0 lun 0 (ada2,pass22)
<ST3000DM001> at scbus7 target 0 lun 0 (ada3,pass23)
<ST3000DM001> at scbus8 target 0 lun 0 (ada4,pass24)
|
嗯,da1已经不见了。
我插~~~~
代码: | # camcontrol rescan all
Re-scan of bus 0 was successful
Re-scan of bus 1 was successful
Re-scan of bus 2 was successful
Re-scan of bus 3 was successful
Re-scan of bus 4 was successful
Re-scan of bus 5 was successful
Re-scan of bus 6 was successful
Re-scan of bus 7 was successful
Re-scan of bus 8 was successful
# camcontrol devlist
<ATA> at scbus0 target 0 lun 0 (pass0,da0)
<ATA> at scbus0 target 2 lun 0 (pass2,da2)
<ATA> at scbus0 target 3 lun 0 (pass3,da3)
<ATA> at scbus0 target 4 lun 0 (pass4,da4)
<ATA> at scbus0 target 5 lun 0 (pass5,da5)
<ATA> at scbus0 target 6 lun 0 (pass6,da6)
<ATA> at scbus0 target 7 lun 0 (pass7,da7)
<ATA> at scbus0 target 8 lun 0 (da20,pass1)
<ATA> at scbus1 target 0 lun 0 (pass8,da8)
<ATA> at scbus1 target 1 lun 0 (pass9,da9)
<ATA> at scbus1 target 2 lun 0 (pass10,da10)
<ATA> at scbus1 target 3 lun 0 (pass11,da11)
<ATA> at scbus1 target 4 lun 0 (pass12,da12)
<ATA> at scbus1 target 5 lun 0 (pass13,da13)
<ATA> at scbus1 target 6 lun 0 (pass14,da14)
<ATA> at scbus1 target 7 lun 0 (pass15,da15)
<ATA> at scbus2 target 0 lun 0 (pass16,da16)
<ATA> at scbus2 target 1 lun 0 (pass17,da17)
<ATA> at scbus2 target 2 lun 0 (pass18,da18)
<ATA> at scbus2 target 3 lun 0 (pass19,da19)
<INTEL> at scbus3 target 0 lun 0 (ada0,pass20)
<ST3000DM001> at scbus5 target 0 lun 0 (ada1,pass21)
<ST3000DM001> at scbus6 target 0 lun 0 (ada2,pass22)
<ST3000DM001> at scbus7 target 0 lun 0 (ada3,pass23)
<ST3000DM001> at scbus8 target 0 lun 0 (ada4,pass24)
|
可以看到一个新盘da20挂在pass1上了。
分区、初始化:
代码: | # sh ./init.sh da20 st30b0
da20 created
da20p1 added
da20p2 added
|
开始替换:
代码: | # zpool replace pool0 17537677132637262161 /dev/gpt/st30b0zfs.nop
# zpool status
pool: pool0
state: DEGRADED
status: One or more devices is currently being resilvered. The pool will
continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
scan: resilver in progress since Fri May 2 02:17:55 2014
14.2G scanned out of 5.47T at 134M/s, 11h49m to go
2.36G resilvered, 0.25% done
config:
NAME STATE READ WRITE CKSUM
pool0 DEGRADED 0 0 0
raidz2-0 DEGRADED 0 0 0
gpt/st30a0zfs ONLINE 0 0 0
replacing-1 OFFLINE 0 0 0
17537677132637262161 OFFLINE 0 0 0 was /dev/gpt/st30b0zfs
gpt/st30b0zfs.nop ONLINE 0 0 0 (resilvering)
gpt/st30c0zfs ONLINE 0 0 0
gpt/st30d0zfs ONLINE 0 0 0
gpt/st30e0zfs ONLINE 0 0 0
gpt/st30f0zfs ONLINE 0 0 0
errors: No known data errors
pool: pool1
state: ONLINE
scan: scrub repaired 0 in 1h31m with 0 errors on Sat Feb 1 13:53:13 2014
config:
NAME STATE READ WRITE CKSUM
pool1 ONLINE 0 0 0
raidz2-0 ONLINE 0 0 0
gpt/st30a1zfs ONLINE 0 0 0
gpt/st30a2zfs ONLINE 0 0 0
gpt/st30a3zfs ONLINE 0 0 0
gpt/st30b1zfs ONLINE 0 0 0
gpt/st30b2zfs ONLINE 0 0 0
gpt/st30b3zfs ONLINE 0 0 0
gpt/st30c1zfs ONLINE 0 0 0
gpt/st30c2zfs ONLINE 0 0 0
gpt/st30c3zfs ONLINE 0 0 0
gpt/st30d1zfs ONLINE 0 0 0
gpt/st30d2zfs ONLINE 0 0 0
gpt/st30d3zfs ONLINE 0 0 0
gpt/st30e1zfs ONLINE 0 0 0
gpt/st30e2zfs ONLINE 0 0 0
gpt/st30e3zfs ONLINE 0 0 0
gpt/st30f1zfs ONLINE 0 0 0
gpt/st30f2zfs ONLINE 0 0 0
gpt/st30f3zfs ONLINE 0 0 0
errors: No known data errors
|
嗯,慢慢等吧XD……
最后进行编辑的是 gothic on Sun 2014-05-11 23:46:28, 总计第 1 次编辑 |
|
返回页首 |
|
 |
rui0hu 半仙
注册时间: 2014-05-01 文章: 12
|
发表于: Fri 2014-05-02 16:13:45 发表主题: |
|
|
历害 去年到现在 差不多一年了 |
|
返回页首 |
|
 |
zhengwei_zw 道士

注册时间: 2005-10-14 文章: 667 来自: SC=CD
|
发表于: Tue 2014-05-06 14:15:13 发表主题: |
|
|
我抽!!!
我插1!!
潇洒啊! _________________ www.sklinux.com服务器维护 |
|
返回页首 |
|
 |
rui0hu 半仙
注册时间: 2014-05-01 文章: 12
|
发表于: Thu 2014-05-08 17:43:38 发表主题: |
|
|
zhengwei_zw 写到: | 我抽!!!
我插1!!
潇洒啊! |
同感啊! 一抽一插 |
|
返回页首 |
|
 |
gothic 半仙
注册时间: 2013-01-01 文章: 12
|
发表于: Mon 2014-05-12 00:06:11 发表主题: |
|
|
填坑填坑~
代码: | # zpool status
pool: pool0
state: ONLINE
scan: resilvered 934G in 4h23m with 0 errors on Fri May 2 06:40:56 2014
config:
NAME STATE READ WRITE CKSUM
pool0 ONLINE 0 0 0
raidz2-0 ONLINE 0 0 0
gpt/st30a0zfs ONLINE 0 0 0
gpt/st30b0zfs.nop ONLINE 0 0 0
gpt/st30c0zfs ONLINE 0 0 0
gpt/st30d0zfs ONLINE 0 0 0
gpt/st30e0zfs ONLINE 0 0 0
gpt/st30f0zfs ONLINE 0 0 0
errors: No known data errors
pool: pool1
state: ONLINE
scan: scrub in progress since Fri May 2 09:04:08 2014
2.23T scanned out of 3.66T at 688M/s, 0h36m to go
0 repaired, 60.96% done
config:
NAME STATE READ WRITE CKSUM
pool1 ONLINE 0 0 0
raidz2-0 ONLINE 0 0 0
gpt/st30a1zfs ONLINE 0 0 0
gpt/st30a2zfs ONLINE 0 0 0
gpt/st30a3zfs ONLINE 0 0 0
gpt/st30b1zfs ONLINE 0 0 0
gpt/st30b2zfs ONLINE 0 0 0
gpt/st30b3zfs ONLINE 0 0 0
gpt/st30c1zfs ONLINE 0 0 0
gpt/st30c2zfs ONLINE 0 0 0
gpt/st30c3zfs ONLINE 0 0 0
gpt/st30d1zfs ONLINE 0 0 0
gpt/st30d2zfs ONLINE 0 0 0
gpt/st30d3zfs ONLINE 0 0 0
gpt/st30e1zfs ONLINE 0 0 0
gpt/st30e2zfs ONLINE 0 0 0
gpt/st30e3zfs ONLINE 0 0 0
gpt/st30f1zfs ONLINE 0 0 0
gpt/st30f2zfs ONLINE 0 0 0
gpt/st30f3zfs ONLINE 0 0 0
errors: No known data errors
|
修复完成,用了4小时23分钟修复了934G数据,平均60.6M/s。
继续检查另外一个池,无错误,3.66T的数据效验用了1小时32分钟,平均695M/s。
这次修复的总结:
1.zfs很好很强大。
2.掉盘后,处于degraded的pool读写的时候会一卡一卡,如果你突然有天也发现读写有问题,要考虑是不是掉盘。
3.zfs修复的时候只修复有数据的部分,这个速度很不错了,以前的raid5修复要几天,而且修复的时候再掉盘就只有哭了。
4.以前觉得家用不需要热备盘,现在觉得还是有用的,有时候忙起来的根本记不得去scrub,下次再升级存储的时候会考虑加热备盘。并且每周发个邮件给自己。
5.一定要记录下硬盘的位置、序列号、设备名称,方便快速定位。
6.热插拔的作用很大。
7.可以的话直接采用成品的2U或4U服务器,不要自己DIY了,效果真的没有成品好。现在有些服务器风道、噪音都控制的很好。完全可以家用。二手的也不贵。
8.24块希捷硬盘7*24小时运行了一年,坏了一块,可以接受。拆下来的硬盘用MHDD扫了一下,3X%的地方出现大量坏道,寄回去换新了。
最后贴下运行时间,中间只在9.1升级9.2的时候重起了一次。
代码: | # top
last pid: 15082; load averages: 0.00, 0.00, 0.00 up 152+02:33:05 15:29:30
28 processes: 1 running, 27 sleeping
CPU: % user, % nice, % system, % interrupt, % idle
Mem: 5318M Active, 1204M Inact, 96G Wired, 1674M Buf, 22G Free
ARC: 93G Total, 77G MFU, 16G MRU, 36K Anon, 562M Header, 199M Other
Swap: 96G Total, 96G Free
|
感谢FreeBSD。
另,我写了一点存储心得,本想发到wiki上的,但是找了半天也没有找到添加新条目的地方,似乎是直接上传文本的,但我只看到一些上传的资源目录(图片啥的),文本要上传到哪里? |
|
返回页首 |
|
 |
door10000 道童
注册时间: 2012-01-27 文章: 278 来自: 湖南祁阳
|
发表于: Thu 2014-05-15 12:16:17 发表主题: |
|
|
网络传输速度是多少呀? |
|
返回页首 |
|
 |
xjflyttp 半仙
注册时间: 2012-09-25 文章: 77
|
发表于: Wed 2014-09-24 13:23:23 发表主题: |
|
|
一般会选择定期smartctl看盘的realloc unc 记录 出现realloc就换掉 |
|
返回页首 |
|
 |
outcrop 半仙
注册时间: 2005-01-05 文章: 128
|
发表于: Fri 2016-10-28 11:50:35 发表主题: 不错 |
|
|
没找到收藏按钮,人肉回复一个 _________________ entering bsd world...机电工程师:www.jdgcs.org |
|
返回页首 |
|
 |
million 道童
注册时间: 2002-07-09 文章: 283 来自: StarBucks Cafe
|
发表于: Thu 2016-11-10 19:58:37 发表主题: Re: 不错 |
|
|
outcrop 写到: | 没找到收藏按钮,人肉回复一个 |
左下角的订阅 |
|
返回页首 |
|
 |
|
|
您不能发布新主题 您不能在这个论坛回复主题 您不能在这个论坛编辑自己的文章 您不能在这个论坛删除自己的文章 您不能在这个论坛发表投票
|
|