-
-
Notifications
You must be signed in to change notification settings - Fork 6
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
3 changed files
with
70 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,70 @@ | ||
# 2024-12-11 Firmware upgrade on off2 | ||
|
||
We noticed that on off2, the latency was really high for two of the hard drives, | ||
namely sdc and sdd. | ||
|
||
After some research, Christian found that it could be linked to an old firmware on the disks. | ||
|
||
## Preparation | ||
|
||
We can see firmware version by using (here for sdd): | ||
```bash | ||
smartctl -a /dev/sdd|grep -i firmware | ||
``` | ||
It was `GB01` on the two culprits, while it was `GB03` on the others. | ||
|
||
Christian downloaded the firmware update for the disks, | ||
it comes as an installer: `Serial-ATA_Firmware_0942Y_LN_GB03_A00.BIN` | ||
|
||
You can find it by using the "Device Model" entry from `smartctl -a` (here `TOSHIBA MG07ACA14TEY`) | ||
and search for firmwaon the flyre. You land on `https://www.dell.com/support/home/fr-fr/drivers/driversdetails?driverid=0942y` | ||
|
||
|
||
## Changing the firmware | ||
|
||
The firmware update is happening on all disk on the server. | ||
|
||
We just run the program as root and it went smoothly. | ||
|
||
But we wanted to be sure that the firmware upgrade was taken into account | ||
(we didn't have the time to wait for a check, | ||
and where not sure of the figure seen with `zpool iostat`). | ||
|
||
So we decided to remove and re-add the disks to the server. | ||
Because our ZFS is raidz we can remove one disk at a time without losing data, | ||
and because we will remove it only for a short period, re-silver will be fast. | ||
|
||
So we removed sdd from ZFS pools | ||
```bash | ||
zpool offline zfs-hdd sdd4 | ||
# for rpool the name is different, but we can recognize thanks to the serial number (smartctl -a gives it) | ||
zpool offline rpool ata-TOSHIBA_MG07ACA14TEY_X8F0A0H8F9RG-part3 | ||
``` | ||
|
||
We unplugged the disk and plugged it back in. | ||
|
||
Then we re-added it to the ZFS pools: | ||
```bash | ||
zpool online zfs-hdd sdd4 | ||
zpool online rpool ata-TOSHIBA_MG07ACA14TEY_X8F0A0H8F9RG-part3 | ||
``` | ||
|
||
We then wait for re-silver to be completed, zpool status displays if a resilver is in progress or done: | ||
```bash | ||
zpool status | ||
``` | ||
It was almost instantaneous. | ||
|
||
We did the same operation for the other disk, `sdc`. | ||
|
||
Note: we did not offlined the disk from rpool because we forgot to do it. It went well the same, | ||
because ZFS was resilient enough and did the resilver automatically. | ||
|
||
|
||
## Result | ||
|
||
The result is easy to see in munin: | ||
|
||
![disk latency graph](./media/2024-12-11-munin-off2-latency-after-firmware-update.png) | ||
|
||
![disk usage graph](./media/2024-12-11-munin-off2-usage-after-firmware-update.png) |
Binary file added
BIN
+77.1 KB
docs/reports/media/2024-12-11-munin-off2-latency-after-firmware-update.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+61.3 KB
docs/reports/media/2024-12-11-munin-off2-usage-after-firmware-update.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.