Message ID | 7e185931-8830-5f31-7abb-5419bb255991@suse.com (mailing list archive) |
---|---|
State | Not Applicable, archived |
Headers | show |
On 8/2/16 8:20 AM, Jeff Mahoney wrote: > Hi all - > > While investigating a weird report on an internal list I found a few old > commits that don't look quite right and that may be very old bugs. I > know it's hard to go back nearly 15 years, especially in the days where > very short commit messages were still acceptable, and try to remember > why certain changes happened. In this case, a weird corner case[1] > would've been caught, xfs_repair would've bailed, and a file system may > have survived (despite obvious user error). > > 1/ Commit 5000d01d212f (white space cleanup) Meh, not a white space cleanup, is it! > diff --git a/libxlog/util.c b/libxlog/util.c > index 7aca165..aa3093d 100644 > --- a/libxlog/util.c > +++ b/libxlog/util.c > @@ -49,8 +49,10 @@ header_check_uuid(xfs_mount_t *mp, xlog_rec_header_t > *head) > printf("* ERROR: mismatched uuid in log\n" > "* SB : %s\n* log: %s\n", > uu_sb, uu_log); > + > + memcpy(&mp->m_sb.sb_uuid, head->h_fs_uuid, sizeof(uuid_t)); > > - return 1; > + return 0; > } However, after seeing the mismatch, it "fixes" it by copying the header uuid into the mount point uuid. But that doesn't seem like the right approach at all, and it renders all the callers who check the return value of header_check_uuid pointless. So yeah, doesn't look good to me. > This one may well have been intended as a repair operation or perhaps it > was accidentally duplicated from another chunk in the patch. At any > rate, it hides a mismatched UUID between the log and the superblock from > the rest of xfs_repair. The user sees the "error" message but it > carries on anyway. > > 2) Commit d321ceac8da (add libxlog directory.) > > I believe this was supposed to be as simple as pushing some > functionality from logprint into a new libxlog library, but the result > is that things that used to return an error no longer did. The > print_exit global that was initialized to 1 in logprint is initialized > to 0 in libxlog and never set. So we always print an error message but > then carry on. So even if the header_check_uuid() call above would fail > properly, the error is printed and then the error condition ignored. Sigh, yeah, the old commits are wild west. :( In this case header_check_uuid returned 0 anyway,so print_exit would not have helped, but I think you're right. A perfect storm of derp. ;) > -Jeff > > [1] The details are still murky, but what I got was that the user ran > xfs_repair -L (yup, i know) on an image file that contained partitions. > It found a valid XFS superblock and then "repaired" the file system to > an empty state since everything it found was "corrupt." I suspect that > there was an XFS file system on the raw image file, which was then > partitioned without clearing the MBR, and the expected XFS file system > was created on the first partition. xfs_repair was pointed at the whole > image file, discovered the old superblock, and remade the fs in its own > image since nothing was at the proper locations. Oh, so it found 4 matching, old, valid, superblocks? Ugh. I don't know how to protect against that, although <handwave> it probably should have found a few non-matching supers along the way as well. I wonder if we should be more cautious in that case. I could imagine that maybe for each candidate super we find, we should look at its geometry, and spot-check the other locations that it indicates should contain a superblock. If we get enough semi-valid but conflicting "sets," maybe we should bail out and ask. It's quite a corner case, tho. Any chance you have full xfs_repair output? -Eric
On 8/2/16 11:00 AM, Eric Sandeen wrote: > On 8/2/16 8:20 AM, Jeff Mahoney wrote: >> Hi all - >> >> While investigating a weird report on an internal list I found a few old >> commits that don't look quite right and that may be very old bugs. I >> know it's hard to go back nearly 15 years, especially in the days where >> very short commit messages were still acceptable, and try to remember >> why certain changes happened. In this case, a weird corner case[1] >> would've been caught, xfs_repair would've bailed, and a file system may >> have survived (despite obvious user error). >> >> 1/ Commit 5000d01d212f (white space cleanup) > > Meh, not a white space cleanup, is it! > >> diff --git a/libxlog/util.c b/libxlog/util.c >> index 7aca165..aa3093d 100644 >> --- a/libxlog/util.c >> +++ b/libxlog/util.c >> @@ -49,8 +49,10 @@ header_check_uuid(xfs_mount_t *mp, xlog_rec_header_t >> *head) >> printf("* ERROR: mismatched uuid in log\n" >> "* SB : %s\n* log: %s\n", >> uu_sb, uu_log); >> + >> + memcpy(&mp->m_sb.sb_uuid, head->h_fs_uuid, sizeof(uuid_t)); >> >> - return 1; >> + return 0; >> } > > However, after seeing the mismatch, it "fixes" it by copying the header > uuid into the mount point uuid. > > But that doesn't seem like the right approach at all, and it renders all > the callers who check the return value of header_check_uuid pointless. > So yeah, doesn't look good to me. > >> This one may well have been intended as a repair operation or perhaps it >> was accidentally duplicated from another chunk in the patch. At any >> rate, it hides a mismatched UUID between the log and the superblock from >> the rest of xfs_repair. The user sees the "error" message but it >> carries on anyway. >> >> 2) Commit d321ceac8da (add libxlog directory.) >> >> I believe this was supposed to be as simple as pushing some >> functionality from logprint into a new libxlog library, but the result >> is that things that used to return an error no longer did. The >> print_exit global that was initialized to 1 in logprint is initialized >> to 0 in libxlog and never set. So we always print an error message but >> then carry on. So even if the header_check_uuid() call above would fail >> properly, the error is printed and then the error condition ignored. > > Sigh, yeah, the old commits are wild west. :( > > In this case header_check_uuid returned 0 anyway,so print_exit would > not have helped, but I think you're right. > > A perfect storm of derp. ;) Haha, yep. >> -Jeff >> >> [1] The details are still murky, but what I got was that the user ran >> xfs_repair -L (yup, i know) on an image file that contained partitions. >> It found a valid XFS superblock and then "repaired" the file system to >> an empty state since everything it found was "corrupt." I suspect that >> there was an XFS file system on the raw image file, which was then >> partitioned without clearing the MBR, and the expected XFS file system >> was created on the first partition. xfs_repair was pointed at the whole >> image file, discovered the old superblock, and remade the fs in its own >> image since nothing was at the proper locations. > > Oh, so it found 4 matching, old, valid, superblocks? Ugh. I don't know > how to protect against that, although <handwave> it probably should have > found a few non-matching supers along the way as well. I wonder if we > should be more cautious in that case. Well, it at least found the non-matching log but then ran into the trouble above. > I could imagine that maybe for each candidate super we find, we should > look at its geometry, and spot-check the other locations that it indicates > should contain a superblock. If we get enough semi-valid but conflicting > "sets," maybe we should bail out and ask. It's quite a corner case, tho. I'm not sure a geometry check would've helped here. The superblock geometry still would've covered the whole, unpartitioned device. Since we're already linking with blkid, maybe a check to see if there's a partition table on the device and bail if it sees one, unless forced? The part that I'm still trying to explain is how it managed to get a good magic from the log and then got the wrong uuid. > Any chance you have full xfs_repair output? Sure, below. I heard back from the reporter and confirmed that my hypothesis of mkfs -> fdisk -> mkfs was what happened. He's on SLE12 SP1, so that means xfsprogs 3.2.1. -Jeff --- labadmin:/ssd # xfs_repair postgre.raw -L Phase 1 - find and verify superblock... - reporting progress in intervals of 15 minutes sb root inode value 18446744073709551615 (NULLFSINO) inconsistent with calculated value 128 resetting superblock root inode pointer to 128 sb realtime bitmap inode 18446744073709551615 (NULLFSINO) inconsistent with calculated value 129 resetting superblock realtime bitmap ino pointer to 129 sb realtime summary inode 18446744073709551615 (NULLFSINO) inconsistent with calculated value 130 resetting superblock realtime summary ino pointer to 130 Phase 2 - using internal log - zero log... * ERROR: mismatched uuid in log * SB : ae384026-e3ac-4350-90b8-6dc1ac91595d * log: 4cb79a9a-4b85-40fd-aed5-c7f2de36a3f5 ALERT: The filesystem has valuable metadata changes in a log which is being destroyed because the -L option was used. - scan filesystem freespace and inode maps... Metadata corruption detected at block 0x1/0x200 Metadata corruption detected at block 0x2/0x200 bad magic # 0x0 for agf 0 bad version # 0 for agf 0 bad length 0 for agf 0, should be 491520 bad magic # 0x0 for agi 0 bad version # 0 for agi 0 bad length # 0 for agi 0, should be 491520 reset bad agf for ag 0 reset bad agi for ag 0 bad agbno 0 for btbno root, agno 0 bad agbno 0 for btbcnt root, agno 0 bad agbno 0 for inobt root, agno 0 agi unlinked bucket 0 is 0 in ag 0 (inode=0) agi unlinked bucket 1 is 0 in ag 0 (inode=0) agi unlinked bucket 2 is 0 in ag 0 (inode=0) agi unlinked bucket 3 is 0 in ag 0 (inode=0) agi unlinked bucket 4 is 0 in ag 0 (inode=0) agi unlinked bucket 5 is 0 in ag 0 (inode=0) agi unlinked bucket 6 is 0 in ag 0 (inode=0) agi unlinked bucket 7 is 0 in ag 0 (inode=0) agi unlinked bucket 8 is 0 in ag 0 (inode=0) agi unlinked bucket 9 is 0 in ag 0 (inode=0) agi unlinked bucket 10 is 0 in ag 0 (inode=0) agi unlinked bucket 11 is 0 in ag 0 (inode=0) agi unlinked bucket 12 is 0 in ag 0 (inode=0) agi unlinked bucket 13 is 0 in ag 0 (inode=0) agi unlinked bucket 14 is 0 in ag 0 (inode=0) agi unlinked bucket 15 is 0 in ag 0 (inode=0) agi unlinked bucket 16 is 0 in ag 0 (inode=0) agi unlinked bucket 17 is 0 in ag 0 (inode=0) agi unlinked bucket 18 is 0 in ag 0 (inode=0) agi unlinked bucket 19 is 0 in ag 0 (inode=0) agi unlinked bucket 20 is 0 in ag 0 (inode=0) agi unlinked bucket 21 is 0 in ag 0 (inode=0) agi unlinked bucket 22 is 0 in ag 0 (inode=0) agi unlinked bucket 23 is 0 in ag 0 (inode=0) agi unlinked bucket 24 is 0 in ag 0 (inode=0) agi unlinked bucket 25 is 0 in ag 0 (inode=0) agi unlinked bucket 26 is 0 in ag 0 (inode=0) agi unlinked bucket 27 is 0 in ag 0 (inode=0) agi unlinked bucket 28 is 0 in ag 0 (inode=0) agi unlinked bucket 29 is 0 in ag 0 (inode=0) agi unlinked bucket 30 is 0 in ag 0 (inode=0) agi unlinked bucket 31 is 0 in ag 0 (inode=0) agi unlinked bucket 32 is 0 in ag 0 (inode=0) agi unlinked bucket 33 is 0 in ag 0 (inode=0) agi unlinked bucket 34 is 0 in ag 0 (inode=0) agi unlinked bucket 35 is 0 in ag 0 (inode=0) agi unlinked bucket 36 is 0 in ag 0 (inode=0) agi unlinked bucket 37 is 0 in ag 0 (inode=0) agi unlinked bucket 38 is 0 in ag 0 (inode=0) agi unlinked bucket 39 is 0 in ag 0 (inode=0) agi unlinked bucket 40 is 0 in ag 0 (inode=0) agi unlinked bucket 41 is 0 in ag 0 (inode=0) agi unlinked bucket 42 is 0 in ag 0 (inode=0) agi unlinked bucket 43 is 0 in ag 0 (inode=0) agi unlinked bucket 44 is 0 in ag 0 (inode=0) agi unlinked bucket 45 is 0 in ag 0 (inode=0) agi unlinked bucket 46 is 0 in ag 0 (inode=0) agi unlinked bucket 47 is 0 in ag 0 (inode=0) agi unlinked bucket 48 is 0 in ag 0 (inode=0) agi unlinked bucket 49 is 0 in ag 0 (inode=0) agi unlinked bucket 50 is 0 in ag 0 (inode=0) agi unlinked bucket 51 is 0 in ag 0 (inode=0) agi unlinked bucket 52 is 0 in ag 0 (inode=0) agi unlinked bucket 53 is 0 in ag 0 (inode=0) agi unlinked bucket 54 is 0 in ag 0 (inode=0) agi unlinked bucket 55 is 0 in ag 0 (inode=0) agi unlinked bucket 56 is 0 in ag 0 (inode=0) agi unlinked bucket 57 is 0 in ag 0 (inode=0) agi unlinked bucket 58 is 0 in ag 0 (inode=0) agi unlinked bucket 59 is 0 in ag 0 (inode=0) agi unlinked bucket 60 is 0 in ag 0 (inode=0) agi unlinked bucket 61 is 0 in ag 0 (inode=0) agi unlinked bucket 62 is 0 in ag 0 (inode=0) agi unlinked bucket 63 is 0 in ag 0 (inode=0) sb_fdblocks 7860416, counted 7368900 - 10:29:06: scanning filesystem freespace - 16 of 16 allocation groups done root inode chunk not found Phase 3 - for each AG... - scan and clear agi unlinked lists... - 10:29:06: scanning agi unlinked lists - 16 of 16 allocation groups done - process known inodes and perform inode discovery... - agno = 0 - agno = 15 Metadata corruption detected at block 0x40/0x2000 Metadata corruption detected at block 0x40/0x2000 Metadata corruption detected at block 0x40/0x2000 Metadata corruption detected at block 0x40/0x2000 Metadata corruption detected at block 0x40/0x2000 Metadata corruption detected at block 0x40/0x2000 Metadata corruption detected at block 0x40/0x2000 Metadata corruption detected at block 0x40/0x2000 Metadata corruption detected at block 0x40/0x2000 Metadata corruption detected at block 0x40/0x2000 Metadata corruption detected at block 0x40/0x2000 Metadata corruption detected at block 0x40/0x2000 Metadata corruption detected at block 0x40/0x2000 Metadata corruption detected at block 0x40/0x2000 Metadata corruption detected at block 0x40/0x2000 Metadata corruption detected at block 0x40/0x2000 Metadata corruption detected at block 0x40/0x2000 Metadata corruption detected at block 0x40/0x2000 Metadata corruption detected at block 0x40/0x2000 Metadata corruption detected at block 0x40/0x2000 Metadata corruption detected at block 0x40/0x2000 Metadata corruption detected at block 0x40/0x2000 Metadata corruption detected at block 0x40/0x2000 Metadata corruption detected at block 0x40/0x2000 Metadata corruption detected at block 0x40/0x2000 Metadata corruption detected at block 0x40/0x2000 Metadata corruption detected at block 0x40/0x2000 Metadata corruption detected at block 0x40/0x2000 Metadata corruption detected at block 0x40/0x2000 Metadata corruption detected at block 0x40/0x2000 Metadata corruption detected at block 0x40/0x2000 Metadata corruption detected at block 0x40/0x2000 Metadata corruption detected at block 0x50/0x2000 Metadata corruption detected at block 0x50/0x2000 Metadata corruption detected at block 0x50/0x2000 Metadata corruption detected at block 0x50/0x2000 Metadata corruption detected at block 0x50/0x2000 Metadata corruption detected at block 0x50/0x2000 Metadata corruption detected at block 0x50/0x2000 Metadata corruption detected at block 0x50/0x2000 Metadata corruption detected at block 0x50/0x2000 Metadata corruption detected at block 0x50/0x2000 Metadata corruption detected at block 0x50/0x2000 Metadata corruption detected at block 0x50/0x2000 Metadata corruption detected at block 0x50/0x2000 Metadata corruption detected at block 0x50/0x2000 Metadata corruption detected at block 0x50/0x2000 Metadata corruption detected at block 0x50/0x2000 Metadata corruption detected at block 0x50/0x2000 Metadata corruption detected at block 0x50/0x2000 Metadata corruption detected at block 0x50/0x2000 Metadata corruption detected at block 0x50/0x2000 Metadata corruption detected at block 0x50/0x2000 Metadata corruption detected at block 0x50/0x2000 Metadata corruption detected at block 0x50/0x2000 Metadata corruption detected at block 0x50/0x2000 Metadata corruption detected at block 0x50/0x2000 Metadata corruption detected at block 0x50/0x2000 Metadata corruption detected at block 0x50/0x2000 Metadata corruption detected at block 0x50/0x2000 Metadata corruption detected at block 0x50/0x2000 Metadata corruption detected at block 0x50/0x2000 Metadata corruption detected at block 0x50/0x2000 Metadata corruption detected at block 0x50/0x2000 bad magic number 0x0 on inode 128 bad version number 0x0 on inode 128 bad magic number 0x0 on inode 129 bad version number 0x0 on inode 129 bad magic number 0x0 on inode 130 bad version number 0x0 on inode 130 bad magic number 0x0 on inode 131 bad version number 0x0 on inode 131 bad magic number 0x0 on inode 132 bad version number 0x0 on inode 132 bad magic number 0x0 on inode 133 bad version number 0x0 on inode 133 bad magic number 0x0 on inode 134 bad version number 0x0 on inode 134 bad magic number 0x0 on inode 135 bad version number 0x0 on inode 135 bad magic number 0x0 on inode 136 bad version number 0x0 on inode 136 bad magic number 0x0 on inode 137 bad version number 0x0 on inode 137 bad magic number 0x0 on inode 138 bad version number 0x0 on inode 138 bad magic number 0x0 on inode 139 bad version number 0x0 on inode 139 bad magic number 0x0 on inode 140 bad version number 0x0 on inode 140 bad magic number 0x0 on inode 141 bad version number 0x0 on inode 141 bad magic number 0x0 on inode 142 bad version number 0x0 on inode 142 bad magic number 0x0 on inode 143 bad version number 0x0 on inode 143 bad magic number 0x0 on inode 144 bad version number 0x0 on inode 144 bad magic number 0x0 on inode 145 bad version number 0x0 on inode 145 bad magic number 0x0 on inode 146 bad version number 0x0 on inode 146 bad magic number 0x0 on inode 147 bad version number 0x0 on inode 147 bad magic number 0x0 on inode 148 bad version number 0x0 on inode 148 bad magic number 0x0 on inode 149 bad version number 0x0 on inode 149 bad magic number 0x0 on inode 150 bad version number 0x0 on inode 150 bad magic number 0x0 on inode 151 bad version number 0x0 on inode 151 bad magic number 0x0 on inode 152 bad version number 0x0 on inode 152 bad magic number 0x0 on inode 153 bad version number 0x0 on inode 153 bad magic number 0x0 on inode 154 bad version number 0x0 on inode 154 bad magic number 0x0 on inode 155 bad version number 0x0 on inode 155 bad magic number 0x0 on inode 156 bad version number 0x0 on inode 156 bad magic number 0x0 on inode 157 bad version number 0x0 on inode 157 bad magic number 0x0 on inode 158 bad version number 0x0 on inode 158 bad magic number 0x0 on inode 159 bad version number 0x0 on inode 159 bad magic number 0x0 on inode 160 bad version number 0x0 on inode 160 bad magic number 0x0 on inode 161 bad version number 0x0 on inode 161 bad magic number 0x0 on inode 162 bad version number 0x0 on inode 162 bad magic number 0x0 on inode 163 bad version number 0x0 on inode 163 bad magic number 0x0 on inode 164 bad version number 0x0 on inode 164 bad magic number 0x0 on inode 165 bad version number 0x0 on inode 165 bad magic number 0x0 on inode 166 bad version number 0x0 on inode 166 bad magic number 0x0 on inode 167 bad version number 0x0 on inode 167 bad magic number 0x0 on inode 168 bad version number 0x0 on inode 168 bad magic number 0x0 on inode 169 bad version number 0x0 on inode 169 bad magic number 0x0 on inode 170 bad version number 0x0 on inode 170 bad magic number 0x0 on inode 171 bad version number 0x0 on inode 171 bad magic number 0x0 on inode 172 bad version number 0x0 on inode 172 bad magic number 0x0 on inode 173 bad version number 0x0 on inode 173 bad magic number 0x0 on inode 174 bad version number 0x0 on inode 174 bad magic number 0x0 on inode 175 bad version number 0x0 on inode 175 bad magic number 0x0 on inode 176 bad version number 0x0 on inode 176 bad magic number 0x0 on inode 177 bad version number 0x0 on inode 177 bad magic number 0x0 on inode 178 bad version number 0x0 on inode 178 bad magic number 0x0 on inode 179 bad version number 0x0 on inode 179 bad magic number 0x0 on inode 180 bad version number 0x0 on inode 180 bad magic number 0x0 on inode 181 bad version number 0x0 on inode 181 bad magic number 0x0 on inode 182 bad version number 0x0 on inode 182 bad magic number 0x0 on inode 183 bad version number 0x0 on inode 183 bad magic number 0x0 on inode 184 bad version number 0x0 on inode 184 bad magic number 0x0 on inode 185 bad version number 0x0 on inode 185 bad magic number 0x0 on inode 186 bad version number 0x0 on inode 186 bad magic number 0x0 on inode 187 bad version number 0x0 on inode 187 bad magic number 0x0 on inode 188 bad version number 0x0 on inode 188 bad magic number 0x0 on inode 189 bad version number 0x0 on inode 189 bad magic number 0x0 on inode 190 bad version number 0x0 on inode 190 bad magic number 0x0 on inode 191 bad version number 0x0 on inode 191 bad magic number 0x0 on inode 128, resetting magic number bad version number 0x0 on inode 128, resetting version number imap claims a free inode 128 is in use, correcting imap and clearing inode cleared root inode 128 bad magic number 0x0 on inode 129, resetting magic number bad version number 0x0 on inode 129, resetting version number imap claims a free inode 129 is in use, correcting imap and clearing inode cleared realtime bitmap inode 129 bad magic number 0x0 on inode 130, resetting magic number bad version number 0x0 on inode 130, resetting version number imap claims a free inode 130 is in use, correcting imap and clearing inode cleared realtime summary inode 130 bad magic number 0x0 on inode 131, resetting magic number bad version number 0x0 on inode 131, resetting version number bad magic number 0x0 on inode 132, resetting magic number bad version number 0x0 on inode 132, resetting version number bad magic number 0x0 on inode 133, resetting magic number bad version number 0x0 on inode 133, resetting version number bad magic number 0x0 on inode 134, resetting magic number bad version number 0x0 on inode 134, resetting version number bad magic number 0x0 on inode 135, resetting magic number bad version number 0x0 on inode 135, resetting version number bad magic number 0x0 on inode 136, resetting magic number bad version number 0x0 on inode 136, resetting version number bad magic number 0x0 on inode 137, resetting magic number bad version number 0x0 on inode 137, resetting version number bad magic number 0x0 on inode 138, resetting magic number bad version number 0x0 on inode 138, resetting version number bad magic number 0x0 on inode 139, resetting magic number bad version number 0x0 on inode 139, resetting version number bad magic number 0x0 on inode 140, resetting magic number bad version number 0x0 on inode 140, resetting version number bad magic number 0x0 on inode 141, resetting magic number bad version number 0x0 on inode 141, resetting version number bad magic number 0x0 on inode 142, resetting magic number bad version number 0x0 on inode 142, resetting version number bad magic number 0x0 on inode 143, resetting magic number bad version number 0x0 on inode 143, resetting version number bad magic number 0x0 on inode 144, resetting magic number bad version number 0x0 on inode 144, resetting version number bad magic number 0x0 on inode 145, resetting magic number bad version number 0x0 on inode 145, resetting version number bad magic number 0x0 on inode 146, resetting magic number bad version number 0x0 on inode 146, resetting version number bad magic number 0x0 on inode 147, resetting magic number bad version number 0x0 on inode 147, resetting version number bad magic number 0x0 on inode 148, resetting magic number bad version number 0x0 on inode 148, resetting version number bad magic number 0x0 on inode 149, resetting magic number bad version number 0x0 on inode 149, resetting version number bad magic number 0x0 on inode 150, resetting magic number bad version number 0x0 on inode 150, resetting version number bad magic number 0x0 on inode 151, resetting magic number bad version number 0x0 on inode 151, resetting version number bad magic number 0x0 on inode 152, resetting magic number bad version number 0x0 on inode 152, resetting version number bad magic number 0x0 on inode 153, resetting magic number bad version number 0x0 on inode 153, resetting version number bad magic number 0x0 on inode 154, resetting magic number bad version number 0x0 on inode 154, resetting version number bad magic number 0x0 on inode 155, resetting magic number bad version number 0x0 on inode 155, resetting version number bad magic number 0x0 on inode 156, resetting magic number bad version number 0x0 on inode 156, resetting version number bad magic number 0x0 on inode 157, resetting magic number bad version number 0x0 on inode 157, resetting version number bad magic number 0x0 on inode 158, resetting magic number bad version number 0x0 on inode 158, resetting version number bad magic number 0x0 on inode 159, resetting magic number bad version number 0x0 on inode 159, resetting version number bad magic number 0x0 on inode 160, resetting magic number bad version number 0x0 on inode 160, resetting version number bad magic number 0x0 on inode 161, resetting magic number bad version number 0x0 on inode 161, resetting version number bad magic number 0x0 on inode 162, resetting magic number bad version number 0x0 on inode 162, resetting version number bad magic number 0x0 on inode 163, resetting magic number bad version number 0x0 on inode 163, resetting version number bad magic number 0x0 on inode 164, resetting magic number bad version number 0x0 on inode 164, resetting version number bad magic number 0x0 on inode 165, resetting magic number bad version number 0x0 on inode 165, resetting version number bad magic number 0x0 on inode 166, resetting magic number bad version number 0x0 on inode 166, resetting version number bad magic number 0x0 on inode 167, resetting magic number bad version number 0x0 on inode 167, resetting version number bad magic number 0x0 on inode 168, resetting magic number bad version number 0x0 on inode 168, resetting version number bad magic number 0x0 on inode 169, resetting magic number bad version number 0x0 on inode 169, resetting version number bad magic number 0x0 on inode 170, resetting magic number bad version number 0x0 on inode 170, resetting version number bad magic number 0x0 on inode 171, resetting magic number bad version number 0x0 on inode 171, resetting version number bad magic number 0x0 on inode 172, resetting magic number bad version number 0x0 on inode 172, resetting version number bad magic number 0x0 on inode 173, resetting magic number bad version number 0x0 on inode 173, resetting version number bad magic number 0x0 on inode 174, resetting magic number bad version number 0x0 on inode 174, resetting version number bad magic number 0x0 on inode 175, resetting magic number bad version number 0x0 on inode 175, resetting version number bad magic number 0x0 on inode 176, resetting magic number bad version number 0x0 on inode 176, resetting version number bad magic number 0x0 on inode 177, resetting magic number bad version number 0x0 on inode 177, resetting version number bad magic number 0x0 on inode 178, resetting magic number bad version number 0x0 on inode 178, resetting version number bad magic number 0x0 on inode 179, resetting magic number bad version number 0x0 on inode 179, resetting version number bad magic number 0x0 on inode 180, resetting magic number bad version number 0x0 on inode 180, resetting version number bad magic number 0x0 on inode 181, resetting magic number bad version number 0x0 on inode 181, resetting version number bad magic number 0x0 on inode 182, resetting magic number bad version number 0x0 on inode 182, resetting version number bad magic number 0x0 on inode 183, resetting magic number bad version number 0x0 on inode 183, resetting version number bad magic number 0x0 on inode 184, resetting magic number bad version number 0x0 on inode 184, resetting version number bad magic number 0x0 on inode 185, resetting magic number bad version number 0x0 on inode 185, resetting version number bad magic number 0x0 on inode 186, resetting magic number bad version number 0x0 on inode 186, resetting version number bad magic number 0x0 on inode 187, resetting magic number bad version number 0x0 on inode 187, resetting version number bad magic number 0x0 on inode 188, resetting magic number bad version number 0x0 on inode 188, resetting version number bad magic number 0x0 on inode 189, resetting magic number bad version number 0x0 on inode 189, resetting version number bad magic number 0x0 on inode 190, resetting magic number bad version number 0x0 on inode 190, resetting version number bad magic number 0x0 on inode 191, resetting magic number bad version number 0x0 on inode 191, resetting version number - agno = 1 - agno = 2 - agno = 3 - agno = 4 - agno = 5 - agno = 6 - agno = 7 - agno = 8 - agno = 9 - agno = 10 - agno = 11 - agno = 12 - agno = 13 - agno = 14 - 10:29:06: process known inodes and inode discovery - 64 of 0 inodes done - process newly discovered inodes... - 10:29:06: process newly discovered inodes - 16 of 16 allocation groups done Phase 4 - check for duplicate blocks... - setting up duplicate extent list... root inode lost - 10:29:06: setting up duplicate extent list - 16 of 16 allocation groups done - check for inodes claiming duplicate blocks... - agno = 0 - agno = 3 - agno = 10 - agno = 13 - agno = 6 - agno = 5 - agno = 8 - agno = 1 - agno = 7 - agno = 9 - agno = 11 - agno = 12 - agno = 2 - agno = 15 - agno = 14 - agno = 4 - 10:29:06: check for inodes claiming duplicate blocks - 64 of 0 inodes done Phase 5 - rebuild AG headers and trees... - 10:29:06: rebuild AG headers and trees - 16 of 16 allocation groups done - reset superblock... Phase 6 - check inode connectivity... reinitializing root directory reinitializing realtime bitmap inode reinitializing realtime summary inode - resetting contents of realtime bitmap and summary inodes - traversing filesystem ... - traversal finished ... - moving disconnected inodes to lost+found ... Phase 7 - verify and correct link counts... resetting inode 128 nlinks from 1 to 2 done
On 8/2/16 1:51 PM, Jeff Mahoney wrote: > On 8/2/16 11:00 AM, Eric Sandeen wrote: ... >> I could imagine that maybe for each candidate super we find, we should >> look at its geometry, and spot-check the other locations that it indicates >> should contain a superblock. If we get enough semi-valid but conflicting >> "sets," maybe we should bail out and ask. It's quite a corner case, tho. > > I'm not sure a geometry check would've helped here. The superblock > geometry still would've covered the whole, unpartitioned device. Since > we're already linking with blkid, maybe a check to see if there's a > partition table on the device and bail if it sees one, unless forced? > The part that I'm still trying to explain is how it managed to get a > good magic from the log and then got the wrong uuid. Hm, yeah, maybe right off the bat, if the primary super looks bad do a blkid check, and a (sigh) "are you sure?" just like we do for mkfs. >> Any chance you have full xfs_repair output? > > Sure, below. I heard back from the reporter and confirmed that my > hypothesis of mkfs -> fdisk -> mkfs was what happened. He's on SLE12 > SP1, so that means xfsprogs 3.2.1. > > -Jeff > > --- > > labadmin:/ssd # xfs_repair postgre.raw -L > Phase 1 - find and verify superblock... > - reporting progress in intervals of 15 minutes > sb root inode value 18446744073709551615 (NULLFSINO) inconsistent with > calculated value 128 > resetting superblock root inode pointer to 128 > sb realtime bitmap inode 18446744073709551615 (NULLFSINO) inconsistent > with calculated value 129 > resetting superblock realtime bitmap ino pointer to 129 > sb realtime summary inode 18446744073709551615 (NULLFSINO) inconsistent > with calculated value 130 > resetting superblock realtime summary ino pointer to 130 huh. Ok, so I guess the primary in block zero was mostly-ok, and all of the backup supers were still more or less intact, and it didn't have to go searching... -Eric
diff --git a/libxlog/util.c b/libxlog/util.c index 7aca165..aa3093d 100644 --- a/libxlog/util.c +++ b/libxlog/util.c @@ -49,8 +49,10 @@ header_check_uuid(xfs_mount_t *mp, xlog_rec_header_t *head) printf("* ERROR: mismatched uuid in log\n" "* SB : %s\n* log: %s\n", uu_sb, uu_log); + + memcpy(&mp->m_sb.sb_uuid, head->h_fs_uuid, sizeof(uuid_t)); - return 1; + return 0; }