Project

General

Profile

Actions

Bug #494

closed

Error during flashing (Enabled erase path optimisation)

Added by Alexander Goncharov 11 months ago. Updated about 2 months ago.

Status:
Resolved
Priority:
Normal
Category:
-
Target version:
Start date:
06/14/2023
Due date:
% Done:

0%

Estimated time:
Affected versions:
Needs backport to:
Affected hardware:
Affected OS:

Description

I tried to flash the W25Q128.V chip with BIOS and encountered the next issue.

I ran the following commands:

$ cp ~/bios/img/flash.img ~/bios/img/flash.tr.img
$ truncate -s 16M "~/bios/img/flash.tr.img"
$ echo "00000:2d637e boot" > ~/bios/img/rom.layout
# flashrom --programmer ch341a_spi -l ~/bios/img/rom.layout -i boot -w ~/bios/img/flash.tr.img
flashrom 1.4.0-devel (git:v1.2-1298-g8e948f9) on Linux 6.3.3-arch1-1 (x86_64)
flashrom is free software, get the source code at https://flashrom.org

Using clock_gettime for delay loops (clk_id: 1, resolution: 1ns).
Using region: "boot".
Found Winbond flash chip "W25Q128.V" (16384 kB, SPI) on ch341a_spi.
Reading old flash chip contents... done.
Region end not sector aligned! Extending end boundaries...
Erase/write done from 0 to 2d6fff
Verifying flash... FAILED at 0x002d637e! Expected=0x00, Found=0xff, failed byte count from 0x00000000-0x00ffffff: 0x2
Your flash chip is in an unknown state.
Please report this to the mailing list at flashrom@flashrom.org or
on IRC (see https://www.flashrom.org/Contact for details), thanks!
make: *** [Makefile:33: flash] Error 3

Immediately after this happens, I try to do it again:

# flashrom --programmer ch341a_spi -l ~/bios/img/rom.layout -i boot -w ~/bios/img/flash.tr.img
flashrom 1.4.0-devel (git:v1.2-1298-g8e948f9) on Linux 6.3.3-arch1-1 (x86_64)
flashrom is free software, get the source code at https://flashrom.org

Using clock_gettime for delay loops (clk_id: 1, resolution: 1ns).
Using region: "boot".
Found Winbond flash chip "W25Q128.V" (16384 kB, SPI) on ch341a_spi.
Reading old flash chip contents... done.
Region end not sector aligned! Extending end boundaries...
Erase/write done from 0 to 2d6fff

And it works... Probably, there's a bug in the erase mechanism

Actions #1

Updated by Anastasia Klimchuk 8 months ago

  • Assignee set to Anastasia Klimchuk
Actions #2

Updated by Alexander Goncharov 8 months ago

Tried to test with this patch (https://review.coreboot.org/c/flashrom/+/77747). Unfortunately, it still gives an error.

NOTE: There's only 0x1 failcount instead of 0x2

flashrom 1.4.0-devel (git:v1.2-1352-g0e529eb) on Linux 6.4.12-arch1-1 (x86_64)
flashrom is free software, get the source code at https://flashrom.org

Using clock_gettime for delay loops (clk_id: 1, resolution: 1ns).
Using region: "boot".
Found Winbond flash chip "W25Q128.V" (16384 kB, SPI) on ch341a_spi.
Reading old flash chip contents... done.
Region end not sector aligned! Extending end boundaries...
Erase/write done from 0 to 2d6fff
Verifying flash... FAILED at 0x002d637e! Expected=0x00, Found=0xff, failed byte count from 0x00000000-0x00ffffff: 0x1
Your flash chip is in an unknown state.
Please report this to the mailing list at flashrom@flashrom.org or
on IRC (see https://www.flashrom.org/Contact for details), thanks!
Actions #3

Updated by Anastasia Klimchuk 7 months ago

I have an idea: since this chip is emulated by dummyflasher, can you give me the initial and target image you are flashing and I can try repro with dummy? If there is an issue with the logic, hardware would not matter, it should repro and fail at verifying step even with dummyflasher.
I don't know if these images are attachable, if not maybe send by email to me?
Thank you!

Actions #4

Updated by Alexander Goncharov 7 months ago

Sorry for the delayed response. I'm unable to share BIOS image in public due to the NDA (the BIOS is still under development). So, I tried to find an image that would give me the same error, and I found one!

Again, that's how I encountered this error:

[triste@reeva bins]$ sudo /home/triste/fw/flashrom/out/flashrom --programmer ch341a_spi --chip "W25Q128.V" --read dump_before_w25q128_151023.bin
flashrom 1.4.0-devel (git:v1.2-1356-g0223f76) on Linux 6.5.7-arch1-1 (x86_64)
flashrom is free software, get the source code at https://flashrom.org

Using clock_gettime for delay loops (clk_id: 1, resolution: 1ns).
Found Winbond flash chip "W25Q128.V" (16384 kB, SPI) on ch341a_spi.
Reading flash... done.
[triste@reeva bins]$ make flash
cp bins/sbl.bin bins/sbl.tr.bin
truncate -s 16M "bins/sbl.tr.bin"
echo "00000:d00000 boot" > bins/sbl-rom.layout
sudo /home/triste/fw/flashrom/out/flashrom --programmer ch341a_spi --chip "W25Q128.V" -l bins/sbl-rom.layout -i boot -w bins/sbl.tr.bin
[sudo] password for triste: 
flashrom 1.4.0-devel (git:v1.2-1356-g0223f76) on Linux 6.5.7-arch1-1 (x86_64)
flashrom is free software, get the source code at https://flashrom.org

Using clock_gettime for delay loops (clk_id: 1, resolution: 1ns).
Using region: "boot".
Found Winbond flash chip "W25Q128.V" (16384 kB, SPI) on ch341a_spi.
Reading old flash chip contents... done.
Region end not sector aligned! Extending end boundaries...
Erase/write done from 0 to d00fff
Verifying flash... FAILED at 0x00d00000! Expected=0x00, Found=0xff, failed byte count from 0x00000000-0x00ffffff: 0x2
Your flash chip is in an unknown state.
Please report this to the mailing list at flashrom@flashrom.org or
on IRC (see https://www.flashrom.org/Contact for details), thanks!
make: *** [Makefile:20: flash] Error 3
[triste@reeva bins]$ sudo /home/triste/fw/flashrom/out/flashrom --programmer ch341a_spi --chip "W25Q128.V" --read dump_after_w25q128_151023.bin
[sudo] password for triste: 
flashrom 1.4.0-devel (git:v1.2-1356-g0223f76) on Linux 6.5.7-arch1-1 (x86_64)
flashrom is free software, get the source code at https://flashrom.org

Using clock_gettime for delay loops (clk_id: 1, resolution: 1ns).
Found Winbond flash chip "W25Q128.V" (16384 kB, SPI) on ch341a_spi.
Reading flash... done.

Unfortunately, the Issue Tracker doesn't want to attach such files (probably, because of the size). I will drop them to you via email, as you suggested. There will be 3 images.

dump before writing: dump_before_w25q128_151023.bin
writing with this binary: sbl.tr.bin
dump after writing: dump_after_w25q128_151023.bin

If anyone is interested and needs binaries, please let me know

Actions #5

Updated by Anastasia Klimchuk 7 months ago

  • Status changed from New to In Progress
  • Target version changed from none to 1.4
  • Affected versions main added

Good news: I was able to repro with dummy with your images! I have this:

~/flashrom_repo/builddir3$ ./flashrom -p dummy:emulate=W25Q128FV,image=ticket494/dump_before_w25q128_151023.bin -l ticket494/sbl-rom.layout -i boot -w ticket494/sbl.tr.bin
flashrom 1.4.0-devel (git:v1.2-1367-g780689be) on Linux 6.2.0-34-generic (x86_64)
flashrom is free software, get the source code at https://flashrom.org
Using clock_gettime for delay loops (clk_id: 1, resolution: 1ns).
Using region: "boot".
Found Winbond flash chip "W25Q128.V" (16384 kB, SPI) on dummy.
Reading old flash chip contents... done.
Region end not sector aligned! Extending end boundaries...
Erase/write done from 0 to d00fff
Verifying flash... FAILED at 0x00d00000! Expected=0x00, Found=0xff, failed byte count from 0x00000000-0x00ffffff: 0x2
Your flash chip is in an unknown state.
Please report this to the mailing list at flashrom@flashrom.org or
on IRC (see https://www.flashrom.org/Contact for details), thanks!

So now I can do the debugging, I will update the ticket with any discoveries.

As a note, if anyone will be trying to run this with dummy: for this command, dummy updates the original image (emulates the write), so keep the backup of the file you give to image param. And if you need to re-run again, restore the file from backup.

Actions #6

Updated by Anastasia Klimchuk 7 months ago

I noticed layout only covers 13M, and chip size is 16M.
So I did some experimenting and modified the layout file manually, to extend the layout to match full chip size.

this layout file doesn't work:

00000:d00000 boot
d00001:ffffff tail

this layout file works:

00000:ffffff boot

Given that the error says FAILED at 0x00d00000 this may be something to do with layout borders.

Actions #7

Updated by Anastasia Klimchuk 7 months ago

I just realised that initial layout region is not 13M but 13M+1byte. This is why it extended, that last extra byte is extended to fill in 4K block. And then something fails on that code branch.

As another experiment, I ran write with pure 13M layout 00000:cfffff boot and it work successfully.

Not sure whether 13M+1byte layout region was intentional or not, but it's very cool because it helped to discover an edge case!

Actions #8

Updated by Anastasia Klimchuk 7 months ago

I have few more discoveries. There is a branch of code in erasure_layout.c which runs for the case when region boundaries are extended:

    if (old_start - region_start) {
        read_flash(flashctx, curcontents + region_start, region_start, old_start - region_start);
        memcpy(old_start_buf, newcontents + region_start, old_start - region_start);
        memcpy(newcontents + region_start, curcontents + region_start, old_start - region_start);
    }
    if (region_end - old_end) {
        read_flash(flashctx, curcontents + old_end, old_end, region_end - old_end);
        memcpy(old_end_buf, newcontents + old_end, region_end - old_end);
        memcpy(newcontents + old_end, curcontents + old_end, region_end - old_end);
    }

As it seems, it copies current contents of the chip into newcontents for that extended area, which means, for that extended area write operation will do nothing? Which would explain the error message
Expected=0x00, Found=0xff

To experiment, I commented out that code and this made a test fail! This test: write_nonaligned_region_with_dummyflasher_test_success
After this I uncommented the code back.

I looked into that test, it is very useful, it creates a layout region which is smaller that chip erase block. I was wondering why it passes on head? I think it's because it's new image to write on chip differs only in 2 bytes, the rest of image is the same.
As of experiment, I change the test to write whole new image to the chip, and now test fails on head:

diff --git a/tests/chip.c b/tests/chip.c
index 96be7b10..e947d552 100644
--- a/tests/chip.c
+++ b/tests/chip.c
@@ -451,7 +451,7 @@ void write_nonaligned_region_with_dummyflasher_test_success(void **state)
         * 0xAA 0xAA {MOCK_CHIP_SUBREGION_CONTENTS}, [..]}.
         */
        printf("Subregion chip write op..\n");
-       memset(newcontents, 0xAA, 2);
+       memset(newcontents, 0xAA, mock_chip_size);
        assert_int_equal(0, flashrom_image_write(&flashctx, newcontents, mock_chip_size, NULL));
        printf("Subregion chip write op done.\n");
Actions #9

Updated by Anastasia Klimchuk 6 months ago

I added print message for every failed byte (since there are only two failed bytes in this case).
The two failed bytes are exactly the first and the last bytes of the extended 4K block: 0x00d00000 and 0x00d00fff

$ ./flashrom -p dummy:emulate=W25Q128FV,image=ticket494/dump_before_w25q128_151023.bin -l ticket494/sbl-rom.layout -i boot -w ticket494/sbl.tr.bin
flashrom 1.4.0-devel (git:v1.2-1371-ge094f04c) on Linux 6.2.0-34-generic (x86_64)
flashrom is free software, get the source code at https://flashrom.org

Using clock_gettime for delay loops (clk_id: 1, resolution: 1ns).
Using region: "boot".
Found Winbond flash chip "W25Q128.V" (16384 kB, SPI) on dummy.
Reading old flash chip contents... done.
Region end 0x00d00000 not sector aligned! Extend end boundaries by 0x00000fff bytes
Erase/write done from 0 to d00fff
Verifying flash... FAILED at 0x00d00000! Expected=0x00, Found=0xff,FAILED at 0x00d00fff! Expected=0xff, Found=0x00, failed byte count from 0x00000000-0x00ffffff: 0x2
Your flash chip is in an unknown state.
Please report this to the mailing list at flashrom@flashrom.org or
on IRC (see https://www.flashrom.org/Contact for details), thanks!
Actions #10

Updated by Anastasia Klimchuk 6 months ago

The failed case identified as:
1) layout has a region which is not aligned with any of erase blocks
2) therefore layout boundaries need to be extended, to align with erase block
3) specifically bug repro when
3.1) end boundary needs to be extended
3.2) and, image to write should have different bytes on the region boundary (different in comparison with original data on the chip).

See also the patch: https://review.coreboot.org/c/flashrom/+/78984 with potential fix. The patch fixes all currently known failed cases.

On the top, few more test cases are added to cover the scenario: https://review.coreboot.org/c/flashrom/+/78985

Actions #11

Updated by Anastasia Klimchuk 5 months ago

The last thing I would want to do here is to test https://review.coreboot.org/c/flashrom/+/78984 with the original images where the issue was discovered, just to be sure. I understand they can't be shared, which is fine: maybe if you could test (without sharing the images) and tell whether it works now with the patch 78984 that would be great.
Thank you so much!

Actions #12

Updated by Anastasia Klimchuk about 2 months ago

  • Status changed from In Progress to Resolved
Actions

Also available in: Atom PDF