Project

General

Profile

Actions

Bug #525

open

Potential clash when custom get_flash_region() called in erase_write

Added by Anastasia Klimchuk 3 months ago. Updated about 1 month ago.

Status:
New
Priority:
Normal
Category:
-
Target version:
Start date:
02/05/2024
Due date:
% Done:

0%

Estimated time:
Affected versions:
Needs backport to:
Affected hardware:
Affected OS:

Description

erase_write in erasure_layout.c makes a call to get_flash_region(). This call happens after erase layout is calculated, however get_flash_region() can potentially return smaller region.

Issue needs research (and fix if needed).

Originally reported by Vincent Fazio, copying the report below:

As part of looking over #1 and #2, I developed some slight concerns about the loop on line 314 which is separate from
this patchset. I don't know if it warrants an issue and further discussion, and I haven't spelunked into the erase function
selection logic, but the concern I have there is we've already pre-calculated what erase functions we're using but
`get_flash_region` could return a region with a shorter end range, meaning the selected erase block fn could erase part
of the next region. So if we had a 32k erase block fn, it seems like if region.end forces the length of the erasable region
to 4k, we don't actually select a 4k erase function and instead continue to use a 32k function. If the next region is write
protected it seems like we'd have verification errors.

I think this currently only affects masters that have get_region defined, so `opaque_master_ich_hwseq` is at risk here.
This may have been less of a risk with the legacy path because, from what I remember, it always used the smallest working
erase block function, however the new path adjusts the function used based on the amount of changed data and coalesces blocks
when possible (I think).
Actions #1

Updated by Anastasia Klimchuk about 1 month ago

Looked into the code today, few things that I need to clarify. Writing down here

1) Line 319 in erasure_layout.c adjusts len.
What happens if len is indeed adjusted?
What happens with remaining area? Seems like it is ignored by next loop iteration?

2) What is the purpose of get_flash_region (in the context here)? Is it only to check region is not protected?
Why do we need to call get_flash_region at all (in erasure_layout) if check_for_unwritable_regions already checked protected regions earlier?
What is the added value of get_flash_region in this case?

3) It seems that for ichspi the layout should repeat the fd_regions layout?
Then the algo respects layout boundaries and works fine
Is it possible to enforce this? (Enforce layouts being the same?)

5) Can we erase across fd_regions? Any issues apart from write protection? Do these fd_regions have any physical difference for the chip memory ?

Actions #2

Updated by Anastasia Klimchuk about 1 month ago

We just discussed this with Nikolai, and tldr yes this is a legit issue. Vincent thank you so much for discovering!

Some more details, in addition to my previous comment:

1->
Remaining area will be fine, because loop increments by len, which is real len of erased block. So no area will be missed.
BUT the real issue is that erasefn is already pre-selected, so the opcode is selected, and even if region len is adjusted, the same opcode will be sent, and the same (pre-selected) size will be erased.

2+5 ->
The main purpose is to detect protected regions (which are currently only relevant for Intel fd_regions, but still). However, there are protected regions on Intel, so this is relevant use case.
fd_regions typically align with 4K block. So the smallest erase block will typically work, however optimisation can select large erase block (to optimise!) and this can cover multiple fd_regions

3 ->
The logical layout not always the same as fd_regions, although if they are the same then the issue won't repro.

4->
check_for_unwritable_regions is only called when flag skip_unwritable_regions is not set: the flag which tells to abort when any protected region found.
When the flag is set, we need to skip protected regions. But importantly we still need to detect them, and do not send opcodes for them (do not try to erase them!)

Actions

Also available in: Atom PDF