diff mbox

[v3,2/3] hw/sd: model a power-up delay, as a workaround for an EDK2 bug

Message ID 1454364936-18940-3-git-send-email-Andrew.Baumann@microsoft.com (mailing list archive)
State New, archived
Headers show

Commit Message

Andrew Baumann Feb. 1, 2016, 10:15 p.m. UTC
The SD spec for ACMD41 says that a zero argument is an "inquiry"
ACMD41, which does not start initialisation and is used only for
retrieving the OCR. However, Tianocore EDK2 (UEFI) has a bug [1]: it
first sends an inquiry (zero) ACMD41. If that first request returns an
OCR value with the power up bit (0x80000000) set, it assumes the card
is ready and continues, leaving the card in the wrong state. (My
assumption is that this works on hardware, because no real card is
immediately powered up upon reset.)

This change models a delay of 0.5ms from the first ACMD41 to the power
being up. However, it also immediately sets the power on upon seeing a
non-zero (non-enquiry) ACMD41. This speeds up UEFI boot, it should
also account for guests that simply delay after card reset and then
issue an ACMD41 that they expect will succeed.

[1] https://github.com/tianocore/edk2/blob/master/EmbeddedPkg/Universal/MmcDxe/MmcIdentification.c#L279
(This is the loop starting with "We need to wait for the MMC or SD
card is ready")

Signed-off-by: Andrew Baumann <Andrew.Baumann@microsoft.com>
---
Obviously this is a bug that should be fixed in EDK2. However, this
initialisation appears to have been around for quite a while in EDK2
(in various forms), and the fact that it has obviously worked with so
many real SD/MMC cards makes me think that it would be pragmatic to
have the workaround in QEMU as well.

You might argue that the delay timer should start on sd_reset(), and
not the first ACMD41. However, that doesn't work reliably with UEFI,
because a large delay often elapses between the two (particularly in
debug builds that do lots of printing to the serial port). If the
timer fires too early, we'll still hit the bug, but we also don't want
to set a huge timeout value, because some guests may depend on it
expiring.

 hw/sd/sd.c | 83 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++-----
 1 file changed, 77 insertions(+), 6 deletions(-)
 mode change 100644 => 100755 hw/sd/sd.c

Comments

Peter Maydell Feb. 3, 2016, 2:22 p.m. UTC | #1
On 1 February 2016 at 22:15, Andrew Baumann
<Andrew.Baumann@microsoft.com> wrote:
> The SD spec for ACMD41 says that a zero argument is an "inquiry"
> ACMD41, which does not start initialisation and is used only for
> retrieving the OCR. However, Tianocore EDK2 (UEFI) has a bug [1]: it
> first sends an inquiry (zero) ACMD41. If that first request returns an
> OCR value with the power up bit (0x80000000) set, it assumes the card
> is ready and continues, leaving the card in the wrong state. (My
> assumption is that this works on hardware, because no real card is
> immediately powered up upon reset.)
>
> This change models a delay of 0.5ms from the first ACMD41 to the power
> being up. However, it also immediately sets the power on upon seeing a
> non-zero (non-enquiry) ACMD41. This speeds up UEFI boot, it should
> also account for guests that simply delay after card reset and then
> issue an ACMD41 that they expect will succeed.
>
> [1] https://github.com/tianocore/edk2/blob/master/EmbeddedPkg/Universal/MmcDxe/MmcIdentification.c#L279
> (This is the loop starting with "We need to wait for the MMC or SD
> card is ready")
>
> Signed-off-by: Andrew Baumann <Andrew.Baumann@microsoft.com>
> ---
> Obviously this is a bug that should be fixed in EDK2. However, this
> initialisation appears to have been around for quite a while in EDK2
> (in various forms), and the fact that it has obviously worked with so
> many real SD/MMC cards makes me think that it would be pragmatic to
> have the workaround in QEMU as well.

Have you filed it as an EDK2 bug, just out of interest?

> -#define ACMD41_ENQUIRY_MASK 0x00ffffff
> +#define ACMD41_ENQUIRY_MASK     0x00ffffff
> +#define OCR_POWER_UP            0x80000000
> +#define OCR_POWER_DELAY         (get_ticks_per_sec() / 2000) /* 0.5ms */

It's kind of odd to have something here scaled by get_ticks_per_sec(),
but then later add it to a pure nanoseconds value. (It works because
get_ticks_per_sec() always returns a value indicating 1 tick per ns.)
I think it would be cleaner to:
 * have this #define be a nanosecond value, with no call to
   get_ticks_per_sec()
   (we have a NANOSECONDS_PER_SECOND constant if you want it)
 * call timer_mod_ns() rather than timer_mod()

The ticks-per-sec stuff is legacy which we don't need for new code.

>  /* Legacy initialization function for use by non-qdevified callers */
> @@ -1320,12 +1371,31 @@ static sd_rsp_type_t sd_app_command(SDState *sd,
>          }
>          switch (sd->state) {
>          case sd_idle_state:
> +            /* If it's the first ACMD41 since reset, we need to decide
> +             * whether to power up. If this is not an enquiry ACMD41,
> +             * we immediately report power on and proceed below to the
> +             * ready state, but if it is, we set a timer to model a
> +             * delay for power up. This works around a bug in EDK2
> +             * UEFI, which sends an initial enquiry ACMD41, but
> +             * assumes that the card is in ready state as soon as it
> +             * sees the power up bit set. */
> +            if (!(sd->ocr & OCR_POWER_UP)) {
> +                if ((req.arg & ACMD41_ENQUIRY_MASK) != 0) {
> +                    timer_del(sd->ocr_power_timer);
> +                    sd_ocr_powerup(sd);
> +                } else if (!timer_pending(sd->ocr_power_timer)) {
> +                    timer_mod(sd->ocr_power_timer,
> +                              (qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL)
> +                               + OCR_POWER_DELAY));
> +                }
> +            }
> +
>              /* We accept any voltage.  10000 V is nothing.
>               *
> -             * We don't model init delay so just advance straight to ready state
> +             * Once we're powered up, we advance straight to ready state
>               * unless it's an enquiry ACMD41 (bits 23:0 == 0).
>               */
> -            if (req.arg & ACMD41_ENQUIRY_MASK) {
> +            if ((sd->ocr & OCR_POWER_UP) && (req.arg & ACMD41_ENQUIRY_MASK)) {
>                  sd->state = sd_ready_state;
>              }

Isn't (sd->ocr & OCR_POWER_UP) redundant in this check? If
(req.arg & ACMD41_ENQUIRY_MASK) is true then either:
 (a) OCR_POWER_UP was set when we came in to the function
 (b) OCR_POWER_UP wasn't set, but we went through the code path that
     deletes the timer and calls sd_ocr_powerup(), which will set
     OCR_POWER_UP
So the enquiry-mask bits being nonzero here implies OCR_POWER_UP must
be set.

Otherwise
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>

thanks
-- PMM
Andrew Baumann Feb. 7, 2016, 10:54 p.m. UTC | #2
Hi Peter,

> From: Peter Maydell [mailto:peter.maydell@linaro.org]

> Sent: Thursday, 4 February 2016 1:22 AM

> 

> On 1 February 2016 at 22:15, Andrew Baumann

> <Andrew.Baumann@microsoft.com> wrote:

> > The SD spec for ACMD41 says that a zero argument is an "inquiry"

> > ACMD41, which does not start initialisation and is used only for

> > retrieving the OCR. However, Tianocore EDK2 (UEFI) has a bug [1]: it

> > first sends an inquiry (zero) ACMD41. If that first request returns an

> > OCR value with the power up bit (0x80000000) set, it assumes the card

> > is ready and continues, leaving the card in the wrong state. (My

> > assumption is that this works on hardware, because no real card is

> > immediately powered up upon reset.)

> >

> > This change models a delay of 0.5ms from the first ACMD41 to the power

> > being up. However, it also immediately sets the power on upon seeing a

> > non-zero (non-enquiry) ACMD41. This speeds up UEFI boot, it should

> > also account for guests that simply delay after card reset and then

> > issue an ACMD41 that they expect will succeed.

> >

> > [1]

> >

> https://github.com/tianocore/edk2/blob/master/EmbeddedPkg/Universal/

> Mm

> > cDxe/MmcIdentification.c#L279 (This is the loop starting with "We need

> > to wait for the MMC or SD card is ready")

> >

> > Signed-off-by: Andrew Baumann <Andrew.Baumann@microsoft.com>

> > ---

> > Obviously this is a bug that should be fixed in EDK2. However, this

> > initialisation appears to have been around for quite a while in EDK2

> > (in various forms), and the fact that it has obviously worked with so

> > many real SD/MMC cards makes me think that it would be pragmatic to

> > have the workaround in QEMU as well.

> 

> Have you filed it as an EDK2 bug, just out of interest?


No, I haven't. I didn't see an obvious path to do so; I'm also lazy :)

> > -#define ACMD41_ENQUIRY_MASK 0x00ffffff

> > +#define ACMD41_ENQUIRY_MASK     0x00ffffff

> > +#define OCR_POWER_UP            0x80000000

> > +#define OCR_POWER_DELAY         (get_ticks_per_sec() / 2000) /* 0.5ms */

> 

> It's kind of odd to have something here scaled by get_ticks_per_sec(), but

> then later add it to a pure nanoseconds value. (It works because

> get_ticks_per_sec() always returns a value indicating 1 tick per ns.) I think it

> would be cleaner to:

>  * have this #define be a nanosecond value, with no call to

>    get_ticks_per_sec()

>    (we have a NANOSECONDS_PER_SECOND constant if you want it)

>  * call timer_mod_ns() rather than timer_mod()

> 

> The ticks-per-sec stuff is legacy which we don't need for new code.


Makes sense. I was obviously copying something legacy. I will change this.

> >  /* Legacy initialization function for use by non-qdevified callers */

> > @@ -1320,12 +1371,31 @@ static sd_rsp_type_t

> sd_app_command(SDState *sd,

> >          }

> >          switch (sd->state) {

> >          case sd_idle_state:

> > +            /* If it's the first ACMD41 since reset, we need to decide

> > +             * whether to power up. If this is not an enquiry ACMD41,

> > +             * we immediately report power on and proceed below to the

> > +             * ready state, but if it is, we set a timer to model a

> > +             * delay for power up. This works around a bug in EDK2

> > +             * UEFI, which sends an initial enquiry ACMD41, but

> > +             * assumes that the card is in ready state as soon as it

> > +             * sees the power up bit set. */

> > +            if (!(sd->ocr & OCR_POWER_UP)) {

> > +                if ((req.arg & ACMD41_ENQUIRY_MASK) != 0) {

> > +                    timer_del(sd->ocr_power_timer);

> > +                    sd_ocr_powerup(sd);

> > +                } else if (!timer_pending(sd->ocr_power_timer)) {

> > +                    timer_mod(sd->ocr_power_timer,

> > +                              (qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL)

> > +                               + OCR_POWER_DELAY));

> > +                }

> > +            }

> > +

> >              /* We accept any voltage.  10000 V is nothing.

> >               *

> > -             * We don't model init delay so just advance straight to ready state

> > +             * Once we're powered up, we advance straight to ready

> > + state

> >               * unless it's an enquiry ACMD41 (bits 23:0 == 0).

> >               */

> > -            if (req.arg & ACMD41_ENQUIRY_MASK) {

> > +            if ((sd->ocr & OCR_POWER_UP) && (req.arg &

> > + ACMD41_ENQUIRY_MASK)) {

> >                  sd->state = sd_ready_state;

> >              }

> 

> Isn't (sd->ocr & OCR_POWER_UP) redundant in this check? If (req.arg &

> ACMD41_ENQUIRY_MASK) is true then either:

>  (a) OCR_POWER_UP was set when we came in to the function

>  (b) OCR_POWER_UP wasn't set, but we went through the code path that

>      deletes the timer and calls sd_ocr_powerup(), which will set

>      OCR_POWER_UP

> So the enquiry-mask bits being nonzero here implies OCR_POWER_UP must

> be set.


You're right. I think this was a hangover from the previous version. I'll clean it up.

> Otherwise

> Reviewed-by: Peter Maydell <peter.maydell@linaro.org>


Thanks!
Andrew
diff mbox

Patch

diff --git a/hw/sd/sd.c b/hw/sd/sd.c
old mode 100644
new mode 100755
index 8514ac7..473d4a0
--- a/hw/sd/sd.c
+++ b/hw/sd/sd.c
@@ -36,6 +36,7 @@ 
 #include "qemu/bitmap.h"
 #include "hw/qdev-properties.h"
 #include "qemu/error-report.h"
+#include "qemu/timer.h"
 
 //#define DEBUG_SD 1
 
@@ -46,7 +47,9 @@  do { fprintf(stderr, "SD: " fmt , ## __VA_ARGS__); } while (0)
 #define DPRINTF(fmt, ...) do {} while(0)
 #endif
 
-#define ACMD41_ENQUIRY_MASK 0x00ffffff
+#define ACMD41_ENQUIRY_MASK     0x00ffffff
+#define OCR_POWER_UP            0x80000000
+#define OCR_POWER_DELAY         (get_ticks_per_sec() / 2000) /* 0.5ms */
 
 typedef enum {
     sd_r0 = 0,    /* no response */
@@ -85,6 +88,7 @@  struct SDState {
     uint32_t mode;    /* current card mode, one of SDCardModes */
     int32_t state;    /* current card state, one of SDCardStates */
     uint32_t ocr;
+    QEMUTimer *ocr_power_timer;
     uint8_t scr[8];
     uint8_t cid[16];
     uint8_t csd[16];
@@ -199,8 +203,17 @@  static uint16_t sd_crc16(void *message, size_t width)
 
 static void sd_set_ocr(SDState *sd)
 {
-    /* All voltages OK, card power-up OK, Standard Capacity SD Memory Card */
-    sd->ocr = 0x80ffff00;
+    /* All voltages OK, Standard Capacity SD Memory Card, not yet powered up */
+    sd->ocr = 0x00ffff00;
+}
+
+static void sd_ocr_powerup(void *opaque)
+{
+    SDState *sd = opaque;
+
+    /* Set powered up bit in OCR */
+    assert(!(sd->ocr & OCR_POWER_UP));
+    sd->ocr |= OCR_POWER_UP;
 }
 
 static void sd_set_scr(SDState *sd)
@@ -475,10 +488,44 @@  static const BlockDevOps sd_block_ops = {
     .change_media_cb = sd_cardchange,
 };
 
+static bool sd_ocr_vmstate_needed(void *opaque)
+{
+    SDState *sd = opaque;
+
+    /* Include the OCR state (and timer) if it is not yet powered up */
+    return !(sd->ocr & OCR_POWER_UP);
+}
+
+static const VMStateDescription sd_ocr_vmstate = {
+    .name = "sd-card/ocr-state",
+    .version_id = 1,
+    .minimum_version_id = 1,
+    .needed = sd_ocr_vmstate_needed,
+    .fields = (VMStateField[]) {
+        VMSTATE_UINT32(ocr, SDState),
+        VMSTATE_TIMER_PTR(ocr_power_timer, SDState),
+        VMSTATE_END_OF_LIST()
+    },
+};
+
+static int sd_vmstate_pre_load(void *opaque)
+{
+    SDState *sd = opaque;
+
+    /* If the OCR state is not included (prior versions, or not
+     * needed), then the OCR must be set as powered up. If the OCR state
+     * is included, this will be replaced by the state restore.
+     */
+    sd_ocr_powerup(sd);
+
+    return 0;
+}
+
 static const VMStateDescription sd_vmstate = {
     .name = "sd-card",
     .version_id = 1,
     .minimum_version_id = 1,
+    .pre_load = sd_vmstate_pre_load,
     .fields = (VMStateField[]) {
         VMSTATE_UINT32(mode, SDState),
         VMSTATE_INT32(state, SDState),
@@ -505,7 +552,11 @@  static const VMStateDescription sd_vmstate = {
         VMSTATE_BUFFER_POINTER_UNSAFE(buf, SDState, 1, 512),
         VMSTATE_BOOL(enable, SDState),
         VMSTATE_END_OF_LIST()
-    }
+    },
+    .subsections = (const VMStateDescription*[]) {
+        &sd_ocr_vmstate,
+        NULL
+    },
 };
 
 /* Legacy initialization function for use by non-qdevified callers */
@@ -1320,12 +1371,31 @@  static sd_rsp_type_t sd_app_command(SDState *sd,
         }
         switch (sd->state) {
         case sd_idle_state:
+            /* If it's the first ACMD41 since reset, we need to decide
+             * whether to power up. If this is not an enquiry ACMD41,
+             * we immediately report power on and proceed below to the
+             * ready state, but if it is, we set a timer to model a
+             * delay for power up. This works around a bug in EDK2
+             * UEFI, which sends an initial enquiry ACMD41, but
+             * assumes that the card is in ready state as soon as it
+             * sees the power up bit set. */
+            if (!(sd->ocr & OCR_POWER_UP)) {
+                if ((req.arg & ACMD41_ENQUIRY_MASK) != 0) {
+                    timer_del(sd->ocr_power_timer);
+                    sd_ocr_powerup(sd);
+                } else if (!timer_pending(sd->ocr_power_timer)) {
+                    timer_mod(sd->ocr_power_timer,
+                              (qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL)
+                               + OCR_POWER_DELAY));
+                }
+            }
+
             /* We accept any voltage.  10000 V is nothing.
              *
-             * We don't model init delay so just advance straight to ready state
+             * Once we're powered up, we advance straight to ready state
              * unless it's an enquiry ACMD41 (bits 23:0 == 0).
              */
-            if (req.arg & ACMD41_ENQUIRY_MASK) {
+            if ((sd->ocr & OCR_POWER_UP) && (req.arg & ACMD41_ENQUIRY_MASK)) {
                 sd->state = sd_ready_state;
             }
 
@@ -1833,6 +1903,7 @@  static void sd_instance_init(Object *obj)
     SDState *sd = SD_CARD(obj);
 
     sd->enable = true;
+    sd->ocr_power_timer = timer_new_ns(QEMU_CLOCK_VIRTUAL, sd_ocr_powerup, sd);
 }
 
 static void sd_realize(DeviceState *dev, Error **errp)