Message ID | 20190314160428.3559-15-igor.j.konopko@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | lightnvm: next set of improvements for 5.2 | expand |
> On 14 Mar 2019, at 17.04, Igor Konopko <igor.j.konopko@intel.com> wrote: > > Currently when there is an IO error (or similar) on GC read path, pblk > still moves this line to free state, what leads to data mismatch issue. > > This patch adds a handling for such an error - after that changes this > line will be returned to closed state so it can be still in use > for reading - at least we will be able to return error status to user > when user tries to read such a data. > > Signed-off-by: Igor Konopko <igor.j.konopko@intel.com> > --- > drivers/lightnvm/pblk-core.c | 8 ++++++++ > drivers/lightnvm/pblk-gc.c | 5 ++--- > drivers/lightnvm/pblk-read.c | 1 - > drivers/lightnvm/pblk.h | 2 ++ > 4 files changed, 12 insertions(+), 4 deletions(-) > > diff --git a/drivers/lightnvm/pblk-core.c b/drivers/lightnvm/pblk-core.c > index 4d5cd99..6817f8f 100644 > --- a/drivers/lightnvm/pblk-core.c > +++ b/drivers/lightnvm/pblk-core.c > @@ -1786,6 +1786,14 @@ static void __pblk_line_put(struct pblk *pblk, struct pblk_line *line) > > spin_lock(&line->lock); > WARN_ON(line->state != PBLK_LINESTATE_GC); > + if (line->w_err_gc->has_gc_err) { > + spin_unlock(&line->lock); > + pblk_err(pblk, "line %d had errors during GC\n", line->id); > + pblk_put_line_back(pblk, line); > + line->w_err_gc->has_gc_err = 0; > + return; > + } > + > line->state = PBLK_LINESTATE_FREE; > trace_pblk_line_state(pblk_disk_name(pblk), line->id, > line->state); > diff --git a/drivers/lightnvm/pblk-gc.c b/drivers/lightnvm/pblk-gc.c > index e23b192..63ee205 100644 > --- a/drivers/lightnvm/pblk-gc.c > +++ b/drivers/lightnvm/pblk-gc.c > @@ -59,7 +59,7 @@ static void pblk_gc_writer_kick(struct pblk_gc *gc) > wake_up_process(gc->gc_writer_ts); > } > > -static void pblk_put_line_back(struct pblk *pblk, struct pblk_line *line) > +void pblk_put_line_back(struct pblk *pblk, struct pblk_line *line) > { > struct pblk_line_mgmt *l_mg = &pblk->l_mg; > struct list_head *move_list; > @@ -98,8 +98,7 @@ static void pblk_gc_line_ws(struct work_struct *work) > /* Read from GC victim block */ > ret = pblk_submit_read_gc(pblk, gc_rq); > if (ret) { > - pblk_err(pblk, "failed GC read in line:%d (err:%d)\n", > - line->id, ret); > + line->w_err_gc->has_gc_err = 1; > goto out; > } > > diff --git a/drivers/lightnvm/pblk-read.c b/drivers/lightnvm/pblk-read.c > index 54422a2..6a77c24 100644 > --- a/drivers/lightnvm/pblk-read.c > +++ b/drivers/lightnvm/pblk-read.c > @@ -475,7 +475,6 @@ int pblk_submit_read_gc(struct pblk *pblk, struct pblk_gc_rq *gc_rq) > > if (pblk_submit_io_sync(pblk, &rqd)) { > ret = -EIO; > - pblk_err(pblk, "GC read request failed\n"); > goto err_free_bio; > } > > diff --git a/drivers/lightnvm/pblk.h b/drivers/lightnvm/pblk.h > index 5d1040a..52002f5 100644 > --- a/drivers/lightnvm/pblk.h > +++ b/drivers/lightnvm/pblk.h > @@ -427,6 +427,7 @@ struct pblk_smeta { > > struct pblk_w_err_gc { > int has_write_err; > + int has_gc_err; > __le64 *lba_list; > }; > > @@ -909,6 +910,7 @@ void pblk_gc_free_full_lines(struct pblk *pblk); > void pblk_gc_sysfs_state_show(struct pblk *pblk, int *gc_enabled, > int *gc_active); > int pblk_gc_sysfs_force(struct pblk *pblk, int force); > +void pblk_put_line_back(struct pblk *pblk, struct pblk_line *line); > > /* > * pblk rate limiter > -- > 2.9.5 Looks good to me. Reviewed-by: Javier González <javier@javigon.com>
On Thu, Mar 14, 2019 at 5:09 PM Igor Konopko <igor.j.konopko@intel.com> wrote: > > Currently when there is an IO error (or similar) on GC read path, pblk > still moves this line to free state, what leads to data mismatch issue. > > This patch adds a handling for such an error - after that changes this > line will be returned to closed state so it can be still in use > for reading - at least we will be able to return error status to user > when user tries to read such a data. > > Signed-off-by: Igor Konopko <igor.j.konopko@intel.com> > --- > drivers/lightnvm/pblk-core.c | 8 ++++++++ > drivers/lightnvm/pblk-gc.c | 5 ++--- > drivers/lightnvm/pblk-read.c | 1 - > drivers/lightnvm/pblk.h | 2 ++ > 4 files changed, 12 insertions(+), 4 deletions(-) > > diff --git a/drivers/lightnvm/pblk-core.c b/drivers/lightnvm/pblk-core.c > index 4d5cd99..6817f8f 100644 > --- a/drivers/lightnvm/pblk-core.c > +++ b/drivers/lightnvm/pblk-core.c > @@ -1786,6 +1786,14 @@ static void __pblk_line_put(struct pblk *pblk, struct pblk_line *line) > > spin_lock(&line->lock); > WARN_ON(line->state != PBLK_LINESTATE_GC); > + if (line->w_err_gc->has_gc_err) { > + spin_unlock(&line->lock); > + pblk_err(pblk, "line %d had errors during GC\n", line->id); > + pblk_put_line_back(pblk, line); > + line->w_err_gc->has_gc_err = 0; In a real bummer corner case, the line might have had a write error as well (line->w_err_gc->has_write_err == true) In that case we need to inform the rate limiter: pblk_rl_werr_line_out(&pblk->rl); > + return; > + } > + > line->state = PBLK_LINESTATE_FREE; > trace_pblk_line_state(pblk_disk_name(pblk), line->id, > line->state); > diff --git a/drivers/lightnvm/pblk-gc.c b/drivers/lightnvm/pblk-gc.c > index e23b192..63ee205 100644 > --- a/drivers/lightnvm/pblk-gc.c > +++ b/drivers/lightnvm/pblk-gc.c > @@ -59,7 +59,7 @@ static void pblk_gc_writer_kick(struct pblk_gc *gc) > wake_up_process(gc->gc_writer_ts); > } > > -static void pblk_put_line_back(struct pblk *pblk, struct pblk_line *line) > +void pblk_put_line_back(struct pblk *pblk, struct pblk_line *line) > { > struct pblk_line_mgmt *l_mg = &pblk->l_mg; > struct list_head *move_list; > @@ -98,8 +98,7 @@ static void pblk_gc_line_ws(struct work_struct *work) > /* Read from GC victim block */ > ret = pblk_submit_read_gc(pblk, gc_rq); > if (ret) { > - pblk_err(pblk, "failed GC read in line:%d (err:%d)\n", > - line->id, ret); > + line->w_err_gc->has_gc_err = 1; > goto out; > } > > diff --git a/drivers/lightnvm/pblk-read.c b/drivers/lightnvm/pblk-read.c > index 54422a2..6a77c24 100644 > --- a/drivers/lightnvm/pblk-read.c > +++ b/drivers/lightnvm/pblk-read.c > @@ -475,7 +475,6 @@ int pblk_submit_read_gc(struct pblk *pblk, struct pblk_gc_rq *gc_rq) > > if (pblk_submit_io_sync(pblk, &rqd)) { > ret = -EIO; > - pblk_err(pblk, "GC read request failed\n"); > goto err_free_bio; > } > > diff --git a/drivers/lightnvm/pblk.h b/drivers/lightnvm/pblk.h > index 5d1040a..52002f5 100644 > --- a/drivers/lightnvm/pblk.h > +++ b/drivers/lightnvm/pblk.h > @@ -427,6 +427,7 @@ struct pblk_smeta { > > struct pblk_w_err_gc { > int has_write_err; > + int has_gc_err; > __le64 *lba_list; > }; > > @@ -909,6 +910,7 @@ void pblk_gc_free_full_lines(struct pblk *pblk); > void pblk_gc_sysfs_state_show(struct pblk *pblk, int *gc_enabled, > int *gc_active); > int pblk_gc_sysfs_force(struct pblk *pblk, int force); > +void pblk_put_line_back(struct pblk *pblk, struct pblk_line *line); > > /* > * pblk rate limiter > -- > 2.9.5 >
On 18.03.2019 13:14, Hans Holmberg wrote: > On Thu, Mar 14, 2019 at 5:09 PM Igor Konopko <igor.j.konopko@intel.com> wrote: >> >> Currently when there is an IO error (or similar) on GC read path, pblk >> still moves this line to free state, what leads to data mismatch issue. >> >> This patch adds a handling for such an error - after that changes this >> line will be returned to closed state so it can be still in use >> for reading - at least we will be able to return error status to user >> when user tries to read such a data. >> >> Signed-off-by: Igor Konopko <igor.j.konopko@intel.com> >> --- >> drivers/lightnvm/pblk-core.c | 8 ++++++++ >> drivers/lightnvm/pblk-gc.c | 5 ++--- >> drivers/lightnvm/pblk-read.c | 1 - >> drivers/lightnvm/pblk.h | 2 ++ >> 4 files changed, 12 insertions(+), 4 deletions(-) >> >> diff --git a/drivers/lightnvm/pblk-core.c b/drivers/lightnvm/pblk-core.c >> index 4d5cd99..6817f8f 100644 >> --- a/drivers/lightnvm/pblk-core.c >> +++ b/drivers/lightnvm/pblk-core.c >> @@ -1786,6 +1786,14 @@ static void __pblk_line_put(struct pblk *pblk, struct pblk_line *line) >> >> spin_lock(&line->lock); >> WARN_ON(line->state != PBLK_LINESTATE_GC); >> + if (line->w_err_gc->has_gc_err) { >> + spin_unlock(&line->lock); >> + pblk_err(pblk, "line %d had errors during GC\n", line->id); >> + pblk_put_line_back(pblk, line); >> + line->w_err_gc->has_gc_err = 0; > > In a real bummer corner case, the line might have had a write error as well > (line->w_err_gc->has_write_err == true) > > In that case we need to inform the rate limiter: > pblk_rl_werr_line_out(&pblk->rl); Is that needed or rather preferred? I'm not clearing w_err_gc->has_write_err value in that case (and thus not informing the rate limiter), so the line will go back to exactly the same state as before GC (still with write error marked). > >> + return; >> + } >> + >> line->state = PBLK_LINESTATE_FREE; >> trace_pblk_line_state(pblk_disk_name(pblk), line->id, >> line->state); >> diff --git a/drivers/lightnvm/pblk-gc.c b/drivers/lightnvm/pblk-gc.c >> index e23b192..63ee205 100644 >> --- a/drivers/lightnvm/pblk-gc.c >> +++ b/drivers/lightnvm/pblk-gc.c >> @@ -59,7 +59,7 @@ static void pblk_gc_writer_kick(struct pblk_gc *gc) >> wake_up_process(gc->gc_writer_ts); >> } >> >> -static void pblk_put_line_back(struct pblk *pblk, struct pblk_line *line) >> +void pblk_put_line_back(struct pblk *pblk, struct pblk_line *line) >> { >> struct pblk_line_mgmt *l_mg = &pblk->l_mg; >> struct list_head *move_list; >> @@ -98,8 +98,7 @@ static void pblk_gc_line_ws(struct work_struct *work) >> /* Read from GC victim block */ >> ret = pblk_submit_read_gc(pblk, gc_rq); >> if (ret) { >> - pblk_err(pblk, "failed GC read in line:%d (err:%d)\n", >> - line->id, ret); >> + line->w_err_gc->has_gc_err = 1; >> goto out; >> } >> >> diff --git a/drivers/lightnvm/pblk-read.c b/drivers/lightnvm/pblk-read.c >> index 54422a2..6a77c24 100644 >> --- a/drivers/lightnvm/pblk-read.c >> +++ b/drivers/lightnvm/pblk-read.c >> @@ -475,7 +475,6 @@ int pblk_submit_read_gc(struct pblk *pblk, struct pblk_gc_rq *gc_rq) >> >> if (pblk_submit_io_sync(pblk, &rqd)) { >> ret = -EIO; >> - pblk_err(pblk, "GC read request failed\n"); >> goto err_free_bio; >> } >> >> diff --git a/drivers/lightnvm/pblk.h b/drivers/lightnvm/pblk.h >> index 5d1040a..52002f5 100644 >> --- a/drivers/lightnvm/pblk.h >> +++ b/drivers/lightnvm/pblk.h >> @@ -427,6 +427,7 @@ struct pblk_smeta { >> >> struct pblk_w_err_gc { >> int has_write_err; >> + int has_gc_err; >> __le64 *lba_list; >> }; >> >> @@ -909,6 +910,7 @@ void pblk_gc_free_full_lines(struct pblk *pblk); >> void pblk_gc_sysfs_state_show(struct pblk *pblk, int *gc_enabled, >> int *gc_active); >> int pblk_gc_sysfs_force(struct pblk *pblk, int force); >> +void pblk_put_line_back(struct pblk *pblk, struct pblk_line *line); >> >> /* >> * pblk rate limiter >> -- >> 2.9.5 >>
On Mon, Mar 18, 2019 at 2:22 PM Igor Konopko <igor.j.konopko@intel.com> wrote: > > > > On 18.03.2019 13:14, Hans Holmberg wrote: > > On Thu, Mar 14, 2019 at 5:09 PM Igor Konopko <igor.j.konopko@intel.com> wrote: > >> > >> Currently when there is an IO error (or similar) on GC read path, pblk > >> still moves this line to free state, what leads to data mismatch issue. > >> > >> This patch adds a handling for such an error - after that changes this > >> line will be returned to closed state so it can be still in use > >> for reading - at least we will be able to return error status to user > >> when user tries to read such a data. > >> > >> Signed-off-by: Igor Konopko <igor.j.konopko@intel.com> > >> --- > >> drivers/lightnvm/pblk-core.c | 8 ++++++++ > >> drivers/lightnvm/pblk-gc.c | 5 ++--- > >> drivers/lightnvm/pblk-read.c | 1 - > >> drivers/lightnvm/pblk.h | 2 ++ > >> 4 files changed, 12 insertions(+), 4 deletions(-) > >> > >> diff --git a/drivers/lightnvm/pblk-core.c b/drivers/lightnvm/pblk-core.c > >> index 4d5cd99..6817f8f 100644 > >> --- a/drivers/lightnvm/pblk-core.c > >> +++ b/drivers/lightnvm/pblk-core.c > >> @@ -1786,6 +1786,14 @@ static void __pblk_line_put(struct pblk *pblk, struct pblk_line *line) > >> > >> spin_lock(&line->lock); > >> WARN_ON(line->state != PBLK_LINESTATE_GC); > >> + if (line->w_err_gc->has_gc_err) { > >> + spin_unlock(&line->lock); > >> + pblk_err(pblk, "line %d had errors during GC\n", line->id); > >> + pblk_put_line_back(pblk, line); > >> + line->w_err_gc->has_gc_err = 0; > > > > In a real bummer corner case, the line might have had a write error as well > > (line->w_err_gc->has_write_err == true) > > > > In that case we need to inform the rate limiter: > > pblk_rl_werr_line_out(&pblk->rl); > > Is that needed or rather preferred? > > I'm not clearing w_err_gc->has_write_err value in that case (and thus > not informing the rate limiter), so the line will go back to exactly the > same state as before GC (still with write error marked). You're right, we will try to gc the line again. Thanks! Reviewed-by: Hans Holmberg <hans.holmberg@cnexlabs.com> > > > > >> + return; > >> + } > >> + > >> line->state = PBLK_LINESTATE_FREE; > >> trace_pblk_line_state(pblk_disk_name(pblk), line->id, > >> line->state); > >> diff --git a/drivers/lightnvm/pblk-gc.c b/drivers/lightnvm/pblk-gc.c > >> index e23b192..63ee205 100644 > >> --- a/drivers/lightnvm/pblk-gc.c > >> +++ b/drivers/lightnvm/pblk-gc.c > >> @@ -59,7 +59,7 @@ static void pblk_gc_writer_kick(struct pblk_gc *gc) > >> wake_up_process(gc->gc_writer_ts); > >> } > >> > >> -static void pblk_put_line_back(struct pblk *pblk, struct pblk_line *line) > >> +void pblk_put_line_back(struct pblk *pblk, struct pblk_line *line) > >> { > >> struct pblk_line_mgmt *l_mg = &pblk->l_mg; > >> struct list_head *move_list; > >> @@ -98,8 +98,7 @@ static void pblk_gc_line_ws(struct work_struct *work) > >> /* Read from GC victim block */ > >> ret = pblk_submit_read_gc(pblk, gc_rq); > >> if (ret) { > >> - pblk_err(pblk, "failed GC read in line:%d (err:%d)\n", > >> - line->id, ret); > >> + line->w_err_gc->has_gc_err = 1; > >> goto out; > >> } > >> > >> diff --git a/drivers/lightnvm/pblk-read.c b/drivers/lightnvm/pblk-read.c > >> index 54422a2..6a77c24 100644 > >> --- a/drivers/lightnvm/pblk-read.c > >> +++ b/drivers/lightnvm/pblk-read.c > >> @@ -475,7 +475,6 @@ int pblk_submit_read_gc(struct pblk *pblk, struct pblk_gc_rq *gc_rq) > >> > >> if (pblk_submit_io_sync(pblk, &rqd)) { > >> ret = -EIO; > >> - pblk_err(pblk, "GC read request failed\n"); > >> goto err_free_bio; > >> } > >> > >> diff --git a/drivers/lightnvm/pblk.h b/drivers/lightnvm/pblk.h > >> index 5d1040a..52002f5 100644 > >> --- a/drivers/lightnvm/pblk.h > >> +++ b/drivers/lightnvm/pblk.h > >> @@ -427,6 +427,7 @@ struct pblk_smeta { > >> > >> struct pblk_w_err_gc { > >> int has_write_err; > >> + int has_gc_err; > >> __le64 *lba_list; > >> }; > >> > >> @@ -909,6 +910,7 @@ void pblk_gc_free_full_lines(struct pblk *pblk); > >> void pblk_gc_sysfs_state_show(struct pblk *pblk, int *gc_enabled, > >> int *gc_active); > >> int pblk_gc_sysfs_force(struct pblk *pblk, int force); > >> +void pblk_put_line_back(struct pblk *pblk, struct pblk_line *line); > >> > >> /* > >> * pblk rate limiter > >> -- > >> 2.9.5 > >>
diff --git a/drivers/lightnvm/pblk-core.c b/drivers/lightnvm/pblk-core.c index 4d5cd99..6817f8f 100644 --- a/drivers/lightnvm/pblk-core.c +++ b/drivers/lightnvm/pblk-core.c @@ -1786,6 +1786,14 @@ static void __pblk_line_put(struct pblk *pblk, struct pblk_line *line) spin_lock(&line->lock); WARN_ON(line->state != PBLK_LINESTATE_GC); + if (line->w_err_gc->has_gc_err) { + spin_unlock(&line->lock); + pblk_err(pblk, "line %d had errors during GC\n", line->id); + pblk_put_line_back(pblk, line); + line->w_err_gc->has_gc_err = 0; + return; + } + line->state = PBLK_LINESTATE_FREE; trace_pblk_line_state(pblk_disk_name(pblk), line->id, line->state); diff --git a/drivers/lightnvm/pblk-gc.c b/drivers/lightnvm/pblk-gc.c index e23b192..63ee205 100644 --- a/drivers/lightnvm/pblk-gc.c +++ b/drivers/lightnvm/pblk-gc.c @@ -59,7 +59,7 @@ static void pblk_gc_writer_kick(struct pblk_gc *gc) wake_up_process(gc->gc_writer_ts); } -static void pblk_put_line_back(struct pblk *pblk, struct pblk_line *line) +void pblk_put_line_back(struct pblk *pblk, struct pblk_line *line) { struct pblk_line_mgmt *l_mg = &pblk->l_mg; struct list_head *move_list; @@ -98,8 +98,7 @@ static void pblk_gc_line_ws(struct work_struct *work) /* Read from GC victim block */ ret = pblk_submit_read_gc(pblk, gc_rq); if (ret) { - pblk_err(pblk, "failed GC read in line:%d (err:%d)\n", - line->id, ret); + line->w_err_gc->has_gc_err = 1; goto out; } diff --git a/drivers/lightnvm/pblk-read.c b/drivers/lightnvm/pblk-read.c index 54422a2..6a77c24 100644 --- a/drivers/lightnvm/pblk-read.c +++ b/drivers/lightnvm/pblk-read.c @@ -475,7 +475,6 @@ int pblk_submit_read_gc(struct pblk *pblk, struct pblk_gc_rq *gc_rq) if (pblk_submit_io_sync(pblk, &rqd)) { ret = -EIO; - pblk_err(pblk, "GC read request failed\n"); goto err_free_bio; } diff --git a/drivers/lightnvm/pblk.h b/drivers/lightnvm/pblk.h index 5d1040a..52002f5 100644 --- a/drivers/lightnvm/pblk.h +++ b/drivers/lightnvm/pblk.h @@ -427,6 +427,7 @@ struct pblk_smeta { struct pblk_w_err_gc { int has_write_err; + int has_gc_err; __le64 *lba_list; }; @@ -909,6 +910,7 @@ void pblk_gc_free_full_lines(struct pblk *pblk); void pblk_gc_sysfs_state_show(struct pblk *pblk, int *gc_enabled, int *gc_active); int pblk_gc_sysfs_force(struct pblk *pblk, int force); +void pblk_put_line_back(struct pblk *pblk, struct pblk_line *line); /* * pblk rate limiter
Currently when there is an IO error (or similar) on GC read path, pblk still moves this line to free state, what leads to data mismatch issue. This patch adds a handling for such an error - after that changes this line will be returned to closed state so it can be still in use for reading - at least we will be able to return error status to user when user tries to read such a data. Signed-off-by: Igor Konopko <igor.j.konopko@intel.com> --- drivers/lightnvm/pblk-core.c | 8 ++++++++ drivers/lightnvm/pblk-gc.c | 5 ++--- drivers/lightnvm/pblk-read.c | 1 - drivers/lightnvm/pblk.h | 2 ++ 4 files changed, 12 insertions(+), 4 deletions(-)