From patchwork Fri Feb 24 09:22:48 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Cornelia Huck X-Patchwork-Id: 9589733 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id A4EE7604A2 for ; Fri, 24 Feb 2017 09:24:42 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 9554D28791 for ; Fri, 24 Feb 2017 09:24:42 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 88692287A1; Fri, 24 Feb 2017 09:24:42 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 00FB128791 for ; Fri, 24 Feb 2017 09:24:41 +0000 (UTC) Received: from localhost ([::1]:35623 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1chC7F-0003yB-38 for patchwork-qemu-devel@patchwork.kernel.org; Fri, 24 Feb 2017 04:24:41 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:48868) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1chC5r-00038n-8z for qemu-devel@nongnu.org; Fri, 24 Feb 2017 04:23:16 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1chC5m-0004R2-PP for qemu-devel@nongnu.org; Fri, 24 Feb 2017 04:23:15 -0500 Received: from mx0a-001b2d01.pphosted.com ([148.163.156.1]:53214) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1chC5m-0004QO-GP for qemu-devel@nongnu.org; Fri, 24 Feb 2017 04:23:10 -0500 Received: from pps.filterd (m0098396.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.20/8.16.0.20) with SMTP id v1O9K3sE038908 for ; Fri, 24 Feb 2017 04:23:09 -0500 Received: from e06smtp13.uk.ibm.com (e06smtp13.uk.ibm.com [195.75.94.109]) by mx0a-001b2d01.pphosted.com with ESMTP id 28td8c3c1j-1 (version=TLSv1.2 cipher=AES256-SHA bits=256 verify=NOT) for ; Fri, 24 Feb 2017 04:23:09 -0500 Received: from localhost by e06smtp13.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Fri, 24 Feb 2017 09:23:06 -0000 Received: from d06dlp03.portsmouth.uk.ibm.com (9.149.20.15) by e06smtp13.uk.ibm.com (192.168.101.143) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; Fri, 24 Feb 2017 09:23:03 -0000 Received: from b06cxnps4074.portsmouth.uk.ibm.com (d06relay11.portsmouth.uk.ibm.com [9.149.109.196]) by d06dlp03.portsmouth.uk.ibm.com (Postfix) with ESMTP id 974051B08072; Fri, 24 Feb 2017 09:26:03 +0000 (GMT) Received: from d06av26.portsmouth.uk.ibm.com (d06av26.portsmouth.uk.ibm.com [9.149.105.62]) by b06cxnps4074.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id v1O9N3K3065970; Fri, 24 Feb 2017 09:23:03 GMT Received: from d06av26.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 39684AE053; Fri, 24 Feb 2017 08:20:53 +0000 (GMT) Received: from d06av26.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 04349AE045; Fri, 24 Feb 2017 08:20:53 +0000 (GMT) Received: from gondolin.boeblingen.de.ibm.com (unknown [9.152.224.55]) by d06av26.portsmouth.uk.ibm.com (Postfix) with ESMTPS; Fri, 24 Feb 2017 08:20:52 +0000 (GMT) From: Cornelia Huck To: peter.maydell@linaro.org Date: Fri, 24 Feb 2017 10:22:48 +0100 X-Mailer: git-send-email 2.11.0 In-Reply-To: <20170224092259.16483-1-cornelia.huck@de.ibm.com> References: <20170224092259.16483-1-cornelia.huck@de.ibm.com> X-TM-AS-GCONF: 00 X-Content-Scanned: Fidelis XPS MAILER x-cbid: 17022409-0012-0000-0000-000004D44E15 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 17022409-0013-0000-0000-0000175E2D69 Message-Id: <20170224092259.16483-3-cornelia.huck@de.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:, , definitions=2017-02-24_06:, , signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 suspectscore=1 malwarescore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1612050000 definitions=main-1702240093 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x [generic] [fuzzy] X-Received-From: 148.163.156.1 Subject: [Qemu-devel] [PULL 02/13] s390x/kvm: detect some program check loops X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Cornelia Huck , borntraeger@de.ibm.com, jfrei@linux.vnet.ibm.com, agraf@suse.de, qemu-devel@nongnu.org Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" X-Virus-Scanned: ClamAV using ClamSMTP From: Christian Borntraeger Sometimes (e.g. early boot) a guest is broken in such ways that it loops 100% delivering operation exceptions (illegal operation) but the pgm new PSW is not set properly. This will result in code being read from address zero, which usually contains another illegal op. Let's detect this case and put the guest in crashed state. Instead of only detecting this for address zero apply a heuristic that will work for any program check new psw so that it will also reach the crashed state if you provide some random elf file to the -kernel option. We do not want guest problem state to be able to trigger a guest panic, e.g. by faulting on an address that is the same as the program check new PSW, so we check for the problem state bit being off. With this we a: get rid of CPU consumption of such broken guests b: keep the program old PSW. This allows to find out the original illegal operation - making debugging such early boot issues much easier than with single stepping This relies on the kernel using a similar heuristic and passing such operation exceptions to user space. Signed-off-by: Christian Borntraeger Signed-off-by: Cornelia Huck --- target/s390x/kvm.c | 43 ++++++++++++++++++++++++++++++++++++++++--- 1 file changed, 40 insertions(+), 3 deletions(-) diff --git a/target/s390x/kvm.c b/target/s390x/kvm.c index 25367807f4..5ec050cf89 100644 --- a/target/s390x/kvm.c +++ b/target/s390x/kvm.c @@ -1867,6 +1867,40 @@ static void unmanageable_intercept(S390CPU *cpu, const char *str, int pswoffset) qemu_system_guest_panicked(NULL); } +/* try to detect pgm check loops */ +static int handle_oper_loop(S390CPU *cpu, struct kvm_run *run) +{ + CPUState *cs = CPU(cpu); + PSW oldpsw, newpsw; + + cpu_synchronize_state(cs); + newpsw.mask = ldq_phys(cs->as, cpu->env.psa + + offsetof(LowCore, program_new_psw)); + newpsw.addr = ldq_phys(cs->as, cpu->env.psa + + offsetof(LowCore, program_new_psw) + 8); + oldpsw.mask = run->psw_mask; + oldpsw.addr = run->psw_addr; + /* + * Avoid endless loops of operation exceptions, if the pgm new + * PSW will cause a new operation exception. + * The heuristic checks if the pgm new psw is within 6 bytes before + * the faulting psw address (with same DAT, AS settings) and the + * new psw is not a wait psw and the fault was not triggered by + * problem state. In that case go into crashed state. + */ + + if (oldpsw.addr - newpsw.addr <= 6 && + !(newpsw.mask & PSW_MASK_WAIT) && + !(oldpsw.mask & PSW_MASK_PSTATE) && + (newpsw.mask & PSW_MASK_ASC) == (oldpsw.mask & PSW_MASK_ASC) && + (newpsw.mask & PSW_MASK_DAT) == (oldpsw.mask & PSW_MASK_DAT)) { + unmanageable_intercept(cpu, "operation exception loop", + offsetof(LowCore, program_new_psw)); + return EXCP_HALTED; + } + return 0; +} + static int handle_intercept(S390CPU *cpu) { CPUState *cs = CPU(cpu); @@ -1914,11 +1948,14 @@ static int handle_intercept(S390CPU *cpu) r = EXCP_HALTED; break; case ICPT_OPEREXC: - /* currently only instr 0x0000 after enabled via capability */ + /* check for break points */ r = handle_sw_breakpoint(cpu, run); if (r == -ENOENT) { - enter_pgmcheck(cpu, PGM_OPERATION); - r = 0; + /* Then check for potential pgm check loops */ + r = handle_oper_loop(cpu, run); + if (r == 0) { + enter_pgmcheck(cpu, PGM_OPERATION); + } } break; case ICPT_SOFT_INTERCEPT: