commit 1230ae0e99e05ced8a945a1a2c5762ce5c6c97c9
Author: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Date:   Thu Oct 1 11:36:53 2015 +0200

    Linux 3.14.54

commit 184e1198462223a7347a923d02370bf79207638b
Author: Keith Busch <keith.busch@intel.com>
Date:   Mon Mar 3 11:09:47 2014 -0700

    NVMe: Initialize device reference count earlier
    
    commit fb35e914b3f88cda9ee6f9d776910c35269c4ecf upstream.
    
    If an NVMe device becomes ready but fails to create IO queues, the driver
    creates a character device handle so the device can be managed. The
    device reference count needs to be initialized before creating the
    character device.
    
    Signed-off-by: Keith Busch <keith.busch@intel.com>
    Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 34820fc89c5e635b7381e4060931ca30a63d110a
Author: Jan Kara <jack@suse.cz>
Date:   Wed Jan 7 13:49:08 2015 +0100

    udf: Check length of extended attributes and allocation descriptors
    
    commit 23b133bdc452aa441fcb9b82cbf6dd05cfd342d0 upstream.
    
    Check length of extended attributes and allocation descriptors when
    loading inodes from disk. Otherwise corrupted filesystems could confuse
    the code and make the kernel oops.
    
    This fixes CVE-2015-4167.
    
    Reported-by: Carl Henrik Lunde <chlunde@ping.uio.no>
    Signed-off-by: Jan Kara <jack@suse.cz>
    [Use make_bad_inode() instead of branching due to older implementation.]
    Signed-off-by: Chas Williams <3chas3@gmail.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 5f521316a9d5c70842744b8f3f872ab1a932711b
Author: Andy Lutomirski <luto@kernel.org>
Date:   Wed Jul 15 10:29:38 2015 -0700

    x86/nmi/64: Use DF to avoid userspace RSP confusing nested NMI detection
    
    commit 810bc075f78ff2c221536eb3008eac6a492dba2d upstream.
    
    We have a tricky bug in the nested NMI code: if we see RSP
    pointing to the NMI stack on NMI entry from kernel mode, we
    assume that we are executing a nested NMI.
    
    This isn't quite true.  A malicious userspace program can point
    RSP at the NMI stack, issue SYSCALL, and arrange for an NMI to
    happen while RSP is still pointing at the NMI stack.
    
    Fix it with a sneaky trick.  Set DF in the region of code that
    the RSP check is intended to detect.  IRET will clear DF
    atomically.
    
    ( Note: other than paravirt, there's little need for all this
      complexity. We could check RIP instead of RSP. )
    
    Signed-off-by: Andy Lutomirski <luto@kernel.org>
    Reviewed-by: Steven Rostedt <rostedt@goodmis.org>
    Cc: Borislav Petkov <bp@suse.de>
    Cc: Linus Torvalds <torvalds@linux-foundation.org>
    Cc: Peter Zijlstra <peterz@infradead.org>
    Cc: Thomas Gleixner <tglx@linutronix.de>
    Signed-off-by: Ingo Molnar <mingo@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 9fea301b1f49f1fd741a9e78d2a7c8906f01b4f8
Author: Andy Lutomirski <luto@kernel.org>
Date:   Wed Jul 15 10:29:37 2015 -0700

    x86/nmi/64: Reorder nested NMI checks
    
    commit a27507ca2d796cfa8d907de31ad730359c8a6d06 upstream.
    
    Check the repeat_nmi .. end_repeat_nmi special case first.  The
    next patch will rework the RSP check and, as a side effect, the
    RSP check will no longer detect repeat_nmi .. end_repeat_nmi, so
    we'll need this ordering of the checks.
    
    Note: this is more subtle than it appears.  The check for
    repeat_nmi .. end_repeat_nmi jumps straight out of the NMI code
    instead of adjusting the "iret" frame to force a repeat.  This
    is necessary, because the code between repeat_nmi and
    end_repeat_nmi sets "NMI executing" and then writes to the
    "iret" frame itself.  If a nested NMI comes in and modifies the
    "iret" frame while repeat_nmi is also modifying it, we'll end up
    with garbage.  The old code got this right, as does the new
    code, but the new code is a bit more explicit.
    
    If we were to move the check right after the "NMI executing"
    check, then we'd get it wrong and have random crashes.
    
    ( Because the "NMI executing" check would jump to the code that would
      modify the "iret" frame without checking if the interrupted NMI was
      currently modifying it. )
    
    Signed-off-by: Andy Lutomirski <luto@kernel.org>
    Reviewed-by: Steven Rostedt <rostedt@goodmis.org>
    Cc: Borislav Petkov <bp@suse.de>
    Cc: Linus Torvalds <torvalds@linux-foundation.org>
    Cc: Peter Zijlstra <peterz@infradead.org>
    Cc: Thomas Gleixner <tglx@linutronix.de>
    Signed-off-by: Ingo Molnar <mingo@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 8a1a2c4d4a170f35acd71c419049c3da210418e3
Author: Andy Lutomirski <luto@kernel.org>
Date:   Wed Jul 15 10:29:36 2015 -0700

    x86/nmi/64: Improve nested NMI comments
    
    commit 0b22930ebad563ae97ff3f8d7b9f12060b4c6e6b upstream.
    
    I found the nested NMI documentation to be difficult to follow.
    Improve the comments.
    
    Signed-off-by: Andy Lutomirski <luto@kernel.org>
    Reviewed-by: Steven Rostedt <rostedt@goodmis.org>
    Cc: Borislav Petkov <bp@suse.de>
    Cc: Linus Torvalds <torvalds@linux-foundation.org>
    Cc: Peter Zijlstra <peterz@infradead.org>
    Cc: Thomas Gleixner <tglx@linutronix.de>
    Signed-off-by: Ingo Molnar <mingo@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 66af900f21c6b0b1b59ac156245ca8d2b5d7b696
Author: Andy Lutomirski <luto@kernel.org>
Date:   Wed Jul 15 10:29:35 2015 -0700

    x86/nmi/64: Switch stacks on userspace NMI entry
    
    commit 9b6e6a8334d56354853f9c255d1395c2ba570e0a upstream.
    
    Returning to userspace is tricky: IRET can fail, and ESPFIX can
    rearrange the stack prior to IRET.
    
    The NMI nesting fixup relies on a precise stack layout and
    atomic IRET.  Rather than trying to teach the NMI nesting fixup
    to handle ESPFIX and failed IRET, punt: run NMIs that came from
    user mode on the normal kernel stack.
    
    This will make some nested NMIs visible to C code, but the C
    code is okay with that.
    
    As a side effect, this should speed up perf: it eliminates an
    RDMSR when NMIs come from user mode.
    
    Signed-off-by: Andy Lutomirski <luto@kernel.org>
    Reviewed-by: Steven Rostedt <rostedt@goodmis.org>
    Reviewed-by: Borislav Petkov <bp@suse.de>
    Cc: Linus Torvalds <torvalds@linux-foundation.org>
    Cc: Peter Zijlstra <peterz@infradead.org>
    Cc: Thomas Gleixner <tglx@linutronix.de>
    Cc: stable@vger.kernel.org
    Signed-off-by: Ingo Molnar <mingo@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 264477bc18b1b34a38b5211a75f2af3a0346f814
Author: Andy Lutomirski <luto@kernel.org>
Date:   Wed Jul 15 10:29:34 2015 -0700

    x86/nmi/64: Remove asm code that saves CR2
    
    commit 0e181bb58143cb4a2e8f01c281b0816cd0e4798e upstream.
    
    Now that do_nmi saves CR2, we don't need to save it in asm.
    
    Signed-off-by: Andy Lutomirski <luto@kernel.org>
    Reviewed-by: Steven Rostedt <rostedt@goodmis.org>
    Acked-by: Borislav Petkov <bp@suse.de>
    Cc: Linus Torvalds <torvalds@linux-foundation.org>
    Cc: Peter Zijlstra <peterz@infradead.org>
    Cc: Thomas Gleixner <tglx@linutronix.de>
    Signed-off-by: Ingo Molnar <mingo@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit a4896b65ac7a912ff9d4cd9d6956eee83a577bf1
Author: Andy Lutomirski <luto@kernel.org>
Date:   Wed Jul 15 10:29:33 2015 -0700

    x86/nmi: Enable nested do_nmi() handling for 64-bit kernels
    
    commit 9d05041679904b12c12421cbcf9cb5f4860a8d7b upstream.
    
    32-bit kernels handle nested NMIs in C.  Enable the exact same
    handling on 64-bit kernels as well.  This isn't currently
    necessary, but it will become necessary once the asm code starts
    allowing limited nesting.
    
    Signed-off-by: Andy Lutomirski <luto@kernel.org>
    Reviewed-by: Steven Rostedt <rostedt@goodmis.org>
    Cc: Borislav Petkov <bp@suse.de>
    Cc: Linus Torvalds <torvalds@linux-foundation.org>
    Cc: Peter Zijlstra <peterz@infradead.org>
    Cc: Thomas Gleixner <tglx@linutronix.de>
    Cc: stable@vger.kernel.org
    Signed-off-by: Ingo Molnar <mingo@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 934d9b907aeec5f344ca801ed7361551199dfc69
Author: Markus Pargmann <mpa@pengutronix.de>
Date:   Wed Jul 29 15:46:03 2015 +0200

    Revert "iio: bmg160: IIO_BUFFER and IIO_TRIGGERED_BUFFER are required"
    
    This reverts commit 279c039ca63acbd69e69d6d7ddfed50346fb2185 which was
    commit 06d2f6ca5a38abe92f1f3a132b331eee773868c3 upstream as it should
    not have been applied.
    
    
    Reported-by: Luis Henriques <luis.henriques@canonical.com>
    Cc: Markus Pargmann <mpa@pengutronix.de>
    Cc: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
    Cc: Jonathan Cameron <jic23@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit bbd7b7ff10f4d49ffdc2f9fa8310c8d380039739
Author: Florian Westphal <fw@strlen.de>
Date:   Wed Aug 26 22:17:39 2015 -0700

    net: gso: use feature flag argument in all protocol gso handlers
    
    [ Upstream commit 1e16aa3ddf863c6b9f37eddf52503230a62dedb3 ]
    
    skb_gso_segment() has a 'features' argument representing offload features
    available to the output path.
    
    A few handlers, e.g. GRE, instead re-fetch the features of skb->dev and use
    those instead of the provided ones when handing encapsulation/tunnels.
    
    Depending on dev->hw_enc_features of the output device skb_gso_segment() can
    then return NULL even when the caller has disabled all GSO feature bits,
    as segmentation of inner header thinks device will take care of segmentation.
    
    This e.g. affects the tbf scheduler, which will silently drop GRE-encap GSO skbs
    that did not fit the remaining token quota as the segmentation does not work
    when device supports corresponding hw offload capabilities.
    
    Cc: Pravin B Shelar <pshelar@nicira.com>
    Signed-off-by: Florian Westphal <fw@strlen.de>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    [jay.vosburgh: backported to 3.14. ]
    Signed-off-by: Jay Vosburgh <jay.vosburgh@canonical.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit f7600bc155c49ec97830743b4b91288858d01d2d
Author: Ivan Vecera <ivecera@redhat.com>
Date:   Thu Aug 6 22:48:23 2015 +0200

    bna: fix interrupts storm caused by erroneous packets
    
    [ Upstream commit ade4dc3e616e33c80d7e62855fe1b6f9895bc7c3 ]
    
    The commit "e29aa33 bna: Enable Multi Buffer RX" moved packets counter
    increment from the beginning of the NAPI processing loop after the check
    for erroneous packets so they are never accounted. This counter is used
    to inform firmware about number of processed completions (packets).
    As these packets are never acked the firmware fires IRQs for them again
    and again.
    
    Fixes: e29aa33 ("bna: Enable Multi Buffer RX")
    Signed-off-by: Ivan Vecera <ivecera@redhat.com>
    Acked-by: Rasesh Mody <rasesh.mody@qlogic.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit e602c012fd2d30cba979bc5b51bc3e5b52cc9065
Author: Eric Dumazet <edumazet@google.com>
Date:   Sat Aug 1 12:14:33 2015 +0200

    udp: fix dst races with multicast early demux
    
    [ Upstream commit 10e2eb878f3ca07ac2f05fa5ca5e6c4c9174a27a ]
    
    Multicast dst are not cached. They carry DST_NOCACHE.
    
    As mentioned in commit f8864972126899 ("ipv4: fix dst race in
    sk_dst_get()"), these dst need special care before caching them
    into a socket.
    
    Caching them is allowed only if their refcnt was not 0, ie we
    must use atomic_inc_not_zero()
    
    Also, we must use READ_ONCE() to fetch sk->sk_rx_dst, as mentioned
    in commit d0c294c53a771 ("tcp: prevent fetching dst twice in early demux
    code")
    
    Fixes: 421b3885bf6d ("udp: ipv4: Add udp early demux")
    Tested-by: Gregory Hoggarth <Gregory.Hoggarth@alliedtelesis.co.nz>
    Signed-off-by: Eric Dumazet <edumazet@google.com>
    Reported-by: Gregory Hoggarth <Gregory.Hoggarth@alliedtelesis.co.nz>
    Reported-by: Alex Gartrell <agartrell@fb.com>
    Cc: Michal Kubeček <mkubecek@suse.cz>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit ef9fff760b034f17c113944efb6b0b3b29321a45
Author: Dan Carpenter <dan.carpenter@oracle.com>
Date:   Sat Aug 1 15:33:26 2015 +0300

    rds: fix an integer overflow test in rds_info_getsockopt()
    
    [ Upstream commit 468b732b6f76b138c0926eadf38ac88467dcd271 ]
    
    "len" is a signed integer.  We check that len is not negative, so it
    goes from zero to INT_MAX.  PAGE_SIZE is unsigned long so the comparison
    is type promoted to unsigned long.  ULONG_MAX - 4095 is a higher than
    INT_MAX so the condition can never be true.
    
    I don't know if this is harmful but it seems safe to limit "len" to
    INT_MAX - 4095.
    
    Fixes: a8c879a7ee98 ('RDS: Info and stats')
    Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit f0b9eeef1f0e71cce65b6dbd31f7df0b3e9e59b9
Author: Lars Westerhoff <lars.westerhoff@newtec.eu>
Date:   Tue Jul 28 01:32:21 2015 +0300

    packet: missing dev_put() in packet_do_bind()
    
    [ Upstream commit 158cd4af8dedbda0d612d448c724c715d0dda649 ]
    
    When binding a PF_PACKET socket, the use count of the bound interface is
    always increased with dev_hold in dev_get_by_{index,name}.  However,
    when rebound with the same protocol and device as in the previous bind
    the use count of the interface was not decreased.  Ultimately, this
    caused the deletion of the interface to fail with the following message:
    
    unregister_netdevice: waiting for dummy0 to become free. Usage count = 1
    
    This patch moves the dev_put out of the conditional part that was only
    executed when either the protocol or device changed on a bind.
    
    Fixes: 902fefb82ef7 ('packet: improve socket create/bind latency in some cases')
    Signed-off-by: Lars Westerhoff <lars.westerhoff@newtec.eu>
    Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
    Reviewed-by: Daniel Borkmann <dborkman@redhat.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 427f994587dde5a0064e85614a631c075481c4ba
Author: Wilson Kok <wkok@cumulusnetworks.com>
Date:   Tue Sep 22 21:40:22 2015 -0700

    fib_rules: fix fib rule dumps across multiple skbs
    
    [ Upstream commit 41fc014332d91ee90c32840bf161f9685b7fbf2b ]
    
    dump_rules returns skb length and not error.
    But when family == AF_UNSPEC, the caller of dump_rules
    assumes that it returns an error. Hence, when family == AF_UNSPEC,
    we continue trying to dump on -EMSGSIZE errors resulting in
    incorrect dump idx carried between skbs belonging to the same dump.
    This results in fib rule dump always only dumping rules that fit
    into the first skb.
    
    This patch fixes dump_rules to return error so that we exit correctly
    and idx is correctly maintained between skbs that are part of the
    same dump.
    
    Signed-off-by: Wilson Kok <wkok@cumulusnetworks.com>
    Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 6f6e8414631dd4bde7e0e29e9cab41d7e7c92970
Author: Jesse Gross <jesse@nicira.com>
Date:   Mon Sep 21 20:21:20 2015 -0700

    openvswitch: Zero flows on allocation.
    
    [ Upstream commit ae5f2fb1d51fa128a460bcfbe3c56d7ab8bf6a43 ]
    
    When support for megaflows was introduced, OVS needed to start
    installing flows with a mask applied to them. Since masking is an
    expensive operation, OVS also had an optimization that would only
    take the parts of the flow keys that were covered by a non-zero
    mask. The values stored in the remaining pieces should not matter
    because they are masked out.
    
    While this works fine for the purposes of matching (which must always
    look at the mask), serialization to netlink can be problematic. Since
    the flow and the mask are serialized separately, the uninitialized
    portions of the flow can be encoded with whatever values happen to be
    present.
    
    In terms of functionality, this has little effect since these fields
    will be masked out by definition. However, it leaks kernel memory to
    userspace, which is a potential security vulnerability. It is also
    possible that other code paths could look at the masked key and get
    uninitialized data, although this does not currently appear to be an
    issue in practice.
    
    This removes the mask optimization for flows that are being installed.
    This was always intended to be the case as the mask optimizations were
    really targetting per-packet flow operations.
    
    Fixes: 03f0d916 ("openvswitch: Mega flow implementation")
    Signed-off-by: Jesse Gross <jesse@nicira.com>
    Acked-by: Pravin B Shelar <pshelar@nicira.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 62f575aaba7ae93a4e02029d30f9dcf69b84470f
Author: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Date:   Thu Sep 10 17:31:15 2015 -0300

    sctp: fix race on protocol/netns initialization
    
    [ Upstream commit 8e2d61e0aed2b7c4ecb35844fe07e0b2b762dee4 ]
    
    Consider sctp module is unloaded and is being requested because an user
    is creating a sctp socket.
    
    During initialization, sctp will add the new protocol type and then
    initialize pernet subsys:
    
            status = sctp_v4_protosw_init();
            if (status)
                    goto err_protosw_init;
    
            status = sctp_v6_protosw_init();
            if (status)
                    goto err_v6_protosw_init;
    
            status = register_pernet_subsys(&sctp_net_ops);
    
    The problem is that after those calls to sctp_v{4,6}_protosw_init(), it
    is possible for userspace to create SCTP sockets like if the module is
    already fully loaded. If that happens, one of the possible effects is
    that we will have readers for net->sctp.local_addr_list list earlier
    than expected and sctp_net_init() does not take precautions while
    dealing with that list, leading to a potential panic but not limited to
    that, as sctp_sock_init() will copy a bunch of blank/partially
    initialized values from net->sctp.
    
    The race happens like this:
    
         CPU 0                           |  CPU 1
      socket()                           |
       __sock_create                     | socket()
        inet_create                      |  __sock_create
         list_for_each_entry_rcu(        |
            answer, &inetsw[sock->type], |
            list) {                      |   inet_create
          /* no hits */                  |
         if (unlikely(err)) {            |
          ...                            |
          request_module()               |
          /* socket creation is blocked  |
           * the module is fully loaded  |
           */                            |
           sctp_init                     |
            sctp_v4_protosw_init         |
             inet_register_protosw       |
              list_add_rcu(&p->list,     |
                           last_perm);   |
                                         |  list_for_each_entry_rcu(
                                         |     answer, &inetsw[sock->type],
            sctp_v6_protosw_init         |     list) {
                                         |     /* hit, so assumes protocol
                                         |      * is already loaded
                                         |      */
                                         |  /* socket creation continues
                                         |   * before netns is initialized
                                         |   */
            register_pernet_subsys       |
    
    Simply inverting the initialization order between
    register_pernet_subsys() and sctp_v4_protosw_init() is not possible
    because register_pernet_subsys() will create a control sctp socket, so
    the protocol must be already visible by then. Deferring the socket
    creation to a work-queue is not good specially because we loose the
    ability to handle its errors.
    
    So, as suggested by Vlad, the fix is to split netns initialization in
    two moments: defaults and control socket, so that the defaults are
    already loaded by when we register the protocol, while control socket
    initialization is kept at the same moment it is today.
    
    Fixes: 4db67e808640 ("sctp: Make the address lists per network namespace")
    Signed-off-by: Vlad Yasevich <vyasevich@gmail.com>
    Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 85aec6328f3346b0718211faad564a3ffa64f60e
Author: Daniel Borkmann <daniel@iogearbox.net>
Date:   Thu Sep 10 20:05:46 2015 +0200

    netlink, mmap: transform mmap skb into full skb on taps
    
    [ Upstream commit 1853c949646005b5959c483becde86608f548f24 ]
    
    Ken-ichirou reported that running netlink in mmap mode for receive in
    combination with nlmon will throw a NULL pointer dereference in
    __kfree_skb() on nlmon_xmit(), in my case I can also trigger an "unable
    to handle kernel paging request". The problem is the skb_clone() in
    __netlink_deliver_tap_skb() for skbs that are mmaped.
    
    I.e. the cloned skb doesn't have a destructor, whereas the mmap netlink
    skb has it pointed to netlink_skb_destructor(), set in the handler
    netlink_ring_setup_skb(). There, skb->head is being set to NULL, so
    that in such cases, __kfree_skb() doesn't perform a skb_release_data()
    via skb_release_all(), where skb->head is possibly being freed through
    kfree(head) into slab allocator, although netlink mmap skb->head points
    to the mmap buffer. Similarly, the same has to be done also for large
    netlink skbs where the data area is vmalloced. Therefore, as discussed,
    make a copy for these rather rare cases for now. This fixes the issue
    on my and Ken-ichirou's test-cases.
    
    Reference: http://thread.gmane.org/gmane.linux.network/371129
    Fixes: bcbde0d449ed ("net: netlink: virtual tap device management")
    Reported-by: Ken-ichirou MATSUZAWA <chamaken@gmail.com>
    Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
    Tested-by: Ken-ichirou MATSUZAWA <chamaken@gmail.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit c9cc1b7e3f713f4b76fad3f3ade88528b1dfb56b
Author: Richard Laing <richard.laing@alliedtelesis.co.nz>
Date:   Thu Sep 3 13:52:31 2015 +1200

    net/ipv6: Correct PIM6 mrt_lock handling
    
    [ Upstream commit 25b4a44c19c83d98e8c0807a7ede07c1f28eab8b ]
    
    In the IPv6 multicast routing code the mrt_lock was not being released
    correctly in the MFC iterator, as a result adding or deleting a MIF would
    cause a hang because the mrt_lock could not be acquired.
    
    This fix is a copy of the code for the IPv4 case and ensures that the lock
    is released correctly.
    
    Signed-off-by: Richard Laing <richard.laing@alliedtelesis.co.nz>
    Acked-by: Cong Wang <cwang@twopensource.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit c3ab95905bfc2f384d5e24cc622a4b72108e4d5d
Author: Daniel Borkmann <daniel@iogearbox.net>
Date:   Thu Sep 3 00:29:07 2015 +0200

    ipv6: fix exthdrs offload registration in out_rt path
    
    [ Upstream commit e41b0bedba0293b9e1e8d1e8ed553104b9693656 ]
    
    We previously register IPPROTO_ROUTING offload under inet6_add_offload(),
    but in error path, we try to unregister it with inet_del_offload(). This
    doesn't seem correct, it should actually be inet6_del_offload(), also
    ipv6_exthdrs_offload_exit() from that commit seems rather incorrect (it
    also uses rthdr_offload twice), but it got removed entirely later on.
    
    Fixes: 3336288a9fea ("ipv6: Switch to using new offload infrastructure.")
    Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 1db7ccf33f3c444a01a4d950fc6c23847524481e
Author: Eugene Shatokhin <eugene.shatokhin@rosalab.ru>
Date:   Mon Aug 24 23:13:42 2015 +0300

    usbnet: Get EVENT_NO_RUNTIME_PM bit before it is cleared
    
    [ Upstream commit f50791ac1aca1ac1b0370d62397b43e9f831421a ]
    
    It is needed to check EVENT_NO_RUNTIME_PM bit of dev->flags in
    usbnet_stop(), but its value should be read before it is cleared
    when dev->flags is set to 0.
    
    The problem was spotted and the fix was provided by
    Oliver Neukum <oneukum@suse.de>.
    
    Signed-off-by: Eugene Shatokhin <eugene.shatokhin@rosalab.ru>
    Acked-by: Oliver Neukum <oneukum@suse.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit a4240b5e7ac9f3718bd836471a27a212489585ac
Author: huaibin Wang <huaibin.wang@6wind.com>
Date:   Tue Aug 25 16:20:34 2015 +0200

    ip6_gre: release cached dst on tunnel removal
    
    [ Upstream commit d4257295ba1b389c693b79de857a96e4b7cd8ac0 ]
    
    When a tunnel is deleted, the cached dst entry should be released.
    
    This problem may prevent the removal of a netns (seen with a x-netns IPv6
    gre tunnel):
      unregister_netdevice: waiting for lo to become free. Usage count = 3
    
    CC: Dmitry Kozlov <xeb@mail.ru>
    Fixes: c12b395a4664 ("gre: Support GRE over IPv6")
    Signed-off-by: huaibin Wang <huaibin.wang@6wind.com>
    Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit f8952cdcb1bfd3c612ecb8a2e21ea9324ffd93db
Author: Jack Morgenstein <jackm@dev.mellanox.co.il>
Date:   Wed Jul 22 16:53:47 2015 +0300

    net/mlx4_core: Fix wrong index in propagating port change event to VFs
    
    [ Upstream commit 1c1bf34951e8d17941bf708d1901c47e81b15d55 ]
    
    The port-change event processing in procedure mlx4_eq_int() uses "slave"
    as the vf_oper array index. Since the value of "slave" is the PF function
    index, the result is that the PF link state is used for deciding to
    propagate the event for all the VFs. The VF link state should be used,
    so the VF function index should be used here.
    
    Fixes: 948e306d7d64 ('net/mlx4: Add VF link state support')
    Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
    Signed-off-by: Matan Barak <matanb@mellanox.com>
    Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit afe158591ba9dc2b48aee305150c8a5b92b485e9
Author: Florian Westphal <fw@strlen.de>
Date:   Tue Jul 21 16:33:50 2015 +0200

    netlink: don't hold mutex in rcu callback when releasing mmapd ring
    
    [ Upstream commit 0470eb99b4721586ccac954faac3fa4472da0845 ]
    
    Kirill A. Shutemov says:
    
    This simple test-case trigers few locking asserts in kernel:
    
    int main(int argc, char **argv)
    {
            unsigned int block_size = 16 * 4096;
            struct nl_mmap_req req = {
                    .nm_block_size          = block_size,
                    .nm_block_nr            = 64,
                    .nm_frame_size          = 16384,
                    .nm_frame_nr            = 64 * block_size / 16384,
            };
            unsigned int ring_size;
    	int fd;
    
    	fd = socket(AF_NETLINK, SOCK_RAW, NETLINK_GENERIC);
            if (setsockopt(fd, SOL_NETLINK, NETLINK_RX_RING, &req, sizeof(req)) < 0)
                    exit(1);
            if (setsockopt(fd, SOL_NETLINK, NETLINK_TX_RING, &req, sizeof(req)) < 0)
                    exit(1);
    
    	ring_size = req.nm_block_nr * req.nm_block_size;
    	mmap(NULL, 2 * ring_size, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0);
    	return 0;
    }
    
    +++ exited with 0 +++
    BUG: sleeping function called from invalid context at /home/kas/git/public/linux-mm/kernel/locking/mutex.c:616
    in_atomic(): 1, irqs_disabled(): 0, pid: 1, name: init
    3 locks held by init/1:
     #0:  (reboot_mutex){+.+...}, at: [<ffffffff81080959>] SyS_reboot+0xa9/0x220
     #1:  ((reboot_notifier_list).rwsem){.+.+..}, at: [<ffffffff8107f379>] __blocking_notifier_call_chain+0x39/0x70
     #2:  (rcu_callback){......}, at: [<ffffffff810d32e0>] rcu_do_batch.isra.49+0x160/0x10c0
    Preemption disabled at:[<ffffffff8145365f>] __delay+0xf/0x20
    
    CPU: 1 PID: 1 Comm: init Not tainted 4.1.0-00009-gbddf4c4818e0 #253
    Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS Debian-1.8.2-1 04/01/2014
     ffff88017b3d8000 ffff88027bc03c38 ffffffff81929ceb 0000000000000102
     0000000000000000 ffff88027bc03c68 ffffffff81085a9d 0000000000000002
     ffffffff81ca2a20 0000000000000268 0000000000000000 ffff88027bc03c98
    Call Trace:
     <IRQ>  [<ffffffff81929ceb>] dump_stack+0x4f/0x7b
     [<ffffffff81085a9d>] ___might_sleep+0x16d/0x270
     [<ffffffff81085bed>] __might_sleep+0x4d/0x90
     [<ffffffff8192e96f>] mutex_lock_nested+0x2f/0x430
     [<ffffffff81932fed>] ? _raw_spin_unlock_irqrestore+0x5d/0x80
     [<ffffffff81464143>] ? __this_cpu_preempt_check+0x13/0x20
     [<ffffffff8182fc3d>] netlink_set_ring+0x1ed/0x350
     [<ffffffff8182e000>] ? netlink_undo_bind+0x70/0x70
     [<ffffffff8182fe20>] netlink_sock_destruct+0x80/0x150
     [<ffffffff817e484d>] __sk_free+0x1d/0x160
     [<ffffffff817e49a9>] sk_free+0x19/0x20
    [..]
    
    Cong Wang says:
    
    We can't hold mutex lock in a rcu callback, [..]
    
    Thomas Graf says:
    
    The socket should be dead at this point. It might be simpler to
    add a netlink_release_ring() function which doesn't require
    locking at all.
    
    Reported-by: "Kirill A. Shutemov" <kirill@shutemov.name>
    Diagnosed-by: Cong Wang <cwang@twopensource.com>
    Suggested-by: Thomas Graf <tgraf@suug.ch>
    Signed-off-by: Florian Westphal <fw@strlen.de>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit ba63f0d8c7a719eaf1efde68e78092e20217ce22
Author: Edward Hyunkoo Jee <edjee@google.com>
Date:   Tue Jul 21 09:43:59 2015 +0200

    inet: frags: fix defragmented packet's IP header for af_packet
    
    [ Upstream commit 0848f6428ba3a2e42db124d41ac6f548655735bf ]
    
    When ip_frag_queue() computes positions, it assumes that the passed
    sk_buff does not contain L2 headers.
    
    However, when PACKET_FANOUT_FLAG_DEFRAG is used, IP reassembly
    functions can be called on outgoing packets that contain L2 headers.
    
    Also, IPv4 checksum is not corrected after reassembly.
    
    Fixes: 7736d33f4262 ("packet: Add pre-defragmentation support for ipv4 fanouts.")
    Signed-off-by: Edward Hyunkoo Jee <edjee@google.com>
    Signed-off-by: Eric Dumazet <edumazet@google.com>
    Cc: Willem de Bruijn <willemb@google.com>
    Cc: Jerry Chu <hkchu@google.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 2850eec9fc91122f88a4f78af2f0afe0f3db43b3
Author: dingtianhong <dingtianhong@huawei.com>
Date:   Thu Jul 16 16:30:02 2015 +0800

    bonding: correct the MAC address for "follow" fail_over_mac policy
    
    [ Upstream commit a951bc1e6ba58f11df5ed5ddc41311e10f5fd20b ]
    
    The "follow" fail_over_mac policy is useful for multiport devices that
    either become confused or incur a performance penalty when multiple
    ports are programmed with the same MAC address, but the same MAC
    address still may happened by this steps for this policy:
    
    1) echo +eth0 > /sys/class/net/bond0/bonding/slaves
       bond0 has the same mac address with eth0, it is MAC1.
    
    2) echo +eth1 > /sys/class/net/bond0/bonding/slaves
       eth1 is backup, eth1 has MAC2.
    
    3) ifconfig eth0 down
       eth1 became active slave, bond will swap MAC for eth0 and eth1,
       so eth1 has MAC1, and eth0 has MAC2.
    
    4) ifconfig eth1 down
       there is no active slave, and eth1 still has MAC1, eth2 has MAC2.
    
    5) ifconfig eth0 up
       the eth0 became active slave again, the bond set eth0 to MAC1.
    
    Something wrong here, then if you set eth1 up, the eth0 and eth1 will have the same
    MAC address, it will break this policy for ACTIVE_BACKUP mode.
    
    This patch will fix this problem by finding the old active slave and
    swap them MAC address before change active slave.
    
    Signed-off-by: Ding Tianhong <dingtianhong@huawei.com>
    Tested-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 55ae4d26822039182a7132619e1dae891ba744d2
Author: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Date:   Wed Jul 15 21:52:51 2015 +0200

    bonding: fix destruction of bond with devices different from arphrd_ether
    
    [ Upstream commit 06f6d1094aa0992432b1e2a0920b0ee86ccd83bf ]
    
    When the bonding is being unloaded and the netdevice notifier is
    unregistered it executes NETDEV_UNREGISTER for each device which should
    remove the bond's proc entry but if the device enslaved is not of
    ARPHRD_ETHER type and is in front of the bonding, it may execute
    bond_release_and_destroy() first which would release the last slave and
    destroy the bond device leaving the proc entry and thus we will get the
    following error (with dynamic debug on for bond_netdev_event to see the
    events order):
    [  908.963051] eql: event: 9
    [  908.963052] eql: IFF_SLAVE
    [  908.963054] eql: event: 2
    [  908.963056] eql: IFF_SLAVE
    [  908.963058] eql: event: 6
    [  908.963059] eql: IFF_SLAVE
    [  908.963110] bond0: Releasing active interface eql
    [  908.976168] bond0: Destroying bond bond0
    [  908.976266] bond0 (unregistering): Released all slaves
    [  908.984097] ------------[ cut here ]------------
    [  908.984107] WARNING: CPU: 0 PID: 1787 at fs/proc/generic.c:575
    remove_proc_entry+0x112/0x160()
    [  908.984110] remove_proc_entry: removing non-empty directory
    'net/bonding', leaking at least 'bond0'
    [  908.984111] Modules linked in: bonding(-) eql(O) 9p nfsd auth_rpcgss
    oid_registry nfs_acl nfs lockd grace fscache sunrpc crct10dif_pclmul
    crc32_pclmul crc32c_intel ghash_clmulni_intel ppdev qxl drm_kms_helper
    snd_hda_codec_generic aesni_intel ttm aes_x86_64 glue_helper pcspkr lrw
    gf128mul ablk_helper cryptd snd_hda_intel virtio_console snd_hda_codec
    psmouse serio_raw snd_hwdep snd_hda_core 9pnet_virtio 9pnet evdev joydev
    drm virtio_balloon snd_pcm snd_timer snd soundcore i2c_piix4 i2c_core
    pvpanic acpi_cpufreq parport_pc parport processor thermal_sys button
    autofs4 ext4 crc16 mbcache jbd2 hid_generic usbhid hid sg sr_mod cdrom
    ata_generic virtio_blk virtio_net floppy ata_piix e1000 libata ehci_pci
    virtio_pci scsi_mod uhci_hcd ehci_hcd virtio_ring virtio usbcore
    usb_common [last unloaded: bonding]
    
    [  908.984168] CPU: 0 PID: 1787 Comm: rmmod Tainted: G        W  O
    4.2.0-rc2+ #8
    [  908.984170] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
    [  908.984172]  0000000000000000 ffffffff81732d41 ffffffff81525b34
    ffff8800358dfda8
    [  908.984175]  ffffffff8106c521 ffff88003595af78 ffff88003595af40
    ffff88003e3a4280
    [  908.984178]  ffffffffa058d040 0000000000000000 ffffffff8106c59a
    ffffffff8172ebd0
    [  908.984181] Call Trace:
    [  908.984188]  [<ffffffff81525b34>] ? dump_stack+0x40/0x50
    [  908.984193]  [<ffffffff8106c521>] ? warn_slowpath_common+0x81/0xb0
    [  908.984196]  [<ffffffff8106c59a>] ? warn_slowpath_fmt+0x4a/0x50
    [  908.984199]  [<ffffffff81218352>] ? remove_proc_entry+0x112/0x160
    [  908.984205]  [<ffffffffa05850e6>] ? bond_destroy_proc_dir+0x26/0x30
    [bonding]
    [  908.984208]  [<ffffffffa057540e>] ? bond_net_exit+0x8e/0xa0 [bonding]
    [  908.984217]  [<ffffffff8142f407>] ? ops_exit_list.isra.4+0x37/0x70
    [  908.984225]  [<ffffffff8142f52d>] ?
    unregister_pernet_operations+0x8d/0xd0
    [  908.984228]  [<ffffffff8142f58d>] ?
    unregister_pernet_subsys+0x1d/0x30
    [  908.984232]  [<ffffffffa0585269>] ? bonding_exit+0x23/0xdba [bonding]
    [  908.984236]  [<ffffffff810e28ba>] ? SyS_delete_module+0x18a/0x250
    [  908.984241]  [<ffffffff81086f99>] ? task_work_run+0x89/0xc0
    [  908.984244]  [<ffffffff8152b732>] ?
    entry_SYSCALL_64_fastpath+0x16/0x75
    [  908.984247] ---[ end trace 7c006ed4abbef24b ]---
    
    Thus remove the proc entry manually if bond_release_and_destroy() is
    used. Because of the checks in bond_remove_proc_entry() it's not a
    problem for a bond device to change namespaces (the bug fixed by the
    Fixes commit) but since commit
    f9399814927ad ("bonding: Don't allow bond devices to change network
    namespaces.") that can't happen anyway.
    
    Reported-by: Carol Soto <clsoto@linux.vnet.ibm.com>
    Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
    Fixes: a64d49c3dd50 ("bonding: Manage /proc/net/bonding/ entries from
                          the netdev events")
    Tested-by: Carol L Soto <clsoto@linux.vnet.ibm.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit bc31dfc82c43b54fd193c24594192eb10b065d6d
Author: Eric Dumazet <edumazet@google.com>
Date:   Tue Jul 14 08:10:22 2015 +0200

    ipv6: lock socket in ip6_datagram_connect()
    
    [ Upstream commit 03645a11a570d52e70631838cb786eb4253eb463 ]
    
    ip6_datagram_connect() is doing a lot of socket changes without
    socket being locked.
    
    This looks wrong, at least for udp_lib_rehash() which could corrupt
    lists because of concurrent udp_sk(sk)->udp_portaddr_hash accesses.
    
    Signed-off-by: Eric Dumazet <edumazet@google.com>
    Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 006dc8bbef09b215c43d559661137211770f031a
Author: Tilman Schmidt <tilman@imap.cc>
Date:   Tue Jul 14 00:37:13 2015 +0200

    isdn/gigaset: reset tty->receive_room when attaching ser_gigaset
    
    [ Upstream commit fd98e9419d8d622a4de91f76b306af6aa627aa9c ]
    
    Commit 79901317ce80 ("n_tty: Don't flush buffer when closing ldisc"),
    first merged in kernel release 3.10, caused the following regression
    in the Gigaset M101 driver:
    
    Before that commit, when closing the N_TTY line discipline in
    preparation to switching to N_GIGASET_M101, receive_room would be
    reset to a non-zero value by the call to n_tty_flush_buffer() in
    n_tty's close method. With the removal of that call, receive_room
    might be left at zero, blocking data reception on the serial line.
    
    The present patch fixes that regression by setting receive_room
    to an appropriate value in the ldisc open method.
    
    Fixes: 79901317ce80 ("n_tty: Don't flush buffer when closing ldisc")
    Signed-off-by: Tilman Schmidt <tilman@imap.cc>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit cf4a8f241ebea97abf9438569f98ab3330bca01c
Author: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Date:   Mon Jul 13 06:36:19 2015 -0700

    bridge: mdb: fix double add notification
    
    [ Upstream commit 5ebc784625ea68a9570d1f70557e7932988cd1b4 ]
    
    Since the mdb add/del code was introduced there have been 2 br_mdb_notify
    calls when doing br_mdb_add() resulting in 2 notifications on each add.
    
    Example:
     Command: bridge mdb add dev br0 port eth1 grp 239.0.0.1 permanent
     Before patch:
     root@debian:~# bridge monitor all
     [MDB]dev br0 port eth1 grp 239.0.0.1 permanent
     [MDB]dev br0 port eth1 grp 239.0.0.1 permanent
    
     After patch:
     root@debian:~# bridge monitor all
     [MDB]dev br0 port eth1 grp 239.0.0.1 permanent
    
    Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
    Fixes: cfd567543590 ("bridge: add support of adding and deleting mdb entries")
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 414913c2d0a2141bdbe4cd8d9546c23d66d7b402
Author: Herbert Xu <herbert@gondor.apana.org.au>
Date:   Tue Aug 4 15:42:47 2015 +0800

    net: Fix skb_set_peeked use-after-free bug
    
    [ Upstream commit a0a2a6602496a45ae838a96db8b8173794b5d398 ]
    
    The commit 738ac1ebb96d02e0d23bc320302a6ea94c612dec ("net: Clone
    skb before setting peeked flag") introduced a use-after-free bug
    in skb_recv_datagram.  This is because skb_set_peeked may create
    a new skb and free the existing one.  As it stands the caller will
    continue to use the old freed skb.
    
    This patch fixes it by making skb_set_peeked return the new skb
    (or the old one if unchanged).
    
    Fixes: 738ac1ebb96d ("net: Clone skb before setting peeked flag")
    Reported-by: Brenden Blanco <bblanco@plumgrid.com>
    Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
    Tested-by: Brenden Blanco <bblanco@plumgrid.com>
    Reviewed-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit d8f7f61b238941cf2b31be65307b8f38cc0486b9
Author: Herbert Xu <herbert@gondor.apana.org.au>
Date:   Mon Jul 13 20:01:42 2015 +0800

    net: Fix skb csum races when peeking
    
    [ Upstream commit 89c22d8c3b278212eef6a8cc66b570bc840a6f5a ]
    
    When we calculate the checksum on the recv path, we store the
    result in the skb as an optimisation in case we need the checksum
    again down the line.
    
    This is in fact bogus for the MSG_PEEK case as this is done without
    any locking.  So multiple threads can peek and then store the result
    to the same skb, potentially resulting in bogus skb states.
    
    This patch fixes this by only storing the result if the skb is not
    shared.  This preserves the optimisations for the few cases where
    it can be done safely due to locking or other reasons, e.g., SIOCINQ.
    
    Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
    Acked-by: Eric Dumazet <edumazet@google.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 2482787e9f54e275373f32f0924fe4392c4c9e80
Author: Herbert Xu <herbert@gondor.apana.org.au>
Date:   Mon Jul 13 16:04:13 2015 +0800

    net: Clone skb before setting peeked flag
    
    [ Upstream commit 738ac1ebb96d02e0d23bc320302a6ea94c612dec ]
    
    Shared skbs must not be modified and this is crucial for broadcast
    and/or multicast paths where we use it as an optimisation to avoid
    unnecessary cloning.
    
    The function skb_recv_datagram breaks this rule by setting peeked
    without cloning the skb first.  This causes funky races which leads
    to double-free.
    
    This patch fixes this by cloning the skb and replacing the skb
    in the list when setting skb->peeked.
    
    Fixes: a59322be07c9 ("[UDP]: Only increment counter on first peek/recv")
    Reported-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
    Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 69f02461e4c79aec12c66ae657b3ded6f7a9c94c
Author: Julian Anastasov <ja@ssi.bg>
Date:   Thu Jul 9 09:59:10 2015 +0300

    net: call rcu_read_lock early in process_backlog
    
    [ Upstream commit 2c17d27c36dcce2b6bf689f41a46b9e909877c21 ]
    
    Incoming packet should be either in backlog queue or
    in RCU read-side section. Otherwise, the final sequence of
    flush_backlog() and synchronize_net() may miss packets
    that can run without device reference:
    
    CPU 1                  CPU 2
                           skb->dev: no reference
                           process_backlog:__skb_dequeue
                           process_backlog:local_irq_enable
    
    on_each_cpu for
    flush_backlog =>       IPI(hardirq): flush_backlog
                           - packet not found in backlog
    
                           CPU delayed ...
    synchronize_net
    - no ongoing RCU
    read-side sections
    
    netdev_run_todo,
    rcu_barrier: no
    ongoing callbacks
                           __netif_receive_skb_core:rcu_read_lock
                           - too late
    free dev
                           process packet for freed dev
    
    Fixes: 6e583ce5242f ("net: eliminate refcounting in backlog queue")
    Cc: Eric W. Biederman <ebiederm@xmission.com>
    Cc: Stephen Hemminger <stephen@networkplumber.org>
    Signed-off-by: Julian Anastasov <ja@ssi.bg>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 2095ab2456da766174cda8bc105d7a7b1bc70e16
Author: Julian Anastasov <ja@ssi.bg>
Date:   Thu Jul 9 09:59:09 2015 +0300

    net: do not process device backlog during unregistration
    
    [ Upstream commit e9e4dd3267d0c5234c5c0f47440456b10875dec9 ]
    
    commit 381c759d9916 ("ipv4: Avoid crashing in ip_error")
    fixes a problem where processed packet comes from device
    with destroyed inetdev (dev->ip_ptr). This is not expected
    because inetdev_destroy is called in NETDEV_UNREGISTER
    phase and packets should not be processed after
    dev_close_many() and synchronize_net(). Above fix is still
    required because inetdev_destroy can be called for other
    reasons. But it shows the real problem: backlog can keep
    packets for long time and they do not hold reference to
    device. Such packets are then delivered to upper levels
    at the same time when device is unregistered.
    Calling flush_backlog after NETDEV_UNREGISTER_FINAL still
    accounts all packets from backlog but before that some packets
    continue to be delivered to upper levels long after the
    synchronize_net call which is supposed to wait the last
    ones. Also, as Eric pointed out, processed packets, mostly
    from other devices, can continue to add new packets to backlog.
    
    Fix the problem by moving flush_backlog early, after the
    device driver is stopped and before the synchronize_net() call.
    Then use netif_running check to make sure we do not add more
    packets to backlog. We have to do it in enqueue_to_backlog
    context when the local IRQ is disabled. As result, after the
    flush_backlog and synchronize_net sequence all packets
    should be accounted.
    
    Thanks to Eric W. Biederman for the test script and his
    valuable feedback!
    
    Reported-by: Vittorio Gambaletta <linuxbugs@vittgam.net>
    Fixes: 6e583ce5242f ("net: eliminate refcounting in backlog queue")
    Cc: Eric W. Biederman <ebiederm@xmission.com>
    Cc: Stephen Hemminger <stephen@networkplumber.org>
    Signed-off-by: Julian Anastasov <ja@ssi.bg>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 45b5e68e764981b1dc69417504bfc2ed9c72b064
Author: Oleg Nesterov <oleg@redhat.com>
Date:   Wed Jul 8 21:42:11 2015 +0200

    net: pktgen: fix race between pktgen_thread_worker() and kthread_stop()
    
    [ Upstream commit fecdf8be2d91e04b0a9a4f79ff06499a36f5d14f ]
    
    pktgen_thread_worker() is obviously racy, kthread_stop() can come
    between the kthread_should_stop() check and set_current_state().
    
    Signed-off-by: Oleg Nesterov <oleg@redhat.com>
    Reported-by: Jan Stancek <jstancek@redhat.com>
    Reported-by: Marcelo Leitner <mleitner@redhat.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit d46c3f82dce0dd99a8ff22e8e45d86abb9683fd5
Author: Nikolay Aleksandrov <razor@blackwall.org>
Date:   Tue Jul 7 15:55:56 2015 +0200

    bridge: mdb: zero out the local br_ip variable before use
    
    [ Upstream commit f1158b74e54f2e2462ba5e2f45a118246d9d5b43 ]
    
    Since commit b0e9a30dd669 ("bridge: Add vlan id to multicast groups")
    there's a check in br_ip_equal() for a matching vlan id, but the mdb
    functions were not modified to use (or at least zero it) so when an
    entry was added it would have a garbage vlan id (from the local br_ip
    variable in __br_mdb_add/del) and this would prevent it from being
    matched and also deleted. So zero out the whole local ip var to protect
    ourselves from future changes and also to fix the current bug, since
    there's no vlan id support in the mdb uapi - use always vlan id 0.
    Example before patch:
    root@debian:~# bridge mdb add dev br0 port eth1 grp 239.0.0.1 permanent
    root@debian:~# bridge mdb
    dev br0 port eth1 grp 239.0.0.1 permanent
    root@debian:~# bridge mdb del dev br0 port eth1 grp 239.0.0.1 permanent
    RTNETLINK answers: Invalid argument
    
    After patch:
    root@debian:~# bridge mdb add dev br0 port eth1 grp 239.0.0.1 permanent
    root@debian:~# bridge mdb
    dev br0 port eth1 grp 239.0.0.1 permanent
    root@debian:~# bridge mdb del dev br0 port eth1 grp 239.0.0.1 permanent
    root@debian:~# bridge mdb
    
    Signed-off-by: Nikolay Aleksandrov <razor@blackwall.org>
    Fixes: b0e9a30dd669 ("bridge: Add vlan id to multicast groups")
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 4db9a9724fdf1a61ef99e09580237db07503d21e
Author: Stephen Smalley <sds@tycho.nsa.gov>
Date:   Tue Jul 7 09:43:45 2015 -0400

    net/tipc: initialize security state for new connection socket
    
    [ Upstream commit fdd75ea8df370f206a8163786e7470c1277a5064 ]
    
    Calling connect() with an AF_TIPC socket would trigger a series
    of error messages from SELinux along the lines of:
    SELinux: Invalid class 0
    type=AVC msg=audit(1434126658.487:34500): avc:  denied  { <unprintable> }
      for pid=292 comm="kworker/u16:5" scontext=system_u:system_r:kernel_t:s0
      tcontext=system_u:object_r:unlabeled_t:s0 tclass=<unprintable>
      permissive=0
    
    This was due to a failure to initialize the security state of the new
    connection sock by the tipc code, leaving it with junk in the security
    class field and an unlabeled secid.  Add a call to security_sk_clone()
    to inherit the security state from the parent socket.
    
    Reported-by: Tim Shearer <tim.shearer@overturenetworks.com>
    Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
    Acked-by: Paul Moore <paul@paul-moore.com>
    Acked-by: Ying Xue <ying.xue@windriver.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 5a2c3656bd466dffd0eaef5f8149a668be60a3bd
Author: Timo Teräs <timo.teras@iki.fi>
Date:   Tue Jul 7 08:34:13 2015 +0300

    ip_tunnel: fix ipv4 pmtu check to honor inner ip header df
    
    [ Upstream commit fc24f2b2094366da8786f59f2606307e934cea17 ]
    
    Frag needed should be sent only if the inner header asked
    to not fragment. Currently fragmentation is broken if the
    tunnel has df set, but df was not asked in the original
    packet. The tunnel's df needs to be still checked to update
    internally the pmtu cache.
    
    Commit 23a3647bc4f93bac broke it, and this commit fixes
    the ipv4 df check back to the way it was.
    
    Fixes: 23a3647bc4f93bac ("ip_tunnels: Use skb-len to PMTU check.")
    Cc: Pravin B Shelar <pshelar@nicira.com>
    Signed-off-by: Timo Teräs <timo.teras@iki.fi>
    Acked-by: Pravin B Shelar <pshelar@nicira.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit c8fa57bbcfc2cd2ce5e6f4e89d860c58ba401279
Author: Daniel Borkmann <daniel@iogearbox.net>
Date:   Tue Jul 7 00:07:52 2015 +0200

    rtnetlink: verify IFLA_VF_INFO attributes before passing them to driver
    
    [ Upstream commit 4f7d2cdfdde71ffe962399b7020c674050329423 ]
    
    Jason Gunthorpe reported that since commit c02db8c6290b ("rtnetlink: make
    SR-IOV VF interface symmetric"), we don't verify IFLA_VF_INFO attributes
    anymore with respect to their policy, that is, ifla_vfinfo_policy[].
    
    Before, they were part of ifla_policy[], but they have been nested since
    placed under IFLA_VFINFO_LIST, that contains the attribute IFLA_VF_INFO,
    which is another nested attribute for the actual VF attributes such as
    IFLA_VF_MAC, IFLA_VF_VLAN, etc.
    
    Despite the policy being split out from ifla_policy[] in this commit,
    it's never applied anywhere. nla_for_each_nested() only does basic nla_ok()
    testing for struct nlattr, but it doesn't know about the data context and
    their requirements.
    
    Fix, on top of Jason's initial work, does 1) parsing of the attributes
    with the right policy, and 2) using the resulting parsed attribute table
    from 1) instead of the nla_for_each_nested() loop (just like we used to
    do when still part of ifla_policy[]).
    
    Reference: http://thread.gmane.org/gmane.linux.network/368913
    Fixes: c02db8c6290b ("rtnetlink: make SR-IOV VF interface symmetric")
    Reported-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
    Cc: Chris Wright <chrisw@sous-sol.org>
    Cc: Sucheta Chakraborty <sucheta.chakraborty@qlogic.com>
    Cc: Greg Rose <gregory.v.rose@intel.com>
    Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
    Cc: Rony Efraim <ronye@mellanox.com>
    Cc: Vlad Zolotarov <vladz@cloudius-systems.com>
    Cc: Nicolas Dichtel <nicolas.dichtel@6wind.com>
    Cc: Thomas Graf <tgraf@suug.ch>
    Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
    Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
    Acked-by: Vlad Zolotarov <vladz@cloudius-systems.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 6133a5c1e5a5a001bff75977653a800fb5c13735
Author: Eric Dumazet <edumazet@google.com>
Date:   Mon Jul 6 17:13:26 2015 +0200

    net: graceful exit from netif_alloc_netdev_queues()
    
    [ Upstream commit d339727c2b1a10f25e6636670ab6e1841170e328 ]
    
    User space can crash kernel with
    
    ip link add ifb10 numtxqueues 100000 type ifb
    
    We must replace a BUG_ON() by proper test and return -EINVAL for
    crazy values.
    
    Fixes: 60877a32bce00 ("net: allow large number of tx queues")
    Signed-off-by: Eric Dumazet <edumazet@google.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 1742bfe8f498dc13a1de8b2b452bd1ec3aac0544
Author: Angga <Hermin.Anggawijaya@alliedtelesis.co.nz>
Date:   Fri Jul 3 14:40:52 2015 +1200

    ipv6: Make MLD packets to only be processed locally
    
    [ Upstream commit 4c938d22c88a9ddccc8c55a85e0430e9c62b1ac5 ]
    
    Before commit daad151263cf ("ipv6: Make ipv6_is_mld() inline and use it
    from ip6_mc_input().") MLD packets were only processed locally. After the
    change, a copy of MLD packet goes through ip6_mr_input, causing
    MRT6MSG_NOCACHE message to be generated to user space.
    
    Make MLD packet only processed locally.
    
    Fixes: daad151263cf ("ipv6: Make ipv6_is_mld() inline and use it from ip6_mc_input().")
    Signed-off-by: Hermin Anggawijaya <hermin.anggawijaya@alliedtelesis.co.nz>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit c824dd9a233f97fa01c6bad48634c32043684613
Author: Hin-Tak Leung <htl10@users.sourceforge.net>
Date:   Wed Sep 9 15:38:04 2015 -0700

    hfs,hfsplus: cache pages correctly between bnode_create and bnode_free
    
    commit 7cb74be6fd827e314f81df3c5889b87e4c87c569 upstream.
    
    Pages looked up by __hfs_bnode_create() (called by hfs_bnode_create() and
    hfs_bnode_find() for finding or creating pages corresponding to an inode)
    are immediately kmap()'ed and used (both read and write) and kunmap()'ed,
    and should not be page_cache_release()'ed until hfs_bnode_free().
    
    This patch fixes a problem I first saw in July 2012: merely running "du"
    on a large hfsplus-mounted directory a few times on a reasonably loaded
    system would get the hfsplus driver all confused and complaining about
    B-tree inconsistencies, and generates a "BUG: Bad page state".  Most
    recently, I can generate this problem on up-to-date Fedora 22 with shipped
    kernel 4.0.5, by running "du /" (="/" + "/home" + "/mnt" + other smaller
    mounts) and "du /mnt" simultaneously on two windows, where /mnt is a
    lightly-used QEMU VM image of the full Mac OS X 10.9:
    
    $ df -i / /home /mnt
    Filesystem                  Inodes   IUsed      IFree IUse% Mounted on
    /dev/mapper/fedora-root    3276800  551665    2725135   17% /
    /dev/mapper/fedora-home   52879360  716221   52163139    2% /home
    /dev/nbd0p2             4294967295 1387818 4293579477    1% /mnt
    
    After applying the patch, I was able to run "du /" (60+ times) and "du
    /mnt" (150+ times) continuously and simultaneously for 6+ hours.
    
    There are many reports of the hfsplus driver getting confused under load
    and generating "BUG: Bad page state" or other similar issues over the
    years.  [1]
    
    The unpatched code [2] has always been wrong since it entered the kernel
    tree.  The only reason why it gets away with it is that the
    kmap/memcpy/kunmap follow very quickly after the page_cache_release() so
    the kernel has not had a chance to reuse the memory for something else,
    most of the time.
    
    The current RW driver appears to have followed the design and development
    of the earlier read-only hfsplus driver [3], where-by version 0.1 (Dec
    2001) had a B-tree node-centric approach to
    read_cache_page()/page_cache_release() per bnode_get()/bnode_put(),
    migrating towards version 0.2 (June 2002) of caching and releasing pages
    per inode extents.  When the current RW code first entered the kernel [2]
    in 2005, there was an REF_PAGES conditional (and "//" commented out code)
    to switch between B-node centric paging to inode-centric paging.  There
    was a mistake with the direction of one of the REF_PAGES conditionals in
    __hfs_bnode_create().  In a subsequent "remove debug code" commit [4], the
    read_cache_page()/page_cache_release() per bnode_get()/bnode_put() were
    removed, but a page_cache_release() was mistakenly left in (propagating
    the "REF_PAGES <-> !REF_PAGE" mistake), and the commented-out
    page_cache_release() in bnode_release() (which should be spanned by
    !REF_PAGES) was never enabled.
    
    References:
    [1]:
    Michael Fox, Apr 2013
    http://www.spinics.net/lists/linux-fsdevel/msg63807.html
    ("hfsplus volume suddenly inaccessable after 'hfs: recoff %d too large'")
    
    Sasha Levin, Feb 2015
    http://lkml.org/lkml/2015/2/20/85 ("use after free")
    
    https://bugs.launchpad.net/ubuntu/+source/linux/+bug/740814
    https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1027887
    https://bugzilla.kernel.org/show_bug.cgi?id=42342
    https://bugzilla.kernel.org/show_bug.cgi?id=63841
    https://bugzilla.kernel.org/show_bug.cgi?id=78761
    
    [2]:
    http://git.kernel.org/cgit/linux/kernel/git/tglx/history.git/commit/\
    fs/hfs/bnode.c?id=d1081202f1d0ee35ab0beb490da4b65d4bc763db
    commit d1081202f1d0ee35ab0beb490da4b65d4bc763db
    Author: Andrew Morton <akpm@osdl.org>
    Date:   Wed Feb 25 16:17:36 2004 -0800
    
        [PATCH] HFS rewrite
    
    http://git.kernel.org/cgit/linux/kernel/git/tglx/history.git/commit/\
    fs/hfsplus/bnode.c?id=91556682e0bf004d98a529bf829d339abb98bbbd
    
    commit 91556682e0bf004d98a529bf829d339abb98bbbd
    Author: Andrew Morton <akpm@osdl.org>
    Date:   Wed Feb 25 16:17:48 2004 -0800
    
        [PATCH] HFS+ support
    
    [3]:
    http://sourceforge.net/projects/linux-hfsplus/
    
    http://sourceforge.net/projects/linux-hfsplus/files/Linux%202.4.x%20patch/hfsplus%200.1/
    http://sourceforge.net/projects/linux-hfsplus/files/Linux%202.4.x%20patch/hfsplus%200.2/
    
    http://linux-hfsplus.cvs.sourceforge.net/viewvc/linux-hfsplus/linux/\
    fs/hfsplus/bnode.c?r1=1.4&r2=1.5
    
    Date:   Thu Jun 6 09:45:14 2002 +0000
    Use buffer cache instead of page cache in bnode.c. Cache inode extents.
    
    [4]:
    http://git.kernel.org/cgit/linux/kernel/git/\
    stable/linux-stable.git/commit/?id=a5e3985fa014029eb6795664c704953720cc7f7d
    
    commit a5e3985fa014029eb6795664c704953720cc7f7d
    Author: Roman Zippel <zippel@linux-m68k.org>
    Date:   Tue Sep 6 15:18:47 2005 -0700
    
    [PATCH] hfs: remove debug code
    
    Signed-off-by: Hin-Tak Leung <htl10@users.sourceforge.net>
    Signed-off-by: Sergei Antonov <saproj@gmail.com>
    Reviewed-by: Anton Altaparmakov <anton@tuxera.com>
    Reported-by: Sasha Levin <sasha.levin@oracle.com>
    Cc: Al Viro <viro@zeniv.linux.org.uk>
    Cc: Christoph Hellwig <hch@infradead.org>
    Cc: Vyacheslav Dubeyko <slava@dubeyko.com>
    Cc: Sougata Santra <sougata@tuxera.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit bf6391e2938245fed93866aaa6ac903115869b32
Author: Alexey Brodkin <Alexey.Brodkin@synopsys.com>
Date:   Wed Jun 24 11:47:41 2015 +0300

    stmmac: troubleshoot unexpected bits in des0 & des1
    
    commit f1590670ce069eefeb93916391a67643e6ad1630 upstream.
    
    Current implementation of descriptor init procedure only takes
    care about setting/clearing ownership flag in "des0"/"des1"
    fields while it is perfectly possible to get unexpected bits
    set because of the following factors:
    
     [1] On driver probe underlying memory allocated with
         dma_alloc_coherent() might not be zeroed and so
         it will be filled with garbage.
    
     [2] During driver operation some bits could be set by SD/MMC
         controller (for example error flags etc).
    
    And unexpected and/or randomly set flags in "des0"/"des1"
    fields may lead to unpredictable behavior of GMAC DMA block.
    
    This change addresses both items above with:
    
     [1] Use of dma_zalloc_coherent() instead of simple
         dma_alloc_coherent() to make sure allocated memory is
         zeroed. That shouldn't affect performance because
         this allocation only happens once on driver probe.
    
     [2] Do explicit zeroing of both "des0" and "des1" fields
         of all buffer descriptors during initialization of
         DMA transfer.
    
    And while at it fixed identation of dma_free_coherent()
    counterpart as well.
    
    Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
    Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com>
    Cc: arc-linux-dev@synopsys.com
    Cc: linux-kernel@vger.kernel.org
    Cc: stable@vger.kernel.org
    Cc: David Miller <davem@davemloft.net>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 86f1e0692fe3ac6bd675f75e3bab0ce811e7bb3b
Author: Noa Osherovich <noaos@mellanox.com>
Date:   Thu Jul 30 17:34:24 2015 +0300

    IB/mlx4: Use correct SL on AH query under RoCE
    
    commit 5e99b139f1b68acd65e36515ca347b03856dfb5a upstream.
    
    The mlx4 IB driver implementation for ib_query_ah used a wrong offset
    (28 instead of 29) when link type is Ethernet. Fixed to use the correct one.
    
    Fixes: fa417f7b520e ('IB/mlx4: Add support for IBoE')
    Signed-off-by: Shani Michaeli <shanim@mellanox.com>
    Signed-off-by: Noa Osherovich <noaos@mellanox.com>
    Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
    Signed-off-by: Doug Ledford <dledford@redhat.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 3fb968bf0f3c8e5792e0c128d7c1330cb5ce9ff5
Author: Jack Morgenstein <jackm@dev.mellanox.co.il>
Date:   Thu Jul 30 17:34:23 2015 +0300

    IB/mlx4: Forbid using sysfs to change RoCE pkeys
    
    commit 2b135db3e81301d0452e6aa107349abe67b097d6 upstream.
    
    The pkey mapping for RoCE must remain the default mapping:
    VFs:
      virtual index 0 = mapped to real index 0 (0xFFFF)
      All others indices: mapped to a real pkey index containing an
                          invalid pkey.
    PF:
      virtual index i = real index i.
    
    Don't allow users to change these mappings using files found in
    sysfs.
    
    Fixes: c1e7e466120b ('IB/mlx4: Add iov directory in sysfs under the ib device')
    Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
    Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
    Signed-off-by: Doug Ledford <dledford@redhat.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 20dbae92ee5ee3d0b6d4b6fb6ba6428d9475467e
Author: Yishai Hadas <yishaih@mellanox.com>
Date:   Thu Aug 13 18:32:03 2015 +0300

    IB/uverbs: Fix race between ib_uverbs_open and remove_one
    
    commit 35d4a0b63dc0c6d1177d4f532a9deae958f0662c upstream.
    
    Fixes: 2a72f212263701b927559f6850446421d5906c41 ("IB/uverbs: Remove dev_table")
    
    Before this commit there was a device look-up table that was protected
    by a spin_lock used by ib_uverbs_open and by ib_uverbs_remove_one. When
    it was dropped and container_of was used instead, it enabled the race
    with remove_one as dev might be freed just after:
    dev = container_of(inode->i_cdev, struct ib_uverbs_device, cdev) but
    before the kref_get.
    
    In addition, this buggy patch added some dead code as
    container_of(x,y,z) can never be NULL and so dev can never be NULL.
    As a result the comment above ib_uverbs_open saying "the open method
    will either immediately run -ENXIO" is wrong as it can never happen.
    
    The solution follows Jason Gunthorpe suggestion from below URL:
    https://www.mail-archive.com/linux-rdma@vger.kernel.org/msg25692.html
    
    cdev will hold a kref on the parent (the containing structure,
    ib_uverbs_device) and only when that kref is released it is
    guaranteed that open will never be called again.
    
    In addition, fixes the active count scheme to use an atomic
    not a kref to prevent WARN_ON as pointed by above comment
    from Jason.
    
    Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
    Signed-off-by: Shachar Raindel <raindel@mellanox.com>
    Reviewed-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
    Signed-off-by: Doug Ledford <dledford@redhat.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 58922dda37a03d4ae7a51f5fe1fc042cbc749b51
Author: Christoph Hellwig <hch@lst.de>
Date:   Wed Aug 26 11:00:37 2015 +0200

    IB/uverbs: reject invalid or unknown opcodes
    
    commit b632ffa7cee439ba5dce3b3bc4a5cbe2b3e20133 upstream.
    
    We have many WR opcodes that are only supported in kernel space
    and/or require optional information to be copied into the WR
    structure.  Reject all those not explicitly handled so that we
    can't pass invalid information to drivers.
    
    Signed-off-by: Christoph Hellwig <hch@lst.de>
    Reviewed-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
    Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
    Signed-off-by: Doug Ledford <dledford@redhat.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 25a58b07f7993b551acf22dfdb90d81282efb200
Author: Mike Marciniszyn <mike.marciniszyn@intel.com>
Date:   Tue Jul 21 08:36:07 2015 -0400

    IB/qib: Change lkey table allocation to support more MRs
    
    commit d6f1c17e162b2a11e708f28fa93f2f79c164b442 upstream.
    
    The lkey table is allocated with with a get_user_pages() with an
    order based on a number of index bits from a module parameter.
    
    The underlying kernel code cannot allocate that many contiguous pages.
    
    There is no reason the underlying memory needs to be physically
    contiguous.
    
    This patch:
    - switches the allocation/deallocation to vmalloc/vfree
    - caps the number of bits to 23 to insure at least 1 generation bit
      o this matches the module parameter description
    
    Reviewed-by: Vinit Agnihotri <vinit.abhay.agnihotri@intel.com>
    Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
    Signed-off-by: Doug Ledford <dledford@redhat.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 18fd1e934ae3e204820b13a2b87bac9f7254a46d
Author: Hin-Tak Leung <htl10@users.sourceforge.net>
Date:   Wed Sep 9 15:38:07 2015 -0700

    hfs: fix B-tree corruption after insertion at position 0
    
    commit b4cc0efea4f0bfa2477c56af406cfcf3d3e58680 upstream.
    
    Fix B-tree corruption when a new record is inserted at position 0 in the
    node in hfs_brec_insert().
    
    This is an identical change to the corresponding hfs b-tree code to Sergei
    Antonov's "hfsplus: fix B-tree corruption after insertion at position 0",
    to keep similar code paths in the hfs and hfsplus drivers in sync, where
    appropriate.
    
    Signed-off-by: Hin-Tak Leung <htl10@users.sourceforge.net>
    Cc: Sergei Antonov <saproj@gmail.com>
    Cc: Joe Perches <joe@perches.com>
    Reviewed-by: Vyacheslav Dubeyko <slava@dubeyko.com>
    Cc: Anton Altaparmakov <anton@tuxera.com>
    Cc: Al Viro <viro@zeniv.linux.org.uk>
    Cc: Christoph Hellwig <hch@infradead.org>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit e254330e799f2745c6cb929c0df99644e5790d05
Author: David Vrabel <david.vrabel@citrix.com>
Date:   Fri Jan 9 18:06:12 2015 +0000

    xen/gntdev: convert priv->lock to a mutex
    
    commit 1401c00e59ea021c575f74612fe2dbba36d6a4ee upstream.
    
    Unmapping may require sleeping and we unmap while holding priv->lock, so
    convert it to a mutex.
    
    Signed-off-by: David Vrabel <david.vrabel@citrix.com>
    Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
    Cc: Ian Campbell <ian.campbell@citrix.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit a78b86334c03d29ad4d426d6af47c1171c2e86ac
Author: NeilBrown <neilb@suse.com>
Date:   Mon Jul 6 17:37:49 2015 +1000

    md/raid10: always set reshape_safe when initializing reshape_position.
    
    commit 299b0685e31c9f3dcc2d58ee3beca761a40b44b3 upstream.
    
    'reshape_position' tracks where in the reshape we have reached.
    'reshape_safe' tracks where in the reshape we have safely recorded
    in the metadata.
    
    These are compared to determine when to update the metadata.
    So it is important that reshape_safe is initialised properly.
    Currently it isn't.  When starting a reshape from the beginning
    it usually has the correct value by luck.  But when reducing the
    number of devices in a RAID10, it has the wrong value and this leads
    to the metadata not being updated correctly.
    This can lead to corruption if the reshape is not allowed to complete.
    
    This patch is suitable for any -stable kernel which supports RAID10
    reshape, which is 3.5 and later.
    
    Fixes: 3ea7daa5d7fd ("md/raid10: add reshape support")
    Signed-off-by: NeilBrown <neilb@suse.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit a30c50d2817332e8c5701134b6a869be1a9a1d70
Author: Jialing Fu <jlfu@marvell.com>
Date:   Fri Aug 28 11:13:09 2015 +0800

    mmc: core: fix race condition in mmc_wait_data_done
    
    commit 71f8a4b81d040b3d094424197ca2f1bf811b1245 upstream.
    
    The following panic is captured in ker3.14, but the issue still exists
    in latest kernel.
    ---------------------------------------------------------------------
    [   20.738217] c0 3136 (Compiler) Unable to handle kernel NULL pointer dereference
    at virtual address 00000578
    ......
    [   20.738499] c0 3136 (Compiler) PC is at _raw_spin_lock_irqsave+0x24/0x60
    [   20.738527] c0 3136 (Compiler) LR is at _raw_spin_lock_irqsave+0x20/0x60
    [   20.740134] c0 3136 (Compiler) Call trace:
    [   20.740165] c0 3136 (Compiler) [<ffffffc0008ee900>] _raw_spin_lock_irqsave+0x24/0x60
    [   20.740200] c0 3136 (Compiler) [<ffffffc0000dd024>] __wake_up+0x1c/0x54
    [   20.740230] c0 3136 (Compiler) [<ffffffc000639414>] mmc_wait_data_done+0x28/0x34
    [   20.740262] c0 3136 (Compiler) [<ffffffc0006391a0>] mmc_request_done+0xa4/0x220
    [   20.740314] c0 3136 (Compiler) [<ffffffc000656894>] sdhci_tasklet_finish+0xac/0x264
    [   20.740352] c0 3136 (Compiler) [<ffffffc0000a2b58>] tasklet_action+0xa0/0x158
    [   20.740382] c0 3136 (Compiler) [<ffffffc0000a2078>] __do_softirq+0x10c/0x2e4
    [   20.740411] c0 3136 (Compiler) [<ffffffc0000a24bc>] irq_exit+0x8c/0xc0
    [   20.740439] c0 3136 (Compiler) [<ffffffc00008489c>] handle_IRQ+0x48/0xac
    [   20.740469] c0 3136 (Compiler) [<ffffffc000081428>] gic_handle_irq+0x38/0x7c
    ----------------------------------------------------------------------
    Because in SMP, "mrq" has race condition between below two paths:
    path1: CPU0: <tasklet context>
      static void mmc_wait_data_done(struct mmc_request *mrq)
      {
         mrq->host->context_info.is_done_rcv = true;
         //
         // If CPU0 has just finished "is_done_rcv = true" in path1, and at
         // this moment, IRQ or ICache line missing happens in CPU0.
         // What happens in CPU1 (path2)?
         //
         // If the mmcqd thread in CPU1(path2) hasn't entered to sleep mode:
         // path2 would have chance to break from wait_event_interruptible
         // in mmc_wait_for_data_req_done and continue to run for next
         // mmc_request (mmc_blk_rw_rq_prep).
         //
         // Within mmc_blk_rq_prep, mrq is cleared to 0.
         // If below line still gets host from "mrq" as the result of
         // compiler, the panic happens as we traced.
         wake_up_interruptible(&mrq->host->context_info.wait);
      }
    
    path2: CPU1: <The mmcqd thread runs mmc_queue_thread>
      static int mmc_wait_for_data_req_done(...
      {
         ...
         while (1) {
               wait_event_interruptible(context_info->wait,
                       (context_info->is_done_rcv ||
                        context_info->is_new_req));
         	   static void mmc_blk_rw_rq_prep(...
               {
               ...
               memset(brq, 0, sizeof(struct mmc_blk_request));
    
    This issue happens very coincidentally; however adding mdelay(1) in
    mmc_wait_data_done as below could duplicate it easily.
    
       static void mmc_wait_data_done(struct mmc_request *mrq)
       {
         mrq->host->context_info.is_done_rcv = true;
    +    mdelay(1);
         wake_up_interruptible(&mrq->host->context_info.wait);
        }
    
    At runtime, IRQ or ICache line missing may just happen at the same place
    of the mdelay(1).
    
    This patch gets the mmc_context_info at the beginning of function, it can
    avoid this race condition.
    
    Signed-off-by: Jialing Fu <jlfu@marvell.com>
    Tested-by: Shawn Lin <shawn.lin@rock-chips.com>
    Fixes: 2220eedfd7ae ("mmc: fix async request mechanism ....")
    Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com>
    Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 9dbc5c788c0ad8a8ba5c58459770df125d456426
Author: Jann Horn <jann@thejh.net>
Date:   Wed Sep 9 15:38:28 2015 -0700

    fs: if a coredump already exists, unlink and recreate with O_EXCL
    
    commit fbb1816942c04429e85dbf4c1a080accc534299e upstream.
    
    It was possible for an attacking user to trick root (or another user) into
    writing his coredumps into an attacker-readable, pre-existing file using
    rename() or link(), causing the disclosure of secret data from the victim
    process' virtual memory.  Depending on the configuration, it was also
    possible to trick root into overwriting system files with coredumps.  Fix
    that issue by never writing coredumps into existing files.
    
    Requirements for the attack:
     - The attack only applies if the victim's process has a nonzero
       RLIMIT_CORE and is dumpable.
     - The attacker can trick the victim into coredumping into an
       attacker-writable directory D, either because the core_pattern is
       relative and the victim's cwd is attacker-writable or because an
       absolute core_pattern pointing to a world-writable directory is used.
     - The attacker has one of these:
      A: on a system with protected_hardlinks=0:
         execute access to a folder containing a victim-owned,
         attacker-readable file on the same partition as D, and the
         victim-owned file will be deleted before the main part of the attack
         takes place. (In practice, there are lots of files that fulfill
         this condition, e.g. entries in Debian's /var/lib/dpkg/info/.)
         This does not apply to most Linux systems because most distros set
         protected_hardlinks=1.
      B: on a system with protected_hardlinks=1:
         execute access to a folder containing a victim-owned,
         attacker-readable and attacker-writable file on the same partition
         as D, and the victim-owned file will be deleted before the main part
         of the attack takes place.
         (This seems to be uncommon.)
      C: on any system, independent of protected_hardlinks:
         write access to a non-sticky folder containing a victim-owned,
         attacker-readable file on the same partition as D
         (This seems to be uncommon.)
    
    The basic idea is that the attacker moves the victim-owned file to where
    he expects the victim process to dump its core.  The victim process dumps
    its core into the existing file, and the attacker reads the coredump from
    it.
    
    If the attacker can't move the file because he does not have write access
    to the containing directory, he can instead link the file to a directory
    he controls, then wait for the original link to the file to be deleted
    (because the kernel checks that the link count of the corefile is 1).
    
    A less reliable variant that requires D to be non-sticky works with link()
    and does not require deletion of the original link: link() the file into
    D, but then unlink() it directly before the kernel performs the link count
    check.
    
    On systems with protected_hardlinks=0, this variant allows an attacker to
    not only gain information from coredumps, but also clobber existing,
    victim-writable files with coredumps.  (This could theoretically lead to a
    privilege escalation.)
    
    Signed-off-by: Jann Horn <jann@thejh.net>
    Cc: Kees Cook <keescook@chromium.org>
    Cc: Al Viro <viro@zeniv.linux.org.uk>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 8803492d518ccbdafeabda75d5eabe433319214d
Author: Jaewon Kim <jaewon31.kim@samsung.com>
Date:   Tue Sep 8 15:02:21 2015 -0700

    vmscan: fix increasing nr_isolated incurred by putback unevictable pages
    
    commit c54839a722a02818677bcabe57e957f0ce4f841d upstream.
    
    reclaim_clean_pages_from_list() assumes that shrink_page_list() returns
    number of pages removed from the candidate list.  But shrink_page_list()
    puts back mlocked pages without passing it to caller and without
    counting as nr_reclaimed.  This increases nr_isolated.
    
    To fix this, this patch changes shrink_page_list() to pass unevictable
    pages back to caller.  Caller will take care those pages.
    
    Minchan said:
    
    It fixes two issues.
    
    1. With unevictable page, cma_alloc will be successful.
    
    Exactly speaking, cma_alloc of current kernel will fail due to
    unevictable pages.
    
    2. fix leaking of NR_ISOLATED counter of vmstat
    
    With it, too_many_isolated works.  Otherwise, it could make hang until
    the process get SIGKILL.
    
    Signed-off-by: Jaewon Kim <jaewon31.kim@samsung.com>
    Acked-by: Minchan Kim <minchan@kernel.org>
    Cc: Mel Gorman <mgorman@techsingularity.net>
    Acked-by: Vlastimil Babka <vbabka@suse.cz>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 136af9b85ad0406325ab846e5148e3dfabea43c7
Author: Helge Deller <deller@gmx.de>
Date:   Thu Sep 3 22:45:21 2015 +0200

    parisc: Filter out spurious interrupts in PA-RISC irq handler
    
    commit b1b4e435e4ef7de77f07bf2a42c8380b960c2d44 upstream.
    
    When detecting a serial port on newer PA-RISC machines (with iosapic) we have a
    long way to go to find the right IRQ line, registering it, then registering the
    serial port and the irq handler for the serial port. During this phase spurious
    interrupts for the serial port may happen which then crashes the kernel because
    the action handler might not have been set up yet.
    
    So, basically it's a race condition between the serial port hardware and the
    CPU which sets up the necessary fields in the irq sructs. The main reason for
    this race is, that we unmask the serial port irqs too early without having set
    up everything properly before (which isn't easily possible because we need the
    IRQ number to register the serial ports).
    
    This patch is a work-around for this problem. It adds checks to the CPU irq
    handler to verify if the IRQ action field has been initialized already. If not,
    we just skip this interrupt (which isn't critical for a serial port at bootup).
    The real fix would probably involve rewriting all PA-RISC specific IRQ code
    (for CPU, IOSAPIC, GSC and EISA) to use IRQ domains with proper parenting of
    the irq chips and proper irq enabling along this line.
    
    This bug has been in the PA-RISC port since the beginning, but the crashes
    happened very rarely with currently used hardware.  But on the latest machine
    which I bought (a C8000 workstation), which uses the fastest CPUs (4 x PA8900,
    1GHz) and which has the largest possible L1 cache size (64MB each), the kernel
    crashed at every boot because of this race. So, without this patch the machine
    would currently be unuseable.
    
    For the record, here is the flow logic:
    1. serial_init_chip() in 8250_gsc.c calls iosapic_serial_irq().
    2. iosapic_serial_irq() calls txn_alloc_irq() to find the irq.
    3. iosapic_serial_irq() calls cpu_claim_irq() to register the CPU irq
    4. cpu_claim_irq() unmasks the CPU irq (which it shouldn't!)
    5. serial_init_chip() then registers the 8250 port.
    Problems:
    - In step 4 the CPU irq shouldn't have been registered yet, but after step 5
    - If serial irq happens between 4 and 5 have finished, the kernel will crash
    
    Signed-off-by: Helge Deller <deller@gmx.de>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 2d608a34435a5a8525c8d7c337030bae8137d507
Author: John David Anglin <dave.anglin@bell.net>
Date:   Mon Sep 7 20:13:28 2015 -0400

    parisc: Use double word condition in 64bit CAS operation
    
    commit 1b59ddfcf1678de38a1f8ca9fb8ea5eebeff1843 upstream.
    
    The attached change fixes the condition used in the "sub" instruction.
    A double word comparison is needed.  This fixes the 64-bit LWS CAS
    operation on 64-bit kernels.
    
    I can now enable 64-bit atomic support in GCC.
    
    Signed-off-by: John David Anglin <dave.anglin>
    Signed-off-by: Helge Deller <deller@gmx.de>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit e6f5a8e4286cb8ff370f4ea436fc2eff88e8326c
Author: Trond Myklebust <trond.myklebust@primarydata.com>
Date:   Mon Aug 17 12:57:07 2015 -0500

    NFS: nfs_set_pgio_error sometimes misses errors
    
    commit e9ae58aeee8842a50f7e199d602a5ccb2e41a95f upstream.
    
    We should ensure that we always set the pgio_header's error field
    if a READ or WRITE RPC call returns an error. The current code depends
    on 'hdr->good_bytes' always being initialised to a large value, which
    is not always done correctly by callers.
    When this happens, applications may end up missing important errors.
    
    Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 05c5d5c75b6e30af9b00ecd07eb9f2733741e8df
Author: Kinglong Mee <kinglongmee@gmail.com>
Date:   Sat Aug 15 21:52:10 2015 +0800

    NFS: Fix a NULL pointer dereference of migration recovery ops for v4.2 client
    
    commit 18e3b739fdc826481c6a1335ce0c5b19b3d415da upstream.
    
    ---Steps to Reproduce--
    <nfs-server>
    # cat /etc/exports
    /nfs/referal  *(rw,insecure,no_subtree_check,no_root_squash,crossmnt)
    /nfs/old      *(ro,insecure,subtree_check,root_squash,crossmnt)
    
    <nfs-client>
    # mount -t nfs nfs-server:/nfs/ /mnt/
    # ll /mnt/*/
    
    <nfs-server>
    # cat /etc/exports
    /nfs/referal   *(rw,insecure,no_subtree_check,no_root_squash,crossmnt,refer=/nfs/old/@nfs-server)
    /nfs/old       *(ro,insecure,subtree_check,root_squash,crossmnt)
    # service nfs restart
    
    <nfs-client>
    # ll /mnt/*/    --->>>>> oops here
    
    [ 5123.102925] BUG: unable to handle kernel NULL pointer dereference at           (null)
    [ 5123.103363] IP: [<ffffffffa03ed38b>] nfs4_proc_get_locations+0x9b/0x120 [nfsv4]
    [ 5123.103752] PGD 587b9067 PUD 3cbf5067 PMD 0
    [ 5123.104131] Oops: 0000 [#1]
    [ 5123.104529] Modules linked in: nfsv4(OE) nfs(OE) fscache(E) nfsd(OE) xfs libcrc32c iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi coretemp crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel ppdev vmw_balloon parport_pc parport i2c_piix4 shpchp auth_rpcgss nfs_acl vmw_vmci lockd grace sunrpc vmwgfx drm_kms_helper ttm drm mptspi serio_raw scsi_transport_spi e1000 mptscsih mptbase ata_generic pata_acpi [last unloaded: nfsd]
    [ 5123.105887] CPU: 0 PID: 15853 Comm: ::1-manager Tainted: G           OE   4.2.0-rc6+ #214
    [ 5123.106358] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 05/20/2014
    [ 5123.106860] task: ffff88007620f300 ti: ffff88005877c000 task.ti: ffff88005877c000
    [ 5123.107363] RIP: 0010:[<ffffffffa03ed38b>]  [<ffffffffa03ed38b>] nfs4_proc_get_locations+0x9b/0x120 [nfsv4]
    [ 5123.107909] RSP: 0018:ffff88005877fdb8  EFLAGS: 00010246
    [ 5123.108435] RAX: ffff880053f3bc00 RBX: ffff88006ce6c908 RCX: ffff880053a0d240
    [ 5123.108968] RDX: ffffea0000e6d940 RSI: ffff8800399a0000 RDI: ffff88006ce6c908
    [ 5123.109503] RBP: ffff88005877fe28 R08: ffffffff81c708a0 R09: 0000000000000000
    [ 5123.110045] R10: 00000000000001a2 R11: ffff88003ba7f5c8 R12: ffff880054c55800
    [ 5123.110618] R13: 0000000000000000 R14: ffff880053a0d240 R15: ffff880053a0d240
    [ 5123.111169] FS:  0000000000000000(0000) GS:ffffffff81c27000(0000) knlGS:0000000000000000
    [ 5123.111726] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    [ 5123.112286] CR2: 0000000000000000 CR3: 0000000054cac000 CR4: 00000000001406f0
    [ 5123.112888] Stack:
    [ 5123.113458]  ffffea0000e6d940 ffff8800399a0000 00000000000167d0 0000000000000000
    [ 5123.114049]  0000000000000000 0000000000000000 0000000000000000 00000000a7ec82c6
    [ 5123.114662]  ffff88005877fe18 ffffea0000e6d940 ffff8800399a0000 ffff880054c55800
    [ 5123.115264] Call Trace:
    [ 5123.115868]  [<ffffffffa03fb44b>] nfs4_try_migration+0xbb/0x220 [nfsv4]
    [ 5123.116487]  [<ffffffffa03fcb3b>] nfs4_run_state_manager+0x4ab/0x7b0 [nfsv4]
    [ 5123.117104]  [<ffffffffa03fc690>] ? nfs4_do_reclaim+0x510/0x510 [nfsv4]
    [ 5123.117813]  [<ffffffff810a4527>] kthread+0xd7/0xf0
    [ 5123.118456]  [<ffffffff810a4450>] ? kthread_worker_fn+0x160/0x160
    [ 5123.119108]  [<ffffffff816d9cdf>] ret_from_fork+0x3f/0x70
    [ 5123.119723]  [<ffffffff810a4450>] ? kthread_worker_fn+0x160/0x160
    [ 5123.120329] Code: 4c 8b 6a 58 74 17 eb 52 48 8d 55 a8 89 c6 4c 89 e7 e8 4a b5 ff ff 8b 45 b0 85 c0 74 1c 4c 89 f9 48 8b 55 90 48 8b 75 98 48 89 df <41> ff 55 00 3d e8 d8 ff ff 41 89 c6 74 cf 48 8b 4d c8 65 48 33
    [ 5123.121643] RIP  [<ffffffffa03ed38b>] nfs4_proc_get_locations+0x9b/0x120 [nfsv4]
    [ 5123.122308]  RSP <ffff88005877fdb8>
    [ 5123.122942] CR2: 0000000000000000
    
    Fixes: ec011fe847 ("NFS: Introduce a vector of migration recovery ops")
    Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
    Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 74e1449ce81833e84e6e20c6ae0682f25d2321ae
Author: NeilBrown <neilb@suse.com>
Date:   Thu Jul 30 13:00:56 2015 +1000

    NFSv4: don't set SETATTR for O_RDONLY|O_EXCL
    
    commit efcbc04e16dfa95fef76309f89710dd1d99a5453 upstream.
    
    It is unusual to combine the open flags O_RDONLY and O_EXCL, but
    it appears that libre-office does just that.
    
    [pid  3250] stat("/home/USER/.config", {st_mode=S_IFDIR|0700, st_size=8192, ...}) = 0
    [pid  3250] open("/home/USER/.config/libreoffice/4-suse/user/extensions/buildid", O_RDONLY|O_EXCL <unfinished ...>
    
    NFSv4 takes O_EXCL as a sign that a setattr command should be sent,
    probably to reset the timestamps.
    
    When it was an O_RDONLY open, the SETATTR command does not
    identify any actual attributes to change.
    If no delegation was provided to the open, the SETATTR uses the
    all-zeros stateid and the request is accepted (at least by the
    Linux NFS server - no harm, no foul).
    
    If a read-delegation was provided, this is used in the SETATTR
    request, and a Netapp filer will justifiably claim
    NFS4ERR_BAD_STATEID, which the Linux client takes as a sign
    to retry - indefinitely.
    
    So only treat O_EXCL specially if O_CREAT was also given.
    
    Signed-off-by: NeilBrown <neilb@suse.com>
    Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 20ae73f43103d5c895d8ce8b1ac903ba817e2e1f
Author: Filipe Manana <fdmanana@suse.com>
Date:   Wed Aug 12 11:54:35 2015 +0100

    Btrfs: check if previous transaction aborted to avoid fs corruption
    
    commit 1f9b8c8fbc9a4d029760b16f477b9d15500e3a34 upstream.
    
    While we are committing a transaction, it's possible the previous one is
    still finishing its commit and therefore we wait for it to finish first.
    However we were not checking if that previous transaction ended up getting
    aborted after we waited for it to commit, so we ended up committing the
    current transaction which can lead to fs corruption because the new
    superblock can point to trees that have had one or more nodes/leafs that
    were never durably persisted.
    The following sequence diagram exemplifies how this is possible:
    
              CPU 0                                                        CPU 1
    
      transaction N starts
    
      (...)
    
      btrfs_commit_transaction(N)
    
        cur_trans->state = TRANS_STATE_COMMIT_START;
        (...)
        cur_trans->state = TRANS_STATE_COMMIT_DOING;
        (...)
    
        cur_trans->state = TRANS_STATE_UNBLOCKED;
        root->fs_info->running_transaction = NULL;
    
                                                                  btrfs_start_transaction()
                                                                     --> starts transaction N + 1
    
        btrfs_write_and_wait_transaction(trans, root);
          --> starts writing all new or COWed ebs created
              at transaction N
    
                                                                  creates some new ebs, COWs some
                                                                  existing ebs but doesn't COW or
                                                                  deletes eb X
    
                                                                  btrfs_commit_transaction(N + 1)
                                                                    (...)
                                                                    cur_trans->state = TRANS_STATE_COMMIT_START;
                                                                    (...)
                                                                    wait_for_commit(root, prev_trans);
                                                                      --> prev_trans == transaction N
    
        btrfs_write_and_wait_transaction() continues
        writing ebs
           --> fails writing eb X, we abort transaction N
               and set bit BTRFS_FS_STATE_ERROR on
               fs_info->fs_state, so no new transactions
               can start after setting that bit
    
           cleanup_transaction()
             btrfs_cleanup_one_transaction()
               wakes up task at CPU 1
    
                                                                    continues, doesn't abort because
                                                                    cur_trans->aborted (transaction N + 1)
                                                                    is zero, and no checks for bit
                                                                    BTRFS_FS_STATE_ERROR in fs_info->fs_state
                                                                    are made
    
                                                                    btrfs_write_and_wait_transaction(trans, root);
                                                                      --> succeeds, no errors during writeback
    
                                                                    write_ctree_super(trans, root, 0);
                                                                      --> succeeds
                                                                      --> we have now a superblock that points us
                                                                          to some root that uses eb X, which was
                                                                          never written to disk
    
    In this scenario future attempts to read eb X from disk results in an
    error message like "parent transid verify failed on X wanted Y found Z".
    
    So fix this by aborting the current transaction if after waiting for the
    previous transaction we verify that it was aborted.
    
    Signed-off-by: Filipe Manana <fdmanana@suse.com>
    Reviewed-by: Josef Bacik <jbacik@fb.com>
    Reviewed-by: Liu Bo <bo.li.liu@oracle.com>
    Signed-off-by: Chris Mason <clm@fb.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit a601c5b4e8e40421f1aed26676e49a830a3fbd05
Author: Sakari Ailus <sakari.ailus@iki.fi>
Date:   Fri Jun 12 20:06:23 2015 -0300

    v4l: omap3isp: Fix sub-device power management code
    
    commit 9d39f05490115bf145e5ea03c0b7ec9d3d015b01 upstream.
    
    Commit 813f5c0ac5cc ("media: Change media device link_notify behaviour")
    modified the media controller link setup notification API and updated the
    OMAP3 ISP driver accordingly. As a side effect it introduced a bug by
    turning power on after setting the link instead of before. This results in
    sub-devices not being powered down in some cases when they should be. Fix
    it.
    
    Fixes: 813f5c0ac5cc [media] media: Change media device link_notify behaviour
    
    Signed-off-by: Sakari Ailus <sakari.ailus@iki.fi>
    Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
    Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 565e7a31c0b9299b5d0805c3b96bf4cb8b9494e9
Author: David Härdeman <david@hardeman.nu>
Date:   Tue May 19 19:03:12 2015 -0300

    rc-core: fix remove uevent generation
    
    commit a66b0c41ad277ae62a3ae6ac430a71882f899557 upstream.
    
    The input_dev is already gone when the rc device is being unregistered
    so checking for its presence only means that no remove uevent will be
    generated.
    
    Signed-off-by: David Härdeman <david@hardeman.nu>
    Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 0fce8532f6a8e94b409a778a1c607c8858a91e68
Author: Minfei Huang <mnfhuang@gmail.com>
Date:   Sun Jul 12 20:18:42 2015 +0800

    x86/mm: Initialize pmd_idx in page_table_range_init_count()
    
    commit 9962eea9e55f797f05f20ba6448929cab2a9f018 upstream.
    
    The variable pmd_idx is not initialized for the first iteration of the
    for loop.
    
    Assign the proper value which indexes the start address.
    
    Fixes: 719272c45b82 'x86, mm: only call early_ioremap_page_table_range_init() once'
    Signed-off-by: Minfei Huang <mnfhuang@gmail.com>
    Cc: tony.luck@intel.com
    Cc: wangnan0@huawei.com
    Cc: david.vrabel@citrix.com
    Reviewed-by: yinghai@kernel.org
    Link: http://lkml.kernel.org/r/1436703522-29552-1-git-send-email-mhuang@redhat.com
    Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 191276cd0e36f4747356b7a298db881886845820
Author: Yinghai Lu <yinghai@kernel.org>
Date:   Fri Sep 4 15:42:39 2015 -0700

    mm: check if section present during memory block registering
    
    commit 04697858d89e4bf2650364f8d6956e2554e8ef88 upstream.
    
    Tony Luck found on his setup, if memory block size 512M will cause crash
    during booting.
    
      BUG: unable to handle kernel paging request at ffffea0074000020
      IP: get_nid_for_pfn+0x17/0x40
      PGD 128ffcb067 PUD 128ffc9067 PMD 0
      Oops: 0000 [#1] SMP
      Modules linked in:
      CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.2.0-rc8 #1
      ...
      Call Trace:
         ? register_mem_sect_under_node+0x66/0xe0
         register_one_node+0x17b/0x240
         ? pci_iommu_alloc+0x6e/0x6e
         topology_init+0x3c/0x95
         do_one_initcall+0xcd/0x1f0
    
    The system has non continuous RAM address:
     BIOS-e820: [mem 0x0000001300000000-0x0000001cffffffff] usable
     BIOS-e820: [mem 0x0000001d70000000-0x0000001ec7ffefff] usable
     BIOS-e820: [mem 0x0000001f00000000-0x0000002bffffffff] usable
     BIOS-e820: [mem 0x0000002c18000000-0x0000002d6fffefff] usable
     BIOS-e820: [mem 0x0000002e00000000-0x00000039ffffffff] usable
    
    So there are start sections in memory block not present.  For example:
    
        memory block : [0x2c18000000, 0x2c20000000) 512M
    
    first three sections are not present.
    
    The current register_mem_sect_under_node() assume first section is
    present, but memory block section number range [start_section_nr,
    end_section_nr] would include not present section.
    
    For arch that support vmemmap, we don't setup memmap for struct page
    area within not present sections area.
    
    So skip the pfn range that belong to absent section.
    
    [akpm@linux-foundation.org: simplification]
    [rientjes@google.com: more simplification]
    Fixes: bdee237c0343 ("x86: mm: Use 2GB memory block size on large memory x86-64 systems")
    Fixes: 982792c782ef ("x86, mm: probe memory block size for generic x86 64bit")
    Signed-off-by: Yinghai Lu <yinghai@kernel.org>
    Signed-off-by: David Rientjes <rientjes@google.com>
    Reported-by: Tony Luck <tony.luck@intel.com>
    Tested-by: Tony Luck <tony.luck@intel.com>
    Cc: Greg KH <greg@kroah.com>
    Cc: Ingo Molnar <mingo@elte.hu>
    Tested-by: David Rientjes <rientjes@google.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit b8144881745f9e48bb1827ec0014214a3629bfc5
Author: Jeffery Miller <jmiller@neverware.com>
Date:   Tue Sep 1 11:23:02 2015 -0400

    Add radeon suspend/resume quirk for HP Compaq dc5750.
    
    commit 09bfda10e6efd7b65bcc29237bee1765ed779657 upstream.
    
    With the radeon driver loaded the HP Compaq dc5750
    Small Form Factor machine fails to resume from suspend.
    Adding a quirk similar to other devices avoids
    the problem and the system resumes properly.
    
    Signed-off-by: Jeffery Miller <jmiller@neverware.com>
    Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit cc45b90cf124f0f0afb07f81b6bb9f27bce3ad1a
Author: Jann Horn <jann@thejh.net>
Date:   Fri Sep 11 16:27:27 2015 +0200

    CIFS: fix type confusion in copy offload ioctl
    
    commit 4c17a6d56bb0cad3066a714e94f7185a24b40f49 upstream.
    
    This might lead to local privilege escalation (code execution as
    kernel) for systems where the following conditions are met:
    
     - CONFIG_CIFS_SMB2 and CONFIG_CIFS_POSIX are enabled
     - a cifs filesystem is mounted where:
      - the mount option "vers" was used and set to a value >=2.0
      - the attacker has write access to at least one file on the filesystem
    
    To attack this, an attacker would have to guess the target_tcon
    pointer (but guessing wrong doesn't cause a crash, it just returns an
    error code) and win a narrow race.
    
    Signed-off-by: Jann Horn <jann@thejh.net>
    Signed-off-by: Steve French <smfrench@gmail.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 5d395c5df33ca77b6da7e36a1f971bdcc580ec4a
Author: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Date:   Tue Sep 15 12:30:08 2015 +0530

    powerpc/mm: Recompute hash value after a failed update
    
    commit 36b35d5d807b7e57aff7d08e63de8b17731ee211 upstream.
    
    If we had secondary hash flag set, we ended up modifying hash value in
    the updatepp code path. Hence with a failed updatepp we will be using
    a wrong hash value for the following hash insert. Fix this by
    recomputing hash before insert.
    
    Without this patch we can end up with using wrong slot number in linux
    pte. That can result in us missing an hash pte update or invalidate
    which can cause memory corruption or even machine check.
    
    Fixes: 6d492ecc6489 ("powerpc/THP: Add code to handle HPTE faults for hugepages")
    Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
    Reviewed-by: Paul Mackerras <paulus@samba.org>
    Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 4d14a9e6fb8bc461bc62a4289bd4bbca7cd610b3
Author: Thomas Huth <thuth@redhat.com>
Date:   Fri Jul 17 12:46:58 2015 +0200

    powerpc/rtas: Introduce rtas_get_sensor_fast() for IRQ handlers
    
    commit 1c2cb594441d02815d304cccec9742ff5c707495 upstream.
    
    The EPOW interrupt handler uses rtas_get_sensor(), which in turn
    uses rtas_busy_delay() to wait for RTAS becoming ready in case it
    is necessary. But rtas_busy_delay() is annotated with might_sleep()
    and thus may not be used by interrupts handlers like the EPOW handler!
    This leads to the following BUG when CONFIG_DEBUG_ATOMIC_SLEEP is
    enabled:
    
     BUG: sleeping function called from invalid context at arch/powerpc/kernel/rtas.c:496
     in_atomic(): 1, irqs_disabled(): 1, pid: 0, name: swapper/1
     CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.2.0-rc2-thuth #6
     Call Trace:
     [c00000007ffe7b90] [c000000000807670] dump_stack+0xa0/0xdc (unreliable)
     [c00000007ffe7bc0] [c0000000000e1f14] ___might_sleep+0x134/0x180
     [c00000007ffe7c20] [c00000000002aec0] rtas_busy_delay+0x30/0xd0
     [c00000007ffe7c50] [c00000000002bde4] rtas_get_sensor+0x74/0xe0
     [c00000007ffe7ce0] [c000000000083264] ras_epow_interrupt+0x44/0x450
     [c00000007ffe7d90] [c000000000120260] handle_irq_event_percpu+0xa0/0x300
     [c00000007ffe7e70] [c000000000120524] handle_irq_event+0x64/0xc0
     [c00000007ffe7eb0] [c000000000124dbc] handle_fasteoi_irq+0xec/0x260
     [c00000007ffe7ef0] [c00000000011f4f0] generic_handle_irq+0x50/0x80
     [c00000007ffe7f20] [c000000000010f3c] __do_irq+0x8c/0x200
     [c00000007ffe7f90] [c0000000000236cc] call_do_irq+0x14/0x24
     [c00000007e6f39e0] [c000000000011144] do_IRQ+0x94/0x110
     [c00000007e6f3a30] [c000000000002594] hardware_interrupt_common+0x114/0x180
    
    Fix this issue by introducing a new rtas_get_sensor_fast() function
    that does not use rtas_busy_delay() - and thus can only be used for
    sensors that do not cause a BUSY condition - known as "fast" sensors.
    
    The EPOW sensor is defined to be "fast" in sPAPR - mpe.
    
    Fixes: 587f83e8dd50 ("powerpc/pseries: Use rtas_get_sensor in RAS code")
    Signed-off-by: Thomas Huth <thuth@redhat.com>
    Reviewed-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
    Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 65489f8c8b3c5331e739596436ae9ceab1a4f39d
Author: Michael Ellerman <mpe@ellerman.id.au>
Date:   Fri Aug 7 16:19:43 2015 +1000

    powerpc/mm: Fix pte_pagesize_index() crash on 4K w/64K hash
    
    commit 74b5037baa2011a2799e2c43adde7d171b072f9e upstream.
    
    The powerpc kernel can be built to have either a 4K PAGE_SIZE or a 64K
    PAGE_SIZE.
    
    However when built with a 4K PAGE_SIZE there is an additional config
    option which can be enabled, PPC_HAS_HASH_64K, which means the kernel
    also knows how to hash a 64K page even though the base PAGE_SIZE is 4K.
    
    This is used in one obscure configuration, to support 64K pages for SPU
    local store on the Cell processor when the rest of the kernel is using
    4K pages.
    
    In this configuration, pte_pagesize_index() is defined to just pass
    through its arguments to get_slice_psize(). However pte_pagesize_index()
    is called for both user and kernel addresses, whereas get_slice_psize()
    only knows how to handle user addresses.
    
    This has been broken forever, however until recently it happened to
    work. That was because in get_slice_psize() the large kernel address
    would cause the right shift of the slice mask to return zero.
    
    However in commit 7aa0727f3302 ("powerpc/mm: Increase the slice range to
    64TB"), the get_slice_psize() code was changed so that instead of a
    right shift we do an array lookup based on the address. When passed a
    kernel address this means we index way off the end of the slice array
    and return random junk.
    
    That is only fatal if we happen to hit something non-zero, but when we
    do return a non-zero value we confuse the MMU code and eventually cause
    a check stop.
    
    This fix is ugly, but simple. When we're called for a kernel address we
    return 4K, which is always correct in this configuration, otherwise we
    use the slice mask.
    
    Fixes: 7aa0727f3302 ("powerpc/mm: Increase the slice range to 64TB")
    Reported-by: Cyril Bur <cyrilbur@gmail.com>
    Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
    Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit aba2a0df20fd044e626ac5fb02bdd40d4cbbacb0
Author: Takashi Iwai <tiwai@suse.de>
Date:   Thu Aug 13 18:05:06 2015 +0200

    ALSA: hda - Use ALC880_FIXUP_FUJITSU for FSC Amilo M1437
    
    commit a161574e200ae63a5042120e0d8c36830e81bde3 upstream.
    
    It turned out that the machine has a bass speaker, so take a correct
    fixup entry.
    
    Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=102501
    Signed-off-by: Takashi Iwai <tiwai@suse.de>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit d93369f5349e986cee5877e70db66fcfad0df3c8
Author: Takashi Iwai <tiwai@suse.de>
Date:   Thu Aug 13 18:02:39 2015 +0200

    ALSA: hda - Enable headphone jack detect on old Fujitsu laptops
    
    commit bb148bdeb0ab16fc0ae8009799471e4d7180073b upstream.
    
    According to the bug report, FSC Amilo laptops with ALC880 can detect
    the headphone jack but currently the driver disables it.  It's partly
    intentionally, as non-working jack detect was reported in the past.
    Let's enable now.
    
    Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=102501
    Signed-off-by: Takashi Iwai <tiwai@suse.de>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 2350e51bd7a7b84a9d63638eb895a7ad73a4b66c
Author: Takashi Iwai <tiwai@suse.de>
Date:   Thu Sep 3 22:20:00 2015 -0700

    Input: evdev - do not report errors form flush()
    
    commit eb38f3a4f6e86f8bb10a3217ebd85ecc5d763aae upstream.
    
    We've got bug reports showing the old systemd-logind (at least
    system-210) aborting unexpectedly, and this turned out to be because
    of an invalid error code from close() call to evdev devices.  close()
    is supposed to return only either EINTR or EBADFD, while the device
    returned ENODEV.  logind was overreacting to it and decided to kill
    itself when an unexpected error code was received.  What a tragedy.
    
    The bad error code comes from flush fops, and actually evdev_flush()
    returns ENODEV when device is disconnected or client's access to it is
    revoked. But in these cases the fact that flush did not actually happen is
    not an error, but rather normal behavior. For non-disconnected devices
    result of flush is also not that interesting as there is no potential of
    data loss and even if it fails application has no way of handling the
    error. Because of that we are better off always returning success from
    evdev_flush().
    
    Also returning EINTR from flush()/close() is discouraged (as it is not
    clear how application should handle this error), so let's stop taking
    evdev->mutex interruptibly.
    
    Bugzilla: http://bugzilla.suse.com/show_bug.cgi?id=939834
    Signed-off-by: Takashi Iwai <tiwai@suse.de>
    Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit ea87c4cb9b71dd8d7870e4a30999f1eb14fca20b
Author: Marc Zyngier <marc.zyngier@arm.com>
Date:   Wed Sep 16 16:18:59 2015 +0100

    arm64: KVM: Disable virtual timer even if the guest is not using it
    
    commit c4cbba9fa078f55d9f6d081dbb4aec7cf969e7c7 upstream.
    
    When running a guest with the architected timer disabled (with QEMU and
    the kernel_irqchip=off option, for example), it is important to make
    sure the timer gets turned off. Otherwise, the guest may try to
    enable it anyway, leading to a screaming HW interrupt.
    
    The fix is to unconditionally turn off the virtual timer on guest
    exit.
    
    Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
    Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 81cce2cabe09f301bc7e85de30609d5e6704e209
Author: Will Deacon <will.deacon@arm.com>
Date:   Tue Mar 17 12:15:02 2015 +0000

    arm64: errata: add module build workaround for erratum #843419
    
    commit df057cc7b4fa59e9b55f07ffdb6c62bf02e99a00 upstream.
    
    Cortex-A53 processors <= r0p4 are affected by erratum #843419 which can
    lead to a memory access using an incorrect address in certain sequences
    headed by an ADRP instruction.
    
    There is a linker fix to generate veneers for ADRP instructions, but
    this doesn't work for kernel modules which are built as unlinked ELF
    objects.
    
    This patch adds a new config option for the erratum which, when enabled,
    builds kernel modules with the mcmodel=large flag. This uses absolute
    addressing for all kernel symbols, thereby removing the use of ADRP as
    a PC-relative form of addressing. The ADRP relocs are removed from the
    module loader so that we fail to load any potentially affected modules.
    
    Acked-by: Catalin Marinas <catalin.marinas@arm.com>
    Signed-off-by: Will Deacon <will.deacon@arm.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 27369ba9e39eb4d27e6ff3755e210f4af616e56a
Author: Will Deacon <will.deacon@arm.com>
Date:   Wed Sep 2 18:49:28 2015 +0100

    arm64: head.S: initialise mdcr_el2 in el2_setup
    
    commit d10bcd473301888f957ec4b6b12aa3621be78d59 upstream.
    
    When entering the kernel at EL2, we fail to initialise the MDCR_EL2
    register which controls debug access and PMU capabilities at EL1.
    
    This patch ensures that the register is initialised so that all traps
    are disabled and all the PMU counters are available to the host. When a
    guest is scheduled, KVM takes care to configure trapping appropriately.
    
    Acked-by: Marc Zyngier <marc.zyngier@arm.com>
    Signed-off-by: Will Deacon <will.deacon@arm.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit e9512ecf5f0337b84fc1f789a613dc264355e61e
Author: Will Deacon <will.deacon@arm.com>
Date:   Tue Sep 15 12:07:06 2015 +0100

    arm64: compat: fix vfp save/restore across signal handlers in big-endian
    
    commit bdec97a855ef1e239f130f7a11584721c9a1bf04 upstream.
    
    When saving/restoring the VFP registers from a compat (AArch32)
    signal frame, we rely on the compat registers forming a prefix of the
    native register file and therefore make use of copy_{to,from}_user to
    transfer between the native fpsimd_state and the compat_vfp_sigframe.
    
    Unfortunately, this doesn't work so well in a big-endian environment.
    Our fpsimd save/restore code operates directly on 128-bit quantities
    (Q registers) whereas the compat_vfp_sigframe represents the registers
    as an array of 64-bit (D) registers. The architecture packs the compat D
    registers into the Q registers, with the least significant bytes holding
    the lower register. Consequently, we need to swap the 64-bit halves when
    converting between these two representations on a big-endian machine.
    
    This patch replaces the __copy_{to,from}_user invocations in our
    compat VFP signal handling code with explicit __put_user loops that
    operate on 64-bit values and swap them accordingly.
    
    Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
    Signed-off-by: Will Deacon <will.deacon@arm.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit e66caeb33c2d3b1dd07dd9a8fc99218c4dd53c21
Author: Jeff Vander Stoep <jeffv@google.com>
Date:   Tue Aug 18 20:50:10 2015 +0100

    arm64: kconfig: Move LIST_POISON to a safe value
    
    commit bf0c4e04732479f650ff59d1ee82de761c0071f0 upstream.
    
    Move the poison pointer offset to 0xdead000000000000, a
    recognized value that is not mappable by user-space exploits.
    
    Acked-by: Catalin Marinas <catalin.marinas@arm.com>
    Signed-off-by: Thierry Strudel <tstrudel@google.com>
    Signed-off-by: Jeff Vander Stoep <jeffv@google.com>
    Signed-off-by: Will Deacon <will.deacon@arm.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 3e0c3882922823cc90f5930c78c0fbcb217156b7
Author: Bob Copeland <me@bobcopeland.com>
Date:   Sat Jun 13 10:16:31 2015 -0400

    mac80211: enable assoc check for mesh interfaces
    
    commit 3633ebebab2bbe88124388b7620442315c968e8f upstream.
    
    We already set a station to be associated when peering completes, both
    in user space and in the kernel.  Thus we should always have an
    associated sta before sending data frames to that station.
    
    Failure to check assoc state can cause crashes in the lower-level driver
    due to transmitting unicast data frames before driver sta structures
    (e.g. ampdu state in ath9k) are initialized.  This occurred when
    forwarding in the presence of fixed mesh paths: frames were transmitted
    to stations with whom we hadn't yet completed peering.
    
    Reported-by: Alexis Green <agreen@cococorp.com>
    Tested-by: Jesse Jones <jjones@cococorp.com>
    Signed-off-by: Bob Copeland <me@bobcopeland.com>
    Signed-off-by: Johannes Berg <johannes.berg@intel.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 6c53b52eab4a3a98e59c1b9f98578d1a8a35a6ef
Author: Jean Delvare <jdelvare@suse.de>
Date:   Tue Sep 1 18:07:41 2015 +0200

    tg3: Fix temperature reporting
    
    commit d3d11fe08ccc9bff174fc958722b5661f0932486 upstream.
    
    The temperature registers appear to report values in degrees Celsius
    while the hwmon API mandates values to be exposed in millidegrees
    Celsius. Do the conversion so that the values reported by "sensors"
    are correct.
    
    Fixes: aed93e0bf493 ("tg3: Add hwmon support for temperature")
    Signed-off-by: Jean Delvare <jdelvare@suse.de>
    Cc: Prashant Sreedharan <prashant@broadcom.com>
    Cc: Michael Chan <mchan@broadcom.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit a4e80afe2e9d0bcdedb1c3811066b87ff9634ca6
Author: Adrien Schildknecht <adrien+dev@schischi.me>
Date:   Wed Aug 19 17:33:12 2015 +0200

    rtlwifi: rtl8192cu: Add new device ID
    
    commit 1642d09fb9b128e8e538b2a4179962a34f38dff9 upstream.
    
    The v2 of NetGear WNA1000M uses a different idProduct: USB ID 0846:9043
    
    Signed-off-by: Adrien Schildknecht <adrien+dev@schischi.me>
    Acked-by: Larry Finger <Larry.Finger@lwfinger.net>
    Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 83b9d2dc51426de1a1ac8590e37f2d3ee317b626
Author: Eric W. Biederman <ebiederm@xmission.com>
Date:   Mon Aug 10 17:35:07 2015 -0500

    unshare: Unsharing a thread does not require unsharing a vm
    
    commit 12c641ab8270f787dfcce08b5f20ce8b65008096 upstream.
    
    In the logic in the initial commit of unshare made creating a new
    thread group for a process, contingent upon creating a new memory
    address space for that process.  That is wrong.  Two separate
    processes in different thread groups can share a memory address space
    and clone allows creation of such proceses.
    
    This is significant because it was observed that mm_users > 1 does not
    mean that a process is multi-threaded, as reading /proc/PID/maps
    temporarily increments mm_users, which allows other processes to
    (accidentally) interfere with unshare() calls.
    
    Correct the check in check_unshare_flags() to test for
    !thread_group_empty() for CLONE_THREAD, CLONE_SIGHAND, and CLONE_VM.
    For sighand->count > 1 for CLONE_SIGHAND and CLONE_VM.
    For !current_is_single_threaded instead of mm_users > 1 for CLONE_VM.
    
    By using the correct checks in unshare this removes the possibility of
    an accidental denial of service attack.
    
    Additionally using the correct checks in unshare ensures that only an
    explicit unshare(CLONE_VM) can possibly trigger the slow path of
    current_is_single_threaded().  As an explict unshare(CLONE_VM) is
    pointless it is not expected there are many applications that make
    that call.
    
    Fixes: b2e0d98705e60e45bbb3c0032c48824ad7ae0704 userns: Implement unshare of the user namespace
    Reported-by: Ricky Zhou <rickyz@chromium.org>
    Reported-by: Kees Cook <keescook@chromium.org>
    Reviewed-by: Kees Cook <keescook@chromium.org>
    Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit efa9b80eddc078c30fa1b075318a82f81f6f5800
Author: Ming Lei <ming.lei@canonical.com>
Date:   Sun Aug 9 03:41:50 2015 -0400

    blk-mq: fix buffer overflow when reading sysfs file of 'pending'
    
    commit 596f5aad2a704b72934e5abec1b1b4114c16f45b upstream.
    
    There may be lots of pending requests so that the buffer of PAGE_SIZE
    can't hold them at all.
    
    One typical example is scsi-mq, the queue depth(.can_queue) of
    scsi_host and blk-mq is quite big but scsi_device's queue_depth
    is a bit small(.cmd_per_lun), then it is quite easy to have lots
    of pending requests in hw queue.
    
    This patch fixes the following warning and the related memory
    destruction.
    
    [  359.025101] fill_read_buffer: blk_mq_hw_sysfs_show+0x0/0x7d returned bad count^M
    [  359.055595] irq event stamp: 15537^M
    [  359.055606] general protection fault: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC ^M
    [  359.055614] Dumping ftrace buffer:^M
    [  359.055660]    (ftrace buffer empty)^M
    [  359.055672] Modules linked in: nbd ipv6 kvm_intel kvm serio_raw^M
    [  359.055678] CPU: 4 PID: 21631 Comm: stress-ng-sysfs Not tainted 4.2.0-rc5-next-20150805 #434^M
    [  359.055679] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011^M
    [  359.055682] task: ffff8802161cc000 ti: ffff88021b4a8000 task.ti: ffff88021b4a8000^M
    [  359.055693] RIP: 0010:[<ffffffff811541c5>]  [<ffffffff811541c5>] __kmalloc+0xe8/0x152^M
    
    Signed-off-by: Ming Lei <ming.lei@canonical.com>
    Signed-off-by: Jens Axboe <axboe@fb.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>