commit 6ebe34c1da893f1705452ab6a352dfeec548dafe
Author: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Date:   Wed Feb 6 17:33:30 2019 +0100

    Linux 4.9.155

commit 987d8ff3a2d8fa5caf998b068347ec0a3d2e2300
Author: Amir Goldstein <amir73il@gmail.com>
Date:   Tue Oct 30 20:29:53 2018 +0200

    fanotify: fix handling of events on child sub-directory
    
    commit b469e7e47c8a075cc08bcd1e85d4365134bdcdd5 upstream.
    
    When an event is reported on a sub-directory and the parent inode has
    a mark mask with FS_EVENT_ON_CHILD|FS_ISDIR, the event will be sent to
    fsnotify() even if the event type is not in the parent mark mask
    (e.g. FS_OPEN).
    
    Further more, if that event happened on a mount or a filesystem with
    a mount/sb mark that does have that event type in their mask, the "on
    child" event will be reported on the mount/sb mark.  That is not
    desired, because user will get a duplicate event for the same action.
    
    Note that the event reported on the victim inode is never merged with
    the event reported on the parent inode, because of the check in
    should_merge(): old_fsn->inode == new_fsn->inode.
    
    Fix this by looking for a match of an actual event type (i.e. not just
    FS_ISDIR) in parent's inode mark mask and by not reporting an "on child"
    event to group if event type is only found on mount/sb marks.
    
    [backport hint: The bug seems to have always been in fanotify, but this
                    patch will only apply cleanly to v4.19.y]
    
    Cc: <stable@vger.kernel.org> # v4.19
    Signed-off-by: Amir Goldstein <amir73il@gmail.com>
    Signed-off-by: Jan Kara <jack@suse.cz>
    [amir: backport to v4.9]
    Signed-off-by: Amir Goldstein <amir73il@gmail.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit d6f62ecb9e6daa5cea86a0146259effd51fa8930
Author: Dave Chinner <dchinner@redhat.com>
Date:   Fri May 11 11:20:57 2018 +1000

    fs: don't scan the inode cache before SB_BORN is set
    
    commit 79f546a696bff2590169fb5684e23d65f4d9f591 upstream.
    
    We recently had an oops reported on a 4.14 kernel in
    xfs_reclaim_inodes_count() where sb->s_fs_info pointed to garbage
    and so the m_perag_tree lookup walked into lala land.  It produces
    an oops down this path during the failed mount:
    
      radix_tree_gang_lookup_tag+0xc4/0x130
      xfs_perag_get_tag+0x37/0xf0
      xfs_reclaim_inodes_count+0x32/0x40
      xfs_fs_nr_cached_objects+0x11/0x20
      super_cache_count+0x35/0xc0
      shrink_slab.part.66+0xb1/0x370
      shrink_node+0x7e/0x1a0
      try_to_free_pages+0x199/0x470
      __alloc_pages_slowpath+0x3a1/0xd20
      __alloc_pages_nodemask+0x1c3/0x200
      cache_grow_begin+0x20b/0x2e0
      fallback_alloc+0x160/0x200
      kmem_cache_alloc+0x111/0x4e0
    
    The problem is that the superblock shrinker is running before the
    filesystem structures it depends on have been fully set up. i.e.
    the shrinker is registered in sget(), before ->fill_super() has been
    called, and the shrinker can call into the filesystem before
    fill_super() does it's setup work. Essentially we are exposed to
    both use-after-free and use-before-initialisation bugs here.
    
    To fix this, add a check for the SB_BORN flag in super_cache_count.
    In general, this flag is not set until ->fs_mount() completes
    successfully, so we know that it is set after the filesystem
    setup has completed. This matches the trylock_super() behaviour
    which will not let super_cache_scan() run if SB_BORN is not set, and
    hence will not allow the superblock shrinker from entering the
    filesystem while it is being set up or after it has failed setup
    and is being torn down.
    
    Cc: stable@kernel.org
    Signed-Off-By: Dave Chinner <dchinner@redhat.com>
    Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
    Signed-off-by: Aaron Lu <aaron.lu@linux.alibaba.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 50091945a65f3cb5388e4809cf822708014322a0
Author: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Date:   Tue Jul 10 10:29:10 2018 +1000

    drivers: core: Remove glue dirs from sysfs earlier
    
    commit 726e41097920a73e4c7c33385dcc0debb1281e18 upstream.
    
    For devices with a class, we create a "glue" directory between
    the parent device and the new device with the class name.
    
    This directory is never "explicitely" removed when empty however,
    this is left to the implicit sysfs removal done by kobject_release()
    when the object loses its last reference via kobject_put().
    
    This is problematic because as long as it's not been removed from
    sysfs, it is still present in the class kset and in sysfs directory
    structure.
    
    The presence in the class kset exposes a use after free bug fixed
    by the previous patch, but the presence in sysfs means that until
    the kobject is released, which can take a while (especially with
    kobject debugging), any attempt at re-creating such as binding a
    new device for that class/parent pair, will result in a sysfs
    duplicate file name error.
    
    This fixes it by instead doing an explicit kobject_del() when
    the glue dir is empty, by keeping track of the number of
    child devices of the gluedir.
    
    This is made easy by the fact that all glue dir operations are
    done with a global mutex, and there's already a function
    (cleanup_glue_dir) called in all the right places taking that
    mutex that can be enhanced for this. It appears that this was
    in fact the intent of the function, but the implementation was
    wrong.
    
    Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
    Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
    Signed-off-by: Zubin Mithra <zsm@chromium.org>
    Cc: Guenter Roeck <groeck@google.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit fb713a1737fbb75a5d24bd4663d34e3a2bf105b4
Author: Paulo Alcantara <paulo@paulo.ac>
Date:   Tue Nov 20 15:16:36 2018 -0200

    cifs: Always resolve hostname before reconnecting
    
    commit 28eb24ff75c5ac130eb326b3b4d0dcecfc0f427d upstream.
    
    In case a hostname resolves to a different IP address (e.g. long
    running mounts), make sure to resolve it every time prior to calling
    generic_ip_connect() in reconnect.
    
    Suggested-by: Steve French <stfrench@microsoft.com>
    Signed-off-by: Paulo Alcantara <palcantara@suse.de>
    Signed-off-by: Steve French <stfrench@microsoft.com>
    Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit d2de58eb6eb921a0042470fb9467708aa3e09cbd
Author: David Hildenbrand <david@redhat.com>
Date:   Fri Feb 1 14:21:19 2019 -0800

    mm: migrate: don't rely on __PageMovable() of newpage after unlocking it
    
    commit e0a352fabce61f730341d119fbedf71ffdb8663f upstream.
    
    We had a race in the old balloon compaction code before b1123ea6d3b3
    ("mm: balloon: use general non-lru movable page feature") refactored it
    that became visible after backporting 195a8c43e93d ("virtio-balloon:
    deflate via a page list") without the refactoring.
    
    The bug existed from commit d6d86c0a7f8d ("mm/balloon_compaction:
    redesign ballooned pages management") till b1123ea6d3b3 ("mm: balloon:
    use general non-lru movable page feature").  d6d86c0a7f8d
    ("mm/balloon_compaction: redesign ballooned pages management") was
    backported to 3.12, so the broken kernels are stable kernels [3.12 -
    4.7].
    
    There was a subtle race between dropping the page lock of the newpage in
    __unmap_and_move() and checking for __is_movable_balloon_page(newpage).
    
    Just after dropping this page lock, virtio-balloon could go ahead and
    deflate the newpage, effectively dequeueing it and clearing PageBalloon,
    in turn making __is_movable_balloon_page(newpage) fail.
    
    This resulted in dropping the reference of the newpage via
    putback_lru_page(newpage) instead of put_page(newpage), leading to
    page->lru getting modified and a !LRU page ending up in the LRU lists.
    With 195a8c43e93d ("virtio-balloon: deflate via a page list")
    backported, one would suddenly get corrupted lists in
    release_pages_balloon():
    
    - WARNING: CPU: 13 PID: 6586 at lib/list_debug.c:59 __list_del_entry+0xa1/0xd0
    - list_del corruption. prev->next should be ffffe253961090a0, but was dead000000000100
    
    Nowadays this race is no longer possible, but it is hidden behind very
    ugly handling of __ClearPageMovable() and __PageMovable().
    
    __ClearPageMovable() will not make __PageMovable() fail, only
    PageMovable().  So the new check (__PageMovable(newpage)) will still
    hold even after newpage was dequeued by virtio-balloon.
    
    If anybody would ever change that special handling, the BUG would be
    introduced again.  So instead, make it explicit and use the information
    of the original isolated page before migration.
    
    This patch can be backported fairly easy to stable kernels (in contrast
    to the refactoring).
    
    Link: http://lkml.kernel.org/r/20190129233217.10747-1-david@redhat.com
    Fixes: d6d86c0a7f8d ("mm/balloon_compaction: redesign ballooned pages management")
    Signed-off-by: David Hildenbrand <david@redhat.com>
    Reported-by: Vratislav Bendel <vbendel@redhat.com>
    Acked-by: Michal Hocko <mhocko@suse.com>
    Acked-by: Rafael Aquini <aquini@redhat.com>
    Cc: Mel Gorman <mgorman@techsingularity.net>
    Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
    Cc: Michal Hocko <mhocko@suse.com>
    Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
    Cc: Jan Kara <jack@suse.cz>
    Cc: Andrea Arcangeli <aarcange@redhat.com>
    Cc: Dominik Brodowski <linux@dominikbrodowski.net>
    Cc: Matthew Wilcox <willy@infradead.org>
    Cc: Vratislav Bendel <vbendel@redhat.com>
    Cc: Rafael Aquini <aquini@redhat.com>
    Cc: Konstantin Khlebnikov <k.khlebnikov@samsung.com>
    Cc: Minchan Kim <minchan@kernel.org>
    Cc: <stable@vger.kernel.org>    [3.12 - 4.7]
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 5a3c49bb6127ac0d3deab44d7eac6f0d7b98efbe
Author: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Date:   Fri Feb 1 14:21:08 2019 -0800

    mm: hwpoison: use do_send_sig_info() instead of force_sig()
    
    commit 6376360ecbe525a9c17b3d081dfd88ba3e4ed65b upstream.
    
    Currently memory_failure() is racy against process's exiting, which
    results in kernel crash by null pointer dereference.
    
    The root cause is that memory_failure() uses force_sig() to forcibly
    kill asynchronous (meaning not in the current context) processes.  As
    discussed in thread https://lkml.org/lkml/2010/6/8/236 years ago for OOM
    fixes, this is not a right thing to do.  OOM solves this issue by using
    do_send_sig_info() as done in commit d2d393099de2 ("signal:
    oom_kill_task: use SEND_SIG_FORCED instead of force_sig()"), so this
    patch is suggesting to do the same for hwpoison.  do_send_sig_info()
    properly accesses to siglock with lock_task_sighand(), so is free from
    the reported race.
    
    I confirmed that the reported bug reproduces with inserting some delay
    in kill_procs(), and it never reproduces with this patch.
    
    Note that memory_failure() can send another type of signal using
    force_sig_mceerr(), and the reported race shouldn't happen on it because
    force_sig_mceerr() is called only for synchronous processes (i.e.
    BUS_MCEERR_AR happens only when some process accesses to the corrupted
    memory.)
    
    Link: http://lkml.kernel.org/r/20190116093046.GA29835@hori1.linux.bs1.fc.nec.co.jp
    Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
    Reported-by: Jane Chu <jane.chu@oracle.com>
    Reviewed-by: Dan Williams <dan.j.williams@intel.com>
    Reviewed-by: William Kucharski <william.kucharski@oracle.com>
    Cc: Oleg Nesterov <oleg@redhat.com>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 274be582b7a9fd5bf6a882d184fe6811aa99f6be
Author: Shakeel Butt <shakeelb@google.com>
Date:   Fri Feb 1 14:20:54 2019 -0800

    mm, oom: fix use-after-free in oom_kill_process
    
    commit cefc7ef3c87d02fc9307835868ff721ea12cc597 upstream.
    
    Syzbot instance running on upstream kernel found a use-after-free bug in
    oom_kill_process.  On further inspection it seems like the process
    selected to be oom-killed has exited even before reaching
    read_lock(&tasklist_lock) in oom_kill_process().  More specifically the
    tsk->usage is 1 which is due to get_task_struct() in oom_evaluate_task()
    and the put_task_struct within for_each_thread() frees the tsk and
    for_each_thread() tries to access the tsk.  The easiest fix is to do
    get/put across the for_each_thread() on the selected task.
    
    Now the next question is should we continue with the oom-kill as the
    previously selected task has exited? However before adding more
    complexity and heuristics, let's answer why we even look at the children
    of oom-kill selected task? The select_bad_process() has already selected
    the worst process in the system/memcg.  Due to race, the selected
    process might not be the worst at the kill time but does that matter?
    The userspace can use the oom_score_adj interface to prefer children to
    be killed before the parent.  I looked at the history but it seems like
    this is there before git history.
    
    Link: http://lkml.kernel.org/r/20190121215850.221745-1-shakeelb@google.com
    Reported-by: syzbot+7fbbfa368521945f0e3d@syzkaller.appspotmail.com
    Fixes: 6b0c81b3be11 ("mm, oom: reduce dependency on tasklist_lock")
    Signed-off-by: Shakeel Butt <shakeelb@google.com>
    Reviewed-by: Roman Gushchin <guro@fb.com>
    Acked-by: Michal Hocko <mhocko@suse.com>
    Cc: David Rientjes <rientjes@google.com>
    Cc: Johannes Weiner <hannes@cmpxchg.org>
    Cc: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 44ccc0cce1e1250a11474acda2870c39cafa2b67
Author: Andrei Vagin <avagin@gmail.com>
Date:   Fri Feb 1 14:20:24 2019 -0800

    kernel/exit.c: release ptraced tasks before zap_pid_ns_processes
    
    commit 8fb335e078378c8426fabeed1ebee1fbf915690c upstream.
    
    Currently, exit_ptrace() adds all ptraced tasks in a dead list, then
    zap_pid_ns_processes() waits on all tasks in a current pidns, and only
    then are tasks from the dead list released.
    
    zap_pid_ns_processes() can get stuck on waiting tasks from the dead
    list.  In this case, we will have one unkillable process with one or
    more dead children.
    
    Thanks to Oleg for the advice to release tasks in find_child_reaper().
    
    Link: http://lkml.kernel.org/r/20190110175200.12442-1-avagin@gmail.com
    Fixes: 7c8bd2322c7f ("exit: ptrace: shift "reap dead" code from exit_ptrace() to forget_original_parent()")
    Signed-off-by: Andrei Vagin <avagin@gmail.com>
    Signed-off-by: Oleg Nesterov <oleg@redhat.com>
    Cc: "Eric W. Biederman" <ebiederm@xmission.com>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 686ef4545a8427e040a2719485c8854e25430a0e
Author: Stefan Wahren <stefan.wahren@i2se.com>
Date:   Sun Dec 23 21:59:17 2018 +0100

    mmc: sdhci-iproc: handle mmc_of_parse() errors during probe
    
    commit 2bd44dadd5bfb4135162322fd0b45a174d4ad5bf upstream.
    
    We need to handle mmc_of_parse() errors during probe.
    
    This finally fixes the wifi regression on Raspberry Pi 3 series.
    In error case the wifi chip was permanently in reset because of
    the power sequence depending on the deferred probe of the GPIO expander.
    
    Fixes: b580c52d58d9 ("mmc: sdhci-iproc: add IPROC SDHCI driver")
    Cc: stable@vger.kernel.org
    Signed-off-by: Stefan Wahren <stefan.wahren@i2se.com>
    Acked-by: Adrian Hunter <adrian.hunter@intel.com>
    Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 3003149c6552bfdca41854854918927eeac19e43
Author: João Paulo Rechi Vita <jprvita@gmail.com>
Date:   Wed Oct 31 17:21:28 2018 -0700

    platform/x86: asus-nb-wmi: Drop mapping of 0x33 and 0x34 scan codes
    
    [ Upstream commit 71b12beaf12f21a53bfe100795d0797f1035b570 ]
    
    According to Asus firmware engineers, the meaning of these codes is only
    to notify the OS that the screen brightness has been turned on/off by
    the EC. This does not match the meaning of KEY_DISPLAYTOGGLE /
    KEY_DISPLAY_OFF, where userspace is expected to change the display
    brightness.
    
    Signed-off-by: João Paulo Rechi Vita <jprvita@endlessm.com>
    Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

commit c4819f02e3147676b821b2c599bef92173efb308
Author: João Paulo Rechi Vita <jprvita@gmail.com>
Date:   Wed Oct 31 17:21:27 2018 -0700

    platform/x86: asus-nb-wmi: Map 0x35 to KEY_SCREENLOCK
    
    [ Upstream commit b3f2f3799a972d3863d0fdc2ab6287aef6ca631f ]
    
    When the OS registers to handle events from the display off hotkey the
    EC will send a notification with 0x35 for every key press, independent
    of the backlight state.
    
    The behavior of this key on Windows, with the ATKACPI driver from Asus
    installed, is turning off the backlight of all connected displays with a
    fading effect, and any cursor input or key press turning the backlight
    back on. The key press or cursor input that wakes up the display is also
    passed through to the application under the cursor or under focus.
    
    The key that matches this behavior the closest is KEY_SCREENLOCK.
    
    Signed-off-by: João Paulo Rechi Vita <jprvita@endlessm.com>
    Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

commit 83466a5fd3aec53b4b6e8dd412c6ec4cc57ba67f
Author: Andreas Gruenbacher <agruenba@redhat.com>
Date:   Wed Jan 30 21:30:36 2019 +0100

    gfs2: Revert "Fix loop in gfs2_rbm_find"
    
    commit e74c98ca2d6ae4376cc15fa2a22483430909d96b upstream.
    
    This reverts commit 2d29f6b96d8f80322ed2dd895bca590491c38d34.
    
    It turns out that the fix can lead to a ~20 percent performance regression
    in initial writes to the page cache according to iozone.  Let's revert this
    for now to have more time for a proper fix.
    
    Cc: stable@vger.kernel.org # v3.13+
    Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
    Signed-off-by: Bob Peterson <rpeterso@redhat.com>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 828316e65c09dfdb6338d151b52f165f723b1d6f
Author: James Morse <james.morse@arm.com>
Date:   Thu Jan 24 16:32:57 2019 +0000

    arm64: hibernate: Clean the __hyp_text to PoC after resume
    
    commit f7daa9c8fd191724b9ab9580a7be55cd1a67d799 upstream.
    
    During resume hibernate restores all physical memory. Any memory
    that is accessed with the MMU disabled needs to be cleaned to the
    PoC.
    
    KVMs __hyp_text was previously ommitted as it runs with the MMU
    enabled, but now that the hyp-stub is located in this section,
    we must clean __hyp_text too.
    
    This ensures secondary CPUs that come online after hibernate
    has finished resuming, and load KVM via the freshly written
    hyp-stub see the correct instructions.
    
    Signed-off-by: James Morse <james.morse@arm.com>
    Cc: stable@vger.kernel.org
    Signed-off-by: Will Deacon <will.deacon@arm.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit c5edde98996581443f7f5735b545f65ec050e4ec
Author: James Morse <james.morse@arm.com>
Date:   Thu Jan 24 16:32:56 2019 +0000

    arm64: hyp-stub: Forbid kprobing of the hyp-stub
    
    commit 8fac5cbdfe0f01254d9d265c6aa1a95f94f58595 upstream.
    
    The hyp-stub is loaded by the kernel's early startup code at EL2
    during boot, before KVM takes ownership later. The hyp-stub's
    text is part of the regular kernel text, meaning it can be kprobed.
    
    A breakpoint in the hyp-stub causes the CPU to spin in el2_sync_invalid.
    
    Add it to the __hyp_text.
    
    Signed-off-by: James Morse <james.morse@arm.com>
    Cc: stable@vger.kernel.org
    Signed-off-by: Will Deacon <will.deacon@arm.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 62d1d2b720db405c94d85191083bb2eb218a55c0
Author: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Date:   Sun Jan 27 09:29:42 2019 +0100

    arm64: kaslr: ensure randomized quantities are clean also when kaslr is off
    
    commit 8ea235932314311f15ea6cf65c1393ed7e31af70 upstream.
    
    Commit 1598ecda7b23 ("arm64: kaslr: ensure randomized quantities are
    clean to the PoC") added cache maintenance to ensure that global
    variables set by the kaslr init routine are not wiped clean due to
    cache invalidation occurring during the second round of page table
    creation.
    
    However, if kaslr_early_init() exits early with no randomization
    being applied (either due to the lack of a seed, or because the user
    has disabled kaslr explicitly), no cache maintenance is performed,
    leading to the same issue we attempted to fix earlier, as far as the
    module_alloc_base variable is concerned.
    
    Note that module_alloc_base cannot be initialized statically, because
    that would cause it to be subject to a R_AARCH64_RELATIVE relocation,
    causing it to be overwritten by the second round of KASLR relocation
    processing.
    
    Fixes: f80fb3a3d508 ("arm64: add support for kernel ASLR")
    Cc: <stable@vger.kernel.org> # v4.6+
    Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
    Signed-off-by: Will Deacon <will.deacon@arm.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit d5adbc7a1b30c3e8e9fef69cbe95969b46cef8a4
Author: Koen Vandeputte <koen.vandeputte@ncentric.com>
Date:   Thu Jan 31 15:00:01 2019 -0600

    ARM: cns3xxx: Fix writing to wrong PCI config registers after alignment
    
    commit 65dbb423cf28232fed1732b779249d6164c5999b upstream.
    
    Originally, cns3xxx used its own functions for mapping, reading and
    writing config registers.
    
    Commit 802b7c06adc7 ("ARM: cns3xxx: Convert PCI to use generic config
    accessors") removed the internal PCI config write function in favor of
    the generic one:
    
      cns3xxx_pci_write_config() --> pci_generic_config_write()
    
    cns3xxx_pci_write_config() expected aligned addresses, being produced by
    cns3xxx_pci_map_bus() while the generic one pci_generic_config_write()
    actually expects the real address as both the function and hardware are
    capable of byte-aligned writes.
    
    This currently leads to pci_generic_config_write() writing to the wrong
    registers.
    
    For instance, upon ath9k module loading:
    
    - driver ath9k gets loaded
    - The driver wants to write value 0xA8 to register PCI_LATENCY_TIMER,
      located at 0x0D
    - cns3xxx_pci_map_bus() aligns the address to 0x0C
    - pci_generic_config_write() effectively writes 0xA8 into register 0x0C
      (CACHE_LINE_SIZE)
    
    Fix the bug by removing the alignment in the cns3xxx mapping function.
    
    Fixes: 802b7c06adc7 ("ARM: cns3xxx: Convert PCI to use generic config accessors")
    Signed-off-by: Koen Vandeputte <koen.vandeputte@ncentric.com>
    [lorenzo.pieralisi@arm.com: updated commit log]
    Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
    Acked-by: Krzysztof Halasa <khalasa@piap.pl>
    Acked-by: Tim Harvey <tharvey@gateworks.com>
    Acked-by: Arnd Bergmann <arnd@arndb.de>
    CC: stable@vger.kernel.org      # v4.0+
    CC: Bjorn Helgaas <bhelgaas@google.com>
    CC: Olof Johansson <olof@lixom.net>
    CC: Robin Leblon <robin.leblon@ncentric.com>
    CC: Rob Herring <robh@kernel.org>
    CC: Russell King <linux@armlinux.org.uk>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 92744323a4d5c196fc868ec0a9f444827406e707
Author: Waiman Long <longman@redhat.com>
Date:   Wed Jan 30 13:52:36 2019 -0500

    fs/dcache: Fix incorrect nr_dentry_unused accounting in shrink_dcache_sb()
    
    commit 1dbd449c9943e3145148cc893c2461b72ba6fef0 upstream.
    
    The nr_dentry_unused per-cpu counter tracks dentries in both the LRU
    lists and the shrink lists where the DCACHE_LRU_LIST bit is set.
    
    The shrink_dcache_sb() function moves dentries from the LRU list to a
    shrink list and subtracts the dentry count from nr_dentry_unused.  This
    is incorrect as the nr_dentry_unused count will also be decremented in
    shrink_dentry_list() via d_shrink_del().
    
    To fix this double decrement, the decrement in the shrink_dcache_sb()
    function is taken out.
    
    Fixes: 4e717f5c1083 ("list_lru: remove special case function list_lru_dispose_all."
    Cc: stable@kernel.org
    Signed-off-by: Waiman Long <longman@redhat.com>
    Reviewed-by: Dave Chinner <dchinner@redhat.com>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 9033b4f2bc6f139f119d79d1145fa36d0d1d1517
Author: Pavel Shilovsky <pshilov@microsoft.com>
Date:   Sat Jan 26 12:21:32 2019 -0800

    CIFS: Do not count -ENODATA as failure for query directory
    
    commit 8e6e72aeceaaed5aeeb1cb43d3085de7ceb14f79 upstream.
    
    Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com>
    Signed-off-by: Steve French <stfrench@microsoft.com>
    CC: Stable <stable@vger.kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit c574feb8a2cbb40f56a2b2fdb28fb892d97f44d5
Author: Daniel Borkmann <daniel@iogearbox.net>
Date:   Wed Jan 30 12:49:48 2019 +0100

    ipvlan, l3mdev: fix broken l3s mode wrt local routes
    
    [ Upstream commit d5256083f62e2720f75bb3c5a928a0afe47d6bc3 ]
    
    While implementing ipvlan l3 and l3s mode for kubernetes CNI plugin,
    I ran into the issue that while l3 mode is working fine, l3s mode
    does not have any connectivity to kube-apiserver and hence all pods
    end up in Error state as well. The ipvlan master device sits on
    top of a bond device and hostns traffic to kube-apiserver (also running
    in hostns) is DNATed from 10.152.183.1:443 to 139.178.29.207:37573
    where the latter is the address of the bond0. While in l3 mode, a
    curl to https://10.152.183.1:443 or to https://139.178.29.207:37573
    works fine from hostns, neither of them do in case of l3s. In the
    latter only a curl to https://127.0.0.1:37573 appeared to work where
    for local addresses of bond0 I saw kernel suddenly starting to emit
    ARP requests to query HW address of bond0 which remained unanswered
    and neighbor entries in INCOMPLETE state. These ARP requests only
    happen while in l3s.
    
    Debugging this further, I found the issue is that l3s mode is piggy-
    backing on l3 master device, and in this case local routes are using
    l3mdev_master_dev_rcu(dev) instead of net->loopback_dev as per commit
    f5a0aab84b74 ("net: ipv4: dst for local input routes should use l3mdev
    if relevant") and 5f02ce24c269 ("net: l3mdev: Allow the l3mdev to be
    a loopback"). I found that reverting them back into using the
    net->loopback_dev fixed ipvlan l3s connectivity and got everything
    working for the CNI.
    
    Now judging from 4fbae7d83c98 ("ipvlan: Introduce l3s mode") and the
    l3mdev paper in [0] the only sole reason why ipvlan l3s is relying
    on l3 master device is to get the l3mdev_ip_rcv() receive hook for
    setting the dst entry of the input route without adding its own
    ipvlan specific hacks into the receive path, however, any l3 domain
    semantics beyond just that are breaking l3s operation. Note that
    ipvlan also has the ability to dynamically switch its internal
    operation from l3 to l3s for all ports via ipvlan_set_port_mode()
    at runtime. In any case, l3 vs l3s soley distinguishes itself by
    'de-confusing' netfilter through switching skb->dev to ipvlan slave
    device late in NF_INET_LOCAL_IN before handing the skb to L4.
    
    Minimal fix taken here is to add a IFF_L3MDEV_RX_HANDLER flag which,
    if set from ipvlan setup, gets us only the wanted l3mdev_l3_rcv() hook
    without any additional l3mdev semantics on top. This should also have
    minimal impact since dev->priv_flags is already hot in cache. With
    this set, l3s mode is working fine and I also get things like
    masquerading pod traffic on the ipvlan master properly working.
    
      [0] https://netdevconf.org/1.2/papers/ahern-what-is-l3mdev-paper.pdf
    
    Fixes: f5a0aab84b74 ("net: ipv4: dst for local input routes should use l3mdev if relevant")
    Fixes: 5f02ce24c269 ("net: l3mdev: Allow the l3mdev to be a loopback")
    Fixes: 4fbae7d83c98 ("ipvlan: Introduce l3s mode")
    Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
    Cc: Mahesh Bandewar <maheshb@google.com>
    Cc: David Ahern <dsa@cumulusnetworks.com>
    Cc: Florian Westphal <fw@strlen.de>
    Cc: Martynas Pumputis <m@lambda.lt>
    Acked-by: David Ahern <dsa@cumulusnetworks.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 63f7ee6d8a78e156e0d10d67e8465085b84f0b78
Author: Jacob Wen <jian.w.wen@oracle.com>
Date:   Wed Jan 30 14:55:14 2019 +0800

    l2tp: fix reading optional fields of L2TPv3
    
    [ Upstream commit 4522a70db7aa5e77526a4079628578599821b193 ]
    
    Use pskb_may_pull() to make sure the optional fields are in skb linear
    parts, so we can safely read them later.
    
    It's easy to reproduce the issue with a net driver that supports paged
    skb data. Just create a L2TPv3 over IP tunnel and then generates some
    network traffic.
    Once reproduced, rx err in /sys/kernel/debug/l2tp/tunnels will increase.
    
    Changes in v4:
    1. s/l2tp_v3_pull_opt/l2tp_v3_ensure_opt_in_linear/
    2. s/tunnel->version != L2TP_HDR_VER_2/tunnel->version == L2TP_HDR_VER_3/
    3. Add 'Fixes' in commit messages.
    
    Changes in v3:
    1. To keep consistency, move the code out of l2tp_recv_common.
    2. Use "net" instead of "net-next", since this is a bug fix.
    
    Changes in v2:
    1. Only fix L2TPv3 to make code simple.
       To fix both L2TPv3 and L2TPv2, we'd better refactor l2tp_recv_common.
       It's complicated to do so.
    2. Reloading pointers after pskb_may_pull
    
    Fixes: f7faffa3ff8e ("l2tp: Add L2TPv3 protocol support")
    Fixes: 0d76751fad77 ("l2tp: Add L2TPv3 IP encapsulation (no UDP) support")
    Fixes: a32e0eec7042 ("l2tp: introduce L2TPv3 IP encapsulation support for IPv6")
    Signed-off-by: Jacob Wen <jian.w.wen@oracle.com>
    Acked-by: Guillaume Nault <gnault@redhat.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 592bde86496e328360ab672f7558d0a37408b7ca
Author: Lorenzo Bianconi <lorenzo.bianconi@redhat.com>
Date:   Tue Jan 16 23:01:55 2018 +0100

    l2tp: remove l2specific_len dependency in l2tp_core
    
    commit 62e7b6a57c7b9bf3c6fd99418eeec05b08a85c38 upstream.
    
    Remove l2specific_len dependency while building l2tpv3 header or
    parsing the received frame since default L2-Specific Sublayer is
    always four bytes long and we don't need to rely on a user supplied
    value.
    Moreover in l2tp netlink code there are no sanity checks to
    enforce the relation between l2specific_len and l2specific_type,
    so sending a malformed netlink message is possible to set
    l2specific_type to L2TP_L2SPECTYPE_DEFAULT (or even
    L2TP_L2SPECTYPE_NONE) and set l2specific_len to a value greater than
    4 leaking memory on the wire and sending corrupted frames.
    
    Reviewed-by: Guillaume Nault <g.nault@alphalink.fr>
    Tested-by: Guillaume Nault <g.nault@alphalink.fr>
    Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit bd6afb69dc86efbc8014c9ed6c30ac32572ad86b
Author: Aya Levin <ayal@mellanox.com>
Date:   Mon Dec 24 09:48:42 2018 +0200

    net/mlx5e: Allow MAC invalidation while spoofchk is ON
    
    [ Upstream commit 9d2cbdc5d334967c35b5f58c7bf3208e17325647 ]
    
    Prior to this patch the driver prohibited spoof checking on invalid MAC.
    Now the user can set this configuration if it wishes to.
    
    This is required since libvirt might invalidate the VF Mac by setting it
    to zero, while spoofcheck is ON.
    
    Fixes: 1ab2068a4c66 ("net/mlx5: Implement vports admin state backup/restore")
    Signed-off-by: Aya Levin <ayal@mellanox.com>
    Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com>
    Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit dc0fb8cceee31ff5bb356509d7ca46417709742a
Author: Mathias Thore <mathias.thore@infinera.com>
Date:   Mon Jan 28 10:07:47 2019 +0100

    ucc_geth: Reset BQL queue when stopping device
    
    [ Upstream commit e15aa3b2b1388c399c1a2ce08550d2cc4f7e3e14 ]
    
    After a timeout event caused by for example a broadcast storm, when
    the MAC and PHY are reset, the BQL TX queue needs to be reset as
    well. Otherwise, the device will exhibit severe performance issues
    even after the storm has ended.
    
    Co-authored-by: David Gounaris <david.gounaris@infinera.com>
    Signed-off-by: Mathias Thore <mathias.thore@infinera.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 5d16d812d92337fddc1fd2e9c79ec26d7cd3bbdd
Author: Bernard Pidoux <f6bvp@free.fr>
Date:   Fri Jan 25 11:46:40 2019 +0100

    net/rose: fix NULL ax25_cb kernel panic
    
    [ Upstream commit b0cf029234f9b18e10703ba5147f0389c382bccc ]
    
    When an internally generated frame is handled by rose_xmit(),
    rose_route_frame() is called:
    
            if (!rose_route_frame(skb, NULL)) {
                    dev_kfree_skb(skb);
                    stats->tx_errors++;
                    return NETDEV_TX_OK;
            }
    
    We have the same code sequence in Net/Rom where an internally generated
    frame is handled by nr_xmit() calling nr_route_frame(skb, NULL).
    However, in this function NULL argument is tested while it is not in
    rose_route_frame().
    Then kernel panic occurs later on when calling ax25cmp() with a NULL
    ax25_cb argument as reported many times and recently with syzbot.
    
    We need to test if ax25 is NULL before using it.
    
    Testing:
    Built kernel with CONFIG_ROSE=y.
    
    Signed-off-by: Bernard Pidoux <f6bvp@free.fr>
    Acked-by: Dmitry Vyukov <dvyukov@google.com>
    Reported-by: syzbot+1a2c456a1ea08fa5b5f7@syzkaller.appspotmail.com
    Cc: "David S. Miller" <davem@davemloft.net>
    Cc: Ralf Baechle <ralf@linux-mips.org>
    Cc: Bernard Pidoux <f6bvp@free.fr>
    Cc: linux-hams@vger.kernel.org
    Cc: netdev@vger.kernel.org
    Cc: linux-kernel@vger.kernel.org
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit baa9e32336bf6d0d74a7c3486d2a27feaf57cd5f
Author: Cong Wang <xiyou.wangcong@gmail.com>
Date:   Thu Jan 24 14:18:18 2019 -0800

    netrom: switch to sock timer API
    
    [ Upstream commit 63346650c1a94a92be61a57416ac88c0a47c4327 ]
    
    sk_reset_timer() and sk_stop_timer() properly handle
    sock refcnt for timer function. Switching to them
    could fix a refcounting bug reported by syzbot.
    
    Reported-and-tested-by: syzbot+defa700d16f1bd1b9a05@syzkaller.appspotmail.com
    Cc: Ralf Baechle <ralf@linux-mips.org>
    Cc: linux-hams@vger.kernel.org
    Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 024dcf5f6943b48f6b29538487cb0f656de784e7
Author: Aya Levin <ayal@mellanox.com>
Date:   Tue Jan 22 15:19:44 2019 +0200

    net/mlx4_core: Add masking for a few queries on HCA caps
    
    [ Upstream commit a40ded6043658444ee4dd6ee374119e4e98b33fc ]
    
    Driver reads the query HCA capabilities without the corresponding masks.
    Without the correct masks, the base addresses of the queues are
    unaligned.  In addition some reserved bits were wrongly read.  Using the
    correct masks, ensures alignment of the base addresses and allows future
    firmware versions safe use of the reserved bits.
    
    Fixes: ab9c17a009ee ("mlx4_core: Modify driver initialization flow to accommodate SRIOV for Ethernet")
    Fixes: 0ff1fb654bec ("{NET, IB}/mlx4: Add device managed flow steering firmware API")
    Signed-off-by: Aya Levin <ayal@mellanox.com>
    Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit d1ab05e48257728d21b36f73e9c1221341801eba
Author: Jacob Wen <jian.w.wen@oracle.com>
Date:   Thu Jan 31 15:18:56 2019 +0800

    l2tp: copy 4 more bytes to linear part if necessary
    
    [ Upstream commit 91c524708de6207f59dd3512518d8a1c7b434ee3 ]
    
    The size of L2TPv2 header with all optional fields is 14 bytes.
    l2tp_udp_recv_core only moves 10 bytes to the linear part of a
    skb. This may lead to l2tp_recv_common read data outside of a skb.
    
    This patch make sure that there is at least 14 bytes in the linear
    part of a skb to meet the maximum need of l2tp_udp_recv_core and
    l2tp_recv_common. The minimum size of both PPP HDLC-like frame and
    Ethernet frame is larger than 14 bytes, so we are safe to do so.
    
    Also remove L2TP_HDR_SIZE_NOSEQ, it is unused now.
    
    Fixes: fd558d186df2 ("l2tp: Split pppol2tp patch into separate l2tp and ppp parts")
    Suggested-by: Guillaume Nault <gnault@redhat.com>
    Signed-off-by: Jacob Wen <jian.w.wen@oracle.com>
    Acked-by: Guillaume Nault <gnault@redhat.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 16a2595d4c11229f0275a82a6d72f0925ce218bf
Author: David Ahern <dsahern@gmail.com>
Date:   Wed Jan 2 18:57:09 2019 -0800

    ipv6: Consider sk_bound_dev_if when binding a socket to an address
    
    [ Upstream commit c5ee066333ebc322a24a00a743ed941a0c68617e ]
    
    IPv6 does not consider if the socket is bound to a device when binding
    to an address. The result is that a socket can be bound to eth0 and then
    bound to the address of eth1. If the device is a VRF, the result is that
    a socket can only be bound to an address in the default VRF.
    
    Resolve by considering the device if sk_bound_dev_if is set.
    
    This problem exists from the beginning of git history.
    
    Signed-off-by: David Ahern <dsahern@gmail.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit c5cc933badef6ccf224723a2d5281ea6c3fdfbc6
Author: Jimmy Durand Wesolowski <jdw@amazon.de>
Date:   Thu Jan 31 15:19:39 2019 +0100

    fs: add the fsnotify call to vfs_iter_write
    
    A bug has been discovered when redirecting splice output to regular files
    on EXT4 and tmpfs. Other filesystems might be affected.
    This commit fixes the issue for stable series kernel, using one of the
    change introduced during the rewrite and refactoring of vfs_iter_write in
    4.13, specifically in the
    commit abbb65899aec ("fs: implement vfs_iter_write using do_iter_write").
    
    This issue affects v4.4 and v4.9 stable series of kernels.
    
    Without this fix for v4.4 and v4.9 stable, the following upstream commits
    (and their dependencies would need to be backported):
    * commit abbb65899aec ("fs: implement vfs_iter_write using do_iter_write")
    * commit 18e9710ee59c ("fs: implement vfs_iter_read using do_iter_read")
    * commit edab5fe38c2c
      ("fs: move more code into do_iter_read/do_iter_write")
    * commit 19c735868dd0 ("fs: remove __do_readv_writev")
    * commit 26c87fb7d10d ("fs: remove do_compat_readv_writev")
    * commit 251b42a1dc64 ("fs: remove do_readv_writev")
    
    as well as the following dependencies:
    * commit bb7462b6fd64
      ("vfs: use helpers for calling f_op->{read,write}_iter()")
    * commit 0f78d06ac1e9
      ("vfs: pass type instead of fn to do_{loop,iter}_readv_writev()")
    * commit 7687a7a4435f
      ("vfs: extract common parts of {compat_,}do_readv_writev()")
    
    In order to reduce the changes, this commit uses only the part of
    commit abbb65899aec ("fs: implement vfs_iter_write using do_iter_write")
    that fixes the issue.
    
    This issue and the reproducer can be found on
    https://bugzilla.kernel.org/show_bug.cgi?id=85381
    
    Reported-by: Richard Li <richardpku@gmail.com>
    Reported-by: Chad Miller <millchad@amazon.com>
    Reviewed-by: Stefan Nuernberger <snu@amazon.de>
    Reviewed-by: Frank Becker <becke@amazon.de>
    Signed-off-by: Jimmy Durand Wesolowski <jdw@amazon.de>

commit 90a7b84679dedb23660ed46976b964b8bf7f3a55
Author: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Date:   Thu Jan 31 15:59:51 2019 +0100

    Fix "net: ipv4: do not handle duplicate fragments as overlapping"
    
    ade446403bfb ("net: ipv4: do not handle duplicate fragments as
    overlapping") was backported to many stable trees, but it had a problem
    that was "accidentally" fixed by the upstream commit 0ff89efb5246 ("ip:
    fail fast on IP defrag errors")
    
    This is the fixup for that problem as we do not want the larger patch in
    the older stable trees.
    
    Fixes: ade446403bfb ("net: ipv4: do not handle duplicate fragments as overlapping")
    Reported-by: Ivan Babrou <ivan@cloudflare.com>
    Reported-by: Eric Dumazet <edumazet@google.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>