commit f41c15f2c9a00489735036846ec7e474e52b14a6
Author: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Date:   Thu Oct 9 12:18:54 2014 -0700

    Linux 3.10.57

commit bed5396573366682b2e07d79a08aefde1c5a8f52
Author: Stratos Karafotis <stratosk@semaphore.gr>
Date:   Wed Jun 5 19:01:25 2013 +0300

    cpufreq: ondemand: Change the calculation of target frequency
    
    commit dfa5bb622555d9da0df21b50f46ebdeef390041b upstream.
    
    The ondemand governor calculates load in terms of frequency and
    increases it only if load_freq is greater than up_threshold
    multiplied by the current or average frequency.  This appears to
    produce oscillations of frequency between min and max because,
    for example, a relatively small load can easily saturate minimum
    frequency and lead the CPU to the max.  Then, it will decrease
    back to the min due to small load_freq.
    
    Change the calculation method of load and target frequency on the
    basis of the following two observations:
    
     - Load computation should not depend on the current or average
       measured frequency.  For example, absolute load of 80% at 100MHz
       is not necessarily equivalent to 8% at 1000MHz in the next
       sampling interval.
    
     - It should be possible to increase the target frequency to any
       value present in the frequency table proportional to the absolute
       load, rather than to the max only, so that:
    
       Target frequency = C * load
    
       where we take C = policy->cpuinfo.max_freq / 100.
    
    Tested on Intel i7-3770 CPU @ 3.40GHz and on Quad core 1500MHz Krait.
    Phoronix benchmark of Linux Kernel Compilation 3.1 test shows an
    increase ~1.5% in performance. cpufreq_stats (time_in_state) shows
    that middle frequencies are used more, with this patch.  Highest
    and lowest frequencies were used less by ~9%.
    
    [rjw: We have run multiple other tests on kernels with this
     change applied and in the vast majority of cases it turns out
     that the resulting performance improvement also leads to reduced
     consumption of energy.  The change is additionally justified by
     the overall simplification of the code in question.]
    
    Signed-off-by: Stratos Karafotis <stratosk@semaphore.gr>
    Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
    Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
    Cc: Mark Brown <broonie@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 35c239149f6e5794da2285f30bdeb3b4dd4df3b6
Author: Andreas Schwab <schwab@linux-m68k.org>
Date:   Sat Sep 7 18:35:08 2013 +0200

    cpufreq: Fix wrong time unit conversion
    
    commit a857c0b9e24e39fe5be82451b65377795f9538d8 upstream.
    
    The time spent by a CPU under a given frequency is stored in jiffies unit
    in the cpu var cpufreq_stats_table->time_in_state[i], i being the index of
    the frequency.
    
    This is what is displayed in the following file on the right column:
    
         cat /sys/devices/system/cpu/cpuX/cpufreq/stats/time_in_state
         2301000 19835820
         2300000 3172
         [...]
    
    Now cpufreq converts this jiffies unit delta to clock_t before returning it
    to the user as in the above file. And that conversion is achieved using the API
    cputime64_to_clock_t().
    
    Although it accidentally works on traditional tick based cputime accounting, where
    cputime_t maps directly to jiffies, it doesn't work with other types of cputime
    accounting such as CONFIG_VIRT_CPU_ACCOUNTING_* where cputime_t can map to nsecs
    or any granularity preffered by the architecture.
    
    For example we get a buggy zero delta on full dyntick configurations:
    
         cat /sys/devices/system/cpu/cpuX/cpufreq/stats/time_in_state
         2301000 0
         2300000 0
         [...]
    
    Fix this with using the proper jiffies_64_t to clock_t conversion.
    
    Reported-and-tested-by: Carsten Emde <C.Emde@osadl.org>
    Signed-off-by: Andreas Schwab <schwab@linux-m68k.org>
    Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
    Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
    Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
    Cc: Mark Brown <broonie@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 7dd311128022551d7876b26b7193157883494cd3
Author: Johannes Berg <johannes.berg@intel.com>
Date:   Wed Jul 30 14:55:26 2014 +0200

    nl80211: clear skb cb before passing to netlink
    
    commit bd8c78e78d5011d8111bc2533ee73b13a3bd6c42 upstream.
    
    In testmode and vendor command reply/event SKBs we use the
    skb cb data to store nl80211 parameters between allocation
    and sending. This causes the code for CONFIG_NETLINK_MMAP
    to get confused, because it takes ownership of the skb cb
    data when the SKB is handed off to netlink, and it doesn't
    explicitly clear it.
    
    Clear the skb cb explicitly when we're done and before it
    gets passed to netlink to avoid this issue.
    
    Reported-by: Assaf Azulay <assaf.azulay@intel.com>
    Reported-by: David Spinadel <david.spinadel@intel.com>
    Signed-off-by: Johannes Berg <johannes.berg@intel.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 6353c97aa7c7dd6b0c3fe717eeacb39e3873259e
Author: Lars Ellenberg <lars.ellenberg@linbit.com>
Date:   Wed Jul 9 21:18:32 2014 +0200

    drbd: fix regression 'out of mem, failed to invoke fence-peer helper'
    
    commit bbc1c5e8ad6dfebf9d13b8a4ccdf66c92913eac9 upstream.
    
    Since linux kernel 3.13, kthread_run() internally uses
    wait_for_completion_killable().  We sometimes may use kthread_run()
    while we still have a signal pending, which we used to kick our threads
    out of potentially blocking network functions, causing kthread_run() to
    mistake that as a new fatal signal and fail.
    
    Fix: flush_signals() before kthread_run().
    
    Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
    Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
    Signed-off-by: Jens Axboe <axboe@fb.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 00790d4526bd88e711999b9af04a0e896cfbf5a8
Author: Andrew Hunter <ahh@google.com>
Date:   Thu Sep 4 14:17:16 2014 -0700

    jiffies: Fix timeval conversion to jiffies
    
    commit d78c9300c51d6ceed9f6d078d4e9366f259de28c upstream.
    
    timeval_to_jiffies tried to round a timeval up to an integral number
    of jiffies, but the logic for doing so was incorrect: intervals
    corresponding to exactly N jiffies would become N+1. This manifested
    itself particularly repeatedly stopping/starting an itimer:
    
    setitimer(ITIMER_PROF, &val, NULL);
    setitimer(ITIMER_PROF, NULL, &val);
    
    would add a full tick to val, _even if it was exactly representable in
    terms of jiffies_ (say, the result of a previous rounding.)  Doing
    this repeatedly would cause unbounded growth in val.  So fix the math.
    
    Here's what was wrong with the conversion: we essentially computed
    (eliding seconds)
    
    jiffies = usec  * (NSEC_PER_USEC/TICK_NSEC)
    
    by using scaling arithmetic, which took the best approximation of
    NSEC_PER_USEC/TICK_NSEC with denominator of 2^USEC_JIFFIE_SC =
    x/(2^USEC_JIFFIE_SC), and computed:
    
    jiffies = (usec * x) >> USEC_JIFFIE_SC
    
    and rounded this calculation up in the intermediate form (since we
    can't necessarily exactly represent TICK_NSEC in usec.) But the
    scaling arithmetic is a (very slight) *over*approximation of the true
    value; that is, instead of dividing by (1 usec/ 1 jiffie), we
    effectively divided by (1 usec/1 jiffie)-epsilon (rounding
    down). This would normally be fine, but we want to round timeouts up,
    and we did so by adding 2^USEC_JIFFIE_SC - 1 before the shift; this
    would be fine if our division was exact, but dividing this by the
    slightly smaller factor was equivalent to adding just _over_ 1 to the
    final result (instead of just _under_ 1, as desired.)
    
    In particular, with HZ=1000, we consistently computed that 10000 usec
    was 11 jiffies; the same was true for any exact multiple of
    TICK_NSEC.
    
    We could possibly still round in the intermediate form, adding
    something less than 2^USEC_JIFFIE_SC - 1, but easier still is to
    convert usec->nsec, round in nanoseconds, and then convert using
    time*spec*_to_jiffies.  This adds one constant multiplication, and is
    not observably slower in microbenchmarks on recent x86 hardware.
    
    Tested: the following program:
    
    int main() {
      struct itimerval zero = {{0, 0}, {0, 0}};
      /* Initially set to 10 ms. */
      struct itimerval initial = zero;
      initial.it_interval.tv_usec = 10000;
      setitimer(ITIMER_PROF, &initial, NULL);
      /* Save and restore several times. */
      for (size_t i = 0; i < 10; ++i) {
        struct itimerval prev;
        setitimer(ITIMER_PROF, &zero, &prev);
        /* on old kernels, this goes up by TICK_USEC every iteration */
        printf("previous value: %ld %ld %ld %ld\n",
               prev.it_interval.tv_sec, prev.it_interval.tv_usec,
               prev.it_value.tv_sec, prev.it_value.tv_usec);
        setitimer(ITIMER_PROF, &prev, NULL);
      }
        return 0;
    }
    
    
    Cc: Thomas Gleixner <tglx@linutronix.de>
    Cc: Ingo Molnar <mingo@redhat.com>
    Cc: Paul Turner <pjt@google.com>
    Cc: Richard Cochran <richardcochran@gmail.com>
    Cc: Prarit Bhargava <prarit@redhat.com>
    Reviewed-by: Paul Turner <pjt@google.com>
    Reported-by: Aaron Jacobs <jacobsa@google.com>
    Signed-off-by: Andrew Hunter <ahh@google.com>
    [jstultz: Tweaked to apply to 3.17-rc]
    Signed-off-by: John Stultz <john.stultz@linaro.org>
    [bwh: Backported to 3.16: adjust filename]
    Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 06905ff8a6f07cb59b20311694f5d1454654808f
Author: NeilBrown <neilb@suse.de>
Date:   Thu Oct 2 13:45:00 2014 +1000

    md/raid5: disable 'DISCARD' by default due to safety concerns.
    
    commit 8e0e99ba64c7ba46133a7c8a3e3f7de01f23bd93 upstream.
    
    It has come to my attention (thanks Martin) that 'discard_zeroes_data'
    is only a hint.  Some devices in some cases don't do what it
    says on the label.
    
    The use of DISCARD in RAID5 depends on reads from discarded regions
    being predictably zero.  If a write to a previously discarded region
    performs a read-modify-write cycle it assumes that the parity block
    was consistent with the data blocks.  If all were zero, this would
    be the case.  If some are and some aren't this would not be the case.
    This could lead to data corruption after a device failure when
    data needs to be reconstructed from the parity.
    
    As we cannot trust 'discard_zeroes_data', ignore it by default
    and so disallow DISCARD on all raid4/5/6 arrays.
    
    As many devices are trustworthy, and as there are benefits to using
    DISCARD, add a module parameter to over-ride this caution and cause
    DISCARD to work if discard_zeroes_data is set.
    
    If a site want to enable DISCARD on some arrays but not on others they
    should select DISCARD support at the filesystem level, and set the
    raid456 module parameter.
        raid456.devices_handle_discard_safely=Y
    
    As this is a data-safety issue, I believe this patch is suitable for
    -stable.
    DISCARD support for RAID456 was added in 3.7
    
    Cc: Shaohua Li <shli@kernel.org>
    Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
    Cc: Mike Snitzer <snitzer@redhat.com>
    Cc: Heinz Mauelshagen <heinzm@redhat.com>
    Acked-by: Martin K. Petersen <martin.petersen@oracle.com>
    Acked-by: Mike Snitzer <snitzer@redhat.com>
    Fixes: 620125f2bf8ff0c4969b79653b54d7bcc9d40637
    Signed-off-by: NeilBrown <neilb@suse.de>
    [bwh: Backported to 3.10: adjust context]
    Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit f5d34b7cae6c6ddddb1797ebc0d0918954544108
Author: Hans Verkuil <hans.verkuil@cisco.com>
Date:   Sat Sep 20 16:16:35 2014 -0300

    media: vb2: fix VBI/poll regression
    
    commit 58d75f4b1ce26324b4d809b18f94819843a98731 upstream.
    
    The recent conversion of saa7134 to vb2 unconvered a poll() bug that
    broke the teletext applications alevt and mtt. These applications
    expect that calling poll() without having called VIDIOC_STREAMON will
    cause poll() to return POLLERR. That did not happen in vb2.
    
    This patch fixes that behavior. It also fixes what should happen when
    poll() is called when STREAMON is called but no buffers have been
    queued. In that case poll() will also return POLLERR, but only for
    capture queues since output queues will always return POLLOUT
    anyway in that situation.
    
    This brings the vb2 behavior in line with the old videobuf behavior.
    
    Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com>
    Acked-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
    Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit f35407acce23bab3727190a94468362dc8f030a1
Author: Mel Gorman <mgorman@suse.de>
Date:   Thu Oct 2 19:47:42 2014 +0100

    mm: numa: Do not mark PTEs pte_numa when splitting huge pages
    
    commit abc40bd2eeb77eb7c2effcaf63154aad929a1d5f upstream.
    
    This patch reverts 1ba6e0b50b ("mm: numa: split_huge_page: transfer the
    NUMA type from the pmd to the pte"). If a huge page is being split due
    a protection change and the tail will be in a PROT_NONE vma then NUMA
    hinting PTEs are temporarily created in the protected VMA.
    
     VM_RW|VM_PROTNONE
    |-----------------|
          ^
          split here
    
    In the specific case above, it should get fixed up by change_pte_range()
    but there is a window of opportunity for weirdness to happen. Similarly,
    if a huge page is shrunk and split during a protection update but before
    pmd_numa is cleared then a pte_numa can be left behind.
    
    Instead of adding complexity trying to deal with the case, this patch
    will not mark PTEs NUMA when splitting a huge page. NUMA hinting faults
    will not be triggered which is marginal in comparison to the complexity
    in dealing with the corner cases during THP split.
    
    Signed-off-by: Mel Gorman <mgorman@suse.de>
    Acked-by: Rik van Riel <riel@redhat.com>
    Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 183c062c51c7e7663752b94ac399c830e4cb3c44
Author: Waiman Long <Waiman.Long@hp.com>
Date:   Wed Aug 6 16:05:36 2014 -0700

    mm, thp: move invariant bug check out of loop in __split_huge_page_map
    
    commit f8303c2582b889351e261ff18c4d8eb197a77db2 upstream.
    
    In __split_huge_page_map(), the check for page_mapcount(page) is
    invariant within the for loop.  Because of the fact that the macro is
    implemented using atomic_read(), the redundant check cannot be optimized
    away by the compiler leading to unnecessary read to the page structure.
    
    This patch moves the invariant bug check out of the loop so that it will
    be done only once.  On a 3.16-rc1 based kernel, the execution time of a
    microbenchmark that broke up 1000 transparent huge pages using munmap()
    had an execution time of 38,245us and 38,548us with and without the
    patch respectively.  The performance gain is about 1%.
    
    Signed-off-by: Waiman Long <Waiman.Long@hp.com>
    Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
    Cc: Andrea Arcangeli <aarcange@redhat.com>
    Cc: Mel Gorman <mgorman@suse.de>
    Cc: Rik van Riel <riel@redhat.com>
    Cc: Scott J Norton <scott.norton@hp.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 78a3db11cb0e9521572c7d0effbc63f2bd5dac12
Author: Steven Rostedt (Red Hat) <rostedt@goodmis.org>
Date:   Thu Oct 2 16:51:18 2014 -0400

    ring-buffer: Fix infinite spin in reading buffer
    
    commit 24607f114fd14f2f37e3e0cb3d47bce96e81e848 upstream.
    
    Commit 651e22f2701b "ring-buffer: Always reset iterator to reader page"
    fixed one bug but in the process caused another one. The reset is to
    update the header page, but that fix also changed the way the cached
    reads were updated. The cache reads are used to test if an iterator
    needs to be updated or not.
    
    A ring buffer iterator, when created, disables writes to the ring buffer
    but does not stop other readers or consuming reads from happening.
    Although all readers are synchronized via a lock, they are only
    synchronized when in the ring buffer functions. Those functions may
    be called by any number of readers. The iterator continues down when
    its not interrupted by a consuming reader. If a consuming read
    occurs, the iterator starts from the beginning of the buffer.
    
    The way the iterator sees that a consuming read has happened since
    its last read is by checking the reader "cache". The cache holds the
    last counts of the read and the reader page itself.
    
    Commit 651e22f2701b changed what was saved by the cache_read when
    the rb_iter_reset() occurred, making the iterator never match the cache.
    Then if the iterator calls rb_iter_reset(), it will go into an
    infinite loop by checking if the cache doesn't match, doing the reset
    and retrying, just to see that the cache still doesn't match! Which
    should never happen as the reset is suppose to set the cache to the
    current value and there's locks that keep a consuming reader from
    having access to the data.
    
    Fixes: 651e22f2701b "ring-buffer: Always reset iterator to reader page"
    Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 0000372a96216b393bacfa50fb0253c40f8cf3d1
Author: Josh Triplett <josh@joshtriplett.org>
Date:   Fri Oct 3 16:19:24 2014 -0700

    init/Kconfig: Fix HAVE_FUTEX_CMPXCHG to not break up the EXPERT menu
    
    commit 62b4d2041117f35ab2409c9f5c4b8d3dc8e59d0f upstream.
    
    commit 03b8c7b623c80af264c4c8d6111e5c6289933666 ("futex: Allow
    architectures to skip futex_atomic_cmpxchg_inatomic() test") added the
    HAVE_FUTEX_CMPXCHG symbol right below FUTEX.  This placed it right in
    the middle of the options for the EXPERT menu.  However,
    HAVE_FUTEX_CMPXCHG does not depend on EXPERT or FUTEX, so Kconfig stops
    placing items in the EXPERT menu, and displays the remaining several
    EXPERT items (starting with EPOLL) directly in the General Setup menu.
    
    Since both users of HAVE_FUTEX_CMPXCHG only select it "if FUTEX", make
    HAVE_FUTEX_CMPXCHG itself depend on FUTEX.  With this change, the
    subsequent items display as part of the EXPERT menu again; the EMBEDDED
    menu now appears as the next top-level item in the General Setup menu,
    which makes General Setup much shorter and more usable.
    
    Signed-off-by: Josh Triplett <josh@joshtriplett.org>
    Acked-by: Randy Dunlap <rdunlap@infradead.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit bee870fc1af7c5109a0f167af3bfe7002a02e7f3
Author: Peter Zijlstra <peterz@infradead.org>
Date:   Thu Oct 2 16:17:02 2014 -0700

    perf: fix perf bug in fork()
    
    commit 6c72e3501d0d62fc064d3680e5234f3463ec5a86 upstream.
    
    Oleg noticed that a cleanup by Sylvain actually uncovered a bug; by
    calling perf_event_free_task() when failing sched_fork() we will not yet
    have done the memset() on ->perf_event_ctxp[] and will therefore try and
    'free' the inherited contexts, which are still in use by the parent
    process.  This is bad..
    
    Suggested-by: Oleg Nesterov <oleg@redhat.com>
    Reported-by: Oleg Nesterov <oleg@redhat.com>
    Reported-by: Sylvain 'ythier' Hitier <sylvain.hitier@gmail.com>
    Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
    Cc: Ingo Molnar <mingo@kernel.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 07d209bd092d023976fdb881ba6d4b30fe18aebe
Author: Jan Kara <jack@suse.cz>
Date:   Thu Sep 4 14:06:55 2014 +0200

    udf: Avoid infinite loop when processing indirect ICBs
    
    commit c03aa9f6e1f938618e6db2e23afef0574efeeb65 upstream.
    
    We did not implement any bound on number of indirect ICBs we follow when
    loading inode. Thus corrupted medium could cause kernel to go into an
    infinite loop, possibly causing a stack overflow.
    
    Fix the possible stack overflow by removing recursion from
    __udf_read_inode() and limit number of indirect ICBs we follow to avoid
    infinite loops.
    
    Signed-off-by: Jan Kara <jack@suse.cz>
    Cc: Chuck Ebbert <cebbert.lkml@gmail.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>