Skip to content
Snippets Groups Projects
  1. Mar 13, 2022
  2. Mar 12, 2022
  3. Mar 11, 2022
    • Randy Dunlap's avatar
      ARM: Spectre-BHB: provide empty stub for non-config · 68453767
      Randy Dunlap authored
      
      When CONFIG_GENERIC_CPU_VULNERABILITIES is not set, references
      to spectre_v2_update_state() cause a build error, so provide an
      empty stub for that function when the Kconfig option is not set.
      
      Fixes this build error:
      
        arm-linux-gnueabi-ld: arch/arm/mm/proc-v7-bugs.o: in function `cpu_v7_bugs_init':
        proc-v7-bugs.c:(.text+0x52): undefined reference to `spectre_v2_update_state'
        arm-linux-gnueabi-ld: proc-v7-bugs.c:(.text+0x82): undefined reference to `spectre_v2_update_state'
      
      Fixes: b9baf5c8 ("ARM: Spectre-BHB workaround")
      Signed-off-by: default avatarRandy Dunlap <rdunlap@infradead.org>
      Reported-by: default avatarkernel test robot <lkp@intel.com>
      Cc: Russell King <rmk+kernel@armlinux.org.uk>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: linux-arm-kernel@lists.infradead.org
      Cc: patches@armlinux.org.uk
      Acked-by: default avatarRussell King (Oracle) <rmk+kernel@armlinux.org.uk>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      68453767
    • Linus Torvalds's avatar
      Merge tag 'riscv-for-linus-5.17-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux · 77fe1ba9
      Linus Torvalds authored
      Pull RISC-V fixes from Palmer Dabbelt:
      
       - prevent users from enabling the alternatives framework (and thus
         errata handling) on XIP kernels, where runtime code patching does not
         function correctly.
      
       - properly detect offset overflow for AUIPC-based relocations in
         modules. This may manifest as modules calling arbitrary invalid
         addresses, depending on the address allocated when a module is
         loaded.
      
      * tag 'riscv-for-linus-5.17-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
        riscv: Fix auipc+jalr relocation range checks
        riscv: alternative only works on !XIP_KERNEL
      77fe1ba9
    • Linus Torvalds's avatar
      Merge tag 'powerpc-5.17-6' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux · 878409ec
      Linus Torvalds authored
      Pull powerpc fix from Michael Ellerman:
       "Fix STACKTRACE=n build, in particular for skiroot_defconfig"
      
      * tag 'powerpc-5.17-6' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
        powerpc: Fix STACKTRACE=n build
      878409ec
    • Russell King (Oracle)'s avatar
      ARM: fix Thumb2 regression with Spectre BHB · 6c7cb60b
      Russell King (Oracle) authored
      
      When building for Thumb2, the vectors make use of a local label. Sadly,
      the Spectre BHB code also uses a local label with the same number which
      results in the Thumb2 reference pointing at the wrong place. Fix this
      by changing the number used for the Spectre BHB local label.
      
      Fixes: b9baf5c8 ("ARM: Spectre-BHB workaround")
      Tested-by: default avatarNathan Chancellor <nathan@kernel.org>
      Signed-off-by: default avatarRussell King (Oracle) <rmk+kernel@armlinux.org.uk>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      6c7cb60b
    • Linus Torvalds's avatar
      Merge tag 'mmc-v5.17-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc · 3977a3fb
      Linus Torvalds authored
      Pull MMC fixes from Ulf Hansson:
       "MMC core:
         - Restore (mostly) the busy polling for MMC_SEND_OP_COND
      
        MMC host:
         - meson-gx: Fix DMA usage of meson_mmc_post_req()"
      
      * tag 'mmc-v5.17-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
        mmc: core: Restore (almost) the busy polling for MMC_SEND_OP_COND
        mmc: meson: Fix usage of meson_mmc_post_req()
      3977a3fb
    • Jarkko Sakkinen's avatar
      x86/sgx: Free backing memory after faulting the enclave page · 08999b24
      Jarkko Sakkinen authored
      
      There is a limited amount of SGX memory (EPC) on each system.  When that
      memory is used up, SGX has its own swapping mechanism which is similar
      in concept but totally separate from the core mm/* code.  Instead of
      swapping to disk, SGX swaps from EPC to normal RAM.  That normal RAM
      comes from a shared memory pseudo-file and can itself be swapped by the
      core mm code.  There is a hierarchy like this:
      
      	EPC <-> shmem <-> disk
      
      After data is swapped back in from shmem to EPC, the shmem backing
      storage needs to be freed.  Currently, the backing shmem is not freed.
      This effectively wastes the shmem while the enclave is running.  The
      memory is recovered when the enclave is destroyed and the backing
      storage freed.
      
      Sort this out by freeing memory with shmem_truncate_range(), as soon as
      a page is faulted back to the EPC.  In addition, free the memory for
      PCMD pages as soon as all PCMD's in a page have been marked as unused
      by zeroing its contents.
      
      Cc: stable@vger.kernel.org
      Fixes: 1728ab54 ("x86/sgx: Add a page reclaimer")
      Reported-by: default avatarDave Hansen <dave.hansen@linux.intel.com>
      Signed-off-by: default avatarJarkko Sakkinen <jarkko@kernel.org>
      Signed-off-by: default avatarDave Hansen <dave.hansen@linux.intel.com>
      Link: https://lkml.kernel.org/r/20220303223859.273187-1-jarkko@kernel.org
      08999b24
    • Linus Torvalds's avatar
      Merge branch 'davidh' (fixes from David Howells) · 93ce9358
      Linus Torvalds authored
      Merge misc fixes from David Howells:
       "A set of patches for watch_queue filter issues noted by Jann. I've
        added in a cleanup patch from Christophe Jaillet to convert to using
        formal bitmap specifiers for the note allocation bitmap.
      
        Also two filesystem fixes (afs and cachefiles)"
      
      * emailed patches from David Howells <dhowells@redhat.com>:
        cachefiles: Fix volume coherency attribute
        afs: Fix potential thrashing in afs writeback
        watch_queue: Make comment about setting ->defunct more accurate
        watch_queue: Fix lack of barrier/sync/lock between post and read
        watch_queue: Free the alloc bitmap when the watch_queue is torn down
        watch_queue: Fix the alloc bitmap size to reflect notes allocated
        watch_queue: Use the bitmap API when applicable
        watch_queue: Fix to always request a pow-of-2 pipe ring size
        watch_queue: Fix to release page in ->release()
        watch_queue, pipe: Free watchqueue state after clearing pipe ring
        watch_queue: Fix filter limit check
      93ce9358
    • David Howells's avatar
      cachefiles: Fix volume coherency attribute · 413a4a6b
      David Howells authored
      
      A network filesystem may set coherency data on a volume cookie, and if
      given, cachefiles will store this in an xattr on the directory in the
      cache corresponding to the volume.
      
      The function that sets the xattr just stores the contents of the volume
      coherency buffer directly into the xattr, with nothing added; the
      checking function, on the other hand, has a cut'n'paste error whereby it
      tries to interpret the xattr contents as would be the xattr on an
      ordinary file (using the cachefiles_xattr struct).  This results in a
      failure to match the coherency data because the buffer ends up being
      shifted by 18 bytes.
      
      Fix this by defining a structure specifically for the volume xattr and
      making both the setting and checking functions use it.
      
      Since the volume coherency doesn't work if used, take the opportunity to
      insert a reserved field for future use, set it to 0 and check that it is
      0.  Log mismatch through the appropriate tracepoint.
      
      Note that this only affects cifs; 9p, afs, ceph and nfs don't use the
      volume coherency data at the moment.
      
      Fixes: 32e15003 ("fscache, cachefiles: Store the volume coherency data")
      Reported-by: default avatarRohith Surabattula <rohiths.msft@gmail.com>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Reviewed-by: default avatarJeff Layton <jlayton@kernel.org>
      cc: Steve French <smfrench@gmail.com>
      cc: linux-cifs@vger.kernel.org
      cc: linux-cachefs@redhat.com
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      413a4a6b
    • David Howells's avatar
      afs: Fix potential thrashing in afs writeback · 173ce1ca
      David Howells authored
      
      In afs_writepages_region(), if the dirty page we find is undergoing
      writeback or write to cache, but the sync_mode is WB_SYNC_NONE, we go
      round the loop trying the same page again and again with no pausing or
      waiting unless and until another thread manages to clear the writeback
      and fscache flags.
      
      Fix this with three measures:
      
       (1) Advance start to after the page we found.
      
       (2) Break out of the loop and return if rescheduling is requested.
      
       (3) Arbitrarily give up after a maximum of 5 skips.
      
      Fixes: 31143d5d ("AFS: implement basic file write support")
      Reported-by: default avatarMarc Dionne <marc.dionne@auristor.com>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Tested-by: default avatarMarc Dionne <marc.dionne@auristor.com>
      Acked-by: default avatarMarc Dionne <marc.dionne@auristor.com>
      Link: https://lore.kernel.org/r/164692725757.2097000.2060513769492301854.stgit@warthog.procyon.org.uk/
      
       # v1
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      173ce1ca
    • Li Huafei's avatar
      x86/traps: Mark do_int3() NOKPROBE_SYMBOL · a365a65f
      Li Huafei authored
      
      Since kprobe_int3_handler() is called in do_int3(), probing do_int3()
      can cause a breakpoint recursion and crash the kernel. Therefore,
      do_int3() should be marked as NOKPROBE_SYMBOL.
      
      Fixes: 21e28290 ("x86/traps: Split int3 handler up")
      Signed-off-by: default avatarLi Huafei <lihuafei1@huawei.com>
      Signed-off-by: default avatarBorislav Petkov <bp@suse.de>
      Acked-by: default avatarMasami Hiramatsu <mhiramat@kernel.org>
      Cc: <stable@vger.kernel.org>
      Link: https://lore.kernel.org/r/20220310120915.63349-1-lihuafei1@huawei.com
      a365a65f
    • David Howells's avatar
      watch_queue: Make comment about setting ->defunct more accurate · 4edc0760
      David Howells authored
      
      watch_queue_clear() has a comment stating that setting ->defunct to true
      preventing new additions as well as preventing notifications.  Whilst
      the latter is true, the first bit is superfluous since at the time this
      function is called, the pipe cannot be accessed to add new event
      sources.
      
      Remove the "new additions" bit from the comment.
      
      Fixes: c73be61c ("pipe: Add general notification queue support")
      Reported-by: default avatarJann Horn <jannh@google.com>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      4edc0760
    • David Howells's avatar
      watch_queue: Fix lack of barrier/sync/lock between post and read · 2ed147f0
      David Howells authored
      
      There's nothing to synchronise post_one_notification() versus
      pipe_read().  Whilst posting is done under pipe->rd_wait.lock, the
      reader only takes pipe->mutex which cannot bar notification posting as
      that may need to be made from contexts that cannot sleep.
      
      Fix this by setting pipe->head with a barrier in post_one_notification()
      and reading pipe->head with a barrier in pipe_read().
      
      If that's not sufficient, the rd_wait.lock will need to be taken,
      possibly in a ->confirm() op so that it only applies to notifications.
      The lock would, however, have to be dropped before copy_page_to_iter()
      is invoked.
      
      Fixes: c73be61c ("pipe: Add general notification queue support")
      Reported-by: default avatarJann Horn <jannh@google.com>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      2ed147f0
    • David Howells's avatar
      watch_queue: Free the alloc bitmap when the watch_queue is torn down · 7ea1a012
      David Howells authored
      
      Free the watch_queue note allocation bitmap when the watch_queue is
      destroyed.
      
      Fixes: c73be61c ("pipe: Add general notification queue support")
      Reported-by: default avatarJann Horn <jannh@google.com>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      7ea1a012
    • David Howells's avatar
      watch_queue: Fix the alloc bitmap size to reflect notes allocated · 3b4c0371
      David Howells authored
      
      Currently, watch_queue_set_size() sets the number of notes available in
      wqueue->nr_notes according to the number of notes allocated, but sets
      the size of the bitmap to the unrounded number of notes originally asked
      for.
      
      Fix this by setting the bitmap size to the number of notes we're
      actually going to make available (ie. the number allocated).
      
      Fixes: c73be61c ("pipe: Add general notification queue support")
      Reported-by: default avatarJann Horn <jannh@google.com>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      3b4c0371
    • Christophe JAILLET's avatar
      watch_queue: Use the bitmap API when applicable · a66bd757
      Christophe JAILLET authored
      
      Use bitmap_alloc() to simplify code, improve the semantic and reduce
      some open-coded arithmetic in allocator arguments.
      
      Also change a memset(0xff) into an equivalent bitmap_fill() to keep
      consistency.
      
      Signed-off-by: default avatarChristophe JAILLET <christophe.jaillet@wanadoo.fr>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      a66bd757
    • David Howells's avatar
      watch_queue: Fix to always request a pow-of-2 pipe ring size · 96a4d891
      David Howells authored
      
      The pipe ring size must always be a power of 2 as the head and tail
      pointers are masked off by AND'ing with the size of the ring - 1.
      watch_queue_set_size(), however, lets you specify any number of notes
      between 1 and 511.  This number is passed through to pipe_resize_ring()
      without checking/forcing its alignment.
      
      Fix this by rounding the number of slots required up to the nearest
      power of two.  The request is meant to guarantee that at least that many
      notifications can be generated before the queue is full, so rounding
      down isn't an option, but, alternatively, it may be better to give an
      error if we aren't allowed to allocate that much ring space.
      
      Fixes: c73be61c ("pipe: Add general notification queue support")
      Reported-by: default avatarJann Horn <jannh@google.com>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      96a4d891
    • David Howells's avatar
      watch_queue: Fix to release page in ->release() · c1853fba
      David Howells authored
      
      When a pipe ring descriptor points to a notification message, the
      refcount on the backing page is incremented by the generic get function,
      but the release function, which marks the bitmap, doesn't drop the page
      ref.
      
      Fix this by calling generic_pipe_buf_release() at the end of
      watch_queue_pipe_buf_release().
      
      Fixes: c73be61c ("pipe: Add general notification queue support")
      Reported-by: default avatarJann Horn <jannh@google.com>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      c1853fba
    • David Howells's avatar
      watch_queue, pipe: Free watchqueue state after clearing pipe ring · db8facfc
      David Howells authored
      
      In free_pipe_info(), free the watchqueue state after clearing the pipe
      ring as each pipe ring descriptor has a release function, and in the
      case of a notification message, this is watch_queue_pipe_buf_release()
      which tries to mark the allocation bitmap that was previously released.
      
      Fix this by moving the put of the pipe's ref on the watch queue to after
      the ring has been cleared.  We still need to call watch_queue_clear()
      before doing that to make sure that the pipe is disconnected from any
      notification sources first.
      
      Fixes: c73be61c ("pipe: Add general notification queue support")
      Reported-by: default avatarJann Horn <jannh@google.com>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      db8facfc
    • David Howells's avatar
      watch_queue: Fix filter limit check · c993ee0f
      David Howells authored
      
      In watch_queue_set_filter(), there are a couple of places where we check
      that the filter type value does not exceed what the type_filter bitmap
      can hold.  One place calculates the number of bits by:
      
         if (tf[i].type >= sizeof(wfilter->type_filter) * 8)
      
      which is fine, but the second does:
      
         if (tf[i].type >= sizeof(wfilter->type_filter) * BITS_PER_LONG)
      
      which is not.  This can lead to a couple of out-of-bounds writes due to
      a too-large type:
      
       (1) __set_bit() on wfilter->type_filter
       (2) Writing more elements in wfilter->filters[] than we allocated.
      
      Fix this by just using the proper WATCH_TYPE__NR instead, which is the
      number of types we actually know about.
      
      The bug may cause an oops looking something like:
      
        BUG: KASAN: slab-out-of-bounds in watch_queue_set_filter+0x659/0x740
        Write of size 4 at addr ffff88800d2c66bc by task watch_queue_oob/611
        ...
        Call Trace:
         <TASK>
         dump_stack_lvl+0x45/0x59
         print_address_description.constprop.0+0x1f/0x150
         ...
         kasan_report.cold+0x7f/0x11b
         ...
         watch_queue_set_filter+0x659/0x740
         ...
         __x64_sys_ioctl+0x127/0x190
         do_syscall_64+0x43/0x90
         entry_SYSCALL_64_after_hwframe+0x44/0xae
      
        Allocated by task 611:
         kasan_save_stack+0x1e/0x40
         __kasan_kmalloc+0x81/0xa0
         watch_queue_set_filter+0x23a/0x740
         __x64_sys_ioctl+0x127/0x190
         do_syscall_64+0x43/0x90
         entry_SYSCALL_64_after_hwframe+0x44/0xae
      
        The buggy address belongs to the object at ffff88800d2c66a0
         which belongs to the cache kmalloc-32 of size 32
        The buggy address is located 28 bytes inside of
         32-byte region [ffff88800d2c66a0, ffff88800d2c66c0)
      
      Fixes: c73be61c ("pipe: Add general notification queue support")
      Reported-by: default avatarJann Horn <jannh@google.com>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      c993ee0f
    • Linus Torvalds's avatar
      Merge tag 'drm-fixes-2022-03-11' of git://anongit.freedesktop.org/drm/drm · 79b00034
      Linus Torvalds authored
      Pull drm fixes from Dave Airlie:
       "As expected at this stage its pretty quiet, one sun4i mixer fix and
        one i915 display flicker fix:
      
        i915:
         - fix psr screen flicker
      
        sun4i:
         - mixer format fix"
      
      * tag 'drm-fixes-2022-03-11' of git://anongit.freedesktop.org/drm/drm:
        drm/sun4i: mixer: Fix P010 and P210 format numbers
        drm/i915/psr: Set "SF Partial Frame Enable" also on full update
      79b00034
    • Emil Renner Berthing's avatar
      riscv: Fix auipc+jalr relocation range checks · 0966d385
      Emil Renner Berthing authored
      
      RISC-V can do PC-relative jumps with a 32bit range using the following
      two instructions:
      
      	auipc	t0, imm20	; t0 = PC + imm20 * 2^12
      	jalr	ra, t0, imm12	; ra = PC + 4, PC = t0 + imm12
      
      Crucially both the 20bit immediate imm20 and the 12bit immediate imm12
      are treated as two's-complement signed values. For this reason the
      immediates are usually calculated like this:
      
      	imm20 = (offset + 0x800) >> 12
      	imm12 = offset & 0xfff
      
      ..where offset is the signed offset from the auipc instruction. When
      the 11th bit of offset is 0 the addition of 0x800 doesn't change the top
      20 bits and imm12 considered positive. When the 11th bit is 1 the carry
      of the addition by 0x800 means imm20 is one higher, but since imm12 is
      then considered negative the two's complement representation means it
      all cancels out nicely.
      
      However, this addition by 0x800 (2^11) means an offset greater than or
      equal to 2^31 - 2^11 would overflow so imm20 is considered negative and
      result in a backwards jump. Similarly the lower range of offset is also
      moved down by 2^11 and hence the true 32bit range is
      
      	[-2^31 - 2^11, 2^31 - 2^11)
      
      Signed-off-by: default avatarEmil Renner Berthing <kernel@esmil.dk>
      Fixes: e2c0cdfb ("RISC-V: User-facing API")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarPalmer Dabbelt <palmer@rivosinc.com>
      0966d385
    • Dave Airlie's avatar
      Merge tag 'drm-intel-fixes-2022-03-10' of... · 30eb13a2
      Dave Airlie authored
      Merge tag 'drm-intel-fixes-2022-03-10' of git://anongit.freedesktop.org/drm/drm-intel
      
       into drm-fixes
      
      - Fix PSR2 when selective fetch is enabled and cursor at (-1, -1) (Jouni Högander)
      
      Signed-off-by: default avatarDave Airlie <airlied@redhat.com>
      From: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/YinTFSFg++HvuFpZ@tursulin-mobl2
      30eb13a2
    • Dave Airlie's avatar
      Merge tag 'drm-misc-fixes-2022-03-10' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes · 1f37299b
      Dave Airlie authored
      
       * drm/sun4i: Fix P010 and P210 format numbers
      
      Signed-off-by: default avatarDave Airlie <airlied@redhat.com>
      
      From: Thomas Zimmermann <tzimmermann@suse.de>
      Link: https://patchwork.freedesktop.org/patch/msgid/YipS65Iuu7RMMlAa@linux-uq9g
      1f37299b
    • Linus Torvalds's avatar
      Merge tag 'trace-v5.17-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace · dda64ead
      Linus Torvalds authored
      Pull tracing fixes from Steven Rostedt:
       "Minor tracing fixes:
      
         - Fix unregistering the same event twice. A user could disable the
           same event that osnoise will disable on unregistering.
      
         - Inform RCU of a quiescent state in the osnoise testing thread.
      
         - Fix some kerneldoc comments"
      
      * tag 'trace-v5.17-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
        ftrace: Fix some W=1 warnings in kernel doc comments
        tracing/osnoise: Force quiescent states while tracing
        tracing/osnoise: Do not unregister events twice
      dda64ead
    • Linus Torvalds's avatar
      Merge tag 'net-5.17-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · 186d32bb
      Linus Torvalds authored
      Pull networking fixes from Jakub Kicinski:
       "Including fixes from bluetooth, and ipsec.
      
        Current release - regressions:
      
         - Bluetooth: fix unbalanced unlock in set_device_flags()
      
         - Bluetooth: fix not processing all entries on cmd_sync_work, make
           connect with qualcomm and intel adapters reliable
      
         - Revert "xfrm: state and policy should fail if XFRMA_IF_ID 0"
      
         - xdp: xdp_mem_allocator can be NULL in trace_mem_connect()
      
         - eth: ice: fix race condition and deadlock during interface enslave
      
        Current release - new code bugs:
      
         - tipc: fix incorrect order of state message data sanity check
      
        Previous releases - regressions:
      
         - esp: fix possible buffer overflow in ESP transformation
      
         - dsa: unlock the rtnl_mutex when dsa_master_setup() fails
      
         - phy: meson-gxl: fix interrupt handling in forced mode
      
         - smsc95xx: ignore -ENODEV errors when device is unplugged
      
        Previous releases - always broken:
      
         - xfrm: fix tunnel mode fragmentation behavior
      
         - esp: fix inter address family tunneling on GSO
      
         - tipc: fix null-deref due to race when enabling bearer
      
         - sctp: fix kernel-infoleak for SCTP sockets
      
         - eth: macb: fix lost RX packet wakeup race in NAPI receive
      
         - eth: intel stop disabling VFs due to PF error responses
      
         - eth: bcmgenet: don't claim WOL when its not available"
      
      * tag 'net-5.17-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (50 commits)
        xdp: xdp_mem_allocator can be NULL in trace_mem_connect().
        ice: Fix race condition during interface enslave
        net: phy: meson-gxl: improve link-up behavior
        net: bcmgenet: Don't claim WOL when its not available
        net: arc_emac: Fix use after free in arc_mdio_probe()
        sctp: fix kernel-infoleak for SCTP sockets
        net: phy: correct spelling error of media in documentation
        net: phy: DP83822: clear MISR2 register to disable interrupts
        gianfar: ethtool: Fix refcount leak in gfar_get_ts_info
        selftests: pmtu.sh: Kill nettest processes launched in subshell.
        selftests: pmtu.sh: Kill tcpdump processes launched by subshell.
        NFC: port100: fix use-after-free in port100_send_complete
        net/mlx5e: SHAMPO, reduce TIR indication
        net/mlx5e: Lag, Only handle events from highest priority multipath entry
        net/mlx5: Fix offloading with ESWITCH_IPV4_TTL_MODIFY_ENABLE
        net/mlx5: Fix a race on command flush flow
        net/mlx5: Fix size field in bufferx_reg struct
        ax25: Fix NULL pointer dereference in ax25_kill_by_device
        net: marvell: prestera: Add missing of_node_put() in prestera_switch_set_base_mac_addr
        net: ethernet: lpc_eth: Handle error for clk_enable
        ...
      186d32bb
    • Sebastian Andrzej Siewior's avatar
      xdp: xdp_mem_allocator can be NULL in trace_mem_connect(). · e0ae7130
      Sebastian Andrzej Siewior authored
      
      Since the commit mentioned below __xdp_reg_mem_model() can return a NULL
      pointer. This pointer is dereferenced in trace_mem_connect() which leads
      to segfault.
      
      The trace points (mem_connect + mem_disconnect) were put in place to
      pair connect/disconnect using the IDs. The ID is only assigned if
      __xdp_reg_mem_model() does not return NULL. That connect trace point is
      of no use if there is no ID.
      
      Skip that connect trace point if xdp_alloc is NULL.
      
      [ Toke Høiland-Jørgensen delivered the reasoning for skipping the trace
        point ]
      
      Fixes: 4a48ef70 ("xdp: Allow registering memory model without rxq reference")
      Signed-off-by: default avatarSebastian Andrzej Siewior <bigeasy@linutronix.de>
      Acked-by: default avatarToke Høiland-Jørgensen <toke@redhat.com>
      Link: https://lore.kernel.org/r/YikmmXsffE+QajTB@linutronix.de
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      e0ae7130
    • Ivan Vecera's avatar
      ice: Fix race condition during interface enslave · 5cb1ebdb
      Ivan Vecera authored
      
      Commit 5dbbbd01 ("ice: Avoid RTNL lock when re-creating
      auxiliary device") changes a process of re-creation of aux device
      so ice_plug_aux_dev() is called from ice_service_task() context.
      This unfortunately opens a race window that can result in dead-lock
      when interface has left LAG and immediately enters LAG again.
      
      Reproducer:
      ```
      #!/bin/sh
      
      ip link add lag0 type bond mode 1 miimon 100
      ip link set lag0
      
      for n in {1..10}; do
              echo Cycle: $n
              ip link set ens7f0 master lag0
              sleep 1
              ip link set ens7f0 nomaster
      done
      ```
      
      This results in:
      [20976.208697] Workqueue: ice ice_service_task [ice]
      [20976.213422] Call Trace:
      [20976.215871]  __schedule+0x2d1/0x830
      [20976.219364]  schedule+0x35/0xa0
      [20976.222510]  schedule_preempt_disabled+0xa/0x10
      [20976.227043]  __mutex_lock.isra.7+0x310/0x420
      [20976.235071]  enum_all_gids_of_dev_cb+0x1c/0x100 [ib_core]
      [20976.251215]  ib_enum_roce_netdev+0xa4/0xe0 [ib_core]
      [20976.256192]  ib_cache_setup_one+0x33/0xa0 [ib_core]
      [20976.261079]  ib_register_device+0x40d/0x580 [ib_core]
      [20976.266139]  irdma_ib_register_device+0x129/0x250 [irdma]
      [20976.281409]  irdma_probe+0x2c1/0x360 [irdma]
      [20976.285691]  auxiliary_bus_probe+0x45/0x70
      [20976.289790]  really_probe+0x1f2/0x480
      [20976.298509]  driver_probe_device+0x49/0xc0
      [20976.302609]  bus_for_each_drv+0x79/0xc0
      [20976.306448]  __device_attach+0xdc/0x160
      [20976.310286]  bus_probe_device+0x9d/0xb0
      [20976.314128]  device_add+0x43c/0x890
      [20976.321287]  __auxiliary_device_add+0x43/0x60
      [20976.325644]  ice_plug_aux_dev+0xb2/0x100 [ice]
      [20976.330109]  ice_service_task+0xd0c/0xed0 [ice]
      [20976.342591]  process_one_work+0x1a7/0x360
      [20976.350536]  worker_thread+0x30/0x390
      [20976.358128]  kthread+0x10a/0x120
      [20976.365547]  ret_from_fork+0x1f/0x40
      ...
      [20976.438030] task:ip              state:D stack:    0 pid:213658 ppid:213627 flags:0x00004084
      [20976.446469] Call Trace:
      [20976.448921]  __schedule+0x2d1/0x830
      [20976.452414]  schedule+0x35/0xa0
      [20976.455559]  schedule_preempt_disabled+0xa/0x10
      [20976.460090]  __mutex_lock.isra.7+0x310/0x420
      [20976.464364]  device_del+0x36/0x3c0
      [20976.467772]  ice_unplug_aux_dev+0x1a/0x40 [ice]
      [20976.472313]  ice_lag_event_handler+0x2a2/0x520 [ice]
      [20976.477288]  notifier_call_chain+0x47/0x70
      [20976.481386]  __netdev_upper_dev_link+0x18b/0x280
      [20976.489845]  bond_enslave+0xe05/0x1790 [bonding]
      [20976.494475]  do_setlink+0x336/0xf50
      [20976.502517]  __rtnl_newlink+0x529/0x8b0
      [20976.543441]  rtnl_newlink+0x43/0x60
      [20976.546934]  rtnetlink_rcv_msg+0x2b1/0x360
      [20976.559238]  netlink_rcv_skb+0x4c/0x120
      [20976.563079]  netlink_unicast+0x196/0x230
      [20976.567005]  netlink_sendmsg+0x204/0x3d0
      [20976.570930]  sock_sendmsg+0x4c/0x50
      [20976.574423]  ____sys_sendmsg+0x1eb/0x250
      [20976.586807]  ___sys_sendmsg+0x7c/0xc0
      [20976.606353]  __sys_sendmsg+0x57/0xa0
      [20976.609930]  do_syscall_64+0x5b/0x1a0
      [20976.613598]  entry_SYSCALL_64_after_hwframe+0x65/0xca
      
      1. Command 'ip link ... set nomaster' causes that ice_plug_aux_dev()
         is called from ice_service_task() context, aux device is created
         and associated device->lock is taken.
      2. Command 'ip link ... set master...' calls ice's notifier under
         RTNL lock and that notifier calls ice_unplug_aux_dev(). That
         function tries to take aux device->lock but this is already taken
         by ice_plug_aux_dev() in step 1
      3. Later ice_plug_aux_dev() tries to take RTNL lock but this is already
         taken in step 2
      4. Dead-lock
      
      The patch fixes this issue by following changes:
      - Bit ICE_FLAG_PLUG_AUX_DEV is kept to be set during ice_plug_aux_dev()
        call in ice_service_task()
      - The bit is checked in ice_clear_rdma_cap() and only if it is not set
        then ice_unplug_aux_dev() is called. If it is set (in other words
        plugging of aux device was requested and ice_plug_aux_dev() is
        potentially running) then the function only clears the bit
      - Once ice_plug_aux_dev() call (in ice_service_task) is finished
        the bit ICE_FLAG_PLUG_AUX_DEV is cleared but it is also checked
        whether it was already cleared by ice_clear_rdma_cap(). If so then
        aux device is unplugged.
      
      Signed-off-by: default avatarIvan Vecera <ivecera@redhat.com>
      Co-developed-by: default avatarPetr Oros <poros@redhat.com>
      Signed-off-by: default avatarPetr Oros <poros@redhat.com>
      Reviewed-by: default avatarDave Ertman <david.m.ertman@intel.com>
      Link: https://lore.kernel.org/r/20220310171641.3863659-1-ivecera@redhat.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      5cb1ebdb
  4. Mar 10, 2022
Loading