Skip to content
Snippets Groups Projects
  1. Jan 31, 2023
    • Roberto Sassu's avatar
      ima: Align ima_file_mmap() parameters with mmap_file LSM hook · 4971c268
      Roberto Sassu authored
      
      Commit 98de59bf ("take calculation of final prot in
      security_mmap_file() into a helper") moved the code to update prot, to be
      the actual protections applied to the kernel, to a new helper called
      mmap_prot().
      
      However, while without the helper ima_file_mmap() was getting the updated
      prot, with the helper ima_file_mmap() gets the original prot, which
      contains the protections requested by the application.
      
      A possible consequence of this change is that, if an application calls
      mmap() with only PROT_READ, and the kernel applies PROT_EXEC in addition,
      that application would have access to executable memory without having this
      event recorded in the IMA measurement list. This situation would occur for
      example if the application, before mmap(), calls the personality() system
      call with READ_IMPLIES_EXEC as the first argument.
      
      Align ima_file_mmap() parameters with those of the mmap_file LSM hook, so
      that IMA can receive both the requested prot and the final prot. Since the
      requested protections are stored in a new variable, and the final
      protections are stored in the existing variable, this effectively restores
      the original behavior of the MMAP_CHECK hook.
      
      Cc: stable@vger.kernel.org
      Fixes: 98de59bf ("take calculation of final prot in security_mmap_file() into a helper")
      Signed-off-by: default avatarRoberto Sassu <roberto.sassu@huawei.com>
      Reviewed-by: default avatarStefan Berger <stefanb@linux.ibm.com>
      Signed-off-by: default avatarMimi Zohar <zohar@linux.ibm.com>
      4971c268
  2. Jan 19, 2023
    • Christian Brauner's avatar
      fs: port acl to mnt_idmap · 700b7940
      Christian Brauner authored
      
      Convert to struct mnt_idmap.
      
      Last cycle we merged the necessary infrastructure in
      256c8aed ("fs: introduce dedicated idmap type for mounts").
      This is just the conversion to struct mnt_idmap.
      
      Currently we still pass around the plain namespace that was attached to a
      mount. This is in general pretty convenient but it makes it easy to
      conflate namespaces that are relevant on the filesystem with namespaces
      that are relevent on the mount level. Especially for non-vfs developers
      without detailed knowledge in this area this can be a potential source for
      bugs.
      
      Once the conversion to struct mnt_idmap is done all helpers down to the
      really low-level helpers will take a struct mnt_idmap argument instead of
      two namespace arguments. This way it becomes impossible to conflate the two
      eliminating the possibility of any bugs. All of the vfs and all filesystems
      only operate on struct mnt_idmap.
      
      Acked-by: default avatarDave Chinner <dchinner@redhat.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarChristian Brauner (Microsoft) <brauner@kernel.org>
      700b7940
    • Christian Brauner's avatar
      fs: port xattr to mnt_idmap · 39f60c1c
      Christian Brauner authored
      
      Convert to struct mnt_idmap.
      
      Last cycle we merged the necessary infrastructure in
      256c8aed ("fs: introduce dedicated idmap type for mounts").
      This is just the conversion to struct mnt_idmap.
      
      Currently we still pass around the plain namespace that was attached to a
      mount. This is in general pretty convenient but it makes it easy to
      conflate namespaces that are relevant on the filesystem with namespaces
      that are relevent on the mount level. Especially for non-vfs developers
      without detailed knowledge in this area this can be a potential source for
      bugs.
      
      Once the conversion to struct mnt_idmap is done all helpers down to the
      really low-level helpers will take a struct mnt_idmap argument instead of
      two namespace arguments. This way it becomes impossible to conflate the two
      eliminating the possibility of any bugs. All of the vfs and all filesystems
      only operate on struct mnt_idmap.
      
      Acked-by: default avatarDave Chinner <dchinner@redhat.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarChristian Brauner (Microsoft) <brauner@kernel.org>
      39f60c1c
    • Christian Brauner's avatar
      fs: port ->permission() to pass mnt_idmap · 4609e1f1
      Christian Brauner authored
      
      Convert to struct mnt_idmap.
      
      Last cycle we merged the necessary infrastructure in
      256c8aed ("fs: introduce dedicated idmap type for mounts").
      This is just the conversion to struct mnt_idmap.
      
      Currently we still pass around the plain namespace that was attached to a
      mount. This is in general pretty convenient but it makes it easy to
      conflate namespaces that are relevant on the filesystem with namespaces
      that are relevent on the mount level. Especially for non-vfs developers
      without detailed knowledge in this area this can be a potential source for
      bugs.
      
      Once the conversion to struct mnt_idmap is done all helpers down to the
      really low-level helpers will take a struct mnt_idmap argument instead of
      two namespace arguments. This way it becomes impossible to conflate the two
      eliminating the possibility of any bugs. All of the vfs and all filesystems
      only operate on struct mnt_idmap.
      
      Acked-by: default avatarDave Chinner <dchinner@redhat.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarChristian Brauner (Microsoft) <brauner@kernel.org>
      4609e1f1
    • Christian Brauner's avatar
      fs: port ->setattr() to pass mnt_idmap · c1632a0f
      Christian Brauner authored
      
      Convert to struct mnt_idmap.
      
      Last cycle we merged the necessary infrastructure in
      256c8aed ("fs: introduce dedicated idmap type for mounts").
      This is just the conversion to struct mnt_idmap.
      
      Currently we still pass around the plain namespace that was attached to a
      mount. This is in general pretty convenient but it makes it easy to
      conflate namespaces that are relevant on the filesystem with namespaces
      that are relevent on the mount level. Especially for non-vfs developers
      without detailed knowledge in this area this can be a potential source for
      bugs.
      
      Once the conversion to struct mnt_idmap is done all helpers down to the
      really low-level helpers will take a struct mnt_idmap argument instead of
      two namespace arguments. This way it becomes impossible to conflate the two
      eliminating the possibility of any bugs. All of the vfs and all filesystems
      only operate on struct mnt_idmap.
      
      Acked-by: default avatarDave Chinner <dchinner@redhat.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarChristian Brauner (Microsoft) <brauner@kernel.org>
      c1632a0f
  3. Nov 16, 2022
    • Kees Cook's avatar
      LSM: Better reporting of actual LSMs at boot · 86ef3c73
      Kees Cook authored
      
      Enhance the details reported by "lsm.debug" in several ways:
      
      - report contents of "security="
      - report contents of "CONFIG_LSM"
      - report contents of "lsm="
      - report any early LSM details
      - whitespace-align the output of similar phases for easier visual parsing
      - change "disabled" to more accurate "skipped"
      - explain what "skipped" and "ignored" mean in a parenthetical
      
      Upgrade the "security= is ignored" warning from pr_info to pr_warn,
      and include full arguments list to make the cause even more clear.
      
      Replace static "Security Framework initializing" pr_info with specific
      list of the resulting order of enabled LSMs.
      
      For example, if the kernel is built with:
      
      CONFIG_SECURITY_SELINUX=y
      CONFIG_SECURITY_APPARMOR=y
      CONFIG_SECURITY_LOADPIN=y
      CONFIG_SECURITY_YAMA=y
      CONFIG_SECURITY_SAFESETID=y
      CONFIG_SECURITY_LOCKDOWN_LSM=y
      CONFIG_SECURITY_LANDLOCK=y
      CONFIG_INTEGRITY=y
      CONFIG_BPF_LSM=y
      CONFIG_DEFAULT_SECURITY_APPARMOR=y
      CONFIG_LSM="landlock,lockdown,yama,loadpin,safesetid,integrity,selinux,
                  smack,tomoyo,apparmor,bpf"
      
      Booting without options will show:
      
      LSM: initializing lsm=lockdown,capability,landlock,yama,loadpin,
           safesetid,integrity,selinux,bpf
      landlock: Up and running.
      Yama: becoming mindful.
      LoadPin: ready to pin (currently not enforcing)
      SELinux:  Initializing.
      LSM support for eBPF active
      
      Boot with "lsm.debug" will show:
      
      LSM: legacy security= *unspecified*
      LSM: CONFIG_LSM=landlock,lockdown,yama,loadpin,safesetid,integrity,
                      selinux,smack,tomoyo,apparmor,bpf
      LSM: boot arg lsm= *unspecified*
      LSM:   early started: lockdown (enabled)
      LSM:   first ordered: capability (enabled)
      LSM: builtin ordered: landlock (enabled)
      LSM: builtin ignored: lockdown (not built into kernel)
      LSM: builtin ordered: yama (enabled)
      LSM: builtin ordered: loadpin (enabled)
      LSM: builtin ordered: safesetid (enabled)
      LSM: builtin ordered: integrity (enabled)
      LSM: builtin ordered: selinux (enabled)
      LSM: builtin ignored: smack (not built into kernel)
      LSM: builtin ignored: tomoyo (not built into kernel)
      LSM: builtin ordered: apparmor (enabled)
      LSM: builtin ordered: bpf (enabled)
      LSM: exclusive chosen:   selinux
      LSM: exclusive disabled: apparmor
      LSM: initializing lsm=lockdown,capability,landlock,yama,loadpin,
                            safesetid,integrity,selinux,bpf
      LSM: cred blob size       = 32
      LSM: file blob size       = 16
      LSM: inode blob size      = 72
      LSM: ipc blob size        = 8
      LSM: msg_msg blob size    = 4
      LSM: superblock blob size = 80
      LSM: task blob size       = 8
      LSM: initializing capability
      LSM: initializing landlock
      landlock: Up and running.
      LSM: initializing yama
      Yama: becoming mindful.
      LSM: initializing loadpin
      LoadPin: ready to pin (currently not enforcing)
      LSM: initializing safesetid
      LSM: initializing integrity
      LSM: initializing selinux
      SELinux:  Initializing.
      LSM: initializing bpf
      LSM support for eBPF active
      
      And some examples of how the lsm.debug ordering report changes...
      
      With "lsm.debug security=selinux":
      
      LSM: legacy security=selinux
      LSM: CONFIG_LSM=landlock,lockdown,yama,loadpin,safesetid,integrity,
                      selinux,smack,tomoyo,apparmor,bpf
      LSM: boot arg lsm= *unspecified*
      LSM:   early started: lockdown (enabled)
      LSM:   first ordered: capability (enabled)
      LSM: security=selinux disabled: apparmor (only one legacy major LSM)
      LSM: builtin ordered: landlock (enabled)
      LSM: builtin ignored: lockdown (not built into kernel)
      LSM: builtin ordered: yama (enabled)
      LSM: builtin ordered: loadpin (enabled)
      LSM: builtin ordered: safesetid (enabled)
      LSM: builtin ordered: integrity (enabled)
      LSM: builtin ordered: selinux (enabled)
      LSM: builtin ignored: smack (not built into kernel)
      LSM: builtin ignored: tomoyo (not built into kernel)
      LSM: builtin ordered: apparmor (disabled)
      LSM: builtin ordered: bpf (enabled)
      LSM: exclusive chosen:   selinux
      LSM: initializing lsm=lockdown,capability,landlock,yama,loadpin,
      		      safesetid,integrity,selinux,bpf
      
      With "lsm.debug lsm=integrity,selinux,loadpin,crabability,bpf,
                          loadpin,loadpin":
      
      LSM: legacy security= *unspecified*
      LSM: CONFIG_LSM=landlock,lockdown,yama,loadpin,safesetid,integrity,
                      selinux,smack,tomoyo,apparmor,bpf
      LSM: boot arg lsm=integrity,selinux,loadpin,capability,bpf,loadpin,
      		  loadpin
      LSM:   early started: lockdown (enabled)
      LSM:   first ordered: capability (enabled)
      LSM: cmdline ordered: integrity (enabled)
      LSM: cmdline ordered: selinux (enabled)
      LSM: cmdline ordered: loadpin (enabled)
      LSM: cmdline ignored: crabability (not built into kernel)
      LSM: cmdline ordered: bpf (enabled)
      LSM: cmdline skipped: apparmor (not in requested order)
      LSM: cmdline skipped: yama (not in requested order)
      LSM: cmdline skipped: safesetid (not in requested order)
      LSM: cmdline skipped: landlock (not in requested order)
      LSM: exclusive chosen:   selinux
      LSM: initializing lsm=lockdown,capability,integrity,selinux,loadpin,bpf
      
      Cc: Paul Moore <paul@paul-moore.com>
      Cc: James Morris <jmorris@namei.org>
      Cc: "Serge E. Hallyn" <serge@hallyn.com>
      Cc: linux-security-module@vger.kernel.org
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      Acked-by: default avatarCasey Schaufler <casey@schaufler-ca.com>
      Acked-by: default avatarMickaël Salaün <mic@digikod.net>
      [PM: line wrapped commit description]
      Signed-off-by: default avatarPaul Moore <paul@paul-moore.com>
      86ef3c73
  4. Nov 05, 2022
    • Paul Moore's avatar
      lsm: make security_socket_getpeersec_stream() sockptr_t safe · b10b9c34
      Paul Moore authored
      
      Commit 4ff09db1 ("bpf: net: Change sk_getsockopt() to take the
      sockptr_t argument") made it possible to call sk_getsockopt()
      with both user and kernel address space buffers through the use of
      the sockptr_t type.  Unfortunately at the time of conversion the
      security_socket_getpeersec_stream() LSM hook was written to only
      accept userspace buffers, and in a desire to avoid having to change
      the LSM hook the commit author simply passed the sockptr_t's
      userspace buffer pointer.  Since the only sk_getsockopt() callers
      at the time of conversion which used kernel sockptr_t buffers did
      not allow SO_PEERSEC, and hence the
      security_socket_getpeersec_stream() hook, this was acceptable but
      also very fragile as future changes presented the possibility of
      silently passing kernel space pointers to the LSM hook.
      
      There are several ways to protect against this, including careful
      code review of future commits, but since relying on code review to
      catch bugs is a recipe for disaster and the upstream eBPF maintainer
      is "strongly against defensive programming", this patch updates the
      LSM hook, and all of the implementations to support sockptr_t and
      safely handle both user and kernel space buffers.
      
      Acked-by: default avatarCasey Schaufler <casey@schaufler-ca.com>
      Acked-by: default avatarJohn Johansen <john.johansen@canonical.com>
      Signed-off-by: default avatarPaul Moore <paul@paul-moore.com>
      b10b9c34
  5. Oct 20, 2022
    • Christian Brauner's avatar
      integrity: implement get and set acl hook · e61b135f
      Christian Brauner authored
      The current way of setting and getting posix acls through the generic
      xattr interface is error prone and type unsafe. The vfs needs to
      interpret and fixup posix acls before storing or reporting it to
      userspace. Various hacks exist to make this work. The code is hard to
      understand and difficult to maintain in it's current form. Instead of
      making this work by hacking posix acls through xattr handlers we are
      building a dedicated posix acl api around the get and set inode
      operations. This removes a lot of hackiness and makes the codepaths
      easier to maintain. A lot of background can be found in [1].
      
      So far posix acls were passed as a void blob to the security and
      integrity modules. Some of them like evm then proceed to interpret the
      void pointer and convert it into the kernel internal struct posix acl
      representation to perform their integrity checking magic. This is
      obviously pretty problematic as that requires knowledge that only the
      vfs is guaranteed to have and has lead to various bugs. Add a proper
      security hook for setting posix acls and pass down the posix acls in
      their appropriate vfs format instead of hacking it through a void
      pointer stored in the uapi format.
      
      I spent considerate time in the security module and integrity
      infrastructure and audited all codepaths. EVM is the only part that
      really has restrictions based on the actual posix acl values passed
      through it (e.g., i_mode). Before this dedicated hook EVM used to translate
      from the uapi posix acl format sent to it in the form of a void pointer
      into the vfs format. This is not a good thing. Instead of hacking around in
      the uapi struct give EVM the posix acls in the appropriate vfs format and
      perform sane permissions checks that mirror what it used to to in the
      generic xattr hook.
      
      IMA doesn't have any restrictions on posix acls. When posix acls are
      changed it just wants to update its appraisal status to trigger an EVM
      revalidation.
      
      The removal of posix acls is equivalent to passing NULL to the posix set
      acl hooks. This is the same as before through the generic xattr api.
      
      Link: https://lore.kernel.org/all/20220801145520.1532837-1-brauner@kernel.org
      
       [1]
      Acked-by: Paul Moore <paul@paul-moore.com> (LSM)
      Signed-off-by: default avatarChristian Brauner (Microsoft) <brauner@kernel.org>
      e61b135f
    • Christian Brauner's avatar
      security: add get, remove and set acl hook · 72b3897e
      Christian Brauner authored
      The current way of setting and getting posix acls through the generic
      xattr interface is error prone and type unsafe. The vfs needs to
      interpret and fixup posix acls before storing or reporting it to
      userspace. Various hacks exist to make this work. The code is hard to
      understand and difficult to maintain in it's current form. Instead of
      making this work by hacking posix acls through xattr handlers we are
      building a dedicated posix acl api around the get and set inode
      operations. This removes a lot of hackiness and makes the codepaths
      easier to maintain. A lot of background can be found in [1].
      
      So far posix acls were passed as a void blob to the security and
      integrity modules. Some of them like evm then proceed to interpret the
      void pointer and convert it into the kernel internal struct posix acl
      representation to perform their integrity checking magic. This is
      obviously pretty problematic as that requires knowledge that only the
      vfs is guaranteed to have and has lead to various bugs. Add a proper
      security hook for setting posix acls and pass down the posix acls in
      their appropriate vfs format instead of hacking it through a void
      pointer stored in the uapi format.
      
      In the next patches we implement the hooks for the few security modules
      that do actually have restrictions on posix acls.
      
      Link: https://lore.kernel.org/all/20220801145520.1532837-1-brauner@kernel.org
      
       [1]
      Acked-by: default avatarPaul Moore <paul@paul-moore.com>
      Signed-off-by: default avatarChristian Brauner (Microsoft) <brauner@kernel.org>
      72b3897e
  6. Oct 19, 2022
  7. Sep 28, 2022
  8. Sep 01, 2022
  9. Aug 26, 2022
  10. Aug 16, 2022
    • Frederick Lawler's avatar
      security, lsm: Introduce security_create_user_ns() · 7cd4c5c2
      Frederick Lawler authored
      User namespaces are an effective tool to allow programs to run with
      permission without requiring the need for a program to run as root. User
      namespaces may also be used as a sandboxing technique. However, attackers
      sometimes leverage user namespaces as an initial attack vector to perform
      some exploit. [1,2,3]
      
      While it is not the unprivileged user namespace functionality, which
      causes the kernel to be exploitable, users/administrators might want to
      more granularly limit or at least monitor how various processes use this
      functionality, while vulnerable kernel subsystems are being patched.
      
      Preventing user namespace already creation comes in a few of forms in
      order of granularity:
      
              1. /proc/sys/user/max_user_namespaces sysctl
              2. Distro specific patch(es)
              3. CONFIG_USER_NS
      
      To block a task based on its attributes, the LSM hook cred_prepare is a
      decent candidate for use because it provides more granular control, and
      it is called before create_user_ns():
      
              cred = prepare_creds()
                      security_prepare_creds()
                              call_int_hook(cred_prepare, ...
              if (cred)
                      create_user_ns(cred)
      
      Since security_prepare_creds() is meant for LSMs to copy and prepare
      credentials, access control is an unintended use of the hook. [4]
      Further, security_prepare_creds() will always return a ENOMEM if the
      hook returns any non-zero error code.
      
      This hook also does not handle the clone3 case which requires us to
      access a user space pointer to know if we're in the CLONE_NEW_USER
      call path which may be subject to a TOCTTOU attack.
      
      Lastly, cred_prepare is called in many call paths, and a targeted hook
      further limits the frequency of calls which is a beneficial outcome.
      Therefore introduce a new function security_create_user_ns() with an
      accompanying userns_create LSM hook.
      
      With the new userns_create hook, users will have more control over the
      observability and access control over user namespace creation. Users
      should expect that normal operation of user namespaces will behave as
      usual, and only be impacted when controls are implemented by users or
      administrators.
      
      This hook takes the prepared creds for LSM authors to write policy
      against. On success, the new namespace is applied to credentials,
      otherwise an error is returned.
      
      Links:
      1. https://nvd.nist.gov/vuln/detail/CVE-2022-0492
      2. https://nvd.nist.gov/vuln/detail/CVE-2022-25636
      3. https://nvd.nist.gov/vuln/detail/CVE-2022-34918
      4. https://lore.kernel.org/all/1c4b1c0d-12f6-6e9e-a6a3-cdce7418110c@schaufler-ca.com/
      
      
      
      Reviewed-by: default avatarChristian Brauner (Microsoft) <brauner@kernel.org>
      Reviewed-by: default avatarKP Singh <kpsingh@kernel.org>
      Signed-off-by: default avatarFrederick Lawler <fred@cloudflare.com>
      Signed-off-by: default avatarPaul Moore <paul@paul-moore.com>
      7cd4c5c2
  11. Jul 15, 2022
  12. Jun 26, 2022
    • Christian Brauner's avatar
      security: pass down mount idmapping to setattr hook · 0e363cf3
      Christian Brauner authored
      Before this change we used to take a shortcut and place the actual
      values that would be written to inode->i_{g,u}id into struct iattr. This
      had the advantage that we moved idmappings mostly out of the picture
      early on but it made reasoning about changes more difficult than it
      should be.
      
      The filesystem was never explicitly told that it dealt with an idmapped
      mount. The transition to the value that needed to be stored in
      inode->i_{g,u}id appeared way too early and increased the probability of
      bugs in various codepaths.
      
      We know place the same value in struct iattr no matter if this is an
      idmapped mount or not. The vfs will only deal with type safe
      vfs{g,u}id_t. This makes it massively safer to perform permission checks
      as the type will tell us what checks we need to perform and what helpers
      we need to use.
      
      Adapt the security_inode_setattr() helper to pass down the mount's
      idmapping to account for that change.
      
      Link: https://lore.kernel.org/r/20220621141454.2914719-8-brauner@kernel.org
      
      
      Cc: Seth Forshee <sforshee@digitalocean.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Aleksa Sarai <cyphar@cyphar.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      CC: linux-fsdevel@vger.kernel.org
      Reviewed-by: default avatarSeth Forshee <sforshee@digitalocean.com>
      Signed-off-by: default avatarChristian Brauner (Microsoft) <brauner@kernel.org>
      0e363cf3
  13. May 24, 2022
  14. May 23, 2022
  15. May 13, 2022
  16. Apr 13, 2022
  17. Feb 15, 2022
  18. Jan 28, 2022
    • Vivek Goyal's avatar
      security, lsm: dentry_init_security() Handle multi LSM registration · 7f5056b9
      Vivek Goyal authored
      
      A ceph user has reported that ceph is crashing with kernel NULL pointer
      dereference. Following is the backtrace.
      
      /proc/version: Linux version 5.16.2-arch1-1 (linux@archlinux) (gcc (GCC)
      11.1.0, GNU ld (GNU Binutils) 2.36.1) #1 SMP PREEMPT Thu, 20 Jan 2022
      16:18:29 +0000
      distro / arch: Arch Linux / x86_64
      SELinux is not enabled
      ceph cluster version: 16.2.7 (dd0603118f56ab514f133c8d2e3adfc983942503)
      
      relevant dmesg output:
      [   30.947129] BUG: kernel NULL pointer dereference, address:
      0000000000000000
      [   30.947206] #PF: supervisor read access in kernel mode
      [   30.947258] #PF: error_code(0x0000) - not-present page
      [   30.947310] PGD 0 P4D 0
      [   30.947342] Oops: 0000 [#1] PREEMPT SMP PTI
      [   30.947388] CPU: 5 PID: 778 Comm: touch Not tainted 5.16.2-arch1-1 #1
      86fbf2c313cc37a553d65deb81d98e9dcc2a3659
      [   30.947486] Hardware name: Gigabyte Technology Co., Ltd. B365M
      DS3H/B365M DS3H, BIOS F5 08/13/2019
      [   30.947569] RIP: 0010:strlen+0x0/0x20
      [   30.947616] Code: b6 07 38 d0 74 16 48 83 c7 01 84 c0 74 05 48 39 f7 75
      ec 31 c0 31 d2 89 d6 89 d7 c3 48 89 f8 31 d2 89 d6 89 d7 c3 0
      f 1f 40 00 <80> 3f 00 74 12 48 89 f8 48 83 c0 01 80 38 00 75 f7 48 29 f8 31
      ff
      [   30.947782] RSP: 0018:ffffa4ed80ffbbb8 EFLAGS: 00010246
      [   30.947836] RAX: 0000000000000000 RBX: ffffa4ed80ffbc60 RCX:
      0000000000000000
      [   30.947904] RDX: 0000000000000000 RSI: 0000000000000000 RDI:
      0000000000000000
      [   30.947971] RBP: ffff94b0d15c0ae0 R08: 0000000000000000 R09:
      0000000000000000
      [   30.948040] R10: 0000000000000000 R11: 0000000000000000 R12:
      0000000000000000
      [   30.948106] R13: 0000000000000001 R14: ffffa4ed80ffbc60 R15:
      0000000000000000
      [   30.948174] FS:  00007fc7520f0740(0000) GS:ffff94b7ced40000(0000)
      knlGS:0000000000000000
      [   30.948252] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [   30.948308] CR2: 0000000000000000 CR3: 0000000104a40001 CR4:
      00000000003706e0
      [   30.948376] Call Trace:
      [   30.948404]  <TASK>
      [   30.948431]  ceph_security_init_secctx+0x7b/0x240 [ceph
      49f9c4b9bf5be8760f19f1747e26da33920bce4b]
      [   30.948582]  ceph_atomic_open+0x51e/0x8a0 [ceph
      49f9c4b9bf5be8760f19f1747e26da33920bce4b]
      [   30.948708]  ? get_cached_acl+0x4d/0xa0
      [   30.948759]  path_openat+0x60d/0x1030
      [   30.948809]  do_filp_open+0xa5/0x150
      [   30.948859]  do_sys_openat2+0xc4/0x190
      [   30.948904]  __x64_sys_openat+0x53/0xa0
      [   30.948948]  do_syscall_64+0x5c/0x90
      [   30.948989]  ? exc_page_fault+0x72/0x180
      [   30.949034]  entry_SYSCALL_64_after_hwframe+0x44/0xae
      [   30.949091] RIP: 0033:0x7fc7521e25bb
      [   30.950849] Code: 25 00 00 41 00 3d 00 00 41 00 74 4b 64 8b 04 25 18 00
      00 00 85 c0 75 67 44 89 e2 48 89 ee bf 9c ff ff ff b8 01 01 0
      0 00 0f 05 <48> 3d 00 f0 ff ff 0f 87 91 00 00 00 48 8b 54 24 28 64 48 2b 14
      25
      
      Core of the problem is that ceph checks for return code from
      security_dentry_init_security() and if return code is 0, it assumes
      everything is fine and continues to call strlen(name), which crashes.
      
      Typically SELinux LSM returns 0 and sets name to "security.selinux" and
      it is not a problem. Or if selinux is not compiled in or disabled, it
      returns -EOPNOTSUP and ceph deals with it.
      
      But somehow in this configuration, 0 is being returned and "name" is
      not being initialized and that's creating the problem.
      
      Our suspicion is that BPF LSM is registering a hook for
      dentry_init_security() and returns hook default of 0.
      
      LSM_HOOK(int, 0, dentry_init_security, struct dentry *dentry,...)
      
      I have not been able to reproduce it just by doing CONFIG_BPF_LSM=y.
      Stephen has tested the patch though and confirms it solves the problem
      for him.
      
      dentry_init_security() is written in such a way that it expects only one
      LSM to register the hook. Atleast that's the expectation with current code.
      
      If another LSM returns a hook and returns default, it will simply return
      0 as of now and that will break ceph.
      
      Hence, suggestion is that change semantics of this hook a bit. If there
      are no LSMs or no LSM is taking ownership and initializing security context,
      then return -EOPNOTSUP. Also allow at max one LSM to initialize security
      context. This hook can't deal with multiple LSMs trying to init security
      context. This patch implements this new behavior.
      
      Reported-by: default avatarStephen Muth <smuth4@gmail.com>
      Tested-by: default avatarStephen Muth <smuth4@gmail.com>
      Suggested-by: default avatarCasey Schaufler <casey@schaufler-ca.com>
      Acked-by: default avatarCasey Schaufler <casey@schaufler-ca.com>
      Reviewed-by: default avatarSerge Hallyn <serge@hallyn.com>
      Cc: Jeff Layton <jlayton@kernel.org>
      Cc: Christian Brauner <brauner@kernel.org>
      Cc: Paul Moore <paul@paul-moore.com>
      Cc: <stable@vger.kernel.org> # 5.16.0
      Signed-off-by: default avatarVivek Goyal <vgoyal@redhat.com>
      Reviewed-by: default avatarJeff Layton <jlayton@kernel.org>
      Acked-by: default avatarPaul Moore <paul@paul-moore.com>
      Acked-by: default avatarChristian Brauner <brauner@kernel.org>
      Signed-off-by: default avatarJames Morris <jmorris@namei.org>
      7f5056b9
    • Casey Schaufler's avatar
      LSM: general protection fault in legacy_parse_param · ecff3057
      Casey Schaufler authored
      
      The usual LSM hook "bail on fail" scheme doesn't work for cases where
      a security module may return an error code indicating that it does not
      recognize an input.  In this particular case Smack sees a mount option
      that it recognizes, and returns 0. A call to a BPF hook follows, which
      returns -ENOPARAM, which confuses the caller because Smack has processed
      its data.
      
      The SELinux hook incorrectly returns 1 on success. There was a time
      when this was correct, however the current expectation is that it
      return 0 on success. This is repaired.
      
      Reported-by: default avatar <syzbot+d1e3b1d92d25abf97943@syzkaller.appspotmail.com>
      Signed-off-by: default avatarCasey Schaufler <casey@schaufler-ca.com>
      Acked-by: default avatarJames Morris <jamorris@linux.microsoft.com>
      Signed-off-by: default avatarPaul Moore <paul@paul-moore.com>
      ecff3057
  19. Dec 06, 2021
  20. Nov 22, 2021
  21. Nov 14, 2021
    • Paul Moore's avatar
      net,lsm,selinux: revert the security_sctp_assoc_established() hook · 1aa3b220
      Paul Moore authored
      
      This patch reverts two prior patches, e7310c94
      ("security: implement sctp_assoc_established hook in selinux") and
      7c2ef024 ("security: add sctp_assoc_established hook"), which
      create the security_sctp_assoc_established() LSM hook and provide a
      SELinux implementation.  Unfortunately these two patches were merged
      without proper review (the Reviewed-by and Tested-by tags from
      Richard Haines were for previous revisions of these patches that
      were significantly different) and there are outstanding objections
      from the SELinux maintainers regarding these patches.
      
      Work is currently ongoing to correct the problems identified in the
      reverted patches, as well as others that have come up during review,
      but it is unclear at this point in time when that work will be ready
      for inclusion in the mainline kernel.  In the interest of not keeping
      objectionable code in the kernel for multiple weeks, and potentially
      a kernel release, we are reverting the two problematic patches.
      
      Signed-off-by: default avatarPaul Moore <paul@paul-moore.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1aa3b220
  22. Nov 12, 2021
    • Paul Moore's avatar
      net,lsm,selinux: revert the security_sctp_assoc_established() hook · 32a370ab
      Paul Moore authored
      
      This patch reverts two prior patches, e7310c94
      ("security: implement sctp_assoc_established hook in selinux") and
      7c2ef024 ("security: add sctp_assoc_established hook"), which
      create the security_sctp_assoc_established() LSM hook and provide a
      SELinux implementation.  Unfortunately these two patches were merged
      without proper review (the Reviewed-by and Tested-by tags from
      Richard Haines were for previous revisions of these patches that
      were significantly different) and there are outstanding objections
      from the SELinux maintainers regarding these patches.
      
      Work is currently ongoing to correct the problems identified in the
      reverted patches, as well as others that have come up during review,
      but it is unclear at this point in time when that work will be ready
      for inclusion in the mainline kernel.  In the interest of not keeping
      objectionable code in the kernel for multiple weeks, and potentially
      a kernel release, we are reverting the two problematic patches.
      
      Signed-off-by: default avatarPaul Moore <paul@paul-moore.com>
      32a370ab
  23. Nov 03, 2021
  24. Oct 20, 2021
  25. Oct 15, 2021
  26. Oct 14, 2021
    • Kees Cook's avatar
      LSM: Avoid warnings about potentially unused hook variables · 86dd9fd5
      Kees Cook authored
      
      Building with W=1 shows many unused const variable warnings. These can
      be silenced, as we're well aware of their being potentially unused:
      
      ./include/linux/lsm_hook_defs.h:36:18: error: 'ptrace_access_check_default' defined but not used [-Werror=unused-const-variable=]
         36 | LSM_HOOK(int, 0, ptrace_access_check, struct task_struct *child,
            |                  ^~~~~~~~~~~~~~~~~~~
      security/security.c:706:32: note: in definition of macro 'LSM_RET_DEFAULT'
        706 | #define LSM_RET_DEFAULT(NAME) (NAME##_default)
            |                                ^~~~
      security/security.c:711:9: note: in expansion of macro 'DECLARE_LSM_RET_DEFAULT_int'
        711 |         DECLARE_LSM_RET_DEFAULT_##RET(DEFAULT, NAME)
            |         ^~~~~~~~~~~~~~~~~~~~~~~~
      ./include/linux/lsm_hook_defs.h:36:1: note: in expansion of macro 'LSM_HOOK'
         36 | LSM_HOOK(int, 0, ptrace_access_check, struct task_struct *child,
            | ^~~~~~~~
      
      Cc: James Morris <jmorris@namei.org>
      Cc: "Serge E. Hallyn" <serge@hallyn.com>
      Cc: Paul Moore <paul@paul-moore.com>
      Cc: Casey Schaufler <casey@schaufler-ca.com>
      Cc: KP Singh <kpsingh@chromium.org>
      Cc: linux-security-module@vger.kernel.org
      Reported-by: default avatarkernel test robot <lkp@intel.com>
      Link: https://lore.kernel.org/linux-mm/202110131608.zms53FPR-lkp@intel.com/
      
      
      Fixes: 98e828a0 ("security: Refactor declaration of LSM hooks")
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      Acked-by: default avatarJames Morris <jamorris@linux.microsoft.com>
      Signed-off-by: default avatarPaul Moore <paul@paul-moore.com>
      86dd9fd5
  27. Sep 20, 2021
    • Paul Moore's avatar
      lsm,io_uring: add LSM hooks to io_uring · cdc1404a
      Paul Moore authored
      
      A full expalantion of io_uring is beyond the scope of this commit
      description, but in summary it is an asynchronous I/O mechanism
      which allows for I/O requests and the resulting data to be queued
      in memory mapped "rings" which are shared between the kernel and
      userspace.  Optionally, io_uring offers the ability for applications
      to spawn kernel threads to dequeue I/O requests from the ring and
      submit the requests in the kernel, helping to minimize the syscall
      overhead.  Rings are accessed in userspace by memory mapping a file
      descriptor provided by the io_uring_setup(2), and can be shared
      between applications as one might do with any open file descriptor.
      Finally, process credentials can be registered with a given ring
      and any process with access to that ring can submit I/O requests
      using any of the registered credentials.
      
      While the io_uring functionality is widely recognized as offering a
      vastly improved, and high performing asynchronous I/O mechanism, its
      ability to allow processes to submit I/O requests with credentials
      other than its own presents a challenge to LSMs.  When a process
      creates a new io_uring ring the ring's credentials are inhertied
      from the calling process; if this ring is shared with another
      process operating with different credentials there is the potential
      to bypass the LSMs security policy.  Similarly, registering
      credentials with a given ring allows any process with access to that
      ring to submit I/O requests with those credentials.
      
      In an effort to allow LSMs to apply security policy to io_uring I/O
      operations, this patch adds two new LSM hooks.  These hooks, in
      conjunction with the LSM anonymous inode support previously
      submitted, allow an LSM to apply access control policy to the
      sharing of io_uring rings as well as any io_uring credential changes
      requested by a process.
      
      The new LSM hooks are described below:
      
       * int security_uring_override_creds(cred)
         Controls if the current task, executing an io_uring operation,
         is allowed to override it's credentials with @cred.  In cases
         where the current task is a user application, the current
         credentials will be those of the user application.  In cases
         where the current task is a kernel thread servicing io_uring
         requests the current credentials will be those of the io_uring
         ring (inherited from the process that created the ring).
      
       * int security_uring_sqpoll(void)
         Controls if the current task is allowed to create an io_uring
         polling thread (IORING_SETUP_SQPOLL).  Without a SQPOLL thread
         in the kernel processes must submit I/O requests via
         io_uring_enter(2) which allows us to compare any requested
         credential changes against the application making the request.
         With a SQPOLL thread, we can no longer compare requested
         credential changes against the application making the request,
         the comparison is made against the ring's credentials.
      
      Signed-off-by: default avatarPaul Moore <paul@paul-moore.com>
      cdc1404a
  28. Aug 10, 2021
    • Daniel Borkmann's avatar
      bpf: Add lockdown check for probe_write_user helper · 51e1bb9e
      Daniel Borkmann authored
      
      Back then, commit 96ae5227 ("bpf: Add bpf_probe_write_user BPF helper
      to be called in tracers") added the bpf_probe_write_user() helper in order
      to allow to override user space memory. Its original goal was to have a
      facility to "debug, divert, and manipulate execution of semi-cooperative
      processes" under CAP_SYS_ADMIN. Write to kernel was explicitly disallowed
      since it would otherwise tamper with its integrity.
      
      One use case was shown in cf9b1199 ("samples/bpf: Add test/example of
      using bpf_probe_write_user bpf helper") where the program DNATs traffic
      at the time of connect(2) syscall, meaning, it rewrites the arguments to
      a syscall while they're still in userspace, and before the syscall has a
      chance to copy the argument into kernel space. These days we have better
      mechanisms in BPF for achieving the same (e.g. for load-balancers), but
      without having to write to userspace memory.
      
      Of course the bpf_probe_write_user() helper can also be used to abuse
      many other things for both good or bad purpose. Outside of BPF, there is
      a similar mechanism for ptrace(2) such as PTRACE_PEEK{TEXT,DATA} and
      PTRACE_POKE{TEXT,DATA}, but would likely require some more effort.
      Commit 96ae5227 explicitly dedicated the helper for experimentation
      purpose only. Thus, move the helper's availability behind a newly added
      LOCKDOWN_BPF_WRITE_USER lockdown knob so that the helper is disabled under
      the "integrity" mode. More fine-grained control can be implemented also
      from LSM side with this change.
      
      Fixes: 96ae5227 ("bpf: Add bpf_probe_write_user BPF helper to be called in tracers")
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      51e1bb9e
  29. Aug 09, 2021
  30. May 21, 2021
  31. May 11, 2021
  32. Apr 22, 2021
Loading