Skip to content
  1. Apr 16, 2020
    • Vasily Averin's avatar
      keys: Fix proc_keys_next to increase position index · 86d32f9a
      Vasily Averin authored
      If seq_file .next function does not change position index,
      read after some lseek can generate unexpected output:
      
          $ dd if=/proc/keys bs=1  # full usual output
          0f6bfdf5 I--Q---     2 perm 3f010000  1000  1000 user      4af2f79ab8848d0a: 740
          1fb91b32 I--Q---     3 perm 1f3f0000  1000 65534 keyring   _uid.1000: 2
          27589480 I--Q---     1 perm 0b0b0000     0     0 user      invocation_id: 16
          2f33ab67 I--Q---   152 perm 3f030000     0     0 keyring   _ses: 2
          33f1d8fa I--Q---     4 perm 3f030000  1000  1000 keyring   _ses: 1
          3d427fda I--Q---     2 perm 3f010000  1000  1000 user      69ec44aec7678e5a: 740
          3ead4096 I--Q---     1 perm 1f3f0000  1000 65534 keyring   _uid_ses.1000: 1
          521+0 records in
          521+0 records out
          521 bytes copied, 0,00123769 s, 421 kB/s
      
      But a read after lseek in middle of last line results in the partial
      last line and then a repeat of the final line:
      
          $ dd if=/proc/keys bs=500 skip=1
          dd: /proc/keys: cannot skip to specified offset
          g   _uid_ses.1000: 1
          3ead4096 I--Q---     1 perm 1f3f0000  1000 65534 keyring   _uid_ses.1000: 1
          0+1 records in
          0+1 records out
          97 bytes copied, 0,000135035 s, 718 kB/s
      
      and a read after lseek beyond end of file results in the last line being
      shown:
      
          $ dd if=/proc/keys bs=1000 skip=1   # read after lseek beyond end of file
          dd: /proc/keys: cannot skip to specified offset
          3ead4096 I--Q---     1 perm 1f3f0000  1000 65534 keyring   _uid_ses.1000: 1
          0+1 records in
          0+1 records out
          76 bytes copied, 0,000119981 s, 633 kB/s
      
      See https://bugzilla.kernel.org/show_bug.cgi?id=206283
      
      
      
      Fixes: 1f4aace6 ("fs/seq_file.c: simplify seq_file iteration code ...")
      Signed-off-by: default avatarVasily Averin <vvs@virtuozzo.com>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Reviewed-by: default avatarJarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      86d32f9a
  2. Apr 15, 2020
  3. Mar 30, 2020
  4. Mar 29, 2020
    • KP Singh's avatar
      bpf: lsm: Initialize the BPF LSM hooks · 520b7aa0
      KP Singh authored
      
      
      * The hooks are initialized using the definitions in
        include/linux/lsm_hook_defs.h.
      * The LSM can be enabled / disabled with CONFIG_BPF_LSM.
      
      Signed-off-by: default avatarKP Singh <kpsingh@google.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Reviewed-by: default avatarBrendan Jackman <jackmanb@google.com>
      Reviewed-by: default avatarFlorent Revest <revest@google.com>
      Acked-by: default avatarKees Cook <keescook@chromium.org>
      Acked-by: default avatarJames Morris <jamorris@linux.microsoft.com>
      Link: https://lore.kernel.org/bpf/20200329004356.27286-6-kpsingh@chromium.org
      520b7aa0
    • KP Singh's avatar
      security: Refactor declaration of LSM hooks · 98e828a0
      KP Singh authored
      
      
      The information about the different types of LSM hooks is scattered
      in two locations i.e. union security_list_options and
      struct security_hook_heads. Rather than duplicating this information
      even further for BPF_PROG_TYPE_LSM, define all the hooks with the
      LSM_HOOK macro in lsm_hook_defs.h which is then used to generate all
      the data structures required by the LSM framework.
      
      The LSM hooks are defined as:
      
        LSM_HOOK(<return_type>, <default_value>, <hook_name>, args...)
      
      with <default_value> acccessible in security.c as:
      
        LSM_RET_DEFAULT(<hook_name>)
      
      Signed-off-by: default avatarKP Singh <kpsingh@google.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Reviewed-by: default avatarBrendan Jackman <jackmanb@google.com>
      Reviewed-by: default avatarFlorent Revest <revest@google.com>
      Reviewed-by: default avatarKees Cook <keescook@chromium.org>
      Reviewed-by: default avatarCasey Schaufler <casey@schaufler-ca.com>
      Acked-by: default avatarJames Morris <jamorris@linux.microsoft.com>
      Link: https://lore.kernel.org/bpf/20200329004356.27286-3-kpsingh@chromium.org
      98e828a0
    • Waiman Long's avatar
      KEYS: Avoid false positive ENOMEM error on key read · 4f088249
      Waiman Long authored
      
      
      By allocating a kernel buffer with a user-supplied buffer length, it
      is possible that a false positive ENOMEM error may be returned because
      the user-supplied length is just too large even if the system do have
      enough memory to hold the actual key data.
      
      Moreover, if the buffer length is larger than the maximum amount of
      memory that can be returned by kmalloc() (2^(MAX_ORDER-1) number of
      pages), a warning message will also be printed.
      
      To reduce this possibility, we set a threshold (PAGE_SIZE) over which we
      do check the actual key length first before allocating a buffer of the
      right size to hold it. The threshold is arbitrary, it is just used to
      trigger a buffer length check. It does not limit the actual key length
      as long as there is enough memory to satisfy the memory request.
      
      To further avoid large buffer allocation failure due to page
      fragmentation, kvmalloc() is used to allocate the buffer so that vmapped
      pages can be used when there is not a large enough contiguous set of
      pages available for allocation.
      
      In the extremely unlikely scenario that the key keeps on being changed
      and made longer (still <= buflen) in between 2 __keyctl_read_key()
      calls, the __keyctl_read_key() calling loop in keyctl_read_key() may
      have to be iterated a large number of times, but definitely not infinite.
      
      Signed-off-by: default avatarWaiman Long <longman@redhat.com>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      4f088249
    • Waiman Long's avatar
      KEYS: Don't write out to userspace while holding key semaphore · d3ec10aa
      Waiman Long authored
      
      
      A lockdep circular locking dependency report was seen when running a
      keyutils test:
      
      [12537.027242] ======================================================
      [12537.059309] WARNING: possible circular locking dependency detected
      [12537.088148] 4.18.0-147.7.1.el8_1.x86_64+debug #1 Tainted: G OE    --------- -  -
      [12537.125253] ------------------------------------------------------
      [12537.153189] keyctl/25598 is trying to acquire lock:
      [12537.175087] 000000007c39f96c (&mm->mmap_sem){++++}, at: __might_fault+0xc4/0x1b0
      [12537.208365]
      [12537.208365] but task is already holding lock:
      [12537.234507] 000000003de5b58d (&type->lock_class){++++}, at: keyctl_read_key+0x15a/0x220
      [12537.270476]
      [12537.270476] which lock already depends on the new lock.
      [12537.270476]
      [12537.307209]
      [12537.307209] the existing dependency chain (in reverse order) is:
      [12537.340754]
      [12537.340754] -> #3 (&type->lock_class){++++}:
      [12537.367434]        down_write+0x4d/0x110
      [12537.385202]        __key_link_begin+0x87/0x280
      [12537.405232]        request_key_and_link+0x483/0xf70
      [12537.427221]        request_key+0x3c/0x80
      [12537.444839]        dns_query+0x1db/0x5a5 [dns_resolver]
      [12537.468445]        dns_resolve_server_name_to_ip+0x1e1/0x4d0 [cifs]
      [12537.496731]        cifs_reconnect+0xe04/0x2500 [cifs]
      [12537.519418]        cifs_readv_from_socket+0x461/0x690 [cifs]
      [12537.546263]        cifs_read_from_socket+0xa0/0xe0 [cifs]
      [12537.573551]        cifs_demultiplex_thread+0x311/0x2db0 [cifs]
      [12537.601045]        kthread+0x30c/0x3d0
      [12537.617906]        ret_from_fork+0x3a/0x50
      [12537.636225]
      [12537.636225] -> #2 (root_key_user.cons_lock){+.+.}:
      [12537.664525]        __mutex_lock+0x105/0x11f0
      [12537.683734]        request_key_and_link+0x35a/0xf70
      [12537.705640]        request_key+0x3c/0x80
      [12537.723304]        dns_query+0x1db/0x5a5 [dns_resolver]
      [12537.746773]        dns_resolve_server_name_to_ip+0x1e1/0x4d0 [cifs]
      [12537.775607]        cifs_reconnect+0xe04/0x2500 [cifs]
      [12537.798322]        cifs_readv_from_socket+0x461/0x690 [cifs]
      [12537.823369]        cifs_read_from_socket+0xa0/0xe0 [cifs]
      [12537.847262]        cifs_demultiplex_thread+0x311/0x2db0 [cifs]
      [12537.873477]        kthread+0x30c/0x3d0
      [12537.890281]        ret_from_fork+0x3a/0x50
      [12537.908649]
      [12537.908649] -> #1 (&tcp_ses->srv_mutex){+.+.}:
      [12537.935225]        __mutex_lock+0x105/0x11f0
      [12537.954450]        cifs_call_async+0x102/0x7f0 [cifs]
      [12537.977250]        smb2_async_readv+0x6c3/0xc90 [cifs]
      [12538.000659]        cifs_readpages+0x120a/0x1e50 [cifs]
      [12538.023920]        read_pages+0xf5/0x560
      [12538.041583]        __do_page_cache_readahead+0x41d/0x4b0
      [12538.067047]        ondemand_readahead+0x44c/0xc10
      [12538.092069]        filemap_fault+0xec1/0x1830
      [12538.111637]        __do_fault+0x82/0x260
      [12538.129216]        do_fault+0x419/0xfb0
      [12538.146390]        __handle_mm_fault+0x862/0xdf0
      [12538.167408]        handle_mm_fault+0x154/0x550
      [12538.187401]        __do_page_fault+0x42f/0xa60
      [12538.207395]        do_page_fault+0x38/0x5e0
      [12538.225777]        page_fault+0x1e/0x30
      [12538.243010]
      [12538.243010] -> #0 (&mm->mmap_sem){++++}:
      [12538.267875]        lock_acquire+0x14c/0x420
      [12538.286848]        __might_fault+0x119/0x1b0
      [12538.306006]        keyring_read_iterator+0x7e/0x170
      [12538.327936]        assoc_array_subtree_iterate+0x97/0x280
      [12538.352154]        keyring_read+0xe9/0x110
      [12538.370558]        keyctl_read_key+0x1b9/0x220
      [12538.391470]        do_syscall_64+0xa5/0x4b0
      [12538.410511]        entry_SYSCALL_64_after_hwframe+0x6a/0xdf
      [12538.435535]
      [12538.435535] other info that might help us debug this:
      [12538.435535]
      [12538.472829] Chain exists of:
      [12538.472829]   &mm->mmap_sem --> root_key_user.cons_lock --> &type->lock_class
      [12538.472829]
      [12538.524820]  Possible unsafe locking scenario:
      [12538.524820]
      [12538.551431]        CPU0                    CPU1
      [12538.572654]        ----                    ----
      [12538.595865]   lock(&type->lock_class);
      [12538.613737]                                lock(root_key_user.cons_lock);
      [12538.644234]                                lock(&type->lock_class);
      [12538.672410]   lock(&mm->mmap_sem);
      [12538.687758]
      [12538.687758]  *** DEADLOCK ***
      [12538.687758]
      [12538.714455] 1 lock held by keyctl/25598:
      [12538.732097]  #0: 000000003de5b58d (&type->lock_class){++++}, at: keyctl_read_key+0x15a/0x220
      [12538.770573]
      [12538.770573] stack backtrace:
      [12538.790136] CPU: 2 PID: 25598 Comm: keyctl Kdump: loaded Tainted: G
      [12538.844855] Hardware name: HP ProLiant DL360 Gen9/ProLiant DL360 Gen9, BIOS P89 12/27/2015
      [12538.881963] Call Trace:
      [12538.892897]  dump_stack+0x9a/0xf0
      [12538.907908]  print_circular_bug.isra.25.cold.50+0x1bc/0x279
      [12538.932891]  ? save_trace+0xd6/0x250
      [12538.948979]  check_prev_add.constprop.32+0xc36/0x14f0
      [12538.971643]  ? keyring_compare_object+0x104/0x190
      [12538.992738]  ? check_usage+0x550/0x550
      [12539.009845]  ? sched_clock+0x5/0x10
      [12539.025484]  ? sched_clock_cpu+0x18/0x1e0
      [12539.043555]  __lock_acquire+0x1f12/0x38d0
      [12539.061551]  ? trace_hardirqs_on+0x10/0x10
      [12539.080554]  lock_acquire+0x14c/0x420
      [12539.100330]  ? __might_fault+0xc4/0x1b0
      [12539.119079]  __might_fault+0x119/0x1b0
      [12539.135869]  ? __might_fault+0xc4/0x1b0
      [12539.153234]  keyring_read_iterator+0x7e/0x170
      [12539.172787]  ? keyring_read+0x110/0x110
      [12539.190059]  assoc_array_subtree_iterate+0x97/0x280
      [12539.211526]  keyring_read+0xe9/0x110
      [12539.227561]  ? keyring_gc_check_iterator+0xc0/0xc0
      [12539.249076]  keyctl_read_key+0x1b9/0x220
      [12539.266660]  do_syscall_64+0xa5/0x4b0
      [12539.283091]  entry_SYSCALL_64_after_hwframe+0x6a/0xdf
      
      One way to prevent this deadlock scenario from happening is to not
      allow writing to userspace while holding the key semaphore. Instead,
      an internal buffer is allocated for getting the keys out from the
      read method first before copying them out to userspace without holding
      the lock.
      
      That requires taking out the __user modifier from all the relevant
      read methods as well as additional changes to not use any userspace
      write helpers. That is,
      
        1) The put_user() call is replaced by a direct copy.
        2) The copy_to_user() call is replaced by memcpy().
        3) All the fault handling code is removed.
      
      Compiling on a x86-64 system, the size of the rxrpc_read() function is
      reduced from 3795 bytes to 2384 bytes with this patch.
      
      Fixes: ^1da177e4 ("Linux-2.6.12-rc2")
      Reviewed-by: default avatarJarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
      Signed-off-by: default avatarWaiman Long <longman@redhat.com>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      d3ec10aa
  5. Mar 25, 2020
  6. Mar 15, 2020
    • Yang Xu's avatar
      KEYS: reaching the keys quotas correctly · 2e356101
      Yang Xu authored
      
      
      Currently, when we add a new user key, the calltrace as below:
      
      add_key()
        key_create_or_update()
          key_alloc()
          __key_instantiate_and_link
            generic_key_instantiate
              key_payload_reserve
                ......
      
      Since commit a08bf91c ("KEYS: allow reaching the keys quotas exactly"),
      we can reach max bytes/keys in key_alloc, but we forget to remove this
      limit when we reserver space for payload in key_payload_reserve. So we
      can only reach max keys but not max bytes when having delta between plen
      and type->def_datalen. Remove this limit when instantiating the key, so we
      can keep consistent with key_alloc.
      
      Also, fix the similar problem in keyctl_chown_key().
      
      Fixes: 0b77f5bf ("keys: make the keyring quotas controllable through /proc/sys")
      Fixes: a08bf91c ("KEYS: allow reaching the keys quotas exactly")
      Cc: stable@vger.kernel.org # 5.0.x
      Cc: Eric Biggers <ebiggers@google.com>
      Signed-off-by: default avatarYang Xu <xuyang2018.jy@cn.fujitsu.com>
      Reviewed-by: default avatarJarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
      Reviewed-by: default avatarEric Biggers <ebiggers@google.com>
      Signed-off-by: default avatarJarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
      2e356101
  7. Mar 12, 2020
  8. Mar 05, 2020
  9. Feb 28, 2020
    • Tushar Sugandhi's avatar
      integrity: Remove duplicate pr_fmt definitions · 555d6d71
      Tushar Sugandhi authored
      
      
      The #define for formatting log messages, pr_fmt, is duplicated in the
      files under security/integrity.
      
      This change moves the definition to security/integrity/integrity.h and
      removes the duplicate definitions in the other files under
      security/integrity.
      
      With this change, the messages in the following files will be prefixed
      with 'integrity'.
      
           security/integrity/platform_certs/platform_keyring.c
           security/integrity/platform_certs/load_powerpc.c
           security/integrity/platform_certs/load_uefi.c
           security/integrity/iint.c
      
           e.g. "integrity: Error adding keys to platform keyring %s\n"
      
      And the messages in the following file will be prefixed with 'ima'.
      
           security/integrity/ima/ima_mok.c
      
           e.g. "ima: Allocating IMA blacklist keyring.\n"
      
      For the rest of the files under security/integrity, there will be no
      change in the message format.
      
      Suggested-by: default avatarShuah Khan <skhan@linuxfoundation.org>
      Suggested-by: default avatarJoe Perches <joe@perches.com>
      Signed-off-by: default avatarTushar Sugandhi <tusharsu@linux.microsoft.com>
      Reviewed-by: default avatarLakshmi Ramasubramanian <nramas@linux.microsoft.com>
      Signed-off-by: default avatarMimi Zohar <zohar@linux.ibm.com>
      555d6d71
    • Tushar Sugandhi's avatar
      IMA: Add log statements for failure conditions · 72ec611c
      Tushar Sugandhi authored
      
      
      process_buffer_measurement() does not have log messages for failure
      conditions.
      
      This change adds a log statement in the above function.
      
      Suggested-by: default avatarJoe Perches <joe@perches.com>
      Signed-off-by: default avatarTushar Sugandhi <tusharsu@linux.microsoft.com>
      Reviewed-by: default avatarMimi Zohar <zohar@linux.ibm.com>
      Reviewed-by: default avatarLakshmi Ramasubramanian <nramas@linux.microsoft.com>
      Signed-off-by: default avatarMimi Zohar <zohar@linux.ibm.com>
      72ec611c
    • Tushar Sugandhi's avatar
      IMA: Update KBUILD_MODNAME for IMA files to ima · e2bf6814
      Tushar Sugandhi authored
      
      
      The kbuild Makefile specifies object files for vmlinux in the $(obj-y)
      lists. These lists depend on the kernel configuration[1].
      
      The kbuild Makefile for IMA combines the object files for IMA into a
      single object file namely ima.o. All the object files for IMA should be
      combined into ima.o. But certain object files are being added to their
      own $(obj-y). This results in the log messages from those modules getting
      prefixed with their respective base file name, instead of "ima". This is
      inconsistent with the log messages from the IMA modules that are combined
      into ima.o.
      
      This change fixes the above issue.
      
      [1] Documentation\kbuild\makefiles.rst
      
      Signed-off-by: default avatarTushar Sugandhi <tusharsu@linux.microsoft.com>
      Reviewed-by: default avatarMimi Zohar <zohar@linux.ibm.com>
      Reviewed-by: default avatarLakshmi Ramasubramanian <nramas@linux.microsoft.com>
      Signed-off-by: default avatarMimi Zohar <zohar@linux.ibm.com>
      e2bf6814
    • Stephen Smalley's avatar
      selinux: remove unused initial SIDs and improve handling · e3e0b582
      Stephen Smalley authored
      Remove initial SIDs that have never been used or are no longer used by
      the kernel from its string table, which is also used to generate the
      SECINITSID_* symbols referenced in code.  Update the code to
      gracefully handle the fact that these can now be NULL. Stop treating
      it as an error if a policy defines additional initial SIDs unknown to
      the kernel.  Do not load unused initial SID contexts into the sidtab.
      Fix the incorrect usage of the name from the ocontext in error
      messages when loading initial SIDs since these are not presently
      written to the kernel policy and are therefore always NULL.
      
      After this change, it is possible to safely reclaim and reuse some of
      the unused initial SIDs without compatibility issues.  Specifically,
      unused initial SIDs that were being assigned the same context as the
      unlabeled initial SID in policies can be reclaimed and reused for
      another purpose, with existing policies still treating them as having
      the unlabeled context and future policies having the option of mapping
      them to a more specific context.  For example, this could have been
      used when the infiniband labeling support was introduced to define
      initial SIDs for the default pkey and endport SIDs similar to the
      handling of port/netif/node SIDs rather than always using
      SECINITSID_UNLABELED as the default.
      
      The set of safely reclaimable unused initial SIDs across all known
      policies is igmp_packet (13), icmp_socket (14), tcp_socket (15), kmod
      (24), policy (25), and scmp_packet (26); these initial SIDs were
      assigned the same context as unlabeled in all known policies including
      mls.  If only considering non-mls policies (i.e. assuming that mls
      users always upgrade policy with their kernels), the set of safely
      reclaimable unused initial SIDs further includes file_labels (6), init
      (7), sysctl_modprobe (16), and sysctl_fs (18) through sysctl_dev (23).
      
      Adding new initial SIDs beyond SECINITSID_NUM to policy unfortunately
      became a fatal error in commit 24ed7fda ("selinux: use separate
      table for initial SID lookup") and even before that it could cause
      problems on a policy reload (collision between the new initial SID and
      one allocated at runtime) ever since commit 42596eaf ("selinux:
      load the initial SIDs upon every policy load") so we cannot safely
      start adding new initial SIDs to policies beyond SECINITSID_NUM (27)
      until such a time as all such kernels do not need to be supported and
      only those that include this commit are relevant. That is not a big
      deal since we haven't added a new initial SID since 2004 (v2.6.7) and
      we have plenty of unused ones we can reclaim if we truly need one.
      
      If we want to avoid the wasted storage in initial_sid_to_string[]
      and/or sidtab->isids[] for the unused initial SIDs, we could introduce
      an indirection between the kernel initial SID values and the policy
      initial SID values and just map the policy SID values in the ocontexts
      to the kernel values during policy_load_isids(). Originally I thought
      we'd do this by preserving the initial SID names in the kernel policy
      and creating a mapping at load time like we do for the security
      classes and permissions but that would require a new kernel policy
      format version and associated changes to libsepol/checkpolicy and I'm
      not sure it is justified. Simpler approach is just to create a fixed
      mapping table in the kernel from the existing fixed policy values to
      the kernel values. Less flexible but probably sufficient.
      
      A separate selinux userspace change was applied in
      https://github.com/SELinuxProject/selinux/commit/8677ce5e8f592950ae6f14cea1b68a20ddc1ac25
      to enable removal of most of the unused initial SID contexts from
      policies, but there is no dependency between that change and this one.
      That change permits removing all of the unused initial SID contexts
      from policy except for the fs and sysctl SID contexts.  The initial
      SID declarations themselves would remain in policy to preserve the
      values of subsequent ones but the contexts can be dropped.  If/when
      the kernel decides to reuse one of them, future policies can change
      the name and start assigning a context again without breaking
      compatibility.
      
      Here is how I would envision staging changes to the initial SIDs in a
      compatible manner after this commit is applied:
      
      1. At any time after this commit is applied, the kernel could choose
      to reclaim one of the safely reclaimable unused initial SIDs listed
      above for a new purpose (i.e. replace its NULL entry in the
      initial_sid_to_string[] table with a new name and start using the
      newly generated SECINITSID_name symbol in code), and refpolicy could
      at that time rename its declaration of that initial SID to reflect its
      new purpose and start assigning it a context going
      forward. Existing/old policies would map the reclaimed initial SID to
      the unlabeled context, so that would be the initial default behavior
      until policies are updated. This doesn't depend on the selinux
      userspace change; it will work with existing policies and userspace.
      
      2. In 6 months or so we'll have another SELinux userspace release that
      will include the libsepol/checkpolicy support for omitting unused
      initial SID contexts.
      
      3. At any time after that release, refpolicy can make that release its
      minimum build requirement and drop the sid context statements (but not
      the sid declarations) for all of the unused initial SIDs except for
      fs and sysctl, which must remain for compatibility on policy
      reload with old kernels and for compatibility with kernels that were
      still using SECINITSID_SYSCTL (< 2.6.39). This doesn't depend on this
      kernel commit; it will work with previous kernels as well.
      
      4. After N years for some value of N, refpolicy decides that it no
      longer cares about policy reload compatibility for kernels that
      predate this kernel commit, and refpolicy drops the fs and sysctl
      SID contexts from policy too (but retains the declarations).
      
      5. After M years for some value of M, the kernel decides that it no
      longer cares about compatibility with refpolicies that predate step 4
      (dropping the fs and sysctl SIDs), and those two SIDs also become
      safely reclaimable.  This step is optional and need not ever occur unless
      we decide that the need to reclaim those two SIDs outweighs the
      compatibility cost.
      
      6. After O years for some value of O, refpolicy decides that it no
      longer cares about policy load (not just reload) compatibility for
      kernels that predate this kernel commit, and both kernel and refpolicy
      can then start adding and using new initial SIDs beyond 27. This does
      not depend on the previous change (step 5) and can occur independent
      of it.
      
      Fixes: https://github.com/SELinuxProject/selinux-kernel/issues/12
      
      
      Signed-off-by: default avatarStephen Smalley <sds@tycho.nsa.gov>
      Signed-off-by: default avatarPaul Moore <paul@paul-moore.com>
      e3e0b582
    • Ondrej Mosnacek's avatar
      selinux: reduce the use of hard-coded hash sizes · e0ac568d
      Ondrej Mosnacek authored
      
      
      Instead allocate hash tables with just the right size based on the
      actual number of elements (which is almost always known beforehand, we
      just need to defer the hashtab allocation to the right time). The only
      case when we don't know the size (with the current policy format) is the
      new filename transitions hashtable. Here I just left the existing value.
      
      After this patch, the time to load Fedora policy on x86_64 decreases
      from 790 ms to 167 ms. If the unconfined module is removed, it decreases
      from 750 ms to 122 ms. It is also likely that other operations are going
      to be faster, mainly string_to_context_struct() or mls_compute_sid(),
      but I didn't try to quantify that.
      
      The memory usage of all hash table arrays increases from ~58 KB to
      ~163 KB (with Fedora policy on x86_64).
      
      Signed-off-by: default avatarOndrej Mosnacek <omosnace@redhat.com>
      Acked-by: default avatarStephen Smalley <sds@tycho.nsa.gov>
      Signed-off-by: default avatarPaul Moore <paul@paul-moore.com>
      e0ac568d
  10. Feb 23, 2020
  11. Feb 22, 2020
    • Richard Haines's avatar
      selinux: Add xfs quota command types · e4cfa05e
      Richard Haines authored
      
      
      Add Q_XQUOTAOFF, Q_XQUOTAON and Q_XSETQLIM to trigger filesystem quotamod
      permission check.
      
      Add Q_XGETQUOTA, Q_XGETQSTAT, Q_XGETQSTATV and Q_XGETNEXTQUOTA to trigger
      filesystem quotaget permission check.
      
      Signed-off-by: default avatarRichard Haines <richard_c_haines@btinternet.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarPaul Moore <paul@paul-moore.com>
      e4cfa05e
    • Ondrej Mosnacek's avatar
      selinux: optimize storage of filename transitions · c3a27611
      Ondrej Mosnacek authored
      In these rules, each rule with the same (target type, target class,
      filename) values is (in practice) always mapped to the same result type.
      Therefore, it is much more efficient to group the rules by (ttype,
      tclass, filename).
      
      Thus, this patch drops the stype field from the key and changes the
      datum to be a linked list of one or more structures that contain a
      result type and an ebitmap of source types that map the given target to
      the given result type under the given filename. The size of the hash
      table is also incremented to 2048 to be more optimal for Fedora policy
      (which currently has ~2500 unique (ttype, tclass, filename) tuples,
      regardless of whether the 'unconfined' module is enabled).
      
      Not only does this dramtically reduce memory usage when the policy
      contains a lot of unconfined domains (ergo a lot of filename based
      transitions), but it also slightly reduces memory usage of strongly
      confined policies (modeled on Fedora policy with 'unconfined' module
      disabled) and significantly reduces lookup times of these rules on
      Fedora (roughly matches the performance of the rhashtable conversion
      patch [1] posted recently to selinux@vger.kernel.org).
      
      An obvious next step is to change binary policy format to match this
      layout, so that disk space is also saved. However, since that requires
      more work (including matching userspace changes) and this patch is
      already beneficial on its own, I'm posting it separately.
      
      Performance/memory usage comparison:
      
      Kernel           | Policy load | Policy load   | Mem usage | Mem usage     | openbench
                       |             | (-unconfined) |           | (-unconfined) | (createfiles)
      -----------------|-------------|---------------|-----------|---------------|--------------
      reference        |       1,30s |         0,91s |      90MB |          77MB | 55 us/file
      rhashtable patch |       0.98s |         0,85s |      85MB |          75MB | 38 us/file
      this patch       |       0,95s |         0,87s |      75MB |          75MB | 40 us/file
      
      (Memory usage is measured after boot. With SELinux disabled the memory
      usage was ~60MB on the same system.)
      
      [1] https://lore.kernel.org/selinux/20200116213937.77795-1-dev@lynxeye.de/T/
      
      
      
      Signed-off-by: default avatarOndrej Mosnacek <omosnace@redhat.com>
      Acked-by: default avatarStephen Smalley <sds@tycho.nsa.gov>
      Signed-off-by: default avatarPaul Moore <paul@paul-moore.com>
      c3a27611
  12. Feb 18, 2020
  13. Feb 13, 2020
  14. Feb 12, 2020
  15. Feb 10, 2020
  16. Feb 07, 2020
  17. Feb 05, 2020
    • Ondrej Mosnacek's avatar
      selinux: fix sidtab string cache locking · 39a706fb
      Ondrej Mosnacek authored
      Avoiding taking a lock in an IRQ context is not enough to prevent
      deadlocks, as discovered by syzbot:
      
      ===
      WARNING: SOFTIRQ-safe -> SOFTIRQ-unsafe lock order detected
      5.5.0-syzkaller #0 Not tainted
      -----------------------------------------------------
      syz-executor.0/8927 [HC0[0]:SC0[2]:HE1:SE0] is trying to acquire:
      ffff888027c94098 (&(&s->cache_lock)->rlock){+.+.}, at: spin_lock include/linux/spinlock.h:338 [inline]
      ffff888027c94098 (&(&s->cache_lock)->rlock){+.+.}, at: sidtab_sid2str_put.part.0+0x36/0x880 security/selinux/ss/sidtab.c:533
      
      and this task is already holding:
      ffffffff898639b0 (&(&nf_conntrack_locks[i])->rlock){+.-.}, at: spin_lock include/linux/spinlock.h:338 [inline]
      ffffffff898639b0 (&(&nf_conntrack_locks[i])->rlock){+.-.}, at: nf_conntrack_lock+0x17/0x70 net/netfilter/nf_conntrack_core.c:91
      which would create a new lock dependency:
       (&(&nf_conntrack_locks[i])->rlock){+.-.} -> (&(&s->cache_lock)->rlock){+.+.}
      
      but this new dependency connects a SOFTIRQ-irq-safe lock:
       (&(&nf_conntrack_locks[i])->rlock){+.-.}
      
      [...]
      
      other info that might help us debug this:
      
       Possible interrupt unsafe locking scenario:
      
             CPU0                    CPU1
             ----                    ----
        lock(&(&s->cache_lock)->rlock);
                                     local_irq_disable();
                                     lock(&(&nf_conntrack_locks[i])->rlock);
                                     lock(&(&s->cache_lock)->rlock);
        <Interrupt>
          lock(&(&nf_conntrack_locks[i])->rlock);
      
       *** DEADLOCK ***
      [...]
      ===
      
      Fix this by simply locking with irqsave/irqrestore and stop giving up on
      !in_task(). It makes the locking a bit slower, but it shouldn't make a
      big difference in real workloads. Under the scenario from [1] (only
      cache hits) it only increased the runtime overhead from the
      security_secid_to_secctx() function from ~2% to ~3% (it was ~5-65%
      before introducing the cache).
      
      [1] https://bugzilla.redhat.com/show_bug.cgi?id=1733259
      
      
      
      Fixes: d97bd23c ("selinux: cache the SID -> context string translation")
      Reported-by: default avatar <syzbot+61cba5033e2072d61806@syzkaller.appspotmail.com>
      Signed-off-by: default avatarOndrej Mosnacek <omosnace@redhat.com>
      Acked-by: default avatarStephen Smalley <sds@tycho.nsa.gov>
      Signed-off-by: default avatarPaul Moore <paul@paul-moore.com>
      39a706fb
    • Hridya Valsaraju's avatar
      selinux: fix typo in filesystem name · a20456ae
      Hridya Valsaraju authored
      
      
      Correct the filesystem name to "binder" to enable genfscon per-file
      labelling for binderfs.
      
      Fixes: 7a4b5194 ("selinux: allow per-file labelling for binderfs")
      Signed-off-by: default avatarHridya Valsaraju <hridya@google.com>
      Acked-by: default avatarStephen Smalley <sds@tycho.nsa.gov>
      [PM: slight style changes to the subj/description]
      Signed-off-by: default avatarPaul Moore <paul@paul-moore.com>
      a20456ae
    • Casey Schaufler's avatar
      broken ping to ipv6 linklocal addresses on debian buster · 87fbfffc
      Casey Schaufler authored
      
      
      I am seeing ping failures to IPv6 linklocal addresses with Debian
      buster. Easiest example to reproduce is:
      
      $ ping -c1 -w1 ff02::1%eth1
      connect: Invalid argument
      
      $ ping -c1 -w1 ff02::1%eth1
      PING ff02::01%eth1(ff02::1%eth1) 56 data bytes
      64 bytes from fe80::e0:f9ff:fe0c:37%eth1: icmp_seq=1 ttl=64 time=0.059 ms
      
      git bisect traced the failure to
      commit b9ef5513 ("smack: Check address length before reading address family")
      
      Arguably ping is being stupid since the buster version is not setting
      the address family properly (ping on stretch for example does):
      
      $ strace -e connect ping6 -c1 -w1 ff02::1%eth1
      connect(5, {sa_family=AF_UNSPEC,
      sa_data="\4\1\0\0\0\0\377\2\0\0\0\0\0\0\0\0\0\0\0\0\0\1\3\0\0\0"}, 28)
      = -1 EINVAL (Invalid argument)
      
      but the command works fine on kernels prior to this commit, so this is
      breakage which goes against the Linux paradigm of "don't break userspace"
      
      Cc: stable@vger.kernel.org
      Reported-by: default avatarDavid Ahern <dsahern@gmail.com>
      Suggested-by: default avatarTetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
      Signed-off-by: default avatarCasey Schaufler <casey@schaufler-ca.com>
      
       security/smack/smack_lsm.c | 41 +++++++++++++++++++----------------------
       1 file changed, 19 insertions(+), 22 deletions(-)
      87fbfffc
  18. Jan 27, 2020
Loading