- Feb 03, 2019
-
-
Deepa Dinamani authored
struct __kernel_old_timeval is supposed to have the same layout as struct timeval. But, it was inadvarently missed that __kernel_suseconds has a different definition for sparc64. Provide an asm-specific override that fixes it. Reported-by:
Arnd Bergmann <arnd@arndb.de> Suggested-by:
Arnd Bergmann <arnd@arndb.de> Signed-off-by:
Deepa Dinamani <deepa.kernel@gmail.com> Acked-by:
Willem de Bruijn <willemb@google.com> Cc: sparclinux@vger.kernel.org Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Deepa Dinamani authored
SO_TIMESTAMP, SO_TIMESTAMPNS and SO_TIMESTAMPING options, the way they are currently defined, are not y2038 safe. Subsequent patches in the series add new y2038 safe versions of these options which provide 64 bit timestamps on all architectures uniformly. Hence, rename existing options with OLD tag suffixes. Also note that kernel will not use the untagged SO_TIMESTAMP* and SCM_TIMESTAMP* options internally anymore. Signed-off-by:
Deepa Dinamani <deepa.kernel@gmail.com> Acked-by:
Willem de Bruijn <willemb@google.com> Cc: deller@gmx.de Cc: dhowells@redhat.com Cc: jejb@parisc-linux.org Cc: ralf@linux-mips.org Cc: rth@twiddle.net Cc: linux-afs@lists.infradead.org Cc: linux-alpha@vger.kernel.org Cc: linux-arch@vger.kernel.org Cc: linux-mips@linux-mips.org Cc: linux-parisc@vger.kernel.org Cc: linux-rdma@vger.kernel.org Cc: sparclinux@vger.kernel.org Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- Jan 17, 2019
-
-
David Rheinsberg authored
This introduces a new generic SOL_SOCKET-level socket option called SO_BINDTOIFINDEX. It behaves similar to SO_BINDTODEVICE, but takes a network interface index as argument, rather than the network interface name. User-space often refers to network-interfaces via their index, but has to temporarily resolve it to a name for a call into SO_BINDTODEVICE. This might pose problems when the network-device is renamed asynchronously by other parts of the system. When this happens, the SO_BINDTODEVICE might either fail, or worse, it might bind to the wrong device. In most cases user-space only ever operates on devices which they either manage themselves, or otherwise have a guarantee that the device name will not change (e.g., devices that are UP cannot be renamed). However, particularly in libraries this guarantee is non-obvious and it would be nice if that race-condition would simply not exist. It would make it easier for those libraries to operate even in situations where the device-name might change under the hood. A real use-case that we recently hit is trying to start the network stack early in the initrd but make it survive into the real system. Existing distributions rename network-interfaces during the transition from initrd into the real system. This, obviously, cannot affect devices that are up and running (unless you also consider moving them between network-namespaces). However, the network manager now has to make sure its management engine for dormant devices will not run in parallel to these renames. Particularly, when you offload operations like DHCP into separate processes, these might setup their sockets early, and thus have to resolve the device-name possibly running into this race-condition. By avoiding a call to resolve the device-name, we no longer depend on the name and can run network setup of dormant devices in parallel to the transition off the initrd. The SO_BINDTOIFINDEX ioctl plugs this race. Reviewed-by:
Tom Gundersen <teg@jklm.no> Signed-off-by:
David Herrmann <dh.herrmann@gmail.com> Acked-by:
Willem de Bruijn <willemb@google.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- Jan 06, 2019
-
-
Masahiro Yamada authored
Now that Kbuild automatically creates asm-generic wrappers for missing mandatory headers, it is redundant to list the same headers in generic-y and mandatory-y. Suggested-by:
Sam Ravnborg <sam@ravnborg.org> Signed-off-by:
Masahiro Yamada <yamada.masahiro@socionext.com> Acked-by:
Sam Ravnborg <sam@ravnborg.org>
-
Masahiro Yamada authored
These comments are leftovers of commit fcc8487d ("uapi: export all headers under uapi directories"). Prior to that commit, exported headers must be explicitly added to header-y. Now, all headers under the uapi/ directories are exported. Signed-off-by:
Masahiro Yamada <yamada.masahiro@socionext.com>
-
- Nov 19, 2018
-
-
Firoz Khan authored
System call table generation script must be run to gener- ate unistd_32/64.h and syscall_table_32/64/c32.h files. This patch will have changes which will invokes the script. This patch will generate unistd_32/64.h and syscall_table- _32/64/c32.h files by the syscall table generation script invoked by parisc/Makefile and the generated files against the removed files must be identical. The generated uapi header file will be included in uapi/- asm/unistd.h and generated system call table header file will be included by kernel/systbls_32/64.S file. Signed-off-by:
Firoz Khan <firoz.khan@linaro.org> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Firoz Khan authored
NR_syscalls macro holds the number of system call exist in sparc architecture. We have to change the value of NR- _syscalls, if we add or delete a system call. One of the patch in this patch series has a script which will generate a uapi header based on syscall.tbl file. The syscall.tbl file contains the total number of system calls information. So we have two option to update NR_sy- scalls value. 1. Update NR_syscalls in asm/unistd.h manually by count- ing the no.of system calls. No need to update NR_sys- calls until we either add a new system call or delete existing system call. 2. We can keep this feature it above mentioned script, that will count the number of syscalls and keep it in a generated file. In this case we don't need to expli- citly update NR_syscalls in asm/unistd.h file. The 2nd option will be the recommended one. For that, I added the __NR_syscalls macro in uapi/asm/unistd.h along with NR_syscalls asm/unistd.h. The macro __NR_syscalls also added for making the name convention same across all architecture. While __NR_syscalls isn't strictly part of the uapi, having it as part of the generated header to simplifies the implementation. We also need to enclose this macro with #ifdef __KERNEL__ to avoid side effects. Signed-off-by:
Firoz Khan <firoz.khan@linaro.org> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Firoz Khan authored
All the __IGNORE* entries are resides in the uapi header file move to non uapi header asm/unistd.h as it is not used by any user space applications. It is correct to keep __IGNORE* entry in non uapi header asm/unistd.h while uapi/asm/unistd.h must hold information only useful for user space applications. One of the patch in this patch series will generate uapi header file. The information which directly used by the user space application must be present in uapi file. Signed-off-by:
Firoz Khan <firoz.khan@linaro.org> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- Oct 09, 2018
-
-
David S. Miller authored
Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- Oct 03, 2018
-
-
Eric W. Biederman authored
Rework the defintion of struct siginfo so that the array padding struct siginfo to SI_MAX_SIZE can be placed in a union along side of the rest of the struct siginfo members. The result is that we no longer need the __ARCH_SI_PREAMBLE_SIZE or SI_PAD_SIZE definitions. Signed-off-by:
"Eric W. Biederman" <ebiederm@xmission.com>
-
Eric W. Biederman authored
When moving all of the architectures specific si_codes into siginfo.h, I apparently overlooked EMT_TAGOVF. Move it now. Remove the now redundant test in siginfo_layout for SIGEMT as now NSIGEMT is always defined. Signed-off-by:
"Eric W. Biederman" <ebiederm@xmission.com>
-
- Oct 02, 2018
-
-
Nicolas Ferre authored
Add the ISO7816 ioctl and associated accessors and data structure. Drivers can then use this common implementation to handle ISO7816 (smart cards). Signed-off-by:
Nicolas Ferre <nicolas.ferre@microchip.com> [ludovic.desroches@microchip.com: squash and rebase, removal of gpios, checkpatch fixes] Signed-off-by:
Ludovic Desroches <ludovic.desroches@microchip.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- Jul 04, 2018
-
-
Richard Cochran authored
This patch introduces SO_TXTIME. User space enables this option in order to pass a desired future transmit time in a CMSG when calling sendmsg(2). The argument to this socket option is a 8-bytes long struct provided by the uapi header net_tstamp.h defined as: struct sock_txtime { clockid_t clockid; u32 flags; }; Note that new fields were added to struct sock by filling a 2-bytes hole found in the struct. For that reason, neither the struct size or number of cachelines were altered. Signed-off-by:
Richard Cochran <rcochran@linutronix.de> Signed-off-by:
Jesus Sanchez-Palencia <jesus.sanchez-palencia@intel.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- Jun 05, 2018
-
-
Dmitry V. Levin authored
Starting with commit v4.14-rc1~60^2^2~1, a SIGFPE signal sent via kill results to wrong values in si_pid and si_uid fields of compat siginfo_t. This happens due to FPE_FIXME being defined to 0 for sparc, and at the same time siginfo_layout() introduced by the same commit returns SIL_FAULT for SIGFPE if si_code == SI_USER and FPE_FIXME is defined to 0. Fix this regression by removing FPE_FIXME macro and changing all its users to assign FPE_FLTUNK to si_code instead of FPE_FIXME. Note that FPE_FLTUNK is a new macro introduced by commit 266da65e. Tested with commit v4.16-11958-g16e205cf42da. This bug was found by strace test suite. Link: https://github.com/strace/strace/issues/21 Fixes: cc731525 ("signal: Remove kernel interal si_code magic") Thanks-to: Anatoly Pugachev <matorola@gmail.com> Signed-off-by:
Dmitry V. Levin <ldv@altlinux.org> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- May 15, 2018
-
-
Jens Axboe authored
Nobody is using it anymore, and it's been abandoned. Since David is fine with removing it, kill it. Suggested-by:
Christoph Hellwig <hch@lst.de> Acked-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
- Apr 30, 2018
-
-
Rob Gardner authored
The license text in both oradax files mistakenly specifies "version 3" of the GNU General Public License. This is corrected to specify "version 2". Signed-off-by:
Rob Gardner <rob.gardner@oracle.com> Signed-off-by:
Jonathan Helman <jonathan.helman@oracle.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- Apr 20, 2018
-
-
Arnd Bergmann authored
sparc, uses a nonstandard variation of the generic sysvipc data structures, intended to have the padding moved around so it can deal with big-endian 32-bit user space that has 64-bit time_t. Unlike most architectures, sparc actually succeeded in defining this right for big-endian CPUs, but as everyone else got it wrong, we just use the same hack everywhere. This takes just take the same approach here that we have for the asm-generic headers and adds separate 32-bit fields for the upper halves of the timestamps, to let libc deal with the mess in user space. Signed-off-by:
Arnd Bergmann <arnd@arndb.de>
-
- Apr 19, 2018
-
-
Dmitry V. Levin authored
Starting with commit v4.14-rc1~60^2^2~1, a SIGFPE signal sent via kill results to wrong values in si_pid and si_uid fields of compat siginfo_t. This happens due to FPE_FIXME being defined to 0 for sparc, and at the same time siginfo_layout() introduced by the same commit returns SIL_FAULT for SIGFPE if si_code == SI_USER and FPE_FIXME is defined to 0. Fix this regression by removing FPE_FIXME macro and changing all its users to assign FPE_FLTUNK to si_code instead of FPE_FIXME. Note that FPE_FLTUNK is a new macro introduced by commit 266da65e. Tested with commit v4.16-11958-g16e205cf42da. This bug was found by strace test suite. In the discussion about FPE_FLTUNK on sparc David Miller said: > Eric, feel free to do something similar on Sparc. Link: https://github.com/strace/strace/issues/21 Fixes: cc731525 ("signal: Remove kernel interal si_code magic") Fixes: 2.3.41 Cc: David Miller <davem@davemloft.net> Cc: sparclinux@vger.kernel.org Conceptually-Acked-By:
David Miller <davem@davemloft.net> Thanks-to: Anatoly Pugachev <matorola@gmail.com> Signed-off-by:
Dmitry V. Levin <ldv@altlinux.org> Signed-off-by:
Eric W. Biederman <ebiederm@xmission.com>
-
- Mar 20, 2018
-
-
Khalid Aziz authored
Commit c6202ca7 ("sparc64: Add auxiliary vectors to report platform ADI properties") adds auxiliary vectors to report ADI capabilities on sparc64 platform only. This needs to be uniform across 64-bit and 32-bit. This patch makes the same vectors available on 32-bit as well. Signed-off-by:
Khalid Aziz <khalid.aziz@oracle.com> Cc: Khalid Aziz <khalid@gonehiking.org> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- Mar 18, 2018
-
-
Khalid Aziz authored
ADI is a new feature supported on SPARC M7 and newer processors to allow hardware to catch rogue accesses to memory. ADI is supported for data fetches only and not instruction fetches. An app can enable ADI on its data pages, set version tags on them and use versioned addresses to access the data pages. Upper bits of the address contain the version tag. On M7 processors, upper four bits (bits 63-60) contain the version tag. If a rogue app attempts to access ADI enabled data pages, its access is blocked and processor generates an exception. Please see Documentation/sparc/adi.txt for further details. This patch extends mprotect to enable ADI (TSTATE.mcde), enable/disable MCD (Memory Corruption Detection) on selected memory ranges, enable TTE.mcd in PTEs, return ADI parameters to userspace and save/restore ADI version tags on page swap out/in or migration. ADI is not enabled by default for any task. A task must explicitly enable ADI on a memory range and set version tag for ADI to be effective for the task. Signed-off-by:
Khalid Aziz <khalid.aziz@oracle.com> Cc: Khalid Aziz <khalid@gonehiking.org> Reviewed-by:
Anthony Yznaga <anthony.yznaga@oracle.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Khalid Aziz authored
ADI feature on M7 and newer processors has three important properties relevant to userspace apps using ADI capabilities - (1) Size of block of memory an ADI version tag applies to, (2) Number of uppermost bits in virtual address used to encode ADI tag, and (3) The value M7 processor will force the ADI tags to if it detects uncorrectable error in an ADI tagged cacheline. Kernel can retrieve these properties for a platform through machine description provided by the firmware. This patch adds code to retrieve these properties and report them to userspace through auxiliary vectors. Signed-off-by:
Khalid Aziz <khalid.aziz@oracle.com> Cc: Khalid Aziz <khalid@gonehiking.org> Reviewed-by:
Anthony Yznaga <anthony.yznaga@oracle.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Khalid Aziz authored
SPARC M7 processor adds new control register fields, ASIs and a new trap to support the ADI (Application Data Integrity) feature. This patch adds definitions for these register fields, ASIs and a handler for the new precise memory corruption detected trap. Signed-off-by:
Khalid Aziz <khalid.aziz@oracle.com> Cc: Khalid Aziz <khalid@gonehiking.org> Reviewed-by:
Anthony Yznaga <anthony.yznaga@oracle.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- Feb 11, 2018
-
-
Al Viro authored
except, again, POLLFREE and POLL_BUSY_LOOP. With this, we finally get to the promised end result: - POLL{IN,OUT,...} are plain integers and *not* in __poll_t, so any stray instances of ->poll() still using those will be caught by sparse. - eventpoll.c and select.c warning-free wrt __poll_t - no more kernel-side definitions of POLL... - userland ones are visible through the entire kernel (and used pretty much only for mangle/demangle) - same behavior as after the first series (i.e. sparc et.al. epoll(2) working correctly). Signed-off-by:
Al Viro <viro@zeniv.linux.org.uk> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
- Jan 22, 2018
-
-
Rob Gardner authored
DAX is a coprocessor which resides on the SPARC M7 (DAX1) and M8 (DAX2) processor chips, and has direct access to the CPU's L3 caches as well as physical memory. It can perform several operations on data streams with various input and output formats. This driver provides a transport mechanism and has limited knowledge of the various opcodes and data formats. A user space library provides high level services and translates these into low level commands which are then passed into the driver and subsequently the hypervisor and the coprocessor. The library is the recommended way for applications to use the coprocessor, and the driver interface is not intended for general use. Signed-off-by:
Rob Gardner <rob.gardner@oracle.com> Signed-off-by:
Jonathan Helman <jonathan.helman@oracle.com> Signed-off-by:
Sanath Kumar <sanath099@gmail.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- Dec 05, 2017
-
-
Hendrik Brueckner authored
Commit 0515e599 ("bpf: introduce BPF_PROG_TYPE_PERF_EVENT program type") introduced the bpf_perf_event_data structure which exports the pt_regs structure. This is OK for multiple architectures but fail for s390 and arm64 which do not export pt_regs. Programs using them, for example, the bpf selftest fail to compile on these architectures. For s390, exporting the pt_regs is not an option because s390 wants to allow changes to it. For arm64, there is a user_pt_regs structure that covers parts of the pt_regs structure for use by user space. To solve the broken uapi for s390 and arm64, introduce an abstract type for pt_regs and add an asm/bpf_perf_event.h file that concretes the type. An asm-generic header file covers the architectures that export pt_regs today. The arch-specific enablement for s390 and arm64 follows in separate commits. Reported-by:
Thomas Richter <tmricht@linux.vnet.ibm.com> Fixes: 0515e599 ("bpf: introduce BPF_PROG_TYPE_PERF_EVENT program type") Signed-off-by:
Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Reviewed-and-tested-by:
Thomas Richter <tmricht@linux.vnet.ibm.com> Acked-by:
Alexei Starovoitov <ast@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by:
Daniel Borkmann <daniel@iogearbox.net>
-
- Nov 30, 2017
-
-
Al Viro authored
mangle/demangle on the way to/from userland Signed-off-by:
Al Viro <viro@zeniv.linux.org.uk>
-
- Nov 27, 2017
-
-
Al Viro authored
Signed-off-by:
Al Viro <viro@zeniv.linux.org.uk>
-
- Nov 15, 2017
-
-
Nagarathnam Muthusamy authored
Following patch is based on work done by Nick Alcock on 64-bit vDSO for sparc in Oracle linux. I have extended it to include support for 32-bit vDSO for sparc on 64-bit kernel. vDSO for sparc is based on the X86 implementation. This patch provides vDSO support for both 64-bit and 32-bit programs on 64-bit kernel. vDSO will be disabled on 32-bit linux kernel on sparc. *) vclock_gettime.c contains all the vdso functions. Since data page is mapped before the vdso code page, the pointer to data page is got by subracting offset from an address in the vdso code page. The return address stored in %i7 is used for this purpose. *) During compilation, both 32-bit and 64-bit vdso images are compiled and are converted into raw bytes by vdso2c program to be ready for mapping into the process. 32-bit images are compiled only if CONFIG_COMPAT is enabled. vdso2c generates two files vdso-image-64.c and vdso-image-32.c which contains the respective vDSO image in C structure. *) During vdso initialization, required number of vdso pages are allocated and raw bytes are copied into the pages. *) During every exec, these pages are mapped into the process through arch_setup_additional_pages and the location of mapping is passed on to the process through aux vector AT_SYSINFO_EHDR which is used by glibc. *) A new update_vsyscall routine for sparc is added to keep the data page in vdso updated. *) As vDSO cannot contain dynamically relocatable references, a new version of cpu_relax is added for the use of vDSO. This change also requires a putback to glibc to use vDSO. For testing, programs planning to try vDSO can be compiled against the generated vdso(64/32).so in the source. Testing: ======== [root@localhost ~]# cat vdso_test.c int main() { struct timespec tv_start, tv_end; struct timeval tv_tmp; int i; int count = 1 * 1000 * 10000; long long diff; clock_gettime(0, &tv_start); for (i = 0; i < count; i++) gettimeofday(&tv_tmp, NULL); clock_gettime(0, &tv_end); diff = (long long)(tv_end.tv_sec - tv_start.tv_sec)*(1*1000*1000*1000); diff += (tv_end.tv_nsec - tv_start.tv_nsec); printf("Start sec: %d\n", tv_start.tv_sec); printf("End sec : %d\n", tv_end.tv_sec); printf("%d cycles in %lld ns = %f ns/cycle\n", count, diff, (double)diff / (double)count); return 0; } [root@localhost ~]# cc vdso_test.c -o t32_without_fix -m32 -lrt [root@localhost ~]# ./t32_without_fix Start sec: 1502396130 End sec : 1502396140 10000000 cycles in 9565148528 ns = 956.514853 ns/cycle [root@localhost ~]# cc vdso_test.c -o t32_with_fix -m32 ./vdso32.so.dbg [root@localhost ~]# ./t32_with_fix Start sec: 1502396168 End sec : 1502396169 10000000 cycles in 798141262 ns = 79.814126 ns/cycle [root@localhost ~]# cc vdso_test.c -o t64_without_fix -m64 -lrt [root@localhost ~]# ./t64_without_fix Start sec: 1502396208 End sec : 1502396218 10000000 cycles in 9846091800 ns = 984.609180 ns/cycle [root@localhost ~]# cc vdso_test.c -o t64_with_fix -m64 ./vdso64.so.dbg [root@localhost ~]# ./t64_with_fix Start sec: 1502396257 End sec : 1502396257 10000000 cycles in 380984048 ns = 38.098405 ns/cycle V1 to V2 Changes: ================= Added hot patching code to switch the read stick instruction to read tick instruction based on the hardware. V2 to V3 Changes: ================= Merged latest changes from sparc-next and moved the initialization of clocksource_tick.archdata.vclock_mode to time_init_early. Disabled queued spinlock and rwlock configuration when simulating 32-bit config to compile 32-bit VDSO. V3 to V4 Changes: ================= Hardcoded the page size as 8192 in linker script for both 64-bit and 32-bit binaries. Removed unused variables in vdso2c.h. Added -mv8plus flag to Makefile to prevent the generation of relocation entries for __lshrdi3 in 32-bit vdso binary. Signed-off-by:
Nick Alcock <nick.alcock@oracle.com> Signed-off-by:
Nagarathnam Muthusamy <nagarathnam.muthusamy@oracle.com> Reviewed-by:
Shannon Nelson <shannon.nelson@oracle.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- Nov 02, 2017
-
-
Greg Kroah-Hartman authored
Many user space API headers are missing licensing information, which makes it hard for compliance tools to determine the correct license. By default are files without license information under the default license of the kernel, which is GPLV2. Marking them GPLV2 would exclude them from being included in non GPLV2 code, which is obviously not intended. The user space API headers fall under the syscall exception which is in the kernels COPYING file: NOTE! This copyright does *not* cover user programs that use kernel services by normal system calls - this is merely considered normal use of the kernel, and does *not* fall under the heading of "derived work". otherwise syscall usage would not be possible. Update the files which contain no license information with an SPDX license identifier. The chosen identifier is 'GPL-2.0 WITH Linux-syscall-note' which is the officially assigned identifier for the Linux syscall exception. SPDX license identifiers are a legally binding shorthand, which can be used instead of the full boiler plate text. This patch is based on work done by Thomas Gleixner and Kate Stewart and Philippe Ombredanne. See the previous patch in this series for the methodology of how this patch was researched. Reviewed-by:
Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by:
Philippe Ombredanne <pombredanne@nexb.com> Reviewed-by:
Thomas Gleixner <tglx@linutronix.de> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- Aug 04, 2017
-
-
Willem de Bruijn authored
The send call ignores unknown flags. Legacy applications may already unwittingly pass MSG_ZEROCOPY. Continue to ignore this flag unless a socket opts in to zerocopy. Introduce socket option SO_ZEROCOPY to enable MSG_ZEROCOPY processing. Processes can also query this socket option to detect kernel support for the feature. Older kernels will return ENOPROTOOPT. Signed-off-by:
Willem de Bruijn <willemb@google.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- Jul 24, 2017
-
-
Eric W. Biederman authored
struct siginfo is a union and the kernel since 2.4 has been hiding a union tag in the high 16bits of si_code using the values: __SI_KILL __SI_TIMER __SI_POLL __SI_FAULT __SI_CHLD __SI_RT __SI_MESGQ __SI_SYS While this looks plausible on the surface, in practice this situation has not worked well. - Injected positive signals are not copied to user space properly unless they have these magic high bits set. - Injected positive signals are not reported properly by signalfd unless they have these magic high bits set. - These kernel internal values leaked to userspace via ptrace_peek_siginfo - It was possible to inject these kernel internal values and cause the the kernel to misbehave. - Kernel developers got confused and expected these kernel internal values in userspace in kernel self tests. - Kernel developers got confused and set si_code to __SI_FAULT which is SI_USER in userspace which causes userspace to think an ordinary user sent the signal and that it was not kernel generated. - The values make it impossible to reorganize the code to transform siginfo_copy_to_user into a plain copy_to_user. As si_code must be massaged before being passed to userspace. So remove these kernel internal si codes and make the kernel code simpler and more maintainable. To replace these kernel internal magic si_codes introduce the helper function siginfo_layout, that takes a signal number and an si_code and computes which union member of siginfo is being used. Have siginfo_layout return an enumeration so that gcc will have enough information to warn if a switch statement does not handle all of union members. A couple of architectures have a messed up ABI that defines signal specific duplications of SI_USER which causes more special cases in siginfo_layout than I would like. The good news is only problem architectures pay the cost. Update all of the code that used the previous magic __SI_ values to use the new SIL_ values and to call siginfo_layout to get those values. Escept where not all of the cases are handled remove the defaults in the switch statements so that if a new case is missed in the future the lack will show up at compile time. Modify the code that copies siginfo si_code to userspace to just copy the value and not cast si_code to a short first. The high bits are no longer used to hold a magic union member. Fixup the siginfo header files to stop including the __SI_ values in their constants and for the headers that were missing it to properly update the number of si_codes for each signal type. The fixes to copy_siginfo_from_user32 implementations has the interesting property that several of them perviously should never have worked as the __SI_ values they depended up where kernel internal. With that dependency gone those implementations should work much better. The idea of not passing the __SI_ values out to userspace and then not reinserting them has been tested with criu and criu worked without changes. Ref: 2.4.0-test1 Signed-off-by:
"Eric W. Biederman" <ebiederm@xmission.com>
-
- Jul 20, 2017
-
-
Eric W. Biederman authored
Setting si_code to __SI_FAULT results in a userspace seeing an si_code of 0. This is the same si_code as SI_USER. Posix and common sense requires that SI_USER not be a signal specific si_code. As such this use of 0 for the si_code is a pretty horribly broken ABI. This was introduced in 2.3.41 so this mess has had a long time for people to be able to start depending on it. As this bug has existed for 17 years already I don't know if it is worth fixing. It is definitely worth documenting what is going on so that no one decides to copy this bad decision. Cc: "David S. Miller" <davem@davemloft.net> Cc: sparclinux@vger.kernel.org Signed-off-by:
"Eric W. Biederman" <ebiederm@xmission.com>
-
- Jul 17, 2017
-
-
Gleb Fotengauer-Malinovskiy authored
This ioctl does nothing to justify an _IOC_READ or _IOC_WRITE flag because it doesn't copy anything from/to userspace to access the argument. Fixes: 54ebbfb1 ("tty: add TIOCGPTPEER ioctl") Signed-off-by:
Gleb Fotengauer-Malinovskiy <glebfm@altlinux.org> Acked-by:
Aleksa Sarai <asarai@suse.de> Acked-by:
Arnd Bergmann <arnd@arndb.de> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- Jul 11, 2017
-
-
Masahiro Yamada authored
Since commit fcc8487d ("uapi: export all headers under uapi directories"), all (and only) headers under uapi directories are exported, but asm-generic wrappers are still exceptions. To complete de-coupling the uapi from kernel headers, move generic-y of exported headers to uapi/asm/Kbuild. With this change, "make headers_install" will just need to parse uapi/asm/Kbuild to build up exported headers. Signed-off-by:
Masahiro Yamada <yamada.masahiro@socionext.com>
-
- Jun 21, 2017
-
-
David Rheinsberg authored
This adds the new getsockopt(2) option SO_PEERGROUPS on SOL_SOCKET to retrieve the auxiliary groups of the remote peer. It is designed to naturally extend SO_PEERCRED. That is, the underlying data is from the same credentials. Regarding its syntax, it is based on SO_PEERSEC. That is, if the provided buffer is too small, ERANGE is returned and @optlen is updated. Otherwise, the information is copied, @optlen is set to the actual size, and 0 is returned. While SO_PEERCRED (and thus `struct ucred') already returns the primary group, it lacks the auxiliary group vector. However, nearly all access controls (including kernel side VFS and SYSVIPC, but also user-space polkit, DBus, ...) consider the entire set of groups, rather than just the primary group. But this is currently not possible with pure SO_PEERCRED. Instead, user-space has to work around this and query the system database for the auxiliary groups of a UID retrieved via SO_PEERCRED. Unfortunately, there is no race-free way to query the auxiliary groups of the PID/UID retrieved via SO_PEERCRED. Hence, the current user-space solution is to use getgrouplist(3p), which itself falls back to NSS and whatever is configured in nsswitch.conf(3). This effectively checks which groups we *would* assign to the user if it logged in *now*. On normal systems it is as easy as reading /etc/group, but with NSS it can resort to quering network databases (eg., LDAP), using IPC or network communication. Long story short: Whenever we want to use auxiliary groups for access checks on IPC, we need further IPC to talk to the user/group databases, rather than just relying on SO_PEERCRED and the incoming socket. This is unfortunate, and might even result in dead-locks if the database query uses the same IPC as the original request. So far, those recursions / dead-locks have been avoided by using primitive IPC for all crucial NSS modules. However, we want to avoid re-inventing the wheel for each NSS module that might be involved in user/group queries. Hence, we would preferably make DBus (and other IPC that supports access-management based on groups) work without resorting to the user/group database. This new SO_PEERGROUPS ioctl would allow us to make dbus-daemon work without ever calling into NSS. Cc: Michal Sekletar <msekleta@redhat.com> Cc: Simon McVittie <simon.mcvittie@collabora.co.uk> Reviewed-by:
Tom Gundersen <teg@jklm.no> Signed-off-by:
David Herrmann <dh.herrmann@gmail.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- Jun 09, 2017
-
-
Aleksa Sarai authored
When opening the slave end of a PTY, it is not possible for userspace to safely ensure that /dev/pts/$num is actually a slave (in cases where the mount namespace in which devpts was mounted is controlled by an untrusted process). In addition, there are several unresolvable race conditions if userspace were to attempt to detect attacks through stat(2) and other similar methods [in addition it is not clear how userspace could detect attacks involving FUSE]. Resolve this by providing an interface for userpace to safely open the "peer" end of a PTY file descriptor by using the dentry cached by devpts. Since it is not possible to have an open master PTY without having its slave exposed in /dev/pts this interface is safe. This interface currently does not provide a way to get the master pty (since it is not clear whether such an interface is safe or even useful). Cc: Christian Brauner <christian.brauner@ubuntu.com> Cc: Valentin Rothberg <vrothberg@suse.com> Signed-off-by:
Aleksa Sarai <asarai@suse.de> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- May 22, 2017
-
-
David S. Miller authored
A definition was only provided for asm-generic/socket.h using platforms, define it for the others as well Reported-by:
Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- May 10, 2017
-
-
Nicolas Dichtel authored
Regularly, when a new header is created in include/uapi/, the developer forgets to add it in the corresponding Kbuild file. This error is usually detected after the release is out. In fact, all headers under uapi directories should be exported, thus it's useless to have an exhaustive list. After this patch, the following files, which were not exported, are now exported (with make headers_install_all): asm-arc/kvm_para.h asm-arc/ucontext.h asm-blackfin/shmparam.h asm-blackfin/ucontext.h asm-c6x/shmparam.h asm-c6x/ucontext.h asm-cris/kvm_para.h asm-h8300/shmparam.h asm-h8300/ucontext.h asm-hexagon/shmparam.h asm-m32r/kvm_para.h asm-m68k/kvm_para.h asm-m68k/shmparam.h asm-metag/kvm_para.h asm-metag/shmparam.h asm-metag/ucontext.h asm-mips/hwcap.h asm-mips/reg.h asm-mips/ucontext.h asm-nios2/kvm_para.h asm-nios2/ucontext.h asm-openrisc/shmparam.h asm-parisc/kvm_para.h asm-powerpc/perf_regs.h asm-sh/kvm_para.h asm-sh/ucontext.h asm-tile/shmparam.h asm-unicore32/shmparam.h asm-unicore32/ucontext.h asm-x86/hwcap2.h asm-xtensa/kvm_para.h drm/armada_drm.h drm/etnaviv_drm.h drm/vgem_drm.h linux/aspeed-lpc-ctrl.h linux/auto_dev-ioctl.h linux/bcache.h linux/btrfs_tree.h linux/can/vxcan.h linux/cifs/cifs_mount.h linux/coresight-stm.h linux/cryptouser.h linux/fsmap.h linux/genwqe/genwqe_card.h linux/hash_info.h linux/kcm.h linux/kcov.h linux/kfd_ioctl.h linux/lightnvm.h linux/module.h linux/nbd-netlink.h linux/nilfs2_api.h linux/nilfs2_ondisk.h linux/nsfs.h linux/pr.h linux/qrtr.h linux/rpmsg.h linux/sched/types.h linux/sed-opal.h linux/smc.h linux/smc_diag.h linux/stm.h linux/switchtec_ioctl.h linux/vfio_ccw.h linux/wil6210_uapi.h rdma/bnxt_re-abi.h Note that I have removed from this list the files which are generated in every exported directories (like .install or .install.cmd). Thanks to Julien Floret <julien.floret@6wind.com> for the tip to get all subdirs with a pure makefile command. For the record, note that exported files for asm directories are a mix of files listed by: - include/uapi/asm-generic/Kbuild.asm; - arch/<arch>/include/uapi/asm/Kbuild; - arch/<arch>/include/asm/Kbuild. Signed-off-by:
Nicolas Dichtel <nicolas.dichtel@6wind.com> Acked-by:
Daniel Vetter <daniel.vetter@ffwll.ch> Acked-by:
Russell King <rmk+kernel@armlinux.org.uk> Acked-by:
Mark Salter <msalter@redhat.com> Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc) Signed-off-by:
Masahiro Yamada <yamada.masahiro@socionext.com>
-
- Apr 24, 2017
-
-
David S. Miller authored
Hook up statx. Ignore pkeys system calls, we don't have protection keeys on SPARC. Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- Apr 08, 2017
-
-
Chenbo Feng authored
Introduce a new getsockopt operation to retrieve the socket cookie for a specific socket based on the socket fd. It returns a unique non-decreasing cookie for each socket. Tested: https://android-review.googlesource.com/#/c/358163/ Acked-by:
Willem de Bruijn <willemb@google.com> Signed-off-by:
Chenbo Feng <fengc@google.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-