profile
viewpoint
Aleksa Sarai [see §317C(6)] cyphar @SUSE Linux GmbH Oceania (circa 1984) https://www.cyphar.com Backdoors are courtesy of the Australian Government. "designated communications provider" under §317C(6) of the Telecommunications Act 1997.
CommitCommentEvent

pull request commentzfsonlinux/zfs

OverlayFS support (d_revalidate out and support renameat2 flags)

Should probably be rebased now that #9549 has been merged.

snajpa

comment created time in 2 days

delete branch cyphar/zfs

delete branch : zfs-renameat2-flags

delete time in 2 days

PR closed zfsonlinux/zfs

zfs_rename: support RENAME_* flags Status: Code Review Needed

<!--- Documentation on ZFS Buildbot options can be found at https://github.com/zfsonlinux/zfs/wiki/Buildbot-Options -->

Motivation and Context

In order to allow overlayfs-on-ZFS (#8648), we need to support the renameat(2) flags (most importantly, RENAME_WHITEOUT and RENAME_EXCHANGE). This requires quite a few changes to the ordering of link operations during rename (and the related error paths).

Signed-off-by: Aleksa Sarai cyphar@cyphar.com

How Has This Been Tested?

<!--- Please describe in detail how you tested your changes. --> <!--- Include details of your testing environment, and the tests you ran to --> <!--- see how your change affects other areas of the code, etc. --> <!--- If your change is a performance enhancement, please provide benchmarks here. -->

I conducted a variety of smoke-tests to check that all of the renameat(2) flags (and ordinary rename) all operated correctly. After all the runs, I checked that there were no leaks with zdb -bbb. However, I have not tested the error path at all (and I have a feeling this is almost certainly where I've made mistakes).

  • [ ] TODO: Run the rename-related xfstests.

Types of changes

<!--- What types of changes does your code introduce? Put an x in all the boxes that apply: -->

  • [ ] Bug fix (non-breaking change which fixes an issue)
  • [x] New feature (non-breaking change which adds functionality)
  • [ ] Performance enhancement (non-breaking change which improves efficiency)
  • [x] Code cleanup (non-breaking change which makes code smaller or more readable)
  • [ ] Breaking change (fix or feature that would cause existing functionality to change)
  • [ ] Documentation (a change to man pages or other documentation)

Checklist:

<!--- Go over all the following points, and put an x in all the boxes that apply. --> <!--- If you're unsure about any of these, don't hesitate to ask. We're here to help! -->

  • [x] My code follows the ZFS on Linux code style requirements.
  • [ ] I have updated the documentation accordingly.
  • [x] I have read the contributing document.
  • [ ] I have added tests to cover my changes.
  • [ ] All new and existing tests passed.
  • [x] All commit messages are properly formatted and contain Signed-off-by.
+716 -82

14 comments

22 changed files

cyphar

pr closed time in 2 days

pull request commentzfsonlinux/zfs

zfs_rename: support RENAME_* flags

This is being carried by #9414.

cyphar

comment created time in 2 days

push eventcyphar/linux

Matthew Wilcox (Oracle)

commit sha 91abab83839aa2eba073e4a63c729832fdb27ea1

XArray: Fix xas_next() with a single entry at 0 If there is only a single entry at 0, the first time we call xas_next(), we return the entry. Unfortunately, all subsequent times we call xas_next(), we also return the entry at 0 instead of noticing that the xa_index is now greater than zero. This broke find_get_pages_contig(). Fixes: 64d3e9a9e0cc ("xarray: Step through an XArray") Reported-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>

view details

Martin Blumenstingl

commit sha 44b09b11b813b8550e6b976ea51593bc23bba8d1

clk: meson: gxbb: let sar_adc_clk_div set the parent clock rate The meson-saradc driver manually sets the input clock for sar_adc_clk_sel. Update the GXBB clock driver (which is used on GXBB, GXL and GXM) so the rate settings on sar_adc_clk_div are propagated up to sar_adc_clk_sel which will let the common clock framework select the best matching parent clock if we want that. This makes sar_adc_clk_div consistent with the axg-aoclk and g12a-aoclk drivers, which both also specify CLK_SET_RATE_PARENT. Fixes: 33d0fcdfe0e870 ("clk: gxbb: add the SAR ADC clocks and expose them") Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com> Signed-off-by: Jerome Brunet <jbrunet@baylibre.com>

view details

Neil Armstrong

commit sha 4a079643fc73247667000ba54fbccc2acadb04a5

clk: meson: g12a: fix cpu clock rate setting CLK_SET_RATE_NO_REPARENT is wrongly set on the g12a cpu premux0 clocks flags, and CLK_SET_RATE_PARENT is required for the g12a cpu premux0 clock and the g12b cpub premux0 clock, otherwise CCF always selects the SYS_PLL clock to feed the cpu cluster. Fixes: ffae8475b90c ("clk: meson: g12a: add notifiers to handle cpu clock change") Signed-off-by: Neil Armstrong <narmstrong@baylibre.com> Signed-off-by: Jerome Brunet <jbrunet@baylibre.com>

view details

Neil Armstrong

commit sha 90b171f6035688236a3f09117a683020be45603a

clk: meson: g12a: set CLK_MUX_ROUND_CLOSEST on the cpu clock muxes When setting the 100MHz, 500MHz, 666MHz and 1GHz rate for CPU clocks, CCF will use the SYS_PLL to handle these frequencies, but: - using FIXED_PLL derived FCLK_DIV2/DIV3 clocks is more precise - the Amlogic G12A/G12B/SM1 Suspend handling in firmware doesn't handle entering suspend using SYS_PLL for these frequencies Adding CLK_MUX_ROUND_CLOSEST on all the muxes of the non-SYS_PLL cpu clock tree helps CCF always selecting the FCLK_DIV2/DIV3 as source for these frequencies. Fixes: ffae8475b90c ("clk: meson: g12a: add notifiers to handle cpu clock change") Signed-off-by: Neil Armstrong <narmstrong@baylibre.com> Signed-off-by: Jerome Brunet <jbrunet@baylibre.com>

view details

Eugen Hristev

commit sha 2200ab6a7403f4fcd052c55ca328fc942f9392b6

clk: at91: sam9x60: fix programmable clock The prescaler mask for sam9x60 must be 0xff (8 bits). Being set to 0, means that we cannot set any prescaler, thus the programmable clocks do not work (except the case with prescaler 0) Set the mask accordingly in layout struct. Fixes: 01e2113de9a5 ("clk: at91: add sam9x60 pmc driver") Signed-off-by: Eugen Hristev <eugen.hristev@microchip.com> Link: https://lkml.kernel.org/r/1569321191-27606-1-git-send-email-eugen.hristev@microchip.com Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com> Acked-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Signed-off-by: Stephen Boyd <sboyd@kernel.org>

view details

Mika Westerberg

commit sha fd5c46b754d4799afda8dcdd6851e0390aa4961a

thunderbolt: Read DP IN adapter first two dwords in one go When we discover existing DP tunnels the code checks whether DP IN adapter port is enabled by calling tb_dp_port_is_enabled() before it continues the discovery process. On Light Ridge (gen 1) controller reading only the first dword of the DP IN config space causes subsequent access to the same DP IN port path config space to fail or return invalid data as can be seen in the below splat: thunderbolt 0000:07:00.0: CFG_ERROR(0:d): Invalid config space or offset Call Trace: tb_cfg_read+0xb9/0xd0 __tb_path_deactivate_hop+0x98/0x210 tb_path_activate+0x228/0x7d0 tb_tunnel_restart+0x95/0x200 tb_handle_hotplug+0x30e/0x630 process_one_work+0x1b4/0x340 worker_thread+0x44/0x3d0 kthread+0xeb/0x120 ? process_one_work+0x340/0x340 ? kthread_park+0xa0/0xa0 ret_from_fork+0x1f/0x30 If both DP In adapter config dwords are read in one go the issue does not reproduce. This is likely firmware bug but we can work it around by always reading the two dwords in one go. There should be no harm for other controllers either so can do it unconditionally. Link: https://lkml.org/lkml/2019/8/28/160 Reported-by: Brad Campbell <lists2009@fnarfbargle.com> Tested-by: Brad Campbell <lists2009@fnarfbargle.com> Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>

view details

Mika Westerberg

commit sha 6f6709734274aef75058356e029d5e8f86d0d53b

thunderbolt: Fix lockdep circular locking depedency warning When lockdep is enabled, plugging Thunderbolt dock on Dominik's laptop triggers following splat: ====================================================== WARNING: possible circular locking dependency detected 5.3.0-rc6+ #1 Tainted: G T ------------------------------------------------------ pool-/usr/lib/b/1258 is trying to acquire lock: 000000005ab0ad43 (pci_rescan_remove_lock){+.+.}, at: authorized_store+0xe8/0x210 but task is already holding lock: 00000000bfb796b5 (&tb->lock){+.+.}, at: authorized_store+0x7c/0x210 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #1 (&tb->lock){+.+.}: __mutex_lock+0xac/0x9a0 tb_domain_add+0x2d/0x130 nhi_probe+0x1dd/0x330 pci_device_probe+0xd2/0x150 really_probe+0xee/0x280 driver_probe_device+0x50/0xc0 bus_for_each_drv+0x84/0xd0 __device_attach+0xe4/0x150 pci_bus_add_device+0x4e/0x70 pci_bus_add_devices+0x2e/0x66 pci_bus_add_devices+0x59/0x66 pci_bus_add_devices+0x59/0x66 enable_slot+0x344/0x450 acpiphp_check_bridge.part.0+0x119/0x150 acpiphp_hotplug_notify+0xaa/0x140 acpi_device_hotplug+0xa2/0x3f0 acpi_hotplug_work_fn+0x1a/0x30 process_one_work+0x234/0x580 worker_thread+0x50/0x3b0 kthread+0x10a/0x140 ret_from_fork+0x3a/0x50 -> #0 (pci_rescan_remove_lock){+.+.}: __lock_acquire+0xe54/0x1ac0 lock_acquire+0xb8/0x1b0 __mutex_lock+0xac/0x9a0 authorized_store+0xe8/0x210 kernfs_fop_write+0x125/0x1b0 vfs_write+0xc2/0x1d0 ksys_write+0x6c/0xf0 do_syscall_64+0x50/0x180 entry_SYSCALL_64_after_hwframe+0x49/0xbe other info that might help us debug this: Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&tb->lock); lock(pci_rescan_remove_lock); lock(&tb->lock); lock(pci_rescan_remove_lock); *** DEADLOCK *** 5 locks held by pool-/usr/lib/b/1258: #0: 000000003df1a1ad (&f->f_pos_lock){+.+.}, at: __fdget_pos+0x4d/0x60 #1: 0000000095a40b02 (sb_writers#6){.+.+}, at: vfs_write+0x185/0x1d0 #2: 0000000017a7d714 (&of->mutex){+.+.}, at: kernfs_fop_write+0xf2/0x1b0 #3: 000000004f262981 (kn->count#208){.+.+}, at: kernfs_fop_write+0xfa/0x1b0 #4: 00000000bfb796b5 (&tb->lock){+.+.}, at: authorized_store+0x7c/0x210 stack backtrace: CPU: 0 PID: 1258 Comm: pool-/usr/lib/b Tainted: G T 5.3.0-rc6+ #1 On an system using ACPI hotplug the host router gets hotplugged first and then the firmware starts sending notifications about connected devices so the above scenario should not happen in reality. However, after taking a second look at commit a03e828915c0 ("thunderbolt: Serialize PCIe tunnel creation with PCI rescan") that introduced the locking, I don't think it is actually correct. It may have cured the symptom but probably the real root cause was somewhere closer to PCI stack and possibly is already fixed with recent kernels. I also tried to reproduce the original issue with the commit reverted but could not. So to keep lockdep happy and the code bit less complex drop calls to pci_lock_rescan_remove()/pci_unlock_rescan_remove() in tb_switch_set_authorized() effectively reverting a03e828915c0. Link: https://lkml.org/lkml/2019/8/30/513 Fixes: a03e828915c0 ("thunderbolt: Serialize PCIe tunnel creation with PCI rescan") Reported-by: Dominik Brodowski <linux@dominikbrodowski.net> Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>

view details

Mika Westerberg

commit sha 747125db6dcd8bcc21f13d013f6e6a2acade21ee

thunderbolt: Drop unnecessary read when writing LC command in Ice Lake The read is not needed as we overwrite the returned value in the next line anyway so drop it. Fixes: 3cdb9446a117 ("thunderbolt: Add support for Intel Ice Lake") Reported-by: Nicholas Johnson <nicholas.johnson-opensource@outlook.com.au> Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>

view details

Alexandru Ardelean

commit sha 24e1eb5c0d78cfb9750b690bbe997d4d59170258

iio: imu: adis16480: make sure provided frequency is positive It could happen that either `val` or `val2` [provided from userspace] is negative. In that case the computed frequency could get a weird value. Fix this by checking that neither of the 2 variables is negative, and check that the computed result is not-zero. Fixes: e4f959390178 ("iio: imu: adis16480 switch sampling frequency attr to core support") Signed-off-by: Alexandru Ardelean <alexandru.ardelean@analog.com> Cc: <Stable@vger.kernel.org> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>

view details

Andreas Klinger

commit sha 431f7667bd6889a274913162dfd19cce9d84848e

iio: srf04: fix wrong limitation in distance measuring The measured time value in the driver is limited to the maximum distance which can be read by the sensor. This limitation was wrong and is fixed by this patch. It also takes into account that we are supporting a variety of sensors today and that the recently added sensors have a higher maximum distance range. Changes in v2: - Added a Tested-by Suggested-by: Zbyněk Kocur <zbynek.kocur@fel.cvut.cz> Tested-by: Zbyněk Kocur <zbynek.kocur@fel.cvut.cz> Signed-off-by: Andreas Klinger <ak@it-klinger.de> Cc:<Stable@vger.kernel.org> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>

view details

Michal Suchanek

commit sha 52eb063d153ac310058fbaa91577a72c0e7a7169

soundwire: depend on ACPI The device cannot be probed on !ACPI and gives this warning: drivers/soundwire/slave.c:16:12: warning: ‘sdw_slave_add’ defined but not used [-Wunused-function] static int sdw_slave_add(struct sdw_bus *bus, ^~~~~~~~~~~~~ Cc: stable@vger.kernel.org Fixes: 7c3cd189b86d ("soundwire: Add Master registration") Signed-off-by: Michal Suchanek <msuchanek@suse.de> Link: https://lore.kernel.org/r/bd685232ea511251eeb9554172f1524eabf9a46e.1570097621.git.msuchanek@suse.de Signed-off-by: Vinod Koul <vkoul@kernel.org>

view details

Michal Suchanek

commit sha 0f8c0f8a7782178c40157b2feb6a532493cbadd3

soundwire: depend on ACPI || OF Now devicetree is supported for probing soundwire as well. On platforms built with !ACPI !OF (ie s390x) the device still cannot be probed and gives a build warning. Cc: stable@vger.kernel.org Fixes: a2e484585ad3 ("soundwire: core: add device tree support for slave devices") Signed-off-by: Michal Suchanek <msuchanek@suse.de> Link: https://lore.kernel.org/r/0b89b4ea16a93f523105c81a2f718b0cd7ec66f2.1570097621.git.msuchanek@suse.de Signed-off-by: Vinod Koul <vkoul@kernel.org>

view details

Fabien Parent

commit sha 41d49e7939de5ec532d86494185b2ca2e99c848a

clocksource/drivers/mediatek: Fix error handling When timer_of_init fails, it cleans up after itself by undoing everything it did during the initialization function. mtk_syst_init and mtk_gpt_init both call timer_of_cleanup if timer_of_init fails. timer_of_cleanup try to release the resource taken. Since these resources have already been cleaned up by timer_of_init, we end up getting a few warnings printed: [ 0.001935] WARNING: CPU: 0 PID: 0 at __clk_put+0xe8/0x128 [ 0.002650] Modules linked in: [ 0.003058] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.19.67+ #1 [ 0.003852] Hardware name: MediaTek MT8183 (DT) [ 0.004446] pstate: 20400085 (nzCv daIf +PAN -UAO) [ 0.005073] pc : __clk_put+0xe8/0x128 [ 0.005555] lr : clk_put+0xc/0x14 [ 0.005988] sp : ffffff80090b3ea0 [ 0.006422] x29: ffffff80090b3ea0 x28: 0000000040e20018 [ 0.007121] x27: ffffffc07bfff780 x26: 0000000000000001 [ 0.007819] x25: ffffff80090bda80 x24: ffffff8008ec5828 [ 0.008517] x23: ffffff80090bd000 x22: ffffff8008d8b2e8 [ 0.009216] x21: 0000000000000001 x20: fffffffffffffdfb [ 0.009914] x19: ffffff8009166180 x18: 00000000002bffa8 [ 0.010612] x17: ffffffc012996980 x16: 0000000000000000 [ 0.011311] x15: ffffffbf004a6800 x14: 3536343038393334 [ 0.012009] x13: 2079726576652073 x12: 7eb9c62c5c38f100 [ 0.012707] x11: ffffff80090b3ba0 x10: ffffff80090b3ba0 [ 0.013405] x9 : 0000000000000004 x8 : 0000000000000040 [ 0.014103] x7 : ffffffc079400270 x6 : 0000000000000000 [ 0.014801] x5 : ffffffc079400248 x4 : 0000000000000000 [ 0.015499] x3 : 0000000000000000 x2 : 0000000000000000 [ 0.016197] x1 : ffffff80091661c0 x0 : fffffffffffffdfb [ 0.016896] Call trace: [ 0.017218] __clk_put+0xe8/0x128 [ 0.017654] clk_put+0xc/0x14 [ 0.018048] timer_of_cleanup+0x60/0x7c [ 0.018551] mtk_syst_init+0x8c/0x9c [ 0.019020] timer_probe+0x6c/0xe0 [ 0.019469] time_init+0x14/0x44 [ 0.019893] start_kernel+0x2d0/0x46c [ 0.020378] ---[ end trace 8c1efabea1267649 ]--- [ 0.020982] ------------[ cut here ]------------ [ 0.021586] Trying to vfree() nonexistent vm area ((____ptrval____)) [ 0.022427] WARNING: CPU: 0 PID: 0 at __vunmap+0xd0/0xd8 [ 0.023119] Modules linked in: [ 0.023524] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G W 4.19.67+ #1 [ 0.024498] Hardware name: MediaTek MT8183 (DT) [ 0.025091] pstate: 60400085 (nZCv daIf +PAN -UAO) [ 0.025718] pc : __vunmap+0xd0/0xd8 [ 0.026176] lr : __vunmap+0xd0/0xd8 [ 0.026632] sp : ffffff80090b3e90 [ 0.027066] x29: ffffff80090b3e90 x28: 0000000040e20018 [ 0.027764] x27: ffffffc07bfff780 x26: 0000000000000001 [ 0.028462] x25: ffffff80090bda80 x24: ffffff8008ec5828 [ 0.029160] x23: ffffff80090bd000 x22: ffffff8008d8b2e8 [ 0.029858] x21: 0000000000000000 x20: 0000000000000000 [ 0.030556] x19: ffffff800800d000 x18: 00000000002bffa8 [ 0.031254] x17: 0000000000000000 x16: 0000000000000000 [ 0.031952] x15: ffffffbf004a6800 x14: 3536343038393334 [ 0.032651] x13: 2079726576652073 x12: 7eb9c62c5c38f100 [ 0.033349] x11: ffffff80090b3b40 x10: ffffff80090b3b40 [ 0.034047] x9 : 0000000000000005 x8 : 5f5f6c6176727470 [ 0.034745] x7 : 5f5f5f5f28282061 x6 : ffffff80091c86ef [ 0.035443] x5 : ffffff800852b690 x4 : 0000000000000000 [ 0.036141] x3 : 0000000000000002 x2 : 0000000000000002 [ 0.036839] x1 : 7eb9c62c5c38f100 x0 : 7eb9c62c5c38f100 [ 0.037536] Call trace: [ 0.037859] __vunmap+0xd0/0xd8 [ 0.038271] vunmap+0x24/0x30 [ 0.038664] __iounmap+0x2c/0x34 [ 0.039088] timer_of_cleanup+0x70/0x7c [ 0.039591] mtk_syst_init+0x8c/0x9c [ 0.040060] timer_probe+0x6c/0xe0 [ 0.040507] time_init+0x14/0x44 [ 0.040932] start_kernel+0x2d0/0x46c This commit remove the calls to timer_of_cleanup when timer_of_init fails since it is unnecessary and actually cause warnings to be printed. Fixes: a0858f937960 ("mediatek: Convert the driver to timer-of") Signed-off-by: Fabien Parent <fparent@baylibre.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/linux-arm-kernel/20190919191315.25190-1-fparent@baylibre.com/

view details

Stephen Boyd

commit sha 3d883e896947cceeb0e290dfffe0bc16912c90ae

Merge tag 'clk-meson-fixes-v5.4-1' of https://github.com/BayLibre/clk-meson into clk-fixes Pull first round of amlogic clock fixes from Jerome Brunet: - This fixes the clock rate propagation for the g12a cpu and gxbb adc clocks. * tag 'clk-meson-fixes-v5.4-1' of https://github.com/BayLibre/clk-meson: clk: meson: g12a: set CLK_MUX_ROUND_CLOSEST on the cpu clock muxes clk: meson: g12a: fix cpu clock rate setting clk: meson: gxbb: let sar_adc_clk_div set the parent clock rate

view details

Geert Uytterhoeven

commit sha 7693de9f7aa4e2993fbd7094863304be6a4bbe16

clocksource/drivers/sh_mtu2: Do not loop using platform_get_irq_by_name() As platform_get_irq_by_name() now prints an error when the interrupt does not exist, looping over possibly non-existing interrupts causes the printing of scary messages like: sh_mtu2 fcff0000.timer: IRQ tgi1a not found sh_mtu2 fcff0000.timer: IRQ tgi2a not found Fix this by using the platform_irq_count() helper, to avoid touching non-existent interrupts. Limit the returned number of interrupts to the maximum number of channels currently supported by the driver in a future-proof way, i.e. using ARRAY_SIZE() instead of a hardcoded number. Fixes: 7723f4c5ecdb8d83 ("driver core: platform: Add an error message to platform_get_irq*()") Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20191016143003.28561-1-geert+renesas@glider.be

view details

Leonard Crestez

commit sha 83c774f0c69d9d1b32812f3fcf7dde9b556d2670

interconnect: qcom: Fix icc_onecell_data allocation This is a struct with a trailing zero-length array of icc_node pointers but it's allocated as if it were a single array of icc_nodes instead. This allocates too much memory at probe time but shouldn't have any noticeable effect. Both sdm845 and qcs404 are affected. Fix by replacing kcalloc with kzalloc and using the "struct_size" macro. Signed-off-by: Leonard Crestez <leonard.crestez@nxp.com> Fixes: 5e4e6c4d3ae0 ("interconnect: qcom: Add QCS404 interconnect provider driver") Link: https://lore.kernel.org/linux-pm/a7360abb6561917e30bbfaa6084578449152bf1d.1569348056.git.leonard.crestez@nxp.com/ Signed-off-by: Georgi Djakov <georgi.djakov@linaro.org>

view details

Georgi Djakov

commit sha a8dfe193a60c6db7c54e03e3f1b96e0aa7244990

interconnect: Add locking in icc_set_tag() We must ensure that the tag is not changed while we aggregate the requests. Currently the icc_set_tag() is not using any locks and this may cause the values to be aggregated incorrectly. Fix this by acquiring the icc_lock while we set the tag. Link: https://lore.kernel.org/lkml/20191018141750.17032-1-georgi.djakov@linaro.org/ Fixes: 127ab2cc5f19 ("interconnect: Add support for path tags") Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org> Signed-off-by: Georgi Djakov <georgi.djakov@linaro.org>

view details

Bard Liao

commit sha cf9249626f72878b6d205a4965093cba5cce98df

soundwire: intel: fix intel_register_dai PDI offsets and numbers There are two issues, likely copy/paste: 1. Use cdns->pcm.num_in instead of stream_num_in for consistency with the rest of the code. This was not detected earlier since platforms did not have input-only PDIs. 2. use the correct offset for bi-dir PDM, based on IN and OUT PDIs. Again this was not detected since PDM was not supported earlier. Reported-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com> Signed-off-by: Bard Liao <yung-chuan.liao@linux.intel.com> Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Link: https://lore.kernel.org/r/20190916192348.467-2-pierre-louis.bossart@linux.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>

view details

Andy Shevchenko

commit sha 29c2c6aa32405dfee4a29911a51ba133edcedb0f

pinctrl: intel: Avoid potential glitches if pin is in GPIO mode When consumer requests a pin, in order to be on the safest side, we switch it first to GPIO mode followed by immediate transition to the input state. Due to posted writes it's luckily to be a single I/O transaction. However, if firmware or boot loader already configures the pin to the GPIO mode, user expects no glitches for the requested pin. We may check if the pin is pre-configured and leave it as is till the actual consumer toggles its state to avoid glitches. Fixes: 7981c0015af2 ("pinctrl: intel: Add Intel Sunrisepoint pin controller and GPIO support") Depends-on: f5a26acf0162 ("pinctrl: intel: Initialize GPIO properly when used through irqchip") Cc: stable@vger.kernel.org Cc: fei.yang@intel.com Reported-by: Oliver Barta <oliver.barta@aptiv.com> Reported-by: Malin Jonsson <malin.jonsson@ericsson.com> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>

view details

Stephan Gerhold

commit sha 9110d1b0e229cebb1ffce0c04db2b22beffd513d

ASoC: msm8916-wcd-analog: Fix RX1 selection in RDAC2 MUX According to the PM8916 Hardware Register Description, CDC_D_CDC_CONN_HPHR_DAC_CTL has only a single bit (RX_SEL) to switch between RX1 (0) and RX2 (1). It is not possible to disable it entirely to achieve the "ZERO" state. However, at the moment the "RDAC2 MUX" mixer defines three possible values ("ZERO", "RX2" and "RX1"). Setting the mixer to "ZERO" actually configures it to RX1. Setting the mixer to "RX1" has (seemingly) no effect. Remove "ZERO" and replace it with "RX1" to fix this. Fixes: 585e881e5b9e ("ASoC: codecs: Add msm8916-wcd analog codec") Signed-off-by: Stephan Gerhold <stephan@gerhold.net> Acked-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org> Link: https://lore.kernel.org/r/20191020153007.206070-1-stephan@gerhold.net Signed-off-by: Mark Brown <broonie@kernel.org>

view details

push time in 2 days

Pull request review commentzfsonlinux/zfs

OverlayFS support (d_revalidate out and support renameat2 flags)

 zfs_rename(struct inode *sdip, char *snm, struct inode *tdip, char *tnm,  	zfs_inode_update(szp); 	iput(ZTOI(szp));+	if (wzp) {+		zfs_inode_update(wzp);+		iput(ZTOI(wzp));+	} 	if (tzp) { 		zfs_inode_update(tzp); 		iput(ZTOI(tzp)); 	} +	if (zl != NULL)+		zfs_rename_unlock(&zl);++	zfs_dirent_unlock(sdl);+	zfs_dirent_unlock(tdl);+ 	if (zfsvfs->z_os->os_sync == ZFS_SYNC_ALWAYS) 		zil_commit(zilog, 0);  	ZFS_EXIT(zfsvfs); 	return (error);++	/*+	 * Clean-up path for broken link state.+	 *+	 * At this point we are in a (very) bad state, so we need to do our+	 * best to correct the state. In particular, all of the nlinks are+	 * wrong because we were destroying and creating links with ZRENAMING.+	 *+	 * In some form, all of thee operations have to resolve the state:+	 *+	 *  * link_destroy() *must* succeed. Fortunately, this is very likely+	 *    since we only just created it.+	 *+	 *  * link_create()s are allowed to fail (though they shouldn't because+	 *    we only just unlinked them and are putting the entries back+	 *    during clean-up). But if they fail, we can just forcefully drop+	 *    the nlink value to (at the very least) avoid broken nlink values+	 *    -- though in the case of non-empty directories we will have to+	 *    panic (otherwise we'd have a leaked directory with a broken ..).+	 */+commit_unlink_td_szp:+	VERIFY3U(zfs_link_destroy(tdl, szp, tx, ZRENAMING, NULL), ==, 0);+commit_link_tzp:+	if (tzp) {+		if (zfs_link_create(tdl, tzp, tx, ZRENAMING))+			VERIFY3U(zfs_drop_nlink(tzp, tx, NULL), ==, 0);+	}+commit_link_szp:+	if (zfs_link_create(sdl, szp, tx, ZRENAMING))+		VERIFY3U(zfs_drop_nlink(szp, tx, NULL), ==, 0);+	goto commit;

You're quite right -- carry on. :wink:

snajpa

comment created time in 3 days

Pull request review commentzfsonlinux/zfs

OverlayFS support (d_revalidate out and support renameat2 flags)

 zfs_rename(struct inode *sdip, char *snm, struct inode *tdip, char *tnm, 		return (error); 	} -	if (tzp)	/* Attempt to remove the existing target */-		error = zfs_link_destroy(tdl, tzp, tx, zflg, NULL);+	/*+	 * Unlink the source.+	 */+	szp->z_pflags |= ZFS_AV_MODIFIED;+	if (tdzp->z_pflags & ZFS_PROJINHERIT)+		szp->z_pflags |= ZFS_PROJINHERIT;++	error = sa_update(szp->z_sa_hdl, SA_ZPL_FLAGS(zfsvfs),+	    (void *)&szp->z_pflags, sizeof (uint64_t), tx);+	ASSERT0(error); -	if (error == 0) {-		error = zfs_link_create(tdl, szp, tx, ZRENAMING);-		if (error == 0) {-			szp->z_pflags |= ZFS_AV_MODIFIED;-			if (tdzp->z_pflags & ZFS_PROJINHERIT)-				szp->z_pflags |= ZFS_PROJINHERIT;+	error = zfs_link_destroy(sdl, szp, tx, ZRENAMING, NULL);+	if (error)+		goto commit;++	/*+	 * Unlink the target.+	 */+	if (tzp) {+		int tzflg = zflg;++		if (flags & RENAME_EXCHANGE) {+			/* This inode will be re-linked soon. */+			tzflg |= ZRENAMING;++			tzp->z_pflags |= ZFS_AV_MODIFIED;+			if (sdzp->z_pflags & ZFS_PROJINHERIT)+				tzp->z_pflags |= ZFS_PROJINHERIT; -			error = sa_update(szp->z_sa_hdl, SA_ZPL_FLAGS(zfsvfs),-			    (void *)&szp->z_pflags, sizeof (uint64_t), tx);+			error = sa_update(tzp->z_sa_hdl, SA_ZPL_FLAGS(zfsvfs),+			    (void *)&tzp->z_pflags, sizeof (uint64_t), tx); 			ASSERT0(error);+		}+		error = zfs_link_destroy(tdl, tzp, tx, tzflg, NULL);+		if (error)+			goto commit_link_szp;+	} -			error = zfs_link_destroy(sdl, szp, tx, ZRENAMING, NULL);-			if (error == 0) {-				zfs_log_rename(zilog, tx, TX_RENAME |-				    (flags & FIGNORECASE ? TX_CI : 0), sdzp,-				    sdl->dl_name, tdzp, tdl->dl_name, szp);-			} else {-				/*-				 * At this point, we have successfully created-				 * the target name, but have failed to remove-				 * the source name.  Since the create was done-				 * with the ZRENAMING flag, there are-				 * complications; for one, the link count is-				 * wrong.  The easiest way to deal with this-				 * is to remove the newly created target, and-				 * return the original error.  This must-				 * succeed; fortunately, it is very unlikely to-				 * fail, since we just created it.-				 */-				VERIFY3U(zfs_link_destroy(tdl, szp, tx,-				    ZRENAMING, NULL), ==, 0);-			}-		} else {-			/*-			 * If we had removed the existing target, subsequent-			 * call to zfs_link_create() to add back the same entry-			 * but, the new dnode (szp) should not fail.-			 */-			ASSERT(tzp == NULL);+	/*+	 * Create the new target links:+	 *   * We always link the target.+	 *   * RENAME_WHITEOUT: Create a whiteout inode in-place of the source.+	 *   * RENAME_EXCHANGE: Link the old target to the source.+	 */+	error = zfs_link_create(tdl, szp, tx, ZRENAMING);+	if (error) {+		/*+		 * If we have removed the existing target, a subsequent call to+		 * zfs_link_create() to add back the same entry, but with a new+		 * dnode (szp), should not fail.+		 */+		ASSERT3P(tzp, ==, NULL);+		goto commit_link_tzp;+	}++	if (flags & RENAME_EXCHANGE) {+		error = zfs_link_create(sdl, tzp, tx, ZRENAMING);+		/*+		 * The same argument as zfs_link_create() failing for+		 * szp applies here, since the source directory must+		 * have had an entry we are replacing.+		 */+		ASSERT3U(error, ==, 0);+		if (error)+			goto commit_unlink_td_szp;+	} else if (flags & RENAME_WHITEOUT) {+		zfs_mknode(sdzp, &wo_vap, tx, cr, 0, &wzp, &acl_ids);+		error = zfs_link_create(sdl, wzp, tx, ZNEW);+		if (error) {+			zfs_znode_delete(wzp, tx);+			remove_inode_hash(ZTOI(wzp));+			goto commit_unlink_td_szp; 		}+		/* No need to zfs_log_create_txtype here. */ 	} +	if (fuid_dirtied)+		zfs_fuid_sync(zfsvfs, tx);++	if (flags & RENAME_EXCHANGE) {+		zfs_log_rename_exchange(zilog, tx,+		    (flags & FIGNORECASE ? TX_CI : 0), sdzp,+		    sdl->dl_name, tdzp, tdl->dl_name, szp);+	} else if (flags & RENAME_WHITEOUT) {+		vsecattr_t vsecp;++		vsecp.vsa_mask |= VSA_ACE_ALLTYPES;+		error = zfs_getacl(szp, &vsecp, B_TRUE, cr);

Fair enough -- when writing them I was unsure whether ASSERT or VERIFY were the more accepted practices within the ZFS codebase. I would also opt for VERIFY.

snajpa

comment created time in 3 days

Pull request review commentzfsonlinux/zfs

OverlayFS support (d_revalidate out and support renameat2 flags)

 zfs_log_rename(zilog_t *zilog, dmu_tx_t *tx, uint64_t txtype, 	zil_itx_assign(zilog, itx, tx); } +/*+ * At the moment, only Linux supports renameat2 variant of renameat, which+ * adds three new flags of interest for us:+ *     RENAME_NOREPLACE: if the target name at the moment of the call exists,+ *                       don't rewrite it and return error+ *     RENAME_EXCHANGE: atomically swap the two names on the filesystem+ *     RENAME_WHITEOUT: creates a whiteout inode in place of renamed file as+ *                      an atomic operation+ *+ * Ideally, these operations should be represented as new ZFS Intent Log record+ * types, which should mandate a new ZFS feature flag due to the on-disk format+ * change. One would use spa_feature_incr/decr functions to indicate that we're+ * actually actively using the new on-disk txtypes - but these functions are+ * only supposed to be called from the txg syncing context.+ * This means that would need to force out in-progress txg to disk and start+ * a new one before writing any ZIL records, just so we can be sure that ZIL+ * replaying ZFS gets told it should expect potentially incompatible ZIL+ * txtypes. Doing this would hurt performance.+ * Alternatively, we could just activate the feature on a pool when these+ * renameat2 flags get first used and leave it at that - which would render the+ * pool importable read-only on implementations without the new feature flag,+ * even when no new txtypes IL records would be present on-disk - which on most+ * setups could be 'almost' all the time, so it'd be a shame to have them all+ * read-only on non-Linux platforms.+ * As a third option, at least until more platforms implement renameat2, we+ * choose to rely on the fact that the ZIL is replayed in single-threaded mode+ * before the dataset is mounted. This way, we can represent the otherwise+ * atomic operations as a series of plain good old txtypes known to all current+ * OpenZFS implementations. To do that, we use these hacky functions:+ *+ * zfs_log_rename_exchange+ * zfs_log_rename_whiteout+ *+ *     To represent atomic rename with old non-atomic operations, we need+ *     a temporary new name; so we try picking a name until we succeed, then+ *     we get a dirent lock for that temp name until the final itx gets queued+ */++void+zfs_log_rename_exchange(zilog_t *zilog, dmu_tx_t *tx, uint64_t txtype,+    znode_t *sdzp, char *sname, znode_t *tdzp, char *dname, znode_t *szp)+{+	zfs_dirlock_t *tmpdl;+	znode_t *tmpzp = NULL;+	char rndname[16];+	char *tmpname;+	int retries = 0;+	int error;++	tmpname = kmem_alloc(MAXPATHLEN, KM_SLEEP);+	ASSERT3P(tmpname, !=, NULL);++	for (int i = 0; i < 16; i++) {+		int r = 0xFF;+		while (r > 127 || r == 0 || r == '/')+			random_get_pseudo_bytes((void *)&r, 1);+		rndname[i] = (char)r;+	}++	do {+		retries++;+		(void) snprintf(tmpname, MAXPATHLEN,+		    "%s.zfs_renameat2_emul_%s%4d",

You don't need to add logic -- my point was that it should be prefixed with a . but it's only a nit.

snajpa

comment created time in 3 days

pull request commentzfsonlinux/zfs

Remove zpl_revalidate

:+1: Removing ->d_revalidate solves one of the problems on the path to overlay-on-ZFS support. Assuming that the new invalidation handling is correct (it seems right to me, at least on a surface-level), this looks good overall. Thanks @snajpa.

snajpa

comment created time in 3 days

issue commentcontainers/buildah

COPY is broken with relative symbolic links

You get a "too many symbolic links" error because what happens is that securejoin treats the link as being a link to itself (because /../../../README.md becomes /README.md). If you get an error from securejoin it's usually because a user is doing something dodgy, so adding some additional wrapping warnings in buildah wouldn't be a bad idea.

debarshiray

comment created time in 6 days

issue commentopencontainers/runc

cgroup2: procHooks: failed to load program: operation not permitted

@AkihiroSuda Is systemd setting a seccomp filter or AppArmor profile?

AkihiroSuda

comment created time in 6 days

pull request commentopencontainers/runtime-spec

Add mount and start hooks

@h-vetinari Aside from the few nits that I just commented, I'm fine with the bulk of the PR. We just need to get the @opencontainers/runtime-spec-maintainers to review it.

RenaudWasTaken

comment created time in 6 days

Pull request review commentopencontainers/runtime-spec

Add mount and start hooks

 For POSIX platforms, the configuration structure supports `hooks` for configurin         * **`env`** (array of strings, OPTIONAL) with the same semantics as [IEEE Std 1003.1-2008's `environ`][ieee-1003.1-2008-xbd-c8.1].         * **`timeout`** (int, OPTIONAL) is the number of seconds before aborting the hook.             If set, `timeout` MUST be greater than zero.+    * **`create-runtime`** (array of objects, OPTIONAL) is an array of [create-runtime hooks](#create-runtime-hooks).+        Entries in the array have the same schema as prestart entries.+    * **`create-container`** (array of objects, OPTIONAL) is an array of [create-container hooks](#create-container-hooks).+        Entries in the array have the same schema as prestart entries.+    * **`start-container`** (array of objects, OPTIONAL) is an array of [start-container hooks](#start-container-hooks).+        Entries in the array have the same schema as prestart entries.     * **`poststart`** (array of objects, OPTIONAL) is an array of [post-start hooks](#poststart).-        Entries in the array have the same schema as pre-start entries.+        Entries in the array have the same schema as prestart entries.     * **`poststop`** (array of objects, OPTIONAL) is an array of [post-stop hooks](#poststop).-        Entries in the array have the same schema as pre-start entries.+        Entries in the array have the same schema as prestart entries.  Hooks allow users to specify programs to run before or after various lifecycle events. Hooks MUST be called in the listed order.-Hooks MUST be executed in the [runtime namespace](glossary.md#runtime-namespace). The [state](runtime.md#state) of the container MUST be passed to hooks over stdin so that they may do work appropriate to the current state of the container.  ### <a name="configHooksPrestart" />Prestart -The pre-start hooks MUST be called after the [`start`](runtime.md#start) operation is called but [before the user-specified program command is executed](runtime.md#lifecycle).+The prestart hooks MUST be called after the [`start`](runtime.md#start) operation is called but [before the user-specified program command is executed](runtime.md#lifecycle). On Linux, for example, they are called after the container namespaces are created, so they provide an opportunity to customize the container (e.g. the network namespace could be specified in this hook). +Note: Prestart hooks were deprecated in favor of create-runtime, create-container and start-container hooks, which allow more granular hook control during the create and start phase.++Prestart hooks MUST be executed in the [runtime namespace](glossary.md#runtime-namespace).++### <a name="configHooksCreateRuntime" />Create Runtime Hooks++The create-runtime hooks MUST be called as part of the [`create`](runtime.md#create) operation after the runtime environment has been created (according to the configuration in config.json) but before the pivot_root or any equivalent operation has been executed.+The create-runtime hooks MUST be executed in the [runtime namespace](glossary.md#runtime-namespace).++On Linux, for example, they are called after the container namespaces are created, so they provide an opportunity to customize the container (e.g. the network namespace could be specified in this hook).++Note: the incorrect `runc` implementation of the prestart hooks, maps to a create-runtime hook.+For runtimes that implement the deprecated prestart hooks as create-runtime hooks, create-runtime hooks MUST be called after the prestart hooks.

I agree with @h-vetinari -- please re-add the runc reference.

RenaudWasTaken

comment created time in 6 days

Pull request review commentopencontainers/runtime-spec

Add mount and start hooks

 For POSIX platforms, the configuration structure supports `hooks` for configurin         * **`env`** (array of strings, OPTIONAL) with the same semantics as [IEEE Std 1003.1-2008's `environ`][ieee-1003.1-2008-xbd-c8.1].         * **`timeout`** (int, OPTIONAL) is the number of seconds before aborting the hook.             If set, `timeout` MUST be greater than zero.+    * **`create-runtime`** (array of objects, OPTIONAL) is an array of [create-runtime hooks](#create-runtime-hooks).+        Entries in the array have the same schema as prestart entries.+    * **`create-container`** (array of objects, OPTIONAL) is an array of [create-container hooks](#create-container-hooks).+        Entries in the array have the same schema as prestart entries.+    * **`start-container`** (array of objects, OPTIONAL) is an array of [start-container hooks](#start-container-hooks).+        Entries in the array have the same schema as prestart entries.     * **`poststart`** (array of objects, OPTIONAL) is an array of [post-start hooks](#poststart).-        Entries in the array have the same schema as pre-start entries.+        Entries in the array have the same schema as prestart entries.

nit: "prestart" isn't really a word -- write it as prestart (the same applies throughout your changes).

RenaudWasTaken

comment created time in 6 days

Pull request review commentopencontainers/runtime-spec

Add mount and start hooks

 For POSIX platforms, the configuration structure supports `hooks` for configurin         * **`env`** (array of strings, OPTIONAL) with the same semantics as [IEEE Std 1003.1-2008's `environ`][ieee-1003.1-2008-xbd-c8.1].         * **`timeout`** (int, OPTIONAL) is the number of seconds before aborting the hook.             If set, `timeout` MUST be greater than zero.+    * **`create-runtime`** (array of objects, OPTIONAL) is an array of [create-runtime hooks](#create-runtime-hooks).+        Entries in the array have the same schema as prestart entries.+    * **`create-container`** (array of objects, OPTIONAL) is an array of [create-container hooks](#create-container-hooks).+        Entries in the array have the same schema as prestart entries.+    * **`start-container`** (array of objects, OPTIONAL) is an array of [start-container hooks](#start-container-hooks).+        Entries in the array have the same schema as prestart entries.     * **`poststart`** (array of objects, OPTIONAL) is an array of [post-start hooks](#poststart).-        Entries in the array have the same schema as pre-start entries.+        Entries in the array have the same schema as prestart entries.     * **`poststop`** (array of objects, OPTIONAL) is an array of [post-stop hooks](#poststop).-        Entries in the array have the same schema as pre-start entries.+        Entries in the array have the same schema as prestart entries.  Hooks allow users to specify programs to run before or after various lifecycle events. Hooks MUST be called in the listed order.-Hooks MUST be executed in the [runtime namespace](glossary.md#runtime-namespace). The [state](runtime.md#state) of the container MUST be passed to hooks over stdin so that they may do work appropriate to the current state of the container.  ### <a name="configHooksPrestart" />Prestart -The pre-start hooks MUST be called after the [`start`](runtime.md#start) operation is called but [before the user-specified program command is executed](runtime.md#lifecycle).+The prestart hooks MUST be called after the [`start`](runtime.md#start) operation is called but [before the user-specified program command is executed](runtime.md#lifecycle). On Linux, for example, they are called after the container namespaces are created, so they provide an opportunity to customize the container (e.g. the network namespace could be specified in this hook). +Note: Prestart hooks were deprecated in favor of create-runtime, create-container and start-container hooks, which allow more granular hook control during the create and start phase.++Prestart hooks MUST be executed in the [runtime namespace](glossary.md#runtime-namespace).++### <a name="configHooksCreateRuntime" />Create Runtime Hooks++The create-runtime hooks MUST be called as part of the [`create`](runtime.md#create) operation after the runtime environment has been created (according to the configuration in config.json) but before the pivot_root or any equivalent operation has been executed.+The create-runtime hooks MUST be executed in the [runtime namespace](glossary.md#runtime-namespace).++On Linux, for example, they are called after the container namespaces are created, so they provide an opportunity to customize the container (e.g. the network namespace could be specified in this hook).++Note: the incorrect `runc` implementation of the prestart hooks, maps to a create-runtime hook.+For runtimes that implement the deprecated prestart hooks as create-runtime hooks, create-runtime hooks MUST be called after the prestart hooks.++### <a name="configHooksCreateContainer" />Create Container Hooks++The create-container hooks MUST be called as part of the [`create`](runtime.md#create) operation after the runtime environment has been created (according to the configuration in config.json) but before the pivot_root or any equivalent operation has been executed.+The create-container hooks MUST be executed in the [container namespace](glossary.md#container-namespace).+THe create-container hooks MUST be called after the create-runtime hooks.++For example, on Linux this would happen before the `pivot_root` operation is executed but after the mount namespace was created and setup.++### <a name="configHooksStartContainer" />Start Container Hooks++The start-container hooks MUST be called [before the user-specified process is executed](runtime.md#lifecycle) as part of the [`start`](runtime.md#start) operation.+This hook can be used to execute some operations in the container, for example running the ldconfig binary on linux before the container process is spawned.++The start-container hooks MUST be executed in the [container namespace](glossary.md#container-namespace).+ ### <a name="configHooksPoststart" />Poststart  The post-start hooks MUST be called [after the user-specified process is executed](runtime.md#lifecycle) but before the [`start`](runtime.md#start) operation returns. For example, this hook can notify the user that the container process is spawned. +Poststart hooks MUST be executed in the [runtime namespace](glossary.md#runtime-namespace).+ ### <a name="configHooksPoststop" />Poststop  The post-stop hooks MUST be called [after the container is deleted](runtime.md#lifecycle) but before the [`delete`](runtime.md#delete) operation returns. Cleanup or debugging functions are examples of such a hook. +Poststop hooks MUST be executed in the [runtime namespace](glossary.md#runtime-namespace).++### Summary++See the below table for a summary of hooks and when they are called:++|          Name         | Namespace |                                                            When                                                                   |+| --------------------- | --------- | ----------------------------------------------------------------------------------------------------------------------------------|+| prestart (Deprecated) | runtime   | After the start  operation is called but before the user-specified program command is executed                                    |+| create-runtime        | runtime   | During the create operation, after the runtime environment has been created and before the pivot root or any equivalent operation |+| create-container      | container | During the create operation after the runtime environment has been created and before the pivot root or any equivalent operation  |

And a missing fullstop for the first three hooks.

RenaudWasTaken

comment created time in 6 days

pull request commentopencontainers/runtime-spec

Add mount and start hooks

@bart0sh The old hooks must be deprecated (that's ignoring the fact that the names weren't very good in the first place -- I'm of half a mind to suggest org.opencontainers.runtime.hook.* as the hook names). Sorry, but that's non-negotiable -- the current hooks are never going to be usable because some project won't correctly update their handling or documentation.

Every other OCI runtime I'm aware of has (incorrectly, though understandably) copied runc's incorrect semantics because the widely-used NVIDIA and CNI hooks wouldn't work as-is with the semantics defined in the current spec. As much as I wish this wasn't the case, people blindly copy what runc does because the runtime-spec doesn't correctly capture the semantics needed to get an OCI container runtime to be fully functional (don't even get me started on --console-socket and how inadequate "terminal": true is).

I do wish the situation was better, and that we could just ignore a runc bug as being "just a bug in one implementation" (I would've loved to release runc 1.0 two years ago) but that simply isn't the state of things.

RenaudWasTaken

comment created time in 8 days

push eventcyphar/dotfiles

Aleksa Sarai

commit sha b78fabf05fcfda224d257726c4bb0b9a626c1d38

mutt: clean up suse setup Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>

view details

Aleksa Sarai

commit sha 523b0757ec434299fe8b68be98d72f2a125f9211

mutt: sort by send date Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>

view details

push time in 8 days

push eventcyphar/linux

Aleksa Sarai

commit sha 33b643b02378ae39c00459cfd6a20580b64328ff

open: introduce openat2(2) syscall /* Background. */ For a very long time, extending openat(2) with new features has been incredibly frustrating. This stems from the fact that openat(2) is possibly the most famous counter-example to the mantra "don't silently accept garbage from userspace" -- it doesn't check whether unknown flags are present[1]. This means that (generally) the addition of new flags to openat(2) has been fraught with backwards-compatibility issues (O_TMPFILE has to be defined as __O_TMPFILE|O_DIRECTORY|[O_RDWR or O_WRONLY] to ensure old kernels gave errors, since it's insecure to silently ignore the flag[2]). All new security-related flags therefore have a tough road to being added to openat(2). Userspace also has a hard time figuring out whether a particular flag is supported on a particular kernel. While it is now possible with contemporary kernels (thanks to [3]), older kernels will expose unknown flag bits through fcntl(F_GETFL). Giving a clear -EINVAL during openat(2) time matches modern syscall designs and is far more fool-proof. In addition, the newly-added path resolution restriction LOOKUP flags (which we would like to expose to user-space) don't feel related to the pre-existing O_* flag set -- they affect all components of path lookup. We'd therefore like to add a new flag argument. Adding a new syscall allows us to finally fix the flag-ignoring problem, and we can make it extensible enough so that we will hopefully never need an openat3(2). /* Syscall Prototype. */ /* * open_how is an extensible structure (similar in interface to * clone3(2) or sched_setattr(2)). The size parameter must be set to * sizeof(struct open_how), to allow for future extensions. All future * extensions will be appended to open_how, with their zero value * acting as a no-op default. */ struct open_how { /* ... */ }; int openat2(int dfd, const char *pathname, struct open_how *how, size_t size); /* Description. */ The initial version of 'struct open_how' contains the following fields: flags Used to specify openat(2)-style flags. However, any unknown flag bits or otherwise incorrect flag combinations (like O_PATH|O_RDWR) will result in -EINVAL. In addition, this field is 64-bits wide to allow for more O_ flags than currently permitted with openat(2). mode The file mode for O_CREAT or O_TMPFILE. Must be set to zero if flags does not contain O_CREAT or O_TMPFILE. __padding Must be set to all zeroes. resolve Restrict path resolution (in contrast to O_* flags they affect all path components). The current set of flags are as follows (at the moment, all of the RESOLVE_ flags are implemented as just passing the corresponding LOOKUP_ flag). RESOLVE_NO_XDEV => LOOKUP_NO_XDEV RESOLVE_NO_SYMLINKS => LOOKUP_NO_SYMLINKS RESOLVE_NO_MAGICLINKS => LOOKUP_NO_MAGICLINKS RESOLVE_BENEATH => LOOKUP_BENEATH RESOLVE_IN_ROOT => LOOKUP_IN_ROOT open_how does not contain an embedded size field, because it is of little benefit (userspace can figure out the kernel open_how size at runtime fairly easily without it). Note that as a result of the new how->flags handling, O_PATH|O_TMPFILE is no longer permitted for openat(2). As far as I can tell, this has always been a bug and appears to not be used by userspace (and I've not seen any problems on my machines by disallowing it). If it turns out this breaks something, we can special-case it and only permit it for openat(2) but not openat2(2). /* Testing. */ In a follow-up patch there are over 200 selftests which ensure that this syscall has the correct semantics and will correctly handle several attack scenarios. In addition, I've written a userspace library[4] which provides convenient wrappers around openat2(RESOLVE_IN_ROOT) (this is necessary because no other syscalls support RESOLVE_IN_ROOT, and thus lots of care must be taken when using RESOLVE_IN_ROOT'd file descriptors with other syscalls). During the development of this patch, I've run numerous verification tests using libpathrs (showing that the API is reasonably usable by userspace). /* Future Work. */ Additional RESOLVE_ flags have been suggested during the review period. These can be easily implemented separately (such as blocking auto-mount during resolution). Furthermore, there are some other proposed changes to the openat(2) interface (the most obvious example is magic-link hardening[5]) which would be a good opportunity to add a way for userspace to restrict how O_PATH file descriptors can be re-opened. [1]: https://lwn.net/Articles/588444/ [2]: https://lore.kernel.org/lkml/CA+55aFyyxJL1LyXZeBsf2ypriraj5ut1XkNDsunRBqgVjZU_6Q@mail.gmail.com [3]: commit 629e014bb834 ("fs: completely ignore unknown open flags") [4]: https://sourceware.org/bugzilla/show_bug.cgi?id=17523 [5]: https://lore.kernel.org/lkml/20190930183316.10190-2-cyphar@cyphar.com/ Suggested-by: Christian Brauner <christian@brauner.io> Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>

view details

Aleksa Sarai

commit sha 6d2b41d0ca4867e4592fe8b9ce814d9cfaf88c5b

selftests: add openat2(2) selftests Test all of the various openat2(2) flags. A small stress-test of a symlink-rename attack is included to show that the protections against ".."-based attacks are sufficient. The main things these self-tests are enforcing are: * The struct+usize ABI for openat2(2) and copy_struct_from_user() to ensure that upgrades will be handled gracefully (in addition, ensuring that misaligned structures are also handled correctly). * The -EINVAL checks for openat2(2) are all correctly handled to avoid userspace passing unknown or conflicting flag sets (most importantly, ensuring that invalid flag combinations are checked). * All of the RESOLVE_* semantics (including errno values) are correctly handled with various combinations of paths and flags. * RESOLVE_IN_ROOT correctly protects against the symlink rename(2) attack that has been responsible for several CVEs (and likely will be responsible for several more). Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>

view details

Aleksa Sarai

commit sha 5455c481c176bc0830817387754eb432bb2dfefd

Documentation: path-lookup: mention LOOKUP_MAGICLINK_JUMPED Now that we have a special flag to signify magic-link jumps, mention it within the path-lookup docs. And now that "magic link" is the correct term for nd_jump_link()-style symlinks, clean up references to this type of "symlink". Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>

view details

push time in 8 days

push eventcyphar/linux

Aleksa Sarai

commit sha 965b32c9415855fe57a389fc79a851906591117c

selftests: add openat2(2) selftests Test all of the various openat2(2) flags. A small stress-test of a symlink-rename attack is included to show that the protections against ".."-based attacks are sufficient. The main things these self-tests are enforcing are: * The struct+usize ABI for openat2(2) and copy_struct_from_user() to ensure that upgrades will be handled gracefully (in addition, ensuring that misaligned structures are also handled correctly). * The -EINVAL checks for openat2(2) are all correctly handled to avoid userspace passing unknown or conflicting flag sets (most importantly, ensuring that invalid flag combinations are checked). * All of the RESOLVE_* semantics (including errno values) are correctly handled with various combinations of paths and flags. * RESOLVE_IN_ROOT correctly protects against the symlink rename(2) attack that has been responsible for several CVEs (and likely will be responsible for several more). Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>

view details

Aleksa Sarai

commit sha 7bf9c8e6b9e024b6c59443fc13ea0a96e0346294

Documentation: path-lookup: mention LOOKUP_MAGICLINK_JUMPED Now that we have a special flag to signify magic-link jumps, mention it within the path-lookup docs. And now that "magic link" is the correct term for nd_jump_link()-style symlinks, clean up references to this type of "symlink". Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>

view details

push time in 8 days

push eventcyphar/man-pages

Michael Kerrisk

commit sha 81c2368f462f25714f4a153abb61a25a142cf701

clone.2: Rename arguments for consistency with clone3() Sometime soon, we'll have to add documentation of clone3() to this page. As a preparatorys step, make the names of the clone() arguments the same as the fields in the clone3() 'args' struct: ctid ==> child_pid ptid ==> parent_tid newtls ==> tld child_stack ==> stack Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>

view details

Michael Kerrisk

commit sha 5fbce8f22b802ef68b1ffda70f476232ff503e40

clone.2: Add some subsection headings Again, in preparation for adding clone3() documentation. Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>

view details

Michael Kerrisk

commit sha b7cf324fd8eb68db3b1893bac5414a5514c8ca81

clone.2: Minor change: move a paragraph from DESCRIPTION to NOTES Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>

view details

Michael Kerrisk

commit sha e2bf12346d08fe7ef01bdcfd5b7054a9fe1a10ab

clone.2: wfix Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>

view details

Michael Kerrisk

commit sha faa0e55ae9e490d71c826546bbdef954a1800969

clone.2: Document clone3() Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>

view details

Michael Kerrisk

commit sha d89d14246a868a74641fa2f6f6f4d8cc2e3ae36d

clone3.2: New link to clone(2) Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>

view details

Michael Kerrisk

commit sha fb1fa92b0ade8521228b7cd674bfaf3e530ad7e9

clone.2: srcfix: update copyright Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>

view details

Michael Kerrisk

commit sha 5261b0fe757ee6349f8bf96ac386bc293948ba92

clone.2: Minor improvements following clone3() additions Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>

view details

Michael Kerrisk

commit sha 16853a31ee3c4e11753eb2f7ea57a606a1c9c6f3

clone.2: Introduce "flags mask" as a generic term for clone()/clone3() Use "flags mask" as a generic term to refer to the clone() 'flags' argument and the clone3() 'cl_args.flags' field. Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>

view details

Michael Kerrisk

commit sha 462ce23d491904a0b46252dc97c8cb42391c093e

seccomp.2: Switch to "considerate language" Thanks-to: https://twitter.com/expensivestevie Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>

view details

Michael Kerrisk

commit sha 640453bbea08f75c3e682aeb17f9661b2588aab6

cgroups.7: Switch to "considerate language" Thanks-to: https://twitter.com/expensivestevie Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>

view details

Michael Kerrisk

commit sha 4e4e9e83b6716e16137a6d2c9625b8d3ab3843bf

pidfd_open.2: tfix Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>

view details

Torin Carey

commit sha 6a141329250b8a41672b88548e1ffe982051e736

unix.7: tfix Signed-off-by: Torin Carey <torin@tcarey.uk> Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>

view details

Torin Carey

commit sha 897367f900a353d279913c144ca0dc99d9f531c1

unix.7: tfix In the given example, the second recvmsg(2) call should receive four bytes, as the third sendmsg(2) call only sends four. Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>

view details

Jakub Wilk

commit sha 1191b4e7b27f43ba810230eedd7014ced0b03480

netlink.7: tfix Signed-off-by: Jakub Wilk <jwilk@jwilk.net> Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>

view details

Jakub Wilk

commit sha a9e52b437f8d88c9afddc9fe43bc60d667884ef1

clone.2: Include clone3 in NAME section. Signed-off-by: Jakub Wilk <jwilk@jwilk.net> Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>

view details

Yang Xu

commit sha 13a07cc4854796313c06e34a09d2ace5bd6fe45e

copy_file_range.2: tfix Signed-off-by: Yang Xu <xuyang2018.jy@cn.fujitsu.com> Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>

view details

Yang Xu

commit sha ae848b1d80dec0703b74a76c7983e213c1cd0b3a

quotactl.2: Add some details about Q_QUOTAON For Q_QUOTAON, on old kernel we can use quotacheck -ug to generate quota files. But in current kernel, we can also hide them in system inodes and indicate them by using "quota" or project feature. For user or group quota, we can do as below (etc ext4): mkfs.ext4 -F -o quota /dev/sda5 mount /dev/sda5 /mnt quotactl(QCMD(Q_QUOTAON, USRQUOTA), /dev/sda5, QFMT_VFS_V0, NULL); For project quota, we can do as below (etc ext4): mkfs.ext4 -F -o quota,project /dev/sda5 mount /dev/sda5 /mnt quotactl(QCMD(Q_QUOTAON, PRJQUOTA), /dev/sda5, QFMT_VFS_V0, NULL); Reported-by: Jan Kara <jack@suse.cz> Signed-off-by: Yang Xu <xuyang2018.jy@cn.fujitsu.com> Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>

view details

Michael Kerrisk

commit sha f5fd82cc4ea03dc980e982fb6af514c299825f3f

quotactl.2: Minor tweaks to Yang Xu's patch Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>

view details

Michael Kerrisk

commit sha a5394cba1c77678050fe7e2f6a2c7e29bbf8d9ab

quotactl.2: tfix Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>

view details

push time in 8 days

push eventcyphar/linux

Michał Mirosław

commit sha b3a81c777dcb093020680490ab970d85e2f6f04f

HID: fix error message in hid_open_report() On HID report descriptor parsing error the code displays bogus pointer instead of error offset (subtracts start=NULL from end). Make the message more useful by displaying correct error offset and include total buffer size for reference. This was carried over from ancient times - "Fixed" commit just promoted the message from DEBUG to ERROR. Cc: stable@vger.kernel.org Fixes: 8c3d52fc393b ("HID: make parser more verbose about parsing errors by default") Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl> Signed-off-by: Jiri Kosina <jkosina@suse.cz>

view details

Colin Ian King

commit sha fe2199cfd1516e90e03c033c52c9a28da09d9986

HID: prodikeys: make array keys static const, makes object smaller Don't populate the array keys on the stack but instead make it static const. Makes the object code smaller by 166 bytes. Before: text data bss dec hex filename 18931 5872 480 25283 62c3 drivers/hid/hid-prodikeys.o After: text data bss dec hex filename 18669 5968 480 25117 621d drivers/hid/hid-prodikeys.o (gcc version 9.2.1, amd64) Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>

view details

Alan Stern

commit sha d9d4b1e46d9543a82c23f6df03f4ad697dab361b

HID: Fix assumption that devices have inputs The syzbot fuzzer found a slab-out-of-bounds write bug in the hid-gaff driver. The problem is caused by the driver's assumption that the device must have an input report. While this will be true for all normal HID input devices, a suitably malicious device can violate the assumption. The same assumption is present in over a dozen other HID drivers. This patch fixes them by checking that the list of hid_inputs for the hid_device is nonempty before allowing it to be used. Reported-and-tested-by: syzbot+403741a091bf41d4ae79@syzkaller.appspotmail.com Signed-off-by: Alan Stern <stern@rowland.harvard.edu> CC: <stable@vger.kernel.org> Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>

view details

Nicolas Boichat

commit sha 9e4dbc4646a84b2562ea7c64a542740687ff7daf

HID: google: add magnemite/masterball USB ids Add 2 additional hammer-like devices. Signed-off-by: Nicolas Boichat <drinkcat@chromium.org> Signed-off-by: Jiri Kosina <jkosina@suse.cz>

view details

Sven Eckelmann

commit sha a8d23cbbf6c9f515ed678204ad2962be7c336344

batman-adv: Avoid free/alloc race when handling OGM2 buffer A B.A.T.M.A.N. V virtual interface has an OGM2 packet buffer which is initialized using data from the netdevice notifier and other rtnetlink related hooks. It is sent regularly via various slave interfaces of the batadv virtual interface and in this process also modified (realloced) to integrate additional state information via TVLV containers. It must be avoided that the worker item is executed without a common lock with the netdevice notifier/rtnetlink helpers. Otherwise it can either happen that half modified data is sent out or the functions modifying the OGM2 buffer try to access already freed memory regions. Fixes: 0da0035942d4 ("batman-adv: OGMv2 - add basic infrastructure") Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>

view details

Sven Eckelmann

commit sha 40e220b4218bb3d278e5e8cc04ccdfd1c7ff8307

batman-adv: Avoid free/alloc race when handling OGM buffer Each slave interface of an B.A.T.M.A.N. IV virtual interface has an OGM packet buffer which is initialized using data from netdevice notifier and other rtnetlink related hooks. It is sent regularly via various slave interfaces of the batadv virtual interface and in this process also modified (realloced) to integrate additional state information via TVLV containers. It must be avoided that the worker item is executed without a common lock with the netdevice notifier/rtnetlink helpers. Otherwise it can either happen that half modified/freed data is sent out or functions modifying the OGM buffer try to access already freed memory regions. Reported-by: syzbot+0cc629f19ccb8534935b@syzkaller.appspotmail.com Fixes: c6c8fea29769 ("net: Add batman-adv meshing protocol") Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>

view details

Zhenfang Wang

commit sha 8b6bc5fd71e677864d1a3b896b3069a6e0c5e214

dmaengine: sprd: Fix the link-list pointer register configuration issue We will set the link-list pointer register point to next link-list configuration's physical address, which can load DMA configuration from the link-list node automatically. But the link-list node's physical address can be larger than 32bits, and now Spreadtrum DMA driver only supports 32bits physical address, which may cause loading a incorrect DMA configuration when starting the link-list transfer mode. According to the DMA datasheet, we can use SRC_BLK_STEP register (bit28 - bit31) to save the high bits of the link-list node's physical address to fix this issue. Fixes: 4ac695464763 ("dmaengine: sprd: Support DMA link-list mode") Signed-off-by: Zhenfang Wang <zhenfang.wang@unisoc.com> Signed-off-by: Baolin Wang <baolin.wang@linaro.org> Link: https://lore.kernel.org/r/eadfe9295499efa003e1c344e67e2890f9d1d780.1568267061.git.baolin.wang@linaro.org Signed-off-by: Vinod Koul <vkoul@kernel.org>

view details

Sameer Pujar

commit sha 9ec691f48b5ef741a48af8932ccaec859c67e8f1

dmaengine: tegra210-adma: fix transfer failure >From Tegra186 onwards OUTSTANDING_REQUESTS field is added in channel configuration register(bits 7:4) which defines the maximum number of reads from the source and writes to the destination that may be outstanding at any given point of time. This field must be programmed with a value between 1 and 8. A value of 0 will prevent any transfers from happening. Thus added 'has_outstanding_reqs' bool member in chip data structure and is set to false for Tegra210, since the field is not applicable. For Tegra186 it is set to true and channel configuration is updated with maximum outstanding requests. Fixes: 433de642a76c ("dmaengine: tegra210-adma: add support for Tegra186/Tegra194") Cc: stable@vger.kernel.org Signed-off-by: Sameer Pujar <spujar@nvidia.com> Acked-by: Jon Hunter <jonathanh@nvidia.com> Link: https://lore.kernel.org/r/1568626513-16541-1-git-send-email-spujar@nvidia.com Signed-off-by: Vinod Koul <vkoul@kernel.org>

view details

Robin Gong

commit sha bd73dfabdda280fc5f05bdec79b6721b4b2f035f

dmaengine: imx-sdma: fix size check for sdma script_number Illegal memory will be touch if SDMA_SCRIPT_ADDRS_ARRAY_SIZE_V3 (41) exceed the size of structure sdma_script_start_addrs(40), thus cause memory corrupt such as slob block header so that kernel trap into while() loop forever in slob_free(). Please refer to below code piece in imx-sdma.c: for (i = 0; i < sdma->script_number; i++) if (addr_arr[i] > 0) saddr_arr[i] = addr_arr[i]; /* memory corrupt here */ That issue was brought by commit a572460be9cf ("dmaengine: imx-sdma: Add support for version 3 firmware") because SDMA_SCRIPT_ADDRS_ARRAY_SIZE_V3 (38->41 3 scripts added) not align with script number added in sdma_script_start_addrs(2 scripts). Fixes: a572460be9cf ("dmaengine: imx-sdma: Add support for version 3 firmware") Cc: stable@vger.kernel Link: https://www.spinics.net/lists/arm-kernel/msg754895.html Signed-off-by: Robin Gong <yibin.gong@nxp.com> Reported-by: Jurgen Lambrecht <J.Lambrecht@TELEVIC.com> Link: https://lore.kernel.org/r/1569347584-3478-1-git-send-email-yibin.gong@nxp.com [vkoul: update the patch title] Signed-off-by: Vinod Koul <vkoul@kernel.org>

view details

Vivek Goyal

commit sha 112e72373d1f60f1e4558d0a7f0de5da39a1224d

virtio-fs: Change module name to virtiofs.ko We have been calling it virtio_fs and even file name is virtio_fs.c. Module name is virtio_fs.ko but when registering file system user is supposed to specify filesystem type as "virtiofs". Masayoshi Mizuma reported that he specified filesytem type as "virtio_fs" and got this warning on console. ------------[ cut here ]------------ request_module fs-virtio_fs succeeded, but still no fs? WARNING: CPU: 1 PID: 1234 at fs/filesystems.c:274 get_fs_type+0x12c/0x138 Modules linked in: ... virtio_fs fuse virtio_net net_failover ... CPU: 1 PID: 1234 Comm: mount Not tainted 5.4.0-rc1 #1 So looks like kernel could find the module virtio_fs.ko but could not find filesystem type after that. It probably is better to rename module name to virtiofs.ko so that above warning goes away in case user ends up specifying wrong fs name. Reported-by: Masayoshi Mizuma <msys.mizuma@gmail.com> Suggested-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Tested-by: Masayoshi Mizuma <m.mizuma@jp.fujitsu.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>

view details

Jiri Benc

commit sha 9e8acd9c44a0dd52b2922eeb82398c04e356c058

bpf: lwtunnel: Fix reroute supplying invalid dst The dst in bpf_input() has lwtstate field set. As it is of the LWTUNNEL_ENCAP_BPF type, lwtstate->data is struct bpf_lwt. When the bpf program returns BPF_LWT_REROUTE, ip_route_input_noref is directly called on this skb. This causes invalid memory access, as ip_route_input_slow calls skb_tunnel_info(skb) that expects the dst->lwstate->data to be struct ip_tunnel_info. This results to struct bpf_lwt being accessed as struct ip_tunnel_info. Drop the dst before calling the IP route input functions (both for IPv4 and IPv6). Reported by KASAN. Fixes: 3bd0b15281af ("bpf: add handling of BPF_LWT_REROUTE to lwt_bpf.c") Signed-off-by: Jiri Benc <jbenc@redhat.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Peter Oskolkov <posk@google.com> Link: https://lore.kernel.org/bpf/111664d58fe4e9dd9c8014bb3d0b2dab93086a9e.1570609794.git.jbenc@redhat.com

view details

Radhey Shyam Pandey

commit sha 68fe2b520cee829ed518b4b1f64d2a557bcbffe1

dmaengine: xilinx_dma: Fix 64-bit simple AXIDMA transfer In AXI DMA simple mode also pass MSB bits of source and destination address to xilinx_write function. It fixes simple AXI DMA operation mode using 64-bit addressing. Signed-off-by: Radhey Shyam Pandey <radhey.shyam.pandey@xilinx.com> Link: https://lore.kernel.org/r/1569495060-18117-2-git-send-email-radhey.shyam.pandey@xilinx.com Signed-off-by: Vinod Koul <vkoul@kernel.org>

view details

Radhey Shyam Pandey

commit sha 6c6de1ddb1be3840f2ed5cc9d009a622720940c9

dmaengine: xilinx_dma: Fix control reg update in vdma_channel_set_config In vdma_channel_set_config clear the delay, frame count and master mask before updating their new values. It avoids programming incorrect state when input parameters are different from default. Signed-off-by: Radhey Shyam Pandey <radhey.shyam.pandey@xilinx.com> Acked-by: Appana Durga Kedareswara rao <appana.durga.rao@xilinx.com> Signed-off-by: Michal Simek <michal.simek@xilinx.com> Link: https://lore.kernel.org/r/1569495060-18117-3-git-send-email-radhey.shyam.pandey@xilinx.com Signed-off-by: Vinod Koul <vkoul@kernel.org>

view details

Baolin Wang

commit sha ec1ac309596a7bdf206743b092748205f6cd5720

dmaengine: sprd: Fix the possible memory leak issue If we terminate the channel to free all descriptors associated with this channel, we will leak the memory of current descriptor if the current descriptor is not completed, since it had been deteled from the desc_issued list and have not been added into the desc_completed list. Thus we should check if current descriptor is completed or not, when freeing the descriptors associated with one channel, if not, we should free it to avoid this issue. Fixes: 9b3b8171f7f4 ("dmaengine: sprd: Add Spreadtrum DMA driver") Reported-by: Zhenfang Wang <zhenfang.wang@unisoc.com> Tested-by: Zhenfang Wang <zhenfang.wang@unisoc.com> Signed-off-by: Baolin Wang <baolin.wang@linaro.org> Link: https://lore.kernel.org/r/170dbbc6d5366b6fa974ce2d366652e23a334251.1570609788.git.baolin.wang@linaro.org Signed-off-by: Vinod Koul <vkoul@kernel.org>

view details

Miklos Szeredi

commit sha 3f22c7467136adfa6d2a7baf7cd5c573f0641bd1

virtio-fs: don't show mount options Virtio-fs does not accept any mount options, so it's confusing and wrong to show any in /proc/mounts. Reported-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>

view details

Zhang Lixu

commit sha 16ff7bf6dbcc6f77d2eec1ac9120edf44213c2f1

HID: intel-ish-hid: fix wrong error handling in ishtp_cl_alloc_tx_ring() When allocating tx ring buffers failed, should free tx buffers, not rx buffers. Signed-off-by: Zhang Lixu <lixu.zhang@intel.com> Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>

view details

Christophe Leroy

commit sha d10f60ae27d26d811e2a1bb39ded47df96d7499f

powerpc/32s: fix allow/prevent_user_access() when crossing segment boundaries. Make sure starting addr is aligned to segment boundary so that when incrementing the segment, the starting address of the new segment is below the end address. Otherwise the last segment might get missed. Fixes: a68c31fc01ef ("powerpc/32s: Implement Kernel Userspace Access Protection") Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/067a1b09f15f421d40797c2d04c22d4049a1cee8.1571071875.git.christophe.leroy@c-s.fr

view details

Rafi Wiener

commit sha c8973df2da677f375f8b12b6eefca2f44c8884d5

RDMA/mlx5: Clear old rate limit when closing QP Before QP is closed it changes to ERROR state, when this happens the QP was left with old rate limit that was already removed from the table. Fixes: 7d29f349a4b9 ("IB/mlx5: Properly adjust rate limit on QP state transitions") Signed-off-by: Rafi Wiener <rafiw@mellanox.com> Signed-off-by: Oleg Kuporosov <olegk@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Link: https://lore.kernel.org/r/20191002120243.16971-1-leon@kernel.org Signed-off-by: Doug Ledford <dledford@redhat.com>

view details

Kaike Wan

commit sha 9ed5bd7d22241ad232fd3a5be404e83eb6cadc04

IB/hfi1: Avoid excessive retry for TID RDMA READ request A TID RDMA READ request could be retried under one of the following conditions: - The RC retry timer expires; - A later TID RDMA READ RESP packet is received before the next expected one. For the latter, under normal conditions, the PSN in IB space is used for comparison. More specifically, the IB PSN in the incoming TID RDMA READ RESP packet is compared with the last IB PSN of a given TID RDMA READ request to determine if the request should be retried. This is similar to the retry logic for noraml RDMA READ request. However, if a TID RDMA READ RESP packet is lost due to congestion, header suppresion will be disabled and each incoming packet will raise an interrupt until the hardware flow is reloaded. Under this condition, each packet KDETH PSN will be checked by software against r_next_psn and a retry will be requested if the packet KDETH PSN is later than r_next_psn. Since each TID RDMA READ segment could have up to 64 packets and each TID RDMA READ request could have many segments, we could make far more retries under such conditions, and thus leading to RETRY_EXC_ERR status. This patch fixes the issue by removing the retry when the incoming packet KDETH PSN is later than r_next_psn. Instead, it resorts to RC timer and normal IB PSN comparison for any request retry. Fixes: 9905bf06e890 ("IB/hfi1: Add functions to receive TID RDMA READ response") Cc: <stable@vger.kernel.org> Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Link: https://lore.kernel.org/r/20191004204035.26542.41684.stgit@awfm-01.aw.intel.com Signed-off-by: Doug Ledford <dledford@redhat.com>

view details

Mike Marciniszyn

commit sha 22bb13653410424d9fce8d447506a41f8292f22f

IB/hfi1: Use a common pad buffer for 9B and 16B packets There is no reason for a different pad buffer for the two packet types. Expand the current buffer allocation to allow for both packet types. Fixes: f8195f3b14a0 ("IB/hfi1: Eliminate allocation while atomic") Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Kaike Wan <kaike.wan@intel.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Link: https://lore.kernel.org/r/20191004204934.26838.13099.stgit@awfm-01.aw.intel.com Signed-off-by: Doug Ledford <dledford@redhat.com>

view details

push time in 8 days

push eventcyphar/cyphar.com

Aleksa Sarai

commit sha 5ed5b38cc0efdf62dac14d375443784b12ed6d5d

srv: tor-hs: fix problems with /var/run in container Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>

view details

push time in 9 days

push eventcyphar/cyphar.com

Aleksa Sarai

commit sha debbb46dff7e8a864c77a4201cc7c087f4af566e

srv: nextcloud: only permit 10.0.0.0/8 clients Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>

view details

push time in 9 days

startedgorhill/uMatrix

started time in 10 days

pull request commentopencontainers/runtime-spec

Add mount and start hooks

@amshinde I have no problem with adding *-guest-* hooks as well, though I think that:

  • We should put that in a separate PR to not further block this one.
  • We will need to figure out what the semantics for non-VM runtimes should be. We could make create-guest-* == create-* in that case, but I'm not sure how much I like that. I'd also want to put a the word "vm" in the hook name to clarify that those hooks are only useful for VM-based runtimes (create-vmguest-* or something).
RenaudWasTaken

comment created time in 13 days

delete branch cyphar/umoci

delete branch : casext-more-hooks

delete time in 14 days

push eventopenSUSE/umoci

Aleksa Sarai

commit sha bd8c6d61c383374955a584d96680921a328c95a7

oci: casext: further hookification of blobs This is necessary in order for references to non-stock OCI blobs to work properly (in order for OCIv2 to work properly). Hopefully this is also extensible enough that anyone who has a custom blob format can make use of it. Signed-off-by: Aleksa Sarai <asarai@suse.de>

view details

Aleksa Sarai

commit sha c0dd46ae078f2b17f3ef9630c53c0614006a2ea2

Merge branch 'pr-307' Aleksa Sarai (1): oci: casext: further hookification of blobs LGTMs: @cyphar Closes #307

view details

push time in 14 days

PR merged openSUSE/umoci

oci: casext: further hookification of blobs lgtm/need 1

This is necessary in order for references to non-stock OCI blobs to work properly (in order for OCIv2 to work properly). Hopefully this is also extensible enough that anyone who has a custom blob format can make use of it.

Signed-off-by: Aleksa Sarai asarai@suse.de

+293 -91

1 comment

5 changed files

cyphar

pr closed time in 14 days

PR closed openSUSE/umoci

oci: casext: further hookification of blobs lgtm/need 1

This is necessary in order for references to non-stock OCI blobs to work properly (in order for OCIv2 to work properly). Hopefully this is also extensible enough that anyone who has a custom blob format can make use of it.

Signed-off-by: Aleksa Sarai asarai@suse.de

+293 -91

1 comment

5 changed files

cyphar

pr closed time in 14 days

pull request commentopenSUSE/umoci

oci: casext: further hookification of blobs

LGTM.

cyphar

comment created time in 14 days

PR opened openSUSE/umoci

oci: casext: further hookification of blobs

This is necessary in order for references to non-stock OCI blobs to work properly (in order for OCIv2 to work properly). Hopefully this is also extensible enough that anyone who has a custom blob format can make use of it.

Signed-off-by: Aleksa Sarai asarai@suse.de

+293 -91

0 comment

5 changed files

pr created time in 14 days

create barnchcyphar/umoci

branch : casext-more-hooks

created branch time in 14 days

Pull request review commentzfsonlinux/zfs

OverlayFS support (d_revalidate out and support renameat2 flags)

 zfs_log_rename(zilog_t *zilog, dmu_tx_t *tx, uint64_t txtype, 	zil_itx_assign(zilog, itx, tx); } +/*+ * At the moment, only Linux supports renameat2 variant of renameat, which+ * adds three new flags of interest for us:+ *     RENAME_NOREPLACE: if the target name at the moment of the call exists,+ *                       don't rewrite it and return error+ *     RENAME_EXCHANGE: atomically swap the two names on the filesystem+ *     RENAME_WHITEOUT: creates a whiteout inode in place of renamed file as+ *                      an atomic operation+ *+ * Ideally, these operations should be represented as new ZFS Intent Log record+ * types, which should mandate a new ZFS feature flag due to the on-disk format+ * change. One would use spa_feature_incr/decr functions to indicate that we're+ * actually actively using the new on-disk txtypes - but these functions are+ * only supposed to be called from the txg syncing context.+ * This means that would need to force out in-progress txg to disk and start+ * a new one before writing any ZIL records, just so we can be sure that ZIL+ * replaying ZFS gets told it should expect potentially incompatible ZIL+ * txtypes. Doing this would hurt performance.+ * Alternatively, we could just activate the feature on a pool when these+ * renameat2 flags get first used and leave it at that - which would render the+ * pool importable read-only on implementations without the new feature flag,+ * even when no new txtypes IL records would be present on-disk - which on most+ * setups could be 'almost' all the time, so it'd be a shame to have them all+ * read-only on non-Linux platforms.+ * As a third option, at least until more platforms implement renameat2, we+ * choose to rely on the fact that the ZIL is replayed in single-threaded mode+ * before the dataset is mounted. This way, we can represent the otherwise+ * atomic operations as a series of plain good old txtypes known to all current+ * OpenZFS implementations. To do that, we use these hacky functions:+ *+ * zfs_log_rename_exchange+ * zfs_log_rename_whiteout+ *+ *     To represent atomic rename with old non-atomic operations, we need+ *     a temporary new name; so we try picking a name until we succeed, then+ *     we get a dirent lock for that temp name until the final itx gets queued+ */++void+zfs_log_rename_exchange(zilog_t *zilog, dmu_tx_t *tx, uint64_t txtype,+    znode_t *sdzp, char *sname, znode_t *tdzp, char *dname, znode_t *szp)+{+	zfs_dirlock_t *tmpdl;+	znode_t *tmpzp = NULL;+	char rndname[16];+	char *tmpname;+	int retries = 0;+	int error;++	tmpname = kmem_alloc(MAXPATHLEN, KM_SLEEP);+	ASSERT3P(tmpname, !=, NULL);++	for (int i = 0; i < 16; i++) {+		int r = 0xFF;+		while (r > 127 || r == 0 || r == '/')+			random_get_pseudo_bytes((void *)&r, 1);+		rndname[i] = (char)r;+	}++	do {+		retries++;+		(void) snprintf(tmpname, MAXPATHLEN,+		    "%s.zfs_renameat2_emul_%s%4d",+		    dname, rndname, retries);+		error = zfs_dirent_lock(&tmpdl, tdzp, tmpname,+		    &tmpzp, ZNEW, NULL, NULL);+	} while (error != 0 && retries != INT_MAX);+	ASSERT3U(retries, !=, INT_MAX);++	zfs_log_rename(zilog, tx, txtype, sdzp, sname, tdzp, tmpname, szp);+	zfs_log_rename(zilog, tx, txtype, tdzp, dname, sdzp, sname, szp);+	zfs_log_rename(zilog, tx, txtype, tdzp, tmpname, tdzp, dname, szp);

Nit: I think the ordering should be the more traditional:

	/* dst -> tmp */
	zfs_log_rename(zilog, tx, txtype, tdzp, dname, tdzp, tmpname, szp);
	/* src -> dst */
	zfs_log_rename(zilog, tx, txtype, sdzp, sname, tdzp, dname, szp);
	/* tmp -> src */
	zfs_log_rename(zilog, tx, txtype, tdzp, tmpname, sdzp, sname, szp);
snajpa

comment created time in 14 days

Pull request review commentzfsonlinux/zfs

OverlayFS support (d_revalidate out and support renameat2 flags)

 zfs_log_rename(zilog_t *zilog, dmu_tx_t *tx, uint64_t txtype, 	zil_itx_assign(zilog, itx, tx); } +/*+ * At the moment, only Linux supports renameat2 variant of renameat, which+ * adds three new flags of interest for us:+ *     RENAME_NOREPLACE: if the target name at the moment of the call exists,+ *                       don't rewrite it and return error+ *     RENAME_EXCHANGE: atomically swap the two names on the filesystem+ *     RENAME_WHITEOUT: creates a whiteout inode in place of renamed file as+ *                      an atomic operation+ *+ * Ideally, these operations should be represented as new ZFS Intent Log record+ * types, which should mandate a new ZFS feature flag due to the on-disk format+ * change. One would use spa_feature_incr/decr functions to indicate that we're+ * actually actively using the new on-disk txtypes - but these functions are+ * only supposed to be called from the txg syncing context.+ * This means that would need to force out in-progress txg to disk and start+ * a new one before writing any ZIL records, just so we can be sure that ZIL+ * replaying ZFS gets told it should expect potentially incompatible ZIL+ * txtypes. Doing this would hurt performance.+ * Alternatively, we could just activate the feature on a pool when these+ * renameat2 flags get first used and leave it at that - which would render the+ * pool importable read-only on implementations without the new feature flag,+ * even when no new txtypes IL records would be present on-disk - which on most+ * setups could be 'almost' all the time, so it'd be a shame to have them all+ * read-only on non-Linux platforms.+ * As a third option, at least until more platforms implement renameat2, we+ * choose to rely on the fact that the ZIL is replayed in single-threaded mode+ * before the dataset is mounted. This way, we can represent the otherwise+ * atomic operations as a series of plain good old txtypes known to all current+ * OpenZFS implementations. To do that, we use these hacky functions:+ *+ * zfs_log_rename_exchange+ * zfs_log_rename_whiteout+ *+ *     To represent atomic rename with old non-atomic operations, we need+ *     a temporary new name; so we try picking a name until we succeed, then+ *     we get a dirent lock for that temp name until the final itx gets queued+ */++void+zfs_log_rename_exchange(zilog_t *zilog, dmu_tx_t *tx, uint64_t txtype,+    znode_t *sdzp, char *sname, znode_t *tdzp, char *dname, znode_t *szp)+{+	zfs_dirlock_t *tmpdl;+	znode_t *tmpzp = NULL;+	char rndname[16];+	char *tmpname;+	int retries = 0;+	int error;++	tmpname = kmem_alloc(MAXPATHLEN, KM_SLEEP);+	ASSERT3P(tmpname, !=, NULL);++	for (int i = 0; i < 16; i++) {+		int r = 0xFF;+		while (r > 127 || r == 0 || r == '/')+			random_get_pseudo_bytes((void *)&r, 1);+		rndname[i] = (char)r;+	}++	do {+		retries++;+		(void) snprintf(tmpname, MAXPATHLEN,+		    "%s.zfs_renameat2_emul_%s%4d",

I'd argue this should be a hidden file.

snajpa

comment created time in 14 days

Pull request review commentzfsonlinux/zfs

OverlayFS support (d_revalidate out and support renameat2 flags)

 zfs_log_rename(zilog_t *zilog, dmu_tx_t *tx, uint64_t txtype, 	zil_itx_assign(zilog, itx, tx); } +/*+ * At the moment, only Linux supports renameat2 variant of renameat, which+ * adds three new flags of interest for us:+ *     RENAME_NOREPLACE: if the target name at the moment of the call exists,+ *                       don't rewrite it and return error+ *     RENAME_EXCHANGE: atomically swap the two names on the filesystem+ *     RENAME_WHITEOUT: creates a whiteout inode in place of renamed file as+ *                      an atomic operation+ *+ * Ideally, these operations should be represented as new ZFS Intent Log record+ * types, which should mandate a new ZFS feature flag due to the on-disk format+ * change. One would use spa_feature_incr/decr functions to indicate that we're+ * actually actively using the new on-disk txtypes - but these functions are+ * only supposed to be called from the txg syncing context.+ * This means that would need to force out in-progress txg to disk and start+ * a new one before writing any ZIL records, just so we can be sure that ZIL+ * replaying ZFS gets told it should expect potentially incompatible ZIL+ * txtypes. Doing this would hurt performance.+ * Alternatively, we could just activate the feature on a pool when these+ * renameat2 flags get first used and leave it at that - which would render the+ * pool importable read-only on implementations without the new feature flag,+ * even when no new txtypes IL records would be present on-disk - which on most+ * setups could be 'almost' all the time, so it'd be a shame to have them all+ * read-only on non-Linux platforms.+ * As a third option, at least until more platforms implement renameat2, we+ * choose to rely on the fact that the ZIL is replayed in single-threaded mode+ * before the dataset is mounted. This way, we can represent the otherwise+ * atomic operations as a series of plain good old txtypes known to all current+ * OpenZFS implementations. To do that, we use these hacky functions:+ *+ * zfs_log_rename_exchange+ * zfs_log_rename_whiteout+ *+ *     To represent atomic rename with old non-atomic operations, we need+ *     a temporary new name; so we try picking a name until we succeed, then+ *     we get a dirent lock for that temp name until the final itx gets queued+ */++void+zfs_log_rename_exchange(zilog_t *zilog, dmu_tx_t *tx, uint64_t txtype,+    znode_t *sdzp, char *sname, znode_t *tdzp, char *dname, znode_t *szp)+{+	zfs_dirlock_t *tmpdl;+	znode_t *tmpzp = NULL;+	char rndname[16];+	char *tmpname;+	int retries = 0;+	int error;++	tmpname = kmem_alloc(MAXPATHLEN, KM_SLEEP);+	ASSERT3P(tmpname, !=, NULL);++	for (int i = 0; i < 16; i++) {+		int r = 0xFF;+		while (r > 127 || r == 0 || r == '/')

I really don't like it much either. My suggestion would be to just take some random bytes and hash them -- or represent the set of random bytes in hex (whichever is simpler to implement).

I also think that the retry logic (having an incrementing counter) in addition to this random-name logic is unnecessary -- just generate a new candidate random value each time. It's very unlikely that you would hit this case at all (let alone enough times to make random_get_pseudo_bytes become a significant performance problem).

snajpa

comment created time in 14 days

Pull request review commentzfsonlinux/zfs

OverlayFS support (d_revalidate out and support renameat2 flags)

 zfs_rename(struct inode *sdip, char *snm, struct inode *tdip, char *tnm,  	zfs_inode_update(szp); 	iput(ZTOI(szp));+	if (wzp) {+		zfs_inode_update(wzp);+		iput(ZTOI(wzp));+	} 	if (tzp) { 		zfs_inode_update(tzp); 		iput(ZTOI(tzp)); 	} +	if (zl != NULL)+		zfs_rename_unlock(&zl);++	zfs_dirent_unlock(sdl);+	zfs_dirent_unlock(tdl);+ 	if (zfsvfs->z_os->os_sync == ZFS_SYNC_ALWAYS) 		zil_commit(zilog, 0);  	ZFS_EXIT(zfsvfs); 	return (error);++	/*+	 * Clean-up path for broken link state.+	 *+	 * At this point we are in a (very) bad state, so we need to do our+	 * best to correct the state. In particular, all of the nlinks are+	 * wrong because we were destroying and creating links with ZRENAMING.+	 *+	 * In some form, all of thee operations have to resolve the state:+	 *+	 *  * link_destroy() *must* succeed. Fortunately, this is very likely+	 *    since we only just created it.+	 *+	 *  * link_create()s are allowed to fail (though they shouldn't because+	 *    we only just unlinked them and are putting the entries back+	 *    during clean-up). But if they fail, we can just forcefully drop+	 *    the nlink value to (at the very least) avoid broken nlink values+	 *    -- though in the case of non-empty directories we will have to+	 *    panic (otherwise we'd have a leaked directory with a broken ..).+	 */+commit_unlink_td_szp:+	VERIFY3U(zfs_link_destroy(tdl, szp, tx, ZRENAMING, NULL), ==, 0);+commit_link_tzp:+	if (tzp) {+		if (zfs_link_create(tdl, tzp, tx, ZRENAMING))+			VERIFY3U(zfs_drop_nlink(tzp, tx, NULL), ==, 0);+	}+commit_link_szp:+	if (zfs_link_create(sdl, szp, tx, ZRENAMING))+		VERIFY3U(zfs_drop_nlink(szp, tx, NULL), ==, 0);+	goto commit;

I think in the original PR there was an agreement that this handling wasn't necessary anymore (and neither was zfs_drop_nlink) now that we take a dirent lock on the source to make sure the allocations won't fail. Or did the dirent locking change never land in this PR?

snajpa

comment created time in 14 days

Pull request review commentzfsonlinux/zfs

OverlayFS support (d_revalidate out and support renameat2 flags)

 zfs_rename(struct inode *sdip, char *snm, struct inode *tdip, char *tnm, 		return (error); 	} -	if (tzp)	/* Attempt to remove the existing target */-		error = zfs_link_destroy(tdl, tzp, tx, zflg, NULL);+	/*+	 * Unlink the source.+	 */+	szp->z_pflags |= ZFS_AV_MODIFIED;+	if (tdzp->z_pflags & ZFS_PROJINHERIT)+		szp->z_pflags |= ZFS_PROJINHERIT;++	error = sa_update(szp->z_sa_hdl, SA_ZPL_FLAGS(zfsvfs),+	    (void *)&szp->z_pflags, sizeof (uint64_t), tx);+	ASSERT0(error); -	if (error == 0) {-		error = zfs_link_create(tdl, szp, tx, ZRENAMING);-		if (error == 0) {-			szp->z_pflags |= ZFS_AV_MODIFIED;-			if (tdzp->z_pflags & ZFS_PROJINHERIT)-				szp->z_pflags |= ZFS_PROJINHERIT;+	error = zfs_link_destroy(sdl, szp, tx, ZRENAMING, NULL);+	if (error)+		goto commit;++	/*+	 * Unlink the target.+	 */+	if (tzp) {+		int tzflg = zflg;++		if (flags & RENAME_EXCHANGE) {+			/* This inode will be re-linked soon. */+			tzflg |= ZRENAMING;++			tzp->z_pflags |= ZFS_AV_MODIFIED;+			if (sdzp->z_pflags & ZFS_PROJINHERIT)+				tzp->z_pflags |= ZFS_PROJINHERIT; -			error = sa_update(szp->z_sa_hdl, SA_ZPL_FLAGS(zfsvfs),-			    (void *)&szp->z_pflags, sizeof (uint64_t), tx);+			error = sa_update(tzp->z_sa_hdl, SA_ZPL_FLAGS(zfsvfs),+			    (void *)&tzp->z_pflags, sizeof (uint64_t), tx); 			ASSERT0(error);+		}+		error = zfs_link_destroy(tdl, tzp, tx, tzflg, NULL);+		if (error)+			goto commit_link_szp;+	} -			error = zfs_link_destroy(sdl, szp, tx, ZRENAMING, NULL);-			if (error == 0) {-				zfs_log_rename(zilog, tx, TX_RENAME |-				    (flags & FIGNORECASE ? TX_CI : 0), sdzp,-				    sdl->dl_name, tdzp, tdl->dl_name, szp);-			} else {-				/*-				 * At this point, we have successfully created-				 * the target name, but have failed to remove-				 * the source name.  Since the create was done-				 * with the ZRENAMING flag, there are-				 * complications; for one, the link count is-				 * wrong.  The easiest way to deal with this-				 * is to remove the newly created target, and-				 * return the original error.  This must-				 * succeed; fortunately, it is very unlikely to-				 * fail, since we just created it.-				 */-				VERIFY3U(zfs_link_destroy(tdl, szp, tx,-				    ZRENAMING, NULL), ==, 0);-			}-		} else {-			/*-			 * If we had removed the existing target, subsequent-			 * call to zfs_link_create() to add back the same entry-			 * but, the new dnode (szp) should not fail.-			 */-			ASSERT(tzp == NULL);+	/*+	 * Create the new target links:+	 *   * We always link the target.+	 *   * RENAME_WHITEOUT: Create a whiteout inode in-place of the source.+	 *   * RENAME_EXCHANGE: Link the old target to the source.+	 */+	error = zfs_link_create(tdl, szp, tx, ZRENAMING);+	if (error) {+		/*+		 * If we have removed the existing target, a subsequent call to+		 * zfs_link_create() to add back the same entry, but with a new+		 * dnode (szp), should not fail.+		 */+		ASSERT3P(tzp, ==, NULL);+		goto commit_link_tzp;+	}++	if (flags & RENAME_EXCHANGE) {+		error = zfs_link_create(sdl, tzp, tx, ZRENAMING);+		/*+		 * The same argument as zfs_link_create() failing for+		 * szp applies here, since the source directory must+		 * have had an entry we are replacing.+		 */+		ASSERT3U(error, ==, 0);+		if (error)+			goto commit_unlink_td_szp;+	} else if (flags & RENAME_WHITEOUT) {+		zfs_mknode(sdzp, &wo_vap, tx, cr, 0, &wzp, &acl_ids);+		error = zfs_link_create(sdl, wzp, tx, ZNEW);+		if (error) {+			zfs_znode_delete(wzp, tx);+			remove_inode_hash(ZTOI(wzp));+			goto commit_unlink_td_szp; 		}+		/* No need to zfs_log_create_txtype here. */ 	} +	if (fuid_dirtied)+		zfs_fuid_sync(zfsvfs, tx);++	if (flags & RENAME_EXCHANGE) {+		zfs_log_rename_exchange(zilog, tx,+		    (flags & FIGNORECASE ? TX_CI : 0), sdzp,+		    sdl->dl_name, tdzp, tdl->dl_name, szp);+	} else if (flags & RENAME_WHITEOUT) {+		vsecattr_t vsecp;++		vsecp.vsa_mask |= VSA_ACE_ALLTYPES;+		error = zfs_getacl(szp, &vsecp, B_TRUE, cr);

This error isn't handled anywhere -- there should probably at least be an ASSERT3U(error, ==, 0) if not a goto commit_unlink_td_szp.

snajpa

comment created time in 14 days

issue commentopencontainers/runtime-spec

[RFC] allow to skip `setgroups(2)`

(Also I would seriously suggest that this is functionality that should be exposed through a runtime-specific annotation and not a first-class field in config.json -- the runtime-spec already has lots of really odd features we probably shouldn't have added, and this one just rubs me the wrong way.)

giuseppe

comment created time in 15 days

issue commentopencontainers/runtime-spec

[RFC] allow to skip `setgroups(2)`

If we do add an option, it needs to have a really scary name (disableSetgroupSecurity or something). Not dropping supplementary groups weakens the userns security boundary, and really is something that very few people should actually want to do (not least of all because it will confuse all sorts of programs to be touching unmapped files).

In my view, the best solution to the problem of such volumes is to do exactly what LXD does -- "punch out" the GID that the storage volume is owned by (by adding a single 1:1 mapping for that GID). The most ideal solution would be the next-gen "shiftfs" work that was discussed recently, but obviously we'll have to wait for that to actually land.

giuseppe

comment created time in 15 days

push eventcyphar/linux

Marco Felsch

commit sha 131cb1210d4b58acb0695707dad2eb90dcb50a2a

regulator: of: fix suspend-min/max-voltage parsing Currently the regulator-suspend-min/max-microvolt must be within the root regulator node but the dt-bindings specifies it as subnode properties for the regulator-state-[mem/disk/standby] node. The only DT using this bindings currently is the at91-sama5d2_xplained.dts and this DT uses it correctly. I don't know if it isn't tested but it can't work without this fix. Fixes: f7efad10b5c4 ("regulator: add PM suspend and resume hooks") Signed-off-by: Marco Felsch <m.felsch@pengutronix.de> Link: https://lore.kernel.org/r/20190917154021.14693-3-m.felsch@pengutronix.de Signed-off-by: Mark Brown <broonie@kernel.org>

view details

Marco Felsch

commit sha f8970d341eec73c976a3462b9ecdb02b60b84dd6

regulator: core: make regulator_register() EPROBE_DEFER aware Sometimes it can happen that the regulator_of_get_init_data() can't retrieve the config due to a not probed device the regulator depends on. Fix that by checking the return value of of_parse_cb() and return EPROBE_DEFER in such cases. Signed-off-by: Marco Felsch <m.felsch@pengutronix.de> Link: https://lore.kernel.org/r/20190917154021.14693-4-m.felsch@pengutronix.de Signed-off-by: Mark Brown <broonie@kernel.org>

view details

Sylwester Nawrocki

commit sha fb629fa2587d0c150792d87e3053664bfc8dc78c

ASoC: samsung: arndale: Add missing OF node dereferencing Ensure there is no OF node references kept when the driver is removed/unbound. Reviewed-by: Charles Keepax <ckeepax@opensource.cirrus.com> Signed-off-by: Sylwester Nawrocki <s.nawrocki@samsung.com> Acked-by: Krzysztof Kozlowski <krzk@kernel.org> Link: https://lore.kernel.org/r/20190920130218.32690-3-s.nawrocki@samsung.com Signed-off-by: Mark Brown <broonie@kernel.org>

view details

Sylwester Nawrocki

commit sha ca2347190adb5e4eece73a2b16e96e651c46246b

ASoC: wm8994: Do not register inapplicable controls for WM1811 In case of WM1811 device there are currently being registered controls referring to registers not existing on that device. It has been noticed when getting values of "AIF1ADC2 Volume", "AIF1DAC2 Volume" controls was failing during ALSA state restoring at boot time: "amixer: Mixer hw:0 load error: Device or resource busy" Reading some registers through I2C was failing with EBUSY error and indeed these registers were not available according to the datasheet. To fix this controls not available on WM1811 are moved to a separate array and registered only for WM8994 and WM8958. There are some further differences between WM8994 and WM1811, e.g. registers 603h, 604h, 605h, which are not covered in this patch. Acked-by: Charles Keepax <ckeepax@opensource.cirrus.com> Acked-by: Krzysztof Kozlowski <krzk@kernel.org> Signed-off-by: Sylwester Nawrocki <s.nawrocki@samsung.com> Link: https://lore.kernel.org/r/20190920130218.32690-2-s.nawrocki@samsung.com Signed-off-by: Mark Brown <broonie@kernel.org>

view details

Dan Carpenter

commit sha 901e822b2e365dac4727e0ddffb444a2554b0a89

ASoC: soc-component: fix a couple missing error assignments There were a couple places where the return value wasn't assigned so the error handling wouldn't trigger. Fixes: 5c0769af4caf ("ASoC: soc-dai: add snd_soc_dai_bespoke_trigger()") Fixes: 95aef3553384 ("ASoC: soc-dai: add snd_soc_dai_trigger()") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Link: https://lore.kernel.org/r/20190923142257.GB31251@mwanda Signed-off-by: Mark Brown <broonie@kernel.org>

view details

Axel Lin

commit sha 1d6db22ff7d67a17c571543c69c63b1d261249b0

regulator: fixed: Prevent NULL pointer dereference when !CONFIG_OF Use of_device_get_match_data which has NULL test for match before dereference match->data. Add NULL test for drvtype so it still works for fixed_voltage_ops when !CONFIG_OF. Signed-off-by: Axel Lin <axel.lin@ingics.com> Reviewed-by: Philippe Schenker <philippe.schenker@toradex.com> Link: https://lore.kernel.org/r/20190922022928.28355-1-axel.lin@ingics.com Signed-off-by: Mark Brown <broonie@kernel.org>

view details

Philippe Schenker

commit sha 58283636a5a03e64ad5d03bd282e3b66dcfa2c49

dt-bindings: fixed-regulator: fix compatible enum Remove 'const:' in the compatible enum. This was breaking make dt_binding_check since it has more than one compatible string. Fixes: 9c86d003d620 ("dt-bindings: regulator: add regulator-fixed-clock binding") Signed-off-by: Philippe Schenker <philippe.schenker@toradex.com> Link: https://lore.kernel.org/r/20190923081840.23391-1-philippe.schenker@toradex.com Signed-off-by: Mark Brown <broonie@kernel.org>

view details

Marco Felsch

commit sha a72865f057820ea9f57597915da4b651d65eb92f

regulator: da9062: fix suspend_enable/disable preparation Currently the suspend reg_field maps to the pmic voltage selection bits and is used during suspend_enabe/disable() and during get_mode(). This seems to be wrong for both use cases. Use case one (suspend_enabe/disable): Those callbacks are used to mark a regulator device as enabled/disabled during suspend. Marking the regulator enabled during suspend is done by the LDOx_CONF/BUCKx_CONF bit within the LDOx_CONT/BUCKx_CONT registers. Setting this bit tells the DA9062 PMIC state machine to keep the regulator on in POWERDOWN mode and switch to suspend voltage. Use case two (get_mode): The get_mode callback is used to retrieve the active mode state. Since the regulator-setting-A is used for the active state and regulator-setting-B for the suspend state there is no need to check which regulator setting is active. Fixes: 4068e5182ada ("regulator: da9062: DA9062 regulator driver") Signed-off-by: Marco Felsch <m.felsch@pengutronix.de> Reviewed-by: Adam Thomson <Adam.Thomson.Opensource@diasemi.com> Link: https://lore.kernel.org/r/20190917124246.11732-2-m.felsch@pengutronix.de Signed-off-by: Mark Brown <broonie@kernel.org>

view details

Dan Carpenter

commit sha 752c938a5c14b8cbf0ed3ffbfa637fb166255c3f

ASoC: topology: Fix a signedness bug in soc_tplg_dapm_widget_create() The "template.id" variable is an enum and in this context GCC will treat it as an unsigned int so it can never be less than zero. Fixes: 8a9782346dcc ("ASoC: topology: Add topology core") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Link: https://lore.kernel.org/r/20190925110624.GR3264@mwanda Signed-off-by: Mark Brown <broonie@kernel.org>

view details

Jernej Skrabec

commit sha 2511366797fa6ab4a404b4b000ef7cd262aaafe8

arm64: dts: allwinner: a64: pine64-plus: Add PHY regulator delay Depending on kernel and bootloader configuration, it's possible that Realtek ethernet PHY isn't powered on properly. According to the datasheet, it needs 30ms to power up and then some more time before it can be used. Fix that by adding 100ms ramp delay to regulator responsible for powering PHY. Fixes: 94dcfdc77fc5 ("arm64: allwinner: pine64-plus: Enable dwmac-sun8i") Suggested-by: Ondrej Jirman <megous@megous.com> Signed-off-by: Jernej Skrabec <jernej.skrabec@siol.net> Signed-off-by: Maxime Ripard <mripard@kernel.org>

view details

Vasily Khoruzhick

commit sha ed3e9406bcbc32f84dc4aa4cb4767852e5ab086c

arm64: dts: allwinner: a64: Drop PMU node Looks like PMU in A64 is broken, it generates no interrupts at all and as result 'perf top' shows no events. Tested on Pine64-LTS. Fixes: 34a97fcc71c2 ("arm64: dts: allwinner: a64: Add PMU node") Cc: Harald Geyer <harald@ccbib.org> Cc: Jared D. McNeill <jmcneill@NetBSD.org> Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com> Reviewed-by: Emmanuel Vadot <manu@FreeBSD.org> Signed-off-by: Maxime Ripard <mripard@kernel.org>

view details

Jernej Skrabec

commit sha ccdf3aaa27ded6db9a93eed3ca7468bb2353b8fe

arm64: dts: allwinner: a64: sopine-baseboard: Add PHY regulator delay It turns out that sopine-baseboard needs same fix as pine64-plus for ethernet PHY. Here too Realtek ethernet PHY chip needs additional power on delay to properly initialize. Datasheet mentions that chip needs 30 ms to be properly powered on and that it needs some more time to be initialized. Fix that by adding 100ms ramp delay to regulator responsible for powering PHY. Note that issue was found out and fix tested on pine64-lts, but it's basically the same as sopine-baseboard, only layout and connectors differ. Fixes: bdfe4cebea11 ("arm64: allwinner: a64: add Ethernet PHY regulator for several boards") Signed-off-by: Jernej Skrabec <jernej.skrabec@siol.net> Signed-off-by: Maxime Ripard <mripard@kernel.org>

view details

Rayagonda Kokatanur

commit sha 965f6603e3335a953f4f876792074cb36bf65f7f

arm64: dts: Fix gpio to pinmux mapping There are total of 151 non-secure gpio (0-150) and four pins of pinmux (91, 92, 93 and 94) are not mapped to any gpio pin, hence update same in DT. Fixes: 8aa428cc1e2e ("arm64: dts: Add pinctrl DT nodes for Stingray SOC") Signed-off-by: Rayagonda Kokatanur <rayagonda.kokatanur@broadcom.com> Reviewed-by: Ray Jui <ray.jui@broadcom.com> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>

view details

Axel Lin

commit sha f64db548799e0330897c3203680c2ee795ade518

regulator: ti-abb: Fix timeout in ti_abb_wait_txdone/ti_abb_clear_all_txdone ti_abb_wait_txdone() may return -ETIMEDOUT when ti_abb_check_txdone() returns true in the latest iteration of the while loop because the timeout value is abb->settling_time + 1. Similarly, ti_abb_clear_all_txdone() may return -ETIMEDOUT when ti_abb_check_txdone() returns false in the latest iteration of the while loop. Fix it. Signed-off-by: Axel Lin <axel.lin@ingics.com> Acked-by: Nishanth Menon <nm@ti.com> Link: https://lore.kernel.org/r/20190929095848.21960-1-axel.lin@ingics.com Signed-off-by: Mark Brown <broonie@kernel.org>

view details

Sathyanarayana Nujella

commit sha 4bb41984bf2f4cb8ed6ec1579d317790bd941788

ASoC: max98373: check for device node before parsing Below Oops is caused in a system which uses ACPI instead of device node: of_get_named_gpiod_flags: can't parse 'maxim,reset-gpio' property of node '(null)[0]' BUG: kernel NULL pointer dereference, address: 0000000000000010 This patch avoids NULL pointer deferencing by adding a check before parsing and initializes to make reset-gpio pin as invalid. Signed-off-by: Sathyanarayana Nujella <sathyanarayana.nujella@intel.com> Signed-off-by: Jairaj Arava <jairaj.arava@intel.com> Link: https://lore.kernel.org/r/1569702150-11976-1-git-send-email-sathyanarayana.nujella@intel.com Signed-off-by: Mark Brown <broonie@kernel.org>

view details

Andy Shevchenko

commit sha 57ff2df1b952c7934d7b0e1d3a2ec403ec76edec

pinctrl: intel: Allocate IRQ chip dynamic Keeping the IRQ chip definition static shares it with multiple instances of the GPIO chip in the system. This is bad and now we get this warning from GPIO library: "detected irqchip that is shared with multiple gpiochips: please fix the driver." Hence, move the IRQ chip definition from being driver static into the struct intel_pinctrl. So a unique IRQ chip is used for each GPIO chip instance. Fixes: ee1a6ca43dba ("pinctrl: intel: Add Intel Broxton pin controller support") Depends-on: 5ff56b015e85 ("pinctrl: intel: Disable GPIO pin interrupts in suspend") Reported-by: Federico Ricchiuto <fed.ricchiuto@gmail.com> Suggested-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>

view details

Dmitry Torokhov

commit sha 260996c30f4f3a732f45045e3e0efe27017615e4

pinctrl: cherryview: restore Strago DMI workaround for all versions This is essentially a revert of: e3f72b749da2 pinctrl: cherryview: fix Strago DMI workaround 86c5dd6860a6 pinctrl: cherryview: limit Strago DMI workarounds to version 1.0 because even with 1.1 versions of BIOS there are some pins that are configured as interrupts but not claimed by any driver, and they sometimes fire up and result in interrupt storms that cause touchpad stop functioning and other issues. Given that we are unlikely to qualify another firmware version for a while it is better to keep the workaround active on all Strago boards. Reported-by: Alex Levin <levinale@chromium.org> Fixes: 86c5dd6860a6 ("pinctrl: cherryview: limit Strago DMI workarounds to version 1.0") Cc: stable@vger.kernel.org Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Tested-by: Alex Levin <levinale@chromium.org> Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>

view details

Li Xu

commit sha 9daf4fd0302b2559223cf90dae7dc510c6679047

ASoC: wm_adsp: Fix theoretical NULL pointer for alg_region Fix potential NULL pointer dereference for alg_region in wm_adsp_buffer_parse_legacy. In practice this can never happen as loading the firmware should have failed at the wm_adsp2_setup_algs stage, however probably better for the code to be robust against future changes and this is more helpful for static analysis. Signed-off-by: Li Xu <li.xu@cirrus.com> Signed-off-by: Charles Keepax <ckeepax@opensource.cirrus.com> Link: https://lore.kernel.org/r/20191001130911.19238-1-ckeepax@opensource.cirrus.com Signed-off-by: Mark Brown <broonie@kernel.org>

view details

Charles Keepax

commit sha f75841aa3b4bf02fcc7941af2d3e00ff74a93bdb

regulator: lochnagar: Add on_off_delay for VDDCORE The VDDCORE regulator takes a good length of time to discharge down, so add an on_off_delay to ensure DCVDD is removed before it is powered on again. Signed-off-by: Charles Keepax <ckeepax@opensource.cirrus.com> Link: https://lore.kernel.org/r/20191001132017.1785-1-ckeepax@opensource.cirrus.com Signed-off-by: Mark Brown <broonie@kernel.org>

view details

Pierre-Louis Bossart

commit sha 798614885a0e1b867ceb0197c30c2d82575c73b0

ASoC: SOF: loader: fix kernel oops on firmware boot failure When we fail to boot the firmware, we encounter a kernel oops in hda_dsp_get_registers(), which is called conditionally in hda_dsp_dump() when the sdev_>boot_complete flag is set. Setting this flag _after_ dumping the data fixes the issue and does not change the programming flow. Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Link: https://lore.kernel.org/r/20190927200538.660-2-pierre-louis.bossart@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org>

view details

push time in 16 days

push eventcyphar/linux

Aleksa Sarai

commit sha 6de35ed7417d3cb451ed3c940a2a6e7777b55d21

namei: LOOKUP_IN_ROOT: chroot-like path resolution /* Background. */ Container runtimes or other administrative management processes will often interact with root filesystems while in the host mount namespace, because the cost of doing a chroot(2) on every operation is too prohibitive (especially in Go, which cannot safely use vfork). However, a malicious program can trick the management process into doing operations on files outside of the root filesystem through careful crafting of symlinks. Most programs that need this feature have attempted to make this process safe, by doing all of the path resolution in userspace (with symlinks being scoped to the root of the malicious root filesystem). Unfortunately, this method is prone to foot-guns and usually such implementations have subtle security bugs. Thus, what userspace needs is a way to resolve a path as though it were in a chroot(2) -- with all absolute symlinks being resolved relative to the dirfd root (and ".." components being stuck under the dirfd root[1]) It is much simpler and more straight-forward to provide this functionality in-kernel (because it can be done far more cheaply and correctly). More classical applications that also have this problem (which have their own potentially buggy userspace path sanitisation code) include web servers, archive extraction tools, network file servers, and so on. [1]: At the moment, ".." and magic-link jumping are disallowed for the same reason it is disabled for LOOKUP_BENEATH -- currently it is not safe to allow it. Future patches may enable it unconditionally once we have resolved the possible races (for "..") and semantics (for magic-link jumping). /* Userspace API. */ LOOKUP_IN_ROOT will be exposed to userspace through openat2(2). There is a slight change in behaviour regarding pathnames -- if the pathname is absolute then the dirfd is still used as the root of resolution of LOOKUP_IN_ROOT is specified (this is to avoid obvious foot-guns, at the cost of a minor API inconsistency). Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>

view details

Aleksa Sarai

commit sha 85f5248daee004eb459713fe77b339be8365e67e

namei: permit ".." resolution with LOOKUP_{IN_ROOT,BENEATH} This patch allows for LOOKUP_BENEATH and LOOKUP_IN_ROOT to safely permit ".." resolution (in the case of LOOKUP_BENEATH the resolution will still fail if ".." resolution would resolve a path outside of the root -- while LOOKUP_IN_ROOT will chroot(2)-style scope it). Magic-link jumps are still disallowed entirely[*]. The need for this patch (and the original no-".." restriction) is explained by observing there is a fairly easy-to-exploit race condition with chroot(2) (and thus by extension LOOKUP_IN_ROOT and LOOKUP_BENEATH if ".." is allowed) where a rename(2) of a path can be used to "skip over" nd->root and thus escape to the filesystem above nd->root. thread1 [attacker]: for (;;) renameat2(AT_FDCWD, "/a/b/c", AT_FDCWD, "/a/d", RENAME_EXCHANGE); thread2 [victim]: for (;;) openat2(dirb, "b/c/../../etc/shadow", { .flags = O_PATH, .resolve = RESOLVE_IN_ROOT } ); With fairly significant regularity, thread2 will resolve to "/etc/shadow" rather than "/a/b/etc/shadow". There is also a similar (though somewhat more privileged) attack using MS_MOVE. With this patch, such cases will be detected *during* ".." resolution and will return -EAGAIN for userspace to decide to either retry or abort the lookup. It should be noted that ".." is the weak point of chroot(2) -- walking *into* a subdirectory tautologically cannot result in you walking *outside* nd->root (except through a bind-mount or magic-link). There is also no other way for a directory's parent to change (which is the primary worry with ".." resolution here) other than a rename or MS_MOVE. This is a first-pass implementation, where -EAGAIN will be returned if any rename or mount occurs anywhere on the host (in any namespace). This will result in spurious errors, but there isn't a satisfactory alternative (other than denying ".." altogether). One other possible alternative (which previous versions of this patch used) would be to check with path_is_under() if there was a racing rename or mount (after re-taking the relevant seqlocks). While this does work, it results in possible O(n*m) behaviour if there are many renames or mounts occuring *anywhere on the system*. A variant of the above attack is included in the selftests for openat2(2) later in this patch series. I've run this test on several machines for several days and no instances of a breakout were detected. While this is not concrete proof that this is safe, when combined with the above argument it should lend some trustworthiness to this construction. [*] It may be acceptable in the future to do a path_is_under() check (as with the alternative solution for "..") for magic-links after they are resolved. However this seems unlikely to be a feature that people *really* need -- it can be added later if it turns out a lot of people want it. Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>

view details

Aleksa Sarai

commit sha 9d6e20a6301519580fc8029321d63bb402256df0

open: introduce openat2(2) syscall /* Background. */ For a very long time, extending openat(2) with new features has been incredibly frustrating. This stems from the fact that openat(2) is possibly the most famous counter-example to the mantra "don't silently accept garbage from userspace" -- it doesn't check whether unknown flags are present[1]. This means that (generally) the addition of new flags to openat(2) has been fraught with backwards-compatibility issues (O_TMPFILE has to be defined as __O_TMPFILE|O_DIRECTORY|[O_RDWR or O_WRONLY] to ensure old kernels gave errors, since it's insecure to silently ignore the flag[2]). All new security-related flags therefore have a tough road to being added to openat(2). In addition, the newly-added path resolution restriction LOOKUP flags (which we would like to expose to user-space) don't feel related to the pre-existing O_* flag set -- they affect all components of path lookup. Thus it's necessary to (at the very least) add an additional flag. Adding a new syscall allows us to finally fix the flag-ignoring problem, and we can make it extensible enough so that we will hopefully never need an openat3(2). /* Syscall Prototype. */ /* * open_how is an extensible structure (similar in interface to * clone3(2) or sched_setattr(2)). The size parameter must be set to * sizeof(struct open_how), to allow for future extensions. All future * extensions will be appended to open_how, with their zero value * acting as a no-op default. */ struct open_how { /* ... */ }; int openat2(int dfd, const char *pathname, struct open_how *how, size_t size); /* Description. */ The initial version of 'struct open_how' contains the following fields: flags Used to specify openat(2)-style flags. However, any unknown flag bits or otherwise incorrect flag combinations (like O_PATH|O_RDWR) will result in -EINVAL. In addition, this field is 64-bits wide to allow for more O_ flags than currently permitted with openat(2). mode The file mode for O_CREAT or O_TMPFILE. Must be set to zero if flags does not contain O_CREAT or O_TMPFILE. __padding Must be set to all zeroes. resolve Restrict path resolution (in contrast to O_* flags they affect all path components). The current set of flags are as follows (at the moment, all of the RESOLVE_ flags are implemented as just passing the corresponding LOOKUP_ flag). RESOLVE_NO_XDEV => LOOKUP_NO_XDEV RESOLVE_NO_SYMLINKS => LOOKUP_NO_SYMLINKS RESOLVE_NO_MAGICLINKS => LOOKUP_NO_MAGICLINKS RESOLVE_BENEATH => LOOKUP_BENEATH RESOLVE_IN_ROOT => LOOKUP_IN_ROOT open_how does not contain an embedded size field, because it is of little benefit (userspace can figure out the kernel open_how size at runtime fairly easily without it). Note that as a result of the new how->flags handling, O_PATH|O_TMPFILE is no longer permitted for openat(2). As far as I can tell, this has always been a bug and appears to not be used by userspace (and I've not seen any problems on my machines by disallowing it). If it turns out this breaks something, we can special-case it and only permit it for openat(2) but not openat2(2). /* Testing. */ In a follow-up patch there are over 200 selftests which ensure that this syscall has the correct semantics and will correctly handle several attack scenarios. In addition, I've written a userspace library[4] which provides convenient wrappers around openat2(RESOLVE_IN_ROOT) (this is necessary because no other syscalls support RESOLVE_IN_ROOT, and thus lots of care must be taken when using RESOLVE_IN_ROOT'd file descriptors with other syscalls). During the development of this patch, I've run numerous verification tests using libpathrs (showing that the API is reasonably usable by userspace). /* Future Work. */ Additional RESOLVE_ flags have been suggested during the review period. These can be easily implemented separately (such as blocking automount during resolution). Furthermore, there are some other proposed changes to the openat(2) interface (the most obvious example is magic-link hardening[4]) which would be a good opportunity to add a way for userspace to restrict how O_PATH file descriptors can be re-opened. [1]: https://lwn.net/Articles/588444/ [2]: https://lore.kernel.org/lkml/CA+55aFyyxJL1LyXZeBsf2ypriraj5ut1XkNDsunRBqgVjZU_6Q@mail.gmail.com [3]: https://sourceware.org/bugzilla/show_bug.cgi?id=17523 [4]: https://lore.kernel.org/lkml/20190930183316.10190-2-cyphar@cyphar.com/ Suggested-by: Christian Brauner <christian@brauner.io> Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>

view details

Aleksa Sarai

commit sha 3fc3654a28dca18cedb23d81f7346572fef8b28e

selftests: add openat2(2) selftests Test all of the various openat2(2) flags. A small stress-test of a symlink-rename attack is included to show that the protections against ".."-based attacks are sufficient. The main things these self-tests are enforcing are: * The struct+usize ABI for openat2(2) and copy_struct_from_user() to ensure that upgrades will be handled gracefully (in addition, ensuring that misaligned structures are also handled correctly). * The -EINVAL checks for openat2(2) are all correctly handled to avoid userspace passing unknown or conflicting flag sets (most importantly, ensuring that invalid flag combinations are checked). * All of the RESOLVE_* semantics (including errno values) are correctly handled with various combinations of paths and flags. * RESOLVE_IN_ROOT correctly protects against the symlink rename(2) attack that has been responsible for several CVEs (and likely will be responsible for several more). Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>

view details

Aleksa Sarai

commit sha 8e85875e67c2e09f81f63578a1f8f5ef82a1628d

Documentation: path-lookup: mention LOOKUP_MAGICLINK_JUMPED Now that we have a special flag to signify magic-link jumps, mention it within the path-lookup docs. And now that "magic link" is the correct term for nd_jump_link()-style symlinks, clean up references to this type of "symlink". Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>

view details

push time in 17 days

push eventcyphar/linux

Aleksa Sarai

commit sha 6156cc1b628fb29a1772bde82211b7abb06fbaea

namei: LOOKUP_IN_ROOT: chroot-like path resolution /* Background. */ Container runtimes or other administrative management processes will often interact with root filesystems while in the host mount namespace, because the cost of doing a chroot(2) on every operation is too prohibitive (especially in Go, which cannot safely use vfork). However, a malicious program can trick the management process into doing operations on files outside of the root filesystem through careful crafting of symlinks. Most programs that need this feature have attempted to make this process safe, by doing all of the path resolution in userspace (with symlinks being scoped to the root of the malicious root filesystem). Unfortunately, this method is prone to foot-guns and usually such implementations have subtle security bugs. Thus, what userspace needs is a way to resolve a path as though it were in a chroot(2) -- with all absolute symlinks being resolved relative to the dirfd root (and ".." components being stuck under the dirfd root[1]) It is much simpler and more straight-forward to provide this functionality in-kernel (because it can be done far more cheaply and correctly). More classical applications that also have this problem (which have their own potentially buggy userspace path sanitisation code) include web servers, archive extraction tools, network file servers, and so on. [1]: At the moment, ".." and magic-link jumping are disallowed for the same reason it is disabled for LOOKUP_BENEATH -- currently it is not safe to allow it. Future patches may enable it unconditionally once we have resolved the possible races (for "..") and semantics (for magic-link jumping). /* Userspace API. */ LOOKUP_IN_ROOT will be exposed to userspace through openat2(2). There is a slight change in behaviour regarding pathnames -- if the pathname is absolute then the dirfd is still used as the root of resolution of LOOKUP_IN_ROOT is specified (this is to avoid obvious foot-guns, at the cost of a minor API inconsistency). Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>

view details

Aleksa Sarai

commit sha ee8794d9739144ab73a6667de49b0b3829fe20a2

namei: permit ".." resolution with LOOKUP_{IN_ROOT,BENEATH} This patch allows for LOOKUP_BENEATH and LOOKUP_IN_ROOT to safely permit ".." resolution (in the case of LOOKUP_BENEATH the resolution will still fail if ".." resolution would resolve a path outside of the root -- while LOOKUP_IN_ROOT will chroot(2)-style scope it). Magic-link jumps are still disallowed entirely[*]. The need for this patch (and the original no-".." restriction) is explained by observing there is a fairly easy-to-exploit race condition with chroot(2) (and thus by extension LOOKUP_IN_ROOT and LOOKUP_BENEATH if ".." is allowed) where a rename(2) of a path can be used to "skip over" nd->root and thus escape to the filesystem above nd->root. thread1 [attacker]: for (;;) renameat2(AT_FDCWD, "/a/b/c", AT_FDCWD, "/a/d", RENAME_EXCHANGE); thread2 [victim]: for (;;) openat2(dirb, "b/c/../../etc/shadow", { .flags = O_PATH, .resolve = RESOLVE_IN_ROOT } ); With fairly significant regularity, thread2 will resolve to "/etc/shadow" rather than "/a/b/etc/shadow". There is also a similar (though somewhat more privileged) attack using MS_MOVE. With this patch, such cases will be detected *during* ".." resolution and will return -EAGAIN for userspace to decide to either retry or abort the lookup. It should be noted that ".." is the weak point of chroot(2) -- walking *into* a subdirectory tautologically cannot result in you walking *outside* nd->root (except through a bind-mount or magic-link). There is also no other way for a directory's parent to change (which is the primary worry with ".." resolution here) other than a rename or MS_MOVE. This is a first-pass implementation, where -EAGAIN will be returned if any rename or mount occurs anywhere on the host (in any namespace). This will result in spurious errors, but there isn't a satisfactory alternative (other than denying ".." altogether). One other possible alternative (which previous versions of this patch used) would be to check with path_is_under() if there was a racing rename or mount (after re-taking the relevant seqlocks). While this does work, it results in possible O(n*m) behaviour if there are many renames or mounts occuring *anywhere on the system*. A variant of the above attack is included in the selftests for openat2(2) later in this patch series. I've run this test on several machines for several days and no instances of a breakout were detected. While this is not concrete proof that this is safe, when combined with the above argument it should lend some trustworthiness to this construction. [*] It may be acceptable in the future to do a path_is_under() check (as with the alternative solution for "..") for magic-links after they are resolved. However this seems unlikely to be a feature that people *really* need -- it can be added later if it turns out a lot of people want it. Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>

view details

Aleksa Sarai

commit sha 9374c3e2a44f241427ff60fe748d356d53e33572

open: introduce openat2(2) syscall /* Background. */ For a very long time, extending openat(2) with new features has been incredibly frustrating. This stems from the fact that openat(2) is possibly the most famous counter-example to the mantra "don't silently accept garbage from userspace" -- it doesn't check whether unknown flags are present[1]. This means that (generally) the addition of new flags to openat(2) has been fraught with backwards-compatibility issues (O_TMPFILE has to be defined as __O_TMPFILE|O_DIRECTORY|[O_RDWR or O_WRONLY] to ensure old kernels gave errors, since it's insecure to silently ignore the flag[2]). All new security-related flags therefore have a tough road to being added to openat(2). In addition, the newly-added path resolution restriction LOOKUP flags (which we would like to expose to user-space) don't feel related to the pre-existing O_* flag set -- they affect all components of path lookup. Thus it's necessary to (at the very least) add an additional flag. Adding a new syscall allows us to finally fix the flag-ignoring problem, and we can make it extensible enough so that we will hopefully never need an openat3(2). /* Syscall Prototype. */ /* * open_how is an extensible structure (similar in interface to * clone3(2) or sched_setattr(2)). The size parameter must be set to * sizeof(struct open_how), to allow for future extensions. All future * extensions will be appended to open_how, with their zero value * acting as a no-op default. */ struct open_how { /* ... */ }; int openat2(int dfd, const char *pathname, struct open_how *how, size_t size); /* Description. */ The initial version of 'struct open_how' contains the following fields: flags Used to specify openat(2)-style flags. However, any unknown flag bits or otherwise incorrect flag combinations (like O_PATH|O_RDWR) will result in -EINVAL. In addition, this field is 64-bits wide to allow for more O_ flags than currently permitted with openat(2). mode The file mode for O_CREAT or O_TMPFILE. Must be set to zero if flags does not contain O_CREAT or O_TMPFILE. __padding Must be set to all zeroes. resolve Restrict path resolution (in contrast to O_* flags they affect all path components). The current set of flags are as follows (at the moment, all of the RESOLVE_ flags are implemented as just passing the corresponding LOOKUP_ flag). RESOLVE_NO_XDEV => LOOKUP_NO_XDEV RESOLVE_NO_SYMLINKS => LOOKUP_NO_SYMLINKS RESOLVE_NO_MAGICLINKS => LOOKUP_NO_MAGICLINKS RESOLVE_BENEATH => LOOKUP_BENEATH RESOLVE_IN_ROOT => LOOKUP_IN_ROOT open_how does not contain an embedded size field, because it is of little benefit (userspace can figure out the kernel open_how size at runtime fairly easily without it). Note that as a result of the new how->flags handling, O_PATH|O_TMPFILE is no longer permitted for openat(2). As far as I can tell, this has always been a bug and appears to not be used by userspace (and I've not seen any problems on my machines by disallowing it). If it turns out this breaks something, we can special-case it and only permit it for openat(2) but not openat2(2). /* Testing. */ In a follow-up patch there are over 200 selftests which ensure that this syscall has the correct semantics and will correctly handle several attack scenarios. In addition, I've written a userspace library[4] which provides convenient wrappers around openat2(RESOLVE_IN_ROOT) (this is necessary because no other syscalls support RESOLVE_IN_ROOT, and thus lots of care must be taken when using RESOLVE_IN_ROOT'd file descriptors with other syscalls). During the development of this patch, I've run numerous verification tests using libpathrs (showing that the API is reasonably usable by userspace). /* Future Work. */ Additional RESOLVE_ flags have been suggested during the review period. These can be easily implemented separately (such as blocking automount during resolution). Furthermore, there are some other proposed changes to the openat(2) interface (the most obvious example is magic-link hardening[4]) which would be a good opportunity to add a way for userspace to restrict how O_PATH file descriptors can be re-opened. [1]: https://lwn.net/Articles/588444/ [2]: https://lore.kernel.org/lkml/CA+55aFyyxJL1LyXZeBsf2ypriraj5ut1XkNDsunRBqgVjZU_6Q@mail.gmail.com [3]: https://sourceware.org/bugzilla/show_bug.cgi?id=17523 [4]: https://lore.kernel.org/lkml/20190930183316.10190-2-cyphar@cyphar.com/ Suggested-by: Christian Brauner <christian@brauner.io> Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>

view details

Aleksa Sarai

commit sha 5b84a3779d7888806bebb3305bfc40ac3d748e8c

selftests: add openat2(2) selftests Test all of the various openat2(2) flags. A small stress-test of a symlink-rename attack is included to show that the protections against ".."-based attacks are sufficient. The main things these self-tests are enforcing are: * The struct+usize ABI for openat2(2) and copy_struct_from_user() to ensure that upgrades will be handled gracefully (in addition, ensuring that misaligned structures are also handled correctly). * The -EINVAL checks for openat2(2) are all correctly handled to avoid userspace passing unknown or conflicting flag sets (most importantly, ensuring that invalid flag combinations are checked). * All of the RESOLVE_* semantics (including errno values) are correctly handled with various combinations of paths and flags. * RESOLVE_IN_ROOT correctly protects against the symlink rename(2) attack that has been responsible for several CVEs (and likely will be responsible for several more). Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>

view details

Aleksa Sarai

commit sha 4dca371ac71c8bffff8a4e368aad66354be0263b

Documentation: path-lookup: mention LOOKUP_MAGICLINK_JUMPED Now that we have a special flag to signify magic-link jumps, mention it within the path-lookup docs. And now that "magic link" is the correct term for nd_jump_link()-style symlinks, clean up references to this type of "symlink". Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>

view details

push time in 17 days

push eventcyphar/linux

Talel Shenhar

commit sha 9c426b770bd088f18899f836093d810a83b59b98

irqchip/al-fic: Add support for irq retrigger Introduce interrupts retrigger support for Amazon's Annapurna Labs Fabric Interrupt Controller. Signed-off-by: Talel Shenhar <talel@amazon.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/1568018358-18985-1-git-send-email-talel@amazon.com

view details

Sandeep Sheriker Mallikarjun

commit sha 212fbf2c9e84ceb267cadd8342156b69b54b8135

irqchip/atmel-aic5: Add support for sam9x60 irqchip Add support for SAM9X60 irqchip. Signed-off-by: Sandeep Sheriker Mallikarjun <sandeepsheriker.mallikarjun@microchip.com> Signed-off-by: Claudiu Beznea <claudiu.beznea@microchip.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/1568026835-6646-1-git-send-email-claudiu.beznea@microchip.com [claudiu.beznea@microchip.com: update aic5_irq_fixups[], update documentation]

view details

Zenghui Yu

commit sha c107d613f9204ff9c7624c229938153d7492c56e

irqchip/gic-v3: Fix GIC_LINE_NR accessor As per GIC spec, ITLinesNumber indicates the maximum SPI INTID that the GIC implementation supports. And the maximum SPI INTID an implementation might support is 1019 (field value 11111). max(GICD_TYPER_SPIS(...), 1020) is not what we actually want for GIC_LINE_NR. Fix it to min(GICD_TYPER_SPIS(...), 1020). Signed-off-by: Zenghui Yu <yuzenghui@huawei.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/1568789850-14080-1-git-send-email-yuzenghui@huawei.com

view details

Marc Zyngier

commit sha bb0fed1c60cccbe4063b455a7228818395dac86e

irqchip/sifive-plic: Switch to fasteoi flow The SiFive PLIC interrupt controller seems to have all the HW features to support the fasteoi flow, but the driver seems to be stuck in a distant past. Bring it into the 21st century. Signed-off-by: Marc Zyngier <maz@kernel.org> Tested-by: Palmer Dabbelt <palmer@sifive.com> (QEMU Boot) Tested-by: Darius Rad <darius@bluespec.com> (on 2 HW PLIC implementations) Tested-by: Paul Walmsley <paul.walmsley@sifive.com> (HiFive Unleashed) Reviewed-by: Palmer Dabbelt <palmer@sifive.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/8636gxskmj.wl-maz@kernel.org

view details

Steffen Maier

commit sha 82a9ac7130cf51c2640800fb0ef19d3a05cb8fff

scsi: core: fix missing .cleanup_rq for SCSI hosts without request batching This was missing from scsi_mq_ops_no_commit of linux-next commit 8930a6c20791 ("scsi: core: add support for request batching") from Martin's scsi/5.4/scsi-queue or James' scsi/misc. See also linux-next commit b7e9e1fb7a92 ("scsi: implement .cleanup_rq callback") from block/for-next. Signed-off-by: Steffen Maier <maier@linux.ibm.com> Fixes: 8930a6c20791 ("scsi: core: add support for request batching") Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Ming Lei <ming.lei@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

view details

Steffen Maier

commit sha 6b6fa7a5c86e1269d9f0c9a5b902072351317387

scsi: core: fix dh and multipathing for SCSI hosts without request batching This was missing from scsi_device_from_queue() due to the introduction of another new scsi_mq_ops_no_commit of linux-next commit 8930a6c20791 ("scsi: core: add support for request batching") from Martin's scsi/5.4/scsi-queue or James' scsi/misc. Only devicehandler code seems to call scsi_device_from_queue(): *** drivers/scsi/scsi_dh.c: scsi_dh_activate[255] sdev = scsi_device_from_queue(q); scsi_dh_set_params[302] sdev = scsi_device_from_queue(q); scsi_dh_attach[325] sdev = scsi_device_from_queue(q); scsi_dh_attached_handler_name[363] sdev = scsi_device_from_queue(q); Fixes multipath tools follow-on errors: $ multipath -v6 ... libdevmapper: ioctl/libdm-iface.c(1887): device-mapper: reload ioctl on mpatha failed: No such device ... mpatha: failed to load map, error 19 ... showing also as kernel messages: device-mapper: table: 252:0: multipath: error attaching hardware handler device-mapper: ioctl: error adding target to table Signed-off-by: Steffen Maier <maier@linux.ibm.com> Fixes: 8930a6c20791 ("scsi: core: add support for request batching") Cc: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

view details

Shuah Khan

commit sha 55d554f5d14071f7c2c5dbd88d0a2eb695c97d16

tools: bpf: Use !building_out_of_srctree to determine srctree make TARGETS=bpf kselftest fails with: Makefile:127: tools/build/Makefile.include: No such file or directory When the bpf tool make is invoked from tools Makefile, srctree is cleared and the current logic check for srctree equals to empty string to determine srctree location from CURDIR. When the build in invoked from selftests/bpf Makefile, the srctree is set to "." and the same logic used for srctree equals to empty is needed to determine srctree. Check building_out_of_srctree undefined as the condition for both cases to fix "make TARGETS=bpf kselftest" build failure. Signed-off-by: Shuah Khan <skhan@linuxfoundation.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20190927011344.4695-1-skhan@linuxfoundation.org

view details

Yonghong Song

commit sha 1bd63524593b964934a33afd442df16b8f90e2b5

libbpf: handle symbol versioning properly for libbpf.a bcc uses libbpf repo as a submodule. It brings in libbpf source code and builds everything together to produce shared libraries. With latest libbpf, I got the following errors: /bin/ld: libbcc_bpf.so.0.10.0: version node not found for symbol xsk_umem__create@LIBBPF_0.0.2 /bin/ld: failed to set dynamic section sizes: Bad value collect2: error: ld returned 1 exit status make[2]: *** [src/cc/libbcc_bpf.so.0.10.0] Error 1 In xsk.c, we have asm(".symver xsk_umem__create_v0_0_2, xsk_umem__create@LIBBPF_0.0.2"); asm(".symver xsk_umem__create_v0_0_4, xsk_umem__create@@LIBBPF_0.0.4"); The linker thinks the built is for LIBBPF but cannot find proper version LIBBPF_0.0.2/4, so emit errors. I also confirmed that using libbpf.a to produce a shared library also has issues: -bash-4.4$ cat t.c extern void *xsk_umem__create; void * test() { return xsk_umem__create; } -bash-4.4$ gcc -c -fPIC t.c -bash-4.4$ gcc -shared t.o libbpf.a -o t.so /bin/ld: t.so: version node not found for symbol xsk_umem__create@LIBBPF_0.0.2 /bin/ld: failed to set dynamic section sizes: Bad value collect2: error: ld returned 1 exit status -bash-4.4$ Symbol versioning does happens in commonly used libraries, e.g., elfutils and glibc. For static libraries, for a versioned symbol, the old definitions will be ignored, and the symbol will be an alias to the latest definition. For example, glibc sched_setaffinity is versioned. -bash-4.4$ readelf -s /usr/lib64/libc.so.6 | grep sched_setaffinity 756: 000000000013d3d0 13 FUNC GLOBAL DEFAULT 13 sched_setaffinity@GLIBC_2.3.3 757: 00000000000e2e70 455 FUNC GLOBAL DEFAULT 13 sched_setaffinity@@GLIBC_2.3.4 1800: 0000000000000000 0 FILE LOCAL DEFAULT ABS sched_setaffinity.c 4228: 00000000000e2e70 455 FUNC LOCAL DEFAULT 13 __sched_setaffinity_new 4648: 000000000013d3d0 13 FUNC LOCAL DEFAULT 13 __sched_setaffinity_old 7338: 000000000013d3d0 13 FUNC GLOBAL DEFAULT 13 sched_setaffinity@GLIBC_2 7380: 00000000000e2e70 455 FUNC GLOBAL DEFAULT 13 sched_setaffinity@@GLIBC_ -bash-4.4$ For static library, the definition of sched_setaffinity aliases to the new definition. -bash-4.4$ readelf -s /usr/lib64/libc.a | grep sched_setaffinity File: /usr/lib64/libc.a(sched_setaffinity.o) 8: 0000000000000000 455 FUNC GLOBAL DEFAULT 1 __sched_setaffinity_new 12: 0000000000000000 455 FUNC WEAK DEFAULT 1 sched_setaffinity For both elfutils and glibc, additional macros are used to control different handling of symbol versioning w.r.t static and shared libraries. For elfutils, the macro is SYMBOL_VERSIONING (https://sourceware.org/git/?p=elfutils.git;a=blob;f=lib/eu-config.h). For glibc, the macro is SHARED (https://sourceware.org/git/?p=glibc.git;a=blob;f=include/shlib-compat.h;hb=refs/heads/master) This patch used SHARED as the macro name. After this patch, the libbpf.a has -bash-4.4$ readelf -s libbpf.a | grep xsk_umem__create 372: 0000000000017145 1190 FUNC GLOBAL DEFAULT 1 xsk_umem__create_v0_0_4 405: 0000000000017145 1190 FUNC GLOBAL DEFAULT 1 xsk_umem__create 499: 00000000000175eb 103 FUNC GLOBAL DEFAULT 1 xsk_umem__create_v0_0_2 -bash-4.4$ No versioned symbols for xsk_umem__create. The libbpf.a can be used to build a shared library succesfully. -bash-4.4$ cat t.c extern void *xsk_umem__create; void * test() { return xsk_umem__create; } -bash-4.4$ gcc -c -fPIC t.c -bash-4.4$ gcc -shared t.o libbpf.a -o t.so -bash-4.4$ Fixes: 10d30e301732 ("libbpf: add flags to umem config") Cc: Kevin Laatz <kevin.laatz@intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Andrii Nakryiko <andriin@fb.com> Acked-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>

view details

Oliver Neukum

commit sha 21e3d6c81179bbdfa279efc8de456c34b814cfd2

scsi: sd: Ignore a failure to sync cache due to lack of authorization I've got a report about a UAS drive enclosure reporting back Sense: Logical unit access not authorized if the drive it holds is password protected. While the drive is obviously unusable in that state as a mass storage device, it still exists as a sd device and when the system is asked to perform a suspend of the drive, it will be sent a SYNCHRONIZE CACHE. If that fails due to password protection, the error must be ignored. Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20190903101840.16483-1-oneukum@suse.com Signed-off-by: Oliver Neukum <oneukum@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

view details

Daniel Wagner

commit sha 9bc6157f5fd0e898c94f3018d088a3419bde0d8f

scsi: qla2xxx: Remove WARN_ON_ONCE in qla2x00_status_cont_entry() Commit 88263208dd23 ("scsi: qla2xxx: Complain if sp->done() is not called from the completion path") introduced the WARN_ON_ONCE in qla2x00_status_cont_entry(). The assumption was that there is only one status continuations element. According to the firmware documentation it is possible that multiple status continuations are emitted by the firmware. Fixes: 88263208dd23 ("scsi: qla2xxx: Complain if sp->done() is not called from the completion path") Link: https://lore.kernel.org/r/20190927073031.62296-1-dwagner@suse.de Cc: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Daniel Wagner <dwagner@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

view details

Liu Xiang

commit sha 6db7bfb431220d78e34d2d0afdb7c12683323588

iommu/arm-smmu: Free context bitmap in the err path of arm_smmu_init_domain_context When alloc_io_pgtable_ops is failed, context bitmap which is just allocated by __arm_smmu_alloc_bitmap should be freed to release the resource. Signed-off-by: Liu Xiang <liuxiang_1999@126.com> Signed-off-by: Will Deacon <will@kernel.org>

view details

Robin Murphy

commit sha 52f325f4eb321ea2e8a0779f49a3866be58bc694

iommu/io-pgtable-arm: Correct Mali attributes Whilst Midgard's MEMATTR follows a similar principle to the VMSA MAIR, the actual attribute values differ, so although it currently appears to work to some degree, we probably shouldn't be using our standard stage 1 MAIR for that. Instead, generate a reasonable MEMATTR with attribute values borrowed from the kbase driver; at this point we'll be overriding or ignoring pretty much all of the LPAE config, so just implement these Mali details in a dedicated allocator instead of pretending to subclass the standard VMSA format. Fixes: d08d42de6432 ("iommu: io-pgtable: Add ARM Mali midgard MMU page table format") Tested-by: Neil Armstrong <narmstrong@baylibre.com> Reviewed-by: Steven Price <steven.price@arm.com> Reviewed-by: Rob Herring <robh@kernel.org> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Will Deacon <will@kernel.org>

view details

Robin Murphy

commit sha 1be08f458d1602275b02f5357ef069957058f3fd

iommu/io-pgtable-arm: Support all Mali configurations In principle, Midgard GPUs supporting smaller VA sizes should only require 3-level pagetables, since level 0 only resolves bits 48:40 of the address. However, the kbase driver does not appear to have any notion of a variable start level, and empirically T720 and T820 rapidly blow up with translation faults unless given a full 4-level table, despite only supporting a 33-bit VA size. The 'real' IAS value is still valuable in terms of validating addresses on map/unmap, so tweak the allocator to allow smaller values while still forcing the resultant tables to the full 4 levels. As far as I can test, this should make all known Midgard variants happy. Fixes: d08d42de6432 ("iommu: io-pgtable: Add ARM Mali midgard MMU page table format") Tested-by: Neil Armstrong <narmstrong@baylibre.com> Reviewed-by: Steven Price <steven.price@arm.com> Reviewed-by: Rob Herring <robh@kernel.org> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Will Deacon <will@kernel.org>

view details

Brian Vazquez

commit sha 86c1aea84b97120a6d428ce17a2ebd55be677f56

selftests/bpf: test_progs: Don't leak server_fd in tcp_rtt server_fd needs to be closed if pthread can't be created. Fixes: 8a03222f508b ("selftests/bpf: test_progs: fix client/server race in tcp_rtt") Signed-off-by: Brian Vazquez <brianvv@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Stanislav Fomichev <sdf@google.com> Link: https://lore.kernel.org/bpf/20191001173728.149786-2-brianvv@google.com

view details

Brian Vazquez

commit sha a2d074e4c6e81ec9ab359d54f0b88273c738de37

selftests/bpf: test_progs: Don't leak server_fd in test_sockopt_inherit server_fd needs to be closed if pthread can't be created. Fixes: e3e02e1d9c24 ("selftests/bpf: test_progs: convert test_sockopt_inherit") Signed-off-by: Brian Vazquez <brianvv@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Stanislav Fomichev <sdf@google.com> Link: https://lore.kernel.org/bpf/20191001173728.149786-3-brianvv@google.com

view details

Stanislaw Gruszka

commit sha c91a9cfe9f6d136172a52ff6e01b3f83ba850c19

rt2x00: initialize last_reset Initialize last_reset variable to INITIAL_JIFFIES, otherwise it is not possible to test H/W reset for first 5 minutes of system run. Fixes: e403fa31ed71 ("rt2x00: add restart hw") Reported-and-tested-by: Jonathan Liu <net147@gmail.com> Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>

view details

Marco Felsch

commit sha afce285b859cea91c182015fc9858ea58c26cd0e

Input: da9063 - fix capability and drop KEY_SLEEP Since commit f889beaaab1c ("Input: da9063 - report KEY_POWER instead of KEY_SLEEP during power key-press") KEY_SLEEP isn't supported anymore. This caused input device to not generate any events if "dlg,disable-key-power" is set. Fix this by unconditionally setting KEY_POWER capability, and not declaring KEY_SLEEP. Fixes: f889beaaab1c ("Input: da9063 - report KEY_POWER instead of KEY_SLEEP during power key-press") Signed-off-by: Marco Felsch <m.felsch@pengutronix.de> Cc: stable@vger.kernel.org Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>

view details

Yauhen Kharuzhy

commit sha bd3b8480237680b5967aee3c814b92b2fd87a582

Input: goodix - add support for 9-bytes reports Some variants of Goodix touchscreen firmwares use 9-bytes finger report format instead of common 8-bytes format. This report format may be present as: struct goodix_contact_data { uint8_t unknown1; uint8_t track_id; uint8_t unknown2; uint16_t x; uint16_t y; uint16_t w; }__attribute__((packed)); Add support for such format and use it for Lenovo Yoga Book notebook (which uses a Goodix touchpad as a touch keyboard). Signed-off-by: Yauhen Kharuzhy <jekhor@gmail.com> Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>

view details

Daniel Black

commit sha 35b9ad840892d979dbeffe702dae95a3cbefa07b

ACPI: HMAT: ACPI_HMAT_MEMORY_PD_VALID is deprecated since ACPI-6.3 ACPI-6.3 corresponds to when HMAT revision was bumped from 1 to 2. In this version ACPI_HMAT_MEMORY_PD_VALID was deprecated and made reserved. As such in revision 2+ we shouldn't be testing this flag. This is as per ACPI-6.3, 5.2.27.3, Table 5-145 "Memory Proximity Domain Attributes Structure" for Flags. Signed-off-by: Daniel Black <daniel@linux.ibm.com> Reviewed-by: Tao Xu <tao3.xu@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

view details

Geert Uytterhoeven

commit sha e8307ec51efebf579da5966aa5da5ab5353c61c7

mmc: renesas_sdhi: Do not use platform_get_irq() to count interrupts As platform_get_irq() now prints an error when the interrupt does not exist, counting interrupts by looping until failure causes the printing of scary messages like: renesas_sdhi_internal_dmac ee140000.sd: IRQ index 1 not found Fix this by using the platform_irq_count() helper to avoid touching non-existent interrupts. Fixes: 7723f4c5ecdb8d83 ("driver core: platform: Add an error message to platform_get_irq*()") Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com> Reviewed-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Tested-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com> Tested-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>

view details

push time in 17 days

delete branch cyphar/umoci

delete branch : dont-mask-error

delete time in 19 days

PR merged openSUSE/umoci

tar_extract: don't mask original error on short write lgtm/need 1

If there was an error in io.Copy() which resulted in a short write, we used to hide the error (though the error would be quite useful for debugging). Thus only set err to io.ErrShortWrite if there's no error set already.

Fixes #301 Reported-by: Tycho Andersen tycho@tycho.ws Signed-off-by: Aleksa Sarai asarai@suse.de

+5 -1

2 comments

1 changed file

cyphar

pr closed time in 19 days

push eventopenSUSE/umoci

Aleksa Sarai

commit sha 63d4feb6f9fd3b9c509c092184f2b1e9e18b0589

tar_extract: don't mask original error on short write If there was an error in io.Copy() which resulted in a short write, we used to hide the error (though the error would be quite useful for debugging). Thus only set err to io.ErrShortWrite if there's no error set already. Reported-by: Tycho Andersen <tycho@tycho.ws> Signed-off-by: Aleksa Sarai <asarai@suse.de>

view details

Aleksa Sarai

commit sha 55e2ed6ae040db06fbf3b1595ddd8c4ad487d11b

merge branch 'pr-306' Aleksa Sarai (1): tar_extract: don't mask original error on short write LGTMs: @cyphar Closes #306

view details

push time in 19 days

PR closed openSUSE/umoci

tar_extract: don't mask original error on short write lgtm/need 1

If there was an error in io.Copy() which resulted in a short write, we used to hide the error (though the error would be quite useful for debugging). Thus only set err to io.ErrShortWrite if there's no error set already.

Fixes #301 Reported-by: Tycho Andersen tycho@tycho.ws Signed-off-by: Aleksa Sarai asarai@suse.de

+5 -1

2 comments

1 changed file

cyphar

pr closed time in 19 days

pull request commentopenSUSE/umoci

tar_extract: don't mask original error on short write

LGTM.

cyphar

comment created time in 19 days

PR closed openSUSE/umoci

tar_extract: don't mask original error on short write lgtm/need 2

The original error may be useful in figuring out why the short write happened, so let's not mask it and instead include it in the new error.

Signed-off-by: Tycho Andersen tycho@tycho.ws

+1 -1

1 comment

1 changed file

tych0

pr closed time in 19 days

pull request commentopenSUSE/umoci

tar_extract: don't mask original error on short write

Closed in favour of #306.

tych0

comment created time in 19 days

PR opened openSUSE/umoci

tar_extract: don't mask original error on short write

If there was an error in io.Copy() which resulted in a short write, we used to hide the error (though the error would be quite useful for debugging). Thus only set err to io.ErrShortWrite if there's no error set already.

Fixes #301 Reported-by: Tycho Andersen tycho@tycho.ws Signed-off-by: Aleksa Sarai asarai@suse.de

+5 -1

0 comment

1 changed file

pr created time in 19 days

create barnchcyphar/umoci

branch : dont-mask-error

created branch time in 19 days

Pull request review commentopenSUSE/umoci

tar_extract: don't mask original error on short write

 func (te *TarExtractor) UnpackEntry(root string, hdr *tar.Header, r io.Reader) ( 		// We need to make sure that we copy all of the bytes. 		n, err := io.Copy(fh, r) 		if int64(n) != hdr.Size {-			err = io.ErrShortWrite+			err = errors.Wrapf(io.ErrShortWrite, "didn't write all the bytes to the new file (%s)", err)

Ah wait, err might be nil -- I will rework this patch and clean it up (no need to go through a bunch of back-and-forth over a one-line change).

tych0

comment created time in 19 days

issue commentopenSUSE/libpathrs

protection against unprivileged symbolic links?

For the kernel side, I could try to add a RESOLVE_PRIVILEGED_SYMLINKS to a future version of the openat2 patchset which implements those semantics. But I'd like to get openat2 merged first :wink:.

maltek

comment created time in 20 days

issue commentopenSUSE/libpathrs

protection against unprivileged symbolic links?

It isn't implemented, though I'm not definitely not against the idea. Though, one problem is that we cannot (as easily) implement it for the kernel-mode driver because that actually does the resolution in-kernel.

Though, I think there is a question about what threat model we are defending against by implementing it -- symlinks will always be resolved within a given Root, so are we saying that even within a Root we want to stop an administrative process from following links that could be bad if an in-container administrative process followed them?

maltek

comment created time in 20 days

push eventcyphar/man-pages

Aleksa Sarai

commit sha 0c4c9daacba8b1b3f34bd8bbd2953bafa27cc9d1

openat2.2: document new openat2(2) syscall Rather than trying to merge the new syscall documentation into open.2 (which would probably result in the man-page being incomprehensible), instead the new syscall gets its own dedicated page with links between open(2) and openat2(2) to avoid duplicating information such as the list of O_* flags or common errors. Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>

view details

Aleksa Sarai

commit sha 424e3c34630892b98afdac63270a8a348dc5e43f

path_resolution.7: update to mention openat2(2) features Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>

view details

push time in 20 days

push eventcyphar/man-pages

Michael Kerrisk

commit sha ee81d7e418523d9fd25b27adcaf27c9885e215af

namespaces.7: Include manual page references in the summary table of namespace types Make the page more compact by removing the stub subsections that list the manual pages for the namespace types. And while we're here, add an explanation of the table columns. Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>

view details

Michael Kerrisk

commit sha b27d444f347bdc0146980a3db76428f82b391ba1

pivot_root.2: Remove an imprecision in description Remove the text that suggests that pivot_root() changes the root directory and CWD of process that have directory and CWD on the old root *filesystem*. Change "filesystem" to "directory". Reported-by: Philipp Wendler <ml@philippwendler.de> Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>

view details

Michael Kerrisk

commit sha 9f3af6b8c88fe59eb8e7fdfe15e7861535aa94e1

pivot_root.2: Simplify discussion of restrictions for 'new_root' Philipp Wendler noted that the text on the restrictions for 'new_root' was slightly contradictory, and things could be clarified and simplified by describing the restrictions on 'new_root' in one place. Reported-by: Philipp Wendler <ml@philippwendler.de> Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>

view details

Michael Kerrisk

commit sha 33313a260ccd688984217a6cebc6f4604baf9028

pivot_root.2: Change "filesystem" to "mount" in various places Quoting Eric: If we are going to be pedantic "filesystem" is really the wrong concept here. The section about bind mount clarifies it, but I wonder if there is a better term. I think I would say: "new_root and put_old must not be on the same mount as the current root." I think using "mount" instead of "filesystem" keeps the concepts less confusing. As I am reading through this email and seeing text that is trying to be precise and clear then hitting the term "filesystem" is a bit jarring. pivot_root doesn't care a thing for file systems. pivot_root only cares about mounts. And by a "mount" I mean the thing that you get when you create a bind mount or you call mount normally. Reported-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>

view details

Michael Kerrisk

commit sha 666373fc08d456ab6a92bae3ae6ee2b12f838fbd

pivot_root.2: Reword one of the restrictions on 'new_root' A suggested by Eric Biederman Reported-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>

view details

Michael Kerrisk

commit sha 0843016c9b4f9bca4bc2738bef16065bc3a19fc1

pivot_root.2: s/root filesystem/root mount/ As suggested by Eric Biederman. Reported-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>

view details

Michael Kerrisk

commit sha 97076c5a0b05a76c14db8ebbeb9209eb96b8865d

pivot_root.2: Minor change: relocate a paragraph in NOTES Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>

view details

Michael Kerrisk

commit sha 3db820fe18093c6b63a678c0ed31b52ce050b640

pivot_root.2: Add a subsection header for the pivot_root(".", ".") discussion Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>

view details

Michael Kerrisk

commit sha 01c64c3b4bd08c5ff7f18af8b5a5c72c628e867f

pivot_root.2: Relegate text about what pivot_root() may or may not do to NOTES The text stating that "pivot_root() may or may not change the current root and the current working directory of any processes or threads which use the old root directory" was written 19 years ago, before the system call itself was even finalized in the kernel. The implementation has never changed, and it won't change in the future, since that would cause user-space breakage. The existence of that text in DESCRIPTION, followed by qualifying text stating what the implementation actually does (and has always done) makes for confusing reading. Therefore, relegate this text to a historical note in NOTES (so that readers with long memories can see why the manual page was changed) and rework the text in DESCRIPTION accordingly. Reported-by: Philipp Wendler <ml@philippwendler.de> Reported-by: Eric W. Biederman <ebiederm@xmission.com> Reported-by: Reid Priedhorsky <reidpr@lanl.gov> Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>

view details

Michael Kerrisk

commit sha 542175d8e4999026be8a1397b8f4e531925b73b7

pivot_root.2: Tweak text of an EINVAL error to correspond to DESCRIPTION Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>

view details

Michael Kerrisk

commit sha ba4b07c30f51605e304a37f08297d9cd4d5c8930

pivot_root.2: Another couple of s/filesystem/mount/ This is consistent with some earlier changes suggested by Eric Biederman. Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>

view details

Michael Kerrisk

commit sha 875298005de4057fc083c0ff99e3077d5d775fe7

pivot_root.2: Minor wording tweaks Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>

view details

Michael Kerrisk

commit sha a2dd6388a739021e4015baabcb3d1838cbbd3866

pivot_root.2: Update the copyright and license After my rewriting, almost nothing of the original page remains, so update the copyright. As the author, I'm relicensing to the "verbatim" license most commonly used in man pages. Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>

view details

Christian Brauner

commit sha 9f93898154e35179e42a7a337d92fb24155471a8

clone.2: Document CLONE_PIDFD Add an entry for CLONE_PIDFD. This flag is available starting with kernel 5.2. If specified, a process file descriptor ("pidfd") referring to the child process will be returned in the ptid argument. Signed-off-by: Christian Brauner <christian@brauner.io> Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>

view details

Michael Kerrisk

commit sha 99f6c1d734e4b8f424c7c54f812db4aefa511976

clone.2: srcfix: wrap source at sentence boundaries Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>

view details

Michael Kerrisk

commit sha 4e98b0747699e972dbc107ddf19ef8ca130fcc34

clone.2: Minor tweaks to Christian Brauner's patch Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>

view details

Michael Kerrisk

commit sha 0eec009fb384d076e5d1c080727c40197224f65b

clone.2: ffix (split a paragraph) Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>

view details

Michael Kerrisk

commit sha b4ebffb2306bfa2c5c010c29610ee20aed78e51c

clone.2: The close-on-exec flag is set on the new FD returned by CLONE_PIDFD In the kernel source (kernel/fork.c::copy_process()), there is: pidfile = anon_inode_getfile("[pidfd]", &pidfd_fops, pid, O_RDWR | O_CLOEXEC); Reviewed-by: Christian Brauner <christian.brauner@ubuntu.com> Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>

view details

Michael Kerrisk

commit sha 7d7dc1877f4eb48abb4d70590a04ca99ba2b57d1

clone.2: Remove a CLONE_PIDFD detail that wasn't true in the final implementation Reviewed-by: Christian Brauner <christian.brauner@ubuntu.com> Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>

view details

Michael Kerrisk

commit sha b97cc7ae40c5830147f5973f418cde77494b0461

clone.2: Minor wording improvements Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>

view details

push time in 20 days

push eventcyphar/man-pages

Michael Kerrisk

commit sha ee81d7e418523d9fd25b27adcaf27c9885e215af

namespaces.7: Include manual page references in the summary table of namespace types Make the page more compact by removing the stub subsections that list the manual pages for the namespace types. And while we're here, add an explanation of the table columns. Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>

view details

Michael Kerrisk

commit sha b27d444f347bdc0146980a3db76428f82b391ba1

pivot_root.2: Remove an imprecision in description Remove the text that suggests that pivot_root() changes the root directory and CWD of process that have directory and CWD on the old root *filesystem*. Change "filesystem" to "directory". Reported-by: Philipp Wendler <ml@philippwendler.de> Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>

view details

Michael Kerrisk

commit sha 9f3af6b8c88fe59eb8e7fdfe15e7861535aa94e1

pivot_root.2: Simplify discussion of restrictions for 'new_root' Philipp Wendler noted that the text on the restrictions for 'new_root' was slightly contradictory, and things could be clarified and simplified by describing the restrictions on 'new_root' in one place. Reported-by: Philipp Wendler <ml@philippwendler.de> Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>

view details

Michael Kerrisk

commit sha 33313a260ccd688984217a6cebc6f4604baf9028

pivot_root.2: Change "filesystem" to "mount" in various places Quoting Eric: If we are going to be pedantic "filesystem" is really the wrong concept here. The section about bind mount clarifies it, but I wonder if there is a better term. I think I would say: "new_root and put_old must not be on the same mount as the current root." I think using "mount" instead of "filesystem" keeps the concepts less confusing. As I am reading through this email and seeing text that is trying to be precise and clear then hitting the term "filesystem" is a bit jarring. pivot_root doesn't care a thing for file systems. pivot_root only cares about mounts. And by a "mount" I mean the thing that you get when you create a bind mount or you call mount normally. Reported-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>

view details

Michael Kerrisk

commit sha 666373fc08d456ab6a92bae3ae6ee2b12f838fbd

pivot_root.2: Reword one of the restrictions on 'new_root' A suggested by Eric Biederman Reported-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>

view details

Michael Kerrisk

commit sha 0843016c9b4f9bca4bc2738bef16065bc3a19fc1

pivot_root.2: s/root filesystem/root mount/ As suggested by Eric Biederman. Reported-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>

view details

Michael Kerrisk

commit sha 97076c5a0b05a76c14db8ebbeb9209eb96b8865d

pivot_root.2: Minor change: relocate a paragraph in NOTES Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>

view details

Michael Kerrisk

commit sha 3db820fe18093c6b63a678c0ed31b52ce050b640

pivot_root.2: Add a subsection header for the pivot_root(".", ".") discussion Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>

view details

Michael Kerrisk

commit sha 01c64c3b4bd08c5ff7f18af8b5a5c72c628e867f

pivot_root.2: Relegate text about what pivot_root() may or may not do to NOTES The text stating that "pivot_root() may or may not change the current root and the current working directory of any processes or threads which use the old root directory" was written 19 years ago, before the system call itself was even finalized in the kernel. The implementation has never changed, and it won't change in the future, since that would cause user-space breakage. The existence of that text in DESCRIPTION, followed by qualifying text stating what the implementation actually does (and has always done) makes for confusing reading. Therefore, relegate this text to a historical note in NOTES (so that readers with long memories can see why the manual page was changed) and rework the text in DESCRIPTION accordingly. Reported-by: Philipp Wendler <ml@philippwendler.de> Reported-by: Eric W. Biederman <ebiederm@xmission.com> Reported-by: Reid Priedhorsky <reidpr@lanl.gov> Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>

view details

Michael Kerrisk

commit sha 542175d8e4999026be8a1397b8f4e531925b73b7

pivot_root.2: Tweak text of an EINVAL error to correspond to DESCRIPTION Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>

view details

Michael Kerrisk

commit sha ba4b07c30f51605e304a37f08297d9cd4d5c8930

pivot_root.2: Another couple of s/filesystem/mount/ This is consistent with some earlier changes suggested by Eric Biederman. Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>

view details

Michael Kerrisk

commit sha 875298005de4057fc083c0ff99e3077d5d775fe7

pivot_root.2: Minor wording tweaks Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>

view details

Michael Kerrisk

commit sha a2dd6388a739021e4015baabcb3d1838cbbd3866

pivot_root.2: Update the copyright and license After my rewriting, almost nothing of the original page remains, so update the copyright. As the author, I'm relicensing to the "verbatim" license most commonly used in man pages. Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>

view details

Christian Brauner

commit sha 9f93898154e35179e42a7a337d92fb24155471a8

clone.2: Document CLONE_PIDFD Add an entry for CLONE_PIDFD. This flag is available starting with kernel 5.2. If specified, a process file descriptor ("pidfd") referring to the child process will be returned in the ptid argument. Signed-off-by: Christian Brauner <christian@brauner.io> Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>

view details

Michael Kerrisk

commit sha 99f6c1d734e4b8f424c7c54f812db4aefa511976

clone.2: srcfix: wrap source at sentence boundaries Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>

view details

Michael Kerrisk

commit sha 4e98b0747699e972dbc107ddf19ef8ca130fcc34

clone.2: Minor tweaks to Christian Brauner's patch Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>

view details

Michael Kerrisk

commit sha 0eec009fb384d076e5d1c080727c40197224f65b

clone.2: ffix (split a paragraph) Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>

view details

Michael Kerrisk

commit sha b4ebffb2306bfa2c5c010c29610ee20aed78e51c

clone.2: The close-on-exec flag is set on the new FD returned by CLONE_PIDFD In the kernel source (kernel/fork.c::copy_process()), there is: pidfile = anon_inode_getfile("[pidfd]", &pidfd_fops, pid, O_RDWR | O_CLOEXEC); Reviewed-by: Christian Brauner <christian.brauner@ubuntu.com> Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>

view details

Michael Kerrisk

commit sha 7d7dc1877f4eb48abb4d70590a04ca99ba2b57d1

clone.2: Remove a CLONE_PIDFD detail that wasn't true in the final implementation Reviewed-by: Christian Brauner <christian.brauner@ubuntu.com> Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>

view details

Michael Kerrisk

commit sha b97cc7ae40c5830147f5973f418cde77494b0461

clone.2: Minor wording improvements Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>

view details

push time in 20 days

push eventcyphar/cyphar.com

Aleksa Sarai

commit sha 200d23117f43c51fdbf0d9e319282af38486445c

srv: tor-relay: update nickname Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>

view details

push time in 24 days

push eventopenSUSE/umoci

David Trudgian

commit sha 3335a0dd0c582f2c6ea9fa12a658129a9c7572ac

tar_extract: don't error on fs without xattr support Closes #303 If we are extracting to a filesystem that does not support xattrs, make sure that an ENTOSUP from clearxattr or listxattr results in a warning, not an error. Signed-off-by: David Trudgian <dave@trudgian.net>

view details

David Trudgian

commit sha 72ae591149af4a819e680a8f157a59cda2be6a26

tar_extract: only warn for forbidden xattrs Closes #302 Rootless mode currently warns if a forbidden xattr is seen, while extractions as root error out. Make root extractions warn, so that docker images such as cern/sl6-base:latest can be extracted as root without failing due to this error. Signed-off-by: David Trudgian <dave@trudgian.net>

view details

Aleksa Sarai

commit sha a2e72a14734bef79652748e41b4ee15252cf681f

merge branch 'pr-304' David Trudgian (2): tar_extract: only warn for forbidden xattrs tar_extract: don't error on fs without xattr support LGTMs: @cyphar Closes #304

view details

push time in a month

PR merged openSUSE/umoci

tar_extract: don't error on fs without xattr support lgtm/need 1

This is a simple proposal to address #303 - I understand this may be naive / it may be that inability to deal with the xattrs is considered more than a warning.

Closes #303

If we are extracting to a filesystem that does not support xattrs, make sure that an ENOTSUP from clearxattr or listxattr results in a warning, not an error.

Signed-off-by: David Trudgian dave@trudgian.net

+14 -9

2 comments

1 changed file

dctrud

pr closed time in a month

issue closedopenSUSE/umoci

ENOTSUP error in extraction to NFS

In https://github.com/sylabs/singularity/issues/4593 a Singularity user reports trying to build a sandbox (directory) rootfs to an NFS mounted filesystem that doesn't support xattrs. We use umoci to extract the layers to a rootfs.

This errors out with the following from the umoci code:

get dirHdr.Xattrs: unpriv.llistxattr: operation not supported

This seems to be triggered in the code here:

https://github.com/openSUSE/umoci/blob/4fc9627cfc26f9e2a2af67662cb1e3381d4580f6/oci/layer/tar_extract.go#L307

There is an Llistxattr which is not protected against ENOTSUP on fs that do not handle xattrs in the same way as other xattr operations in the tar_extract.go code.

Is it reasonable to just add a warning similar to:

https://github.com/openSUSE/umoci/blob/4fc9627cfc26f9e2a2af67662cb1e3381d4580f6/oci/layer/tar_extract.go#L178

... here?

closed time in a month

dctrud

issue closedopenSUSE/umoci

selinux xattr errors for Scientific Linux as with root

We're now using umoci for OCI layer extractions in https://github.com/sylabs/singularity due to the more faithful extraction of permissions compared to containers/image-tools - as recommended by @cyphar in https://github.com/opencontainers/image-tools/issues/218

In a recent issue https://github.com/sylabs/singularity/issues/4578 a user reports that umoci extraction errors out for the CERN Scientific Linux 6 docker image (cern/slc6-base:latest). The reason for this is that the layers for this image contain selinux xattrs. In the rootless flow (--fakeroot option for Singularity builds) umoci will just warn about the forbidden xattr. In the flow for extraction with root it will error.

unpack layer: unpack entry: .: apply hdr metadata: restore xattr metadata: saw forbidden xattr "security.selinux": .

I understand from @olifre who reported the issue to Singularity that the CERN SL6 docker image is built with koji. Following up the trail I see that koji is using https://github.com/redhat-imaging/imagefactory/commit/8c61f3e2c494771a23f2649360c3f6528fbcfe73 to build a docker image. Since the referenced commit in mid 2016 that will be excluding selinux xattrs - so there's an argument here that the docker image is broken. However, given that docker can handle this without issue would it be reasonable for umoci to warn rather than error in extractions as root (like it already does rootless), by default or via an option?

closed time in a month

dctrud

issue closedopenSUSE/umoci

ENOTSUP error in extraction to NFS

In https://github.com/sylabs/singularity/issues/4593 a Singularity user reports trying to build a sandbox (directory) rootfs to an NFS mounted filesystem that doesn't support xattrs. We use umoci to extract the layers to a rootfs.

This errors out with the following from the umoci code:

get dirHdr.Xattrs: unpriv.llistxattr: operation not supported

This seems to be triggered in the code here:

https://github.com/openSUSE/umoci/blob/4fc9627cfc26f9e2a2af67662cb1e3381d4580f6/oci/layer/tar_extract.go#L307

There is an Llistxattr which is not protected against ENOTSUP on fs that do not handle xattrs in the same way as other xattr operations in the tar_extract.go code.

Is it reasonable to just add a warning similar to:

https://github.com/openSUSE/umoci/blob/4fc9627cfc26f9e2a2af67662cb1e3381d4580f6/oci/layer/tar_extract.go#L178

... here?

closed time in a month

dctrud

pull request commentopencontainers/runc

Set unified mountpoint in find mnt func

Isn't /sys/fs/cgroup/unified meant to be the unified mountpoint? Or was this changed again?

crosbymichael

comment created time in a month

pull request commentopenSUSE/umoci

tar_extract: don't error on fs without xattr support

LGTM.

dctrud

comment created time in a month

CommitCommentEvent

issue commentopenSUSE/umoci

selinux xattr errors for Scientific Linux as with root

We can definitely change it to give a warning and ignore it -- Docker is just straight-up silently ignoring the forbidden xattrs (which is definitely the wrong thing to do).

dctrud

comment created time in a month

Pull request review commentopenSUSE/umoci

tar_extract: don't error on fs without xattr support

 func (te *TarExtractor) UnpackEntry(root string, hdr *tar.Header, r io.Reader) ( 		//       tar_generate.go. 		xattrs, err := te.fsEval.Llistxattr(dir) 		if err != nil {-			return errors.Wrap(err, "get dirHdr.Xattrs")+			if errors.Cause(err) != unix.ENOTSUP {+				return errors.Wrap(err, "get dirHdr.Xattrs")+			}+			log.Warnf("ignoring ENOTSUP on listxattr %q", dir)

Same nit as above.

dctrud

comment created time in a month

Pull request review commentopenSUSE/umoci

tar_extract: don't error on fs without xattr support

 func (te *TarExtractor) restoreMetadata(path string, hdr *tar.Header) error { 	// Apply xattrs. In order to make sure that we *only* have the xattr set we 	// want, we first clear the set of xattrs from the file then apply the ones 	// set in the tar.Header.-	if err := te.fsEval.Lclearxattrs(path, ignoreXattrs); err != nil {-		return errors.Wrapf(err, "clear xattr metadata: %s", path)+	err := te.fsEval.Lclearxattrs(path, ignoreXattrs)+	if err != nil {+		if errors.Cause(err) != unix.ENOTSUP {+			return errors.Wrapf(err, "clear xattr metadata: %s", path)+		}+		log.Warnf("ignoring ENOTSUP on clearxattrs %q", path)

Please follow the same formatting for warnings:

log.Warnf("xattr{%s} ignoring ENOTSUP on setxattr %q", hdr.Name, name)

(I also just noticed a typo in the original ENOTSUP check -- while you're at it, can you please change it from xatt to xattr? Thanks.)

dctrud

comment created time in a month

create barnchcyphar/linux

branch : cgroup-pids-memory_barrier

created branch time in a month

Pull request review commentopenSUSE/umoci

tar_extract: don't mask original error on short write

 func (te *TarExtractor) UnpackEntry(root string, hdr *tar.Header, r io.Reader) ( 		// We need to make sure that we copy all of the bytes. 		n, err := io.Copy(fh, r) 		if int64(n) != hdr.Size {-			err = io.ErrShortWrite+			err = errors.Wrapf(io.ErrShortWrite, "didn't write all the bytes to the new file (%s)", err)

nit: %v is preferred for errors, but I'll merge this anyway.

tych0

comment created time in a month

push eventcyphar/linux

Remi Pommarel

commit sha de10ac47597e7a3596b27631d0d5ce5f48d2c099

iio: adc: meson_saradc: Fix memory allocation order meson_saradc's irq handler uses priv->regmap so make sure that it is allocated before the irq get enabled. This also fixes crash when CONFIG_DEBUG_SHIRQ is enabled, as device managed resources are freed in the inverted order they had been allocated, priv->regmap was freed before the spurious fake irq that CONFIG_DEBUG_SHIRQ adds called the handler. Fixes: 3af109131b7eb8 ("iio: adc: meson-saradc: switch from polling to interrupt mode") Reported-by: Elie Roudninski <xademax@gmail.com> Signed-off-by: Remi Pommarel <repk@triplefau.lt> Reviewed-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com> Tested-by: Elie ROUDNINSKI <xademax@gmail.com> Reviewed-by: Kevin Hilman <khilman@baylibre.com> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>

view details

Lorenzo Bianconi

commit sha 85ae3aeedeccb6febb0c6f9d5346d9c6419ad925

iio: imu: st_lsm6dsx: forbid 0 sensor sensitivity Do not allow configuring null sensor gain since it will force to 0 device outputs Fixes: c8d4066c7246 ("iio: imu: st_lsm6dsx: remove invalid gain value for LSM9DS1") Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>

view details

Pascal Bouwmann

commit sha 6c59a962e081df6d8fe43325bbfabec57e0d4751

iio: fix center temperature of bmc150-accel-core The center temperature of the supported devices stored in the constant BMC150_ACCEL_TEMP_CENTER_VAL is not 24 degrees but 23 degrees. It seems that some datasheets were inconsistent on this value leading to the error. For most usecases will only make minor difference so not queued for stable. Signed-off-by: Pascal Bouwmann <bouwmann@tau-tec.de> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>

view details

Al Viro

commit sha d4f4de5e5ef8efde85febb6876cd3c8ab1631999

Fix the locking in dcache_readdir() and friends There are two problems in dcache_readdir() - one is that lockless traversal of the list needs non-trivial cooperation of d_alloc() (at least a switch to list_add_rcu(), and probably more than just that) and another is that it assumes that no removal will happen without the directory locked exclusive. Said assumption had always been there, never had been stated explicitly and is violated by several places in the kernel (devpts and selinuxfs). * replacement of next_positive() with different calling conventions: it returns struct list_head * instead of struct dentry *; the latter is passed in and out by reference, grabbing the result and dropping the original value. * scan is under ->d_lock. If we run out of timeslice, cursor is moved after the last position we'd reached and we reschedule; then the scan continues from that place. To avoid livelocks between multiple lseek() (with cursors getting moved past each other, never reaching the real entries) we always skip the cursors, need_resched() or not. * returned list_head * is either ->d_child of dentry we'd found or ->d_subdirs of parent (if we got to the end of the list). * dcache_readdir() and dcache_dir_lseek() switched to new helper. dcache_readdir() always holds a reference to dentry passed to dir_emit() now. Cursor is moved to just before the entry where dir_emit() has failed or into the very end of the list, if we'd run out. * move_cursor() eliminated - it had sucky calling conventions and after fixing that it became simply list_move() (in lseek and scan_positives) or list_move_tail() (in readdir). All operations with the list are under ->d_lock now, and we do not depend upon having all file removals done with parent locked exclusive anymore. Cc: stable@vger.kernel.org Reported-by: "zhengbin (A)" <zhengbin13@huawei.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

view details

Ian Rogers

commit sha 4b0b2b096da9d296e0e5668cdfba8613bd6f5bc8

libsubcmd: Make _FORTIFY_SOURCE defines dependent on the feature Unconditionally defining _FORTIFY_SOURCE can break tools that don't work with it, such as memory sanitizers: https://github.com/google/sanitizers/wiki/AddressSanitizer#faq Fixes: 4b6ab94eabe4 ("perf subcmd: Create subcmd library") Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lore.kernel.org/lkml/20190925195924.152834-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

view details

Ian Rogers

commit sha e3e2cf3d5b1fe800b032e14c0fdcd9a6fb20cf3b

perf tests: Avoid raising SEGV using an obvious NULL dereference An optimized build such as: make -C tools/perf CLANG=1 CC=clang EXTRA_CFLAGS="-O3 will turn the dereference operation into a ud2 instruction, raising a SIGILL rather than a SIGSEGV. Use raise(..) for correctness and clarity. Similar issues were addressed in Numfor Mbiziwo-Tiapo's patch: https://lkml.org/lkml/2019/7/8/1234 Committer testing: Before: [root@quaco ~]# perf test hooks 55: perf hooks : Ok [root@quaco ~]# perf test -v hooks 55: perf hooks : --- start --- test child forked, pid 17092 SIGSEGV is observed as expected, try to recover. Fatal error (SEGFAULT) in perf hook 'test' test child finished with 0 ---- end ---- perf hooks: Ok [root@quaco ~]# After: [root@quaco ~]# perf test hooks 55: perf hooks : Ok [root@quaco ~]# perf test -v hooks 55: perf hooks : --- start --- test child forked, pid 17909 SIGSEGV is observed as expected, try to recover. Fatal error (SEGFAULT) in perf hook 'test' test child finished with 0 ---- end ---- perf hooks: Ok [root@quaco ~]# Fixes: a074865e60ed ("perf tools: Introduce perf hooks") Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lore.kernel.org/lkml/20190925195924.152834-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

view details

Ian Rogers

commit sha d586ac10ce56b2381b8e1d8ed74660c1b2b8ab0d

perf docs: Allow man page date to be specified With this change if a perf_date parameter is provided to asciidoc then it will override the default date written to the man page metadata. Without this change, or if the perf_date isn't specified, then the current date is written to the metadata. Having this parameter allows the metadata to be constant if builds happen on different dates. The name of the parameter is intended to be consistent with the existing perf_version parameter. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lore.kernel.org/lkml/20190921041327.155054-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

view details

Arnaldo Carvalho de Melo

commit sha 08a96a31474a732fd654575ced843b94bc3212e1

tools headers uapi: Sync drm/i915_drm.h with the kernel sources To pick the change in: bf73fc0fa9cf ("drm/i915: Show support for accurate sw PMU busyness tracking") That don't result in any changes in tooling, just silences this perf build warning: Warning: Kernel ABI header at 'tools/include/uapi/drm/i915_drm.h' differs from latest version at 'include/uapi/drm/i915_drm.h' diff -u tools/include/uapi/drm/i915_drm.h include/uapi/drm/i915_drm.h Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-o651nt7vpz93tu3nmx4f3xql@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

view details

Arnaldo Carvalho de Melo

commit sha b1ba55cf1cfb9f3e0e00d743534684a25bf66d28

tools headers uapi: Sync asm-generic/mman-common.h with the kernel To pick the changes from: 1a4e58cce84e ("mm: introduce MADV_PAGEOUT") 9c276cc65a58 ("mm: introduce MADV_COLD") That result in these changes in the tools: $ tools/perf/trace/beauty/madvise_behavior.sh > before $ cp include/uapi/asm-generic/mman-common.h tools/include/uapi/asm-generic/mman-common.h $ git diff diff --git a/tools/include/uapi/asm-generic/mman-common.h b/tools/include/uapi/asm-generic/mman-common.h index 63b1f506ea67..c160a5354eb6 100644 --- a/tools/include/uapi/asm-generic/mman-common.h +++ b/tools/include/uapi/asm-generic/mman-common.h @@ -67,6 +67,9 @@ #define MADV_WIPEONFORK 18 /* Zero memory on fork, child only */ #define MADV_KEEPONFORK 19 /* Undo MADV_WIPEONFORK */ +#define MADV_COLD 20 /* deactivate these pages */ +#define MADV_PAGEOUT 21 /* reclaim these pages */ + /* compatibility flags */ #define MAP_FILE 0 $ tools/perf/trace/beauty/madvise_behavior.sh > after $ diff -u before after --- before 2019-09-27 11:29:43.346320100 -0300 +++ after 2019-09-27 11:30:03.838570439 -0300 @@ -16,6 +16,8 @@ [17] = "DODUMP", [18] = "WIPEONFORK", [19] = "KEEPONFORK", + [20] = "COLD", + [21] = "PAGEOUT", [100] = "HWPOISON", [101] = "SOFT_OFFLINE", }; $ I.e. now when madvise gets those behaviours as args, it will be able to translate from the number to a human readable string. This addresses the following perf build warning: Warning: Kernel ABI header at 'tools/include/uapi/asm-generic/mman-common.h' differs from latest version at 'include/uapi/asm-generic/mman-common.h' diff -u tools/include/uapi/asm-generic/mman-common.h include/uapi/asm-generic/mman-common.h Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-n40y6c4sa49p29q6sl8w3ufx@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

view details

Arnaldo Carvalho de Melo

commit sha 05f371f8c55d69e4c04db4473085303291e4e734

tools headers uapi: Sync linux/usbdevice_fs.h with the kernel sources To pick up the changes from: 4ed3350539aa ("USB: usbfs: Add a capability flag for runtime suspend") 7794f486ed0b ("usbfs: Add ioctls for runtime power management") This triggers these changes in the kernel sources, automagically supporting these new ioctls in the 'perf trace' beautifiers. Soon this will be used in things like filter expressions for tracepoints in 'perf record', 'perf trace', 'perf top', i.e. filter expressions will do a lookup to turn things like USBDEVFS_WAIT_FOR_RESUME into _IO('U', 35) before associating the tracepoint expression to tracepoint perf event. $ tools/perf/trace/beauty/usbdevfs_ioctl.sh > before $ cp include/uapi/linux/usbdevice_fs.h tools/include/uapi/linux/usbdevice_fs.h $ git diff diff --git a/tools/include/uapi/linux/usbdevice_fs.h b/tools/include/uapi/linux/usbdevice_fs.h index 78efe870c2b7..cf525cddeb94 100644 --- a/tools/include/uapi/linux/usbdevice_fs.h +++ b/tools/include/uapi/linux/usbdevice_fs.h @@ -158,6 +158,7 @@ struct usbdevfs_hub_portinfo { #define USBDEVFS_CAP_MMAP 0x20 #define USBDEVFS_CAP_DROP_PRIVILEGES 0x40 #define USBDEVFS_CAP_CONNINFO_EX 0x80 +#define USBDEVFS_CAP_SUSPEND 0x100 /* USBDEVFS_DISCONNECT_CLAIM flags & struct */ @@ -223,5 +224,8 @@ struct usbdevfs_streams { * extending size of the data returned. */ #define USBDEVFS_CONNINFO_EX(len) _IOC(_IOC_READ, 'U', 32, len) +#define USBDEVFS_FORBID_SUSPEND _IO('U', 33) +#define USBDEVFS_ALLOW_SUSPEND _IO('U', 34) +#define USBDEVFS_WAIT_FOR_RESUME _IO('U', 35) #endif /* _UAPI_LINUX_USBDEVICE_FS_H */ $ tools/perf/trace/beauty/usbdevfs_ioctl.sh > after $ diff -u before after --- before 2019-09-27 11:41:50.634867620 -0300 +++ after 2019-09-27 11:42:07.453102978 -0300 @@ -24,6 +24,9 @@ [30] = "DROP_PRIVILEGES", [31] = "GET_SPEED", [32] = "CONNINFO_EX", + [33] = "FORBID_SUSPEND", + [34] = "ALLOW_SUSPEND", + [35] = "WAIT_FOR_RESUME", [3] = "RESETEP", [4] = "SETINTERFACE", [5] = "SETCONFIGURATION", $ This addresses the following perf build warning: Warning: Kernel ABI header at 'tools/include/uapi/linux/usbdevice_fs.h' differs from latest version at 'include/uapi/linux/usbdevice_fs.h' diff -u tools/include/uapi/linux/usbdevice_fs.h include/uapi/linux/usbdevice_fs.h Cc: Alan Stern <stern@rowland.harvard.edu> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-x1rb109b9nfi7pukota82xhj@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

view details

Arnaldo Carvalho de Melo

commit sha 0ae4061223a3d097222cbec6599370e54db17731

tools headers uapi: Sync linux/fs.h with the kernel sources To pick the changes from: 78a1b96bcf7a ("fscrypt: add FS_IOC_REMOVE_ENCRYPTION_KEY_ALL_USERS ioctl") 23c688b54016 ("fscrypt: allow unprivileged users to add/remove keys for v2 policies") 5dae460c2292 ("fscrypt: v2 encryption policy support") 5a7e29924dac ("fscrypt: add FS_IOC_GET_ENCRYPTION_KEY_STATUS ioctl") b1c0ec3599f4 ("fscrypt: add FS_IOC_REMOVE_ENCRYPTION_KEY ioctl") 22d94f493bfb ("fscrypt: add FS_IOC_ADD_ENCRYPTION_KEY ioctl") 3b6df59bc4d2 ("fscrypt: use FSCRYPT_* definitions, not FS_*") 2336d0deb2d4 ("fscrypt: use FSCRYPT_ prefix for uapi constants") 7af0ab0d3aab ("fs, fscrypt: move uapi definitions to new header <linux/fscrypt.h>") That don't trigger any changes in tooling, as it so far is used only for: $ grep -l 'fs\.h' tools/perf/trace/beauty/*.sh | xargs grep regex= tools/perf/trace/beauty/rename_flags.sh:regex='^[[:space:]]*#[[:space:]]*define[[:space:]]+RENAME_([[:alnum:]_]+)[[:space:]]+\(1[[:space:]]*<<[[:space:]]*([[:xdigit:]]+)[[:space:]]*\)[[:space:]]*.*' tools/perf/trace/beauty/sync_file_range.sh:regex='^[[:space:]]*#[[:space:]]*define[[:space:]]+SYNC_FILE_RANGE_([[:alnum:]_]+)[[:space:]]+([[:xdigit:]]+)[[:space:]]*.*' tools/perf/trace/beauty/usbdevfs_ioctl.sh:regex="^#[[:space:]]*define[[:space:]]+USBDEVFS_(\w+)(\(\w+\))?[[:space:]]+_IO[CWR]{0,2}\([[:space:]]*(_IOC_\w+,[[:space:]]*)?'U'[[:space:]]*,[[:space:]]*([[:digit:]]+).*" tools/perf/trace/beauty/usbdevfs_ioctl.sh:regex="^#[[:space:]]*define[[:space:]]+USBDEVFS_(\w+)[[:space:]]+_IO[WR]{0,2}\([[:space:]]*'U'[[:space:]]*,[[:space:]]*([[:digit:]]+).*" $ This silences this perf build warning: Warning: Kernel ABI header at 'tools/include/uapi/linux/fs.h' differs from latest version at 'include/uapi/linux/fs.h' diff -u tools/include/uapi/linux/fs.h include/uapi/linux/fs.h Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Eric Biggers <ebiggers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-44g48exl9br9ba0t64chqb4i@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

view details

Arnaldo Carvalho de Melo

commit sha b7ad6108484221f431372b94763b74e550d16c93

tools headers kvm: Sync kvm headers with the kernel sources To pick the changes in: 200824f55eef ("KVM: s390: Disallow invalid bits in kvm_valid_regs and kvm_dirty_regs") 4a53d99dd0c2 ("KVM: VMX: Introduce exit reason for receiving INIT signal on guest-mode") 7396d337cfad ("KVM: x86: Return to userspace with internal error on unexpected exit reason") 92f35b751c71 ("KVM: arm/arm64: vgic: Allow more than 256 vcpus for KVM_IRQ_LINE") None of them trigger any changes in tooling, this time this is just to silence these perf build warnings: Warning: Kernel ABI header at 'tools/include/uapi/linux/kvm.h' differs from latest version at 'include/uapi/linux/kvm.h' diff -u tools/include/uapi/linux/kvm.h include/uapi/linux/kvm.h Warning: Kernel ABI header at 'tools/arch/x86/include/uapi/asm/vmx.h' differs from latest version at 'arch/x86/include/uapi/asm/vmx.h' diff -u tools/arch/x86/include/uapi/asm/vmx.h arch/x86/include/uapi/asm/vmx.h Warning: Kernel ABI header at 'tools/arch/s390/include/uapi/asm/kvm.h' differs from latest version at 'arch/s390/include/uapi/asm/kvm.h' diff -u tools/arch/s390/include/uapi/asm/kvm.h arch/s390/include/uapi/asm/kvm.h Warning: Kernel ABI header at 'tools/arch/arm/include/uapi/asm/kvm.h' differs from latest version at 'arch/arm/include/uapi/asm/kvm.h' diff -u tools/arch/arm/include/uapi/asm/kvm.h arch/arm/include/uapi/asm/kvm.h Warning: Kernel ABI header at 'tools/arch/arm64/include/uapi/asm/kvm.h' differs from latest version at 'arch/arm64/include/uapi/asm/kvm.h' diff -u tools/arch/arm64/include/uapi/asm/kvm.h arch/arm64/include/uapi/asm/kvm.h Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Janosch Frank <frankja@linux.ibm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Liran Alon <liran.alon@oracle.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Thomas Huth <thuth@redhat.com> Link: https://lkml.kernel.org/n/tip-akuugvvjxte26kzv23zp5d2z@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

view details

Ian Rogers

commit sha 7d4c85b7035eb2f9ab217ce649dcd1bfaf0cacd3

perf llvm: Don't access out-of-scope array The 'test_dir' variable is assigned to the 'release' array which is out-of-scope 3 lines later. Extend the scope of the 'release' array so that an out-of-scope array isn't accessed. Bug detected by clang's address sanitizer. Fixes: 07bc5c699a3d ("perf tools: Make fetch_kernel_version() publicly available") Cc: stable@vger.kernel.org # v4.4+ Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lore.kernel.org/lkml/20190926220018.25402-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

view details

Thomas Richter

commit sha 02d084792273e8a5f1813dcad988229a45be96ea

perf vendor events s390: Add JSON transaction for machine type 8561 Add s390 transaction counter definition for machine 8561. This is the same file as for the predecessor machine. Fixes: 6e67d77d673d ("perf vendor events s390: Add JSON files for machine type 8561") Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Link: http://lore.kernel.org/lkml/20190927081147.18345-1-tmricht@linux.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

view details

Thomas Richter

commit sha 0d0e5ecec6116db6031829299e74cc71240c9ff3

perf vendor events s390: Use s390 machine name instead of type 8561 In the pmu-events directory for JSON file definitions use the official machine name IBM z15 instead of machine type number 8561. This is consistent with previous machines. Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Link: http://lore.kernel.org/lkml/20190927081147.18345-2-tmricht@linux.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

view details

Steve MacLean

commit sha ee212d6ea20887c0ef352be8563ca13dbf965906

perf map: Fix overlapped map handling Whenever an mmap/mmap2 event occurs, the map tree must be updated to add a new entry. If a new map overlaps a previous map, the overlapped section of the previous map is effectively unmapped, but the non-overlapping sections are still valid. maps__fixup_overlappings() is responsible for creating any new map entries from the previously overlapped map. It optionally creates a before and an after map. When creating the after map the existing code failed to adjust the map.pgoff. This meant the new after map would incorrectly calculate the file offset for the ip. This results in incorrect symbol name resolution for any ip in the after region. Make maps__fixup_overlappings() correctly populate map.pgoff. Add an assert that new mapping matches old mapping at the beginning of the after map. Committer-testing: Validated correct parsing of libcoreclr.so symbols from .NET Core 3.0 preview9 (which didn't strip symbols). Preparation: ~/dotnet3.0-preview9/dotnet new webapi -o perfSymbol cd perfSymbol ~/dotnet3.0-preview9/dotnet publish perf record ~/dotnet3.0-preview9/dotnet \ bin/Debug/netcoreapp3.0/publish/perfSymbol.dll ^C Before: perf script --show-mmap-events 2>&1 | grep -e MMAP -e unknown |\ grep libcoreclr.so | head -n 4 dotnet 1907 373352.698780: PERF_RECORD_MMAP2 1907/1907: \ [0x7fe615726000(0x768000) @ 0 08:02 5510620 765057155]: \ r-xp .../3.0.0-preview9-19423-09/libcoreclr.so dotnet 1907 373352.701091: PERF_RECORD_MMAP2 1907/1907: \ [0x7fe615974000(0x1000) @ 0x24e000 08:02 5510620 765057155]: \ rwxp .../3.0.0-preview9-19423-09/libcoreclr.so dotnet 1907 373352.701241: PERF_RECORD_MMAP2 1907/1907: \ [0x7fe615c42000(0x1000) @ 0x51c000 08:02 5510620 765057155]: \ rwxp .../3.0.0-preview9-19423-09/libcoreclr.so dotnet 1907 373352.705249: 250000 cpu-clock: \ 7fe6159a1f99 [unknown] \ (.../3.0.0-preview9-19423-09/libcoreclr.so) After: perf script --show-mmap-events 2>&1 | grep -e MMAP -e unknown |\ grep libcoreclr.so | head -n 4 dotnet 1907 373352.698780: PERF_RECORD_MMAP2 1907/1907: \ [0x7fe615726000(0x768000) @ 0 08:02 5510620 765057155]: \ r-xp .../3.0.0-preview9-19423-09/libcoreclr.so dotnet 1907 373352.701091: PERF_RECORD_MMAP2 1907/1907: \ [0x7fe615974000(0x1000) @ 0x24e000 08:02 5510620 765057155]: \ rwxp .../3.0.0-preview9-19423-09/libcoreclr.so dotnet 1907 373352.701241: PERF_RECORD_MMAP2 1907/1907: \ [0x7fe615c42000(0x1000) @ 0x51c000 08:02 5510620 765057155]: \ rwxp .../3.0.0-preview9-19423-09/libcoreclr.so All the [unknown] symbols were resolved. Signed-off-by: Steve MacLean <Steve.MacLean@Microsoft.com> Tested-by: Brian Robbins <brianrob@microsoft.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Eric Saint-Etienne <eric.saint.etienne@oracle.com> Cc: John Keeping <john@metanate.com> Cc: John Salem <josalem@microsoft.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <songliubraving@fb.com> Cc: Stephane Eranian <eranian@google.com> Cc: Tom McDonald <thomas.mcdonald@microsoft.com> Link: http://lore.kernel.org/lkml/BN8PR21MB136270949F22A6A02335C238F7800@BN8PR21MB1362.namprd21.prod.outlook.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

view details

Steve MacLean

commit sha b59711e9b0d22fd47abfa00602fd8c365cdd3ab7

perf inject jit: Fix JIT_CODE_MOVE filename During perf inject --jit, JIT_CODE_MOVE records were injecting MMAP records with an incorrect filename. Specifically it was missing the ".so" suffix. Further the JIT_CODE_LOAD record were silently truncating the jr->load.code_index field to 32 bits before generating the filename. Make both records emit the same filename based on the full 64 bit code_index field. Fixes: 9b07e27f88b9 ("perf inject: Add jitdump mmap injection support") Cc: stable@vger.kernel.org # v4.6+ Signed-off-by: Steve MacLean <Steve.MacLean@Microsoft.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Brian Robbins <brianrob@microsoft.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Eric Saint-Etienne <eric.saint.etienne@oracle.com> Cc: John Keeping <john@metanate.com> Cc: John Salem <josalem@microsoft.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <songliubraving@fb.com> Cc: Stephane Eranian <eranian@google.com> Cc: Tom McDonald <thomas.mcdonald@microsoft.com> Link: http://lore.kernel.org/lkml/BN8PR21MB1362FF8F127B31DBF4121528F7800@BN8PR21MB1362.namprd21.prod.outlook.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

view details

Steve MacLean

commit sha 2657983b4c0d81632c6a73bae469951b0d341251

perf docs: Correct and clarify jitdump spec Specification claims latest version of jitdump file format is 2. Current jit dump reading code treats 1 as the latest version. Correct spec to match code. The original language made it unclear the value to be written in the magic field. Revise language that the writer always writes the same value. Specify that the reader uses the value to detect endian mismatches. Signed-off-by: Steve MacLean <Steve.MacLean@Microsoft.com> Acked-by: Stephane Eranian <eranian@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Brian Robbins <brianrob@microsoft.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Eric Saint-Etienne <eric.saint.etienne@oracle.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: John Keeping <john@metanate.com> Cc: John Salem <josalem@microsoft.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <songliubraving@fb.com> Cc: Tom McDonald <thomas.mcdonald@microsoft.com> Link: http://lore.kernel.org/lkml/BN8PR21MB1362F63CDE7AC69736FC7F9EF7800@BN8PR21MB1362.namprd21.prod.outlook.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

view details

Andi Kleen

commit sha e98df280bc2a499fd41d7f9e2d6733884de69902

perf script brstackinsn: Fix recovery from LBR/binary mismatch When the LBR data and the instructions in a binary do not match the loop printing instructions could get confused and print a long stream of bogus <bad> instructions. The problem was that if the instruction decoder cannot decode an instruction it ilen wasn't initialized, so the loop going through the basic block would continue with the previous value. Harden the code to avoid such problems: - Make sure ilen is always freshly initialized and is 0 for bad instructions. - Do not overrun the code buffer while printing instructions - Print a warning message if the final jump is not on an instruction boundary. Signed-off-by: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Link: http://lore.kernel.org/lkml/20190927233546.11533-1-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

view details

Andi Kleen

commit sha 6bdfd9f118bd59cf0f85d3bf4b72b586adea17c1

perf jevents: Fix period for Intel fixed counters The Intel fixed counters use a special table to override the JSON information. During this override the period information from the JSON file got dropped, which results in inst_retired.any and similar running with frequency mode instead of a period. Just specify the expected period in the table. Signed-off-by: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Link: http://lore.kernel.org/lkml/20190927233546.11533-2-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

view details

push time in a month

delete branch cyphar/awesome-linux-containers

delete branch : patch-1

delete time in a month

created tagcyphar/linux

tagopenat2/v14

Personal fork of the Linux kernel source tree

created time in a month

push eventcyphar/dotfiles

Aleksa Sarai

commit sha 79a927ae68cc36accd7791addf794144efba3a29

nvim: clean up mail workflow Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>

view details

push time in a month

issue commentopencontainers/runc

Manually Patching containerd and runc

You shouldn't need to update containerd at all (the CVE fix is just in runc). As for patching runc -- you could try to use 1.0.0-rc9 (this should "just work" -- there haven't been too many changes between the versions). However, I have backported the change for the version of runc which Docker 19.03.3 uses (and the previous version too) for openSUSE and SUSE -- but that really shouldn't be necessary for you to use. If you want though, I can post them.

RedbackThomson

comment created time in a month

create barnchcyphar/man-pages

branch : magiclink

created branch time in a month

push eventcyphar/man-pages

Michael Kerrisk

commit sha c6ed23c5da678bc89b8fee6370f65ddd3e8b4907

perf_event_open.2: SEE ALSO: add Documentation/admin-guide/perf-security.rst Reported-by: Alexey Budankov <alexey.budankov@linux.intel.com> Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>

view details

Carlos O'Donell

commit sha dbb01cbbdb60c34a16d9d48cb58ed3680a5dd36d

pthread_setcancelstate.3, pthreads.7, signal-safety.7: Describe issues with cancellation points in signal handlers In a recent conversation with Mathieu Desnoyers I was reminded that we haven't written up anything about how deferred cancellation and asynchronous signal handlers interact. Mathieu ran into some of this behaviour and I promised to improve the documentation in this area to point out the potential pitfall. Thoughts? 8< --- 8< --- 8< In pthread_setcancelstate.3, pthreads.7, and signal-safety.7 we describe that if you have an asynchronous signal nesting over a deferred cancellation region that any cancellation point in the signal handler may trigger a cancellation that will behave as-if it was an asynchronous cancellation. This asynchronous cancellation may have unexpected effects on the consistency of the application. Therefore care should be taken with asynchronous signals and deferred cancellation. Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>

view details

Michael Kerrisk

commit sha 50639a2a18a4670a4c0055569bfbb59492ce3dc4

pthread_setcancelstate.3, pthreads.7: srcfix: wrap source lines at sentence boundaries Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>

view details

Michael Kerrisk

commit sha 0b6cf5d26e0c65bc5573f4f226bfc87cdaedc281

pthreads.7: Minor tweaks to Carlos O'Donell's patch Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>

view details

Michael Kerrisk

commit sha a2fc45a9f8892b100b8290d630fc662d72119d29

mount_namespaces.7: It may be desirable to disable propagation after creating a namespace After creating a new mount namespace, it may be desirable to disable mount propagation. Give the reader a more explicit hint about this. Reported-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>

view details

Michael Kerrisk

commit sha 93cc3b3827d2d2421bdf989df347c690acf808a2

pivot_root.2: Simplify pivot_root(".", ".") example Eric Biederman notes that the change in commit f646ac88ef83969 was not strictly necessary for this example, since one of the already documented requirements is that various mount points must not have shared propagation, or else pivot_root() will fail. So, simplify the example. Reported-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>

view details

Michael Kerrisk

commit sha a0c97331949452db81585ebcba221ff1539f7472

mount_namespaces.7: Clarify description of "less privileged" mount namespaces The current text talks about "parent mount namespaces", but there is no such concept. As confirmed by Eric Biederman, what is mean here is "the mount namespace this mount namespace started as a copy of". So, this change writes up Eric's description in a more detailed way. Reported-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>

view details

Michael Kerrisk

commit sha ed425459c50cb71c71b08eed903a9efca40b63fc

mount_namespaces.7: tfix Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>

view details

Michael Kerrisk

commit sha 632940d96d7ec94867cddb6535a44bf5a9d80072

mount.2: NOTES: add subsection heading for /proc/[pid]/{mounts,mountinfo} Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>

view details

Michael Kerrisk

commit sha 5d3bcce72dae6733d7f2f3a817aec0d843181080

mount.2: wfix Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>

view details

Michael Kerrisk

commit sha dd858bfd5e9389addf635510807206b6f1a06a73

mount.2: Rework the text on mount namespaces a little Eliminate the term "Per-process namespaces" and add a reference to mount_namespaces(7). Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>

view details

Michael Kerrisk

commit sha e0e0ba7d016027dea9a93272c9f56dbd82eea0f3

mount.2: tfix Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>

view details

Michael Kerrisk

commit sha 459fe995466750ccb8a3222dd4956dd21b67d477

mount.2: Describe the concept of "parent mounts" Reported-by: Reid Priedhorsky <reidpr@lanl.gov> Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>

view details

Michael Kerrisk

commit sha 8f2a9129e6a4c5e87b587aa3b61bd21712e0b009

pivot_root.2: Remove the term 'old_root' Reid noted a confusion between 'old_root' (my attempt at a shorthand for the old root point) and 'put_old. Eliminate the confusion by replacing the shorthand with "old root mount point". Reported-by: Reid Priedhorsky <reidpr@lanl.gov> Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>

view details

Michael Kerrisk

commit sha 47b69a37cf330eec33057f7f396c6584ad701864

pivot_root.2: srcfix: FIXME Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>

view details

Michael Kerrisk

commit sha 534755eed9b8157ee1ad3227d693d0384f5ca3aa

mount_namespaces.7: Explain how a namespace's mount point list is initialized Provide a more detailed explanation of the initialization of the mount point list in a new mount namespace. Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>

view details

Michael Kerrisk

commit sha 19416046c54fb023f2daaf8c84644ad11d8fb068

mount_namespaces.7: Tweak discussion of "less privileged" mount namespace Eric Biederman: I hate to nitpick, but I am going to say that when I read the text above the phrase "mount namespace of the process that created the new mount namespace" feels wrong. Either you use unshare(2) and the mount namespace of the process that created the mount namespace changes. Or you use clone(2) and you could argue it is the new child that created the mount namespace. Having a different mount namespace at the end of the creation operation feels like it makes your phrase confusing about what the starting mount namespace is. I hate to use references that are ambiguous when things are changing. Reported-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>

view details

Michael Kerrisk

commit sha 4d75df3711a60ae7d61a3308750edbed421c64a2

mount_namespaces.7: tfix Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>

view details

Aleksa Sarai

commit sha efa14aa82ec50ce0836e3e8a8806740d8c7ae609

openat2.2: document new openat2(2) syscall Rather than trying to merge the new syscall documentation into open.2 (which would probably result in the man-page being incomprehensible), instead the new syscall gets its own dedicated page with links between open(2) and openat2(2) to avoid duplicating information such as the list of O_* flags or common errors. Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>

view details

Aleksa Sarai

commit sha 0f8db9ea49f1a5ecc4ee6260d5c8f2d0c8e379fc

path_resolution.7: update to mention openat2(2) features Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>

view details

push time in a month

create barnchcyphar/linux

branch : openat2/old

created branch time in a month

create barnchcyphar/linux

branch : openat2/master

created branch time in a month

push eventcyphar/dotfiles

Aleksa Sarai

commit sha e97104e5df700b6b5936aeacc9e1108fe5f2da9f

nvim: clean up mail workflow Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>

view details

push time in a month

create barnchcyphar/linux

branch : magic-links/master

created branch time in a month

issue commentopencontainers/runc

Compilation fails on ubuntu 14.04

Oh, that's strange (that's the latest version of libseccomp) -- I wonder if it's because of the Go compiler version (making it not handle pkg-config correctly). Can you try with a newer Go compiler (you can download a static version from https://golang.org/).

johnnylei

comment created time in a month

issue commentopencontainers/runc

Compilation fails on ubuntu 14.04

You need to have a fairly recent version of libseccomp (also your Go compiler is ancient).

johnnylei

comment created time in a month

push eventcyphar/linux

Aleksa Sarai

commit sha 2cdeee71ecb535cc823a02e5c1385eb4885ddeaf

open: introduce openat2(2) syscall /* Background. */ For a very long time, extending openat(2) with new features has been incredibly frustrating. This stems from the fact that openat(2) is possibly the most famous counter-example to the mantra "don't silently accept garbage from userspace" -- it doesn't check whether unknown flags are present[1]. This means that (generally) the addition of new flags to openat(2) has been fraught with backwards-compatibility issues (O_TMPFILE has to be defined as __O_TMPFILE|O_DIRECTORY|[O_RDWR or O_WRONLY] to ensure old kernels gave errors, since it's insecure to silently ignore the flag[2]). All new security-related flags therefore have a tough road to being added to openat(2). In addition, the newly-added path resolution restriction LOOKUP flags (which we would like to expose to user-space) don't feel related to the pre-existing O_* flag set -- they affect all components of path lookup. Thus it's necessary to (at the very least) add an additional flag. Furthermore, the new magic-link hardening semantics also give an opportunity for us to allow userspace to further restrict the re-openability of O_PATH file descriptors. This feature cannot be indicated through the access mode bits (because O_RDONLY=0, and O_PATH has historically ignored them -- making them off-limits) nor through the file mode (because glibc has historically passed garbage values for non-O_{CREAT,TMPFILE} cases to openat(2)[3]). Thus we'd need yet one more additional flag. Adding a new syscall allows us to finally fix the flag-ignoring problem, and make it extensible enough so that we will hopefully never need an openat3(2). /* Syscall Prototype. */ /* * open_how is an extensible structure (similar in interface to * clone3(2) or sched_setattr(2)). The size parameter must be set to * sizeof(struct open_how), to allow for future extensions. All future * extensions will be appended to open_how, with their zero value * acting as a no-op default. */ struct open_how { /* ... */ }; int openat2(int dfd, const char *pathname, struct open_how *how, size_t size); /* Description. */ The initial version of 'struct open_how' contains the following fields: flags Used to specify openat(2)-style flags. However, any unknown flag bits or otherwise incorrect (O_PATH|O_RDWR) will result in -EINVAL. This field is expanded to be 64-bits wide to allow for more O_ flags than currently permitted with openat(2). mode The file mode for O_CREAT or O_TMPFILE. Must be set to zero if flags does not contain O_CREAT or O_TMPFILE. upgrade_mask Restrict the re-opening of a new O_PATH file descriptor (can be used to restrict re-opening with MAY_READ or MAY_WRITE). Must be set to zero if flags does not contain O_PATH. __padding Must be set to zero. resolve Restrict path resolution (in contrast to O_* flags they affect all path components). The current set of flags are as follows (at the moment, all of the RESOLVE_ flags are implemented as just passing the corresponding LOOKUP_ flag). RESOLVE_NO_XDEV => LOOKUP_NO_XDEV RESOLVE_NO_SYMLINKS => LOOKUP_NO_SYMLINKS RESOLVE_NO_MAGICLINKS => LOOKUP_NO_MAGICLINKS RESOLVE_BENEATH => LOOKUP_BENEATH RESOLVE_IN_ROOT => LOOKUP_IN_ROOT open_how does not contain an embedded size field, because it is of little benefit (userspace can figure out the kernel open_how size at runtime fairly easily without it). /* Testing. */ In a follow-up patch there are over 300 selftests which ensure that this syscall has the correct semantics and will correctly handle several attack scenarios. In addition, I've written a userspace library[4] which provides convenient wrappers around openat2(RESOLVE_IN_ROOT) (this is necessary because no other syscalls support RESOLVE_IN_ROOT, and thus lots of care must be taken when using RESOLVE_IN_ROOT'd file descriptors with other syscalls). During the development of this patch, I've run numerous verification tests using libpathrs (showing that the API is reasonably usable by userspace). /* Future Work. */ Currently there is no way to restrict execute permissions on file descriptors. Once that exists, we could add UPGRADE_NOEXEC. Additional RESOLVE_ flags have been suggested during the review period. These can be easily implemented separately (such as blocking automount during resolution). [1]: https://lwn.net/Articles/588444/ [2]: https://lwn.net/Articles/558949/ [3]: https://sourceware.org/bugzilla/show_bug.cgi?id=17523 Suggested-by: Christian Brauner <christian@brauner.io> Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>

view details

Aleksa Sarai

commit sha 9418f53e801073a65d4e6085a760db6deec59e62

selftests: add openat2(2) selftests Test all of the various openat2(2) flags, as well as how file descriptor re-opening works. A small stress-test of a symlink-rename attack is included to show that the protections against ".."-based attacks are sufficient. In addition, the memfd selftest is fixed to no longer depend on the now-disallowed functionality of upgrading an O_RDONLY descriptor to O_RDWR. The main things these self-tests are enforcing are: * The struct+usize ABI for openat2(2) and copy_struct_from_user() to ensure that upgrades will be handled gracefully (in addition, ensuring that misaligned structures are also handled correctly). * The -EINVAL checks for openat2(2) are all correctly handled to avoid userspace passing unknown or conflicting flag sets (most importantly, ensuring that invalid flag combinations are checked). * All of the RESOLVE_* semantics (including errno values) are correctly handled with various combinations of paths and flags. * RESOLVE_IN_ROOT correctly protects against the symlink rename(2) attack that has been responsible for several CVEs (and likely will be responsible for several more). * The magic-link trailing mode semantics correctly block re-opens in all of the relevant cases, as well as checking that the "flip-flop" attack is correctly protected against. * O_PATH has the correct semantics (the mode is g+rwx for ordinary files, but for trailing magic-links the mode gets inherited). Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>

view details

push time in a month

push eventcyphar/linux

Aleksa Sarai

commit sha c97a8eac9a8d23cb6c68d1a9ac366e3824f76265

namei: obey trailing magic-link DAC permissions /* Background. */ The ability for userspace to "re-open" file descriptors through /proc/self/fd has been a very useful tool for all sorts of usecases (container runtimes are just one common example). However, the current interface for doing this has resulted in some pretty subtle security holes. Userspace can re-open a file descriptor with more permissions than the original, which can result in cases such as /proc/$pid/exe being re-opened O_RDWR at a later date even though (by definition) /proc/$pid/exe cannot be opened for writing. When combined with O_PATH the results can get even more confusing. We cannot block this outright. Aside from userspace already depending on it, it's a useful feature which can actually increase the security of userspace. For instance, LXC keeps an O_PATH of the container's /dev/pts/ptmx that gets re-opened to create new ptys and then uses TIOCGPTPEER to get the slave end[1]. This allows for pty allocation without resolving paths inside an (untrusted) container's rootfs. There isn't a trivial way of doing this that is as straight-forward and safe as O_PATH re-opening. /* Semantics. */ Instead we have to restrict it in such a way that it doesn't break (good) users but does block potential attackers. The solution applied in this patch is to restrict *re-opening* (not resolution through) magic-links by requiring that mode of the link be obeyed. Normal symlinks have modes of a+rwx but magic-links can have other modes. These magic-link modes were historically ignored during path resolution, but they've now been re-purposed for more useful ends. It is also necessary to define semantics for the mode of an O_PATH descriptor, since re-opening a magic-link through an O_PATH needs to be just as restricted as the corresponding magic-link -- otherwise the above protection can be bypassed. There are two distinct cases: 1. The target is a regular file (not a magic-link). Userspace depends on being able to re-open the O_PATH of a regular file, so we must define the mode to be a+rwx. 2. The target is a magic-link. In this case, we simply copy the mode of the magic-link. This results in an O_PATH of a magic-link effectively acting as a no-op in terms of how much re-opening privileges a process has. CAP_DAC_OVERRIDE can be used to override all of these restrictions, but we only permit &init_userns's capabilities to affect these semantics. The reason for this is that there isn't a clear way to track what user_ns is the original owner of a given O_PATH chain -- thus an unprivileged user could create a new userns and O_PATH the file descriptor, owning it. All signs would indicate that the user really does have CAP_DAC_OVERRIDE over the new descriptor and the protection would be bypassed. We thus opt for the more conservative approach. /* Special Considerations. */ In order to make sure we only impact nd_jump_link()-style jumps with these restrictions, we introduce a new temporary LOOKUP_MAGICLINK_JUMPED flag which is set by nd_jump_link(). Jann Horn discovered an attack in a previous version of this patch, where an attacker could swap a single fd between a re-openable file and a non-reopenable one and thus re-open the non-reopenable fd (exploiting the fact that procfs inode metadata is generated somewhat-lazily). To fix this, we need to save the magic-link's mode during nd_jump_link() -- rather than just using nd->link_inode->i_mode. A PoC of this attack (as well as various other attack scenarios) is included as a selftest later in the patch series. /* Testing. */ I have run this patch on several machines for several months. So far, the only processes which have ever hit this case ("loadkeys" and "kbd_mode" from the kbd package[2]) gracefully handle the permission error and do not cause any user-visible problems. In order to give users a heads-up, a warning is output to dmesg whenever may_open_magiclink() refuses access. If this does end up causing problems, we can expand the permissions on initial (non-O_PATH) open() to include extra permission bits based on whether the user could've opened the file with other access modes at the time. /* Future Work. */ There is one additional way in which an attacker could re-open a file descriptor using magic-links -- by bind-mounting the magic-link and then opening the bind-mount. This patch does not protect against this (because it will require more invasive changes to the mount code). Userspace can protect against this particular attack in the meantime through seccomp (and in fact, all mainstream container runtimes have protections against containers doing such bind-mounts). Thus, we can safely punt on this issue for now. [1]: commit 54ebbfb16034 ("tty: add TIOCGPTPEER ioctl") [2]: http://git.altlinux.org/people/legion/packages/kbd.git Cc: Jann Horn <jannh@google.com> Cc: Kees Cook <keescook@chromium.org> Suggested-by: Andy Lutomirski <luto@kernel.org> Suggested-by: Christian Brauner <christian@brauner.io> Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>

view details

Aleksa Sarai

commit sha 5750e2f941d06112dda7a18d8e4c6431bf9d5ebf

procfs: switch magic-link modes to be more sane Now that magic-link modes are obeyed for file re-opening purposes, some of the pre-existing magic-link modes need to be adjusted to be more semantically correct. The most blatant example of this is /proc/self/exe, which had a mode of a+rwx even though tautologically the file could never be opened for writing (because it is the current->mm of a live process). With the new O_PATH restrictions, changing the default mode of these magic-links allows us to entirely block delayed-access attacks such as we saw in CVE-2019-5736. Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>

view details

Aleksa Sarai

commit sha ba12e23f78fd98fc39eb3d326cb277263da9cf03

Documentation: update path-lookup to mention trailing magic-links We've introduced new (somewhat subtle) behaviour regarding trailing magic-links, so it's best to make sure everyone can follow along with the reasoning behind trailing_magiclink(). Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>

view details

Aleksa Sarai

commit sha 2e303dedba3e198dc47c338ad36171cce01e91e4

open: O_EMPTYPATH: procfs-less file descriptor re-opening /* Background. */ Userspace has made use of /proc/self/fd very liberally to allow for descriptors to be re-opened. There are a wide variety of uses for this feature, but it has always required constructing a pathname and could not be done without procfs mounted. The obvious solution for this is to extend openat(2) to have an AT_EMPTY_PATH-equivalent -- O_EMPTYPATH. Now that descriptor re-opening has been made safe through the new magic-link resolution restrictions, we can replicate these restrictions for O_EMPTYPATH (we compute the effective mode of the procfs magic-link, and restrict it the same way trailing_magiclink() does). /* Compatibility. */ Because of open(2)'s peculiar (read: complete lack of) flag checking, we need to ensure that programs which passed garbage bits (and worked) previously will continue to work with this change. Passing an empty path on Linux has provided -ENOENT for an incredibly long time, and O_EMPTYPATH acts as a no-op if the path is non-empty. Thus it is very unlikely that any existing programs will change their behaviour. When doing openat(O_EMPTYPATH|O_PATH), O_PATH takes precedence and O_EMPTYPATH is ignored. Very few users ever have a need to O_PATH re-open an existing file descriptor, and so accommodating them at the expense of further complicating O_PATH makes little sense. Ultimately, if users ask for this we can always add support for this in openat2(2) -- since openat2(2) will give an error if passed (O_EMPTYPATH|O_PATH). Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>

view details

Aleksa Sarai

commit sha 4a534c589bb3d4e9c518af9b385aafca8645f86d

namei: O_BENEATH-style resolution restriction flags /* Background. */ The need for some sort of control over VFS's path resolution (to avoid malicious paths resulting in inadvertent breakouts) has been a very long-standing desire of many userspace applications throughout the history of Unix. While some improvements have been made (such as O_NOFOLLOW or AT_NO_AUTOMOUNT), most of the new APIs have involved restricting the final component in a path's lookup -- completely ignoring the rest of the path components. Userspace programs have thus been forced to implement their own (and usually subtly broken) methods of ensuring path components they don't wish to resolve are detected. Aside from making it more complicated to write such programs safely, there are some things which are effectively impossible to safely handle correctly (for instance, magic-links cannot reliably be differentiated from symlinks on filesystems that may contain magic-links). It would be a massive improvement to provide these types of resolution restriction features to userspace. This is a refresh of Al's AT_NO_JUMPS patchset[1] (which was a variation on David Drysdale's O_BENEATH patchset[2], which in turn was based on the Capsicum project[3]). Input from Linus and Andy in the AT_NO_JUMPS thread[4] determined most of the API changes made in this refresh. /* Userspace API. */ These flags will be exposed to userspace through openat2(2). /* Semantics. */ The following new LOOKUP flags are defined, and (in contrast to most other LOOKUP flags, they apply to all components of path resolution as opposed to only the final component). LOOKUP_NO_XDEV Disallow mount-point crossing (both *down* into one, or *up* from one). Both bind-mounts and cross-filesystem mounts are blocked by this flag. The naming is based on "find -xdev" as well as -EXDEV (though find(1) doesn't walk upwards, the semantics seem obvious). LOOKUP_NO_MAGICLINKS Disallows "magic-link" resolution ("symlinks" that are resolved through nd_jump_link()). This is important to provide explicitly, because magic-links can be used to trick privileged programs into bypassing normal path resolution restriction mechanisms (such as mount namespaces). Such programs likely want to permit ordinary symlink resolution, but don't wish to permit magic-links. It should be noted that prior to this, there was no way for userspace to unambiguously verify whether a symlink was a magic-link. LOOKUP_NO_SYMLINKS Disallows resolution through symlinks (includes magic-links). LOOKUP_BENEATH Disallow "escapes" from the starting point of the filesystem tree during resolution (you must stay "beneath" the starting point at all times). Currently this is done by disallowing ".." and absolute paths (either in the given path or found during symlink resolution) entirely, as well as all magic-link jumping. The wholesale banning of ".." is because it is currently not safe to allow ".." resolution (races can cause the path to be moved outside of the root -- this is conceptually similar to historical chroot(2) escape attacks). Future patches in this series will address this, and will re-enable ".." resolution once it is safe. With those patches, ".." resolution will only be allowed if it remains in the root throughout resolution (such as "a/../b" not "a/../../outside/b"). The banning of magic-link jumping is done because it is not clear whether semantically they should be allowed -- while some magic-links are safe there are many that can cause escapes (and once a resolution is outside of the root, O_BENEATH will no longer detect it). Future patches may re-enable magic-link jumping when such jumps would remain inside the root. The LOOKUP_NO_*LINK flags return -ELOOP if path resolution would violates their requirement, while the others all return -EXDEV. [1]: https://lore.kernel.org/lkml/20170429220414.GT29622@ZenIV.linux.org.uk/ [2]: https://lore.kernel.org/lkml/1415094884-18349-1-git-send-email-drysdale@google.com/ [3]: https://lore.kernel.org/lkml/1404124096-21445-1-git-send-email-drysdale@google.com/ [4]: https://lwn.net/Articles/723057/ Cc: Christian Brauner <christian@brauner.io> Suggested-by: David Drysdale <drysdale@google.com> Suggested-by: Al Viro <viro@zeniv.linux.org.uk> Suggested-by: Andy Lutomirski <luto@kernel.org> Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>

view details

Aleksa Sarai

commit sha 0b529e2e85e81b82caf258ce4bc890fd8c7693ab

namei: LOOKUP_IN_ROOT: chroot-like path resolution /* Background. */ Container runtimes or other administrative management processes will often interact with root filesystems while in the host mount namespace, because the cost of doing a chroot(2) on every operation is too prohibitive (especially in Go, which cannot safely use vfork). However, a malicious program can trick the management process into doing operations on files outside of the root filesystem through careful crafting of symlinks. Most programs that need this feature have attempted to make this process safe, by doing all of the path resolution in userspace (with symlinks being scoped to the root of the malicious root filesystem). Unfortunately, this method is prone to foot-guns and usually such implementations have subtle security bugs. Thus, what userspace needs is a way to resolve a path as though it were in a chroot(2) -- with all absolute symlinks being resolved relative to the dirfd root (and ".." components being stuck under the dirfd root[1]) It is much simpler and more straight-forward to provide this functionality in-kernel (because it can be done far more cheaply and correctly). More classical applications that also have this problem (which have their own potentially buggy userspace path sanitisation code) include web servers, archive extraction tools, network file servers, and so on. [1]: At the moment, ".." and magic-link jumping are disallowed for the same reason it is disabled for LOOKUP_BENEATH -- currently it is not safe to allow it. Future patches may enable it unconditionally once we have resolved the possible races (for "..") and semantics (for magic-link jumping). /* Userspace API. */ LOOKUP_IN_ROOT will be exposed to userspace through openat2(2). There is a slight change in behaviour regarding pathnames -- if the pathname is absolute then the dirfd is still used as the root of resolution of LOOKUP_IN_ROOT is specified (this is to avoid obvious foot-guns, at the cost of a minor API inconsistency). Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>

view details

Aleksa Sarai

commit sha 2cef835d19181edf72bc176d3a1c0ccaee50ab3e

namei: permit ".." resolution with LOOKUP_{IN_ROOT,BENEATH} This patch allows for LOOKUP_BENEATH and LOOKUP_IN_ROOT to safely permit ".." resolution (in the case of LOOKUP_BENEATH the resolution will still fail if ".." resolution would resolve a path outside of the root -- while LOOKUP_IN_ROOT will chroot(2)-style scope it). Magic-link jumps are still disallowed entirely[*]. The need for this patch (and the original no-".." restriction) is explained by observing there is a fairly easy-to-exploit race condition with chroot(2) (and thus by extension LOOKUP_IN_ROOT and LOOKUP_BENEATH if ".." is allowed) where a rename(2) of a path can be used to "skip over" nd->root and thus escape to the filesystem above nd->root. thread1 [attacker]: for (;;) renameat2(AT_FDCWD, "/a/b/c", AT_FDCWD, "/a/d", RENAME_EXCHANGE); thread2 [victim]: for (;;) openat2(dirb, "b/c/../../etc/shadow", { .flags = O_PATH, .resolve = RESOLVE_IN_ROOT } ); With fairly significant regularity, thread2 will resolve to "/etc/shadow" rather than "/a/b/etc/shadow". There is also a similar (though somewhat more privileged) attack using MS_MOVE. With this patch, such cases will be detected *during* ".." resolution and will return -EAGAIN for userspace to decide to either retry or abort the lookup. It should be noted that ".." is the weak point of chroot(2) -- walking *into* a subdirectory tautologically cannot result in you walking *outside* nd->root (except through a bind-mount or magic-link). There is also no other way for a directory's parent to change (which is the primary worry with ".." resolution here) other than a rename or MS_MOVE. This is a first-pass implementation, where -EAGAIN will be returned if any rename or mount occurs anywhere on the host (in any namespace). This will result in spurious errors, but there isn't a satisfactory alternative (other than denying ".." altogether). One other possible alternative (which previous versions of this patch used) would be to check with path_is_under() if there was a racing rename or mount (after re-taking the relevant seqlocks). While this does work, it results in possible O(n*m) behaviour if there are many renames or mounts occuring *anywhere on the system*. A variant of the above attack is included in the selftests for openat2(2) later in this patch series. I've run this test on several machines for several days and no instances of a breakout were detected. While this is not concrete proof that this is safe, when combined with the above argument it should lend some trustworthiness to this construction. [*] It may be acceptable in the future to do a path_is_under() check (as with the alternative solution for "..") for magic-links after they are resolved. However this seems unlikely to be a feature that people *really* need -- it can be added later if it turns out a lot of people want it. Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>

view details

Aleksa Sarai

commit sha 3b717886e90546153059b24e40bdfc2747e298b6

open: introduce openat2(2) syscall /* Background. */ For a very long time, extending openat(2) with new features has been incredibly frustrating. This stems from the fact that openat(2) is possibly the most famous counter-example to the mantra "don't silently accept garbage from userspace" -- it doesn't check whether unknown flags are present[1]. This means that (generally) the addition of new flags to openat(2) has been fraught with backwards-compatibility issues (O_TMPFILE has to be defined as __O_TMPFILE|O_DIRECTORY|[O_RDWR or O_WRONLY] to ensure old kernels gave errors, since it's insecure to silently ignore the flag[2]). All new security-related flags therefore have a tough road to being added to openat(2). In addition, the newly-added path resolution restriction LOOKUP flags (which we would like to expose to user-space) don't feel related to the pre-existing O_* flag set -- they affect all components of path lookup. Thus it's necessary to (at the very least) add an additional flag. Furthermore, the new magic-link hardening semantics also give an opportunity for us to allow userspace to further restrict the re-openability of O_PATH file descriptors. This feature cannot be indicated through the access mode bits (because O_RDONLY=0, and O_PATH has historically ignored them -- making them off-limits) nor through the file mode (because glibc has historically passed garbage values for non-O_{CREAT,TMPFILE} cases to openat(2)[3]). Thus we'd need yet one more additional flag. Adding a new syscall allows us to finally fix the flag-ignoring problem, and make it extensible enough so that we will hopefully never need an openat3(2). /* Syscall Prototype. */ /* * open_how is an extensible structure (similar in interface to * clone3(2) or sched_setattr(2)). The size parameter must be set to * sizeof(struct open_how), to allow for future extensions. All future * extensions will be appended to open_how, with their zero value * acting as a no-op default. */ struct open_how { /* ... */ }; int openat2(int dfd, const char *pathname, struct open_how *how, size_t size); /* Description. */ The initial version of 'struct open_how' contains the following fields: flags Used to specify openat(2)-style flags. However, any unknown flag bits or otherwise incorrect (O_PATH|O_RDWR) will result in -EINVAL. This field is expanded to be 64-bits wide to allow for more O_ flags than currently permitted with openat(2). mode The file mode for O_CREAT or O_TMPFILE. Must be set to zero if flags does not contain O_CREAT or O_TMPFILE. upgrade_mask Restrict the re-opening of a new O_PATH file descriptor (can be used to restrict re-opening with MAY_READ or MAY_WRITE). Must be set to zero if flags does not contain O_PATH. __padding Must be set to zero. resolve Restrict path resolution (in contrast to O_* flags they affect all path components). The current set of flags are as follows (at the moment, all of the RESOLVE_ flags are implemented as just passing the corresponding LOOKUP_ flag). RESOLVE_NO_XDEV => LOOKUP_NO_XDEV RESOLVE_NO_SYMLINKS => LOOKUP_NO_SYMLINKS RESOLVE_NO_MAGICLINKS => LOOKUP_NO_MAGICLINKS RESOLVE_BENEATH => LOOKUP_BENEATH RESOLVE_IN_ROOT => LOOKUP_IN_ROOT open_how does not contain an embedded size field, because it is of little benefit (userspace can figure out the kernel open_how size at runtime fairly easily without it). /* Future Work. */ Currently there is no way to restrict execute permissions on file descriptors. Once that exists, we could add UPGRADE_NOEXEC. Additional RESOLVE_ flags have been suggested during the review period. These can be easily implemented separately (such as blocking automount during resolution). [1]: https://lwn.net/Articles/588444/ [2]: https://lwn.net/Articles/558949/ [3]: https://sourceware.org/bugzilla/show_bug.cgi?id=17523 Suggested-by: Christian Brauner <christian@brauner.io> Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>

view details

Aleksa Sarai

commit sha 09c9ad53f6fee74f6aed7b6441b4314aea8c04a5

selftests: add openat2(2) selftests Test all of the various openat2(2) flags, as well as how file descriptor re-opening works. A small stress-test of a symlink-rename attack is included to show that the protections against ".."-based attacks are sufficient. In addition, the memfd selftest is fixed to no longer depend on the now-disallowed functionality of upgrading an O_RDONLY descriptor to O_RDWR. The main things these self-tests are enforcing are: * The struct+usize ABI for openat2(2) and copy_struct_from_user() to ensure that upgrades will be handled gracefully (in addition, ensuring that misaligned structures are also handled correctly). * The -EINVAL checks for openat2(2) are all correctly handled to avoid userspace passing unknown or conflicting flag sets (most importantly, ensuring that invalid flag combinations are checked). * All of the RESOLVE_* semantics (including errno values) are correctly handled with various combinations of paths and flags. * RESOLVE_IN_ROOT correctly protects against the symlink rename(2) attack that has been responsible for several CVEs (and likely will be responsible for several more). * The magic-link trailing mode semantics correctly block re-opens in all of the relevant cases, as well as checking that the "flip-flop" attack is correctly protected against. * O_PATH has the correct semantics (the mode is g+rwx for ordinary files, but for trailing magic-links the mode gets inherited). Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>

view details

push time in a month

delete branch cyphar/linux

delete branch : copy_struct_from_user/cleanup

delete time in a month

Pull request review commentopencontainers/runc

nsenter: cloned_binary: do not clone exe if on readonly rootfs

 static void *must_realloc(void *ptr, size_t size) /*  * Verify whether we are currently in a self-cloned program (namely, is  * /proc/self/exe a memfd). F_GET_SEALS will only succeed for memfds (or rather- * for shmem files), and we want to be sure it's actually sealed.+ * for shmem files), and we want to be sure it's actually sealed. In case the+ * the program is on a read-only rootfs, do not clone.  */ static int is_self_cloned(void) { 	int fd, ret, is_cloned = 0; 	struct stat statbuf = {}; 	struct statfs fsbuf = {};+	struct statvfs stat;+	char abs_path[1024];

Regarding your suggestion to set '_LIBCONTAINER_CLONED_BINARY' to 1 will require change in every container that were currently running.

Unless I'm very mistaken, you should be able to just set _LIBCONTAINER_CLONED_BINARY=1 before you call into runc (you can probably do this by creating a wrapper around /usr/bin/runc or maybe even set it as an environment variable via a Kubelet configuration or through systemd). If you can't do that, I'd be willing to take a patch which allowed passing it.

Also, could you specify the security concern you have in having an additional check whether /proc/self/exe is on a read-only rootfs or not?

If the host is read-only only temporarily (or if the attacker can trigger the host to temporarily make the host read-write) then you can trivially re-exploit CVE-2019-5736 (defeating the whole point of this protection). This is why we require the bind-mount to be ours and why we lazy-umount it (there are even ways of even working around that protection -- which is why we generally prefer doing it with memfds -- but the mount-related attacks require too many privileges to be a realistic security concern).

Maybe this isn't a concern for Google, but I'd prefer to avoid opening the door to additional attacks for everyone -- _LIBCONTAINER_CLONED_BINARY=1 is a hack that Google can use to disable it because you're sure that your environment is secure enough.

vaibhav-rustagi

comment created time in a month

push eventcyphar/linux

Tony Lindgren

commit sha dd8882a255388ba66175098b1560d4f81c100d30

clk: ti: dra7: Fix mcasp8 clock bits There's a typo for dra7 mcasp clkctrl bit, it should be 22 like the other macasp instances, and not 24. And in dra7xx_clks[] we have the bits wrong way around. Fixes: dffa9051d546 ("clk: ti: dra7: add new clkctrl data") Cc: linux-clk@vger.kernel.org Cc: Michael Turquette <mturquette@baylibre.com> Cc: Stephen Boyd <sboyd@kernel.org> Cc: Suman Anna <s-anna@ti.com> Cc: Tero Kristo <t-kristo@ti.com> Acked-by: Stephen Boyd <sboyd@kernel.org> Signed-off-by: Tony Lindgren <tony@atomide.com>

view details

Tony Lindgren

commit sha 2d3c8ba3cffa00f76bedb713c8c2126c82d8cd13

ARM: dts: Fix wrong clocks for dra7 mcasp The ahclkr clkctrl clock bit 28 only exists for mcasp 1 and 2 on dra7. This causes the following warning on beagle-x15: ti-sysc 48468000.target-module: could not add child clock ahclkr: -19 Also the mcasp clkctrl clock bits are wrong: For mcasp1 and 2 we have four clocks at bits 28, 24, 22 and 0: bit 28 is ahclkr bit 24 is ahclkx bit 22 is auxclk bit 0 is fck For mcasp3 to 8 we have three clocks at bits 24, 22 and 0. bit 24 is ahclkx bit 22 is auxclk bit 0 is fck We do not have currently mapped auxclk at bit 22 for the drivers, that can be added if needed. Fixes: 5241ccbf2819 ("ARM: dts: Add missing ranges for dra7 mcasp l3 ports") Cc: Suman Anna <s-anna@ti.com> Cc: Tero Kristo <t-kristo@ti.com> Signed-off-by: Tony Lindgren <tony@atomide.com>

view details

H. Nikolaus Schaller

commit sha f1f028ff89cb0d37db299d48e7b2ce19be040d52

DTS: ARM: gta04: introduce legacy spi-cs-high to make display work again commit 6953c57ab172 "gpio: of: Handle SPI chipselect legacy bindings" did introduce logic to centrally handle the legacy spi-cs-high property in combination with cs-gpios. This assumes that the polarity of the CS has to be inverted if spi-cs-high is missing, even and especially if non-legacy GPIO_ACTIVE_HIGH is specified. The DTS for the GTA04 was orginally introduced under the assumption that there is no need for spi-cs-high if the gpio is defined with proper polarity GPIO_ACTIVE_HIGH. This was not a problem until gpiolib changed the interpretation of GPIO_ACTIVE_HIGH and missing spi-cs-high. The effect is that the missing spi-cs-high is now interpreted as CS being low (despite GPIO_ACTIVE_HIGH) which turns off the SPI interface when the panel is to be programmed by the panel driver. Therefore, we have to add the redundant and legacy spi-cs-high property to properly activate CS. Cc: stable@vger.kernel.org Signed-off-by: H. Nikolaus Schaller <hns@goldelico.com> Signed-off-by: Tony Lindgren <tony@atomide.com>

view details

Tony Lindgren

commit sha c01f5120ca7cf2994336c42b8a9cae697121ffb3

Merge branch 'fixes-merge-window-pt2' into fixes

view details

Adam Ford

commit sha 04e0e1777a792223897b8c21eb4e0a5b2624d9df

ARM: omap2plus_defconfig: Enable DRM_TI_TFP410 The TFP410 driver was removed but the replacement driver was never enabled. This patch enableds the DRM_TI_TFP410 Fixes: be3143d8b27f ("drm/omap: Remove TFP410 and DVI connector drivers") Signed-off-by: Adam Ford <aford173@gmail.com> Signed-off-by: Tony Lindgren <tony@atomide.com>

view details

Tony Lindgren

commit sha 1d70ded8567ccf70353ff97e3b13f267c889f934

ARM: omap2plus_defconfig: Enable more droid4 devices as loadable modules Droid4 needs USB option serial driver for modem, and lm3532 for the LCD backlight. Note that the LCD backlight does not yet get enabled automatically, but needs to be done manually with: # echo 50 > /sys/class/leds/lm3532::backlight/brightness Signed-off-by: Tony Lindgren <tony@atomide.com>

view details

Tony Lindgren

commit sha 4ef5d76b453908f21341e661a9b6f96862f6f589

ARM: dts: Fix gpio0 flags for am335x-icev2 The ti,no-idle-on-init and ti,no-reset-on-init flags need to be at the interconnect target module level for the modules that have it defined. Otherwise we get the following warnings: dts flag should be at module level for ti,no-idle-on-init dts flag should be at module level for ti,no-reset-on-init Fixes: 87fc89ced3a7 ("ARM: dts: am335x: Move l4 child devices to probe them with ti-sysc") Cc: Lokesh Vutla <lokeshvutla@ti.com> Reported-by: Suman Anna <s-anna@ti.com> Reviewed-by: Lokesh Vutla <lokeshvutla@ti.com> Signed-off-by: Tony Lindgren <tony@atomide.com>

view details

Tony Lindgren

commit sha 8ad8041b98c665b6147e607b749586d6e20ba73a

ARM: OMAP2+: Fix missing reset done flag for am3 and am43 For ti,sysc-omap4 compatible devices with no sysstatus register, we do have reset done status available in the SOFTRESET bit that clears when the reset is done. This is documented for example in am437x TRM for DMTIMER_TIOCP_CFG register. The am335x TRM just says that SOFTRESET bit value 1 means reset is ongoing, but it behaves the same way clearing after reset is done. With the ti-sysc driver handling this automatically based on no sysstatus register defined, we see warnings if SYSC_HAS_RESET_STATUS is missing in the legacy platform data: ti-sysc 48042000.target-module: sysc_flags 00000222 != 00000022 ti-sysc 48044000.target-module: sysc_flags 00000222 != 00000022 ti-sysc 48046000.target-module: sysc_flags 00000222 != 00000022 ... Let's fix these warnings by adding SYSC_HAS_RESET_STATUS. Let's also remove the useless parentheses while at it. If it turns out we do have ti,sysc-omap4 compatible devices without a working SOFTRESET bit we can set up additional quirk handling for it. Signed-off-by: Tony Lindgren <tony@atomide.com>

view details

Tony Lindgren

commit sha 17529d43b21c72466e9109d602c6f5c360a1a9e8

ARM: OMAP2+: Add missing LCDC midlemode for am335x TRM "Table 13-34. SYSCONFIG Register Field Descriptions" lists both standbymode and idlemode that should be just the sidle and midle registers where midle is currently unconfigured for lcdc_sysc. As the dts data has been generated based on lcdc_sysc, we now have an empty "ti,sysc-midle" property. And so we currently get a warning for lcdc because of a difference with dts provided configuration compared to the legacy platform data. This is because lcdc has SYSC_HAS_MIDLEMODE configured in the platform data without configuring the modes. Let's fix the issue by adding the missing midlemode to lcdc_sysc, and configuring the "ti,sysc-midle" property based on the TRM values. Fixes: f711c575cfec ("ARM: dts: am335x: Add l4 interconnect hierarchy and ti-sysc data") Cc: Jyri Sarha <jsarha@ti.com> Cc: Keerthy <j-keerthy@ti.com> Cc: Robert Nelson <robertcnelson@gmail.com> Cc: Suman Anna <s-anna@ti.com> Signed-off-by: Tony Lindgren <tony@atomide.com>

view details

Tony Lindgren

commit sha cf395f7ddb9ebc6b2d28d83b53d18aa4e7c19701

ARM: OMAP2+: Fix warnings with broken omap2_set_init_voltage() This code is currently unable to find the dts opp tables as ti-cpufreq needs to set them up first based on speed binning. We stopped initializing the opp tables with platform code years ago for device tree based booting with commit 92d51856d740 ("ARM: OMAP3+: do not register non-dt OPP tables for device tree boot"), and all of mach-omap2 is now booting using device tree. We currently get the following errors on init: omap2_set_init_voltage: unable to find boot up OPP for vdd_mpu omap2_set_init_voltage: unable to set vdd_mpu omap2_set_init_voltage: unable to find boot up OPP for vdd_core omap2_set_init_voltage: unable to set vdd_core omap2_set_init_voltage: unable to find boot up OPP for vdd_iva omap2_set_init_voltage: unable to set vdd_iva Let's just drop the unused code. Nowadays ti-cpufreq should be used to to initialize things properly. Cc: Adam Ford <aford173@gmail.com> Cc: André Roth <neolynx@gmail.com> Cc: "H. Nikolaus Schaller" <hns@goldelico.com> Cc: Nishanth Menon <nm@ti.com> Cc: Tero Kristo <t-kristo@ti.com> Tested-by: Adam Ford <aford173@gmail.com> #logicpd-torpedo-37xx-devkit Signed-off-by: Tony Lindgren <tony@atomide.com>

view details

Peter Ujfalusi

commit sha f90ec6cdf674248dcad85bf9af6e064bf472b841

ARM: dts: am4372: Set memory bandwidth limit for DISPC Set memory bandwidth limit to filter out resolutions above 720p@60Hz to avoid underflow errors due to the bandwidth needs of higher resolutions. am43xx can not provide enough bandwidth to DISPC to correctly handle 'high' resolutions. Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com> Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ti.com> Signed-off-by: Tony Lindgren <tony@atomide.com>

view details

Keerthy

commit sha a304c0a60252fde9daccd9f8a07a598021f6956a

arm64/ARM: configs: Change CONFIG_REMOTEPROC from m to y Commit 6334150e9a36 ("remoteproc: don't allow modular build") changes CONFIG_REMOTEPROC to a boolean from a tristate config option which inhibits all defconfigs marking CONFIG_REMOTEPROC as a module in compiling the remoteproc and dependent config options. So fix the configs to have CONFIG_REMOTEPROC built in. Link: https://lore.kernel.org/r/20190920075946.13282-5-j-keerthy@ti.com Fixes: 6334150e9a36 ("remoteproc: don't allow modular build") Signed-off-by: Keerthy <j-keerthy@ti.com> Acked-by: Will Deacon <will@kernel.org> [olof: Fixed up all 4 occurrances in this one commit] Signed-off-by: Olof Johansson <olof@lixom.net>

view details

Linus Walleij

commit sha cdee3b60af594403bd389e6e8239bcd0b4a159fc

ARM: dts: ux500: Fix up the CPU thermal zone This fixes up the default ux500 CPU thermal zone: - Set polling delay to 0 and explain why - Set passive polling delay to 250 - Remove restrictions from the CPU cooling device, we should use all cpufreq steps to cool down if needed. Link: https://lore.kernel.org/r/20191001074628.8122-1-linus.walleij@linaro.org Fixes: b786a05f6ce4 ("ARM: dts: ux500: Update thermal zone") Suggested-by: Daniel Lezcano <daniel.lezcano@linaro.org> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Reviewed-by: Daniel Lezcano <daniel.lezcano@linaro.org> Signed-off-by: Olof Johansson <olof@lixom.net>

view details

Olof Johansson

commit sha bcec1221c945baeb30200ce52abde93be5fb1be5

Merge tag 'omap-for-v5.4/fixes-rc1-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap into arm/fixes Fixes for omaps for v5.4-rc cycle Here are fixes for omaps to deal with few regressions, and to fix more boot time errors and warnings: - The recent ti-sysc interconnect target module driver changes had incorrect clock bits for both clocks and dts that cause warnings - For omap3-gta04, gpio changes caused the LCD to break a while back, and after discussing things the right fix is to set spi-cs-high - Recent omapdrm changes to use generic panels caused tfp410 to be disabled as we now must enable the generic support for it in defconfig - Recent omapdrm and backlight changes also finally made droid4 LCD to work, so let's enable it in the defconfig it can be used out of the box. This is not strictly a fix, but we still also have the older CONFIG_MFD_TI_LMU options available so this cuts down the confusion for trying to guess which display and which backlight is needed - Recent ti-sysc interconnect target module changes need the gpio module disabled on some boards, but this now needs to happen at the module level, not at the gpio driver level - Recent changes to probe system timers with ti-sysc caused warnings about mismatch in syconfig registers, so let's configure the option for RESET_STATUS as available in the TRMs - Recent changes to probe LCDC with ti-sysc caused warnings about mismatch in sysconfig registers, so let's configure the missing idlemodes for both platform data and dts as documented in TRMs - Since we moved mach-omap2 to probe with device tree, we've been getting voltage controller warnings. Turns out this code is no longer needed, so let's just remove omap2_set_init_voltage() to get rid of the pointless warnings - Configure am4372 dispc memory bandwidth to avoid underflow errors * tag 'omap-for-v5.4/fixes-rc1-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap: ARM: dts: am4372: Set memory bandwidth limit for DISPC ARM: OMAP2+: Fix warnings with broken omap2_set_init_voltage() ARM: OMAP2+: Add missing LCDC midlemode for am335x ARM: OMAP2+: Fix missing reset done flag for am3 and am43 ARM: dts: Fix gpio0 flags for am335x-icev2 ARM: omap2plus_defconfig: Enable more droid4 devices as loadable modules ARM: omap2plus_defconfig: Enable DRM_TI_TFP410 DTS: ARM: gta04: introduce legacy spi-cs-high to make display work again ARM: dts: Fix wrong clocks for dra7 mcasp clk: ti: dra7: Fix mcasp8 clock bits Link: https://lore.kernel.org/r/pull-1570040410-308159@atomide.com Signed-off-by: Olof Johansson <olof@lixom.net>

view details

Patrice Chotard

commit sha 60c1b3e25728e0485f08e72e61c3cad5331925a3

ARM: multi_v7_defconfig: Fix SPI_STM32_QSPI support SPI_STM32_QSPI must be set in buildin as rootfs can be located on QSPI memory device. Link: https://lore.kernel.org/r/20191004124025.17394-1-patrice.chotard@st.com Signed-off-by: Patrice Chotard <patrice.chotard@st.com> Signed-off-by: Olof Johansson <olof@lixom.net>

view details

Andrey Smirnov

commit sha 2cf2aa6a69db0b17b3979144287af8775c1c1534

dma-mapping: fix false positivse warnings in dma_common_free_remap() Commit 5cf4537975bb ("dma-mapping: introduce a dma_common_find_pages helper") changed invalid input check in dma_common_free_remap() from: if (!area || !area->flags != VM_DMA_COHERENT) to if (!area || !area->flags != VM_DMA_COHERENT || !area->pages) which seem to produce false positives for memory obtained via dma_common_contiguous_remap() This triggers the following warning message when doing "reboot" on ZII VF610 Dev Board Rev B: WARNING: CPU: 0 PID: 1 at kernel/dma/remap.c:112 dma_common_free_remap+0x88/0x8c trying to free invalid coherent area: 9ef82980 Modules linked in: CPU: 0 PID: 1 Comm: systemd-shutdow Not tainted 5.3.0-rc6-next-20190820 #119 Hardware name: Freescale Vybrid VF5xx/VF6xx (Device Tree) Backtrace: [<8010d1ec>] (dump_backtrace) from [<8010d588>] (show_stack+0x20/0x24) r7:8015ed78 r6:00000009 r5:00000000 r4:9f4d9b14 [<8010d568>] (show_stack) from [<8077e3f0>] (dump_stack+0x24/0x28) [<8077e3cc>] (dump_stack) from [<801197a0>] (__warn.part.3+0xcc/0xe4) [<801196d4>] (__warn.part.3) from [<80119830>] (warn_slowpath_fmt+0x78/0x94) r6:00000070 r5:808e540c r4:81c03048 [<801197bc>] (warn_slowpath_fmt) from [<8015ed78>] (dma_common_free_remap+0x88/0x8c) r3:9ef82980 r2:808e53e0 r7:00001000 r6:a0b1e000 r5:a0b1e000 r4:00001000 [<8015ecf0>] (dma_common_free_remap) from [<8010fa9c>] (remap_allocator_free+0x60/0x68) r5:81c03048 r4:9f4d9b78 [<8010fa3c>] (remap_allocator_free) from [<801100d0>] (__arm_dma_free.constprop.3+0xf8/0x148) r5:81c03048 r4:9ef82900 [<8010ffd8>] (__arm_dma_free.constprop.3) from [<80110144>] (arm_dma_free+0x24/0x2c) r5:9f563410 r4:80110120 [<80110120>] (arm_dma_free) from [<8015d80c>] (dma_free_attrs+0xa0/0xdc) [<8015d76c>] (dma_free_attrs) from [<8020f3e4>] (dma_pool_destroy+0xc0/0x154) r8:9efa8860 r7:808f02f0 r6:808f02d0 r5:9ef82880 r4:9ef82780 [<8020f324>] (dma_pool_destroy) from [<805525d0>] (ehci_mem_cleanup+0x6c/0x150) r7:9f563410 r6:9efa8810 r5:00000000 r4:9efd0148 [<80552564>] (ehci_mem_cleanup) from [<80558e0c>] (ehci_stop+0xac/0xc0) r5:9efd0148 r4:9efd0000 [<80558d60>] (ehci_stop) from [<8053c4bc>] (usb_remove_hcd+0xf4/0x1b0) r7:9f563410 r6:9efd0074 r5:81c03048 r4:9efd0000 [<8053c3c8>] (usb_remove_hcd) from [<8056361c>] (host_stop+0x48/0xb8) r7:9f563410 r6:9efd0000 r5:9f5f4040 r4:9f5f5040 [<805635d4>] (host_stop) from [<80563d0c>] (ci_hdrc_host_destroy+0x34/0x38) r7:9f563410 r6:9f5f5040 r5:9efa8800 r4:9f5f4040 [<80563cd8>] (ci_hdrc_host_destroy) from [<8055ef18>] (ci_hdrc_remove+0x50/0x10c) [<8055eec8>] (ci_hdrc_remove) from [<804a2ed8>] (platform_drv_remove+0x34/0x4c) r7:9f563410 r6:81c4f99c r5:9efa8810 r4:9efa8810 [<804a2ea4>] (platform_drv_remove) from [<804a18a8>] (device_release_driver_internal+0xec/0x19c) r5:00000000 r4:9efa8810 [<804a17bc>] (device_release_driver_internal) from [<804a1978>] (device_release_driver+0x20/0x24) r7:9f563410 r6:81c41ed0 r5:9efa8810 r4:9f4a1dac [<804a1958>] (device_release_driver) from [<804a01b8>] (bus_remove_device+0xdc/0x108) [<804a00dc>] (bus_remove_device) from [<8049c204>] (device_del+0x150/0x36c) r7:9f563410 r6:81c03048 r5:9efa8854 r4:9efa8810 [<8049c0b4>] (device_del) from [<804a3368>] (platform_device_del.part.2+0x20/0x84) r10:9f563414 r9:809177e0 r8:81cb07dc r7:81c78320 r6:9f563454 r5:9efa8800 r4:9efa8800 [<804a3348>] (platform_device_del.part.2) from [<804a3420>] (platform_device_unregister+0x28/0x34) r5:9f563400 r4:9efa8800 [<804a33f8>] (platform_device_unregister) from [<8055dce0>] (ci_hdrc_remove_device+0x1c/0x30) r5:9f563400 r4:00000001 [<8055dcc4>] (ci_hdrc_remove_device) from [<805652ac>] (ci_hdrc_imx_remove+0x38/0x118) r7:81c78320 r6:9f563454 r5:9f563410 r4:9f541010 [<8056538c>] (ci_hdrc_imx_shutdown) from [<804a2970>] (platform_drv_shutdown+0x2c/0x30) [<804a2944>] (platform_drv_shutdown) from [<8049e4fc>] (device_shutdown+0x158/0x1f0) [<8049e3a4>] (device_shutdown) from [<8013ac80>] (kernel_restart_prepare+0x44/0x48) r10:00000058 r9:9f4d8000 r8:fee1dead r7:379ce700 r6:81c0b280 r5:81c03048 r4:00000000 [<8013ac3c>] (kernel_restart_prepare) from [<8013ad14>] (kernel_restart+0x1c/0x60) [<8013acf8>] (kernel_restart) from [<8013af84>] (__do_sys_reboot+0xe0/0x1d8) r5:81c03048 r4:00000000 [<8013aea4>] (__do_sys_reboot) from [<8013b0ec>] (sys_reboot+0x18/0x1c) r8:80101204 r7:00000058 r6:00000000 r5:00000000 r4:00000000 [<8013b0d4>] (sys_reboot) from [<80101000>] (ret_fast_syscall+0x0/0x54) Exception stack(0x9f4d9fa8 to 0x9f4d9ff0) 9fa0: 00000000 00000000 fee1dead 28121969 01234567 379ce700 9fc0: 00000000 00000000 00000000 00000058 00000000 00000000 00000000 00016d04 9fe0: 00028e0c 7ec87c64 000135ec 76c1f410 Restore original invalid input check in dma_common_free_remap() to avoid this problem. Fixes: 5cf4537975bb ("dma-mapping: introduce a dma_common_find_pages helper") Signed-off-by: Andrey Smirnov <andrew.smirnov@gmail.com> [hch: just revert the offending hunk instead of creating a new helper] Signed-off-by: Christoph Hellwig <hch@lst.de>

view details

Linus Torvalds

commit sha 43b815c6a8e7dbccb5b8bd9c4b099c24bc22d135

Merge tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc Pull ARM SoC fixes from Olof Johansson: "A few fixes this time around: - Fixup of some clock specifications for DRA7 (device-tree fix) - Removal of some dead/legacy CPU OPP/PM code for OMAP that throws warnings at boot - A few more minor fixups for OMAPs, most around display - Enable STM32 QSPI as =y since their rootfs sometimes comes from there - Switch CONFIG_REMOTEPROC to =y since it went from tristate to bool - Fix of thermal zone definition for ux500 (5.4 regression)" * tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: ARM: multi_v7_defconfig: Fix SPI_STM32_QSPI support ARM: dts: ux500: Fix up the CPU thermal zone arm64/ARM: configs: Change CONFIG_REMOTEPROC from m to y ARM: dts: am4372: Set memory bandwidth limit for DISPC ARM: OMAP2+: Fix warnings with broken omap2_set_init_voltage() ARM: OMAP2+: Add missing LCDC midlemode for am335x ARM: OMAP2+: Fix missing reset done flag for am3 and am43 ARM: dts: Fix gpio0 flags for am335x-icev2 ARM: omap2plus_defconfig: Enable more droid4 devices as loadable modules ARM: omap2plus_defconfig: Enable DRM_TI_TFP410 DTS: ARM: gta04: introduce legacy spi-cs-high to make display work again ARM: dts: Fix wrong clocks for dra7 mcasp clk: ti: dra7: Fix mcasp8 clock bits

view details

Linus Torvalds

commit sha 7cdb85df6061d001fffd09c6adfbcf20356615e2

Merge tag 'dma-mapping-5.4-1' of git://git.infradead.org/users/hch/dma-mapping Pull dma-mapping regression fix from Christoph Hellwig: "Revert an incorret hunk from a patch that caused problems on various arm boards (Andrey Smirnov)" * tag 'dma-mapping-5.4-1' of git://git.infradead.org/users/hch/dma-mapping: dma-mapping: fix false positive warnings in dma_common_free_remap()

view details

Linus Torvalds

commit sha b212921b13bda088a004328457c5c21458262fe2

elf: don't use MAP_FIXED_NOREPLACE for elf executable mappings In commit 4ed28639519c ("fs, elf: drop MAP_FIXED usage from elf_map") we changed elf to use MAP_FIXED_NOREPLACE instead of MAP_FIXED for the executable mappings. Then, people reported that it broke some binaries that had overlapping segments from the same file, and commit ad55eac74f20 ("elf: enforce MAP_FIXED on overlaying elf segments") re-instated MAP_FIXED for some overlaying elf segment cases. But only some - despite the summary line of that commit, it only did it when it also does a temporary brk vma for one obvious overlapping case. Now Russell King reports another overlapping case with old 32-bit x86 binaries, which doesn't trigger that limited case. End result: we had better just drop MAP_FIXED_NOREPLACE entirely, and go back to MAP_FIXED. Yes, it's a sign of old binaries generated with old tool-chains, but we do pride ourselves on not breaking existing setups. This still leaves MAP_FIXED_NOREPLACE in place for the load_elf_interp() and the old load_elf_library() use-cases, because nobody has reported breakage for those. Yet. Note that in all the cases seen so far, the overlapping elf sections seem to be just re-mapping of the same executable with different section attributes. We could possibly introduce a new MAP_FIXED_NOFILECHANGE flag or similar, which acts like NOREPLACE, but allows just remapping the same executable file using different protection flags. It's not clear that would make a huge difference to anything, but if people really hate that "elf remaps over previous maps" behavior, maybe at least a more limited form of remapping would alleviate some concerns. Alternatively, we should take a look at our elf_map() logic to see if we end up not mapping things properly the first time. In the meantime, this is the minimal "don't do that then" patch while people hopefully think about it more. Reported-by: Russell King <linux@armlinux.org.uk> Fixes: 4ed28639519c ("fs, elf: drop MAP_FIXED usage from elf_map") Fixes: ad55eac74f20 ("elf: enforce MAP_FIXED on overlaying elf segments") Cc: Michal Hocko <mhocko@suse.com> Cc: Kees Cook <keescook@chromium.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

view details

Linus Torvalds

commit sha da0c9ea146cbe92b832f1b0f694840ea8eb33cce

Linux 5.4-rc2

view details

push time in a month

create barnchcyphar/linux

branch : copy_struct_from_user/cleanup

created branch time in a month

delete branch cyphar/linux

delete branch : copy_struct_from_user/master

delete time in a month

push eventcyphar/linux

Aleksa Sarai

commit sha 31b045c14114db4d0aea96c40a9e49f6d4aa2a8d

open: openat2(2) syscall The most obvious syscall to add support for the new LOOKUP_* scoping flags would be openat(2). However, there are a few reasons why this is not the best course of action: * The new LOOKUP_* flags are intended to be security features, and openat(2) will silently ignore all unknown flags. This means that users would need to avoid foot-gunning themselves constantly when using this interface if it were part of openat(2). This can be fixed by having userspace libraries handle this for users[1], but should be avoided if possible. * Resolution scoping feels like a different operation to the existing O_* flags. And since openat(2) has limited flag space, it seems to be quite wasteful to clutter it with 5 flags that are all resolution-related. Arguably O_NOFOLLOW is also a resolution flag but its entire purpose is to error out if you encounter a trailing symlink -- not to scope resolution. * Other systems would be able to reimplement this syscall allowing for cross-OS standardisation rather than being hidden amongst O_* flags which may result in it not being used by all the parties that might want to use it (file servers, web servers, container runtimes, etc). * It gives us the opportunity to iterate on the O_PATH interface. In particular, the new @how->upgrade_mask field for fd re-opening is only possible because we have a clean slate without needing to re-use the ACC_MODE flag design nor the existing openat(2) @mode semantics. To this end, we introduce the openat2(2) syscall. It provides all of the features of openat(2) through the @how->flags argument, but also also provides a new @how->resolve argument which exposes RESOLVE_* flags that map to our new LOOKUP_* flags. It also eliminates the long-standing ugliness of variadic-open(2) by embedding it in a struct -- and allows for future extensions using the copy_struct_from_user() model. In order to allow for userspace to lock down their usage of file descriptor re-opening, openat2(2) has the ability for users to disallow certain re-opening modes through @how->upgrade_mask. At the moment, there is no UPGRADE_NOEXEC. And finally, unlike open(2) we require all flag bits to be valid. This not only includes checking against VALID_OPEN_FLAGS, but more importantly includes checks that incompatible O_* flags are not set. Part of this includes a refactor of old syscalls so that their subtle flag-clearing behaviour is far more explicit and slightly more obvious than it was previously. [1]: https://github.com/openSUSE/libpathrs Suggested-by: Christian Brauner <christian@brauner.io> Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>

view details

Aleksa Sarai

commit sha c5e09b6c8af96c43d08ccc35431df5589bbc773e

selftests: add openat2(2) selftests Test all of the various openat2(2) flags, as well as how file descriptor re-opening works. A small stress-test of a symlink-rename attack is included to show that the protections against ".."-based attacks are sufficient. In addition, the memfd selftest is fixed to no longer depend on the now-disallowed functionality of upgrading an O_RDONLY descriptor to O_RDWR. The main things these self-tests are enforcing are: * The struct+usize ABI for openat2(2) and copy_struct_from_user() to ensure that upgrades will be handled gracefully (in addition, ensuring that misaligned structures are also handled correctly). * The -EINVAL checks for openat2(2) are all correctly handled to avoid userspace passing unknown or conflicting flag sets (most importantly, ensuring that invalid flag combinations are checked). * All of the RESOLVE_* semantics (including errno values) are correctly handled with various combinations of paths and flags. * RESOLVE_IN_ROOT correctly protects against the symlink rename(2) attack that has been responsible for several CVEs (and likely will be responsible for several more). * The magic-link trailing mode semantics correctly block re-opens in all of the relevant cases, as well as checking that the "flip-flop" attack is correctly protected against. * O_PATH has the correct semantics (the mode is g+rwx for ordinary files, but for trailing magic-links the mode gets inherited). Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>

view details

Aleksa Sarai

commit sha 7a5de20649deb06a667aace9e8232cbcb5959833

Documentation: update path-lookup to mention trailing magic-links We've introduced new (somewhat subtle) behaviour regarding trailing magic-links, so it's best to make sure everyone can follow along with the reasoning behind trailing_magiclink(). Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>

view details

push time in a month

push eventcyphar/man-pages

Aleksa Sarai

commit sha 7341ae9f0f41c355acba41570061fae76c9d77b2

openat2.2: document new openat2(2) syscall Rather than trying to merge the new syscall documentation into open.2 (which would probably result in the man-page being incomprehensible), instead the new syscall gets its own dedicated page with links between open(2) and openat2(2) to avoid duplicating information such as the list of O_* flags or common errors. Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>

view details

push time in a month

push eventcyphar/linux

Johan Hovold

commit sha 7fd25e6fc035f4b04b75bca6d7e8daa069603a76

ieee802154: atusb: fix use-after-free at disconnect The disconnect callback was accessing the hardware-descriptor private data after having having freed it. Fixes: 7490b008d123 ("ieee802154: add support for atusb transceiver") Cc: stable <stable@vger.kernel.org> # 4.2 Cc: Alexander Aring <alex.aring@gmail.com> Reported-by: syzbot+f4509a9138a1472e7e80@syzkaller.appspotmail.com Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Stefan Schmidt <stefan@datenfreihafen.org>

view details

Laurence Oberman

commit sha c9c53749375ccce0191f89b86732e642f914bd82

scsi: bnx2fc: Handle scope bits when array returns BUSY or TSF The qla2xxx driver had this issue as well when the newer array firmware returned the retry_delay_timer in the fcp_rsp. The bnx2fc is not handling the masking of the scope bits either so the retry_delay_timestamp value lands up being a large value added to the timer timestamp delaying I/O for up to 27 Minutes. This patch adds similar code to handle this to the bnx2fc driver to avoid the huge delay. Link: https://lore.kernel.org/r/1568210202-12794-1-git-send-email-loberman@redhat.com Signed-off-by: Laurence Oberman <loberman@redhat.com> Reported-by: David Jeffery <djeffery@redhat.com> Reported-by: kbuild test robot <lkp@intel.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

view details

Stanley Chu

commit sha f51913eef23f74c3bd07899dc7f1ed6df9e521d8

scsi: ufs: skip shutdown if hba is not powered In some cases, hba may go through shutdown flow without successful initialization and then make system hang. For example, if ufshcd_change_power_mode() gets error and leads to ufshcd_hba_exit() to release resources of the host, future shutdown flow may hang the system since the host register will be accessed in unpowered state. To solve this issue, simply add checking to skip shutdown for above kind of situation. Link: https://lore.kernel.org/r/1568780438-28753-1-git-send-email-stanley.chu@mediatek.com Signed-off-by: Stanley Chu <stanley.chu@mediatek.com> Acked-by: Bean Huo <beanhuo@micron.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

view details

Austin Kim

commit sha 4b062e7cf940aa4f38fedc3a964b24cb095373f0

scsi: qedf: Remove always false 'tmp_prio < 0' statement Since tmp_prio is declared as u8, the following statement is always false. tmp_prio < 0 So remove 'always false' statement. Link: https://lore.kernel.org/r/20190919075548.GA112801@LGEARND20B15 Signed-off-by: Austin Kim <austindh.kim@gmail.com> Acked-by: Saurav Kashyap <skashyap@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

view details

Long Li

commit sha 0ed8810276907c8a633dc8cecc48dabb6678cd23

scsi: storvsc: setup 1:1 mapping between hardware queue and CPU queue storvsc doesn't use a dedicated hardware queue for a given CPU queue. When issuing I/O, it selects returning CPU (hardware queue) dynamically based on vmbus channel usage across all channels. This patch advertises num_present_cpus() as number of hardware queues. This will have upper layer setup 1:1 mapping between hardware queue and CPU queue and avoid unnecessary locking when issuing I/O. Link: https://lore.kernel.org/r/1567790660-48142-1-git-send-email-longli@linuxonhyperv.com Signed-off-by: Long Li <longli@microsoft.com> Reviewed-by: Michael Kelley <mikelley@microsoft.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

view details

Xiang Chen

commit sha 70054aa39a013fa52eff432f2223b8bd5c0048f8

scsi: megaraid: disable device when probe failed after enabled device For pci device, need to disable device when probe failed after enabled device. Link: https://lore.kernel.org/r/1567818450-173315-1-git-send-email-chenxiang66@hisilicon.com Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Reviewed-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

view details

YueHaibing

commit sha 4b6b1bb68628ec49e4cbfabd8ac17a169f079cd8

scsi: hisi_sas: Make three functions static Fix sparse warnings: drivers/scsi/hisi_sas/hisi_sas_main.c:3686:6: warning: symbol 'hisi_sas_debugfs_release' was not declared. Should it be static? drivers/scsi/hisi_sas/hisi_sas_main.c:3708:5: warning: symbol 'hisi_sas_debugfs_alloc' was not declared. Should it be static? drivers/scsi/hisi_sas/hisi_sas_main.c:3799:6: warning: symbol 'hisi_sas_debugfs_bist_init' was not declared. Should it be static? Link: https://lore.kernel.org/r/20190923054035.19036-1-yuehaibing@huawei.com Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: YueHaibing <yuehaibing@huawei.com> Acked-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

view details

Himanshu Madhani

commit sha 248a445adfc8c33ffd67cf1f2e336578e34f9e21

scsi: qla2xxx: Silence fwdump template message Print if fwdt template is present or not, only when ql2xextended_error_logging is enabled. Link: https://lore.kernel.org/r/20190912180918.6436-2-hmadhani@marvell.com Signed-off-by: Himanshu Madhani <hmadhani@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

view details

Quinn Tran

commit sha c3b6a1d397420a0fdd97af2f06abfb78adc370df

scsi: qla2xxx: Fix unbound sleep in fcport delete path. There are instances, though rare, where a LOGO request cannot be sent out and the thread in free session done can wait indefinitely. Fix this by putting an upper bound to sleep. Link: https://lore.kernel.org/r/20190912180918.6436-3-hmadhani@marvell.com Signed-off-by: Quinn Tran <qutran@marvell.com> Signed-off-by: Himanshu Madhani <hmadhani@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

view details

Quinn Tran

commit sha fd5564ba54e0d8a9e3e823d311b764232e09eb5f

scsi: qla2xxx: Fix stale mem access on driver unload On driver unload, 'remove_one' thread was allowed to advance, while session cleanup still lag behind. This patch ensures session deletion will finish before remove_one can advance. Link: https://lore.kernel.org/r/20190912180918.6436-4-hmadhani@marvell.com Signed-off-by: Quinn Tran <qutran@marvell.com> Signed-off-by: Himanshu Madhani <hmadhani@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

view details

Quinn Tran

commit sha f5187b7d1ac66b61676f896751d3af9fcf8dd592

scsi: qla2xxx: Optimize NPIV tear down process In the case of NPIV port is being torn down, this patch will set a flag to indicate VPORT_DELETE. This would prevent relogin to be triggered. Link: https://lore.kernel.org/r/20190912180918.6436-5-hmadhani@marvell.com Signed-off-by: Quinn Tran <qutran@marvell.com> Signed-off-by: Himanshu Madhani <hmadhani@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

view details

Quinn Tran

commit sha 7f2a398d59d658818f3d219645164676fbbc88e8

scsi: qla2xxx: Fix N2N link reset Fix stalled link recovery for N2N with FC-NVMe connection. Link: https://lore.kernel.org/r/20190912180918.6436-6-hmadhani@marvell.com Signed-off-by: Quinn Tran <qutran@marvell.com> Signed-off-by: Himanshu Madhani <hmadhani@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

view details

Quinn Tran

commit sha f3f1938bb673b1b5ad182c4608f5f8a24921eea3

scsi: qla2xxx: Fix N2N link up fail During link up/bounce, qla driver would do command flush as part of cleanup. In this case, the flush can intefere with FW state. This patch allows FW to be in control of link up. Link: https://lore.kernel.org/r/20190912180918.6436-7-hmadhani@marvell.com Signed-off-by: Quinn Tran <qutran@marvell.com> Signed-off-by: Himanshu Madhani <hmadhani@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

view details

Quinn Tran

commit sha 0aabb6b699f72dca96988d3f428e222f932dc889

scsi: qla2xxx: Fix Nport ID display value For N2N, the NPort ID is assigned by driver in the PLOGI ELS. According to FW Spec the byte order for SID is not the same as DID. Link: https://lore.kernel.org/r/20190912180918.6436-8-hmadhani@marvell.com Reviewed-by: Roman Bolshakov <r.bolshakov@yadro.com> Tested-by: Roman Bolshakov <r.bolshakov@yadro.com> Signed-off-by: Quinn Tran <qutran@marvell.com> Signed-off-by: Himanshu Madhani <hmadhani@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

view details

Christophe JAILLET

commit sha 21a045e430e56725628f361dede8a882e92469b9

ieee802154: mcr20a: simplify a bit 'mcr20a_handle_rx_read_buf_complete()' Use a 'skb_put_data()' variant instead of rewritting it. The __skb_put_data variant is safe here. It is obvious that the skb can not overflow. It has just been allocated a few lines above with the same 'len'. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Acked-by: Xue Liu <liuxuenetmail@gmail.com> Signed-off-by: Stefan Schmidt <stefan@datenfreihafen.org>

view details

Navid Emamdoost

commit sha 6402939ec86eaf226c8b8ae00ed983936b164908

ieee802154: ca8210: prevent memory leak In ca8210_probe the allocated pdata needs to be assigned to spi_device->dev.platform_data before calling ca8210_get_platform_data. Othrwise when ca8210_get_platform_data fails pdata cannot be released. Signed-off-by: Navid Emamdoost <navid.emamdoost@gmail.com> Link: https://lore.kernel.org/r/20190917224713.26371-1-navid.emamdoost@gmail.com Signed-off-by: Stefan Schmidt <stefan@datenfreihafen.org>

view details

Michal Vokáč

commit sha 7ae6d93c8f052b7a77ba56ed0f654e22a2876739

net: dsa: qca8k: Use up to 7 ports for all operations The QCA8K family supports up to 7 ports. So use the existing QCA8K_NUM_PORTS define to allocate the switch structure and limit all operations with the switch ports. This was not an issue until commit 0394a63acfe2 ("net: dsa: enable and disable all ports") disabled all unused ports. Since the unused ports 7-11 are outside of the correct register range on this switch some registers were rewritten with invalid content. Fixes: 6b93fb46480a ("net-next: dsa: add new driver for qca8xxx family") Fixes: a0c02161ecfc ("net: dsa: variable number of ports") Fixes: 0394a63acfe2 ("net: dsa: enable and disable all ports") Signed-off-by: Michal Vokáč <michal.vokac@ysoft.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>

view details

Eric Dumazet

commit sha e9789c7cc182484fc031fd88097eb14cb26c4596

sch_cbq: validate TCA_CBQ_WRROPT to avoid crash syzbot reported a crash in cbq_normalize_quanta() caused by an out of range cl->priority. iproute2 enforces this check, but malicious users do not. kasan: CONFIG_KASAN_INLINE enabled kasan: GPF could be caused by NULL-ptr deref or user memory access general protection fault: 0000 [#1] SMP KASAN PTI Modules linked in: CPU: 1 PID: 26447 Comm: syz-executor.1 Not tainted 5.3+ #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:cbq_normalize_quanta.part.0+0x1fd/0x430 net/sched/sch_cbq.c:902 RSP: 0018:ffff8801a5c333b0 EFLAGS: 00010206 RAX: 0000000020000003 RBX: 00000000fffffff8 RCX: ffffc9000712f000 RDX: 00000000000043bf RSI: ffffffff83be8962 RDI: 0000000100000018 RBP: ffff8801a5c33420 R08: 000000000000003a R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: 00000000000002ef R13: ffff88018da95188 R14: dffffc0000000000 R15: 0000000000000015 FS: 00007f37d26b1700(0000) GS:ffff8801dad00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00000000004c7cec CR3: 00000001bcd0a006 CR4: 00000000001626f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: [<ffffffff83be9d57>] cbq_normalize_quanta include/net/pkt_sched.h:27 [inline] [<ffffffff83be9d57>] cbq_addprio net/sched/sch_cbq.c:1097 [inline] [<ffffffff83be9d57>] cbq_set_wrr+0x2d7/0x450 net/sched/sch_cbq.c:1115 [<ffffffff83bee8a7>] cbq_change_class+0x987/0x225b net/sched/sch_cbq.c:1537 [<ffffffff83b96985>] tc_ctl_tclass+0x555/0xcd0 net/sched/sch_api.c:2329 [<ffffffff83a84655>] rtnetlink_rcv_msg+0x485/0xc10 net/core/rtnetlink.c:5248 [<ffffffff83cadf0a>] netlink_rcv_skb+0x17a/0x460 net/netlink/af_netlink.c:2510 [<ffffffff83a7db6d>] rtnetlink_rcv+0x1d/0x30 net/core/rtnetlink.c:5266 [<ffffffff83cac2c6>] netlink_unicast_kernel net/netlink/af_netlink.c:1324 [inline] [<ffffffff83cac2c6>] netlink_unicast+0x536/0x720 net/netlink/af_netlink.c:1350 [<ffffffff83cacd4a>] netlink_sendmsg+0x89a/0xd50 net/netlink/af_netlink.c:1939 [<ffffffff8399d46e>] sock_sendmsg_nosec net/socket.c:673 [inline] [<ffffffff8399d46e>] sock_sendmsg+0x12e/0x170 net/socket.c:684 [<ffffffff8399f1fd>] ___sys_sendmsg+0x81d/0x960 net/socket.c:2359 [<ffffffff839a2d05>] __sys_sendmsg+0x105/0x1d0 net/socket.c:2397 [<ffffffff839a2df9>] SYSC_sendmsg net/socket.c:2406 [inline] [<ffffffff839a2df9>] SyS_sendmsg+0x29/0x30 net/socket.c:2404 [<ffffffff8101ccc8>] do_syscall_64+0x528/0x770 arch/x86/entry/common.c:305 [<ffffffff84400091>] entry_SYSCALL_64_after_hwframe+0x42/0xb7 Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: David S. Miller <davem@davemloft.net>

view details

Haishuang Yan

commit sha 0e141f757b2c78c983df893e9993313e2dc21e38

erspan: remove the incorrect mtu limit for erspan erspan driver calls ether_setup(), after commit 61e84623ace3 ("net: centralize net_device min/max MTU checking"), the range of mtu is [min_mtu, max_mtu], which is [68, 1500] by default. It causes the dev mtu of the erspan device to not be greater than 1500, this limit value is not correct for ipgre tap device. Tested: Before patch: # ip link set erspan0 mtu 1600 Error: mtu greater than device maximum. After patch: # ip link set erspan0 mtu 1600 # ip -d link show erspan0 21: erspan0@NONE: <BROADCAST,MULTICAST> mtu 1600 qdisc noop state DOWN mode DEFAULT group default qlen 1000 link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff promiscuity 0 minmtu 68 maxmtu 0 Fixes: 61e84623ace3 ("net: centralize net_device min/max MTU checking") Signed-off-by: Haishuang Yan <yanhaishuang@cmss.chinamobile.com> Signed-off-by: David S. Miller <davem@davemloft.net>

view details

Huacai Chen

commit sha 2b6509f42dc6c96ef0b625888d4aa56d2e140330

MIPS: Loongson64: Fix boot failure after dropping boot_mem_map From commit a94e4f24ec83 ("MIPS: init: Drop boot_mem_map") onwards, add_memory_region() is handled by memblock_add()/memblock_reserve() directly and all bootmem API should be converted to memblock API. Otherwise it will lead to boot failure, especially in the NUMA case because add_memory_region lose the node_id information. Fixes: a94e4f24ec836c8984f83959 ("MIPS: init: Drop boot_mem_map") Signed-off-by: Huacai Chen <chenhc@lemote.com> Signed-off-by: Jiaxun Yang <jiaxun.yang@flygoat.com> [paul.burton@mips.com: - Invert node_id check to de-indent the switch statement & avoid lines over 80 characters. - Fixup commit reference in commit message.] Signed-off-by: Paul Burton <paul.burton@mips.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: James Hogan <jhogan@kernel.org> Cc: linux-mips@linux-mips.org Cc: linux-mips@vger.kernel.org Cc: Fuxin Zhang <zhangfx@lemote.com> Cc: Zhangjin Wu <wuzhangjin@gmail.com> Cc: Huacai Chen <chenhuacai@gmail.com>

view details

push time in a month

issue commentopencontainers/runc

Missing pieces running rootless containers on RHEL 7.4 + Documentation

@tnguyen14 Yes (sorry I didn't see your question earlier).

foldingbeauty

comment created time in a month

issue closedopencontainers/runc

localunittest error

Recently I ran make localunittest on host and met following error:

`checkpoint_test.go:70: copy error "exit status 1": "cp: cannot stat '/busybox/': No such file or directory\n" === RUN TestCheckpoint --- FAIL: TestCheckpoint (0.01s) checkpoint_test.go:70: copy error "exit status 1": "cp: cannot stat '/busybox/': No such file or directory\n" === RUN TestExecPS --- FAIL: TestExecPS (0.01s) utils_test.go:55: exec_test.go:40: unexpected error: copy error "exit status 1": "cp: cannot stat '/busybox/*': No such file or directory\n"

=== RUN TestUsernsExecPS --- FAIL: TestUsernsExecPS (0.01s) utils_test.go:55: exec_test.go:40: unexpected error: copy error "exit status 1": "cp: cannot stat '/busybox/*': No such file or directory\n"

=== RUN TestIPCPrivate --- FAIL: TestIPCPrivate (0.01s) utils_test.go:55: exec_test.go:73: unexpected error: copy error "exit status 1": "cp: cannot stat '/busybox/*': No such file or directory\n"

=== RUN TestIPCHost --- FAIL: TestIPCHost (0.01s) utils_test.go:55: exec_test.go:98: unexpected error: copy error "exit status 1": "cp: cannot stat '/busybox/*': No such file or directory\n"`

It causes by missing /buxybox. I found the busybox environment is only built in Dockfile, which means if we ran localunittest on the host directly, this error will appear.

I am not sure if it is necessary to add an patch on it.

closed time in a month

zhlhahaha

issue commentopencontainers/runc

localunittest error

Yeah, this is how our tests are written -- maybe it would be nice to make it neater, but none of us really have time to fix it up. If you would like to, feel free to send a patch.

zhlhahaha

comment created time in a month

more