Commit c2e13112 authored by Joel Fernandes (Google)'s avatar Joel Fernandes (Google) Committed by Paul E. McKenney
Browse files

rcu/segcblist: Add additional comments to explain smp_mb()



One counter-intuitive property of RCU is the fact that full memory
barriers are needed both before and after updates to the full
(non-segmented) length.  This patch therefore helps to assist the
reader's intuition by adding appropriate comments.

[ paulmck:  Wordsmithing. ]
Signed-off-by: default avatarJoel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: default avatarPaul E. McKenney <paulmck@kernel.org>
parent 6bc33582
Loading
Loading
Loading
Loading
+64 −4
Original line number Diff line number Diff line
@@ -94,17 +94,77 @@ static void rcu_segcblist_set_len(struct rcu_segcblist *rsclp, long v)
 * field to disagree with the actual number of callbacks on the structure.
 * This increase is fully ordered with respect to the callers accesses
 * both before and after.
 *
 * So why on earth is a memory barrier required both before and after
 * the update to the ->len field???
 *
 * The reason is that rcu_barrier() locklessly samples each CPU's ->len
 * field, and if a given CPU's field is zero, avoids IPIing that CPU.
 * This can of course race with both queuing and invoking of callbacks.
 * Failing to correctly handle either of these races could result in
 * rcu_barrier() failing to IPI a CPU that actually had callbacks queued
 * which rcu_barrier() was obligated to wait on.  And if rcu_barrier()
 * failed to wait on such a callback, unloading certain kernel modules
 * would result in calls to functions whose code was no longer present in
 * the kernel, for but one example.
 *
 * Therefore, ->len transitions from 1->0 and 0->1 have to be carefully
 * ordered with respect with both list modifications and the rcu_barrier().
 *
 * The queuing case is CASE 1 and the invoking case is CASE 2.
 *
 * CASE 1: Suppose that CPU 0 has no callbacks queued, but invokes
 * call_rcu() just as CPU 1 invokes rcu_barrier().  CPU 0's ->len field
 * will transition from 0->1, which is one of the transitions that must
 * be handled carefully.  Without the full memory barriers after the ->len
 * update and at the beginning of rcu_barrier(), the following could happen:
 *
 * CPU 0				CPU 1
 *
 * call_rcu().
 *					rcu_barrier() sees ->len as 0.
 * set ->len = 1.
 *					rcu_barrier() does nothing.
 *					module is unloaded.
 * callback invokes unloaded function!
 *
 * With the full barriers, any case where rcu_barrier() sees ->len as 0 will
 * have unambiguously preceded the return from the racing call_rcu(), which
 * means that this call_rcu() invocation is OK to not wait on.  After all,
 * you are supposed to make sure that any problematic call_rcu() invocations
 * happen before the rcu_barrier().
 *
 *
 * CASE 2: Suppose that CPU 0 is invoking its last callback just as
 * CPU 1 invokes rcu_barrier().  CPU 0's ->len field will transition from
 * 1->0, which is one of the transitions that must be handled carefully.
 * Without the full memory barriers before the ->len update and at the
 * end of rcu_barrier(), the following could happen:
 *
 * CPU 0				CPU 1
 *
 * start invoking last callback
 * set ->len = 0 (reordered)
 *					rcu_barrier() sees ->len as 0
 *					rcu_barrier() does nothing.
 *					module is unloaded
 * callback executing after unloaded!
 *
 * With the full barriers, any case where rcu_barrier() sees ->len as 0
 * will be fully ordered after the completion of the callback function,
 * so that the module unloading operation is completely safe.
 *
 */
void rcu_segcblist_add_len(struct rcu_segcblist *rsclp, long v)
{
#ifdef CONFIG_RCU_NOCB_CPU
	smp_mb__before_atomic(); /* Up to the caller! */
	smp_mb__before_atomic(); // Read header comment above.
	atomic_long_add(v, &rsclp->len);
	smp_mb__after_atomic(); /* Up to the caller! */
	smp_mb__after_atomic();  // Read header comment above.
#else
	smp_mb(); /* Up to the caller! */
	smp_mb(); // Read header comment above.
	WRITE_ONCE(rsclp->len, rsclp->len + v);
	smp_mb(); /* Up to the caller! */
	smp_mb(); // Read header comment above.
#endif
}