Web lists-archives.com

Re: [PATCH 6/6] sched/numa: Delay retrying placement for automatic NUMA balance after wake_affine




On Tue, Feb 13, 2018 at 02:18:12PM +0000, Mel Gorman wrote:
> On Tue, Feb 13, 2018 at 03:01:37PM +0100, Peter Zijlstra wrote:
> > On Tue, Feb 13, 2018 at 01:37:30PM +0000, Mel Gorman wrote:
> > > +static void
> > > +update_wa_numa_placement(struct task_struct *p, int prev_cpu, int target)
> > > +{
> > > +	unsigned long interval;
> > > +
> > > +	if (!static_branch_likely(&sched_numa_balancing))
> > > +		return;
> > > +
> > > +	/* If balancing has no preference then continue gathering data */
> > > +	if (p->numa_preferred_nid == -1)
> > > +		return;
> > > +
> > > +	/*
> > > +	 * If the wakeup is not affecting locality then it is neutral from
> > > +	 * the perspective of NUMA balacing so continue gathering data.
> > > +	 */
> > > +	if (cpus_share_cache(prev_cpu, target))
> > > +		return;
> > 
> > Dang, I wanted to mention this before, but it slipped my mind. The
> > comment and code don't match.
> > 
> > Did you want to write:
> > 
> > 	if (cpu_to_node(prev_cpu) == cpu_to_node(target))
> > 		return;
> > 
> 
> Well, it was deliberate. While it's possible to be on the same memory
> node and not sharing cache, the scheduler typically is more concerned with
> the LLC than NUMA per-se. If they share LLC, then I also assume that they
> share memory locality.

True, but the remaining code only has effect for numa balance, which is
concerned with nodes. So I don't see the point of using something
potentially smaller.

Suppose someone did hardware where a node has 2 cache clusters, then
we'd still set a wake_affine back-off for numa-balance, even though it
remains on the same node.

How would that be useful?