[Oisf-devel] [PATCH V2 0/6] Align elements in 'trans_q' and 'data_queues' array

Holger Eitzenberger holger at eitzenberger.org
Mon Oct 28 08:35:11 UTC 2013

Hi Victor,

> I think it would be more useful if you would send the patches inline, so
> I can comment on them line by line. Alternatively, a github pull request
> works as well.

I am using 'quilt' for sending the patches out, and am a bit surprised
that it doesn't do already.  Will check if a github pull is
more convenient.

> > The first patch is just cleanup, as it only introduces NUM_QUEUES.
> > But from grepping through the source I see that I may not have spotted
> > all places.
> I need to check more carefully, but I'm almost sure TMQ_MAX_QUEUES used
> in tm-queues.[ch] is related.

Ok, will check.

> These queues are used in the autofp runmode to transfer packets between
> the pktacq thread and the workers, so there is one per thread. This
> limits us to 256 threads. I think nfq may actually be worse, as it uses
> more queues iirc. Making this dynamic has been on my list for quite some
> time.

I am sure this would make sense.  I spotted this issue without
actually knowing how often those queues are used in practice.
And therefore can't make a statement about the exptected performance

> The __ALIGN macro looks convenient, I think we should replace the
> current __attribute__((__aligned__(a))) users with it as well.

Ok, I can do so in subsequent patch.

> For the allocation functions we use the wrappers from util-mem.h, so in
> the case of posix_memalign, we would use SCMallocAligned. Although I see
> that it is a wrapper for _mm_malloc().

Creating SCMallocAligned() surely makes sense, so I'll create that.

> Minimal testing on my dual core laptop (still traveling) suggests a
> minor slowdown (17.2s to 17.6s for one test, consistently over multiple
> runs). Will try again on bigger hardware later when I'm home.

As said above I don't have an idea about the actual performance
gain from the alignment itself.  But I'd expect that to be
visible on a large machine with many threads.  Also this change
goes into the direction of the 'dynamic queues' you described.

I think I can resent in the evening today.


More information about the Oisf-devel mailing list