All times are UTC - 6 hours




Post new topic Reply to topic  [ 3 posts ] 
Author Message
PostPosted: Mon Mar 14, 2005 7:08 am 
Offline


Wed Oct 13, 2004 7:26 am

342

Nafplion
I'd like to rewrite these two functions
toLower(char *a)
toUpper(char *a)

in Altivec. Actually the original ones, take just a char as parameter, but where they're used, i could easily use vectorized functions for this. My guess is that they would be quite faster, but i'd have to use a lookup table, or sth like that. Could anyone give me some pointer on lookup tables in Altivec?

Thanks :-)

Konstantinos


Top
 Profile  
 
PostPosted: Mon Mar 14, 2005 7:05 pm 
Offline


Fri Sep 24, 2004 1:39 am

103

Gothenburg, Sweden
I wrote the following two functions for one of the CrabFire filters. They require no memory lookups so they will not pollute the data cache or hog the memory bus.

Code:
vector char vec_tolower(vector char str)
{
   /* From Holger Bettag's table of constants */
   vector char A         = vec_rl(vec_splat_u8(4), vec_splat_u8(4));
   vector char Z         = vec_vor(vec_rl(vec_splat_u8(0xb), vec_splat_u8(0xb)), vec_splat_u8(0xb));
   vector char diff      = vec_rl(vec_splat_u8(1), vec_splat_u8(5));

   vector bool char gt   = vec_cmpgt(str, A);
   vector bool char lt   = vec_cmplt(str, Z);
   vector bool char mask = vec_and(gt, lt);
   vector char small     = vec_add(str, diff);
   return vec_sel(str, small, mask);
}


Code:
vector char vec_toupper(vector char str)
{
   /* From Holger Bettag's table of constants */
   vector char a         = vec_rl(vec_splat_u8(3), vec_splat_u8(5));
   vector char z         = vec_avg(vec_splat_u8(0), vec_splat_u8(-13));
   vector char diff      = vec_rl(vec_splat_u8(1), vec_splat_u8(5));

   vector bool char gt   = vec_cmpgt(str, a);
   vector bool char lt   = vec_cmplt(str, z);
   vector bool char mask = vec_and(gt, lt);
   vector char small     = vec_sub(str, diff);
   return vec_sel(str, small, mask);
}


Top
 Profile  
 
PostPosted: Tue Mar 15, 2005 2:36 am 
Offline


Mon Oct 11, 2004 12:49 am

35
The most basic table lookup in AltiVec is a single vector permute: the 32 table entries reside in the 'left' and 'right' data vectors, and 16 indexes of 5 bit each reside in the 'permute control' vector.

To extend this to larger lookup tables, you need to compare and select based on the higher index bits. For example for a table with 64 entries, you do two 32 entry lookups as explained above. Then you mask out bit number 5 (the one valued as 32), compare the result to zero, and use the result of the comparison to select between the initial two lookups.

In practice you'd compute the boolean mask in parallel with the lookups (they happen in independent execution units).

As you can see, the decision tree can be recursively extended to include more significant bits, up to the limit of a full 8 bit lookup table. It looks unelegant, but it is almost always a win over scalar code indexing an actual char array in memory. The biggest problem is that the lookup table and the invariant values occupy a lot of vector registers. So there is not much headroom to do more calculation right before or after the lookup (you wouldn't want the compiler to spill and refill registers to/from memory).


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 3 posts ] 

All times are UTC - 6 hours


Who is online

Users browsing this forum: No registered users and 1 guest


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum

Search for:
Jump to:  
cron
Powered by phpBB® Forum Software © phpBB Group