Development
 
Forums: » Register « |  User CP |  Games |  Calendar |  Members |  FAQs |  Sitemap |  Support | 
 
User Name:
Password:
Remember me
Go Back   Web Development Archives Mailing Lists Development

Reply
Add This Thread To:
  Del.icio.us   Digg   Google   Spurl   Blink   Furl   Simpy   Y! MyWeb 
Thread Tools Search this Thread Display Modes
 
Unread Web Development Archives Sponsor:
  #1  
Old July 6th, 2008, 07:01 AM
Ira Rosen
Guest
Dev Archives Newbie (0 - 499 posts)
 
Posts: n/a  
Time spent in forums:
Reputation Power:
Support load permutation in loop-aware SLP

Hi,

Current loop-aware SLP scheme starts from a group of adjacent stores and
follows use-def chains until getting to a group of loads. The loads must be
adjacent and their order must match the order of the stores, i.e., no
permutations are currently allowed.

This patch adds a support of a specific type of load permutations along
with general support of load permutations in SLP. It aims to vectorize RGB
to YUV conversion, that can be viewed as {y, u, v} = M * {r, g, b}, where M
is a matrix of constant coefficients, and the calculation is performed in a
single-nested loop:
for i
yi = M00 * ri + M01 * gi + M02 * bi
ui = M10 * ri + M11 * gi + M12 * bi
vi = M20 * ri + M21 * gi + M22 * bi
The required permutation of loads is to transform rgb stream into {r,r,r},
{g,g,g} and {b,b,b} vectors (ignoring vector size for simplicity).

The SLP analysis detects such cases: all the loads in the same SLP node
must access the same memory location, and all the SLP nodes that contain
loads must form a group of adjacent memory accesses. The transformation
phase generates vector permutations of the input vectors with compiler
generated masks, depending on the data type, vectorization factor and size
of SLP nodes.

Bootstrapped with vectorization enabled on ppc-linux and tested on Cell SPU
and ppc-linux.
K. for mainline?

Thanks,
Ira

ChangeLog:

* target.h (struct vectorize): Add new target builtin.
* tree-vectorizer.h (enum slp_load_perm_type): New.
(struct _slp_tree): Add new field loads_perm_type
(struct _slp_instance): Add new field same_perm_nodes.
(SLP_INSTANCE_SAME_PERM_NDES): New.
(SLP_TREE_LADS_PERM_TYPE, TARG_VEC_PERMUTE_CST): New.
(vectorizable_load): Add argument.
(vect_transform_slp_perm_load): new.
* tree-vect-analyze.c (vect_analyze_operations): Add an argument to
vectorizable_load.
(vect_build_slp_tree): Add new argument. Allow load permutations for
the case
when all the loads in the same SLP node access the same memory
location.
(vect_analyze_slp_instance): In case of same location loads check
that the
loads from different nodes form an interleaving chain. Sort the nodes
according
to the chain.
* target-def.h (): New.
* tree-vect-transform.c (vect_transform_stmt): Add new argument.
(vectorizable_store): Allow number of created vectors to be greater
than the
size of an interleaving group. Don't go along the interleaving chain
for SLP.
(vect_create_mask_and_perm): New function.
(vect_get_mask_element, vect_transform_slp_perm_load): Likewise.
(vectorizable_load): Allocate DR_CHAIN according to the number of
generated
vectors. Don't keep the created vectors statements in the node if
permutation
is required. Call vect_transform_slp_perm_load to generate the
permutation.
(vect_transform_stmt): Add new argument. Call vectorizable_load with
additional
argument. Don't wait for other stores in case of SLP.
(vect_schedule_slp_instance): Add new argument. Calculate the number
of vector
statements. In case of loads from the same location, allocate
vectorized
statements structure for all the related SLP nodes. Call
vect_transform_stmt with
additional argument.
(vect_schedule_slp): Remove one argument. Move number of vector
statements
calculation to vect_schedule_slp_instance.
(vect_transform_loop): Call vect_transform_stmt and vect_schedule_slp
with
correct arguments.
* config/spu/spu.c (spu_builtin_vec_perm): New.
(): Redefine
* config/spu/spu.h (TARG_VEC_PERMUTE_CS): Define.
* config/rs6000/rs6000.c (rs6000_builtin_vec_perm): New.
(): Redefine.

testsuite/ChangeLog:

* lib/target-supports.exp (): New.
* gcc.dg/vect/slp-perm-1.c: New testcase.
* gcc.dg/vect/slp-perm-2.c: Likewise.
* gcc.dg/vect/slp-perm-3.c: Likewise.
* gcc.dg/vect/slp-perm-4.c: Likewise.
* gcc.dg/vect/slp-perm-5.c: Likewise.
* gcc.dg/vect/slp-perm-6.c: Likewise.
* gcc.dg/vect/slp-perm-7.c: Likewise.
* gcc.dg/vect/slp-perm-8.c: Likewise.
* gcc.dg/vect/slp-perm-9.c: Likewise.

(See attached file: slp-perm.txt)(See attached file: tests.txt)

Reply With Quote
  #2  
Old July 6th, 2008, 07:20 PM
David Edelsohn
Guest
Dev Archives Newbie (0 - 499 posts)
 
Posts: n/a  
Time spent in forums:
Reputation Power:
Support load permutation in loop-aware SLP

* config/rs6000/rs6000.c (rs6000_builtin_vec_perm): New.
(): Redefine.

The rs6000 part of the patch is okay.

Thanks, David

Reply With Quote
Reply

Viewing: Web Development Archives Mailing Lists Development > Support load permutation in loop-aware SLP


Thread Tools  Search this Thread 
Search this Thread:

Advanced Search
Display Modes  Rate This Thread 
Rate This Thread:


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are Off
[IMG] code is On
HTML code is Off
View Your Warnings | New Posts | Latest Threads | Shoutbox
Forum Jump


Forums: » Register « |  User CP |  Games |  Calendar |  Members |  FAQs |  Sitemap |  Support | 
  
 





© 2003-2008 by Developer Shed. All rights reserved. DS Cluster 3 hosted by Hostway
Stay green...Green IT