From llvm at cs.uiuc.edu Mon Apr 11 00:18:20 2005 From: llvm at cs.uiuc.edu (LLVM) Date: Mon, 11 Apr 2005 00:18:20 -0500 Subject: [llvm-commits] CVS: llvm-test/MultiSource/Benchmarks/ASCI_Purple/ Message-ID: <200504110518.AAA27252@zion.cs.uiuc.edu> Changes in directory llvm-test/MultiSource/Benchmarks/ASCI_Purple: --- Log message: Directory /var/cvs/llvm/llvm-test/MultiSource/Benchmarks/ASCI_Purple added to the repository --- Diffs of the changes: (+0 -0) 0 files changed From llvm at cs.uiuc.edu Mon Apr 11 00:18:40 2005 From: llvm at cs.uiuc.edu (LLVM) Date: Mon, 11 Apr 2005 00:18:40 -0500 Subject: [llvm-commits] CVS: llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/ Message-ID: <200504110518.AAA27261@zion.cs.uiuc.edu> Changes in directory llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000: --- Log message: Directory /var/cvs/llvm/llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000 added to the repository --- Diffs of the changes: (+0 -0) 0 files changed From llvm at cs.uiuc.edu Mon Apr 11 00:19:06 2005 From: llvm at cs.uiuc.edu (LLVM) Date: Mon, 11 Apr 2005 00:19:06 -0500 Subject: [llvm-commits] CVS: llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/docs/ Message-ID: <200504110519.AAA27270@zion.cs.uiuc.edu> Changes in directory llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/docs: --- Log message: Directory /var/cvs/llvm/llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/docs added to the repository --- Diffs of the changes: (+0 -0) 0 files changed From duraid at octopus.com.au Mon Apr 11 00:22:19 2005 From: duraid at octopus.com.au (Duraid Madina) Date: Mon, 11 Apr 2005 00:22:19 -0500 Subject: [llvm-commits] CVS: llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/docs/smg2000.readme Message-ID: <200504110522.AAA27465@zion.cs.uiuc.edu> Changes in directory llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/docs: smg2000.readme added (r1.1) --- Log message: * add the SMG2000 benchmark. This one is brutal on memory, anything you can do in terms of prefetching/unrolling/SWP will probably help. --- Diffs of the changes: (+389 -0) smg2000.readme | 389 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 files changed, 389 insertions(+) Index: llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/docs/smg2000.readme diff -c /dev/null llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/docs/smg2000.readme:1.1 *** /dev/null Mon Apr 11 00:22:17 2005 --- llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/docs/smg2000.readme Mon Apr 11 00:22:07 2005 *************** *** 0 **** --- 1,389 ---- + %========================================================================== + %========================================================================== + + Code Description + + A. General description: + + SMG2000 is a parallel semicoarsening multigrid solver for the linear + systems arising from finite difference, finite volume, or finite + element discretizations of the diffusion equation, + + \grad \cdot ( D \grad u ) + \sigma u = f + + on logically rectangular grids. The code solves both 2D and 3D + problems with discretization stencils of up to 9-point in 2D and up to + 27-point in 3D. See the following paper for details on the algorithm + and its parallel implementation/performance: + + P. N. Brown, R. D. Falgout, and J. E. Jones, + "Semicoarsening multigrid on distributed memory machines", + SIAM Journal on Scientific Computing, 21 (2000), pp. 1823-1834. + Also available as LLNL technical report UCRL-JC-130720. + + The driver provided with SMG2000 builds linear systems for the special + case of the above equation, + + - cx u_xx - cy u_yy - cz u_zz = (1/h)^2 , (in 3D) + - cx u_xx - cy u_yy = (1/h)^2 , (in 2D) + + with Dirichlet boundary conditions of u = 0, where h is the mesh + spacing in each direction. Standard finite differences are used to + discretize the equations, yielding 5-pt. and 7-pt. stencils in 2D and + 3D, respectively. + + To determine when the solver has converged, the driver currently uses + the relative-residual stopping criteria, + + ||r_k||_2 / ||b||_2 < tol + + with tol = 10^-6. + + This solver can serve as a key component for achieving scalability in + radiation diffusion simulations. + + B. Coding: + + SMG2000 is written in ISO-C. It is an SPMD code which uses MPI. + Parallelism is achieved by data decomposition. The driver provided + with SMG2000 achieves this decomposition by simply subdividing the + grid into logical P x Q x R (in 3D) chunks of equal size. + + C. Parallelism: + + SMG2000 is a highly synchronous code. The communications and + computations patterns exhibit the surface-to-volume relationship + common to many parallel scientific codes. Hence, parallel efficiency + is largely determined by the size of the data "chunks" mentioned + above, and the speed of communications and computations on the + machine. SMG2000 is also memory-access bound, doing only about 1-2 + computations per memory access, so memory-access speeds will also have + a large impact on performance. + + %========================================================================== + %========================================================================== + + Files in this Distribution + + NOTE: The SMG2000 code is derived directly from the hypre library, a large + linear solver library that is being developed in the Center for Applied + Scientific Computing (CASC) at LLNL. + + In the smg2000 directory the following files are included: + + COPYRIGHT_and_DISCLAIMER + HYPRE_config.h + Makefile + Makefile.include + + The following subdirectories are also included: + + docs + krylov + struct_ls + struct_mv + test + utilities + + In the 'docs' directory the following files are included: + + smg2000.readme + + In the 'krylov' directory the following files are included: + + HYPRE_pcg.c + Makefile + krylov.h + pcg.c + + In the 'struct_ls' directory the following files are included: + + HYPRE_struct_ls.h + HYPRE_struct_pcg.c + HYPRE_struct_smg.c + Makefile + coarsen.c + cyclic_reduction.c + general.c + headers.h + pcg_struct.c + point_relax.c + semi_interp.c + semi_restrict.c + smg.c + smg.h + smg2_setup_rap.c + smg3_setup_rap.c + smg_axpy.c + smg_relax.c + smg_residual.c + smg_setup.c + smg_setup_interp.c + smg_setup_rap.c + smg_setup_restrict.c + smg_solve.c + struct_ls.h + + In the 'struct_mv' directory the following files are included: + + HYPRE_struct_grid.c + HYPRE_struct_matrix.c + HYPRE_struct_mv.h + HYPRE_struct_stencil.c + HYPRE_struct_vector.c + Makefile + box.c + box_algebra.c + box_alloc.c + box_neighbors.c + communication.c + communication_info.c + computation.c + grow.c + headers.h + hypre_box_smp_forloop.h + project.c + struct_axpy.c + struct_copy.c + struct_grid.c + struct_innerprod.c + struct_io.c + struct_matrix.c + struct_matrix_mask.c + struct_matvec.c + struct_mv.h + struct_scale.c + struct_stencil.c + struct_vector.c + + In the 'test' directory the following files are included: + + Makefile + smg2000.c + + In the 'utilities' directory the following files are included: + + HYPRE_utilities.h + Makefile + general.h + hypre_smp_forloop.h + memory.c + memory.h + mpistubs.c + mpistubs.h + random.c + threading.c + threading.h + timer.c + timing.c + timing.h + utilities.h + version + + %========================================================================== + %========================================================================== + + Building the Code + + SMG2000 uses a simple Makefile system for building the code. All + compiler and link options are set by modifying the file + 'smg2000/Makefile.include' appropriately. This file is then included + in each of the following makefiles: + + krylov/Makefile + struct_ls/Makefile + struct_mv/Makefile + test/Makefile + utilities/Makefile + + To build the code, first modify the 'Makefile.include' file + appropriately, then type (in the smg2000 directory) + + make + + Other available targets are + + make clean (deletes .o files) + make veryclean (deletes .o files, libraries, and executables) + + To configure the code to run with: + 1 - OpenMP only, add '-DHYPRE_USING_OPENMP -DHYPRE_SEQUENTIAL' to + the 'INCLUDE_CFLAGS' line in the 'Makefile.include' file and + use a valid OpenMP compiler. + 2 - Open MP with MPI, add '-DHYPRE_USING_OPENMP -DTIMER_USE_MPI' + to the 'INCLUDE_CFLAGS' line in the 'Makefile.include' file + and use a valid OpenMP compiler and MPI library. + 3 - MPI only , add '-DTIMER_USE_MPI' to the 'INCLUDE_CFLAGS' line + in the 'Makefile.include' file and use a valid MPI. + + %========================================================================== + %========================================================================== + + Optimization and Improvement Challenges + + This code is memory-access bound. We believe it would be very + difficult to obtain "good" cache reuse with an optimized version of + the code. + + %========================================================================== + %========================================================================== + + Parallelism and Scalability Expectations + + SMG2000 has been run on the following platforms: + + Blue-Pacific - up to 1000 procs + Red - up to 3150 procs + Compaq cluster - up to 64 procs + Sun Sparc Ultra 10's - up to 4 machines + + Consider increasing both problem size and number of processors in tandem. + On scalable architectures, time-to-solution for SMG2000 will initially + increase, then it will level off at a modest numbers of processors, + remaining roughly constant for larger numbers of processors. Iteration + counts will also increase slightly for small to modest sized problems, + then level off at a roughly constant number for larger problem sizes. + + For example, we get the following results for a 3D problem with + cx = 0.1, cy = 1.0, and cz = 10.0, for a problem distributed on + a logical P x Q x R processor topology, with fixed local problem + size per processor given as 35x35x35: + + "P x Q x R" P "iters" "setup time" "solve time" + 1x1x1 1 6 1.681680 23.255241 + 2x2x2 8 6 3.738600 32.262907 + 3x3x3 27 6 6.601194 41.341892 + 6x6x6 216 7 12.310776 46.672215 + 8x8x8 512 7 18.968893 50.051737 + 10x10x10 1000 7 18.890876 54.094806 + 14x15x15 3150 8 30.635085 62.725305 + + These results were obtained on ASCI Red. + + %========================================================================== + %========================================================================== + + Running the Code + + The driver for SMG2000 is called `smg2000', and is located in the + smg2000/test subdirectory. Type + + mpirun -np 1 smg2000 -help + + to get usage information. This prints out the following: + + Usage: .../smg2000/test/smg2000 [] + + -n : problem size per block + -P : processor topology + -b : blocking per processor + -c : diffusion coefficients + -v : number of pre and post relaxations + -d : problem dimension (2 or 3) + -solver : solver ID (default = 0) + 0 - SMG + 1 - CG with SMG precond + 2 - CG with diagonal scaling + 3 - CG + + All of the arguments are optional. The most important options for the + SMG2000 compact application are the `-n' and `-P' options. The `-n' + option allows one to specify the local problem size per MPI process, + the the `-P' option specifies the process topology on which to run. + The global problem size will be * by * by *. + + When running with OpenMP, the number of threads used per MPI process + is controlled via the OMP_NUM_THREADS environment variable. + + %========================================================================== + %========================================================================== + + Timing Issues + + If using MPI, the whole code is timed using the MPI timers. If not using + MPI, standard system timers are used. Timing results are printed to + standard out, and are divided into "Setup Phase" times and "Solve Phase" + times. Timings for a few individual routines are also printed out. + + %========================================================================== + %========================================================================== + + Memory Needed + + SMG2000 is a memory intensive code, and its memory needs are somewhat + complicated to describe. For the 3D problems discussed in this + document, memory requirements are roughly 54 times the local problem + size times the size of a double plus some overhead for storing ghost + points, etc. in the code. The overhead required by this version of + the SMG code grows essentially like the logarithm of the problem size. + + %========================================================================== + %========================================================================== + + About the Data + + SMG2000 does not read in any data. All control is on the execute line. + + %========================================================================== + %========================================================================== + + Expected Results + + Consider the following run: + + mpirun -np 1 smg2000 -n 12 12 12 -c 2.0 3.0 40 + + This is what SMG2000 prints out: + + Running with these driver parameters: + (nx, ny, nz) = (12, 12, 12) + (Px, Py, Pz) = (1, 1, 1) + (bx, by, bz) = (1, 1, 1) + (cx, cy, cz) = (2.000000, 3.000000, 40.000000) + (n_pre, n_post) = (1, 1) + dim = 3 + solver ID = 0 + ============================================= + Struct Interface: + ============================================= + Struct Interface: + wall clock time = 0.005627 seconds + cpu clock time = 0.010000 seconds + + ============================================= + Setup phase times: + ============================================= + SMG Setup: + wall clock time = 0.330096 seconds + cpu clock time = 0.330000 seconds + + ============================================= + Solve phase times: + ============================================= + SMG Solve: + wall clock time = 0.686244 seconds + cpu clock time = 0.480000 seconds + + + Iterations = 4 + Final Relative Residual Norm = 8.972097e-07 + + The relative residual norm may differ slightly from machine to machine + or compiler to compiler, but should only differ very slightly (say, + the 6th or 7th decimal place). Also, the code should generate nearly + identical results for a given problem, independent of the data + distribution. The only part of the code that does not guarantee + bitwise identical results is the inner product used to compute norms. + In practice, the above residual norm has remained the same. + + %========================================================================== + %========================================================================== + + Release and Modification Record + + LLNL code release number: UCRL-CODE-2000-022 + + (c) 2000 The Regents of the University of California + + See the file COPYRIGHT_and_DISCLAIMER for a complete copyright notice, + contact person, and disclaimer. From duraid at octopus.com.au Mon Apr 11 00:22:19 2005 From: duraid at octopus.com.au (Duraid Madina) Date: Mon, 11 Apr 2005 00:22:19 -0500 Subject: [llvm-commits] CVS: llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/HYPRE_config.h HYPRE_pcg.c HYPRE_struct_grid.c HYPRE_struct_ls.h HYPRE_struct_matrix.c HYPRE_struct_mv.h HYPRE_struct_pcg.c HYPRE_struct_smg.c HYPRE_struct_stencil.c HYPRE_struct_vector.c HYPRE_utilities.h LICENSE.txt Makefile box.c box_algebra.c box_alloc.c box_neighbors.c coarsen.c communication.c communication_info.c computation.c cyclic_reduction.c general.c general.h grow.c headers.h hypre_box_smp_forloop.h hypre_smp_forloop.h krylov.h memory.c memory.h mpistubs.c mpistubs.h pcg.c pcg_struct.c point_relax.c project.c random.c semi_interp.c semi_restrict.c smg.c smg.h smg2000.c smg2_setup_rap.c smg3_setup_rap.c smg_axpy.c smg_relax.c smg_residual.c smg_setup.c smg_setup_interp.c smg_setup_rap.c smg_setup_restrict.c smg_solve.c struct_axpy.c struct_copy.c struct_grid.c struct_innerprod.c struct_io.c struct_ls.h struct_matrix.c struct_matrix_mask.c struct_matvec.c struct_mv.h struct_scale.c struct_stencil.! c struct_vector.c threading.c threading.h timer.c timing.c timing.h utilities.h Message-ID: <200504110522.AAA27469@zion.cs.uiuc.edu> Changes in directory llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000: HYPRE_config.h added (r1.1) HYPRE_pcg.c added (r1.1) HYPRE_struct_grid.c added (r1.1) HYPRE_struct_ls.h added (r1.1) HYPRE_struct_matrix.c added (r1.1) HYPRE_struct_mv.h added (r1.1) HYPRE_struct_pcg.c added (r1.1) HYPRE_struct_smg.c added (r1.1) HYPRE_struct_stencil.c added (r1.1) HYPRE_struct_vector.c added (r1.1) HYPRE_utilities.h added (r1.1) LICENSE.txt added (r1.1) Makefile added (r1.1) box.c added (r1.1) box_algebra.c added (r1.1) box_alloc.c added (r1.1) box_neighbors.c added (r1.1) coarsen.c added (r1.1) communication.c added (r1.1) communication_info.c added (r1.1) computation.c added (r1.1) cyclic_reduction.c added (r1.1) general.c added (r1.1) general.h added (r1.1) grow.c added (r1.1) headers.h added (r1.1) hypre_box_smp_forloop.h added (r1.1) hypre_smp_forloop.h added (r1.1) krylov.h added (r1.1) memory.c added (r1.1) memory.h added (r1.1) mpistubs.c added (r1.1) mpistubs.h added (r1.1) pcg.c added (r1.1) pcg_struct.c added (r1.1) point_relax.c added (r1.1) project.c added (r1.1) random.c added (r1.1) semi_interp.c added (r1.1) semi_restrict.c added (r1.1) smg.c added (r1.1) smg.h added (r1.1) smg2000.c added (r1.1) smg2_setup_rap.c added (r1.1) smg3_setup_rap.c added (r1.1) smg_axpy.c added (r1.1) smg_relax.c added (r1.1) smg_residual.c added (r1.1) smg_setup.c added (r1.1) smg_setup_interp.c added (r1.1) smg_setup_rap.c added (r1.1) smg_setup_restrict.c added (r1.1) smg_solve.c added (r1.1) struct_axpy.c added (r1.1) struct_copy.c added (r1.1) struct_grid.c added (r1.1) struct_innerprod.c added (r1.1) struct_io.c added (r1.1) struct_ls.h added (r1.1) struct_matrix.c added (r1.1) struct_matrix_mask.c added (r1.1) struct_matvec.c added (r1.1) struct_mv.h added (r1.1) struct_scale.c added (r1.1) struct_stencil.c added (r1.1) struct_vector.c added (r1.1) threading.c added (r1.1) threading.h added (r1.1) timer.c added (r1.1) timing.c added (r1.1) timing.h added (r1.1) utilities.h added (r1.1) --- Log message: * add the SMG2000 benchmark. This one is brutal on memory, anything you can do in terms of prefetching/unrolling/SWP will probably help. --- Diffs of the changes: (+28313 -0) HYPRE_config.h | 2 HYPRE_pcg.c | 187 ++++ HYPRE_struct_grid.c | 97 ++ HYPRE_struct_ls.h | 671 +++++++++++++++ HYPRE_struct_matrix.c | 247 +++++ HYPRE_struct_mv.h | 344 ++++++++ HYPRE_struct_pcg.c | 473 +++++++++++ HYPRE_struct_smg.c | 191 ++++ HYPRE_struct_stencil.c | 68 + HYPRE_struct_vector.c | 312 +++++++ HYPRE_utilities.h | 60 + LICENSE.txt | 43 + Makefile | 15 box.c | 437 ++++++++++ box_algebra.c | 397 +++++++++ box_alloc.c | 143 +++ box_neighbors.c | 344 ++++++++ coarsen.c | 832 +++++++++++++++++++ communication.c | 1569 ++++++++++++++++++++++++++++++++++++ communication_info.c | 701 ++++++++++++++++ computation.c | 405 +++++++++ cyclic_reduction.c | 1215 ++++++++++++++++++++++++++++ general.c | 38 general.h | 33 grow.c | 96 ++ headers.h | 15 hypre_box_smp_forloop.h | 20 hypre_smp_forloop.h | 52 + krylov.h | 851 +++++++++++++++++++ memory.c | 311 +++++++ memory.h | 125 ++ mpistubs.c | 496 +++++++++++ mpistubs.h | 192 ++++ pcg.c | 658 +++++++++++++++ pcg_struct.c | 245 +++++ point_relax.c | 779 ++++++++++++++++++ project.c | 118 ++ random.c | 49 + semi_interp.c | 333 +++++++ semi_restrict.c | 301 +++++++ smg.c | 423 +++++++++ smg.h | 114 ++ smg2000.c | 638 ++++++++++++++ smg2_setup_rap.c | 985 +++++++++++++++++++++++ smg3_setup_rap.c | 2044 ++++++++++++++++++++++++++++++++++++++++++++++++ smg_axpy.c | 76 + smg_relax.c | 989 +++++++++++++++++++++++ smg_residual.c | 356 ++++++++ smg_setup.c | 435 ++++++++++ smg_setup_interp.c | 315 +++++++ smg_setup_rap.c | 135 +++ smg_setup_restrict.c | 46 + smg_solve.c | 327 +++++++ struct_axpy.c | 75 + struct_copy.c | 75 + struct_grid.c | 667 +++++++++++++++ struct_innerprod.c | 117 ++ struct_io.c | 154 +++ struct_ls.h | 431 ++++++++++ struct_matrix.c | 928 +++++++++++++++++++++ struct_matrix_mask.c | 124 ++ struct_matvec.c | 610 ++++++++++++++ struct_mv.h | 1685 +++++++++++++++++++++++++++++++++++++++ struct_scale.c | 66 + struct_stencil.c | 212 ++++ struct_vector.c | 921 +++++++++++++++++++++ threading.c | 263 ++++++ threading.h | 81 + timer.c | 45 + timing.c | 626 ++++++++++++++ timing.h | 130 +++ utilities.h | 755 +++++++++++++++++ 72 files changed, 28313 insertions(+) Index: llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/HYPRE_config.h diff -c /dev/null llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/HYPRE_config.h:1.1 *** /dev/null Mon Apr 11 00:22:17 2005 --- llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/HYPRE_config.h Mon Apr 11 00:22:07 2005 *************** *** 0 **** --- 1,2 ---- + + /* All configuration for SMG2000 is done in the Makefile */ Index: llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/HYPRE_pcg.c diff -c /dev/null llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/HYPRE_pcg.c:1.1 *** /dev/null Mon Apr 11 00:22:18 2005 --- llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/HYPRE_pcg.c Mon Apr 11 00:22:07 2005 *************** *** 0 **** --- 1,187 ---- + /*BHEADER********************************************************************** + * (c) 2000 The Regents of the University of California + * + * See the file COPYRIGHT_and_DISCLAIMER for a complete copyright + * notice, contact person, and disclaimer. + * + * $Revision: 1.1 $ + *********************************************************************EHEADER*/ + + /****************************************************************************** + * + * HYPRE_PCG interface + * + *****************************************************************************/ + #include "krylov.h" + + /*-------------------------------------------------------------------------- + * HYPRE_PCGCreate does not exist. Call the appropriate function which + * also specifies the vector type, e.g. HYPRE_ParCSRPCGCreate + *--------------------------------------------------------------------------*/ + + /*-------------------------------------------------------------------------- + * HYPRE_PCGDestroy + *--------------------------------------------------------------------------*/ + + /* + int + HYPRE_PCGDestroy( HYPRE_Solver solver )*/ + /* >>> This is something we can't do without knowing the vector_type. + We can't save it in and pull it out of solver because that isn't + really a known struct. */ + /* + { + if ( vector_type=="ParCSR" ) { + return HYPRE_ParCSRPCGDestroy( HYPRE_Solver solver ); + } + else { + return 0; + } + }*/ + + /*-------------------------------------------------------------------------- + * HYPRE_PCGSetup + *--------------------------------------------------------------------------*/ + + int + HYPRE_PCGSetup( HYPRE_Solver solver, + HYPRE_Matrix A, + HYPRE_Vector b, + HYPRE_Vector x ) + { + return( hypre_PCGSetup( solver, + A, + b, + x ) ); + } + + /*-------------------------------------------------------------------------- + * HYPRE_PCGSolve + *--------------------------------------------------------------------------*/ + + int + HYPRE_PCGSolve( HYPRE_Solver solver, + HYPRE_Matrix A, + HYPRE_Vector b, + HYPRE_Vector x ) + { + return( hypre_PCGSolve( (void *) solver, + (void *) A, + (void *) b, + (void *) x ) ); + } + + /*-------------------------------------------------------------------------- + * HYPRE_PCGSetTol + *--------------------------------------------------------------------------*/ + + int + HYPRE_PCGSetTol( HYPRE_Solver solver, + double tol ) + { + return( hypre_PCGSetTol( (void *) solver, tol ) ); + } + + /*-------------------------------------------------------------------------- + * HYPRE_PCGSetMaxIter + *--------------------------------------------------------------------------*/ + + int + HYPRE_PCGSetMaxIter( HYPRE_Solver solver, + int max_iter ) + { + return( hypre_PCGSetMaxIter( (void *) solver, max_iter ) ); + } + + /*-------------------------------------------------------------------------- + * HYPRE_PCGSetStopCrit + *--------------------------------------------------------------------------*/ + + int + HYPRE_PCGSetStopCrit( HYPRE_Solver solver, + int stop_crit ) + { + return( hypre_PCGSetStopCrit( (void *) solver, stop_crit ) ); + } + + /*-------------------------------------------------------------------------- + * HYPRE_PCGSetTwoNorm + *--------------------------------------------------------------------------*/ + + int + HYPRE_PCGSetTwoNorm( HYPRE_Solver solver, + int two_norm ) + { + return( hypre_PCGSetTwoNorm( (void *) solver, two_norm ) ); + } + + /*-------------------------------------------------------------------------- + * HYPRE_PCGSetRelChange + *--------------------------------------------------------------------------*/ + + int + HYPRE_PCGSetRelChange( HYPRE_Solver solver, + int rel_change ) + { + return( hypre_PCGSetRelChange( (void *) solver, rel_change ) ); + } + + /*-------------------------------------------------------------------------- + * HYPRE_PCGSetPrecond + *--------------------------------------------------------------------------*/ + + int + HYPRE_PCGSetPrecond( HYPRE_Solver solver, + HYPRE_PtrToSolverFcn precond, + HYPRE_PtrToSolverFcn precond_setup, + HYPRE_Solver precond_solver ) + { + return( hypre_PCGSetPrecond( (void *) solver, + precond, precond_setup, + (void *) precond_solver ) ); + } + + /*-------------------------------------------------------------------------- + * HYPRE_PCGGetPrecond + *--------------------------------------------------------------------------*/ + + int + HYPRE_PCGGetPrecond( HYPRE_Solver solver, + HYPRE_Solver *precond_data_ptr ) + { + return( hypre_PCGGetPrecond( (void *) solver, + (HYPRE_Solver *) precond_data_ptr ) ); + } + + /*-------------------------------------------------------------------------- + * HYPRE_PCGSetLogging + *--------------------------------------------------------------------------*/ + + int + HYPRE_PCGSetLogging( HYPRE_Solver solver, + int logging ) + { + return( hypre_PCGSetLogging( (void *) solver, logging ) ); + } + + /*-------------------------------------------------------------------------- + * HYPRE_PCGGetNumIterations + *--------------------------------------------------------------------------*/ + + int + HYPRE_PCGGetNumIterations( HYPRE_Solver solver, + int *num_iterations ) + { + return( hypre_PCGGetNumIterations( (void *) solver, num_iterations ) ); + } + + /*-------------------------------------------------------------------------- + * HYPRE_PCGGetFinalRelativeResidualNorm + *--------------------------------------------------------------------------*/ + + int + HYPRE_PCGGetFinalRelativeResidualNorm( HYPRE_Solver solver, + double *norm ) + { + return( hypre_PCGGetFinalRelativeResidualNorm( (void *) solver, norm ) ); + } Index: llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/HYPRE_struct_grid.c diff -c /dev/null llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/HYPRE_struct_grid.c:1.1 *** /dev/null Mon Apr 11 00:22:18 2005 --- llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/HYPRE_struct_grid.c Mon Apr 11 00:22:07 2005 *************** *** 0 **** --- 1,97 ---- + /*BHEADER********************************************************************** + * (c) 1997 The Regents of the University of California + * + * See the file COPYRIGHT_and_DISCLAIMER for a complete copyright + * notice, contact person, and disclaimer. + * + * $Revision: 1.1 $ + *********************************************************************EHEADER*/ + /****************************************************************************** + * + * HYPRE_StructGrid interface + * + *****************************************************************************/ + + #include "headers.h" + + /*-------------------------------------------------------------------------- + * HYPRE_StructGridCreate + *--------------------------------------------------------------------------*/ + + int + HYPRE_StructGridCreate( MPI_Comm comm, + int dim, + HYPRE_StructGrid *grid ) + { + int ierr; + + ierr = hypre_StructGridCreate(comm, dim, grid); + + return ierr; + } + + /*-------------------------------------------------------------------------- + * HYPRE_StructGridDestroy + *--------------------------------------------------------------------------*/ + + int + HYPRE_StructGridDestroy( HYPRE_StructGrid grid ) + { + return ( hypre_StructGridDestroy(grid) ); + } + + /*-------------------------------------------------------------------------- + * HYPRE_StructGridSetExtents + *--------------------------------------------------------------------------*/ + + int + HYPRE_StructGridSetExtents( HYPRE_StructGrid grid, + int *ilower, + int *iupper ) + { + hypre_Index new_ilower; + hypre_Index new_iupper; + + int d; + + hypre_ClearIndex(new_ilower); + hypre_ClearIndex(new_iupper); + for (d = 0; d < hypre_StructGridDim((hypre_StructGrid *) grid); d++) + { + hypre_IndexD(new_ilower, d) = ilower[d]; + hypre_IndexD(new_iupper, d) = iupper[d]; + } + + return ( hypre_StructGridSetExtents(grid, new_ilower, new_iupper) ); + } + + /*-------------------------------------------------------------------------- + * HYPRE_SetStructGridPeriodicity + *--------------------------------------------------------------------------*/ + + int + HYPRE_StructGridSetPeriodic( HYPRE_StructGrid grid, + int *periodic ) + { + hypre_Index new_periodic; + + int d; + + hypre_ClearIndex(new_periodic); + for (d = 0; d < hypre_StructGridDim(grid); d++) + { + hypre_IndexD(new_periodic, d) = periodic[d]; + } + + return ( hypre_StructGridSetPeriodic(grid, new_periodic) ); + } + + /*-------------------------------------------------------------------------- + * HYPRE_StructGridAssemble + *--------------------------------------------------------------------------*/ + + int + HYPRE_StructGridAssemble( HYPRE_StructGrid grid ) + { + return ( hypre_StructGridAssemble(grid) ); + } Index: llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/HYPRE_struct_ls.h diff -c /dev/null llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/HYPRE_struct_ls.h:1.1 *** /dev/null Mon Apr 11 00:22:18 2005 --- llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/HYPRE_struct_ls.h Mon Apr 11 00:22:07 2005 *************** *** 0 **** --- 1,671 ---- + /*BHEADER********************************************************************** + * (c) 2000 The Regents of the University of California + * + * See the file COPYRIGHT_and_DISCLAIMER for a complete copyright + * notice, contact person, and disclaimer. + * + * $Revision: 1.1 $ + *********************************************************************EHEADER*/ + /****************************************************************************** + * + * Header file for HYPRE_ls library + * + *****************************************************************************/ + + #ifndef HYPRE_STRUCT_LS_HEADER + #define HYPRE_STRUCT_LS_HEADER + + #include "HYPRE_config.h" + #include "HYPRE_utilities.h" + #include "HYPRE_struct_mv.h" + + #ifdef __cplusplus + extern "C" { + #endif + + /*-------------------------------------------------------------------------- + *--------------------------------------------------------------------------*/ + + /** + * @name Struct Solvers + * + * These solvers use matrix/vector storage schemes that are tailored + * to structured grid problems. + * + * @memo Linear solvers for structured grids + **/ + /*@{*/ + + /*-------------------------------------------------------------------------- + *--------------------------------------------------------------------------*/ + + /** + * @name Struct Solvers + **/ + /*@{*/ + + struct hypre_StructSolver_struct; + /** + * The solver object. + **/ + typedef struct hypre_StructSolver_struct *HYPRE_StructSolver; + + typedef int (*HYPRE_PtrToStructSolverFcn)(HYPRE_StructSolver, + HYPRE_StructMatrix, + HYPRE_StructVector, + HYPRE_StructVector); + + /*@}*/ + + /*-------------------------------------------------------------------------- + *--------------------------------------------------------------------------*/ + + /** + * @name Struct Jacobi Solver + **/ + /*@{*/ + + /** + * Create a solver object. + **/ + int HYPRE_StructJacobiCreate(MPI_Comm comm, + HYPRE_StructSolver *solver); + + /** + * Destroy a solver object. An object should be explicitly destroyed + * using this destructor when the user's code no longer needs direct + * access to it. Once destroyed, the object must not be referenced + * again. Note that the object may not be deallocated at the + * completion of this call, since there may be internal package + * references to the object. The object will then be destroyed when + * all internal reference counts go to zero. + **/ + int HYPRE_StructJacobiDestroy(HYPRE_StructSolver solver); + + /** + **/ + int HYPRE_StructJacobiSetup(HYPRE_StructSolver solver, + HYPRE_StructMatrix A, + HYPRE_StructVector b, + HYPRE_StructVector x); + + /** + * Solve the system. + **/ + int HYPRE_StructJacobiSolve(HYPRE_StructSolver solver, + HYPRE_StructMatrix A, + HYPRE_StructVector b, + HYPRE_StructVector x); + + /** + * (Optional) Set the convergence tolerance. + **/ + int HYPRE_StructJacobiSetTol(HYPRE_StructSolver solver, + double tol); + + /** + * (Optional) Set maximum number of iterations. + **/ + int HYPRE_StructJacobiSetMaxIter(HYPRE_StructSolver solver, + int max_iter); + + /** + * (Optional) Use a zero initial guess. + **/ + int HYPRE_StructJacobiSetZeroGuess(HYPRE_StructSolver solver); + + /** + * (Optional) Use a nonzero initial guess. + **/ + int HYPRE_StructJacobiSetNonZeroGuess(HYPRE_StructSolver solver); + + /** + * Return the number of iterations taken. + **/ + int HYPRE_StructJacobiGetNumIterations(HYPRE_StructSolver solver, + int *num_iterations); + + /** + * Return the norm of the final relative residual. + **/ + int HYPRE_StructJacobiGetFinalRelativeResidualNorm(HYPRE_StructSolver solver, + double *norm); + + /*@}*/ + + /*-------------------------------------------------------------------------- + *--------------------------------------------------------------------------*/ + + /** + * @name Struct PFMG Solver + **/ + /*@{*/ + + /** + * Create a solver object. + **/ + int HYPRE_StructPFMGCreate(MPI_Comm comm, + HYPRE_StructSolver *solver); + + /** + * Destroy a solver object. + **/ + int HYPRE_StructPFMGDestroy(HYPRE_StructSolver solver); + + /** + **/ + int HYPRE_StructPFMGSetup(HYPRE_StructSolver solver, + HYPRE_StructMatrix A, + HYPRE_StructVector b, + HYPRE_StructVector x); + + /** + * Solve the system. + **/ + int HYPRE_StructPFMGSolve(HYPRE_StructSolver solver, + HYPRE_StructMatrix A, + HYPRE_StructVector b, + HYPRE_StructVector x); + + /** + * (Optional) Set the convergence tolerance. + **/ + int HYPRE_StructPFMGSetTol(HYPRE_StructSolver solver, + double tol); + + /** + * (Optional) Set maximum number of iterations. + **/ + int HYPRE_StructPFMGSetMaxIter(HYPRE_StructSolver solver, + int max_iter); + + /** + * (Optional) Additionally require that the relative difference in + * successive iterates be small. + **/ + int HYPRE_StructPFMGSetRelChange(HYPRE_StructSolver solver, + int rel_change); + + /** + * (Optional) Use a zero initial guess. + **/ + int HYPRE_StructPFMGSetZeroGuess(HYPRE_StructSolver solver); + + /** + * (Optional) Use a nonzero initial guess. + **/ + int HYPRE_StructPFMGSetNonZeroGuess(HYPRE_StructSolver solver); + + /** + * (Optional) Set relaxation type. + **/ + int HYPRE_StructPFMGSetRelaxType(HYPRE_StructSolver solver, + int relax_type); + + /** + * (Optional) Set number of pre-relaxation sweeps. + **/ + int HYPRE_StructPFMGSetNumPreRelax(HYPRE_StructSolver solver, + int num_pre_relax); + + /** + * (Optional) Set number of post-relaxation sweeps. + **/ + int HYPRE_StructPFMGSetNumPostRelax(HYPRE_StructSolver solver, + int num_post_relax); + + /** + * (Optional) Skip relaxation on certain grids for isotropic problems. + **/ + int HYPRE_StructPFMGSetSkipRelax(HYPRE_StructSolver solver, + int skip_relax); + + /* + * RE-VISIT + **/ + int HYPRE_StructPFMGSetDxyz(HYPRE_StructSolver solver, + double *dxyz); + + /** + * (Optional) Set the amount of logging to do. + **/ + int HYPRE_StructPFMGSetLogging(HYPRE_StructSolver solver, + int logging); + + /** + * Return the number of iterations taken. + **/ + int HYPRE_StructPFMGGetNumIterations(HYPRE_StructSolver solver, + int *num_iterations); + + /** + * Return the norm of the final relative residual. + **/ + int HYPRE_StructPFMGGetFinalRelativeResidualNorm(HYPRE_StructSolver solver, + double *norm); + + /*@}*/ + + /*-------------------------------------------------------------------------- + *--------------------------------------------------------------------------*/ + + /** + * @name Struct SMG Solver + **/ + /*@{*/ + + /** + * Create a solver object. + **/ + int HYPRE_StructSMGCreate(MPI_Comm comm, + HYPRE_StructSolver *solver); + + /** + * Destroy a solver object. + **/ + int HYPRE_StructSMGDestroy(HYPRE_StructSolver solver); + + /** + **/ + int HYPRE_StructSMGSetup(HYPRE_StructSolver solver, + HYPRE_StructMatrix A, + HYPRE_StructVector b, + HYPRE_StructVector x); + + /** + * Solve the system. + **/ + int HYPRE_StructSMGSolve(HYPRE_StructSolver solver, + HYPRE_StructMatrix A, + HYPRE_StructVector b, + HYPRE_StructVector x); + + /* + * RE-VISIT + **/ + int HYPRE_StructSMGSetMemoryUse(HYPRE_StructSolver solver, + int memory_use); + + /** + * (Optional) Set the convergence tolerance. + **/ + int HYPRE_StructSMGSetTol(HYPRE_StructSolver solver, + double tol); + + /** + * (Optional) Set maximum number of iterations. + **/ + int HYPRE_StructSMGSetMaxIter(HYPRE_StructSolver solver, + int max_iter); + + /** + * (Optional) Additionally require that the relative difference in + * successive iterates be small. + **/ + int HYPRE_StructSMGSetRelChange(HYPRE_StructSolver solver, + int rel_change); + + /** + * (Optional) Use a zero initial guess. + **/ + int HYPRE_StructSMGSetZeroGuess(HYPRE_StructSolver solver); + + /** + * (Optional) Use a nonzero initial guess. + **/ + int HYPRE_StructSMGSetNonZeroGuess(HYPRE_StructSolver solver); + + /** + * (Optional) Set number of pre-relaxation sweeps. + **/ + int HYPRE_StructSMGSetNumPreRelax(HYPRE_StructSolver solver, + int num_pre_relax); + + /** + * (Optional) Set number of post-relaxation sweeps. + **/ + int HYPRE_StructSMGSetNumPostRelax(HYPRE_StructSolver solver, + int num_post_relax); + + /** + * (Optional) Set the amount of logging to do. + **/ + int HYPRE_StructSMGSetLogging(HYPRE_StructSolver solver, + int logging); + + /** + * Return the number of iterations taken. + **/ + int HYPRE_StructSMGGetNumIterations(HYPRE_StructSolver solver, + int *num_iterations); + + /** + * Return the norm of the final relative residual. + **/ + int HYPRE_StructSMGGetFinalRelativeResidualNorm(HYPRE_StructSolver solver, + double *norm); + + /*@}*/ + + /*-------------------------------------------------------------------------- + *--------------------------------------------------------------------------*/ + + /** + * @name Struct PCG Solver + **/ + /*@{*/ + + /** + * Create a solver object. + **/ + int HYPRE_StructPCGCreate(MPI_Comm comm, + HYPRE_StructSolver *solver); + + /** + * Destroy a solver object. + **/ + int HYPRE_StructPCGDestroy(HYPRE_StructSolver solver); + + /** + **/ + int HYPRE_StructPCGSetup(HYPRE_StructSolver solver, + HYPRE_StructMatrix A, + HYPRE_StructVector b, + HYPRE_StructVector x); + + /** + * Solve the system. + **/ + int HYPRE_StructPCGSolve(HYPRE_StructSolver solver, + HYPRE_StructMatrix A, + HYPRE_StructVector b, + HYPRE_StructVector x); + + /** + * (Optional) Set the convergence tolerance. + **/ + int HYPRE_StructPCGSetTol(HYPRE_StructSolver solver, + double tol); + + /** + * (Optional) Set maximum number of iterations. + **/ + int HYPRE_StructPCGSetMaxIter(HYPRE_StructSolver solver, + int max_iter); + + /** + * (Optional) Use the two-norm in stopping criteria. + **/ + int HYPRE_StructPCGSetTwoNorm(HYPRE_StructSolver solver, + int two_norm); + + /** + * (Optional) Additionally require that the relative difference in + * successive iterates be small. + **/ + int HYPRE_StructPCGSetRelChange(HYPRE_StructSolver solver, + int rel_change); + + /** + * (Optional) Set the preconditioner to use. + **/ + int HYPRE_StructPCGSetPrecond(HYPRE_StructSolver solver, + HYPRE_PtrToStructSolverFcn precond, + HYPRE_PtrToStructSolverFcn precond_setup, + HYPRE_StructSolver precond_solver); + + /** + * (Optional) Set the amount of logging to do. + **/ + int HYPRE_StructPCGSetLogging(HYPRE_StructSolver solver, + int logging); + + /** + * Return the number of iterations taken. + **/ + int HYPRE_StructPCGGetNumIterations(HYPRE_StructSolver solver, + int *num_iterations); + + /** + * Return the norm of the final relative residual. + **/ + int HYPRE_StructPCGGetFinalRelativeResidualNorm(HYPRE_StructSolver solver, + double *norm); + + /** + * Setup routine for diagonal preconditioning. + **/ + int HYPRE_StructDiagScaleSetup(HYPRE_StructSolver solver, + HYPRE_StructMatrix A, + HYPRE_StructVector y, + HYPRE_StructVector x); + + /** + * Solve routine for diagonal preconditioning. + **/ + int HYPRE_StructDiagScale(HYPRE_StructSolver solver, + HYPRE_StructMatrix HA, + HYPRE_StructVector Hy, + HYPRE_StructVector Hx); + + /*@}*/ + + /*-------------------------------------------------------------------------- + *--------------------------------------------------------------------------*/ + + /* + * @name Struct GMRES Solver + **/ + /*@{*/ + + /** + * Create a solver object. + **/ + int + HYPRE_StructGMRESCreate( MPI_Comm comm, HYPRE_StructSolver *solver ); + + + /** + * Destroy a solver object. + **/ + int + HYPRE_StructGMRESDestroy( HYPRE_StructSolver solver ); + + + /** + * set up + **/ + int + HYPRE_StructGMRESSetup( HYPRE_StructSolver solver, + HYPRE_StructMatrix A, + HYPRE_StructVector b, + HYPRE_StructVector x ); + + + /** + * Solve the system. + **/ + int + HYPRE_StructGMRESSolve( HYPRE_StructSolver solver, + HYPRE_StructMatrix A, + HYPRE_StructVector b, + HYPRE_StructVector x ); + + + /** + * (Optional) Set the convergence tolerance. + **/ + int + HYPRE_StructGMRESSetTol( HYPRE_StructSolver solver, + double tol ); + + /** + * (Optional) Set maximum number of iterations. + **/ + int + HYPRE_StructGMRESSetMaxIter( HYPRE_StructSolver solver, + int max_iter ); + + + /** + * (Optional) Set the preconditioner to use. + **/ + int + HYPRE_StructGMRESSetPrecond( HYPRE_StructSolver solver, + HYPRE_PtrToStructSolverFcn precond, + HYPRE_PtrToStructSolverFcn precond_setup, + HYPRE_StructSolver precond_solver ); + + /** + * (Optional) Set the amount of logging to do. + **/ + int + HYPRE_StructGMRESSetLogging( HYPRE_StructSolver solver, + int logging ); + + /** + * Return the number of iterations taken. + **/ + int + HYPRE_StructGMRESGetNumIterations( HYPRE_StructSolver solver, + int *num_iterations ); + + /** + * Return the norm of the final relative residual. + **/ + int + HYPRE_StructGMRESGetFinalRelativeResidualNorm( HYPRE_StructSolver solver, + double *norm ); + + /*@}*/ + + /*-------------------------------------------------------------------------- + *--------------------------------------------------------------------------*/ + + /* + * @name Struct SparseMSG Solver + **/ + + int HYPRE_StructSparseMSGCreate(MPI_Comm comm, + HYPRE_StructSolver *solver); + + int HYPRE_StructSparseMSGDestroy(HYPRE_StructSolver solver); + + int HYPRE_StructSparseMSGSetup(HYPRE_StructSolver solver, + HYPRE_StructMatrix A, + HYPRE_StructVector b, + HYPRE_StructVector x); + + int HYPRE_StructSparseMSGSolve(HYPRE_StructSolver solver, + HYPRE_StructMatrix A, + HYPRE_StructVector b, + HYPRE_StructVector x); + + int HYPRE_StructSparseMSGSetTol(HYPRE_StructSolver solver, + double tol); + + int HYPRE_StructSparseMSGSetMaxIter(HYPRE_StructSolver solver, + int max_iter); + + int HYPRE_StructSparseMSGSetJump(HYPRE_StructSolver solver, + int jump); + + int HYPRE_StructSparseMSGSetRelChange(HYPRE_StructSolver solver, + int rel_change); + + int HYPRE_StructSparseMSGSetZeroGuess(HYPRE_StructSolver solver); + + int HYPRE_StructSparseMSGSetNonZeroGuess(HYPRE_StructSolver solver); + + int HYPRE_StructSparseMSGSetRelaxType(HYPRE_StructSolver solver, + int relax_type); + + int HYPRE_StructSparseMSGSetNumPreRelax(HYPRE_StructSolver solver, + int num_pre_relax); + + int HYPRE_StructSparseMSGSetNumPostRelax(HYPRE_StructSolver solver, + int num_post_relax); + + int HYPRE_StructSparseMSGSetNumFineRelax(HYPRE_StructSolver solver, + int num_fine_relax); + + int HYPRE_StructSparseMSGSetLogging(HYPRE_StructSolver solver, + int logging); + + int HYPRE_StructSparseMSGGetNumIterations(HYPRE_StructSolver solver, + int *num_iterations); + + int HYPRE_StructSparseMSGGetFinalRelativeResidualNorm(HYPRE_StructSolver solver, + double *norm); + + /*-------------------------------------------------------------------------- + *--------------------------------------------------------------------------*/ + + /* + * @name Struct Hybrid Solver + **/ + + int HYPRE_StructHybridCreate(MPI_Comm comm, + HYPRE_StructSolver *solver); + + int HYPRE_StructHybridDestroy(HYPRE_StructSolver solver); + + int HYPRE_StructHybridSetup(HYPRE_StructSolver solver, + HYPRE_StructMatrix A, + HYPRE_StructVector b, + HYPRE_StructVector x); + + int HYPRE_StructHybridSolve(HYPRE_StructSolver solver, + HYPRE_StructMatrix A, + HYPRE_StructVector b, + HYPRE_StructVector x); + + int HYPRE_StructHybridSetTol(HYPRE_StructSolver solver, + double tol); + + int HYPRE_StructHybridSetConvergenceTol(HYPRE_StructSolver solver, + double cf_tol); + + int HYPRE_StructHybridSetDSCGMaxIter(HYPRE_StructSolver solver, + int dscg_max_its); + + int HYPRE_StructHybridSetPCGMaxIter(HYPRE_StructSolver solver, + int pcg_max_its); + + int HYPRE_StructHybridSetTwoNorm(HYPRE_StructSolver solver, + int two_norm); + + int HYPRE_StructHybridSetRelChange(HYPRE_StructSolver solver, + int rel_change); + + int HYPRE_StructHybridSetPrecond(HYPRE_StructSolver solver, + HYPRE_PtrToStructSolverFcn precond, + HYPRE_PtrToStructSolverFcn precond_setup, + HYPRE_StructSolver precond_solver); + + int HYPRE_StructHybridSetLogging(HYPRE_StructSolver solver, + int logging); + + int HYPRE_StructHybridGetNumIterations(HYPRE_StructSolver solver, + int *num_its); + + int HYPRE_StructHybridGetDSCGNumIterations(HYPRE_StructSolver solver, + int *dscg_num_its); + + int HYPRE_StructHybridGetPCGNumIterations(HYPRE_StructSolver solver, + int *pcg_num_its); + + int HYPRE_StructHybridGetFinalRelativeResidualNorm(HYPRE_StructSolver solver, + double *norm); + + /*-------------------------------------------------------------------------- + *--------------------------------------------------------------------------*/ + + /*@}*/ + + #ifdef __cplusplus + } + #endif + + #endif + Index: llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/HYPRE_struct_matrix.c diff -c /dev/null llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/HYPRE_struct_matrix.c:1.1 *** /dev/null Mon Apr 11 00:22:18 2005 --- llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/HYPRE_struct_matrix.c Mon Apr 11 00:22:07 2005 *************** *** 0 **** --- 1,247 ---- + /*BHEADER********************************************************************** + * (c) 1997 The Regents of the University of California + * + * See the file COPYRIGHT_and_DISCLAIMER for a complete copyright + * notice, contact person, and disclaimer. + * + * $Revision: 1.1 $ + *********************************************************************EHEADER*/ + /****************************************************************************** + * + * HYPRE_StructMatrix interface + * + *****************************************************************************/ + + #include "headers.h" + + /*-------------------------------------------------------------------------- + * HYPRE_StructMatrixCreate + *--------------------------------------------------------------------------*/ + + int + HYPRE_StructMatrixCreate( MPI_Comm comm, + HYPRE_StructGrid grid, + HYPRE_StructStencil stencil, + HYPRE_StructMatrix *matrix ) + { + *matrix = hypre_StructMatrixCreate(comm, grid, stencil); + + return 0; + } + + /*-------------------------------------------------------------------------- + * HYPRE_StructMatrixDestroy + *--------------------------------------------------------------------------*/ + + int + HYPRE_StructMatrixDestroy( HYPRE_StructMatrix matrix ) + { + return( hypre_StructMatrixDestroy(matrix) ); + } + + /*-------------------------------------------------------------------------- + * HYPRE_StructMatrixInitialize + *--------------------------------------------------------------------------*/ + + int + HYPRE_StructMatrixInitialize( HYPRE_StructMatrix matrix ) + { + return ( hypre_StructMatrixInitialize(matrix) ); + } + + /*-------------------------------------------------------------------------- + * HYPRE_StructMatrixSetValues + *--------------------------------------------------------------------------*/ + + int + HYPRE_StructMatrixSetValues( HYPRE_StructMatrix matrix, + int *grid_index, + int num_stencil_indices, + int *stencil_indices, + double *values ) + { + hypre_Index new_grid_index; + + int d; + int ierr = 0; + + hypre_ClearIndex(new_grid_index); + for (d = 0; d < hypre_StructGridDim(hypre_StructMatrixGrid(matrix)); d++) + { + hypre_IndexD(new_grid_index, d) = grid_index[d]; + } + + ierr = hypre_StructMatrixSetValues(matrix, new_grid_index, + num_stencil_indices, stencil_indices, + values, 0); + + return (ierr); + } + + /*-------------------------------------------------------------------------- + * HYPRE_StructMatrixSetBoxValues + *--------------------------------------------------------------------------*/ + + int + HYPRE_StructMatrixSetBoxValues( HYPRE_StructMatrix matrix, + int *ilower, + int *iupper, + int num_stencil_indices, + int *stencil_indices, + double *values ) + { + hypre_Index new_ilower; + hypre_Index new_iupper; + hypre_Box *new_value_box; + + int d; + int ierr = 0; + + hypre_ClearIndex(new_ilower); + hypre_ClearIndex(new_iupper); + for (d = 0; d < hypre_StructGridDim(hypre_StructMatrixGrid(matrix)); d++) + { + hypre_IndexD(new_ilower, d) = ilower[d]; + hypre_IndexD(new_iupper, d) = iupper[d]; + } + new_value_box = hypre_BoxCreate(); + hypre_BoxSetExtents(new_value_box, new_ilower, new_iupper); + + ierr = hypre_StructMatrixSetBoxValues(matrix, new_value_box, + num_stencil_indices, stencil_indices, + values, 0); + + hypre_BoxDestroy(new_value_box); + + return (ierr); + } + + /*-------------------------------------------------------------------------- + * HYPRE_StructMatrixAddToValues + *--------------------------------------------------------------------------*/ + + int + HYPRE_StructMatrixAddToValues( HYPRE_StructMatrix matrix, + int *grid_index, + int num_stencil_indices, + int *stencil_indices, + double *values ) + { + hypre_Index new_grid_index; + + int d; + int ierr = 0; + + hypre_ClearIndex(new_grid_index); + for (d = 0; d < hypre_StructGridDim(hypre_StructMatrixGrid(matrix)); d++) + { + hypre_IndexD(new_grid_index, d) = grid_index[d]; + } + + ierr = hypre_StructMatrixSetValues(matrix, new_grid_index, + num_stencil_indices, stencil_indices, + values, 1); + + return (ierr); + } + + /*-------------------------------------------------------------------------- + * HYPRE_StructMatrixAddToBoxValues + *--------------------------------------------------------------------------*/ + + int + HYPRE_StructMatrixAddToBoxValues( HYPRE_StructMatrix matrix, + int *ilower, + int *iupper, + int num_stencil_indices, + int *stencil_indices, + double *values ) + { + hypre_Index new_ilower; + hypre_Index new_iupper; + hypre_Box *new_value_box; + + int d; + int ierr = 0; + + hypre_ClearIndex(new_ilower); + hypre_ClearIndex(new_iupper); + for (d = 0; d < hypre_StructGridDim(hypre_StructMatrixGrid(matrix)); d++) + { + hypre_IndexD(new_ilower, d) = ilower[d]; + hypre_IndexD(new_iupper, d) = iupper[d]; + } + new_value_box = hypre_BoxCreate(); + hypre_BoxSetExtents(new_value_box, new_ilower, new_iupper); + + ierr = hypre_StructMatrixSetBoxValues(matrix, new_value_box, + num_stencil_indices, stencil_indices, + values, 1); + + hypre_BoxDestroy(new_value_box); + + return (ierr); + } + + /*-------------------------------------------------------------------------- + * HYPRE_StructMatrixAssemble + *--------------------------------------------------------------------------*/ + + int + HYPRE_StructMatrixAssemble( HYPRE_StructMatrix matrix ) + { + return( hypre_StructMatrixAssemble(matrix) ); + } + + /*-------------------------------------------------------------------------- + * HYPRE_StructMatrixSetNumGhost + *--------------------------------------------------------------------------*/ + + int + HYPRE_StructMatrixSetNumGhost( HYPRE_StructMatrix matrix, + int *num_ghost ) + { + return ( hypre_StructMatrixSetNumGhost(matrix, num_ghost) ); + } + + /*-------------------------------------------------------------------------- + * HYPRE_StructMatrixGetGrid + *--------------------------------------------------------------------------*/ + + int + HYPRE_StructMatrixGetGrid( HYPRE_StructMatrix matrix, HYPRE_StructGrid *grid ) + { + int ierr = 0; + + *grid = hypre_StructMatrixGrid(matrix); + + return ierr; + } + + /*-------------------------------------------------------------------------- + * HYPRE_StructMatrixSetSymmetric + *--------------------------------------------------------------------------*/ + + int + HYPRE_StructMatrixSetSymmetric( HYPRE_StructMatrix matrix, + int symmetric ) + { + int ierr = 0; + + hypre_StructMatrixSymmetric(matrix) = symmetric; + + return ierr; + } + + /*-------------------------------------------------------------------------- + * HYPRE_StructMatrixPrint + *--------------------------------------------------------------------------*/ + + int + HYPRE_StructMatrixPrint( char *filename, + HYPRE_StructMatrix matrix, + int all ) + { + return ( hypre_StructMatrixPrint(filename, matrix, all) ); + } + Index: llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/HYPRE_struct_mv.h diff -c /dev/null llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/HYPRE_struct_mv.h:1.1 *** /dev/null Mon Apr 11 00:22:18 2005 --- llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/HYPRE_struct_mv.h Mon Apr 11 00:22:07 2005 *************** *** 0 **** --- 1,344 ---- + /*BHEADER********************************************************************** + * (c) 2000 The Regents of the University of California + * + * See the file COPYRIGHT_and_DISCLAIMER for a complete copyright + * notice, contact person, and disclaimer. + * + * $Revision: 1.1 $ + *********************************************************************EHEADER*/ + + #ifndef HYPRE_STRUCT_MV_HEADER + #define HYPRE_STRUCT_MV_HEADER + + #include "HYPRE_config.h" + #include "HYPRE_utilities.h" + + #ifdef __cplusplus + extern "C" { + #endif + + /*-------------------------------------------------------------------------- + *--------------------------------------------------------------------------*/ + + /** + * @name Struct System Interface + * + * This interface represents a structured-grid conceptual view of a + * linear system. + * + * @memo A structured-grid conceptual interface + * @author Robert D. Falgout + **/ + /*@{*/ + + /*-------------------------------------------------------------------------- + *--------------------------------------------------------------------------*/ + + /** + * @name Struct Grids + **/ + /*@{*/ + + struct hypre_StructGrid_struct; + /** + * A grid object is constructed out of several ``boxes'', defined on a + * global abstract index space. + **/ + typedef struct hypre_StructGrid_struct *HYPRE_StructGrid; + + /** + * Create an {\tt ndim}-dimensional grid object. + **/ + int HYPRE_StructGridCreate(MPI_Comm comm, + int ndim, + HYPRE_StructGrid *grid); + + /** + * Destroy a grid object. An object should be explicitly destroyed + * using this destructor when the user's code no longer needs direct + * access to it. Once destroyed, the object must not be referenced + * again. Note that the object may not be deallocated at the + * completion of this call, since there may be internal package + * references to the object. The object will then be destroyed when + * all internal reference counts go to zero. + **/ + int HYPRE_StructGridDestroy(HYPRE_StructGrid grid); + + /** + * Set the extents for a box on the grid. + **/ + int HYPRE_StructGridSetExtents(HYPRE_StructGrid grid, + int *ilower, + int *iupper); + + /** + * Finalize the construction of the grid before using. + **/ + int HYPRE_StructGridAssemble(HYPRE_StructGrid grid); + + /** + * (Optional) Set periodic. + **/ + int HYPRE_StructGridSetPeriodic(HYPRE_StructGrid grid, + int *periodic); + + /*@}*/ + + /*-------------------------------------------------------------------------- + *--------------------------------------------------------------------------*/ + + /** + * @name Struct Stencils + **/ + /*@{*/ + + struct hypre_StructStencil_struct; + /** + * The stencil object. + **/ + typedef struct hypre_StructStencil_struct *HYPRE_StructStencil; + + /** + * Create a stencil object for the specified number of spatial dimensions + * and stencil entries. + **/ + int HYPRE_StructStencilCreate(int ndim, + int size, + HYPRE_StructStencil *stencil); + + /** + * Destroy a stencil object. + **/ + int HYPRE_StructStencilDestroy(HYPRE_StructStencil stencil); + + /** + * Set a stencil entry. + * + * NOTE: The name of this routine will eventually be changed to + * {\tt HYPRE\_StructStencilSetEntry}. + **/ + int HYPRE_StructStencilSetElement(HYPRE_StructStencil stencil, + int entry, + int *offset); + + /*@}*/ + + /*-------------------------------------------------------------------------- + *--------------------------------------------------------------------------*/ + + /** + * @name Struct Matrices + **/ + /*@{*/ + + struct hypre_StructMatrix_struct; + /** + * The matrix object. + **/ + typedef struct hypre_StructMatrix_struct *HYPRE_StructMatrix; + + /** + * Create a matrix object. + **/ + int HYPRE_StructMatrixCreate(MPI_Comm comm, + HYPRE_StructGrid grid, + HYPRE_StructStencil stencil, + HYPRE_StructMatrix *matrix); + + /** + * Destroy a matrix object. + **/ + int HYPRE_StructMatrixDestroy(HYPRE_StructMatrix matrix); + + /** + * Prepare a matrix object for setting coefficient values. + **/ + int HYPRE_StructMatrixInitialize(HYPRE_StructMatrix matrix); + + /** + * Set matrix coefficients index by index. + **/ + int HYPRE_StructMatrixSetValues(HYPRE_StructMatrix matrix, + int *index, + int nentries, + int *entries, + double *values); + + /** + * Set matrix coefficients a box at a time. + **/ + int HYPRE_StructMatrixSetBoxValues(HYPRE_StructMatrix matrix, + int *ilower, + int *iupper, + int nentries, + int *entries, + double *values); + /** + * Add to matrix coefficients index by index. + **/ + int HYPRE_StructMatrixAddToValues(HYPRE_StructMatrix matrix, + int *index, + int nentries, + int *entries, + double *values); + + /** + * Add to matrix coefficients a box at a time. + **/ + int HYPRE_StructMatrixAddToBoxValues(HYPRE_StructMatrix matrix, + int *ilower, + int *iupper, + int nentries, + int *entries, + double *values); + + /** + * Finalize the construction of the matrix before using. + **/ + int HYPRE_StructMatrixAssemble(HYPRE_StructMatrix matrix); + + /** + * (Optional) Define symmetry properties of the matrix. By default, + * matrices are assumed to be nonsymmetric. Significant storage + * savings can be made if the matrix is symmetric. + **/ + int HYPRE_StructMatrixSetSymmetric(HYPRE_StructMatrix matrix, + int symmetric); + + /** + * Print the matrix to file. This is mainly for debugging purposes. + **/ + int HYPRE_StructMatrixPrint(char *filename, + HYPRE_StructMatrix matrix, + int all); + + /*@}*/ + + /*-------------------------------------------------------------------------- + *--------------------------------------------------------------------------*/ + + /** + * @name Struct Vectors + **/ + /*@{*/ + + struct hypre_StructVector_struct; + /** + * The vector object. + **/ + typedef struct hypre_StructVector_struct *HYPRE_StructVector; + + /** + * Create a vector object. + **/ + int HYPRE_StructVectorCreate(MPI_Comm comm, + HYPRE_StructGrid grid, + HYPRE_StructVector *vector); + + /** + * Destroy a vector object. + **/ + int HYPRE_StructVectorDestroy(HYPRE_StructVector vector); + + /** + * Prepare a vector object for setting coefficient values. + **/ + int HYPRE_StructVectorInitialize(HYPRE_StructVector vector); + + /** + * Set vector coefficients index by index. + **/ + int HYPRE_StructVectorSetValues(HYPRE_StructVector vector, + int *index, + double value); + + /** + * Set vector coefficients a box at a time. + **/ + int HYPRE_StructVectorSetBoxValues(HYPRE_StructVector vector, + int *ilower, + int *iupper, + double *values); + /** + * Set vector coefficients index by index. + **/ + int HYPRE_StructVectorAddToValues(HYPRE_StructVector vector, + int *index, + double value); + + /** + * Set vector coefficients a box at a time. + **/ + int HYPRE_StructVectorAddToBoxValues(HYPRE_StructVector vector, + int *ilower, + int *iupper, + double *values); + + /** + * Finalize the construction of the vector before using. + **/ + int HYPRE_StructVectorAssemble(HYPRE_StructVector vector); + + /** + * Get vector coefficients index by index. + **/ + int HYPRE_StructVectorGetValues(HYPRE_StructVector vector, + int *index, + double *value); + + /** + * Get vector coefficients a box at a time. + **/ + int HYPRE_StructVectorGetBoxValues(HYPRE_StructVector vector, + int *ilower, + int *iupper, + double *values); + + /** + * Print the vector to file. This is mainly for debugging purposes. + **/ + int HYPRE_StructVectorPrint(char *filename, + HYPRE_StructVector vector, + int all); + + /*@}*/ + /*@}*/ + + /*-------------------------------------------------------------------------- + * Miscellaneous: These probably do not belong in the interface. + *--------------------------------------------------------------------------*/ + + int HYPRE_StructMatrixSetNumGhost(HYPRE_StructMatrix matrix, + int *num_ghost); + + int HYPRE_StructMatrixGetGrid(HYPRE_StructMatrix matrix, + HYPRE_StructGrid *grid); + + struct hypre_CommPkg_struct; + typedef struct hypre_CommPkg_struct *HYPRE_CommPkg; + + int HYPRE_StructVectorSetNumGhost(HYPRE_StructVector vector, + int *num_ghost); + + int HYPRE_StructVectorSetConstantValues(HYPRE_StructVector vector, + double values); + + int HYPRE_StructVectorGetMigrateCommPkg(HYPRE_StructVector from_vector, + HYPRE_StructVector to_vector, + HYPRE_CommPkg *comm_pkg); + + int HYPRE_StructVectorMigrate(HYPRE_CommPkg comm_pkg, + HYPRE_StructVector from_vector, + HYPRE_StructVector to_vector); + + int HYPRE_CommPkgDestroy(HYPRE_CommPkg comm_pkg); + + /*-------------------------------------------------------------------------- + *--------------------------------------------------------------------------*/ + + #ifdef __cplusplus + } + #endif + + #endif + Index: llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/HYPRE_struct_pcg.c diff -c /dev/null llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/HYPRE_struct_pcg.c:1.1 *** /dev/null Mon Apr 11 00:22:18 2005 --- llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/HYPRE_struct_pcg.c Mon Apr 11 00:22:07 2005 *************** *** 0 **** --- 1,473 ---- + /*BHEADER********************************************************************** + * (c) 1998 The Regents of the University of California + * + * See the file COPYRIGHT_and_DISCLAIMER for a complete copyright + * notice, contact person, and disclaimer. + * + * $Revision: 1.1 $ + *********************************************************************EHEADER*/ + + #include "headers.h" + + /*==========================================================================*/ + /** Creates a new PCG solver object. + + {\bf Input files:} + headers.h + + @return Error code. + + @param comm [IN] + MPI communicator + @param solver [OUT] + solver structure + + @see HYPRE_StructPCGDestroy */ + /*--------------------------------------------------------------------------*/ + + int + HYPRE_StructPCGCreate( MPI_Comm comm, HYPRE_StructSolver *solver ) + { + /* The function names with a PCG in them are in + struct_ls/pcg_struct.c . These functions do rather little - + e.g., cast to the correct type - before calling something else. + These names should be called, e.g., hypre_struct_Free, to reduce the + chance of name conflicts. */ + hypre_PCGFunctions * pcg_functions = + hypre_PCGFunctionsCreate( + hypre_CAlloc, hypre_StructKrylovFree, hypre_StructKrylovCreateVector, + hypre_StructKrylovDestroyVector, hypre_StructKrylovMatvecCreate, + hypre_StructKrylovMatvec, hypre_StructKrylovMatvecDestroy, + hypre_StructKrylovInnerProd, hypre_StructKrylovCopyVector, + hypre_StructKrylovClearVector, + hypre_StructKrylovScaleVector, hypre_StructKrylovAxpy, + hypre_StructKrylovIdentitySetup, hypre_StructKrylovIdentity ); + + *solver = ( (HYPRE_StructSolver) hypre_PCGCreate( pcg_functions ) ); + + return 0; + } + + /*==========================================================================*/ + /*==========================================================================*/ + /** Destroys a PCG solver object. + + {\bf Input files:} + headers.h + + @return Error code. + + @param solver [IN/OUT] + solver structure + + @see HYPRE_StructPCGCreate */ + /*--------------------------------------------------------------------------*/ + + int + HYPRE_StructPCGDestroy( HYPRE_StructSolver solver ) + { + return( hypre_PCGDestroy( (void *) solver ) ); + } + + /*==========================================================================*/ + /*==========================================================================*/ + /** Precomputes tasks necessary for doing the solve. This routine + ensures that the setup for the preconditioner is also called. + + NOTE: This is supposed to be an optional call, but currently is required. + + {\bf Input files:} + headers.h + + @return Error code. + + @param solver [IN/OUT] + solver structure + @param A [IN] + coefficient matrix + @param b [IN] + right-hand-side vector + @param x [IN] + unknown vector + + @see HYPRE_StructPCGSolve */ + /*--------------------------------------------------------------------------*/ + + int + HYPRE_StructPCGSetup( HYPRE_StructSolver solver, + HYPRE_StructMatrix A, + HYPRE_StructVector b, + HYPRE_StructVector x ) + { + return( HYPRE_PCGSetup( (HYPRE_Solver) solver, + (HYPRE_Matrix) A, + (HYPRE_Vector) b, + (HYPRE_Vector) x ) ); + } + + /*==========================================================================*/ + /*==========================================================================*/ + /** Performs the PCG linear solve. + + {\bf Input files:} + headers.h + + @return Error code. + + @param solver [IN/OUT] + solver structure + @param A [IN] + coefficient matrix + @param b [IN] + right-hand-side vector + @param x [IN] + unknown vector + + @see HYPRE_StructPCGSetup */ + /*--------------------------------------------------------------------------*/ + + int + HYPRE_StructPCGSolve( HYPRE_StructSolver solver, + HYPRE_StructMatrix A, + HYPRE_StructVector b, + HYPRE_StructVector x ) + { + return( HYPRE_PCGSolve( (HYPRE_Solver) solver, + (HYPRE_Matrix) A, + (HYPRE_Vector) b, + (HYPRE_Vector) x ) ); + } + + /*==========================================================================*/ + /*==========================================================================*/ + /** (Optional) Set the stopping tolerance. + + {\bf Input files:} + headers.h + + @return Error code. + + @param solver [IN/OUT] + solver structure + @param tol [IN] + PCG solver tolerance + + @see HYPRE_StructPCGSolve, HYPRE_StructPCGSetup */ + /*--------------------------------------------------------------------------*/ + + int + HYPRE_StructPCGSetTol( HYPRE_StructSolver solver, + double tol ) + { + return( HYPRE_PCGSetTol( (HYPRE_Solver) solver, tol ) ); + } + + /*==========================================================================*/ + /*==========================================================================*/ + /** (Optional) Set the maximum number of iterations. + + {\bf Input files:} + headers.h + + @return Error code. + + @param solver [IN/OUT] + solver structure + @param max_iter [IN] + PCG solver maximum number of iterations + + @see HYPRE_StructPCGSolve, HYPRE_StructPCGSetup */ + /*--------------------------------------------------------------------------*/ + + int + HYPRE_StructPCGSetMaxIter( HYPRE_StructSolver solver, + int max_iter ) + { + return( HYPRE_PCGSetMaxIter( (HYPRE_Solver) solver, max_iter ) ); + } + + /*==========================================================================*/ + /*==========================================================================*/ + /** (Optional) Set the type of norm to use in the stopping criteria. + If parameter two\_norm is set to 0, the preconditioner norm is used. + If set to 1, the two-norm is used. + + {\bf Input files:} + headers.h + + @return Error code. + + @param solver [IN/OUT] + solver structure + @param two_norm [IN] + boolean indicating whether or not to use the two-norm + + @see HYPRE_StructPCGSolve, HYPRE_StructPCGSetup */ + /*--------------------------------------------------------------------------*/ + + int + HYPRE_StructPCGSetTwoNorm( HYPRE_StructSolver solver, + int two_norm ) + { + return( HYPRE_PCGSetTwoNorm( (HYPRE_Solver) solver, two_norm ) ); + } + + /*==========================================================================*/ + /*==========================================================================*/ + /** (Optional) Set whether or not to do an additional relative change + stopping test. If parameter rel\_change is set to 0, no additional + stopping test is done. If set to 1, the additional test is done. + + {\bf Input files:} + headers.h + + @return Error code. + + @param solver [IN/OUT] + solver structure + @param rel_change [IN] + boolean indicating whether or not to do relative change test + + @see HYPRE_StructPCGSolve, HYPRE_StructPCGSetup */ + /*--------------------------------------------------------------------------*/ + + int + HYPRE_StructPCGSetRelChange( HYPRE_StructSolver solver, + int rel_change ) + { + return( HYPRE_PCGSetRelChange( (HYPRE_Solver) solver, rel_change ) ); + } + + /*==========================================================================*/ + /*==========================================================================*/ + /** (Optional) Sets the precondioner to use in PCG. The Default is no + preconditioner, i.e. the solver is just conjugate gradients (CG). + + {\bf Input files:} + headers.h + + @return Error code. + + @param solver [IN/OUT] + solver structure + @param precond [IN] + pointer to the preconditioner solve function + @param precond_setup [IN] + pointer to the preconditioner setup function + @param precond_solver [IN/OUT] + preconditioner solver structure + + @see HYPRE_StructPCGSolve, HYPRE_StructPCGSetup*/ + /*--------------------------------------------------------------------------*/ + + int + HYPRE_StructPCGSetPrecond( HYPRE_StructSolver solver, + HYPRE_PtrToStructSolverFcn precond, + HYPRE_PtrToStructSolverFcn precond_setup, + HYPRE_StructSolver precond_solver ) + { + return( HYPRE_PCGSetPrecond( (HYPRE_Solver) solver, + (HYPRE_PtrToSolverFcn) precond, + (HYPRE_PtrToSolverFcn) precond_setup, + (HYPRE_Solver) precond_solver ) ); + } + + /*==========================================================================*/ + /*==========================================================================*/ + /** (Optional) Set the type of logging to do. Currently, if parameter + logging is set to 0, no logging is done. If set to 1, the norms and + relative norms for each iteration are saved. + + {\bf Input files:} + headers.h + + @return Error code. + + @param solver [IN/OUT] + solver structure + @param logging [IN] + integer indicating what type of logging to do + + @see HYPRE_StructPCGSolve, HYPRE_StructPCGSetup */ + /*--------------------------------------------------------------------------*/ + + int + HYPRE_StructPCGSetLogging( HYPRE_StructSolver solver, + int logging ) + { + return( HYPRE_PCGSetLogging( (HYPRE_Solver) solver, logging ) ); + } + + /*==========================================================================*/ + /*==========================================================================*/ + /** (Optional) Gets the number of iterations done in the solve. + + {\bf Input files:} + headers.h + + @return Error code. + + @param solver [IN] + solver structure + @param num_iterations [OUT] + number of iterations + + @see HYPRE_StructPCGSolve, HYPRE_StructPCGSetup */ + /*--------------------------------------------------------------------------*/ + + int + HYPRE_StructPCGGetNumIterations( HYPRE_StructSolver solver, + int *num_iterations ) + { + return( HYPRE_PCGGetNumIterations( (HYPRE_Solver) solver, num_iterations ) ); + } + + /*==========================================================================*/ + /*==========================================================================*/ + /** (Optional) Gets the final relative residual norm for the solve. + + {\bf Input files:} + headers.h + + @return Error code. + + @param solver [IN] + solver structure + @param norm [OUT] + final relative residual norm + + @see HYPRE_StructPCGSolve, HYPRE_StructPCGSetup */ + /*--------------------------------------------------------------------------*/ + + int + HYPRE_StructPCGGetFinalRelativeResidualNorm( HYPRE_StructSolver solver, + double *norm ) + { + return( HYPRE_PCGGetFinalRelativeResidualNorm( (HYPRE_Solver) solver, norm ) ); + } + + /*==========================================================================*/ + /*==========================================================================*/ + /** Setup routine for diagonally scaling a vector. + + {\bf Input files:} + headers.h + + @return Error code. + + @param solver [IN/OUT] + solver structure + @param A [IN] + coefficient matrix + @param b [IN] + right-hand-side vector + @param x [IN] + unknown vector + + @see HYPRE_StructDiagScale */ + /*--------------------------------------------------------------------------*/ + + int + HYPRE_StructDiagScaleSetup( HYPRE_StructSolver solver, + HYPRE_StructMatrix A, + HYPRE_StructVector y, + HYPRE_StructVector x ) + { + return 0; + } + + /*==========================================================================*/ + /*==========================================================================*/ + /** Diagonally scale a vector. + + {\bf Input files:} + headers.h + + @return Error code. + + @param solver [IN/OUT] + solver structure + @param A [IN] + coefficient matrix + @param b [IN] + right-hand-side vector + @param x [IN] + unknown vector + + @see HYPRE_StructDiagScaleSetup */ + /*--------------------------------------------------------------------------*/ + /*-------------------------------------------------------------------------- + * HYPRE_StructDiagScale + *--------------------------------------------------------------------------*/ + + int + HYPRE_StructDiagScale( HYPRE_StructSolver solver, + HYPRE_StructMatrix HA, + HYPRE_StructVector Hy, + HYPRE_StructVector Hx ) + { + hypre_StructMatrix *A = (hypre_StructMatrix *) HA; + hypre_StructVector *y = (hypre_StructVector *) Hy; + hypre_StructVector *x = (hypre_StructVector *) Hx; + + hypre_BoxArray *boxes; + hypre_Box *box; + + hypre_Box *A_data_box; + hypre_Box *y_data_box; + hypre_Box *x_data_box; + + double *Ap; + double *yp; + double *xp; + + int Ai; + int yi; + int xi; + + hypre_Index index; + hypre_IndexRef start; + hypre_Index stride; + hypre_Index loop_size; + + int i; + int loopi, loopj, loopk; + + int ierr = 0; + + /* x = D^{-1} y */ + hypre_SetIndex(stride, 1, 1, 1); + boxes = hypre_StructGridBoxes(hypre_StructMatrixGrid(A)); + hypre_ForBoxI(i, boxes) + { + box = hypre_BoxArrayBox(boxes, i); + + A_data_box = hypre_BoxArrayBox(hypre_StructMatrixDataSpace(A), i); + x_data_box = hypre_BoxArrayBox(hypre_StructVectorDataSpace(x), i); + y_data_box = hypre_BoxArrayBox(hypre_StructVectorDataSpace(y), i); + + hypre_SetIndex(index, 0, 0, 0); + Ap = hypre_StructMatrixExtractPointerByIndex(A, i, index); + xp = hypre_StructVectorBoxData(x, i); + yp = hypre_StructVectorBoxData(y, i); + + start = hypre_BoxIMin(box); + + hypre_BoxGetSize(box, loop_size); + + hypre_BoxLoop3Begin(loop_size, + A_data_box, start, stride, Ai, + x_data_box, start, stride, xi, + y_data_box, start, stride, yi); + #define HYPRE_BOX_SMP_PRIVATE loopk,loopi,loopj,yi,xi,Ai + #include "hypre_box_smp_forloop.h" + hypre_BoxLoop3For(loopi, loopj, loopk, Ai, xi, yi) + { + xp[xi] = yp[yi] / Ap[Ai]; + } + hypre_BoxLoop3End(Ai, xi, yi); + } + + return ierr; + } + Index: llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/HYPRE_struct_smg.c diff -c /dev/null llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/HYPRE_struct_smg.c:1.1 *** /dev/null Mon Apr 11 00:22:18 2005 --- llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/HYPRE_struct_smg.c Mon Apr 11 00:22:07 2005 *************** *** 0 **** --- 1,191 ---- + /*BHEADER********************************************************************** + * (c) 1997 The Regents of the University of California + * + * See the file COPYRIGHT_and_DISCLAIMER for a complete copyright + * notice, contact person, and disclaimer. + * + * $Revision: 1.1 $ + *********************************************************************EHEADER*/ + /****************************************************************************** + * + * HYPRE_StructSMG interface + * + *****************************************************************************/ + + #include "headers.h" + + /*-------------------------------------------------------------------------- + * HYPRE_StructSMGCreate + *--------------------------------------------------------------------------*/ + + int + HYPRE_StructSMGCreate( MPI_Comm comm, HYPRE_StructSolver *solver ) + { + *solver = ( (HYPRE_StructSolver) hypre_SMGCreate( comm ) ); + + return 0; + } + + /*-------------------------------------------------------------------------- + * HYPRE_StructSMGDestroy + *--------------------------------------------------------------------------*/ + + int + HYPRE_StructSMGDestroy( HYPRE_StructSolver solver ) + { + return( hypre_SMGDestroy( (void *) solver ) ); + } + + /*-------------------------------------------------------------------------- + * HYPRE_StructSMGSetup + *--------------------------------------------------------------------------*/ + + int + HYPRE_StructSMGSetup( HYPRE_StructSolver solver, + HYPRE_StructMatrix A, + HYPRE_StructVector b, + HYPRE_StructVector x ) + { + return( hypre_SMGSetup( (void *) solver, + (hypre_StructMatrix *) A, + (hypre_StructVector *) b, + (hypre_StructVector *) x ) ); + } + + /*-------------------------------------------------------------------------- + * HYPRE_StructSMGSolve + *--------------------------------------------------------------------------*/ + + int + HYPRE_StructSMGSolve( HYPRE_StructSolver solver, + HYPRE_StructMatrix A, + HYPRE_StructVector b, + HYPRE_StructVector x ) + { + return( hypre_SMGSolve( (void *) solver, + (hypre_StructMatrix *) A, + (hypre_StructVector *) b, + (hypre_StructVector *) x ) ); + } + + /*-------------------------------------------------------------------------- + * HYPRE_StructSMGSetMemoryUse + *--------------------------------------------------------------------------*/ + + int + HYPRE_StructSMGSetMemoryUse( HYPRE_StructSolver solver, + int memory_use ) + { + return( hypre_SMGSetMemoryUse( (void *) solver, memory_use ) ); + } + + /*-------------------------------------------------------------------------- + * HYPRE_StructSMGSetTol + *--------------------------------------------------------------------------*/ + + int + HYPRE_StructSMGSetTol( HYPRE_StructSolver solver, + double tol ) + { + return( hypre_SMGSetTol( (void *) solver, tol ) ); + } + + /*-------------------------------------------------------------------------- + * HYPRE_StructSMGSetMaxIter + *--------------------------------------------------------------------------*/ + + int + HYPRE_StructSMGSetMaxIter( HYPRE_StructSolver solver, + int max_iter ) + { + return( hypre_SMGSetMaxIter( (void *) solver, max_iter ) ); + } + + /*-------------------------------------------------------------------------- + * HYPRE_StructSMGSetRelChange + *--------------------------------------------------------------------------*/ + + int + HYPRE_StructSMGSetRelChange( HYPRE_StructSolver solver, + int rel_change ) + { + return( hypre_SMGSetRelChange( (void *) solver, rel_change ) ); + } + + /*-------------------------------------------------------------------------- + * HYPRE_StructSMGSetZeroGuess + *--------------------------------------------------------------------------*/ + + int + HYPRE_StructSMGSetZeroGuess( HYPRE_StructSolver solver ) + { + return( hypre_SMGSetZeroGuess( (void *) solver, 1 ) ); + } + + /*-------------------------------------------------------------------------- + * HYPRE_StructSMGSetNonZeroGuess + *--------------------------------------------------------------------------*/ + + int + HYPRE_StructSMGSetNonZeroGuess( HYPRE_StructSolver solver ) + { + return( hypre_SMGSetZeroGuess( (void *) solver, 0 ) ); + } + + /*-------------------------------------------------------------------------- + * HYPRE_StructSMGSetNumPreRelax + * + * Note that we require at least 1 pre-relax sweep. + *--------------------------------------------------------------------------*/ + + int + HYPRE_StructSMGSetNumPreRelax( HYPRE_StructSolver solver, + int num_pre_relax ) + { + return( hypre_SMGSetNumPreRelax( (void *) solver, num_pre_relax) ); + } + + /*-------------------------------------------------------------------------- + * HYPRE_StructSMGSetNumPostRelax + *--------------------------------------------------------------------------*/ + + int + HYPRE_StructSMGSetNumPostRelax( HYPRE_StructSolver solver, + int num_post_relax ) + { + return( hypre_SMGSetNumPostRelax( (void *) solver, num_post_relax) ); + } + + /*-------------------------------------------------------------------------- + * HYPRE_StructSMGSetLogging + *--------------------------------------------------------------------------*/ + + int + HYPRE_StructSMGSetLogging( HYPRE_StructSolver solver, + int logging ) + { + return( hypre_SMGSetLogging( (void *) solver, logging) ); + } + + /*-------------------------------------------------------------------------- + * HYPRE_StructSMGGetNumIterations + *--------------------------------------------------------------------------*/ + + int + HYPRE_StructSMGGetNumIterations( HYPRE_StructSolver solver, + int *num_iterations ) + { + return( hypre_SMGGetNumIterations( (void *) solver, num_iterations ) ); + } + + /*-------------------------------------------------------------------------- + * HYPRE_StructSMGGetFinalRelativeResidualNorm + *--------------------------------------------------------------------------*/ + + int + HYPRE_StructSMGGetFinalRelativeResidualNorm( HYPRE_StructSolver solver, + double *norm ) + { + return( hypre_SMGGetFinalRelativeResidualNorm( (void *) solver, norm ) ); + } + Index: llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/HYPRE_struct_stencil.c diff -c /dev/null llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/HYPRE_struct_stencil.c:1.1 *** /dev/null Mon Apr 11 00:22:18 2005 --- llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/HYPRE_struct_stencil.c Mon Apr 11 00:22:07 2005 *************** *** 0 **** --- 1,68 ---- + /*BHEADER********************************************************************** + * (c) 1997 The Regents of the University of California + * + * See the file COPYRIGHT_and_DISCLAIMER for a complete copyright + * notice, contact person, and disclaimer. + * + * $Revision: 1.1 $ + *********************************************************************EHEADER*/ + /****************************************************************************** + * + * HYPRE_StructStencil interface + * + *****************************************************************************/ + + #include "headers.h" + + /*-------------------------------------------------------------------------- + * HYPRE_StructStencilCreate + *--------------------------------------------------------------------------*/ + + int + HYPRE_StructStencilCreate( int dim, + int size, + HYPRE_StructStencil *stencil ) + { + hypre_Index *shape; + + shape = hypre_CTAlloc(hypre_Index, size); + + *stencil = hypre_StructStencilCreate(dim, size, shape); + + return 0; + } + + /*-------------------------------------------------------------------------- + * HYPRE_StructStencilSetElement + *--------------------------------------------------------------------------*/ + + int + HYPRE_StructStencilSetElement( HYPRE_StructStencil stencil, + int element_index, + int *offset ) + { + int ierr = 0; + + hypre_Index *shape; + int d; + + shape = hypre_StructStencilShape(stencil); + hypre_ClearIndex(shape[element_index]); + for (d = 0; d < hypre_StructStencilDim(stencil); d++) + { + hypre_IndexD(shape[element_index], d) = offset[d]; + } + + return ierr; + } + + /*-------------------------------------------------------------------------- + * HYPRE_StructStencilDestroy + *--------------------------------------------------------------------------*/ + + int + HYPRE_StructStencilDestroy( HYPRE_StructStencil stencil ) + { + return ( hypre_StructStencilDestroy(stencil) ); + } + Index: llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/HYPRE_struct_vector.c diff -c /dev/null llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/HYPRE_struct_vector.c:1.1 *** /dev/null Mon Apr 11 00:22:18 2005 --- llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/HYPRE_struct_vector.c Mon Apr 11 00:22:07 2005 *************** *** 0 **** --- 1,312 ---- + /*BHEADER********************************************************************** + * (c) 1997 The Regents of the University of California + * + * See the file COPYRIGHT_and_DISCLAIMER for a complete copyright + * notice, contact person, and disclaimer. + * + * $Revision: 1.1 $ + *********************************************************************EHEADER*/ + /****************************************************************************** + * + * HYPRE_StructVector interface + * + *****************************************************************************/ + + #include "headers.h" + + /*-------------------------------------------------------------------------- + * HYPRE_StructVectorCreate + *--------------------------------------------------------------------------*/ + + int + HYPRE_StructVectorCreate( MPI_Comm comm, + HYPRE_StructGrid grid, + HYPRE_StructVector *vector ) + { + int ierr = 0; + + *vector = hypre_StructVectorCreate(comm, grid); + + return ierr; + } + + /*-------------------------------------------------------------------------- + * HYPRE_StructVectorDestroy + *--------------------------------------------------------------------------*/ + + int + HYPRE_StructVectorDestroy( HYPRE_StructVector struct_vector ) + { + return( hypre_StructVectorDestroy(struct_vector) ); + } + + /*-------------------------------------------------------------------------- + * HYPRE_StructVectorInitialize + *--------------------------------------------------------------------------*/ + + int + HYPRE_StructVectorInitialize( HYPRE_StructVector vector ) + { + return ( hypre_StructVectorInitialize(vector) ); + } + + /*-------------------------------------------------------------------------- + * HYPRE_StructVectorSetValues + *--------------------------------------------------------------------------*/ + + int + HYPRE_StructVectorSetValues( HYPRE_StructVector vector, + int *grid_index, + double values ) + { + hypre_Index new_grid_index; + + int d; + int ierr = 0; + + hypre_ClearIndex(new_grid_index); + for (d = 0; d < hypre_StructGridDim(hypre_StructVectorGrid(vector)); d++) + { + hypre_IndexD(new_grid_index, d) = grid_index[d]; + } + + ierr = hypre_StructVectorSetValues(vector, new_grid_index, values, 0); + + return ierr; + } + + /*-------------------------------------------------------------------------- + * HYPRE_StructVectorSetBoxValues + *--------------------------------------------------------------------------*/ + + int + HYPRE_StructVectorSetBoxValues( HYPRE_StructVector vector, + int *ilower, + int *iupper, + double *values ) + { + hypre_Index new_ilower; + hypre_Index new_iupper; + hypre_Box *new_value_box; + + int d; + int ierr = 0; + + hypre_ClearIndex(new_ilower); + hypre_ClearIndex(new_iupper); + for (d = 0; d < hypre_StructGridDim(hypre_StructVectorGrid(vector)); d++) + { + hypre_IndexD(new_ilower, d) = ilower[d]; + hypre_IndexD(new_iupper, d) = iupper[d]; + } + new_value_box = hypre_BoxCreate(); + hypre_BoxSetExtents(new_value_box, new_ilower, new_iupper); + + ierr = hypre_StructVectorSetBoxValues(vector, new_value_box, values, 0 ); + + hypre_BoxDestroy(new_value_box); + + return ierr; + } + + /*-------------------------------------------------------------------------- + * HYPRE_StructVectorAddToValues + *--------------------------------------------------------------------------*/ + + int + HYPRE_StructVectorAddToValues( HYPRE_StructVector vector, + int *grid_index, + double values ) + { + hypre_Index new_grid_index; + + int d; + int ierr = 0; + + hypre_ClearIndex(new_grid_index); + for (d = 0; d < hypre_StructGridDim(hypre_StructVectorGrid(vector)); d++) + { + hypre_IndexD(new_grid_index, d) = grid_index[d]; + } + + ierr = hypre_StructVectorSetValues(vector, new_grid_index, values, 1); + + return ierr; + } + + /*-------------------------------------------------------------------------- + * HYPRE_StructVectorAddToBoxValues + *--------------------------------------------------------------------------*/ + + int + HYPRE_StructVectorAddToBoxValues( HYPRE_StructVector vector, + int *ilower, + int *iupper, + double *values ) + { + hypre_Index new_ilower; + hypre_Index new_iupper; + hypre_Box *new_value_box; + + int d; + int ierr = 0; + + hypre_ClearIndex(new_ilower); + hypre_ClearIndex(new_iupper); + for (d = 0; d < hypre_StructGridDim(hypre_StructVectorGrid(vector)); d++) + { + hypre_IndexD(new_ilower, d) = ilower[d]; + hypre_IndexD(new_iupper, d) = iupper[d]; + } + new_value_box = hypre_BoxCreate(); + hypre_BoxSetExtents(new_value_box, new_ilower, new_iupper); + + ierr = hypre_StructVectorSetBoxValues(vector, new_value_box, values, 1); + + hypre_BoxDestroy(new_value_box); + + return ierr; + } + + /*-------------------------------------------------------------------------- + * HYPRE_StructVectorGetValues + *--------------------------------------------------------------------------*/ + + int + HYPRE_StructVectorGetValues( HYPRE_StructVector vector, + int *grid_index, + double *values_ptr ) + { + hypre_Index new_grid_index; + + int d; + int ierr = 0; + + hypre_ClearIndex(new_grid_index); + for (d = 0; d < hypre_StructGridDim(hypre_StructVectorGrid(vector)); d++) + { + hypre_IndexD(new_grid_index, d) = grid_index[d]; + } + + ierr = hypre_StructVectorGetValues(vector, new_grid_index, values_ptr); + + return ierr; + } + + /*-------------------------------------------------------------------------- + * HYPRE_StructVectorGetBoxValues + *--------------------------------------------------------------------------*/ + + int + HYPRE_StructVectorGetBoxValues( HYPRE_StructVector vector, + int *ilower, + int *iupper, + double *values ) + { + hypre_Index new_ilower; + hypre_Index new_iupper; + hypre_Box *new_value_box; + + int d; + int ierr = 0; + + hypre_ClearIndex(new_ilower); + hypre_ClearIndex(new_iupper); + for (d = 0; d < hypre_StructGridDim(hypre_StructVectorGrid(vector)); d++) + { + hypre_IndexD(new_ilower, d) = ilower[d]; + hypre_IndexD(new_iupper, d) = iupper[d]; + } + new_value_box = hypre_BoxCreate(); + hypre_BoxSetExtents(new_value_box, new_ilower, new_iupper); + + ierr = hypre_StructVectorGetBoxValues(vector, new_value_box, values); + + hypre_BoxDestroy(new_value_box); + + return ierr; + } + + /*-------------------------------------------------------------------------- + * HYPRE_StructVectorAssemble + *--------------------------------------------------------------------------*/ + + int + HYPRE_StructVectorAssemble( HYPRE_StructVector vector ) + { + return( hypre_StructVectorAssemble(vector) ); + } + + /*-------------------------------------------------------------------------- + * HYPRE_StructVectorPrint + *--------------------------------------------------------------------------*/ + + int + HYPRE_StructVectorPrint( char *filename, + HYPRE_StructVector vector, + int all ) + { + return ( hypre_StructVectorPrint(filename, vector, all) ); + } + + /*-------------------------------------------------------------------------- + * HYPRE_StructVectorSetNumGhost + *--------------------------------------------------------------------------*/ + + int + HYPRE_StructVectorSetNumGhost( HYPRE_StructVector vector, + int *num_ghost ) + { + return ( hypre_StructVectorSetNumGhost(vector, num_ghost) ); + } + + /*-------------------------------------------------------------------------- + * HYPRE_StructVectorSetConstantValues + *--------------------------------------------------------------------------*/ + + int + HYPRE_StructVectorSetConstantValues( HYPRE_StructVector vector, + double values ) + { + return( hypre_StructVectorSetConstantValues(vector, values) ); + } + + /*-------------------------------------------------------------------------- + * HYPRE_StructVectorGetMigrateCommPkg + *--------------------------------------------------------------------------*/ + + int + HYPRE_StructVectorGetMigrateCommPkg( HYPRE_StructVector from_vector, + HYPRE_StructVector to_vector, + HYPRE_CommPkg *comm_pkg ) + { + int ierr = 0; + + *comm_pkg = hypre_StructVectorGetMigrateCommPkg(from_vector, to_vector); + + return ierr; + } + + /*-------------------------------------------------------------------------- + * HYPRE_StructVectorMigrate + *--------------------------------------------------------------------------*/ + + int + HYPRE_StructVectorMigrate( HYPRE_CommPkg comm_pkg, + HYPRE_StructVector from_vector, + HYPRE_StructVector to_vector ) + { + return( hypre_StructVectorMigrate( comm_pkg, from_vector, to_vector) ); + } + + /*-------------------------------------------------------------------------- + * HYPRE_CommPkgDestroy + *--------------------------------------------------------------------------*/ + + int + HYPRE_CommPkgDestroy( HYPRE_CommPkg comm_pkg ) + { + return ( hypre_CommPkgDestroy(comm_pkg) ); + } + + Index: llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/HYPRE_utilities.h diff -c /dev/null llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/HYPRE_utilities.h:1.1 *** /dev/null Mon Apr 11 00:22:18 2005 --- llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/HYPRE_utilities.h Mon Apr 11 00:22:07 2005 *************** *** 0 **** --- 1,60 ---- + /*BHEADER********************************************************************** + * (c) 1998 The Regents of the University of California + * + * See the file COPYRIGHT_and_DISCLAIMER for a complete copyright + * notice, contact person, and disclaimer. + * + * $Revision: 1.1 $ + *********************************************************************EHEADER*/ + /****************************************************************************** + * + * Header file for HYPRE_utilities library + * + *****************************************************************************/ + + #ifndef HYPRE_UTILITIES_HEADER + #define HYPRE_UTILITIES_HEADER + + #include + + #ifndef HYPRE_SEQUENTIAL + #include "mpi.h" + #endif + + #ifdef HYPRE_USING_OPENMP + #include + #endif + + #ifdef __cplusplus + extern "C" { + #endif + + /* + * Before a version of HYPRE goes out the door, increment the version + * number and check in this file (for CVS to substitute the Date). + */ + #define HYPRE_Version() "HYPRE 1.4.0b $Date: 2005/04/11 05:22:07 $ Compiled: " __DATE__ " " __TIME__ + + #ifdef HYPRE_USE_PTHREADS + #ifndef hypre_MAX_THREADS + #define hypre_MAX_THREADS 128 + #endif + #endif + + /*-------------------------------------------------------------------------- + * Structures + *--------------------------------------------------------------------------*/ + + #ifdef HYPRE_SEQUENTIAL + typedef int MPI_Comm; + #endif + + /*-------------------------------------------------------------------------- + * Prototypes + *--------------------------------------------------------------------------*/ + + #ifdef __cplusplus + } + #endif + + #endif Index: llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/LICENSE.txt diff -c /dev/null llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/LICENSE.txt:1.1 *** /dev/null Mon Apr 11 00:22:18 2005 --- llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/LICENSE.txt Mon Apr 11 00:22:07 2005 *************** *** 0 **** --- 1,43 ---- + + + + NOTICE + + This work was produced at the University of California, Lawrence + Livermore National Laboratory (UC LLNL) under contract + no. W-7405-ENG-48 (Contract 48) between the U.S. Department of Energy + (DOE) and The Regents of the University of California (University) for + the operation of UC LLNL. The rights of the Federal Government are + reserved under Contract 48 subject to the restrictions agreed upon by + the DOE and University as allowed under DOE Acquisition Letter 97-1. + + + + DISCLAIMER + + This work was prepared as an account of work sponsored by an agency of + the United States Government. Neither the United States Government nor + the University of California nor any of their employees, makes any + warranty, express or implied, or assumes any liability or + responsibility for the accuracy, completeness, or usefulness of any + information, apparatus, product, or process disclosed, or represents + that its use would not infringe privately-owned rights. Reference + herein to any specific commercial products, process, or service by + trade name, trademark, manufacturer or otherwise does not necessarily + constitute or imply its endorsement, recommendation, or favoring by + the United States Government or the University of California. The + views and opinions of authors expressed herein do not necessarily + state or reflect those of the United States Government or the + University of California, and shall not be used for advertising or + product endorsement purposes. + + + + NOTIFICATION OF COMMERCIAL USE + + Commercialization of this product is prohibited without notifying the + Department of Energy (DOE) or Lawrence Livermore National Laboratory + (LLNL). + + + Index: llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/Makefile diff -c /dev/null llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/Makefile:1.1 *** /dev/null Mon Apr 11 00:22:18 2005 --- llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/Makefile Mon Apr 11 00:22:07 2005 *************** *** 0 **** --- 1,15 ---- + LEVEL = ../../../.. + + PROG = smg2000 + + CPPFLAGS += -D_POSIX_SOURCE -DHYPRE_TIMING -DHYPRE_SEQUENTIAL + CPPFLAGS += -I. + + LIBS += -lm + LDFLAGS += -lm + + #include $(LLVM_OBJ_ROOT)/Makefile.config + + RUN_OPTIONS ="-n 100 40 100 -c 0.1 1.0 10.0" + + include ../../../Makefile.multisrc Index: llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/box.c diff -c /dev/null llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/box.c:1.1 *** /dev/null Mon Apr 11 00:22:18 2005 --- llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/box.c Mon Apr 11 00:22:07 2005 *************** *** 0 **** --- 1,437 ---- + /*BHEADER********************************************************************** + * (c) 1997 The Regents of the University of California + * + * See the file COPYRIGHT_and_DISCLAIMER for a complete copyright + * notice, contact person, and disclaimer. + * + * $Revision: 1.1 $ + *********************************************************************EHEADER*/ + /****************************************************************************** + * + * Member functions for hypre_Box class: + * Basic class functions. + * + *****************************************************************************/ + + #include "headers.h" + + /*-------------------------------------------------------------------------- + * hypre_BoxCreate + *--------------------------------------------------------------------------*/ + + hypre_Box * + hypre_BoxCreate( ) + { + hypre_Box *box; + + #if 1 + box = hypre_TAlloc(hypre_Box, 1); + #else + box = hypre_BoxAlloc(); + #endif + + return box; + } + + /*-------------------------------------------------------------------------- + * hypre_BoxSetExtents + *--------------------------------------------------------------------------*/ + + int + hypre_BoxSetExtents( hypre_Box *box, + hypre_Index imin, + hypre_Index imax ) + { + int ierr = 0; + + hypre_CopyIndex(imin, hypre_BoxIMin(box)); + hypre_CopyIndex(imax, hypre_BoxIMax(box)); + + return ierr; + } + + /*-------------------------------------------------------------------------- + * hypre_BoxArrayCreate + *--------------------------------------------------------------------------*/ + + hypre_BoxArray * + hypre_BoxArrayCreate( int size ) + { + hypre_BoxArray *box_array; + + box_array = hypre_TAlloc(hypre_BoxArray, 1); + + hypre_BoxArrayBoxes(box_array) = hypre_CTAlloc(hypre_Box, size); + hypre_BoxArraySize(box_array) = size; + hypre_BoxArrayAllocSize(box_array) = size; + + return box_array; + } + + /*-------------------------------------------------------------------------- + * hypre_BoxArraySetSize + *--------------------------------------------------------------------------*/ + + int + hypre_BoxArraySetSize( hypre_BoxArray *box_array, + int size ) + { + int ierr = 0; + int alloc_size; + + alloc_size = hypre_BoxArrayAllocSize(box_array); + + if (size > alloc_size) + { + alloc_size = size + hypre_BoxArrayExcess; + + hypre_BoxArrayBoxes(box_array) = + hypre_TReAlloc(hypre_BoxArrayBoxes(box_array), + hypre_Box, alloc_size); + + hypre_BoxArrayAllocSize(box_array) = alloc_size; + } + + hypre_BoxArraySize(box_array) = size; + + return ierr; + } + + /*-------------------------------------------------------------------------- + * hypre_BoxArrayArrayCreate + *--------------------------------------------------------------------------*/ + + hypre_BoxArrayArray * + hypre_BoxArrayArrayCreate( int size ) + { + hypre_BoxArrayArray *box_array_array; + int i; + + box_array_array = hypre_CTAlloc(hypre_BoxArrayArray, 1); + + hypre_BoxArrayArrayBoxArrays(box_array_array) = + hypre_CTAlloc(hypre_BoxArray *, size); + + for (i = 0; i < size; i++) + { + hypre_BoxArrayArrayBoxArray(box_array_array, i) = hypre_BoxArrayCreate(0); + } + hypre_BoxArrayArraySize(box_array_array) = size; + + return box_array_array; + } + + /*-------------------------------------------------------------------------- + * hypre_BoxDestroy + *--------------------------------------------------------------------------*/ + + int + hypre_BoxDestroy( hypre_Box *box ) + { + int ierr = 0; + + if (box) + { + #if 1 + hypre_TFree(box); + #else + hypre_BoxFree(box); + #endif + } + + return ierr; + } + + /*-------------------------------------------------------------------------- + * hypre_BoxArrayDestroy + *--------------------------------------------------------------------------*/ + + int + hypre_BoxArrayDestroy( hypre_BoxArray *box_array ) + { + int ierr = 0; + + if (box_array) + { + hypre_TFree(hypre_BoxArrayBoxes(box_array)); + hypre_TFree(box_array); + } + + return ierr; + } + + /*-------------------------------------------------------------------------- + * hypre_BoxArrayArrayDestroy + *--------------------------------------------------------------------------*/ + + int + hypre_BoxArrayArrayDestroy( hypre_BoxArrayArray *box_array_array ) + { + int ierr = 0; + int i; + + if (box_array_array) + { + hypre_ForBoxArrayI(i, box_array_array) + hypre_BoxArrayDestroy( + hypre_BoxArrayArrayBoxArray(box_array_array, i)); + + hypre_TFree(hypre_BoxArrayArrayBoxArrays(box_array_array)); + hypre_TFree(box_array_array); + } + + return ierr; + } + + /*-------------------------------------------------------------------------- + * hypre_BoxDuplicate: + * Return a duplicate box. + *--------------------------------------------------------------------------*/ + + hypre_Box * + hypre_BoxDuplicate( hypre_Box *box ) + { + hypre_Box *new_box; + + new_box = hypre_BoxCreate(); + hypre_CopyBox(box, new_box); + + return new_box; + } + + /*-------------------------------------------------------------------------- + * hypre_BoxArrayDuplicate: + * Return a duplicate box_array. + *--------------------------------------------------------------------------*/ + + hypre_BoxArray * + hypre_BoxArrayDuplicate( hypre_BoxArray *box_array ) + { + hypre_BoxArray *new_box_array; + + int i; + + new_box_array = hypre_BoxArrayCreate(hypre_BoxArraySize(box_array)); + hypre_ForBoxI(i, box_array) + { + hypre_CopyBox(hypre_BoxArrayBox(box_array, i), + hypre_BoxArrayBox(new_box_array, i)); + } + + return new_box_array; + } + + /*-------------------------------------------------------------------------- + * hypre_BoxArrayArrayDuplicate: + * Return a duplicate box_array_array. + *--------------------------------------------------------------------------*/ + + hypre_BoxArrayArray * + hypre_BoxArrayArrayDuplicate( hypre_BoxArrayArray *box_array_array ) + { + hypre_BoxArrayArray *new_box_array_array; + hypre_BoxArray **new_box_arrays; + int new_size; + + hypre_BoxArray **box_arrays; + int i; + + new_size = hypre_BoxArrayArraySize(box_array_array); + new_box_array_array = hypre_BoxArrayArrayCreate(new_size); + + if (new_size) + { + new_box_arrays = hypre_BoxArrayArrayBoxArrays(new_box_array_array); + box_arrays = hypre_BoxArrayArrayBoxArrays(box_array_array); + + for (i = 0; i < new_size; i++) + { + hypre_AppendBoxArray(box_arrays[i], new_box_arrays[i]); + } + } + + return new_box_array_array; + } + + /*-------------------------------------------------------------------------- + * hypre_AppendBox: + * Append box to the end of box_array. + * The box_array may be empty. + *--------------------------------------------------------------------------*/ + + int + hypre_AppendBox( hypre_Box *box, + hypre_BoxArray *box_array ) + { + int ierr = 0; + int size; + + size = hypre_BoxArraySize(box_array); + hypre_BoxArraySetSize(box_array, (size + 1)); + hypre_CopyBox(box, hypre_BoxArrayBox(box_array, size)); + + return ierr; + } + + /*-------------------------------------------------------------------------- + * hypre_DeleteBox: + * Delete box from box_array. + *--------------------------------------------------------------------------*/ + + int + hypre_DeleteBox( hypre_BoxArray *box_array, + int index ) + { + int ierr = 0; + int i; + + for (i = index; i < hypre_BoxArraySize(box_array) - 1; i++) + { + hypre_CopyBox(hypre_BoxArrayBox(box_array, i+1), + hypre_BoxArrayBox(box_array, i)); + } + + hypre_BoxArraySize(box_array) --; + + return ierr; + } + + /*-------------------------------------------------------------------------- + * hypre_AppendBoxArray: + * Append box_array_0 to the end of box_array_1. + * The box_array_1 may be empty. + *--------------------------------------------------------------------------*/ + + int + hypre_AppendBoxArray( hypre_BoxArray *box_array_0, + hypre_BoxArray *box_array_1 ) + { + int ierr = 0; + int size, size_0; + int i; + + size = hypre_BoxArraySize(box_array_1); + size_0 = hypre_BoxArraySize(box_array_0); + hypre_BoxArraySetSize(box_array_1, (size + size_0)); + + /* copy box_array_0 boxes into box_array_1 */ + for (i = 0; i < size_0; i++) + { + hypre_CopyBox(hypre_BoxArrayBox(box_array_0, i), + hypre_BoxArrayBox(box_array_1, size + i)); + } + + return ierr; + } + + /*-------------------------------------------------------------------------- + * hypre_BoxGetSize: + *--------------------------------------------------------------------------*/ + + int + hypre_BoxGetSize( hypre_Box *box, + hypre_Index size ) + { + hypre_IndexD(size, 0) = hypre_BoxSizeD(box, 0); + hypre_IndexD(size, 1) = hypre_BoxSizeD(box, 1); + hypre_IndexD(size, 2) = hypre_BoxSizeD(box, 2); + + return 0; + } + + /*-------------------------------------------------------------------------- + * hypre_BoxGetStrideSize: + *--------------------------------------------------------------------------*/ + + int + hypre_BoxGetStrideSize( hypre_Box *box, + hypre_Index stride, + hypre_Index size ) + { + int d, s; + + for (d = 0; d < 3; d++) + { + s = hypre_BoxSizeD(box, d); + if (s > 0) + { + s = (s - 1) / hypre_IndexD(stride, d) + 1; + } + hypre_IndexD(size, d) = s; + } + + return 0; + } + + /*-------------------------------------------------------------------------- + * hypre_IModPeriod: + *--------------------------------------------------------------------------*/ + + int + hypre_IModPeriod( int i, + int period ) + + { + int i_mod_p; + int shift; + + if (period == 0) + { + i_mod_p = i; + } + else if (i >= period) + { + i_mod_p = i % period; + } + else if (i < 0) + { + shift = ( -i / period + 1 ) * period; + i_mod_p = ( i + shift ) % period; + } + else + { + i_mod_p = i; + } + + return i_mod_p; + } + + /*-------------------------------------------------------------------------- + * hypre_IModPeriodX: + * Perhaps should be a macro? + *--------------------------------------------------------------------------*/ + + int + hypre_IModPeriodX( hypre_Index index, + hypre_Index periodic ) + { + return hypre_IModPeriod(hypre_IndexX(index), hypre_IndexX(periodic)); + } + + + /*-------------------------------------------------------------------------- + * hypre_IModPeriodY: + * Perhaps should be a macro? + *--------------------------------------------------------------------------*/ + + int + hypre_IModPeriodY( hypre_Index index, + hypre_Index periodic ) + { + return hypre_IModPeriod(hypre_IndexY(index), hypre_IndexY(periodic)); + } + + + /*-------------------------------------------------------------------------- + * hypre_IModPeriodZ: + * Perhaps should be a macro? + *--------------------------------------------------------------------------*/ + + int + hypre_IModPeriodZ( hypre_Index index, + hypre_Index periodic ) + { + return hypre_IModPeriod(hypre_IndexZ(index), hypre_IndexZ(periodic)); + } + + Index: llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/box_algebra.c diff -c /dev/null llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/box_algebra.c:1.1 *** /dev/null Mon Apr 11 00:22:18 2005 --- llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/box_algebra.c Mon Apr 11 00:22:07 2005 *************** *** 0 **** --- 1,397 ---- + /*BHEADER********************************************************************** + * (c) 1997 The Regents of the University of California + * + * See the file COPYRIGHT_and_DISCLAIMER for a complete copyright + * notice, contact person, and disclaimer. + * + * $Revision: 1.1 $ + *********************************************************************EHEADER*/ + /****************************************************************************** + * + * Member functions for hypre_Box class: + * Box algebra functions. + * + *****************************************************************************/ + + #include "headers.h" + + /*-------------------------------------------------------------------------- + * hypre_IntersectBoxes: + * Intersect box1 and box2. + * If the boxes do not intersect, the result is a box with zero volume. + *--------------------------------------------------------------------------*/ + + int + hypre_IntersectBoxes( hypre_Box *box1, + hypre_Box *box2, + hypre_Box *ibox ) + { + int ierr = 0; + int d; + + /* find x, y, and z bounds */ + for (d = 0; d < 3; d++) + { + hypre_BoxIMinD(ibox, d) = + hypre_max(hypre_BoxIMinD(box1, d), hypre_BoxIMinD(box2, d)); + hypre_BoxIMaxD(ibox, d) = + hypre_min(hypre_BoxIMaxD(box1, d), hypre_BoxIMaxD(box2, d)); + } + + return ierr; + } + + /*-------------------------------------------------------------------------- + * hypre_SubtractBoxes: + * Compute box1 - box2. + *--------------------------------------------------------------------------*/ + + int + hypre_SubtractBoxes( hypre_Box *box1, + hypre_Box *box2, + hypre_BoxArray *box_array ) + { + int ierr = 0; + + hypre_Box *box; + hypre_Box *rembox; + int d, size; + + /*------------------------------------------------------ + * Set the box array size to the maximum possible, + * plus one, to have space for the remainder box. + *------------------------------------------------------*/ + + hypre_BoxArraySetSize(box_array, 7); + + /*------------------------------------------------------ + * Subtract the boxes by cutting box1 in x, y, then z + *------------------------------------------------------*/ + + rembox = hypre_BoxArrayBox(box_array, 6); + hypre_CopyBox(box1, rembox); + + size = 0; + for (d = 0; d < 3; d++) + { + /* if the boxes do not intersect, the subtraction is trivial */ + if ( (hypre_BoxIMinD(box2, d) > hypre_BoxIMaxD(rembox, d)) || + (hypre_BoxIMaxD(box2, d) < hypre_BoxIMinD(rembox, d)) ) + { + hypre_CopyBox(box1, hypre_BoxArrayBox(box_array, 0)); + size = 1; + break; + } + + /* update the box array */ + else + { + if ( hypre_BoxIMinD(box2, d) > hypre_BoxIMinD(rembox, d) ) + { + box = hypre_BoxArrayBox(box_array, size); + hypre_CopyBox(rembox, box); + hypre_BoxIMaxD(box, d) = hypre_BoxIMinD(box2, d) - 1; + hypre_BoxIMinD(rembox, d) = hypre_BoxIMinD(box2, d); + size++; + } + if ( hypre_BoxIMaxD(box2, d) < hypre_BoxIMaxD(rembox, d) ) + { + box = hypre_BoxArrayBox(box_array, size); + hypre_CopyBox(rembox, box); + hypre_BoxIMinD(box, d) = hypre_BoxIMaxD(box2, d) + 1; + hypre_BoxIMaxD(rembox, d) = hypre_BoxIMaxD(box2, d); + size++; + } + } + } + hypre_BoxArraySetSize(box_array, size); + + return ierr; + } + + /*-------------------------------------------------------------------------- + * hypre_UnionBoxes: + * Compute the union of all boxes. + * + * To compute the union, we first construct a logically rectangular, + * variably spaced, 3D grid called block. Each cell (i,j,k) of block + * corresponds to a box with extents given by + * + * iminx = block_index[0][i] + * iminy = block_index[1][j] + * iminz = block_index[2][k] + * imaxx = block_index[0][i+1] - 1 + * imaxy = block_index[1][j+1] - 1 + * imaxz = block_index[2][k+1] - 1 + * + * The size of block is given by + * + * sizex = block_sz[0] + * sizey = block_sz[1] + * sizez = block_sz[2] + * + * We initially set all cells of block that are part of the union to + * + * factor[2] + factor[1] + factor[0] + * + * where + * + * factor[0] = 1; + * factor[1] = (block_sz[0] + 1); + * factor[2] = (block_sz[1] + 1) * factor[1]; + * + * The cells of block are then "joined" in x first, then y, then z. + * The result is that each nonzero entry of block corresponds to a + * box in the union with extents defined by factoring the entry, then + * indexing into the block_index array. + * + * Note: Special care has to be taken for boxes of size 0. + * + *--------------------------------------------------------------------------*/ + + int + hypre_UnionBoxes( hypre_BoxArray *boxes ) + { + int ierr = 0; + + hypre_Box *box; + + int *block_index[3]; + int block_sz[3], block_volume; + int *block; + int index; + int size; + int factor[3]; + + int iminmax[2], imin[3], imax[3]; + int ii[3], dd[3]; + int join; + int i_tmp0, i_tmp1; + int ioff, joff, koff; + int bi, d, i, j, k; + + int index_not_there; + + /*------------------------------------------------------ + * If the size of boxes is less than 2, return + *------------------------------------------------------*/ + + if (hypre_BoxArraySize(boxes) < 2) + { + return ierr; + } + + /*------------------------------------------------------ + * Set up the block_index array + *------------------------------------------------------*/ + + i_tmp0 = 2 * hypre_BoxArraySize(boxes); + block_index[0] = hypre_TAlloc(int, 3 * i_tmp0); + block_sz[0] = 0; + for (d = 1; d < 3; d++) + { + block_index[d] = block_index[d-1] + i_tmp0; + block_sz[d] = 0; + } + + hypre_ForBoxI(bi, boxes) + { + box = hypre_BoxArrayBox(boxes, bi); + + for (d = 0; d < 3; d++) + { + iminmax[0] = hypre_BoxIMinD(box, d); + iminmax[1] = hypre_BoxIMaxD(box, d) + 1; + + for (i = 0; i < 2; i++) + { + /* find the new index position in the block_index array */ + index_not_there = 1; + for (j = 0; j < block_sz[d]; j++) + { + if (iminmax[i] <= block_index[d][j]) + { + if (iminmax[i] == block_index[d][j]) + index_not_there = 0; + break; + } + } + + /* if the index is already there, don't add it again */ + if (index_not_there) + { + for (k = block_sz[d]; k > j; k--) + block_index[d][k] = block_index[d][k-1]; + block_index[d][j] = iminmax[i]; + block_sz[d]++; + } + } + } + } + + for (d = 0; d < 3; d++) + block_sz[d]--; + block_volume = block_sz[0] * block_sz[1] * block_sz[2]; + + /*------------------------------------------------------ + * Set factor values + *------------------------------------------------------*/ + + factor[0] = 1; + factor[1] = (block_sz[0] + 1); + factor[2] = (block_sz[1] + 1) * factor[1]; + + /*------------------------------------------------------ + * Set up the block array + *------------------------------------------------------*/ + + block = hypre_CTAlloc(int, block_volume); + + hypre_ForBoxI(bi, boxes) + { + box = hypre_BoxArrayBox(boxes, bi); + + /* find the block_index indices corresponding to the current box */ + for (d = 0; d < 3; d++) + { + j = 0; + + while (hypre_BoxIMinD(box, d) != block_index[d][j]) + j++; + imin[d] = j; + + while (hypre_BoxIMaxD(box, d) + 1 != block_index[d][j]) + j++; + imax[d] = j; + } + + /* note: boxes of size zero will not be added to block */ + for (k = imin[2]; k < imax[2]; k++) + { + for (j = imin[1]; j < imax[1]; j++) + { + for (i = imin[0]; i < imax[0]; i++) + { + index = ((k) * block_sz[1] + j) * block_sz[0] + i; + + block[index] = factor[2] + factor[1] + factor[0]; + } + } + } + } + + /*------------------------------------------------------ + * Join block array in x, then y, then z + * + * Notes: + * - ii[0], ii[1], and ii[2] correspond to indices + * in x, y, and z respectively. + * - dd specifies the order in which to loop over + * the three dimensions. + *------------------------------------------------------*/ + + for (d = 0; d < 3; d++) + { + switch(d) + { + case 0: /* join in x */ + dd[0] = 0; + dd[1] = 1; + dd[2] = 2; + break; + + case 1: /* join in y */ + dd[0] = 1; + dd[1] = 0; + dd[2] = 2; + break; + + case 2: /* join in z */ + dd[0] = 2; + dd[1] = 1; + dd[2] = 0; + break; + } + + for (ii[dd[2]] = 0; ii[dd[2]] < block_sz[dd[2]]; ii[dd[2]]++) + { + for (ii[dd[1]] = 0; ii[dd[1]] < block_sz[dd[1]]; ii[dd[1]]++) + { + join = 0; + for (ii[dd[0]] = 0; ii[dd[0]] < block_sz[dd[0]]; ii[dd[0]]++) + { + index = ((ii[2]) * block_sz[1] + ii[1]) * block_sz[0] + ii[0]; + + if ((join) && (block[index] == i_tmp1)) + { + block[index] = 0; + block[i_tmp0] += factor[dd[0]]; + } + else + { + if (block[index]) + { + i_tmp0 = index; + i_tmp1 = block[index]; + join = 1; + } + else + join = 0; + } + } + } + } + } + + /*------------------------------------------------------ + * Set up the boxes BoxArray + *------------------------------------------------------*/ + + size = 0; + for (index = 0; index < block_volume; index++) + { + if (block[index]) + size++; + } + hypre_BoxArraySetSize(boxes, size); + + index = 0; + size = 0; + for (k = 0; k < block_sz[2]; k++) + { + for (j = 0; j < block_sz[1]; j++) + { + for (i = 0; i < block_sz[0]; i++) + { + if (block[index]) + { + ioff = (block[index] % factor[1]) ; + joff = (block[index] % factor[2]) / factor[1]; + koff = (block[index] ) / factor[2]; + + box = hypre_BoxArrayBox(boxes, size); + hypre_BoxIMinD(box, 0) = block_index[0][i]; + hypre_BoxIMinD(box, 1) = block_index[1][j]; + hypre_BoxIMinD(box, 2) = block_index[2][k]; + hypre_BoxIMaxD(box, 0) = block_index[0][i + ioff] - 1; + hypre_BoxIMaxD(box, 1) = block_index[1][j + joff] - 1; + hypre_BoxIMaxD(box, 2) = block_index[2][k + koff] - 1; + + size++; + } + + index++; + } + } + } + + /*--------------------------------------------------------- + * Clean up and return + *---------------------------------------------------------*/ + + hypre_TFree(block_index[0]); + hypre_TFree(block); + + return ierr; + } + Index: llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/box_alloc.c diff -c /dev/null llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/box_alloc.c:1.1 *** /dev/null Mon Apr 11 00:22:18 2005 --- llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/box_alloc.c Mon Apr 11 00:22:07 2005 *************** *** 0 **** --- 1,143 ---- + /*BHEADER********************************************************************** + * (c) 1997 The Regents of the University of California + * + * See the file COPYRIGHT_and_DISCLAIMER for a complete copyright + * notice, contact person, and disclaimer. + * + * $Revision: 1.1 $ + *********************************************************************EHEADER*/ + /****************************************************************************** + * + * Box allocation routines. These hopefully increase efficiency + * and reduce memory fragmentation. + * + *****************************************************************************/ + + #include "headers.h" + + /*-------------------------------------------------------------------------- + * Box memory data structure and static variables used to manage free + * list and blocks to be freed by the finalization routine. + *--------------------------------------------------------------------------*/ + + union box_memory + { + union box_memory *d_next; + hypre_Box d_box; + }; + + static union box_memory *s_free = NULL; + static union box_memory *s_finalize = NULL; + static int s_at_a_time = 1000; + static int s_count = 0; + + /*-------------------------------------------------------------------------- + * Allocate a new block of memory and thread it into the free list. The + * first block will always be put on the finalize list to be freed by + * the hypre_BoxFinalizeMemory() routine to remove memory leaks. + *--------------------------------------------------------------------------*/ + + static int + hypre_AllocateBoxBlock() + { + int ierr = 0; + union box_memory *ptr; + int i; + + ptr = hypre_TAlloc(union box_memory, s_at_a_time); + ptr[0].d_next = s_finalize; + s_finalize = &ptr[0]; + + for (i = (s_at_a_time - 1); i > 0; i--) + { + ptr[i].d_next = s_free; + s_free = &ptr[i]; + } + + return ierr; + } + + /*-------------------------------------------------------------------------- + * Set up the allocation block size and allocate the first memory block. + *--------------------------------------------------------------------------*/ + + int + hypre_BoxInitializeMemory( const int at_a_time ) + { + int ierr = 0; + + if (at_a_time > 0) + { + s_at_a_time = at_a_time; + } + + return ierr; + } + + /*-------------------------------------------------------------------------- + * Free all of the memory used to manage boxes. This should only be + * called at the end of the program to collect free memory. The blocks + * in the finalize list are freed. + *--------------------------------------------------------------------------*/ + + int + hypre_BoxFinalizeMemory() + { + int ierr = 0; + union box_memory *byebye; + + while (s_finalize) + { + byebye = s_finalize; + s_finalize = (s_finalize -> d_next); + hypre_TFree(byebye); + } + s_finalize = NULL; + s_free = NULL; + + return ierr; + } + + /*-------------------------------------------------------------------------- + * Allocate a box from the free list. If no boxes exist on the free + * list, then allocate a block of memory to repopulate the free list. + *--------------------------------------------------------------------------*/ + + hypre_Box * + hypre_BoxAlloc() + { + union box_memory *ptr = NULL; + + if (!s_free) + { + hypre_AllocateBoxBlock(); + } + + ptr = s_free; + s_free = (s_free -> d_next); + s_count++; + return( &(ptr -> d_box) ); + } + + /*-------------------------------------------------------------------------- + * Put a box back on the free list. + *--------------------------------------------------------------------------*/ + + int + hypre_BoxFree( hypre_Box *box ) + { + int ierr = 0; + union box_memory *ptr = (union box_memory *) box; + + (ptr -> d_next) = s_free; + s_free = ptr; + s_count--; + + if (!s_count) + { + hypre_BoxFinalizeMemory(); + } + + return ierr; + } + Index: llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/box_neighbors.c diff -c /dev/null llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/box_neighbors.c:1.1 *** /dev/null Mon Apr 11 00:22:18 2005 --- llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/box_neighbors.c Mon Apr 11 00:22:07 2005 *************** *** 0 **** --- 1,344 ---- + /*BHEADER********************************************************************** + * (c) 1998 The Regents of the University of California + * + * See the file COPYRIGHT_and_DISCLAIMER for a complete copyright + * notice, contact person, and disclaimer. + * + * $Revision: 1.1 $ + *********************************************************************EHEADER*/ + /****************************************************************************** + * + * Member functions for the hypre_BoxNeighbors class. + * + *****************************************************************************/ + + #include "headers.h" + + #define DEBUG 0 + + #if DEBUG + char filename[255]; + FILE *file; + int my_rank; + hypre_Box *box; + static int debug_count = 0; + #endif + + /*-------------------------------------------------------------------------- + * hypre_RankLinkCreate: + *--------------------------------------------------------------------------*/ + + int + hypre_RankLinkCreate( int rank, + hypre_RankLink **rank_link_ptr) + { + hypre_RankLink *rank_link; + + rank_link = hypre_TAlloc(hypre_RankLink, 1); + + hypre_RankLinkRank(rank_link) = rank; + hypre_RankLinkNext(rank_link) = NULL; + + *rank_link_ptr = rank_link; + + return 0; + } + + /*-------------------------------------------------------------------------- + * hypre_RankLinkDestroy: + *--------------------------------------------------------------------------*/ + + int + hypre_RankLinkDestroy( hypre_RankLink *rank_link ) + { + int ierr = 0; + + if (rank_link) + { + hypre_TFree(rank_link); + } + + return ierr; + } + + /*-------------------------------------------------------------------------- + * hypre_BoxNeighborsCreate: + *--------------------------------------------------------------------------*/ + + int + hypre_BoxNeighborsCreate( hypre_BoxArray *boxes, + int *procs, + int *ids, + int first_local, + int num_local, + int num_periodic, + hypre_BoxNeighbors **neighbors_ptr ) + { + hypre_BoxNeighbors *neighbors; + + neighbors = hypre_CTAlloc(hypre_BoxNeighbors, 1); + hypre_BoxNeighborsRankLinks(neighbors) = + hypre_CTAlloc(hypre_RankLinkArray, num_local); + + hypre_BoxNeighborsBoxes(neighbors) = boxes; + hypre_BoxNeighborsProcs(neighbors) = procs; + hypre_BoxNeighborsIDs(neighbors) = ids; + hypre_BoxNeighborsFirstLocal(neighbors) = first_local; + hypre_BoxNeighborsNumLocal(neighbors) = num_local; + hypre_BoxNeighborsNumPeriodic(neighbors) = num_periodic; + + *neighbors_ptr = neighbors; + + return 0; + } + + /*-------------------------------------------------------------------------- + * hypre_BoxNeighborsAssemble: + * + * Finds boxes that are "near" the local boxes, + * where near is defined by `max_distance'. + * + * Note: A box is not a neighbor of itself, but it will appear + * in the `boxes' BoxArray. + * + * Note: The box ids remain in increasing order, and the box procs + * remain in non-decreasing order. + * + * Note: All boxes on my processor remain in the neighborhood. However, + * they may not be a neighbor of any local box. + * + *--------------------------------------------------------------------------*/ + + int + hypre_BoxNeighborsAssemble( hypre_BoxNeighbors *neighbors, + int max_distance, + int prune ) + { + hypre_BoxArray *boxes; + int *procs; + int *ids; + int first_local; + int num_local; + int num_periodic; + + int keep_box; + int num_boxes; + + hypre_RankLink *rank_link; + + hypre_Box *local_box; + hypre_Box *neighbor_box; + + int distance; + int distance_index[3]; + + int diff; + int i, j, d, ilocal, inew; + + int ierr = 0; + + /*--------------------------------------------- + * Find neighboring boxes + *---------------------------------------------*/ + + boxes = hypre_BoxNeighborsBoxes(neighbors); + procs = hypre_BoxNeighborsProcs(neighbors); + ids = hypre_BoxNeighborsIDs(neighbors); + first_local = hypre_BoxNeighborsFirstLocal(neighbors); + num_local = hypre_BoxNeighborsNumLocal(neighbors); + num_periodic = hypre_BoxNeighborsNumPeriodic(neighbors); + + /*--------------------------------------------- + * Find neighboring boxes + *---------------------------------------------*/ + + inew = 0; + num_boxes = 0; + hypre_ForBoxI(i, boxes) + { + keep_box = 0; + for (j = 0; j < num_local + num_periodic; j++) + { + ilocal = first_local + j; + if (i != ilocal) + { + local_box = hypre_BoxArrayBox(boxes, ilocal); + neighbor_box = hypre_BoxArrayBox(boxes, i); + + /* compute distance info */ + distance = 0; + for (d = 0; d < 3; d++) + { + distance_index[d] = 0; + + diff = hypre_BoxIMinD(neighbor_box, d) - + hypre_BoxIMaxD(local_box, d); + if (diff > 0) + { + distance_index[d] = 1; + distance = hypre_max(distance, diff); + } + + diff = hypre_BoxIMinD(local_box, d) - + hypre_BoxIMaxD(neighbor_box, d); + if (diff > 0) + { + distance_index[d] = -1; + distance = hypre_max(distance, diff); + } + } + + /* create new rank_link */ + if (distance <= max_distance) + { + keep_box = 1; + + if (j < num_local) + { + hypre_RankLinkCreate(num_boxes, &rank_link); + hypre_RankLinkNext(rank_link) = + hypre_BoxNeighborsRankLink(neighbors, j, + distance_index[0], + distance_index[1], + distance_index[2]); + hypre_BoxNeighborsRankLink(neighbors, j, + distance_index[0], + distance_index[1], + distance_index[2]) = rank_link; + } + } + } + else + { + keep_box = 1; + } + } + + if (prune) + { + /* use procs array to store which boxes to keep */ + if (keep_box) + { + procs[i] = -procs[i]; + if (inew < i) + { + procs[inew] = i; + } + inew = i + 1; + + num_boxes++; + } + } + else + { + /* keep all of the boxes */ + num_boxes++; + } + } + + if (prune) + { + i = 0; + for (inew = 0; inew < num_boxes; inew++) + { + if (procs[i] > 0) + { + i = procs[i]; + } + hypre_CopyBox(hypre_BoxArrayBox(boxes, i), + hypre_BoxArrayBox(boxes, inew)); + procs[inew] = -procs[i]; + ids[inew] = ids[i]; + if (i == first_local) + { + first_local = inew; + } + + i++; + } + } + + hypre_BoxArraySetSize(boxes, num_boxes); + hypre_BoxNeighborsFirstLocal(neighbors) = first_local; + + #if DEBUG + MPI_Comm_rank(MPI_COMM_WORLD, &my_rank); + + sprintf(filename, "zneighbors.%05d", my_rank); + + if ((file = fopen(filename, "a")) == NULL) + { + printf("Error: can't open output file %s\n", filename); + exit(1); + } + + fprintf(file, "\n\n============================\n\n"); + fprintf(file, "\n\n%d\n\n", debug_count++); + fprintf(file, "num_boxes = %d\n", num_boxes); + for (i = 0; i < num_boxes; i++) + { + box = hypre_BoxArrayBox(boxes, i); + fprintf(file, "(%d,%d,%d) X (%d,%d,%d) ; (%d,%d); %d\n", + hypre_BoxIMinX(box),hypre_BoxIMinY(box),hypre_BoxIMinZ(box), + hypre_BoxIMaxX(box),hypre_BoxIMaxY(box),hypre_BoxIMaxZ(box), + procs[i], ids[i], hypre_BoxVolume(box)); + } + fprintf(file, "first_local = %d\n", first_local); + fprintf(file, "num_local = %d\n", num_local); + fprintf(file, "num_periodic = %d\n", num_periodic); + fprintf(file, "max_distance = %d\n", max_distance); + + fprintf(file, "\n"); + + fflush(file); + fclose(file); + #endif + + return ierr; + } + + /*-------------------------------------------------------------------------- + * hypre_BoxNeighborsDestroy: + *--------------------------------------------------------------------------*/ + + int + hypre_BoxNeighborsDestroy( hypre_BoxNeighbors *neighbors ) + { + hypre_RankLink *rank_link; + hypre_RankLink *next_rank_link; + + int b, i, j, k; + + int ierr = 0; + + if (neighbors) + { + for (b = 0; b < hypre_BoxNeighborsNumLocal(neighbors); b++) + { + for (k = -1; k <= 1; k++) + { + for (j = -1; j <= 1; j++) + { + for (i = -1; i <= 1; i++) + { + rank_link = + hypre_BoxNeighborsRankLink(neighbors, b, i, j, k); + while (rank_link) + { + next_rank_link = hypre_RankLinkNext(rank_link); + hypre_RankLinkDestroy(rank_link); + rank_link = next_rank_link; + } + } + } + } + } + hypre_BoxArrayDestroy(hypre_BoxNeighborsBoxes(neighbors)); + hypre_TFree(hypre_BoxNeighborsProcs(neighbors)); + hypre_TFree(hypre_BoxNeighborsIDs(neighbors)); + hypre_TFree(hypre_BoxNeighborsRankLinks(neighbors)); + hypre_TFree(neighbors); + } + + return ierr; + } Index: llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/coarsen.c diff -c /dev/null llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/coarsen.c:1.1 *** /dev/null Mon Apr 11 00:22:18 2005 --- llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/coarsen.c Mon Apr 11 00:22:07 2005 *************** *** 0 **** --- 1,832 ---- + /*BHEADER********************************************************************** + * (c) 2000 The Regents of the University of California + * + * See the file COPYRIGHT_and_DISCLAIMER for a complete copyright + * notice, contact person, and disclaimer. + * + * $Revision: 1.1 $ + *********************************************************************EHEADER*/ + + #include "headers.h" + + #define DEBUG 0 + + #if DEBUG + char filename[255]; + FILE *file; + static int debug_count = 0; + #endif + + /*-------------------------------------------------------------------------- + * hypre_StructMapFineToCoarse + * + * NOTE: findex and cindex are indexes on the fine and coarse index space, + * and do not stand for "F-pt index" and "C-pt index". + *--------------------------------------------------------------------------*/ + + int + hypre_StructMapFineToCoarse( hypre_Index findex, + hypre_Index index, + hypre_Index stride, + hypre_Index cindex ) + { + hypre_IndexX(cindex) = + (hypre_IndexX(findex) - hypre_IndexX(index)) / hypre_IndexX(stride); + hypre_IndexY(cindex) = + (hypre_IndexY(findex) - hypre_IndexY(index)) / hypre_IndexY(stride); + hypre_IndexZ(cindex) = + (hypre_IndexZ(findex) - hypre_IndexZ(index)) / hypre_IndexZ(stride); + + return 0; + } + + /*-------------------------------------------------------------------------- + * hypre_StructMapCoarseToFine + * + * NOTE: findex and cindex are indexes on the fine and coarse index space, + * and do not stand for "F-pt index" and "C-pt index". + *--------------------------------------------------------------------------*/ + + int + hypre_StructMapCoarseToFine( hypre_Index cindex, + hypre_Index index, + hypre_Index stride, + hypre_Index findex ) + { + hypre_IndexX(findex) = + hypre_IndexX(cindex) * hypre_IndexX(stride) + hypre_IndexX(index); + hypre_IndexY(findex) = + hypre_IndexY(cindex) * hypre_IndexY(stride) + hypre_IndexY(index); + hypre_IndexZ(findex) = + hypre_IndexZ(cindex) * hypre_IndexZ(stride) + hypre_IndexZ(index); + + return 0; + } + + #if 1 + + /*-------------------------------------------------------------------------- + * hypre_StructCoarsen - NEW + * + * This routine coarsens the grid, 'fgrid', by the coarsening factor, + * 'stride', using the index mapping in 'hypre_StructMapFineToCoarse'. + * The basic algorithm is as follows: + * + * 1. Coarsen the neighborhood boxes. + * + * 2. Loop through neighborhood boxes, and compute the minimum + * positive outside xyz distances from local boxes to neighbor boxes. + * If some xyz distance is less than desired, receive neighborhood + * information from the neighbor box processor. + * + * 3. Loop through neighborhood boxes, and compute the minimum + * positive outside xyz distances from neighbor boxes to local boxes. + * If some xyz distance is less than desired, send neighborhood + * information to the neighbor box processor. + * + * 4. If the boolean variable, 'prune', is nonzero, eliminate boxes of + * size zero from the coarse grid. + * + * Notes: + * + * 1. All neighborhood info is sent. + * + * 2. Positive outside difference, d, is defined as follows: + * + * |<---- d ---->| + * ------ ------ + * | | | | + * | box1 | | box2 | + * | | | | + * ------ ------ + * + * 3. Neighborhoods must contain all boxes associated with the + * processor where it lives. In particular, "periodic boxes", (i.e., + * those boxes that were shifted to handle periodicity) associated + * with local boxes should remain in the neighborhood. The neighbor + * class routines insure this. + * + * 4. Processor numbers must appear in non-decreasing order in the + * neighborhood box array, and IDs must be unique and appear in + * increasing order. + * + * 5. Neighborhood information only needs to be exchanged the first + * time a box boundary moves within the max_distance perimeter. + * + * 6. Boxes of size zero must also be considered when determining + * neighborhood information exchanges. + * + * 7. This routine will work only if the coarsening factor is <= 2. + * To extend this algorithm to work with larger coarsening factors, + * more than one exchange of neighbor information will be needed after + * each processor coarsens its own neighborhood. + * + *--------------------------------------------------------------------------*/ + + #define hypre_StructCoarsenBox(box, index, stride) \ + hypre_ProjectBox(box, index, stride);\ + hypre_StructMapFineToCoarse(hypre_BoxIMin(box), index, stride,\ + hypre_BoxIMin(box));\ + hypre_StructMapFineToCoarse(hypre_BoxIMax(box), index, stride,\ + hypre_BoxIMax(box)) + + int + hypre_StructCoarsen( hypre_StructGrid *fgrid, + hypre_Index index, + hypre_Index stride, + int prune, + hypre_StructGrid **cgrid_ptr ) + { + int ierr = 0; + + hypre_StructGrid *cgrid; + + MPI_Comm comm; + int dim; + hypre_BoxNeighbors *neighbors; + hypre_BoxArray *hood_boxes; + int num_hood; + int *hood_procs; + int *hood_ids; + int first_local; + int num_local; + int num_periodic; + int max_distance; + hypre_Box *bounding_box; + hypre_Index periodic; + + MPI_Request *send_requests; + MPI_Status *send_status; + int *send_buffer; + int send_size; + MPI_Request *recv_requests; + MPI_Status *recv_status; + int **recv_buffers; + int *recv_sizes; + int my_rank; + + int *send_procs; + int *recv_procs; + int num_sends; + int num_recvs; + + hypre_BoxArray *new_hood_boxes; + int new_num_hood; + int *new_hood_procs; + int *new_hood_ids; + int new_first_local; + int new_num_local; + int new_num_periodic; + + hypre_Box *box; + hypre_Box *local_box; + hypre_Box *neighbor_box; + hypre_Box *local_cbox; + hypre_Box *neighbor_cbox; + hypre_Index imin; + hypre_Index imax; + int alloc_size; + + double perimeter_count, cperimeter_count; + /*double diff, distance, perimeter_count, cperimeter_count;*/ + + int *iarray; + int *jrecv; + int i, j, d, ilocal; + int data_id, min_id, jj; + + /*----------------------------------------- + * Copy needed info from fgrid + *-----------------------------------------*/ + + comm = hypre_StructGridComm(fgrid); + dim = hypre_StructGridDim(fgrid); + neighbors = hypre_StructGridNeighbors(fgrid); + hood_boxes = hypre_BoxArrayDuplicate(hypre_BoxNeighborsBoxes(neighbors)); + num_hood = hypre_BoxArraySize(hood_boxes); + + iarray = hypre_BoxNeighborsProcs(neighbors); + hood_procs = hypre_TAlloc(int, num_hood); + for (i = 0; i < num_hood; i++) + { + hood_procs[i] = iarray[i]; + } + + iarray = hypre_BoxNeighborsIDs(neighbors); + hood_ids = hypre_TAlloc(int, num_hood); + for (i = 0; i < num_hood; i++) + { + hood_ids[i] = iarray[i]; + } + + first_local = hypre_BoxNeighborsFirstLocal(neighbors); + num_local = hypre_BoxNeighborsNumLocal(neighbors); + num_periodic = hypre_BoxNeighborsNumPeriodic(neighbors); + + max_distance = hypre_StructGridMaxDistance(fgrid); + bounding_box = hypre_BoxDuplicate(hypre_StructGridBoundingBox(fgrid)); + hypre_CopyIndex(hypre_StructGridPeriodic(fgrid), periodic); + + MPI_Comm_rank(comm, &my_rank); + + #if DEBUG + sprintf(filename, "zcoarsen.%05d", my_rank); + + if ((file = fopen(filename, "a")) == NULL) + { + printf("Error: can't open output file %s\n", filename); + exit(1); + } + + fprintf(file, "\n\n============================\n\n"); + fprintf(file, "\n\n%d\n\n", debug_count++); + fprintf(file, "num_hood = %d\n", num_hood); + for (i = 0; i < num_hood; i++) + { + box = hypre_BoxArrayBox(hood_boxes, i); + fprintf(file, "(%d,%d,%d) X (%d,%d,%d) ; (%d,%d); %d\n", + hypre_BoxIMinX(box),hypre_BoxIMinY(box),hypre_BoxIMinZ(box), + hypre_BoxIMaxX(box),hypre_BoxIMaxY(box),hypre_BoxIMaxZ(box), + hood_procs[i], hood_ids[i], hypre_BoxVolume(box)); + } + fprintf(file, "first_local = %d\n", first_local); + fprintf(file, "num_local = %d\n", num_local); + fprintf(file, "num_periodic = %d\n", num_periodic); + #endif + + /*----------------------------------------- + * Coarsen bounding box + *-----------------------------------------*/ + + hypre_StructCoarsenBox(bounding_box, index, stride); + + /*----------------------------------------- + * Coarsen neighborhood boxes & determine + * send / recv procs + * + * NOTE: Currently, this always communicates + * with all neighboring processes. + *-----------------------------------------*/ + + local_cbox = hypre_BoxCreate(); + neighbor_cbox = hypre_BoxCreate(); + + num_recvs = 0; + num_sends = 0; + recv_procs = NULL; + send_procs = NULL; + for (i = 0; i < num_hood; i++) + { + if (hood_procs[i] != my_rank) + { + for (j = 0; j < num_local; j++) + { + ilocal = first_local + j; + + local_box = hypre_BoxArrayBox(hood_boxes, ilocal); + neighbor_box = hypre_BoxArrayBox(hood_boxes, i); + + /* coarsen boxes being considered */ + hypre_CopyBox(local_box, local_cbox); + hypre_StructCoarsenBox(local_cbox, index, stride); + hypre_CopyBox(neighbor_box, neighbor_cbox); + hypre_StructCoarsenBox(neighbor_cbox, index, stride); + + /*----------------------- + * Receive info? + *-----------------------*/ + + /* always communicate */ + #if 0 + perimeter_count = 0; + cperimeter_count = 0; + for (d = 0; d < 3; d++) + { + distance = max_distance; + diff = hypre_BoxIMaxD(neighbor_box, d) - + hypre_BoxIMaxD(local_box, d); + if (diff > 0) + { + distance = hypre_min(distance, diff); + } + diff = hypre_BoxIMinD(local_box, d) - + hypre_BoxIMinD(neighbor_box, d); + if (diff > 0) + { + distance = hypre_min(distance, diff); + } + if (distance < max_distance) + { + perimeter_count++; + } + + distance = max_distance; + diff = hypre_BoxIMaxD(neighbor_cbox, d) - + hypre_BoxIMaxD(local_cbox, d); + if (diff > 0) + { + distance = hypre_min(distance, diff); + } + diff = hypre_BoxIMinD(local_cbox, d) - + hypre_BoxIMinD(neighbor_cbox, d); + if (diff > 0) + { + distance = hypre_min(distance, diff); + } + if (distance < max_distance) + { + cperimeter_count++; + } + } + #else + perimeter_count = 0; + cperimeter_count = 1; + #endif + if (cperimeter_count > perimeter_count) + { + if (num_recvs == 0) + { + recv_procs = hypre_TAlloc(int, num_hood); + recv_procs[num_recvs] = hood_procs[i]; + num_recvs++; + } + else if (hood_procs[i] != recv_procs[num_recvs-1]) + { + recv_procs[num_recvs] = hood_procs[i]; + num_recvs++; + } + } + + /*----------------------- + * Send info? + *-----------------------*/ + + /* always communicate */ + #if 0 + perimeter_count = 0; + cperimeter_count = 0; + for (d = 0; d < 3; d++) + { + distance = max_distance; + diff = hypre_BoxIMaxD(local_box, d) - + hypre_BoxIMaxD(neighbor_box, d); + if (diff > 0) + { + distance = hypre_min(distance, diff); + } + diff = hypre_BoxIMinD(neighbor_box, d) - + hypre_BoxIMinD(local_box, d); + if (diff > 0) + { + distance = hypre_min(distance, diff); + } + if (distance < max_distance) + { + perimeter_count++; + } + + distance = max_distance; + diff = hypre_BoxIMaxD(local_cbox, d) - + hypre_BoxIMaxD(neighbor_cbox, d); + if (diff > 0) + { + distance = hypre_min(distance, diff); + } + diff = hypre_BoxIMinD(neighbor_cbox, d) - + hypre_BoxIMinD(local_cbox, d); + if (diff > 0) + { + distance = hypre_min(distance, diff); + } + if (distance < max_distance) + { + cperimeter_count++; + } + } + #else + perimeter_count = 0; + cperimeter_count = 1; + #endif + if (cperimeter_count > perimeter_count) + { + if (num_sends == 0) + { + send_procs = hypre_TAlloc(int, num_hood); + send_procs[num_sends] = hood_procs[i]; + num_sends++; + } + else if (hood_procs[i] != send_procs[num_sends-1]) + { + send_procs[num_sends] = hood_procs[i]; + num_sends++; + } + } + } + } + } + + hypre_BoxDestroy(local_cbox); + hypre_BoxDestroy(neighbor_cbox); + + /* coarsen neighborhood boxes */ + for (i = 0; i < num_hood; i++) + { + box = hypre_BoxArrayBox(hood_boxes, i); + hypre_StructCoarsenBox(box, index, stride); + } + + #if DEBUG + fprintf(file, "num_recvs = %d\n", num_recvs); + for (i = 0; i < num_recvs; i++) + { + fprintf(file, "%d ", recv_procs[i]); + } + fprintf(file, "\n"); + fprintf(file, "num_sends = %d\n", num_sends); + for (i = 0; i < num_sends; i++) + { + fprintf(file, "%d ", send_procs[i]); + } + fprintf(file, "\n"); + + fflush(file); + fclose(file); + #endif + + /*----------------------------------------- + * Exchange neighbor info with other procs + *-----------------------------------------*/ + + /* neighbor size info - post receives */ + if (num_recvs) + { + recv_requests = hypre_TAlloc(MPI_Request, num_recvs); + recv_status = hypre_TAlloc(MPI_Status, num_recvs); + + recv_sizes = hypre_TAlloc(int, num_recvs); + for (i = 0; i < num_recvs; i++) + { + MPI_Irecv(&recv_sizes[i], 1, MPI_INT, + recv_procs[i], 0, comm, &recv_requests[i]); + } + } + + /* neighbor size info - post sends */ + if (num_sends) + { + send_requests = hypre_TAlloc(MPI_Request, num_sends); + send_status = hypre_TAlloc(MPI_Status, num_sends); + + send_size = 8 * hypre_BoxArraySize(hood_boxes); + for (i = 0; i < num_sends; i++) + { + MPI_Isend(&send_size, 1, MPI_INT, + send_procs[i], 0, comm, &send_requests[i]); + } + } + + /* neighbor size info - complete receives */ + if (num_recvs) + { + MPI_Waitall(num_recvs, recv_requests, recv_status); + } + + /* neighbor size info - complete sends */ + if (num_sends) + { + MPI_Waitall(num_sends, send_requests, send_status); + } + + /*-----------------------------------------*/ + + /* neighbor info - post receives */ + if (num_recvs) + { + recv_buffers = hypre_TAlloc(int *, num_recvs); + for (i = 0; i < num_recvs; i++) + { + recv_buffers[i] = hypre_SharedTAlloc(int, recv_sizes[i]); + MPI_Irecv(recv_buffers[i], recv_sizes[i], MPI_INT, + recv_procs[i], 0, comm, &recv_requests[i]); + } + } + + /* neighbor info - post sends */ + if (num_sends) + { + /* pack the send buffer */ + send_buffer = hypre_SharedTAlloc(int, send_size); + j = 0; + for (i = 0; i < num_hood; i++) + { + send_buffer[j++] = hood_ids[i]; + send_buffer[j++] = hood_procs[i]; + box = hypre_BoxArrayBox(hood_boxes, i); + for (d = 0; d < 3; d++) + { + send_buffer[j++] = hypre_BoxIMinD(box, d); + send_buffer[j++] = hypre_BoxIMaxD(box, d); + } + } + + for (i = 0; i < num_sends; i++) + { + MPI_Isend(send_buffer, send_size, MPI_INT, + send_procs[i], 0, comm, &send_requests[i]); + } + } + + /* neighbor info - complete receives */ + if (num_recvs) + { + MPI_Waitall(num_recvs, recv_requests, recv_status); + + hypre_TFree(recv_requests); + hypre_TFree(recv_status); + } + + /* neighbor info - complete sends */ + if (num_sends) + { + MPI_Waitall(num_sends, send_requests, send_status); + + hypre_TFree(send_requests); + hypre_TFree(send_status); + hypre_TFree(send_buffer); + } + + /*----------------------------------------- + * Unpack the recv buffers to create + * new neighborhood info + *-----------------------------------------*/ + + if (num_recvs) + { + alloc_size = num_hood; + new_hood_boxes = hypre_BoxArrayCreate(alloc_size); + hypre_BoxArraySetSize(new_hood_boxes, 0); + new_hood_procs = hypre_TAlloc(int, alloc_size); + new_hood_ids = hypre_TAlloc(int, alloc_size); + + box = hypre_BoxCreate(); + + j = 0; + jrecv = hypre_CTAlloc(int, num_recvs); + new_num_hood = 0; + while (1) + { + data_id = -2; + + /* inspect neighborhood */ + if (j < num_hood) + { + if (data_id == -2) + { + min_id = hood_ids[j]; + data_id = -1; + } + else if (hood_ids[j] < min_id) + { + min_id = hood_ids[j]; + data_id = -1; + } + else if (hood_ids[j] == min_id) + { + j++; + } + } + + /* inspect recv buffer neighborhoods */ + for (i = 0; i < num_recvs; i++) + { + jj = jrecv[i]; + if (jj < recv_sizes[i]) + { + if (data_id == -2) + { + min_id = recv_buffers[i][jj]; + data_id = i; + } + else if (recv_buffers[i][jj] < min_id) + { + min_id = recv_buffers[i][jj]; + data_id = i; + } + else if (recv_buffers[i][jj] == min_id) + { + jrecv[i] += 8; + } + } + } + + /* put data into new neighborhood structures */ + if (data_id > -2) + { + if (new_num_hood == alloc_size) + { + alloc_size += num_hood; + new_hood_procs = + hypre_TReAlloc(new_hood_procs, int, alloc_size); + new_hood_ids = + hypre_TReAlloc(new_hood_ids, int, alloc_size); + } + + if (data_id == -1) + { + /* get data from neighborhood */ + new_hood_procs[new_num_hood] = hood_procs[j]; + new_hood_ids[new_num_hood] = hood_ids[j]; + hypre_AppendBox(hypre_BoxArrayBox(hood_boxes, j), + new_hood_boxes); + if (j == first_local) + { + new_first_local = new_num_hood; + } + + j++; + } + else + { + /* get data from recv buffer neighborhoods */ + jj = jrecv[data_id]; + new_hood_ids[new_num_hood] = recv_buffers[data_id][jj++]; + new_hood_procs[new_num_hood] = recv_buffers[data_id][jj++]; + for (d = 0; d < 3; d++) + { + hypre_IndexD(imin, d) = recv_buffers[data_id][jj++]; + hypre_IndexD(imax, d) = recv_buffers[data_id][jj++]; + } + hypre_BoxSetExtents(box, imin, imax); + hypre_AppendBox(box, new_hood_boxes); + jrecv[data_id] = jj; + } + + new_num_hood++; + } + else + { + break; + } + } + + for (i = 0; i < num_recvs; i++) + { + hypre_TFree(recv_buffers[i]); + } + hypre_TFree(recv_buffers); + hypre_TFree(recv_sizes); + + hypre_BoxDestroy(box); + hypre_TFree(jrecv); + + hypre_BoxArrayDestroy(hood_boxes); + hypre_TFree(hood_procs); + hypre_TFree(hood_ids); + + hood_boxes = new_hood_boxes; + num_hood = new_num_hood; + hood_procs = new_hood_procs; + hood_ids = new_hood_ids; + first_local = new_first_local; + } + + hypre_TFree(send_procs); + hypre_TFree(recv_procs); + + /*----------------------------------------- + * Eliminate boxes of size zero + *-----------------------------------------*/ + + if (prune) + { + j = 0; + new_first_local = -1; + new_num_local = 0; + new_num_periodic = 0; + for (i = 0; i < num_hood; i++) + { + box = hypre_BoxArrayBox(hood_boxes, i); + if ( hypre_BoxVolume(box) ) + { + hypre_CopyBox(box, hypre_BoxArrayBox(hood_boxes, j)); + hood_procs[j] = hood_procs[i]; + hood_ids[j] = hood_ids[i]; + if ((i >= first_local) && + (i < first_local + num_local)) + { + if (new_first_local == -1) + { + new_first_local = j; + } + new_num_local++; + } + else if ((i >= first_local + num_local) && + (i < first_local + num_local + num_periodic)) + { + new_num_periodic++; + } + j++; + } + } + num_hood = j; + hypre_BoxArraySetSize(hood_boxes, num_hood); + first_local = new_first_local; + num_local = new_num_local; + num_periodic = new_num_periodic; + } + + /*----------------------------------------- + * Build the coarse grid + *-----------------------------------------*/ + + hypre_StructGridCreate(comm, dim, &cgrid); + + /* set neighborhood */ + hypre_StructGridSetHood(cgrid, hood_boxes, hood_procs, hood_ids, + first_local, num_local, num_periodic, bounding_box); + + hypre_StructGridSetHoodInfo(cgrid, max_distance); + + /* set periodicity */ + for (d = 0; d < dim; d++) + { + if (hypre_IndexD(periodic, d) > 0) + { + hypre_IndexD(periodic, d) = + hypre_IndexD(periodic, d) / hypre_IndexD(stride, d); + } + } + hypre_StructGridSetPeriodic(cgrid, periodic); + + hypre_StructGridAssemble(cgrid); + + *cgrid_ptr = cgrid; + + return ierr; + } + + #undef hypre_StructCoarsenBox + + #else + + /*-------------------------------------------------------------------------- + * hypre_StructCoarsen - TEMPORARY + *--------------------------------------------------------------------------*/ + + int + hypre_StructCoarsen( hypre_StructGrid *fgrid, + hypre_Index index, + hypre_Index stride, + int prune, + hypre_StructGrid **cgrid_ptr ) + { + int ierr = 0; + + hypre_StructGrid *cgrid; + + MPI_Comm comm = hypre_StructGridComm(fgrid); + int dim = hypre_StructGridDim(fgrid); + hypre_BoxArray *boxes; + hypre_Index periodic; + + hypre_Box *box; + + int i, d; + + hypre_StructGridCreate(comm, dim, &cgrid); + + /* coarsen boxes */ + boxes = hypre_BoxArrayDuplicate(hypre_StructGridBoxes(fgrid)); + hypre_ProjectBoxArray(boxes, index, stride); + for (i = 0; i < hypre_BoxArraySize(boxes); i++) + { + box = hypre_BoxArrayBox(boxes, i); + hypre_StructMapFineToCoarse(hypre_BoxIMin(box), index, stride, + hypre_BoxIMin(box)); + hypre_StructMapFineToCoarse(hypre_BoxIMax(box), index, stride, + hypre_BoxIMax(box)); + } + + /* set boxes */ + hypre_StructGridSetBoxes(cgrid, boxes); + + /* set periodicity */ + hypre_CopyIndex(hypre_StructGridPeriodic(fgrid), periodic); + for (d = 0; d < dim; d++) + { + if (hypre_IndexD(periodic, d) > 0) + { + hypre_IndexD(periodic, d) = + hypre_IndexD(periodic, d) / hypre_IndexD(stride, d); + } + } + hypre_StructGridSetPeriodic(cgrid, periodic); + + hypre_StructGridAssemble(cgrid); + + *cgrid_ptr = cgrid; + + return ierr; + } + + #endif Index: llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/communication.c diff -c /dev/null llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/communication.c:1.1 *** /dev/null Mon Apr 11 00:22:18 2005 --- llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/communication.c Mon Apr 11 00:22:07 2005 *************** *** 0 **** --- 1,1569 ---- + /*BHEADER********************************************************************** + * (c) 1997 The Regents of the University of California + * + * See the file COPYRIGHT_and_DISCLAIMER for a complete copyright + * notice, contact person, and disclaimer. + * + * $Revision: 1.1 $ + *********************************************************************EHEADER*/ + + #include "headers.h" + + /*==========================================================================*/ + /*==========================================================================*/ + /** Create a communication package. A grid-based description of a + communication exchange is passed in. This description is then + compiled into an intermediate processor-based description of the + communication. It may further be compiled into a form based on the + message-passing layer in the routine hypre\_CommitCommPkg. This + proceeds as follows based on several compiler flags: + + \begin{itemize} + \item If HYPRE\_COMM\_SIMPLE is defined, the intermediate + processor-based description is not compiled into a form based on + the message-passing layer. This intermediate description is used + directly to pack and unpack buffers during the communications. + No MPI derived datatypes are used. + \item Else if HYPRE\_COMM\_VOLATILE is defined, the communication + package is not committed, and the intermediate processor-based + description is retained. The package is committed at communication + time. + \item Else the communication package is committed, and the intermediate + processor-based description is freed up. + \end{itemize} + + {\bf Note:} + The input boxes and processes are destroyed. + + {\bf Input files:} + headers.h + + @return Communication package. + + @param send_boxes [IN] + description of the grid data to be sent to other processors. + @param recv_boxes [IN] + description of the grid data to be received from other processors. + @param send_data_space [IN] + description of the stored grid data associated with the sends. + @param recv_data_space [IN] + description of the stored grid data associated with the receives. + @param send_processes [IN] + processors that data is to be sent to. + @param recv_processes [IN] + processors that data is to be received from. + @param num_values [IN] + number of data values to be sent for each grid index. + @param comm [IN] + communicator. + + @see hypre_CommPkgCreateInfo, hypre_CommPkgCommit, hypre_CommPkgDestroy */ + /*--------------------------------------------------------------------------*/ + + hypre_CommPkg * + hypre_CommPkgCreate( hypre_BoxArrayArray *send_boxes, + hypre_BoxArrayArray *recv_boxes, + hypre_Index send_stride, + hypre_Index recv_stride, + hypre_BoxArray *send_data_space, + hypre_BoxArray *recv_data_space, + int **send_processes, + int **recv_processes, + int num_values, + MPI_Comm comm, + hypre_Index periodic ) + { + hypre_CommPkg *comm_pkg; + + int num_sends; + int *send_procs; + hypre_CommType **send_types; + int num_recvs; + int *recv_procs; + hypre_CommType **recv_types; + + hypre_CommType *copy_from_type; + hypre_CommType *copy_to_type; + + int i; + + /*------------------------------------------------------ + * Put arguments into hypre_CommPkg + *------------------------------------------------------*/ + + comm_pkg = hypre_CTAlloc(hypre_CommPkg, 1); + + hypre_CommPkgNumValues(comm_pkg) = num_values; + hypre_CommPkgComm(comm_pkg) = comm; + + /*------------------------------------------------------ + * Set up communication information + *------------------------------------------------------*/ + + hypre_CommPkgCreateInfo(send_boxes, send_stride, + send_data_space, send_processes, + num_values, comm, periodic, + &num_sends, &send_procs, + &send_types, ©_from_type); + + hypre_CommPkgNumSends(comm_pkg) = num_sends; + hypre_CommPkgSendProcs(comm_pkg) = send_procs; + hypre_CommPkgSendTypes(comm_pkg) = send_types; + hypre_CommPkgCopyFromType(comm_pkg) = copy_from_type; + + hypre_CommPkgCreateInfo(recv_boxes, recv_stride, + recv_data_space, recv_processes, + num_values, comm, periodic, + &num_recvs, &recv_procs, + &recv_types, ©_to_type); + + hypre_CommPkgNumRecvs(comm_pkg) = num_recvs; + hypre_CommPkgRecvProcs(comm_pkg) = recv_procs; + hypre_CommPkgRecvTypes(comm_pkg) = recv_types; + hypre_CommPkgCopyToType(comm_pkg) = copy_to_type; + + /*------------------------------------------------------ + * Destroy the input boxes and processes + *------------------------------------------------------*/ + + hypre_ForBoxArrayI(i, send_boxes) + hypre_TFree(send_processes[i]); + hypre_BoxArrayArrayDestroy(send_boxes); + hypre_TFree(send_processes); + + hypre_ForBoxArrayI(i, recv_boxes) + hypre_TFree(recv_processes[i]); + hypre_BoxArrayArrayDestroy(recv_boxes); + hypre_TFree(recv_processes); + + #if defined(HYPRE_COMM_SIMPLE) || defined(HYPRE_COMM_VOLATILE) + #else + hypre_CommPkgCommit(comm_pkg); + + /* free up comm types */ + for (i = 0; i < hypre_CommPkgNumSends(comm_pkg); i++) + hypre_CommTypeDestroy(hypre_CommPkgSendType(comm_pkg, i)); + hypre_TFree(hypre_CommPkgSendTypes(comm_pkg)); + for (i = 0; i < hypre_CommPkgNumRecvs(comm_pkg); i++) + hypre_CommTypeDestroy(hypre_CommPkgRecvType(comm_pkg, i)); + hypre_TFree(hypre_CommPkgRecvTypes(comm_pkg)); + #endif + + return comm_pkg; + } + + /*==========================================================================*/ + /*==========================================================================*/ + /** Destroy a communication package. + + {\bf Input files:} + headers.h + + @return Error code. + + @param comm_pkg [IN/OUT] + communication package. + + @see hypre_CommPkgCreate */ + /*--------------------------------------------------------------------------*/ + + int + hypre_CommPkgDestroy( hypre_CommPkg *comm_pkg ) + { + int ierr = 0; + #if defined(HYPRE_COMM_SIMPLE) || defined(HYPRE_COMM_VOLATILE) + int i; + #else + #endif + + if (comm_pkg) + { + #if defined(HYPRE_COMM_SIMPLE) || defined(HYPRE_COMM_VOLATILE) + /* free up comm types */ + for (i = 0; i < hypre_CommPkgNumSends(comm_pkg); i++) + hypre_CommTypeDestroy(hypre_CommPkgSendType(comm_pkg, i)); + hypre_TFree(hypre_CommPkgSendTypes(comm_pkg)); + for (i = 0; i < hypre_CommPkgNumRecvs(comm_pkg); i++) + hypre_CommTypeDestroy(hypre_CommPkgRecvType(comm_pkg, i)); + hypre_TFree(hypre_CommPkgRecvTypes(comm_pkg)); + #else + hypre_CommPkgUnCommit(comm_pkg); + #endif + + hypre_TFree(hypre_CommPkgSendProcs(comm_pkg)); + hypre_TFree(hypre_CommPkgRecvProcs(comm_pkg)); + + hypre_CommTypeDestroy(hypre_CommPkgCopyFromType(comm_pkg)); + hypre_CommTypeDestroy(hypre_CommPkgCopyToType(comm_pkg)); + + hypre_TFree(comm_pkg); + } + + return ierr; + } + + /*==========================================================================*/ + /*==========================================================================*/ + /** Initialize a non-blocking communication exchange. + + \begin{itemize} + \item If HYPRE\_COMM\_SIMPLE is defined, the communication buffers are + created, the send buffer is manually packed, and the communication + requests are posted. No MPI derived datatypes are used. + \item Else if HYPRE\_COMM\_VOLATILE is defined, the communication + package is committed, the communication requests are posted, then + the communication package is un-committed. + \item Else the communication requests are posted. + \end{itemize} + + {\bf Input files:} + headers.h + + @return Error code. + + @param comm_pkg [IN] + communication package. + @param send_data [IN] + reference pointer for the send data. + @param recv_data [IN] + reference pointer for the recv data. + @param comm_handle [OUT] + communication handle. + + @see hypre_FinalizeCommunication, hypre_CommPkgCreate */ + /*--------------------------------------------------------------------------*/ + + #if defined(HYPRE_COMM_SIMPLE) + + int + hypre_InitializeCommunication( hypre_CommPkg *comm_pkg, + double *send_data, + double *recv_data, + hypre_CommHandle **comm_handle_ptr ) + { + int ierr = 0; + + hypre_CommHandle *comm_handle; + + int num_sends = hypre_CommPkgNumSends(comm_pkg); + int num_recvs = hypre_CommPkgNumRecvs(comm_pkg); + MPI_Comm comm = hypre_CommPkgComm(comm_pkg); + + int num_requests; + MPI_Request *requests; + MPI_Status *status; + double **send_buffers; + double **recv_buffers; + int *send_sizes; + int *recv_sizes; + + hypre_CommType *send_type; + hypre_CommTypeEntry *send_entry; + hypre_CommType *recv_type; + hypre_CommTypeEntry *recv_entry; + + int *length_array; + int *stride_array; + + double *iptr, *jptr, *kptr, *lptr, *bptr; + + int i, j, k, ii, jj, kk, ll; + int entry_size, total_size; + + /*-------------------------------------------------------------------- + * allocate requests and status + *--------------------------------------------------------------------*/ + + num_requests = num_sends + num_recvs; + requests = hypre_CTAlloc(MPI_Request, num_requests); + status = hypre_CTAlloc(MPI_Status, num_requests); + + /*-------------------------------------------------------------------- + * allocate buffers + *--------------------------------------------------------------------*/ + + /* allocate send buffers */ + send_buffers = hypre_TAlloc(double *, num_sends); + send_sizes = hypre_TAlloc(int, num_sends); + total_size = 0; + for (i = 0; i < num_sends; i++) + { + send_type = hypre_CommPkgSendType(comm_pkg, i); + + send_sizes[i] = 0; + for (j = 0; j < hypre_CommTypeNumEntries(send_type); j++) + { + send_entry = hypre_CommTypeCommEntry(send_type, j); + length_array = hypre_CommTypeEntryLengthArray(send_entry); + + entry_size = 1; + for (k = 0; k < 4; k++) + { + entry_size *= length_array[k]; + } + send_sizes[i] += entry_size; + } + + total_size += send_sizes[i]; + } + if (num_sends > 0) + { + send_buffers[0] = hypre_SharedTAlloc(double, total_size); + for (i = 1; i < num_sends; i++) + { + send_buffers[i] = send_buffers[i-1] + send_sizes[i-1]; + } + } + + /* allocate recv buffers */ + recv_buffers = hypre_TAlloc(double *, num_recvs); + recv_sizes = hypre_TAlloc(int, num_recvs); + total_size = 0; + for (i = 0; i < num_recvs; i++) + { + recv_type = hypre_CommPkgRecvType(comm_pkg, i); + + recv_sizes[i] = 0; + for (j = 0; j < hypre_CommTypeNumEntries(recv_type); j++) + { + recv_entry = hypre_CommTypeCommEntry(recv_type, j); + length_array = hypre_CommTypeEntryLengthArray(recv_entry); + + entry_size = 1; + for (k = 0; k < 4; k++) + { + entry_size *= length_array[k]; + } + recv_sizes[i] += entry_size; + } + + total_size += recv_sizes[i]; + } + if (num_recvs > 0) + { + recv_buffers[0] = hypre_SharedTAlloc(double, total_size); + for (i = 1; i < num_recvs; i++) + { + recv_buffers[i] = recv_buffers[i-1] + recv_sizes[i-1]; + } + } + + /*-------------------------------------------------------------------- + * pack send buffers + *--------------------------------------------------------------------*/ + + for (i = 0; i < num_sends; i++) + { + send_type = hypre_CommPkgSendType(comm_pkg, i); + + bptr = (double *) send_buffers[i]; + for (j = 0; j < hypre_CommTypeNumEntries(send_type); j++) + { + send_entry = hypre_CommTypeCommEntry(send_type, j); + length_array = hypre_CommTypeEntryLengthArray(send_entry); + stride_array = hypre_CommTypeEntryStrideArray(send_entry); + + lptr = send_data + hypre_CommTypeEntryOffset(send_entry); + for (ll = 0; ll < length_array[3]; ll++) + { + kptr = lptr; + for (kk = 0; kk < length_array[2]; kk++) + { + jptr = kptr; + for (jj = 0; jj < length_array[1]; jj++) + { + if (stride_array[0] == 1) + { + memcpy(bptr, jptr, length_array[0]*sizeof(double)); + } + else + { + iptr = jptr; + for (ii = 0; ii < length_array[0]; ii++) + { + bptr[ii] = *iptr; + iptr += stride_array[0]; + } + } + bptr += length_array[0]; + jptr += stride_array[1]; + } + kptr += stride_array[2]; + } + lptr += stride_array[3]; + } + } + } + + /*-------------------------------------------------------------------- + * post receives and initiate sends + *--------------------------------------------------------------------*/ + + j = 0; + for(i = 0; i < num_recvs; i++) + { + MPI_Irecv(recv_buffers[i], recv_sizes[i], MPI_DOUBLE, + hypre_CommPkgRecvProc(comm_pkg, i), + 0, comm, &requests[j++]); + } + for(i = 0; i < num_sends; i++) + { + MPI_Isend(send_buffers[i], send_sizes[i], MPI_DOUBLE, + hypre_CommPkgSendProc(comm_pkg, i), + 0, comm, &requests[j++]); + } + + hypre_ExchangeLocalData(comm_pkg, send_data, recv_data); + + /*-------------------------------------------------------------------- + * set up comm_handle and return + *--------------------------------------------------------------------*/ + + comm_handle = hypre_TAlloc(hypre_CommHandle, 1); + + hypre_CommHandleCommPkg(comm_handle) = comm_pkg; + hypre_CommHandleSendData(comm_handle) = send_data; + hypre_CommHandleRecvData(comm_handle) = recv_data; + hypre_CommHandleNumRequests(comm_handle) = num_requests; + hypre_CommHandleRequests(comm_handle) = requests; + hypre_CommHandleStatus(comm_handle) = status; + hypre_CommHandleSendBuffers(comm_handle) = send_buffers; + hypre_CommHandleRecvBuffers(comm_handle) = recv_buffers; + hypre_CommHandleSendSizes(comm_handle) = send_sizes; + hypre_CommHandleRecvSizes(comm_handle) = recv_sizes; + + *comm_handle_ptr = comm_handle; + + return ierr; + } + + /*--------------------------------------------------------------------------*/ + + #else + + int + hypre_InitializeCommunication( hypre_CommPkg *comm_pkg, + double *send_data, + double *recv_data, + hypre_CommHandle **comm_handle_ptr ) + { + int ierr = 0; + + hypre_CommHandle *comm_handle; + + int num_sends = hypre_CommPkgNumSends(comm_pkg); + int num_recvs = hypre_CommPkgNumRecvs(comm_pkg); + MPI_Comm comm = hypre_CommPkgComm(comm_pkg); + + int num_requests; + MPI_Request *requests; + MPI_Status *status; + + int i, j; + + /*-------------------------------------------------------------------- + * allocate requests and status + *--------------------------------------------------------------------*/ + + num_requests = num_sends + num_recvs; + requests = hypre_CTAlloc(MPI_Request, num_requests); + status = hypre_CTAlloc(MPI_Status, num_requests); + + /*-------------------------------------------------------------------- + * post receives and initiate sends + *--------------------------------------------------------------------*/ + + #if defined(HYPRE_COMM_VOLATILE) + /* commit the communication package */ + hypre_CommPkgCommit(comm_pkg); + #else + #endif + + j = 0; + for(i = 0; i < num_recvs; i++) + { + MPI_Irecv((void *)recv_data, 1, + hypre_CommPkgRecvMPIType(comm_pkg, i), + hypre_CommPkgRecvProc(comm_pkg, i), + 0, comm, &requests[j++]); + } + for(i = 0; i < num_sends; i++) + { + MPI_Isend((void *)send_data, 1, + hypre_CommPkgSendMPIType(comm_pkg, i), + hypre_CommPkgSendProc(comm_pkg, i), + 0, comm, &requests[j++]); + } + + #if defined(HYPRE_COMM_VOLATILE) + /* un-commit the communication package */ + hypre_CommPkgUnCommit(comm_pkg); + #else + #endif + + hypre_ExchangeLocalData(comm_pkg, send_data, recv_data); + + /*-------------------------------------------------------------------- + * set up comm_handle and return + *--------------------------------------------------------------------*/ + + comm_handle = hypre_TAlloc(hypre_CommHandle, 1); + + hypre_CommHandleCommPkg(comm_handle) = comm_pkg; + hypre_CommHandleSendData(comm_handle) = send_data; + hypre_CommHandleRecvData(comm_handle) = recv_data; + hypre_CommHandleNumRequests(comm_handle) = num_requests; + hypre_CommHandleRequests(comm_handle) = requests; + hypre_CommHandleStatus(comm_handle) = status; + + *comm_handle_ptr = comm_handle; + + return ierr; + } + + #endif + + /*==========================================================================*/ + /*==========================================================================*/ + /** Finalize a communication exchange. This routine blocks until all + of the communication requests are completed. + + \begin{itemize} + \item If HYPRE\_COMM\_SIMPLE is defined, the communication requests + are completed, and the receive buffer is manually unpacked. + \item Else if HYPRE\_COMM\_VOLATILE is defined, the communication requests + are completed and the communication package is un-committed. + \item Else the communication requests are completed. + \end{itemize} + + {\bf Input files:} + headers.h + + @return Error code. + + @param comm_handle [IN/OUT] + communication handle. + + @see hypre_InitializeCommunication, hypre_CommPkgCreate */ + /*--------------------------------------------------------------------------*/ + + #if defined(HYPRE_COMM_SIMPLE) + + int + hypre_FinalizeCommunication( hypre_CommHandle *comm_handle ) + { + + int ierr = 0; + + hypre_CommPkg *comm_pkg = hypre_CommHandleCommPkg(comm_handle); + double **send_buffers = hypre_CommHandleSendBuffers(comm_handle); + double **recv_buffers = hypre_CommHandleRecvBuffers(comm_handle); + int *send_sizes = hypre_CommHandleSendSizes(comm_handle); + int *recv_sizes = hypre_CommHandleRecvSizes(comm_handle); + int num_sends = hypre_CommPkgNumSends(comm_pkg); + int num_recvs = hypre_CommPkgNumRecvs(comm_pkg); + + hypre_CommType *recv_type; + hypre_CommTypeEntry *recv_entry; + + int *length_array; + int *stride_array; + + double *iptr, *jptr, *kptr, *lptr, *bptr; + + int i, j, ii, jj, kk, ll; + + /*-------------------------------------------------------------------- + * finish communications + *--------------------------------------------------------------------*/ + + if (hypre_CommHandleNumRequests(comm_handle)) + { + MPI_Waitall(hypre_CommHandleNumRequests(comm_handle), + hypre_CommHandleRequests(comm_handle), + hypre_CommHandleStatus(comm_handle)); + } + + /*-------------------------------------------------------------------- + * unpack recv buffers + *--------------------------------------------------------------------*/ + + for (i = 0; i < num_recvs; i++) + { + recv_type = hypre_CommPkgRecvType(comm_pkg, i); + + bptr = (double *) recv_buffers[i]; + for (j = 0; j < hypre_CommTypeNumEntries(recv_type); j++) + { + recv_entry = hypre_CommTypeCommEntry(recv_type, j); + length_array = hypre_CommTypeEntryLengthArray(recv_entry); + stride_array = hypre_CommTypeEntryStrideArray(recv_entry); + + lptr = hypre_CommHandleRecvData(comm_handle) + + hypre_CommTypeEntryOffset(recv_entry); + for (ll = 0; ll < length_array[3]; ll++) + { + kptr = lptr; + for (kk = 0; kk < length_array[2]; kk++) + { + jptr = kptr; + for (jj = 0; jj < length_array[1]; jj++) + { + if (stride_array[0] == 1) + { + memcpy(jptr, bptr, length_array[0]*sizeof(double)); + } + else + { + iptr = jptr; + for (ii = 0; ii < length_array[0]; ii++) + { + *iptr = bptr[ii]; + iptr += stride_array[0]; + } + } + bptr += length_array[0]; + jptr += stride_array[1]; + } + kptr += stride_array[2]; + } + lptr += stride_array[3]; + } + } + } + + /*-------------------------------------------------------------------- + * Free up communication handle + *--------------------------------------------------------------------*/ + + hypre_TFree(hypre_CommHandleRequests(comm_handle)); + hypre_TFree(hypre_CommHandleStatus(comm_handle)); + if (num_sends > 0) + { + hypre_SharedTFree(send_buffers[0]); + } + if (num_recvs > 0) + { + hypre_SharedTFree(recv_buffers[0]); + } + hypre_TFree(send_buffers); + hypre_TFree(recv_buffers); + hypre_TFree(send_sizes); + hypre_TFree(recv_sizes); + hypre_TFree(comm_handle); + + return ierr; + } + + #else + + int + hypre_FinalizeCommunication( hypre_CommHandle *comm_handle ) + { + int ierr = 0; + + if (hypre_CommHandleNumRequests(comm_handle)) + { + MPI_Waitall(hypre_CommHandleNumRequests(comm_handle), + hypre_CommHandleRequests(comm_handle), + hypre_CommHandleStatus(comm_handle)); + } + + /*-------------------------------------------------------------------- + * Free up communication handle + *--------------------------------------------------------------------*/ + + hypre_TFree(hypre_CommHandleRequests(comm_handle)); + hypre_TFree(hypre_CommHandleStatus(comm_handle)); + hypre_TFree(comm_handle); + + return ierr; + } + + #endif + + /*==========================================================================*/ + /*==========================================================================*/ + /** Execute local data exchanges. + + {\bf Input files:} + headers.h + + @return Error code. + + @param comm_pkg [IN] + communication package. + @param send_data [IN] + reference pointer for the send data. + @param recv_data [IN] + reference pointer for the recv data. + + @see hypre_InitializeCommunication */ + /*--------------------------------------------------------------------------*/ + + int + hypre_ExchangeLocalData( hypre_CommPkg *comm_pkg, + double *send_data, + double *recv_data ) + { + hypre_CommType *copy_from_type; + hypre_CommType *copy_to_type; + hypre_CommTypeEntry *copy_from_entry; + hypre_CommTypeEntry *copy_to_entry; + + double *from_dp; + int *from_stride_array; + int from_i; + double *to_dp; + int *to_stride_array; + int to_i; + + int *length_array; + int i0, i1, i2, i3; + + int i; + int ierr = 0; + + /*-------------------------------------------------------------------- + * copy local data + *--------------------------------------------------------------------*/ + + copy_from_type = hypre_CommPkgCopyFromType(comm_pkg); + copy_to_type = hypre_CommPkgCopyToType(comm_pkg); + + for (i = 0; i < hypre_CommTypeNumEntries(copy_from_type); i++) + { + copy_from_entry = hypre_CommTypeCommEntry(copy_from_type, i); + copy_to_entry = hypre_CommTypeCommEntry(copy_to_type, i); + + from_dp = send_data + hypre_CommTypeEntryOffset(copy_from_entry); + to_dp = recv_data + hypre_CommTypeEntryOffset(copy_to_entry); + + /* copy data only when necessary */ + if (to_dp != from_dp) + { + length_array = hypre_CommTypeEntryLengthArray(copy_from_entry); + + from_stride_array = hypre_CommTypeEntryStrideArray(copy_from_entry); + to_stride_array = hypre_CommTypeEntryStrideArray(copy_to_entry); + + for (i3 = 0; i3 < length_array[3]; i3++) + { + for (i2 = 0; i2 < length_array[2]; i2++) + { + for (i1 = 0; i1 < length_array[1]; i1++) + { + from_i = (i3*from_stride_array[3] + + i2*from_stride_array[2] + + i1*from_stride_array[1] ); + to_i = (i3*to_stride_array[3] + + i2*to_stride_array[2] + + i1*to_stride_array[1] ); + for (i0 = 0; i0 < length_array[0]; i0++) + { + to_dp[to_i] = from_dp[from_i]; + + from_i += from_stride_array[0]; + to_i += to_stride_array[0]; + } + } + } + } + } + } + + return ( ierr ); + } + + /*==========================================================================*/ + /*==========================================================================*/ + /** Create a communication type. + + {\bf Input files:} + headers.h + + @return Communication type. + + @param comm_entries [IN] + array of pointers to communication type entries. + @param num_entries [IN] + number of elements in comm\_entries array. + + @see hypre_CommTypeDestroy */ + /*--------------------------------------------------------------------------*/ + + hypre_CommType * + hypre_CommTypeCreate( hypre_CommTypeEntry **comm_entries, + int num_entries ) + { + hypre_CommType *comm_type; + + comm_type = hypre_TAlloc(hypre_CommType, 1); + + hypre_CommTypeCommEntries(comm_type) = comm_entries; + hypre_CommTypeNumEntries(comm_type) = num_entries; + + return comm_type; + } + + /*==========================================================================*/ + /*==========================================================================*/ + /** Destroy a communication type. + + {\bf Input files:} + headers.h + + @return Error code. + + @param comm_type [IN] + communication type. + + @see hypre_CommTypeCreate */ + /*--------------------------------------------------------------------------*/ + + int + hypre_CommTypeDestroy( hypre_CommType *comm_type ) + { + int ierr = 0; + hypre_CommTypeEntry *comm_entry; + int i; + + if (comm_type) + { + if ( hypre_CommTypeCommEntries(comm_type) != NULL ) + { + for (i = 0; i < hypre_CommTypeNumEntries(comm_type); i++) + { + comm_entry = hypre_CommTypeCommEntry(comm_type, i); + hypre_CommTypeEntryDestroy(comm_entry); + } + } + + hypre_TFree(hypre_CommTypeCommEntries(comm_type)); + hypre_TFree(comm_type); + } + + return ierr; + } + + /*==========================================================================*/ + /*==========================================================================*/ + /** Create a communication type entry. + + {\bf Input files:} + headers.h + + @return Communication type entry. + + @param box [IN] + description of the grid data to be communicated. + @param data_box [IN] + description of the stored grid data. + @param num_values [IN] + number of data values to be communicated for each grid index. + @param data_box_offset [IN] + offset from some location in memory of the data associated with the + imin index of data_box. + + @see hypre_CommTypeEntryDestroy */ + /*--------------------------------------------------------------------------*/ + + hypre_CommTypeEntry * + hypre_CommTypeEntryCreate( hypre_Box *box, + hypre_Index stride, + hypre_Box *data_box, + int num_values, + int data_box_offset ) + { + hypre_CommTypeEntry *comm_entry; + + int *length_array; + int *stride_array; + + hypre_Index size; + int i, j, dim; + + comm_entry = hypre_TAlloc(hypre_CommTypeEntry, 1); + + /*------------------------------------------------------ + * Set imin, imax, and offset + *------------------------------------------------------*/ + + hypre_CopyIndex(hypre_BoxIMin(box), + hypre_CommTypeEntryIMin(comm_entry)); + hypre_CopyIndex(hypre_BoxIMax(box), + hypre_CommTypeEntryIMax(comm_entry)); + + hypre_CommTypeEntryOffset(comm_entry) = + data_box_offset + hypre_BoxIndexRank(data_box, hypre_BoxIMin(box)); + + /*------------------------------------------------------ + * Set length_array, stride_array, and dim + *------------------------------------------------------*/ + + length_array = hypre_CommTypeEntryLengthArray(comm_entry); + stride_array = hypre_CommTypeEntryStrideArray(comm_entry); + + /* initialize length_array */ + hypre_BoxGetStrideSize(box, stride, size); + for (i = 0; i < 3; i++) + length_array[i] = hypre_IndexD(size, i); + length_array[3] = num_values; + + /* initialize stride_array */ + for (i = 0; i < 3; i++) + { + stride_array[i] = hypre_IndexD(stride, i); + for (j = 0; j < i; j++) + stride_array[i] *= hypre_BoxSizeD(data_box, j); + } + stride_array[3] = hypre_BoxVolume(data_box); + + /* eliminate dimensions with length_array = 1 */ + dim = 4; + i = 0; + while (i < dim) + { + if(length_array[i] == 1) + { + for(j = i; j < (dim - 1); j++) + { + length_array[j] = length_array[j+1]; + stride_array[j] = stride_array[j+1]; + } + length_array[dim - 1] = 1; + stride_array[dim - 1] = 1; + dim--; + } + else + { + i++; + } + } + + #if 0 + /* sort the array according to length_array (largest to smallest) */ + for (i = (dim-1); i > 0; i--) + for (j = 0; j < i; j++) + if (length_array[j] < length_array[j+1]) + { + i_tmp = length_array[j]; + length_array[j] = length_array[j+1]; + length_array[j+1] = i_tmp; + + i_tmp = stride_array[j]; + stride_array[j] = stride_array[j+1]; + stride_array[j+1] = i_tmp; + } + #endif + + /* if every len was 1 we need to fix to communicate at least one */ + if(!dim) + dim = 1; + + hypre_CommTypeEntryDim(comm_entry) = dim; + + return comm_entry; + } + + /*==========================================================================*/ + /*==========================================================================*/ + /** Destroy a communication type entry. + + {\bf Input files:} + headers.h + + @return Error code. + + @param comm_entry [IN/OUT] + communication type entry. + + @see hypre_CommTypeEntryCreate */ + /*--------------------------------------------------------------------------*/ + + int + hypre_CommTypeEntryDestroy( hypre_CommTypeEntry *comm_entry ) + { + int ierr = 0; + + if (comm_entry) + { + hypre_TFree(comm_entry); + } + + return ierr; + } + + /*==========================================================================*/ + /*==========================================================================*/ + /** Compute a processor-based description of a communication from a + grid-based one. Used to construct a communication package. + + {\bf Input files:} + headers.h + + @return Error code. + + @param boxes [IN] + description of the grid data to be communicated to other processors. + @param data_space [IN] + description of the stored grid data associated with the communications. + @param processes [IN] + processors that data is to be communicated with. + @param num_values [IN] + number of data values to be communicated for each grid index. + @param comm [IN] + communicator. + @param num_comms_ptr [OUT] + number of communications. The number of communications is defined + by the number of processors involved in the communications, not + counting ``my processor''. + @param comm_processes_ptr [OUT] + processor + ranks involved in the communications. + @param comm_types_ptr [OUT] + inter-processor communication types. + @param copy_type_ptr [OUT] + intra-processor communication type (copies). + + @see hypre_CommPkgCreate, hypre_CommTypeSort */ + /*--------------------------------------------------------------------------*/ + + int + hypre_CommPkgCreateInfo( hypre_BoxArrayArray *boxes, + hypre_Index stride, + hypre_BoxArray *data_space, + int **processes, + int num_values, + MPI_Comm comm, + hypre_Index periodic, + int *num_comms_ptr, + int **comm_processes_ptr, + hypre_CommType ***comm_types_ptr, + hypre_CommType **copy_type_ptr) + { + int num_comms; + int *comm_processes; + hypre_CommType **comm_types; + hypre_CommType *copy_type; + + hypre_CommTypeEntry ***comm_entries; + int *num_entries; + + hypre_BoxArray *box_array; + hypre_Box *box; + hypre_Box *data_box; + int data_box_offset; + + int i, j, p, m; + int num_procs, my_proc; + + int ierr = 0; + + /*--------------------------------------------------------- + * Misc stuff + *---------------------------------------------------------*/ + + MPI_Comm_size(comm, &num_procs ); + MPI_Comm_rank(comm, &my_proc ); + + /*------------------------------------------------------ + * Loop over boxes and compute num_entries. + *------------------------------------------------------*/ + + num_entries = hypre_CTAlloc(int, num_procs); + + num_comms = 0; + hypre_ForBoxArrayI(i, boxes) + { + box_array = hypre_BoxArrayArrayBoxArray(boxes, i); + + hypre_ForBoxI(j, box_array) + { + box = hypre_BoxArrayBox(box_array, j); + p = processes[i][j]; + + if (hypre_BoxVolume(box) != 0) + { + num_entries[p]++; + if ((num_entries[p] == 1) && (p != my_proc)) + { + num_comms++; + } + } + } + } + + /*------------------------------------------------------ + * Loop over boxes and compute comm_entries + * and comm_processes. + *------------------------------------------------------*/ + + comm_entries = hypre_CTAlloc(hypre_CommTypeEntry **, num_procs); + comm_processes = hypre_TAlloc(int, num_comms); + + m = 0; + data_box_offset = 0; + hypre_ForBoxArrayI(i, boxes) + { + box_array = hypre_BoxArrayArrayBoxArray(boxes, i); + data_box = hypre_BoxArrayBox(data_space, i); + + hypre_ForBoxI(j, box_array) + { + box = hypre_BoxArrayBox(box_array, j); + p = processes[i][j]; + + if (hypre_BoxVolume(box) != 0) + { + /* allocate comm_entries pointer */ + if (comm_entries[p] == NULL) + { + comm_entries[p] = + hypre_CTAlloc(hypre_CommTypeEntry *, num_entries[p]); + num_entries[p] = 0; + + if (p != my_proc) + { + comm_processes[m] = p; + m++; + } + } + + comm_entries[p][num_entries[p]] = + hypre_CommTypeEntryCreate(box, stride, data_box, + num_values, data_box_offset); + + num_entries[p]++; + } + } + + data_box_offset += hypre_BoxVolume(data_box) * num_values; + } + + /*------------------------------------------------------ + * Loop over comm_entries and build comm_types + *------------------------------------------------------*/ + + comm_types = hypre_TAlloc(hypre_CommType *, num_comms); + + for (m = 0; m < num_comms; m++) + { + p = comm_processes[m]; + comm_types[m] = hypre_CommTypeCreate(comm_entries[p], num_entries[p]); + hypre_CommTypeSort(comm_types[m], periodic); + } + + /*------------------------------------------------------ + * Build copy_type + *------------------------------------------------------*/ + + if (comm_entries[my_proc] != NULL) + { + p = my_proc; + copy_type = hypre_CommTypeCreate(comm_entries[p], num_entries[p]); + hypre_CommTypeSort(copy_type, periodic); + } + else + { + copy_type = hypre_CommTypeCreate(NULL, 0); + } + + /*------------------------------------------------------ + * Return + *------------------------------------------------------*/ + + hypre_TFree(comm_entries); + hypre_TFree(num_entries); + + *num_comms_ptr = num_comms; + *comm_processes_ptr = comm_processes; + *comm_types_ptr = comm_types; + *copy_type_ptr = copy_type; + + return ierr; + } + + /*==========================================================================*/ + /*==========================================================================*/ + /** Sort the entries of a communication type. This routine is used to + maintain consistency in communications. + + {\bf Input files:} + headers.h + + {\bf Note:} + The entries are sorted by imin first. Entries with common imin are + then sorted by imax. This assumes that imin and imax define a unique + communication type. + + @return Error code. + + @param comm_type [IN/OUT] + communication type to be sorted. + + @see hypre_CommPkgCreateInfo */ + /*--------------------------------------------------------------------------*/ + + int + hypre_CommTypeSort( hypre_CommType *comm_type, + hypre_Index periodic ) + { + hypre_CommTypeEntry **comm_entries = hypre_CommTypeCommEntries(comm_type); + int num_entries = hypre_CommTypeNumEntries(comm_type); + + hypre_CommTypeEntry *comm_entry; + hypre_IndexRef imin0, imin1; + int *imax0, *imax1; + int swap; + int i, j, ii, jj; + int ierr = 0; + + #if 1 + /*------------------------------------------------ + * Sort by imin: + *------------------------------------------------*/ + + for (i = (num_entries - 1); i > 0; i--) + { + for (j = 0; j < i; j++) + { + swap = 0; + imin0 = hypre_CommTypeEntryIMin(comm_entries[j]); + imin1 = hypre_CommTypeEntryIMin(comm_entries[j+1]); + if ( hypre_IModPeriodZ(imin0, periodic) > + hypre_IModPeriodZ(imin1, periodic) ) + { + swap = 1; + } + else if ( hypre_IModPeriodZ(imin0, periodic) == + hypre_IModPeriodZ(imin1, periodic) ) + { + if ( hypre_IModPeriodY(imin0, periodic) > + hypre_IModPeriodY(imin1, periodic) ) + { + swap = 1; + } + else if ( hypre_IModPeriodY(imin0, periodic) == + hypre_IModPeriodY(imin1, periodic) ) + { + if ( hypre_IModPeriodX(imin0, periodic) > + hypre_IModPeriodX(imin1, periodic) ) + { + swap = 1; + } + } + } + + if (swap) + { + comm_entry = comm_entries[j]; + comm_entries[j] = comm_entries[j+1]; + comm_entries[j+1] = comm_entry; + } + } + } + + /*------------------------------------------------ + * Sort entries with common imin by imax: + *------------------------------------------------*/ + + for (ii = 0; ii < (num_entries - 1); ii = jj) + { + /* want jj where entries ii through jj-1 have common imin */ + imin0 = hypre_CommTypeEntryIMin(comm_entries[ii]); + for (jj = (ii + 1); jj < num_entries; jj++) + { + imin1 = hypre_CommTypeEntryIMin(comm_entries[jj]); + if ( ( hypre_IModPeriodX(imin0, periodic) != + hypre_IModPeriodX(imin1, periodic) ) || + ( hypre_IModPeriodY(imin0, periodic) != + hypre_IModPeriodY(imin1, periodic) ) || + ( hypre_IModPeriodZ(imin0, periodic) != + hypre_IModPeriodZ(imin1, periodic) ) ) + { + break; + } + } + + /* sort entries ii through jj-1 by imax */ + for (i = (jj - 1); i > ii; i--) + { + for (j = ii; j < i; j++) + { + swap = 0; + imax0 = hypre_CommTypeEntryIMax(comm_entries[j]); + imax1 = hypre_CommTypeEntryIMax(comm_entries[j+1]); + if ( hypre_IModPeriodZ(imax0, periodic) > + hypre_IModPeriodZ(imax1, periodic) ) + { + swap = 1; + } + else if ( hypre_IModPeriodZ(imax0, periodic) == + hypre_IModPeriodZ(imax1, periodic) ) + { + if ( hypre_IModPeriodY(imax0, periodic) > + hypre_IModPeriodY(imax1, periodic) ) + { + swap = 1; + } + else if ( hypre_IModPeriodY(imax0, periodic) == + hypre_IModPeriodY(imax1, periodic) ) + { + if ( hypre_IModPeriodX(imax0, periodic) > + hypre_IModPeriodX(imax1, periodic) ) + { + swap = 1; + } + } + } + + if (swap) + { + comm_entry = comm_entries[j]; + comm_entries[j] = comm_entries[j+1]; + comm_entries[j+1] = comm_entry; + } + } + } + } + #endif + + return ierr; + } + + /*==========================================================================*/ + /*==========================================================================*/ + /** Compile a communication package into a form based on the + message-passing layer. + + {\bf Input files:} + headers.h + + @return Error code. + + @param comm_pkg [IN/OUT] + communication package. + + @see hypre_CommPkgCreate, hypre_InitializeCommunication, + hypre_CommTypeBuildMPI, hypre_CommPkgUnCommit */ + /*--------------------------------------------------------------------------*/ + + int + hypre_CommPkgCommit( hypre_CommPkg *comm_pkg ) + { + int ierr = 0; + + /* create send MPI_Datatypes */ + hypre_CommPkgSendMPITypes(comm_pkg) = + hypre_TAlloc(MPI_Datatype, hypre_CommPkgNumSends(comm_pkg)); + hypre_CommTypeBuildMPI(hypre_CommPkgNumSends(comm_pkg), + hypre_CommPkgSendProcs(comm_pkg), + hypre_CommPkgSendTypes(comm_pkg), + hypre_CommPkgSendMPITypes(comm_pkg)); + + /* create recv MPI_Datatypes */ + hypre_CommPkgRecvMPITypes(comm_pkg) = + hypre_TAlloc(MPI_Datatype, hypre_CommPkgNumRecvs(comm_pkg)); + hypre_CommTypeBuildMPI(hypre_CommPkgNumRecvs(comm_pkg), + hypre_CommPkgRecvProcs(comm_pkg), + hypre_CommPkgRecvTypes(comm_pkg), + hypre_CommPkgRecvMPITypes(comm_pkg)); + + return ierr; + } + + /*==========================================================================*/ + /*==========================================================================*/ + /** Destroy the message-passing-layer component of the communication + package. + + {\bf Input files:} + headers.h + + @return Error code. + + @param comm_pkg [IN/OUT] + communication package. + + @see hypre_CommPkgCommit */ + /*--------------------------------------------------------------------------*/ + + int + hypre_CommPkgUnCommit( hypre_CommPkg *comm_pkg ) + { + MPI_Datatype *types; + int i; + int ierr = 0; + + if (comm_pkg) + { + types = hypre_CommPkgSendMPITypes(comm_pkg); + if (types) + { + for (i = 0; i < hypre_CommPkgNumSends(comm_pkg); i++) + MPI_Type_free(&types[i]); + hypre_TFree(types); + } + + types = hypre_CommPkgRecvMPITypes(comm_pkg); + if (types) + { + for (i = 0; i < hypre_CommPkgNumRecvs(comm_pkg); i++) + MPI_Type_free(&types[i]); + hypre_TFree(types); + } + } + + return ierr; + } + + /*==========================================================================*/ + /*==========================================================================*/ + /** Create an MPI-based description of a communication from a + processor-based one. + + {\bf Input files:} + headers.h + + @return Error code. + + @param num_comms [IN] + number of communications. + @param comm_procs [IN] + processor ranks involved in the communications. + @param comm_types [IN] + processor-based communication types. + @param comm_mpi_types [OUT] + MPI derived data-types. + + @see hypre_CommPkgCommit, hypre_CommTypeEntryBuildMPI */ + /*--------------------------------------------------------------------------*/ + + int + hypre_CommTypeBuildMPI( int num_comms, + int *comm_procs, + hypre_CommType **comm_types, + MPI_Datatype *comm_mpi_types ) + { + hypre_CommType *comm_type; + hypre_CommTypeEntry *comm_entry; + int num_entries; + int *comm_entry_blocklengths; + MPI_Aint *comm_entry_displacements; + MPI_Datatype *comm_entry_mpi_types; + + int m, i; + int ierr = 0; + + for (m = 0; m < num_comms; m++) + { + comm_type = comm_types[m]; + + num_entries = hypre_CommTypeNumEntries(comm_type); + comm_entry_blocklengths = hypre_TAlloc(int, num_entries); + comm_entry_displacements = hypre_TAlloc(MPI_Aint, num_entries); + comm_entry_mpi_types = hypre_TAlloc(MPI_Datatype, num_entries); + + for (i = 0; i < num_entries; i++) + { + comm_entry = hypre_CommTypeCommEntry(comm_type, i); + + /* set blocklengths */ + comm_entry_blocklengths[i] = 1; + + /* compute displacements */ + comm_entry_displacements[i] = + hypre_CommTypeEntryOffset(comm_entry) * sizeof(double); + + /* compute types */ + hypre_CommTypeEntryBuildMPI(comm_entry, &comm_entry_mpi_types[i]); + } + + /* create `comm_mpi_types' */ + MPI_Type_struct(num_entries, comm_entry_blocklengths, + comm_entry_displacements, comm_entry_mpi_types, + &comm_mpi_types[m]); + MPI_Type_commit(&comm_mpi_types[m]); + + /* free up memory */ + for (i = 0; i < num_entries; i++) + MPI_Type_free(&comm_entry_mpi_types[i]); + hypre_TFree(comm_entry_blocklengths); + hypre_TFree(comm_entry_displacements); + hypre_TFree(comm_entry_mpi_types); + } + + return ierr; + } + + /*==========================================================================*/ + /*==========================================================================*/ + /** Create an MPI-based description of a communication entry. + + {\bf Input files:} + headers.h + + @return Error code. + + @param comm_entry [IN] + communication entry. + @param comm_entry_mpi_type [OUT] + MPI derived data-type. + + @see hypre_CommTypeBuildMPI */ + /*--------------------------------------------------------------------------*/ + + int + hypre_CommTypeEntryBuildMPI( hypre_CommTypeEntry *comm_entry, + MPI_Datatype *comm_entry_mpi_type ) + { + int dim = hypre_CommTypeEntryDim(comm_entry); + int *length_array = hypre_CommTypeEntryLengthArray(comm_entry); + int *stride_array = hypre_CommTypeEntryStrideArray(comm_entry); + + MPI_Datatype *old_type; + MPI_Datatype *new_type; + MPI_Datatype *tmp_type; + + int i; + int ierr = 0; + + if (dim == 1) + { + MPI_Type_hvector(length_array[0], 1, + (MPI_Aint)(stride_array[0]*sizeof(double)), + MPI_DOUBLE, comm_entry_mpi_type); + } + else + { + old_type = hypre_CTAlloc(MPI_Datatype, 1); + new_type = hypre_CTAlloc(MPI_Datatype, 1); + + MPI_Type_hvector(length_array[0], 1, + (MPI_Aint)(stride_array[0]*sizeof(double)), + MPI_DOUBLE, old_type); + for (i = 1; i < (dim - 1); i++) + { + MPI_Type_hvector(length_array[i], 1, + (MPI_Aint)(stride_array[i]*sizeof(double)), + *old_type, new_type); + + MPI_Type_free(old_type); + tmp_type = old_type; + old_type = new_type; + new_type = tmp_type; + + } + MPI_Type_hvector(length_array[i], 1, + (MPI_Aint)(stride_array[i]*sizeof(double)), + *old_type, comm_entry_mpi_type); + MPI_Type_free(old_type); + + hypre_TFree(old_type); + hypre_TFree(new_type); + } + + return ierr; + } + Index: llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/communication_info.c diff -c /dev/null llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/communication_info.c:1.1 *** /dev/null Mon Apr 11 00:22:18 2005 --- llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/communication_info.c Mon Apr 11 00:22:07 2005 *************** *** 0 **** --- 1,701 ---- + /*BHEADER********************************************************************** + * (c) 1998 The Regents of the University of California + * + * See the file COPYRIGHT_and_DISCLAIMER for a complete copyright + * notice, contact person, and disclaimer. + * + * $Revision: 1.1 $ + *********************************************************************EHEADER*/ + /****************************************************************************** + * + *****************************************************************************/ + + #include "headers.h" + + /*==========================================================================*/ + /*==========================================================================*/ + /** Return descriptions of communications patterns for a given + grid-stencil computation. These patterns are defined by intersecting + the data dependencies of each box (including data dependencies within + the box) with its neighbor boxes. + + {\bf Note:} It is assumed that the grids neighbor information is + sufficiently large. + + {\bf Note:} No concept of data ownership is assumed. As a result, + problematic communications patterns can be produced when the grid + boxes overlap. For example, it is likely that some boxes will have + send and receive patterns that overlap. + + {\bf Input files:} + headers.h + + @return Error code. + + @param grid [IN] + computational grid + @param stencil [IN] + computational stencil + @param send_boxes_ptr [OUT] + description of the grid data to be sent to other processors. + @param recv_boxes_ptr [OUT] + description of the grid data to be received from other processors. + @param send_procs_ptr [OUT] + processors that data is to be sent to. + @param recv_procs_ptr [OUT] + processors that data is to be received from. + + @see hypre_CreateComputeInfo */ + /*--------------------------------------------------------------------------*/ + + int + hypre_CreateCommInfoFromStencil( hypre_StructGrid *grid, + hypre_StructStencil *stencil, + hypre_BoxArrayArray **send_boxes_ptr, + hypre_BoxArrayArray **recv_boxes_ptr, + int ***send_procs_ptr, + int ***recv_procs_ptr ) + { + int ierr = 0; + + /* output variables */ + hypre_BoxArrayArray *send_boxes; + hypre_BoxArrayArray *recv_boxes; + int **send_procs; + int **recv_procs; + + /* internal variables */ + hypre_BoxArray *boxes = hypre_StructGridBoxes(grid); + + hypre_BoxNeighbors *neighbors; + hypre_BoxArray *neighbor_boxes; + int *neighbor_procs; + hypre_Box *neighbor_box; + + hypre_Index *stencil_shape; + hypre_IndexRef stencil_offset; + + hypre_Box *box; + hypre_Box *shift_box; + + hypre_BoxArray *send_box_array; + hypre_BoxArray *recv_box_array; + int send_box_array_size; + int recv_box_array_size; + + hypre_BoxArray **cbox_arrays; + int *cbox_arrays_i; + int num_cbox_arrays; + + int i, j, k, m, n; + int s, d; + + /* temporary work variables */ + hypre_Box *box0; + + /*------------------------------------------------------ + * Determine neighbors: + *------------------------------------------------------*/ + + neighbors = hypre_StructGridNeighbors(grid); + + /*------------------------------------------------------ + * Compute send/recv boxes and procs + *------------------------------------------------------*/ + + send_boxes = hypre_BoxArrayArrayCreate(hypre_BoxArraySize(boxes)); + recv_boxes = hypre_BoxArrayArrayCreate(hypre_BoxArraySize(boxes)); + send_procs = hypre_CTAlloc(int *, hypre_BoxArraySize(boxes)); + recv_procs = hypre_CTAlloc(int *, hypre_BoxArraySize(boxes)); + + stencil_shape = hypre_StructStencilShape(stencil); + + neighbor_boxes = hypre_BoxNeighborsBoxes(neighbors); + neighbor_procs = hypre_BoxNeighborsProcs(neighbors); + + box0 = hypre_BoxCreate(); + shift_box = hypre_BoxCreate(); + + cbox_arrays = + hypre_CTAlloc(hypre_BoxArray *, hypre_BoxArraySize(neighbor_boxes)); + cbox_arrays_i = + hypre_CTAlloc(int, hypre_BoxArraySize(neighbor_boxes)); + + hypre_ForBoxI(i, boxes) + { + box = hypre_BoxArrayBox(boxes, i); + hypre_CopyBox(box, shift_box); + + /*------------------------------------------------ + * Compute recv_box_array for box i + *------------------------------------------------*/ + + num_cbox_arrays = 0; + for (s = 0; s < hypre_StructStencilSize(stencil); s++) + { + stencil_offset = stencil_shape[s]; + + /* shift box by stencil_offset */ + for (d = 0; d < 3; d++) + { + hypre_BoxIMinD(shift_box, d) = + hypre_BoxIMinD(box, d) + hypre_IndexD(stencil_offset, d); + hypre_BoxIMaxD(shift_box, d) = + hypre_BoxIMaxD(box, d) + hypre_IndexD(stencil_offset, d); + } + + hypre_BeginBoxNeighborsLoop(j, neighbors, i, stencil_offset) + { + neighbor_box = hypre_BoxArrayBox(neighbor_boxes, j); + + hypre_IntersectBoxes(shift_box, neighbor_box, box0); + if (hypre_BoxVolume(box0)) + { + if (cbox_arrays[j] == NULL) + { + cbox_arrays[j] = hypre_BoxArrayCreate(0); + cbox_arrays_i[num_cbox_arrays] = j; + num_cbox_arrays++; + } + hypre_AppendBox(box0, cbox_arrays[j]); + } + } + hypre_EndBoxNeighborsLoop; + } + + /* union the boxes in cbox_arrays */ + recv_box_array_size = 0; + for (m = 0; m < num_cbox_arrays; m++) + { + j = cbox_arrays_i[m]; + hypre_UnionBoxes(cbox_arrays[j]); + recv_box_array_size += hypre_BoxArraySize(cbox_arrays[j]); + } + + /* create recv_box_array and recv_procs */ + recv_box_array = hypre_BoxArrayArrayBoxArray(recv_boxes, i); + hypre_BoxArraySetSize(recv_box_array, recv_box_array_size); + recv_procs[i] = hypre_CTAlloc(int, recv_box_array_size); + n = 0; + for (m = 0; m < num_cbox_arrays; m++) + { + j = cbox_arrays_i[m]; + hypre_ForBoxI(k, cbox_arrays[j]) + { + recv_procs[i][n] = neighbor_procs[j]; + hypre_CopyBox(hypre_BoxArrayBox(cbox_arrays[j], k), + hypre_BoxArrayBox(recv_box_array, n)); + n++; + } + hypre_BoxArrayDestroy(cbox_arrays[j]); + cbox_arrays[j] = NULL; + } + + /*------------------------------------------------ + * Compute send_box_array for box i + *------------------------------------------------*/ + + num_cbox_arrays = 0; + for (s = 0; s < hypre_StructStencilSize(stencil); s++) + { + stencil_offset = stencil_shape[s]; + + /* transpose stencil_offset */ + for (d = 0; d < 3; d++) + { + hypre_IndexD(stencil_offset, d) = + -hypre_IndexD(stencil_offset, d); + } + + /* shift box by transpose stencil_offset */ + for (d = 0; d < 3; d++) + { + hypre_BoxIMinD(shift_box, d) = + hypre_BoxIMinD(box, d) + hypre_IndexD(stencil_offset, d); + hypre_BoxIMaxD(shift_box, d) = + hypre_BoxIMaxD(box, d) + hypre_IndexD(stencil_offset, d); + } + + hypre_BeginBoxNeighborsLoop(j, neighbors, i, stencil_offset) + { + neighbor_box = hypre_BoxArrayBox(neighbor_boxes, j); + + hypre_IntersectBoxes(shift_box, neighbor_box, box0); + if (hypre_BoxVolume(box0)) + { + /* shift box0 back */ + for (d = 0; d < 3; d++) + { + hypre_BoxIMinD(box0, d) -= + hypre_IndexD(stencil_offset, d); + hypre_BoxIMaxD(box0, d) -= + hypre_IndexD(stencil_offset, d); + } + + if (cbox_arrays[j] == NULL) + { + cbox_arrays[j] = hypre_BoxArrayCreate(0); + cbox_arrays_i[num_cbox_arrays] = j; + num_cbox_arrays++; + } + hypre_AppendBox(box0, cbox_arrays[j]); + } + } + hypre_EndBoxNeighborsLoop; + + /* restore stencil_offset */ + for (d = 0; d < 3; d++) + { + hypre_IndexD(stencil_offset, d) = + -hypre_IndexD(stencil_offset, d); + } + } + + /* union the boxes in cbox_arrays */ + send_box_array_size = 0; + for (m = 0; m < num_cbox_arrays; m++) + { + j = cbox_arrays_i[m]; + hypre_UnionBoxes(cbox_arrays[j]); + send_box_array_size += hypre_BoxArraySize(cbox_arrays[j]); + } + + /* create send_box_array and send_procs */ + send_box_array = hypre_BoxArrayArrayBoxArray(send_boxes, i); + hypre_BoxArraySetSize(send_box_array, send_box_array_size); + send_procs[i] = hypre_CTAlloc(int, send_box_array_size); + n = 0; + for (m = 0; m < num_cbox_arrays; m++) + { + j = cbox_arrays_i[m]; + hypre_ForBoxI(k, cbox_arrays[j]) + { + send_procs[i][n] = neighbor_procs[j]; + hypre_CopyBox(hypre_BoxArrayBox(cbox_arrays[j], k), + hypre_BoxArrayBox(send_box_array, n)); + n++; + } + hypre_BoxArrayDestroy(cbox_arrays[j]); + cbox_arrays[j] = NULL; + } + } + + hypre_TFree(cbox_arrays); + hypre_TFree(cbox_arrays_i); + + hypre_BoxDestroy(shift_box); + hypre_BoxDestroy(box0); + + /*------------------------------------------------------ + * Return + *------------------------------------------------------*/ + + *send_boxes_ptr = send_boxes; + *recv_boxes_ptr = recv_boxes; + *send_procs_ptr = send_procs; + *recv_procs_ptr = recv_procs; + + return ierr; + } + + /*==========================================================================*/ + /*==========================================================================*/ + /** Return descriptions of communications patterns for a given grid + with "ghost zones". These patterns are defined by intersecting each + box, "grown" by the number of "ghost zones", with its neighbor boxes. + + {\bf Note:} It is assumed that the grids neighbor information is + sufficiently large. + + {\bf Input files:} + headers.h + + @return Error code. + + @param grid [IN] + computational grid + @param num_ghost [IN] + number of ghost zones in each direction + @param send_boxes_ptr [OUT] + description of the grid data to be sent to other processors. + @param recv_boxes_ptr [OUT] + description of the grid data to be received from other processors. + @param send_procs_ptr [OUT] + processors that data is to be sent to. + @param recv_procs_ptr [OUT] + processors that data is to be received from. + + @see hypre_CreateCommInfoFromStencil */ + /*--------------------------------------------------------------------------*/ + + int + hypre_CreateCommInfoFromNumGhost( hypre_StructGrid *grid, + int *num_ghost, + hypre_BoxArrayArray **send_boxes_ptr, + hypre_BoxArrayArray **recv_boxes_ptr, + int ***send_procs_ptr, + int ***recv_procs_ptr ) + { + int ierr = 0; + + /* output variables */ + hypre_BoxArrayArray *send_boxes; + hypre_BoxArrayArray *recv_boxes; + int **send_procs; + int **recv_procs; + + /* internal variables */ + hypre_BoxArray *boxes = hypre_StructGridBoxes(grid); + int *ids = hypre_StructGridIDs(grid); + + hypre_BoxNeighbors *neighbors; + hypre_BoxArray *neighbor_boxes; + int *neighbor_procs; + int *neighbor_ids; + hypre_Box *neighbor_box; + + hypre_Box *box; + hypre_Box *grow_box; + + hypre_BoxArray *send_box_array; + hypre_BoxArray *recv_box_array; + int send_box_array_size; + int recv_box_array_size; + + hypre_BoxArray **cbox_arrays; + int *cbox_arrays_i; + int num_cbox_arrays; + + int i, j, k, m, n, d; + + /* temporary work variables */ + hypre_Box *box0; + + /*------------------------------------------------------ + * Determine neighbors: + *------------------------------------------------------*/ + + neighbors = hypre_StructGridNeighbors(grid); + + /*------------------------------------------------------ + * Compute send/recv boxes and procs + *------------------------------------------------------*/ + + send_boxes = hypre_BoxArrayArrayCreate(hypre_BoxArraySize(boxes)); + recv_boxes = hypre_BoxArrayArrayCreate(hypre_BoxArraySize(boxes)); + send_procs = hypre_CTAlloc(int *, hypre_BoxArraySize(boxes)); + recv_procs = hypre_CTAlloc(int *, hypre_BoxArraySize(boxes)); + + neighbor_boxes = hypre_BoxNeighborsBoxes(neighbors); + neighbor_procs = hypre_BoxNeighborsProcs(neighbors); + neighbor_ids = hypre_BoxNeighborsIDs(neighbors); + + box0 = hypre_BoxCreate(); + grow_box = hypre_BoxCreate(); + + cbox_arrays = + hypre_CTAlloc(hypre_BoxArray *, hypre_BoxArraySize(neighbor_boxes)); + cbox_arrays_i = + hypre_CTAlloc(int, hypre_BoxArraySize(neighbor_boxes)); + + hypre_ForBoxI(i, boxes) + { + box = hypre_BoxArrayBox(boxes, i); + hypre_CopyBox(box, grow_box); + + /*------------------------------------------------ + * Compute recv_box_array for box i + *------------------------------------------------*/ + + num_cbox_arrays = 0; + + /* grow box by num_ghost */ + for (d = 0; d < 3; d++) + { + hypre_BoxIMinD(grow_box, d) = + hypre_BoxIMinD(box, d) - num_ghost[2*d]; + hypre_BoxIMaxD(grow_box, d) = + hypre_BoxIMaxD(box, d) + num_ghost[2*d + 1]; + } + + hypre_ForBoxI(j, neighbor_boxes) + { + if (ids[i] != neighbor_ids[j]) + { + neighbor_box = hypre_BoxArrayBox(neighbor_boxes, j); + + hypre_IntersectBoxes(grow_box, neighbor_box, box0); + if (hypre_BoxVolume(box0)) + { + if (cbox_arrays[j] == NULL) + { + cbox_arrays[j] = hypre_BoxArrayCreate(0); + cbox_arrays_i[num_cbox_arrays] = j; + num_cbox_arrays++; + } + hypre_AppendBox(box0, cbox_arrays[j]); + } + } + } + + /* union the boxes in cbox_arrays */ + recv_box_array_size = 0; + for (m = 0; m < num_cbox_arrays; m++) + { + j = cbox_arrays_i[m]; + hypre_UnionBoxes(cbox_arrays[j]); + recv_box_array_size += hypre_BoxArraySize(cbox_arrays[j]); + } + + /* create recv_box_array and recv_procs */ + recv_box_array = hypre_BoxArrayArrayBoxArray(recv_boxes, i); + hypre_BoxArraySetSize(recv_box_array, recv_box_array_size); + recv_procs[i] = hypre_CTAlloc(int, recv_box_array_size); + n = 0; + for (m = 0; m < num_cbox_arrays; m++) + { + j = cbox_arrays_i[m]; + hypre_ForBoxI(k, cbox_arrays[j]) + { + recv_procs[i][n] = neighbor_procs[j]; + hypre_CopyBox(hypre_BoxArrayBox(cbox_arrays[j], k), + hypre_BoxArrayBox(recv_box_array, n)); + n++; + } + hypre_BoxArrayDestroy(cbox_arrays[j]); + cbox_arrays[j] = NULL; + } + + /*------------------------------------------------ + * Compute send_box_array for box i + *------------------------------------------------*/ + + num_cbox_arrays = 0; + + hypre_ForBoxI(j, neighbor_boxes) + { + if (ids[i] != neighbor_ids[j]) + { + neighbor_box = hypre_BoxArrayBox(neighbor_boxes, j); + + /* grow neighbor box by num_ghost */ + for (d = 0; d < 3; d++) + { + hypre_BoxIMinD(grow_box, d) = + hypre_BoxIMinD(neighbor_box, d) - num_ghost[2*d]; + hypre_BoxIMaxD(grow_box, d) = + hypre_BoxIMaxD(neighbor_box, d) + num_ghost[2*d + 1]; + } + + hypre_IntersectBoxes(box, grow_box, box0); + if (hypre_BoxVolume(box0)) + { + if (cbox_arrays[j] == NULL) + { + cbox_arrays[j] = hypre_BoxArrayCreate(0); + cbox_arrays_i[num_cbox_arrays] = j; + num_cbox_arrays++; + } + hypre_AppendBox(box0, cbox_arrays[j]); + } + } + } + + /* union the boxes in cbox_arrays */ + send_box_array_size = 0; + for (m = 0; m < num_cbox_arrays; m++) + { + j = cbox_arrays_i[m]; + hypre_UnionBoxes(cbox_arrays[j]); + send_box_array_size += hypre_BoxArraySize(cbox_arrays[j]); + } + + /* create send_box_array and send_procs */ + send_box_array = hypre_BoxArrayArrayBoxArray(send_boxes, i); + hypre_BoxArraySetSize(send_box_array, send_box_array_size); + send_procs[i] = hypre_CTAlloc(int, send_box_array_size); + n = 0; + for (m = 0; m < num_cbox_arrays; m++) + { + j = cbox_arrays_i[m]; + hypre_ForBoxI(k, cbox_arrays[j]) + { + send_procs[i][n] = neighbor_procs[j]; + hypre_CopyBox(hypre_BoxArrayBox(cbox_arrays[j], k), + hypre_BoxArrayBox(send_box_array, n)); + n++; + } + hypre_BoxArrayDestroy(cbox_arrays[j]); + cbox_arrays[j] = NULL; + } + } + + hypre_TFree(cbox_arrays); + hypre_TFree(cbox_arrays_i); + + hypre_BoxDestroy(grow_box); + hypre_BoxDestroy(box0); + + /*------------------------------------------------------ + * Return + *------------------------------------------------------*/ + + *send_boxes_ptr = send_boxes; + *recv_boxes_ptr = recv_boxes; + *send_procs_ptr = send_procs; + *recv_procs_ptr = recv_procs; + + return ierr; + } + + /*==========================================================================*/ + /*==========================================================================*/ + /** Return descriptions of communications patterns for migrating data + from one grid distribution to another. + + {\bf Input files:} + headers.h + + @return Error code. + + @param from_grid [IN] + grid distribution to migrate data from. + @param to_grid [IN] + grid distribution to migrate data to. + @param send_boxes_ptr [OUT] + description of the grid data to be sent to other processors. + @param recv_boxes_ptr [OUT] + description of the grid data to be received from other processors. + @param send_procs_ptr [OUT] + processors that data is to be sent to. + @param recv_procs_ptr [OUT] + processors that data is to be received from. + + @see hypre_StructMatrixMigrate, hypre_StructVectorMigrate */ + /*--------------------------------------------------------------------------*/ + + int + hypre_CreateCommInfoFromGrids( hypre_StructGrid *from_grid, + hypre_StructGrid *to_grid, + hypre_BoxArrayArray **send_boxes_ptr, + hypre_BoxArrayArray **recv_boxes_ptr, + int ***send_procs_ptr, + int ***recv_procs_ptr ) + { + int ierr = 0; + + /* output variables */ + hypre_BoxArrayArray *send_boxes; + hypre_BoxArrayArray *recv_boxes; + int **send_procs; + int **recv_procs; + + hypre_BoxArrayArray *comm_boxes; + int **comm_procs; + hypre_BoxArray *comm_box_array; + hypre_Box *comm_box; + + hypre_StructGrid *local_grid; + hypre_StructGrid *remote_grid; + + hypre_BoxArray *local_boxes; + hypre_BoxArray *remote_boxes; + hypre_BoxArray *remote_all_boxes; + int *remote_all_procs; + int remote_first_local; + + hypre_Box *local_box; + hypre_Box *remote_box; + + int i, j, k, r; + + /*------------------------------------------------------ + * Set up communication info + *------------------------------------------------------*/ + + for (r = 0; r < 2; r++) + { + switch(r) + { + case 0: + local_grid = from_grid; + remote_grid = to_grid; + break; + + case 1: + local_grid = to_grid; + remote_grid = from_grid; + break; + } + + /*--------------------------------------------------- + * Compute comm_boxes and comm_procs + *---------------------------------------------------*/ + + local_boxes = hypre_StructGridBoxes(local_grid); + remote_boxes = hypre_StructGridBoxes(remote_grid); + hypre_GatherAllBoxes(hypre_StructGridComm(remote_grid), remote_boxes, + &remote_all_boxes, + &remote_all_procs, + &remote_first_local); + + comm_boxes = hypre_BoxArrayArrayCreate(hypre_BoxArraySize(local_boxes)); + comm_procs = hypre_CTAlloc(int *, hypre_BoxArraySize(local_boxes)); + + comm_box = hypre_BoxCreate(); + hypre_ForBoxI(i, local_boxes) + { + local_box = hypre_BoxArrayBox(local_boxes, i); + + comm_box_array = hypre_BoxArrayArrayBoxArray(comm_boxes, i); + comm_procs[i] = + hypre_CTAlloc(int, hypre_BoxArraySize(remote_all_boxes)); + + hypre_ForBoxI(j, remote_all_boxes) + { + remote_box = hypre_BoxArrayBox(remote_all_boxes, j); + + hypre_IntersectBoxes(local_box, remote_box, comm_box); + if (hypre_BoxVolume(comm_box)) + { + k = hypre_BoxArraySize(comm_box_array); + comm_procs[i][k] = remote_all_procs[j]; + + hypre_AppendBox(comm_box, comm_box_array); + } + } + + comm_procs[i] = + hypre_TReAlloc(comm_procs[i], + int, hypre_BoxArraySize(comm_box_array)); + } + hypre_BoxDestroy(comm_box); + + hypre_BoxArrayDestroy(remote_all_boxes); + hypre_TFree(remote_all_procs); + + switch(r) + { + case 0: + send_boxes = comm_boxes; + send_procs = comm_procs; + break; + + case 1: + recv_boxes = comm_boxes; + recv_procs = comm_procs; + break; + } + } + + /*------------------------------------------------------ + * Return + *------------------------------------------------------*/ + + *send_boxes_ptr = send_boxes; + *recv_boxes_ptr = recv_boxes; + *send_procs_ptr = send_procs; + *recv_procs_ptr = recv_procs; + + return ierr; + } Index: llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/computation.c diff -c /dev/null llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/computation.c:1.1 *** /dev/null Mon Apr 11 00:22:18 2005 --- llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/computation.c Mon Apr 11 00:22:07 2005 *************** *** 0 **** --- 1,405 ---- + /*BHEADER********************************************************************** + * (c) 1997 The Regents of the University of California + * + * See the file COPYRIGHT_and_DISCLAIMER for a complete copyright + * notice, contact person, and disclaimer. + * + * $Revision: 1.1 $ + *********************************************************************EHEADER*/ + /****************************************************************************** + * + *****************************************************************************/ + + #include "headers.h" + + /*==========================================================================*/ + /*==========================================================================*/ + /** Return descriptions of communications and computations patterns + for a given grid-stencil computation. If HYPRE\_OVERLAP\_COMM\_COMP + is defined, then the patterns are computed to allow for overlapping + communications and computations. The default is no overlap. + + {\bf Note:} This routine assumes that the grid boxes do not overlap. + + {\bf Input files:} + headers.h + + @return Error code. + + @param grid [IN] + computational grid + @param stencil [IN] + computational stencil + @param send_boxes_ptr [OUT] + description of the grid data to be sent to other processors. + @param recv_boxes_ptr [OUT] + description of the grid data to be received from other processors. + @param send_processes_ptr [OUT] + processors that data is to be sent to. + @param recv_processes_ptr [OUT] + processors that data is to be received from. + @param indt_boxes_ptr [OUT] + description of computations that do not depend on communicated data. + @param dept_boxes_ptr [OUT] + description of computations that depend on communicated data. + + @see hypre_CreateCommInfoFromStencil */ + /*--------------------------------------------------------------------------*/ + + int + hypre_CreateComputeInfo( hypre_StructGrid *grid, + hypre_StructStencil *stencil, + hypre_BoxArrayArray **send_boxes_ptr, + hypre_BoxArrayArray **recv_boxes_ptr, + int ***send_processes_ptr, + int ***recv_processes_ptr, + hypre_BoxArrayArray **indt_boxes_ptr, + hypre_BoxArrayArray **dept_boxes_ptr ) + { + int ierr = 0; + + /* output variables */ + hypre_BoxArrayArray *send_boxes; + hypre_BoxArrayArray *recv_boxes; + int **send_processes; + int **recv_processes; + hypre_BoxArrayArray *indt_boxes; + hypre_BoxArrayArray *dept_boxes; + + /* internal variables */ + hypre_BoxArray *boxes; + + hypre_BoxArray *cbox_array; + hypre_Box *cbox; + + int i; + + #ifdef HYPRE_OVERLAP_COMM_COMP + hypre_Box *rembox; + hypre_Index *stencil_shape; + int border[3][2] = {{0, 0}, {0, 0}, {0, 0}}; + int cbox_array_size; + int s, d; + #endif + + /*------------------------------------------------------ + * Extract needed grid info + *------------------------------------------------------*/ + + boxes = hypre_StructGridBoxes(grid); + + /*------------------------------------------------------ + * Get communication info + *------------------------------------------------------*/ + + hypre_CreateCommInfoFromStencil(grid, stencil, + &send_boxes, &recv_boxes, + &send_processes, &recv_processes); + + #ifdef HYPRE_OVERLAP_COMM_COMP + + /*------------------------------------------------------ + * Compute border info + *------------------------------------------------------*/ + + stencil_shape = hypre_StructStencilShape(stencil); + for (s = 0; s < hypre_StructStencilSize(stencil); s++) + { + for (d = 0; d < 3; d++) + { + i = hypre_IndexD(stencil_shape[s], d); + if (i < 0) + { + border[d][0] = hypre_max(border[d][0], -i); + } + else if (i > 0) + { + border[d][1] = hypre_max(border[d][1], i); + } + } + } + + /*------------------------------------------------------ + * Set up the dependent boxes + *------------------------------------------------------*/ + + dept_boxes = hypre_BoxArrayArrayCreate(hypre_BoxArraySize(boxes)); + + rembox = hypre_BoxCreate(); + hypre_ForBoxI(i, boxes) + { + cbox_array = hypre_BoxArrayArrayBoxArray(dept_boxes, i); + hypre_BoxArraySetSize(cbox_array, 6); + + hypre_CopyBox(hypre_BoxArrayBox(boxes, i), rembox); + cbox_array_size = 0; + for (d = 0; d < 3; d++) + { + if ( (hypre_BoxVolume(rembox)) && (border[d][0]) ) + { + cbox = hypre_BoxArrayBox(cbox_array, cbox_array_size); + hypre_CopyBox(rembox, cbox); + hypre_BoxIMaxD(cbox, d) = + hypre_BoxIMinD(cbox, d) + border[d][0] - 1; + hypre_BoxIMinD(rembox, d) = + hypre_BoxIMinD(cbox, d) + border[d][0]; + cbox_array_size++; + } + if ( (hypre_BoxVolume(rembox)) && (border[d][1]) ) + { + cbox = hypre_BoxArrayBox(cbox_array, cbox_array_size); + hypre_CopyBox(rembox, cbox); + hypre_BoxIMinD(cbox, d) = + hypre_BoxIMaxD(cbox, d) - border[d][1] + 1; + hypre_BoxIMaxD(rembox, d) = + hypre_BoxIMaxD(cbox, d) - border[d][1]; + cbox_array_size++; + } + } + hypre_BoxArraySetSize(cbox_array, cbox_array_size); + } + hypre_BoxDestroy(rembox); + + /*------------------------------------------------------ + * Set up the independent boxes + *------------------------------------------------------*/ + + indt_boxes = hypre_BoxArrayArrayCreate(hypre_BoxArraySize(boxes)); + + hypre_ForBoxI(i, boxes) + { + cbox_array = hypre_BoxArrayArrayBoxArray(indt_boxes, i); + hypre_BoxArraySetSize(cbox_array, 1); + cbox = hypre_BoxArrayBox(cbox_array, 0); + hypre_CopyBox(hypre_BoxArrayBox(boxes, i), cbox); + + for (d = 0; d < 3; d++) + { + if ( (border[d][0]) ) + { + hypre_BoxIMinD(cbox, d) += border[d][0]; + } + if ( (border[d][1]) ) + { + hypre_BoxIMaxD(cbox, d) -= border[d][1]; + } + } + } + + #else + + /*------------------------------------------------------ + * Set up the independent boxes + *------------------------------------------------------*/ + + indt_boxes = hypre_BoxArrayArrayCreate(hypre_BoxArraySize(boxes)); + + /*------------------------------------------------------ + * Set up the dependent boxes + *------------------------------------------------------*/ + + dept_boxes = hypre_BoxArrayArrayCreate(hypre_BoxArraySize(boxes)); + + hypre_ForBoxI(i, boxes) + { + cbox_array = hypre_BoxArrayArrayBoxArray(dept_boxes, i); + hypre_BoxArraySetSize(cbox_array, 1); + cbox = hypre_BoxArrayBox(cbox_array, 0); + hypre_CopyBox(hypre_BoxArrayBox(boxes, i), cbox); + } + + #endif + + /*------------------------------------------------------ + * Return + *------------------------------------------------------*/ + + *send_boxes_ptr = send_boxes; + *recv_boxes_ptr = recv_boxes; + *send_processes_ptr = send_processes; + *recv_processes_ptr = recv_processes; + *indt_boxes_ptr = indt_boxes; + *dept_boxes_ptr = dept_boxes; + + return ierr; + } + + /*==========================================================================*/ + /*==========================================================================*/ + /** Create a computation package from a grid-based description of a + communication-computation pattern. + + {\bf Note:} + The input boxes and processes are destroyed. + + {\bf Input files:} + headers.h + + @return Error code. + + @param send_boxes [IN] + description of the grid data to be sent to other processors. + @param recv_boxes [IN] + description of the grid data to be received from other processors. + @param send_stride [IN] + stride to use for send data. + @param recv_stride [IN] + stride to use for receive data. + @param send_processes [IN] + processors that data is to be sent to. + @param recv_processes [IN] + processors that data is to be received from. + @param indt_boxes_ptr [IN] + description of computations that do not depend on communicated data. + @param dept_boxes_ptr [IN] + description of computations that depend on communicated data. + @param stride [IN] + stride to use for computations. + @param grid [IN] + computational grid + @param data_space [IN] + description of the stored data associated with the grid. + @param num_values [IN] + number of data values associated with each grid index. + @param compute_pkg_ptr [OUT] + pointer to a computation package + + @see hypre_CommPkgCreate, hypre_ComputePkgDestroy */ + /*--------------------------------------------------------------------------*/ + + int + hypre_ComputePkgCreate( hypre_BoxArrayArray *send_boxes, + hypre_BoxArrayArray *recv_boxes, + hypre_Index send_stride, + hypre_Index recv_stride, + int **send_processes, + int **recv_processes, + hypre_BoxArrayArray *indt_boxes, + hypre_BoxArrayArray *dept_boxes, + hypre_Index stride, + hypre_StructGrid *grid, + hypre_BoxArray *data_space, + int num_values, + hypre_ComputePkg **compute_pkg_ptr ) + { + int ierr = 0; + hypre_ComputePkg *compute_pkg; + + compute_pkg = hypre_CTAlloc(hypre_ComputePkg, 1); + + hypre_ComputePkgCommPkg(compute_pkg) = + hypre_CommPkgCreate(send_boxes, recv_boxes, + send_stride, recv_stride, + data_space, data_space, + send_processes, recv_processes, + num_values, hypre_StructGridComm(grid), + hypre_StructGridPeriodic(grid)); + + hypre_ComputePkgIndtBoxes(compute_pkg) = indt_boxes; + hypre_ComputePkgDeptBoxes(compute_pkg) = dept_boxes; + hypre_CopyIndex(stride, hypre_ComputePkgStride(compute_pkg)); + + hypre_StructGridRef(grid, &hypre_ComputePkgGrid(compute_pkg)); + hypre_ComputePkgDataSpace(compute_pkg) = data_space; + hypre_ComputePkgNumValues(compute_pkg) = num_values; + + *compute_pkg_ptr = compute_pkg; + + return ierr; + } + + /*==========================================================================*/ + /*==========================================================================*/ + /** Destroy a computation package. + + {\bf Input files:} + headers.h + + @return Error code. + + @param compute_pkg [IN/OUT] + computation package. + + @see hypre_ComputePkgCreate */ + /*--------------------------------------------------------------------------*/ + + int + hypre_ComputePkgDestroy( hypre_ComputePkg *compute_pkg ) + { + int ierr = 0; + + if (compute_pkg) + { + hypre_CommPkgDestroy(hypre_ComputePkgCommPkg(compute_pkg)); + + hypre_BoxArrayArrayDestroy(hypre_ComputePkgIndtBoxes(compute_pkg)); + hypre_BoxArrayArrayDestroy(hypre_ComputePkgDeptBoxes(compute_pkg)); + + hypre_StructGridDestroy(hypre_ComputePkgGrid(compute_pkg)); + + hypre_TFree(compute_pkg); + } + + return ierr; + } + + /*==========================================================================*/ + /*==========================================================================*/ + /** Initialize a non-blocking communication exchange. The independent + computations may be done after a call to this routine, to allow for + overlap of communications and computations. + + {\bf Input files:} + headers.h + + @return Error code. + + @param compute_pkg [IN] + computation package. + @param data [IN] + pointer to the associated data. + @param comm_handle [OUT] + communication handle. + + @see hypre_FinalizeIndtComputations, hypre_ComputePkgCreate, + hypre_InitializeCommunication */ + /*--------------------------------------------------------------------------*/ + + int + hypre_InitializeIndtComputations( hypre_ComputePkg *compute_pkg, + double *data, + hypre_CommHandle **comm_handle_ptr ) + { + int ierr = 0; + hypre_CommPkg *comm_pkg = hypre_ComputePkgCommPkg(compute_pkg); + + ierr = hypre_InitializeCommunication(comm_pkg, data, data, comm_handle_ptr); + + return ierr; + } + + /*==========================================================================*/ + /*==========================================================================*/ + /** Finalize a communication exchange. The dependent computations may + be done after a call to this routine. + + {\bf Input files:} + headers.h + + @return Error code. + + @param comm_handle [IN/OUT] + communication handle. + + @see hypre_InitializeIndtComputations, hypre_FinalizeCommunication */ + /*--------------------------------------------------------------------------*/ + + int + hypre_FinalizeIndtComputations( hypre_CommHandle *comm_handle ) + { + int ierr = 0; + + ierr = hypre_FinalizeCommunication(comm_handle); + + return ierr; + } Index: llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/cyclic_reduction.c diff -c /dev/null llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/cyclic_reduction.c:1.1 *** /dev/null Mon Apr 11 00:22:18 2005 --- llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/cyclic_reduction.c Mon Apr 11 00:22:07 2005 *************** *** 0 **** --- 1,1215 ---- + /*BHEADER********************************************************************** + * (c) 1997 The Regents of the University of California + * + * See the file COPYRIGHT_and_DISCLAIMER for a complete copyright + * notice, contact person, and disclaimer. + * + * $Revision: 1.1 $ + *********************************************************************EHEADER*/ + /****************************************************************************** + * Cyclic reduction algorithm (coded as if it were a 1D MG method) + * + *****************************************************************************/ + + #include "headers.h" + + #define DEBUG 0 + + /*-------------------------------------------------------------------------- + * Macros + *--------------------------------------------------------------------------*/ + + #define hypre_CycRedSetCIndex(base_index, base_stride, level, cdir, cindex) \ + {\ + if (level > 0)\ + hypre_SetIndex(cindex, 0, 0, 0);\ + else\ + hypre_CopyIndex(base_index, cindex);\ + hypre_IndexD(cindex, cdir) += 0;\ + } + + #define hypre_CycRedSetFIndex(base_index, base_stride, level, cdir, findex) \ + {\ + if (level > 0)\ + hypre_SetIndex(findex, 0, 0, 0);\ + else\ + hypre_CopyIndex(base_index, findex);\ + hypre_IndexD(findex, cdir) += 1;\ + } + + #define hypre_CycRedSetStride(base_index, base_stride, level, cdir, stride) \ + {\ + if (level > 0)\ + hypre_SetIndex(stride, 1, 1, 1);\ + else\ + hypre_CopyIndex(base_stride, stride);\ + hypre_IndexD(stride, cdir) *= 2;\ + } + + /*-------------------------------------------------------------------------- + * hypre_CyclicReductionData data structure + *--------------------------------------------------------------------------*/ + + typedef struct + { + MPI_Comm comm; + + int num_levels; + + int cdir; /* coarsening direction */ + hypre_Index base_index; + hypre_Index base_stride; + + hypre_StructGrid **grid_l; + + hypre_BoxArray *base_points; + hypre_BoxArray **fine_points_l; + + double *data; + hypre_StructMatrix **A_l; + hypre_StructVector **x_l; + + hypre_ComputePkg **down_compute_pkg_l; + hypre_ComputePkg **up_compute_pkg_l; + + int time_index; + int solve_flops; + + } hypre_CyclicReductionData; + + /*-------------------------------------------------------------------------- + * hypre_CyclicReductionCreate + *--------------------------------------------------------------------------*/ + + void * + hypre_CyclicReductionCreate( MPI_Comm comm ) + { + hypre_CyclicReductionData *cyc_red_data; + + cyc_red_data = hypre_CTAlloc(hypre_CyclicReductionData, 1); + + (cyc_red_data -> comm) = comm; + (cyc_red_data -> cdir) = 0; + (cyc_red_data -> time_index) = hypre_InitializeTiming("CyclicReduction"); + + /* set defaults */ + hypre_SetIndex((cyc_red_data -> base_index), 0, 0, 0); + hypre_SetIndex((cyc_red_data -> base_stride), 1, 1, 1); + + return (void *) cyc_red_data; + } + + /*-------------------------------------------------------------------------- + * hypre_CycRedCreateCoarseOp + *--------------------------------------------------------------------------*/ + + hypre_StructMatrix * + hypre_CycRedCreateCoarseOp( hypre_StructMatrix *A, + hypre_StructGrid *coarse_grid, + int cdir ) + { + hypre_StructMatrix *Ac; + + hypre_Index *Ac_stencil_shape; + hypre_StructStencil *Ac_stencil; + int Ac_stencil_size; + int Ac_stencil_dim; + int Ac_num_ghost[] = {0, 0, 0, 0, 0, 0}; + + int i; + int stencil_rank; + + Ac_stencil_dim = 1; + + /*----------------------------------------------- + * Define Ac_stencil + *-----------------------------------------------*/ + + stencil_rank = 0; + + /*----------------------------------------------- + * non-symmetric case: + * + * 3 point fine grid stencil produces 3 point Ac + *-----------------------------------------------*/ + + if (!hypre_StructMatrixSymmetric(A)) + { + Ac_stencil_size = 3; + Ac_stencil_shape = hypre_CTAlloc(hypre_Index, Ac_stencil_size); + for (i = -1; i < 2; i++) + { + /* Storage for 3 elements (c,w,e) */ + hypre_SetIndex(Ac_stencil_shape[stencil_rank],i,0,0); + stencil_rank++; + } + } + + /*----------------------------------------------- + * symmetric case: + * + * 3 point fine grid stencil produces 3 point Ac + * + * Only store the lower triangular part + diagonal = 2 entries, + * lower triangular means the lower triangular part on the matrix + * in the standard lexicalgraphic ordering. + *-----------------------------------------------*/ + + else + { + Ac_stencil_size = 2; + Ac_stencil_shape = hypre_CTAlloc(hypre_Index, Ac_stencil_size); + for (i = -1; i < 1; i++) + { + + /* Storage for 2 elements in (c,w) */ + hypre_SetIndex(Ac_stencil_shape[stencil_rank],i,0,0); + stencil_rank++; + } + } + + Ac_stencil = hypre_StructStencilCreate(Ac_stencil_dim, Ac_stencil_size, + Ac_stencil_shape); + + Ac = hypre_StructMatrixCreate(hypre_StructMatrixComm(A), + coarse_grid, Ac_stencil); + + hypre_StructStencilDestroy(Ac_stencil); + + /*----------------------------------------------- + * Coarse operator in symmetric iff fine operator is + *-----------------------------------------------*/ + + hypre_StructMatrixSymmetric(Ac) = hypre_StructMatrixSymmetric(A); + + /*----------------------------------------------- + * Set number of ghost points + *-----------------------------------------------*/ + + Ac_num_ghost[2*cdir] = 1; + if (!hypre_StructMatrixSymmetric(A)) + { + Ac_num_ghost[2*cdir + 1] = 1; + } + hypre_StructMatrixSetNumGhost(Ac, Ac_num_ghost); + + hypre_StructMatrixInitializeShell(Ac); + + return Ac; + } + + /*-------------------------------------------------------------------------- + * hypre_CycRedSetupCoarseOp + *--------------------------------------------------------------------------*/ + + int + hypre_CycRedSetupCoarseOp( hypre_StructMatrix *A, + hypre_StructMatrix *Ac, + hypre_Index cindex, + hypre_Index cstride ) + + { + hypre_Index index; + + hypre_StructGrid *fgrid; + int *fgrid_ids; + hypre_StructGrid *cgrid; + hypre_BoxArray *cgrid_boxes; + int *cgrid_ids; + hypre_Box *cgrid_box; + hypre_IndexRef cstart; + hypre_Index stridec; + hypre_Index fstart; + hypre_IndexRef stridef; + hypre_Index loop_size; + + int fi, ci; + int loopi, loopj, loopk; + + hypre_Box *A_dbox; + hypre_Box *Ac_dbox; + + double *a_cc, *a_cw, *a_ce; + double *ac_cc, *ac_cw, *ac_ce; + + int iA, iAm1, iAp1; + int iAc; + + int xOffsetA; + + int ierr = 0; + + stridef = cstride; + hypre_SetIndex(stridec, 1, 1, 1); + + fgrid = hypre_StructMatrixGrid(A); + fgrid_ids = hypre_StructGridIDs(fgrid); + + cgrid = hypre_StructMatrixGrid(Ac); + cgrid_boxes = hypre_StructGridBoxes(cgrid); + cgrid_ids = hypre_StructGridIDs(cgrid); + + fi = 0; + hypre_ForBoxI(ci, cgrid_boxes) + { + while (fgrid_ids[fi] != cgrid_ids[ci]) + { + fi++; + } + + cgrid_box = hypre_BoxArrayBox(cgrid_boxes, ci); + + cstart = hypre_BoxIMin(cgrid_box); + hypre_StructMapCoarseToFine(cstart, cindex, cstride, fstart); + + A_dbox = hypre_BoxArrayBox(hypre_StructMatrixDataSpace(A), fi); + Ac_dbox = hypre_BoxArrayBox(hypre_StructMatrixDataSpace(Ac), ci); + + /*----------------------------------------------- + * Extract pointers for 3-point fine grid operator: + * + * a_cc is pointer for center coefficient + * a_cw is pointer for west coefficient + * a_ce is pointer for east coefficient + *-----------------------------------------------*/ + + hypre_SetIndex(index,0,0,0); + a_cc = hypre_StructMatrixExtractPointerByIndex(A, fi, index); + + hypre_SetIndex(index,-1,0,0); + a_cw = hypre_StructMatrixExtractPointerByIndex(A, fi, index); + + hypre_SetIndex(index,1,0,0); + a_ce = hypre_StructMatrixExtractPointerByIndex(A, fi, index); + + /*----------------------------------------------- + * Extract pointers for coarse grid operator - always 3-point: + * + * If A is symmetric so is Ac. We build only the + * lower triangular part (plus diagonal). + * + * ac_cc is pointer for center coefficient (etc.) + *-----------------------------------------------*/ + + hypre_SetIndex(index,0,0,0); + ac_cc = hypre_StructMatrixExtractPointerByIndex(Ac, ci, index); + + hypre_SetIndex(index,-1,0,0); + ac_cw = hypre_StructMatrixExtractPointerByIndex(Ac, ci, index); + + if(!hypre_StructMatrixSymmetric(A)) + { + hypre_SetIndex(index,1,0,0); + ac_ce = hypre_StructMatrixExtractPointerByIndex(Ac, ci, index); + } + + /*----------------------------------------------- + * Define offsets for fine grid stencil and interpolation + * + * In the BoxLoop below I assume iA and iP refer + * to data associated with the point which we are + * building the stencil for. The below offsets + * are used in refering to data associated with + * other points. + *-----------------------------------------------*/ + + hypre_SetIndex(index,1,0,0); + xOffsetA = hypre_BoxOffsetDistance(A_dbox,index); + + /*----------------------------------------------- + * non-symmetric case + *-----------------------------------------------*/ + + if(!hypre_StructMatrixSymmetric(A)) + { + hypre_BoxGetSize(cgrid_box, loop_size); + + hypre_BoxLoop2Begin(loop_size, + A_dbox, fstart, stridef, iA, + Ac_dbox, cstart, stridec, iAc); + #define HYPRE_BOX_SMP_PRIVATE loopk,loopi,loopj,iA,iAc,iAm1,iAp1 + #include "hypre_box_smp_forloop.h" + hypre_BoxLoop2For(loopi, loopj, loopk, iA, iAc) + { + iAm1 = iA - xOffsetA; + iAp1 = iA + xOffsetA; + + ac_cw[iAc] = - a_cw[iA] *a_cw[iAm1] / a_cc[iAm1]; + + ac_cc[iAc] = a_cc[iA] + - a_cw[iA] * a_ce[iAm1] / a_cc[iAm1] + - a_ce[iA] * a_cw[iAp1] / a_cc[iAp1]; + + ac_ce[iAc] = - a_ce[iA] *a_ce[iAp1] / a_cc[iAp1]; + + } + hypre_BoxLoop2End(iA, iAc); + } + + /*----------------------------------------------- + * symmetric case + *-----------------------------------------------*/ + + else + { + hypre_BoxGetSize(cgrid_box, loop_size); + + hypre_BoxLoop2Begin(loop_size, + A_dbox, fstart, stridef, iA, + Ac_dbox, cstart, stridec, iAc); + #define HYPRE_BOX_SMP_PRIVATE loopk,loopi,loopj,iA,iAc,iAm1,iAp1 + #include "hypre_box_smp_forloop.h" + hypre_BoxLoop2For(loopi, loopj, loopk, iA, iAc) + { + iAm1 = iA - xOffsetA; + iAp1 = iA + xOffsetA; + + ac_cw[iAc] = - a_cw[iA] *a_cw[iAm1] / a_cc[iAm1]; + + ac_cc[iAc] = a_cc[iA] + - a_cw[iA] * a_ce[iAm1] / a_cc[iAm1] + - a_ce[iA] * a_cw[iAp1] / a_cc[iAp1]; + } + hypre_BoxLoop2End(iA, iAc); + } + + } /* end ForBoxI */ + + hypre_StructMatrixAssemble(Ac); + + /*----------------------------------------------------------------------- + * Collapse stencil in periodic direction on coarsest grid. + *-----------------------------------------------------------------------*/ + + if (hypre_IndexX(hypre_StructGridPeriodic(cgrid)) == 1) + { + hypre_ForBoxI(ci, cgrid_boxes) + { + cgrid_box = hypre_BoxArrayBox(cgrid_boxes, ci); + + cstart = hypre_BoxIMin(cgrid_box); + + Ac_dbox = hypre_BoxArrayBox(hypre_StructMatrixDataSpace(Ac), ci); + + /*----------------------------------------------- + * Extract pointers for coarse grid operator - always 3-point: + * + * If A is symmetric so is Ac. We build only the + * lower triangular part (plus diagonal). + * + * ac_cc is pointer for center coefficient (etc.) + *-----------------------------------------------*/ + + hypre_SetIndex(index,0,0,0); + ac_cc = hypre_StructMatrixExtractPointerByIndex(Ac, ci, index); + + hypre_SetIndex(index,-1,0,0); + ac_cw = hypre_StructMatrixExtractPointerByIndex(Ac, ci, index); + + if(!hypre_StructMatrixSymmetric(A)) + { + hypre_SetIndex(index,1,0,0); + ac_ce = hypre_StructMatrixExtractPointerByIndex(Ac, ci, index); + } + + + /*----------------------------------------------- + * non-symmetric case + *-----------------------------------------------*/ + + if(!hypre_StructMatrixSymmetric(A)) + { + hypre_BoxGetSize(cgrid_box, loop_size); + + hypre_BoxLoop1Begin(loop_size, + Ac_dbox, cstart, stridec, iAc); + #define HYPRE_BOX_SMP_PRIVATE loopk,loopi,loopj,iAc + #include "hypre_box_smp_forloop.h" + hypre_BoxLoop1For(loopi, loopj, loopk, iAc) + { + ac_cc[iAc] += (ac_cw[iAc] + ac_ce[iAc]); + ac_cw[iAc] = 0.0; + ac_ce[iAc] = 0.0; + } + hypre_BoxLoop1End(iAc); + } + + /*----------------------------------------------- + * symmetric case + *-----------------------------------------------*/ + + else + { + hypre_BoxGetSize(cgrid_box, loop_size); + + hypre_BoxLoop1Begin(loop_size, + Ac_dbox, cstart, stridec, iAc); + #define HYPRE_BOX_SMP_PRIVATE loopk,loopi,loopj,iAc + #include "hypre_box_smp_forloop.h" + hypre_BoxLoop1For(loopi, loopj, loopk, iAc) + { + ac_cc[iAc] += (2.0 * ac_cw[iAc]); + ac_cw[iAc] = 0.0; + } + hypre_BoxLoop1End(iAc); + } + + } /* end ForBoxI */ + + } + + hypre_StructMatrixAssemble(Ac); + + return ierr; + } + + /*-------------------------------------------------------------------------- + * hypre_CyclicReductionSetup + *--------------------------------------------------------------------------*/ + + int + hypre_CyclicReductionSetup( void *cyc_red_vdata, + hypre_StructMatrix *A, + hypre_StructVector *b, + hypre_StructVector *x ) + { + hypre_CyclicReductionData *cyc_red_data = cyc_red_vdata; + + MPI_Comm comm = (cyc_red_data -> comm); + int cdir = (cyc_red_data -> cdir); + hypre_IndexRef base_index = (cyc_red_data -> base_index); + hypre_IndexRef base_stride = (cyc_red_data -> base_stride); + + int num_levels; + hypre_StructGrid **grid_l; + hypre_BoxArray *base_points; + hypre_BoxArray **fine_points_l; + double *data; + int data_size = 0; + hypre_StructMatrix **A_l; + hypre_StructVector **x_l; + hypre_ComputePkg **down_compute_pkg_l; + hypre_ComputePkg **up_compute_pkg_l; + + hypre_Index cindex; + hypre_Index findex; + hypre_Index stride; + + hypre_BoxArrayArray *send_boxes; + hypre_BoxArrayArray *recv_boxes; + int **send_processes; + int **recv_processes; + hypre_BoxArrayArray *indt_boxes; + hypre_BoxArrayArray *dept_boxes; + + hypre_StructGrid *grid; + + hypre_Box *cbox; + + int l; + int flop_divisor; + + int x_num_ghost[] = {0, 0, 0, 0, 0, 0}; + + int ierr = 0; + + /*----------------------------------------------------- + * Set up coarse grids + *-----------------------------------------------------*/ + + grid = hypre_StructMatrixGrid(A); + + /* Compute a preliminary num_levels value based on the grid */ + cbox = hypre_BoxDuplicate(hypre_StructGridBoundingBox(grid)); + num_levels = hypre_Log2(hypre_BoxSizeD(cbox, cdir)) + 2; + + grid_l = hypre_TAlloc(hypre_StructGrid *, num_levels); + hypre_StructGridRef(grid, &grid_l[0]); + for (l = 0; ; l++) + { + /* set cindex and stride */ + hypre_CycRedSetCIndex(base_index, base_stride, l, cdir, cindex); + hypre_CycRedSetStride(base_index, base_stride, l, cdir, stride); + + /* check to see if we should coarsen */ + if ( hypre_BoxIMinD(cbox, cdir) == hypre_BoxIMaxD(cbox, cdir) ) + { + /* stop coarsening */ + break; + } + + /* coarsen cbox */ + hypre_ProjectBox(cbox, cindex, stride); + hypre_StructMapFineToCoarse(hypre_BoxIMin(cbox), cindex, stride, + hypre_BoxIMin(cbox)); + hypre_StructMapFineToCoarse(hypre_BoxIMax(cbox), cindex, stride, + hypre_BoxIMax(cbox)); + + /* coarsen the grid */ + hypre_StructCoarsen(grid_l[l], cindex, stride, 1, &grid_l[l+1]); + } + num_levels = l + 1; + + /* free up some things */ + hypre_BoxDestroy(cbox); + + (cyc_red_data -> num_levels) = num_levels; + (cyc_red_data -> grid_l) = grid_l; + + /*----------------------------------------------------- + * Set up base points + *-----------------------------------------------------*/ + + base_points = hypre_BoxArrayDuplicate(hypre_StructGridBoxes(grid_l[0])); + hypre_ProjectBoxArray(base_points, base_index, base_stride); + + (cyc_red_data -> base_points) = base_points; + + /*----------------------------------------------------- + * Set up fine points + *-----------------------------------------------------*/ + + fine_points_l = hypre_TAlloc(hypre_BoxArray *, num_levels); + + for (l = 0; l < (num_levels - 1); l++) + { + hypre_CycRedSetCIndex(base_index, base_stride, l, cdir, cindex); + hypre_CycRedSetFIndex(base_index, base_stride, l, cdir, findex); + hypre_CycRedSetStride(base_index, base_stride, l, cdir, stride); + + fine_points_l[l] = + hypre_BoxArrayDuplicate(hypre_StructGridBoxes(grid_l[l])); + hypre_ProjectBoxArray(fine_points_l[l], findex, stride); + } + + fine_points_l[l] = + hypre_BoxArrayDuplicate(hypre_StructGridBoxes(grid_l[l])); + if (num_levels == 1) + { + hypre_ProjectBoxArray(fine_points_l[l], base_index, base_stride); + } + + (cyc_red_data -> fine_points_l) = fine_points_l; + + /*----------------------------------------------------- + * Set up matrix and vector structures + *-----------------------------------------------------*/ + + A_l = hypre_TAlloc(hypre_StructMatrix *, num_levels); + x_l = hypre_TAlloc(hypre_StructVector *, num_levels); + + A_l[0] = hypre_StructMatrixRef(A); + x_l[0] = hypre_StructVectorRef(x); + + x_num_ghost[2*cdir] = 1; + x_num_ghost[2*cdir + 1] = 1; + + for (l = 0; l < (num_levels - 1); l++) + { + A_l[l+1] = hypre_CycRedCreateCoarseOp(A_l[l], grid_l[l+1], cdir); + data_size += hypre_StructMatrixDataSize(A_l[l+1]); + + x_l[l+1] = hypre_StructVectorCreate(comm, grid_l[l+1]); + hypre_StructVectorSetNumGhost(x_l[l+1], x_num_ghost); + hypre_StructVectorInitializeShell(x_l[l+1]); + data_size += hypre_StructVectorDataSize(x_l[l+1]); + } + + data = hypre_SharedCTAlloc(double, data_size); + + (cyc_red_data -> data) = data; + + for (l = 0; l < (num_levels - 1); l++) + { + hypre_StructMatrixInitializeData(A_l[l+1], data); + data += hypre_StructMatrixDataSize(A_l[l+1]); + hypre_StructVectorInitializeData(x_l[l+1], data); + hypre_StructVectorAssemble(x_l[l+1]); + data += hypre_StructVectorDataSize(x_l[l+1]); + } + + (cyc_red_data -> A_l) = A_l; + (cyc_red_data -> x_l) = x_l; + + /*----------------------------------------------------- + * Set up coarse grid operators + *-----------------------------------------------------*/ + + for (l = 0; l < (num_levels - 1); l++) + { + hypre_CycRedSetCIndex(base_index, base_stride, l, cdir, cindex); + hypre_CycRedSetStride(base_index, base_stride, l, cdir, stride); + + hypre_CycRedSetupCoarseOp(A_l[l], A_l[l+1], cindex, stride); + } + + /*---------------------------------------------------------- + * Set up compute packages + *----------------------------------------------------------*/ + + down_compute_pkg_l = hypre_TAlloc(hypre_ComputePkg *, (num_levels - 1)); + up_compute_pkg_l = hypre_TAlloc(hypre_ComputePkg *, (num_levels - 1)); + + for (l = 0; l < (num_levels - 1); l++) + { + hypre_CycRedSetCIndex(base_index, base_stride, l, cdir, cindex); + hypre_CycRedSetFIndex(base_index, base_stride, l, cdir, findex); + hypre_CycRedSetStride(base_index, base_stride, l, cdir, stride); + + hypre_CreateComputeInfo(grid_l[l], hypre_StructMatrixStencil(A_l[l]), + &send_boxes, &recv_boxes, + &send_processes, &recv_processes, + &indt_boxes, &dept_boxes); + + /* down-cycle */ + hypre_ProjectBoxArrayArray(send_boxes, findex, stride); + hypre_ProjectBoxArrayArray(recv_boxes, findex, stride); + hypre_ProjectBoxArrayArray(indt_boxes, cindex, stride); + hypre_ProjectBoxArrayArray(dept_boxes, cindex, stride); + hypre_ComputePkgCreate(send_boxes, recv_boxes, + stride, stride, + send_processes, recv_processes, + indt_boxes, dept_boxes, + stride, grid_l[l], + hypre_StructVectorDataSpace(x_l[l]), 1, + &down_compute_pkg_l[l]); + + hypre_CreateComputeInfo(grid_l[l], hypre_StructMatrixStencil(A_l[l]), + &send_boxes, &recv_boxes, + &send_processes, &recv_processes, + &indt_boxes, &dept_boxes); + + /* up-cycle */ + hypre_ProjectBoxArrayArray(send_boxes, cindex, stride); + hypre_ProjectBoxArrayArray(recv_boxes, cindex, stride); + hypre_ProjectBoxArrayArray(indt_boxes, findex, stride); + hypre_ProjectBoxArrayArray(dept_boxes, findex, stride); + hypre_ComputePkgCreate(send_boxes, recv_boxes, + stride, stride, + send_processes, recv_processes, + indt_boxes, dept_boxes, + stride, grid_l[l], + hypre_StructVectorDataSpace(x_l[l]), 1, + &up_compute_pkg_l[l]); + } + + (cyc_red_data -> down_compute_pkg_l) = down_compute_pkg_l; + (cyc_red_data -> up_compute_pkg_l) = up_compute_pkg_l; + + /*----------------------------------------------------- + * Compute solve flops + *-----------------------------------------------------*/ + + flop_divisor = (hypre_IndexX(base_stride) * + hypre_IndexY(base_stride) * + hypre_IndexZ(base_stride) ); + (cyc_red_data -> solve_flops) = + hypre_StructVectorGlobalSize(x_l[0])/2/flop_divisor; + (cyc_red_data -> solve_flops) += + 5*hypre_StructVectorGlobalSize(x_l[0])/2/flop_divisor; + for (l = 1; l < (num_levels - 1); l++) + { + (cyc_red_data -> solve_flops) += + 10*hypre_StructVectorGlobalSize(x_l[l])/2; + } + + if (num_levels > 1) + { + (cyc_red_data -> solve_flops) += + hypre_StructVectorGlobalSize(x_l[l])/2; + } + + + /*----------------------------------------------------- + * Finalize some things + *-----------------------------------------------------*/ + + #if DEBUG + { + char filename[255]; + + /* debugging stuff */ + for (l = 0; l < num_levels; l++) + { + sprintf(filename, "yout_A.%02d", l); + hypre_StructMatrixPrint(filename, A_l[l], 0); + } + } + #endif + + return ierr; + } + + /*-------------------------------------------------------------------------- + * hypre_CyclicReduction + * + * The solution vectors on each level are also used to store the + * right-hand-side data. We can do this because of the red-black + * nature of the algorithm and the fact that the method is exact, + * allowing one to assume initial guesses of zero on all grid levels. + *--------------------------------------------------------------------------*/ + + int + hypre_CyclicReduction( void *cyc_red_vdata, + hypre_StructMatrix *A, + hypre_StructVector *b, + hypre_StructVector *x ) + { + hypre_CyclicReductionData *cyc_red_data = cyc_red_vdata; + + int num_levels = (cyc_red_data -> num_levels); + int cdir = (cyc_red_data -> cdir); + hypre_IndexRef base_index = (cyc_red_data -> base_index); + hypre_IndexRef base_stride = (cyc_red_data -> base_stride); + hypre_BoxArray *base_points = (cyc_red_data -> base_points); + hypre_BoxArray **fine_points_l = (cyc_red_data -> fine_points_l); + hypre_StructMatrix **A_l = (cyc_red_data -> A_l); + hypre_StructVector **x_l = (cyc_red_data -> x_l); + hypre_ComputePkg **down_compute_pkg_l = + (cyc_red_data -> down_compute_pkg_l); + hypre_ComputePkg **up_compute_pkg_l = + (cyc_red_data -> up_compute_pkg_l); + + hypre_StructGrid *fgrid; + int *fgrid_ids; + hypre_StructGrid *cgrid; + hypre_BoxArray *cgrid_boxes; + int *cgrid_ids; + + hypre_CommHandle *comm_handle; + + hypre_BoxArrayArray *compute_box_aa; + hypre_BoxArray *compute_box_a; + hypre_Box *compute_box; + + hypre_Box *A_dbox; + hypre_Box *x_dbox; + hypre_Box *b_dbox; + hypre_Box *xc_dbox; + + double *Ap, *Awp, *Aep; + double *xp, *xwp, *xep; + double *bp; + double *xcp; + + int Ai; + int xi; + int bi; + int xci; + + hypre_Index cindex; + hypre_Index stride; + + hypre_Index index; + hypre_Index loop_size; + hypre_Index start; + hypre_Index startc; + hypre_Index stridec; + + int compute_i, fi, ci, j, l; + int loopi, loopj, loopk; + + int ierr = 0; + + hypre_BeginTiming(cyc_red_data -> time_index); + + + /*-------------------------------------------------- + * Initialize some things + *--------------------------------------------------*/ + + hypre_SetIndex(stridec, 1, 1, 1); + + hypre_StructMatrixDestroy(A_l[0]); + hypre_StructVectorDestroy(x_l[0]); + A_l[0] = hypre_StructMatrixRef(A); + x_l[0] = hypre_StructVectorRef(x); + + /*-------------------------------------------------- + * Copy b into x + *--------------------------------------------------*/ + + compute_box_a = base_points; + hypre_ForBoxI(fi, compute_box_a) + { + compute_box = hypre_BoxArrayBox(compute_box_a, fi); + + x_dbox = hypre_BoxArrayBox(hypre_StructVectorDataSpace(x), fi); + b_dbox = hypre_BoxArrayBox(hypre_StructVectorDataSpace(b), fi); + + xp = hypre_StructVectorBoxData(x, fi); + bp = hypre_StructVectorBoxData(b, fi); + + hypre_CopyIndex(hypre_BoxIMin(compute_box), start); + hypre_BoxGetStrideSize(compute_box, base_stride, loop_size); + + hypre_BoxLoop2Begin(loop_size, + x_dbox, start, base_stride, xi, + b_dbox, start, base_stride, bi); + #define HYPRE_BOX_SMP_PRIVATE loopk,loopi,loopj,xi,bi + #include "hypre_box_smp_forloop.h" + hypre_BoxLoop2For(loopi, loopj, loopk, xi, bi) + { + xp[xi] = bp[bi]; + } + hypre_BoxLoop2End(xi, bi); + } + + /*-------------------------------------------------- + * Down cycle: + * + * 1) Do an F-relaxation sweep with zero initial guess + * 2) Compute and inject residual at C-points + * - computations are at C-points + * - communications are at F-points + * + * Notes: + * - Before these two steps are executed, the + * fine-grid solution vector contains the right-hand-side. + * - After these two steps are executed, the fine-grid + * solution vector contains the right-hand side at + * C-points and the current solution approximation at + * F-points. The coarse-grid solution vector contains + * the restricted (injected) fine-grid residual. + * - The coarsest grid solve is built into this loop + * because it involves the same code as step 1. + *--------------------------------------------------*/ + + /* The break out of this loop is just before step 2 below */ + for (l = 0; ; l++) + { + /* set cindex and stride */ + hypre_CycRedSetCIndex(base_index, base_stride, l, cdir, cindex); + hypre_CycRedSetStride(base_index, base_stride, l, cdir, stride); + + /* Step 1 */ + compute_box_a = fine_points_l[l]; + hypre_ForBoxI(fi, compute_box_a) + { + compute_box = hypre_BoxArrayBox(compute_box_a, fi); + + A_dbox = + hypre_BoxArrayBox(hypre_StructMatrixDataSpace(A_l[l]), fi); + x_dbox = + hypre_BoxArrayBox(hypre_StructVectorDataSpace(x_l[l]), fi); + + hypre_SetIndex(index, 0, 0, 0); + Ap = hypre_StructMatrixExtractPointerByIndex(A_l[l], fi, index); + xp = hypre_StructVectorBoxData(x_l[l], fi); + + hypre_CopyIndex(hypre_BoxIMin(compute_box), start); + hypre_BoxGetStrideSize(compute_box, stride, loop_size); + + hypre_BoxLoop2Begin(loop_size, + A_dbox, start, stride, Ai, + x_dbox, start, stride, xi); + #define HYPRE_BOX_SMP_PRIVATE loopk,loopi,loopj,Ai,xi + #include "hypre_box_smp_forloop.h" + hypre_BoxLoop2For(loopi, loopj, loopk, Ai, xi) + { + xp[xi] /= Ap[Ai]; + } + hypre_BoxLoop2End(Ai, xi); + } + + if (l == (num_levels - 1)) + break; + + /* Step 2 */ + fgrid = hypre_StructVectorGrid(x_l[l]); + fgrid_ids = hypre_StructGridIDs(fgrid); + cgrid = hypre_StructVectorGrid(x_l[l+1]); + cgrid_boxes = hypre_StructGridBoxes(cgrid); + cgrid_ids = hypre_StructGridIDs(cgrid); + for (compute_i = 0; compute_i < 2; compute_i++) + { + switch(compute_i) + { + case 0: + { + xp = hypre_StructVectorData(x_l[l]); + hypre_InitializeIndtComputations(down_compute_pkg_l[l], xp, + &comm_handle); + compute_box_aa = + hypre_ComputePkgIndtBoxes(down_compute_pkg_l[l]); + } + break; + + case 1: + { + hypre_FinalizeIndtComputations(comm_handle); + compute_box_aa = + hypre_ComputePkgDeptBoxes(down_compute_pkg_l[l]); + } + break; + } + + fi = 0; + hypre_ForBoxArrayI(ci, cgrid_boxes) + { + while (fgrid_ids[fi] != cgrid_ids[ci]) + { + fi++; + } + + compute_box_a = + hypre_BoxArrayArrayBoxArray(compute_box_aa, fi); + + A_dbox = + hypre_BoxArrayBox(hypre_StructMatrixDataSpace(A_l[l]), fi); + x_dbox = + hypre_BoxArrayBox(hypre_StructVectorDataSpace(x_l[l]), fi); + xc_dbox = + hypre_BoxArrayBox(hypre_StructVectorDataSpace(x_l[l+1]), ci); + + xp = hypre_StructVectorBoxData(x_l[l], fi); + xcp = hypre_StructVectorBoxData(x_l[l+1], ci); + + hypre_SetIndex(index, -1, 0, 0); + Awp = + hypre_StructMatrixExtractPointerByIndex(A_l[l], fi, index); + xwp = hypre_StructVectorBoxData(x_l[l], fi) + + hypre_BoxOffsetDistance(x_dbox, index); + + hypre_SetIndex(index, 1, 0, 0); + Aep = + hypre_StructMatrixExtractPointerByIndex(A_l[l], fi, index); + xep = hypre_StructVectorBoxData(x_l[l], fi) + + hypre_BoxOffsetDistance(x_dbox, index); + + hypre_ForBoxI(j, compute_box_a) + { + compute_box = hypre_BoxArrayBox(compute_box_a, j); + + hypre_CopyIndex(hypre_BoxIMin(compute_box), start); + hypre_StructMapFineToCoarse(start, cindex, stride, + startc); + + hypre_BoxGetStrideSize(compute_box, stride, loop_size); + + hypre_BoxLoop3Begin(loop_size, + A_dbox, start, stride, Ai, + x_dbox, start, stride, xi, + xc_dbox, startc, stridec, xci); + #define HYPRE_BOX_SMP_PRIVATE loopk,loopi,loopj,Ai,xi,xci + #include "hypre_box_smp_forloop.h" + hypre_BoxLoop3For(loopi, loopj, loopk, Ai, xi, xci) + { + xcp[xci] = xp[xi] - + Awp[Ai]*xwp[xi] - + Aep[Ai]*xep[xi]; + } + hypre_BoxLoop3End(Ai, xi, xci); + } + } + } + } + + /*-------------------------------------------------- + * Up cycle: + * + * 1) Inject coarse error into fine-grid solution + * vector (this is the solution at the C-points) + * 2) Do an F-relaxation sweep on Ax = 0 and update + * solution at F-points + * - computations are at F-points + * - communications are at C-points + *--------------------------------------------------*/ + + for (l = (num_levels - 2); l >= 0; l--) + { + /* set cindex and stride */ + hypre_CycRedSetCIndex(base_index, base_stride, l, cdir, cindex); + hypre_CycRedSetStride(base_index, base_stride, l, cdir, stride); + + /* Step 1 */ + fgrid = hypre_StructVectorGrid(x_l[l]); + fgrid_ids = hypre_StructGridIDs(fgrid); + cgrid = hypre_StructVectorGrid(x_l[l+1]); + cgrid_boxes = hypre_StructGridBoxes(cgrid); + cgrid_ids = hypre_StructGridIDs(cgrid); + fi = 0; + hypre_ForBoxI(ci, cgrid_boxes) + { + while (fgrid_ids[fi] != cgrid_ids[ci]) + { + fi++; + } + + compute_box = hypre_BoxArrayBox(cgrid_boxes, ci); + + hypre_CopyIndex(hypre_BoxIMin(compute_box), startc); + hypre_StructMapCoarseToFine(startc, cindex, stride, start); + + x_dbox = + hypre_BoxArrayBox(hypre_StructVectorDataSpace(x_l[l]), fi); + xc_dbox = + hypre_BoxArrayBox(hypre_StructVectorDataSpace(x_l[l+1]), ci); + + xp = hypre_StructVectorBoxData(x_l[l], fi); + xcp = hypre_StructVectorBoxData(x_l[l+1], ci); + + hypre_BoxGetSize(compute_box, loop_size); + + hypre_BoxLoop2Begin(loop_size, + x_dbox, start, stride, xi, + xc_dbox, startc, stridec, xci); + #define HYPRE_BOX_SMP_PRIVATE loopk,loopi,loopj,xi,xci + #include "hypre_box_smp_forloop.h" + hypre_BoxLoop2For(loopi, loopj, loopk, xi, xci) + { + xp[xi] = xcp[xci]; + } + hypre_BoxLoop2End(xi, xci); + } + + /* Step 2 */ + for (compute_i = 0; compute_i < 2; compute_i++) + { + switch(compute_i) + { + case 0: + { + xp = hypre_StructVectorData(x_l[l]); + hypre_InitializeIndtComputations(up_compute_pkg_l[l], xp, + &comm_handle); + compute_box_aa = + hypre_ComputePkgIndtBoxes(up_compute_pkg_l[l]); + } + break; + + case 1: + { + hypre_FinalizeIndtComputations(comm_handle); + compute_box_aa = + hypre_ComputePkgDeptBoxes(up_compute_pkg_l[l]); + } + break; + } + + hypre_ForBoxArrayI(fi, compute_box_aa) + { + compute_box_a = + hypre_BoxArrayArrayBoxArray(compute_box_aa, fi); + + A_dbox = + hypre_BoxArrayBox(hypre_StructMatrixDataSpace(A_l[l]), fi); + x_dbox = + hypre_BoxArrayBox(hypre_StructVectorDataSpace(x_l[l]), fi); + + hypre_SetIndex(index, 0, 0, 0); + Ap = hypre_StructMatrixExtractPointerByIndex(A_l[l], fi, index); + xp = hypre_StructVectorBoxData(x_l[l], fi); + + hypre_SetIndex(index, -1, 0, 0); + Awp = + hypre_StructMatrixExtractPointerByIndex(A_l[l], fi, index); + xwp = hypre_StructVectorBoxData(x_l[l], fi) + + hypre_BoxOffsetDistance(x_dbox, index); + + hypre_SetIndex(index, 1, 0, 0); + Aep = + hypre_StructMatrixExtractPointerByIndex(A_l[l], fi, index); + xep = hypre_StructVectorBoxData(x_l[l], fi) + + hypre_BoxOffsetDistance(x_dbox, index); + + hypre_ForBoxI(j, compute_box_a) + { + compute_box = hypre_BoxArrayBox(compute_box_a, j); + + hypre_CopyIndex(hypre_BoxIMin(compute_box), start); + hypre_BoxGetStrideSize(compute_box, stride, loop_size); + + hypre_BoxLoop2Begin(loop_size, + A_dbox, start, stride, Ai, + x_dbox, start, stride, xi); + #define HYPRE_BOX_SMP_PRIVATE loopk,loopi,loopj,Ai,xi + #include "hypre_box_smp_forloop.h" + hypre_BoxLoop2For(loopi, loopj, loopk, Ai, xi) + { + xp[xi] -= (Awp[Ai]*xwp[xi] + + Aep[Ai]*xep[xi] ) / Ap[Ai]; + } + hypre_BoxLoop2End(Ai, xi); + } + } + } + } + + /*----------------------------------------------------- + * Finalize some things + *-----------------------------------------------------*/ + + hypre_IncFLOPCount(cyc_red_data -> solve_flops); + hypre_EndTiming(cyc_red_data -> time_index); + + return ierr; + } + + /*-------------------------------------------------------------------------- + * hypre_CyclicReductionSetBase + *--------------------------------------------------------------------------*/ + + int + hypre_CyclicReductionSetBase( void *cyc_red_vdata, + hypre_Index base_index, + hypre_Index base_stride ) + { + hypre_CyclicReductionData *cyc_red_data = cyc_red_vdata; + int d; + int ierr = 0; + + for (d = 0; d < 3; d++) + { + hypre_IndexD((cyc_red_data -> base_index), d) = + hypre_IndexD(base_index, d); + hypre_IndexD((cyc_red_data -> base_stride), d) = + hypre_IndexD(base_stride, d); + } + + return ierr; + } + + /*-------------------------------------------------------------------------- + * hypre_CyclicReductionDestroy + *--------------------------------------------------------------------------*/ + + int + hypre_CyclicReductionDestroy( void *cyc_red_vdata ) + { + hypre_CyclicReductionData *cyc_red_data = cyc_red_vdata; + + int l; + int ierr = 0; + + if (cyc_red_data) + { + hypre_BoxArrayDestroy(cyc_red_data -> base_points); + hypre_StructGridDestroy(cyc_red_data -> grid_l[0]); + hypre_StructMatrixDestroy(cyc_red_data -> A_l[0]); + hypre_StructVectorDestroy(cyc_red_data -> x_l[0]); + for (l = 0; l < ((cyc_red_data -> num_levels) - 1); l++) + { + hypre_StructGridDestroy(cyc_red_data -> grid_l[l+1]); + hypre_BoxArrayDestroy(cyc_red_data -> fine_points_l[l]); + hypre_StructMatrixDestroy(cyc_red_data -> A_l[l+1]); + hypre_StructVectorDestroy(cyc_red_data -> x_l[l+1]); + hypre_ComputePkgDestroy(cyc_red_data -> down_compute_pkg_l[l]); + hypre_ComputePkgDestroy(cyc_red_data -> up_compute_pkg_l[l]); + } + hypre_BoxArrayDestroy(cyc_red_data -> fine_points_l[l]); + hypre_SharedTFree(cyc_red_data -> data); + hypre_TFree(cyc_red_data -> grid_l); + hypre_TFree(cyc_red_data -> fine_points_l); + hypre_TFree(cyc_red_data -> A_l); + hypre_TFree(cyc_red_data -> x_l); + hypre_TFree(cyc_red_data -> down_compute_pkg_l); + hypre_TFree(cyc_red_data -> up_compute_pkg_l); + + hypre_FinalizeTiming(cyc_red_data -> time_index); + hypre_TFree(cyc_red_data); + } + + return ierr; + } + Index: llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/general.c diff -c /dev/null llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/general.c:1.1 *** /dev/null Mon Apr 11 00:22:18 2005 --- llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/general.c Mon Apr 11 00:22:07 2005 *************** *** 0 **** --- 1,38 ---- + /*BHEADER********************************************************************** + * (c) 1997 The Regents of the University of California + * + * See the file COPYRIGHT_and_DISCLAIMER for a complete copyright + * notice, contact person, and disclaimer. + * + * $Revision: 1.1 $ + *********************************************************************EHEADER*/ + /****************************************************************************** + * + * + *****************************************************************************/ + + #include "headers.h" + + /*-------------------------------------------------------------------------- + * hypre_Log2: + * This routine returns the integer, floor(log_2(p)). + * If p <= 0, it returns a -1. + *--------------------------------------------------------------------------*/ + + int + hypre_Log2(int p) + { + int e; + + if (p <= 0) + return -1; + + e = 0; + while (p > 1) + { + e += 1; + p /= 2; + } + + return e; + } Index: llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/general.h diff -c /dev/null llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/general.h:1.1 *** /dev/null Mon Apr 11 00:22:18 2005 --- llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/general.h Mon Apr 11 00:22:07 2005 *************** *** 0 **** --- 1,33 ---- + /*BHEADER********************************************************************** + * (c) 1997 The Regents of the University of California + * + * See the file COPYRIGHT_and_DISCLAIMER for a complete copyright + * notice, contact person, and disclaimer. + * + * $Revision: 1.1 $ + *********************************************************************EHEADER*/ + /****************************************************************************** + * + * General structures and values + * + *****************************************************************************/ + + #ifndef hypre_GENERAL_HEADER + #define hypre_GENERAL_HEADER + + /*-------------------------------------------------------------------------- + * Define various functions + *--------------------------------------------------------------------------*/ + + #ifndef hypre_max + #define hypre_max(a,b) (((a)<(b)) ? (b) : (a)) + #endif + #ifndef hypre_min + #define hypre_min(a,b) (((a)<(b)) ? (a) : (b)) + #endif + + #ifndef hypre_round + #define hypre_round(x) ( ((x) < 0.0) ? ((int)(x - 0.5)) : ((int)(x + 0.5)) ) + #endif + + #endif Index: llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/grow.c diff -c /dev/null llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/grow.c:1.1 *** /dev/null Mon Apr 11 00:22:18 2005 --- llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/grow.c Mon Apr 11 00:22:07 2005 *************** *** 0 **** --- 1,96 ---- + /*BHEADER********************************************************************** + * (c) 1997 The Regents of the University of California + * + * See the file COPYRIGHT_and_DISCLAIMER for a complete copyright + * notice, contact person, and disclaimer. + * + * $Revision: 1.1 $ + *********************************************************************EHEADER*/ + /****************************************************************************** + * + * Routines for "growing" boxes. + * + *****************************************************************************/ + + #include "headers.h" + + /*-------------------------------------------------------------------------- + * hypre_GrowBoxByStencil: + * The argument `transpose' is a boolean that indicates whether + * or not to use the transpose of the stencil. + *--------------------------------------------------------------------------*/ + + hypre_BoxArray * + hypre_GrowBoxByStencil( hypre_Box *box, + hypre_StructStencil *stencil, + int transpose ) + { + hypre_BoxArray *grow_box_array; + + hypre_BoxArray *shift_box_array; + hypre_Box *shift_box; + + hypre_Index *stencil_shape; + + int s, d; + + stencil_shape = hypre_StructStencilShape(stencil); + + shift_box_array = hypre_BoxArrayCreate(hypre_StructStencilSize(stencil)); + shift_box = hypre_BoxCreate(); + for (s = 0; s < hypre_StructStencilSize(stencil); s++) + { + if (transpose) + for (d = 0; d < 3; d++) + { + hypre_BoxIMinD(shift_box, d) = + hypre_BoxIMinD(box, d) - hypre_IndexD(stencil_shape[s], d); + hypre_BoxIMaxD(shift_box, d) = + hypre_BoxIMaxD(box, d) - hypre_IndexD(stencil_shape[s], d); + } + else + for (d = 0; d < 3; d++) + { + hypre_BoxIMinD(shift_box, d) = + hypre_BoxIMinD(box, d) + hypre_IndexD(stencil_shape[s], d); + hypre_BoxIMaxD(shift_box, d) = + hypre_BoxIMaxD(box, d) + hypre_IndexD(stencil_shape[s], d); + } + + hypre_CopyBox(shift_box, hypre_BoxArrayBox(shift_box_array, s)); + } + hypre_BoxDestroy(shift_box); + + hypre_UnionBoxes(shift_box_array); + grow_box_array = shift_box_array; + + return grow_box_array; + } + + /*-------------------------------------------------------------------------- + * hypre_GrowBoxArrayByStencil: + *--------------------------------------------------------------------------*/ + + hypre_BoxArrayArray * + hypre_GrowBoxArrayByStencil( hypre_BoxArray *box_array, + hypre_StructStencil *stencil, + int transpose ) + { + hypre_BoxArrayArray *grow_box_array_array; + + int i; + + grow_box_array_array = + hypre_BoxArrayArrayCreate(hypre_BoxArraySize(box_array)); + + hypre_ForBoxI(i, box_array) + { + hypre_BoxArrayDestroy( + hypre_BoxArrayArrayBoxArray(grow_box_array_array, i)); + hypre_BoxArrayArrayBoxArray(grow_box_array_array, i) = + hypre_GrowBoxByStencil(hypre_BoxArrayBox(box_array, i), + stencil, transpose); + } + + return grow_box_array_array; + } Index: llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/headers.h diff -c /dev/null llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/headers.h:1.1 *** /dev/null Mon Apr 11 00:22:18 2005 --- llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/headers.h Mon Apr 11 00:22:07 2005 *************** *** 0 **** --- 1,15 ---- + /*BHEADER********************************************************************** + * (c) 1997 The Regents of the University of California + * + * See the file COPYRIGHT_and_DISCLAIMER for a complete copyright + * notice, contact person, and disclaimer. + * + * $Revision: 1.1 $ + *********************************************************************EHEADER*/ + + #include + #include + #include + + #include "struct_ls.h" + #include "struct_mv.h" Index: llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/hypre_box_smp_forloop.h diff -c /dev/null llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/hypre_box_smp_forloop.h:1.1 *** /dev/null Mon Apr 11 00:22:18 2005 --- llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/hypre_box_smp_forloop.h Mon Apr 11 00:22:07 2005 *************** *** 0 **** --- 1,20 ---- + /*BHEADER********************************************************************** + * (c) 1997 The Regents of the University of California + * + * See the file COPYRIGHT_and_DISCLAIMER for a complete copyright + * notice, contact person, and disclaimer. + * + * $Revision: 1.1 $ + *********************************************************************EHEADER*/ + + #if HYPRE_USING_PGCC_SMP + #define HYPRE_SMP_PRIVATE \ + HYPRE_BOX_SMP_PRIVATE,hypre__nx,hypre__ny,hypre__nz,hypre__block + #include "hypre_smp_forloop.h" + #else + #define HYPRE_SMP_PRIVATE \ + HYPRE_BOX_SMP_PRIVATE,hypre__nx,hypre__ny,hypre__nz + #include "hypre_smp_forloop.h" + #endif + #undef HYPRE_BOX_SMP_PRIVATE + Index: llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/hypre_smp_forloop.h diff -c /dev/null llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/hypre_smp_forloop.h:1.1 *** /dev/null Mon Apr 11 00:22:18 2005 --- llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/hypre_smp_forloop.h Mon Apr 11 00:22:07 2005 *************** *** 0 **** --- 1,52 ---- + /*BHEADER********************************************************************** + * (c) 1998 The Regents of the University of California + * + * See the file COPYRIGHT_and_DISCLAIMER for a complete copyright + * notice, contact person, and disclaimer. + * + * $Revision: 1.1 $ + *********************************************************************EHEADER*/ + /***************************************************************************** + * Wrapper code for SMP compiler directives. Translates + * hypre SMP directives into the appropriate Open MP, + * IBM, SGI, or pgcc (Red) SMP compiler directives. + ****************************************************************************/ + + #ifndef HYPRE_SMP_PRIVATE + #define HYPRE_SMP_PRIVATE + #endif + + #ifdef HYPRE_USING_OPENMP + #ifndef HYPRE_SMP_REDUCTION_OP + #pragma omp parallel for private(HYPRE_SMP_PRIVATE) schedule(static) + #endif + #ifdef HYPRE_SMP_REDUCTION_OP + #pragma omp parallel for private(HYPRE_SMP_PRIVATE) \ + reduction(HYPRE_SMP_REDUCTION_OP: HYPRE_SMP_REDUCTION_VARS) \ + schedule(static) + #endif + #endif + + #ifdef HYPRE_USING_SGI_SMP + #pragma parallel + #pragma pfor + #pragma schedtype(gss) + #pragma chunksize(10) + #endif + + #ifdef HYPRE_USING_IBM_SMP + #pragma parallel_loop + #pragma schedule (guided,10) + #endif + + #ifdef HYPRE_USING_PGCC_SMP + #ifndef HYPRE_SMP_REDUCTION_OP + #pragma parallel local(HYPRE_SMP_PRIVATE) pfor + #endif + #ifdef HYPRE_SMP_REDUCTION_OP + #endif + #endif + + #undef HYPRE_SMP_PRIVATE + #undef HYPRE_SMP_REDUCTION_OP + #undef HYPRE_SMP_REDUCTION_VARS Index: llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/krylov.h diff -c /dev/null llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/krylov.h:1.1 *** /dev/null Mon Apr 11 00:22:18 2005 --- llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/krylov.h Mon Apr 11 00:22:07 2005 *************** *** 0 **** --- 1,851 ---- + /*BHEADER********************************************************************** + * (c) 2000 The Regents of the University of California + * + * See the file COPYRIGHT_and_DISCLAIMER for a complete copyright + * notice, contact person, and disclaimer. + * + * $Revision: 1.1 $ + *********************************************************************EHEADER*/ + + /****************************************************************************** + * + * krylov solver headers + * + *****************************************************************************/ + + #ifndef HYPRE_ALL_KRYLOV_HEADER + #define HYPRE_ALL_KRYLOV_HEADER + + #include + #include + #include + + #ifndef max + #define max(a,b) (((a)<(b)) ? (b) : (a)) + #endif + + #define hypre_CTAllocF(type, count, funcs) \ + ( (type *)(*(funcs->CAlloc))\ + ((unsigned int)(count), (unsigned int)sizeof(type)) ) + + #define hypre_TFreeF( ptr, funcs ) \ + ( (*(funcs->Free))((char *)ptr), ptr = NULL ) + + /* A pointer to a type which is never defined, sort of works like void* ... */ + #ifndef HYPRE_SOLVER_STRUCT + #define HYPRE_SOLVER_STRUCT + struct hypre_Solver_struct; + typedef struct hypre_Solver_struct *HYPRE_Solver; + /* similar pseudo-void* for Matrix and Vector: */ + #endif + #ifndef HYPRE_MATRIX_STRUCT + #define HYPRE_MATRIX_STRUCT + struct hypre_Matrix_struct; + typedef struct hypre_Matrix_struct *HYPRE_Matrix; + #endif + #ifndef HYPRE_VECTOR_STRUCT + #define HYPRE_VECTOR_STRUCT + struct hypre_Vector_struct; + typedef struct hypre_Vector_struct *HYPRE_Vector; + #endif + + typedef int (*HYPRE_PtrToSolverFcn)(HYPRE_Solver, + HYPRE_Matrix, + HYPRE_Vector, + HYPRE_Vector); + + #endif + /*BHEADER********************************************************************** + * (c) 2000 The Regents of the University of California + * + * See the file COPYRIGHT_and_DISCLAIMER for a complete copyright + * notice, contact person, and disclaimer. + * + * $Revision: 1.1 $ + *********************************************************************EHEADER*/ + /****************************************************************************** + * + * BiCGSTAB bicgstab + * + *****************************************************************************/ + + #ifndef HYPRE_KRYLOV_BiCGSTAB_HEADER + #define HYPRE_KRYLOV_BiCGSTAB_HEADER + + /*-------------------------------------------------------------------------- + *--------------------------------------------------------------------------*/ + + /** + * @name Generic BiCGSTAB Interface + * + * A general description of the interface goes here... + * + * @memo A generic BiCGSTAB linear solver interface + * @version 0.1 + * @author Jeffrey F. Painter + **/ + /*@{*/ + + /*-------------------------------------------------------------------------- + *--------------------------------------------------------------------------*/ + + /*-------------------------------------------------------------------------- + * hypre_BiCGSTABData and hypre_BiCGSTABFunctions + *--------------------------------------------------------------------------*/ + + + /** + * @name BiCGSTAB structs + * + * Description... + **/ + /*@{*/ + + /** + * The {\tt hypre\_BiCGSTABSFunctions} object ... + **/ + + /* functions in pcg_struct.c which aren't used here: + char *hypre_ParKrylovCAlloc( int count , int elt_size ); + int hypre_ParKrylovFree( char *ptr ); + void *hypre_ParKrylovCreateVectorArray( int n , void *vvector ); + int hypre_ParKrylovMatvecT( void *matvec_data , double alpha , void *A , void *x , double beta , void *y ); + int hypre_ParKrylovClearVector( void *x ); + */ + /* functions in pcg_struct.c which are used here: + void *hypre_ParKrylovCreateVector( void *vvector ); + int hypre_ParKrylovDestroyVector( void *vvector ); + void *hypre_ParKrylovMatvecCreate( void *A , void *x ); + int hypre_ParKrylovMatvec( void *matvec_data , double alpha , void *A , void *x , double beta , void *y ); + int hypre_ParKrylovMatvecDestroy( void *matvec_data ); + double hypre_ParKrylovInnerProd( void *x , void *y ); + int hypre_ParKrylovCopyVector( void *x , void *y ); + int hypre_ParKrylovScaleVector( double alpha , void *x ); + int hypre_ParKrylovAxpy( double alpha , void *x , void *y ); + int hypre_ParKrylovCommInfo( void *A , int *my_id , int *num_procs ); + int hypre_ParKrylovIdentitySetup( void *vdata , void *A , void *b , void *x ); + int hypre_ParKrylovIdentity( void *vdata , void *A , void *b , void *x ); + */ + + typedef struct + { + void *(*CreateVector)( void *vvector ); + int (*DestroyVector)( void *vvector ); + void *(*MatvecCreate)( void *A , void *x ); + int (*Matvec)( void *matvec_data , double alpha , void *A , void *x , double beta , void *y ); + int (*MatvecDestroy)( void *matvec_data ); + double (*InnerProd)( void *x , void *y ); + int (*CopyVector)( void *x , void *y ); + int (*ScaleVector)( double alpha , void *x ); + int (*Axpy)( double alpha , void *x , void *y ); + int (*CommInfo)( void *A , int *my_id , int *num_procs ); + int (*IdentitySetup)( void *vdata , void *A , void *b , void *x ); + int (*Identity)( void *vdata , void *A , void *b , void *x ); + + int (*precond)(); + int (*precond_setup)(); + + } hypre_BiCGSTABFunctions; + + /** + * The {\tt hypre\_BiCGSTABData} object ... + **/ + + typedef struct + { + int min_iter; + int max_iter; + int stop_crit; + double tol; + double rel_residual_norm; + + void *A; + void *r; + void *r0; + void *s; + void *v; + void *p; + void *q; + + void *matvec_data; + void *precond_data; + + hypre_BiCGSTABFunctions * functions; + + /* log info (always logged) */ + int num_iterations; + + /* additional log info (logged when `logging' > 0) */ + int logging; + double *norms; + char *log_file_name; + + } hypre_BiCGSTABData; + + #ifdef __cplusplus + extern "C" { + #endif + + /** + * @name generic BiCGSTAB Solver + * + * Description... + **/ + /*@{*/ + + /** + * Description... + * + * @param param [IN] ... + **/ + + hypre_BiCGSTABFunctions * + hypre_BiCGSTABFunctionsCreate( + void *(*CreateVector)( void *vvector ), + int (*DestroyVector)( void *vvector ), + void *(*MatvecCreate)( void *A , void *x ), + int (*Matvec)( void *matvec_data , double alpha , void *A , void *x , double beta , void *y ), + int (*MatvecDestroy)( void *matvec_data ), + double (*InnerProd)( void *x , void *y ), + int (*CopyVector)( void *x , void *y ), + int (*ScaleVector)( double alpha , void *x ), + int (*Axpy)( double alpha , void *x , void *y ), + int (*CommInfo)( void *A , int *my_id , int *num_procs ), + int (*precond)(), + int (*precond_setup)() + ); + + + /** + * Description... + * + * @param param [IN] ... + **/ + + void * + hypre_BiCGSTABCreate( hypre_BiCGSTABFunctions * bicgstab_functions ); + + + #ifdef __cplusplus + } + #endif + + #endif + /*BHEADER********************************************************************** + * (c) 2000 The Regents of the University of California + * + * See the file COPYRIGHT_and_DISCLAIMER for a complete copyright + * notice, contact person, and disclaimer. + * + * $Revision: 1.1 $ + *********************************************************************EHEADER*/ + + /****************************************************************************** + * + * cgnr (conjugate gradient on the normal equations A^TAx = A^Tb) functions + * + *****************************************************************************/ + + #ifndef HYPRE_KRYLOV_CGNR_HEADER + #define HYPRE_KRYLOV_CGNR_HEADER + + /*-------------------------------------------------------------------------- + *--------------------------------------------------------------------------*/ + + /** + * @name Generic CGNR Interface + * + * A general description of the interface goes here... + * + * @memo A generic CGNR linear solver interface + * @version 0.1 + * @author Jeffrey F. Painter + **/ + /*@{*/ + + /*-------------------------------------------------------------------------- + *--------------------------------------------------------------------------*/ + + /*-------------------------------------------------------------------------- + * hypre_CGNRData and hypre_CGNRFunctions + *--------------------------------------------------------------------------*/ + + + /** + * @name CGNR structs + * + * Description... + **/ + /*@{*/ + + /** + * The {\tt hypre\_CGNRSFunctions} object ... + **/ + + typedef struct + { + int (*CommInfo) ( void *A, int *my_id, int *num_procs ); + void * (*CreateVector) ( void *vector ); + int (*DestroyVector) ( void *vector ); + void * (*MatvecCreate) ( void *A, void *x ); + int (*Matvec) ( void *matvec_data, double alpha, void *A, + void *x, double beta, void *y ); + int (*MatvecT) ( void *matvec_data, double alpha, void *A, + void *x, double beta, void *y ); + int (*MatvecDestroy) ( void *matvec_data ); + double (*InnerProd) ( void *x, void *y ); + int (*CopyVector) ( void *x, void *y ); + int (*ClearVector) ( void *x ); + int (*ScaleVector) ( double alpha, void *x ); + int (*Axpy) ( double alpha, void *x, void *y ); + int (*precond_setup) ( void *vdata, void *A, void *b, void *x ); + int (*precond) ( void *vdata, void *A, void *b, void *x ); + int (*precondT) ( void *vdata, void *A, void *b, void *x ); + } hypre_CGNRFunctions; + + /** + * The {\tt hypre\_CGNRData} object ... + **/ + + typedef struct + { + double tol; + double rel_residual_norm; + int min_iter; + int max_iter; + int stop_crit; + + void *A; + void *p; + void *q; + void *r; + void *t; + + void *matvec_data; + void *precond_data; + + hypre_CGNRFunctions * functions; + + /* log info (always logged) */ + int num_iterations; + + /* additional log info (logged when `logging' > 0) */ + int logging; + double *norms; + char *log_file_name; + + } hypre_CGNRData; + + + #ifdef __cplusplus + extern "C" { + #endif + + + /** + * @name generic CGNR Solver + * + * Description... + **/ + /*@{*/ + + + /** + * Description... + * + * @param param [IN] ... + **/ + hypre_CGNRFunctions * + hypre_CGNRFunctionsCreate( + int (*CommInfo) ( void *A, int *my_id, int *num_procs ), + void * (*CreateVector) ( void *vector ), + int (*DestroyVector) ( void *vector ), + void * (*MatvecCreate) ( void *A, void *x ), + int (*Matvec) ( void *matvec_data, double alpha, void *A, + void *x, double beta, void *y ), + int (*MatvecT) ( void *matvec_data, double alpha, void *A, + void *x, double beta, void *y ), + int (*MatvecDestroy) ( void *matvec_data ), + double (*InnerProd) ( void *x, void *y ), + int (*CopyVector) ( void *x, void *y ), + int (*ClearVector) ( void *x ), + int (*ScaleVector) ( double alpha, void *x ), + int (*Axpy) ( double alpha, void *x, void *y ), + int (*PrecondSetup) ( void *vdata, void *A, void *b, void *x ), + int (*Precond) ( void *vdata, void *A, void *b, void *x ), + int (*PrecondT) ( void *vdata, void *A, void *b, void *x ) + ); + + /** + * Description... + * + * @param param [IN] ... + **/ + + void * + hypre_CGNRCreate( hypre_CGNRFunctions *cgnr_functions ); + + #ifdef __cplusplus + } + #endif + + #endif + /*BHEADER********************************************************************** + * (c) 2000 The Regents of the University of California + * + * See the file COPYRIGHT_and_DISCLAIMER for a complete copyright + * notice, contact person, and disclaimer. + * + * $Revision: 1.1 $ + *********************************************************************EHEADER*/ + /****************************************************************************** + * + * GMRES gmres + * + *****************************************************************************/ + + #ifndef HYPRE_KRYLOV_GMRES_HEADER + #define HYPRE_KRYLOV_GMRES_HEADER + + /*-------------------------------------------------------------------------- + *--------------------------------------------------------------------------*/ + + /** + * @name Generic GMRES Interface + * + * A general description of the interface goes here... + * + * @memo A generic GMRES linear solver interface + * @version 0.1 + * @author Jeffrey F. Painter + **/ + /*@{*/ + + /*-------------------------------------------------------------------------- + *--------------------------------------------------------------------------*/ + + /*-------------------------------------------------------------------------- + * hypre_GMRESData and hypre_GMRESFunctions + *--------------------------------------------------------------------------*/ + + + /** + * @name GMRES structs + * + * Description... + **/ + /*@{*/ + + /** + * The {\tt hypre\_GMRESFunctions} object ... + **/ + + typedef struct + { + char * (*CAlloc) ( int count, int elt_size ); + int (*Free) ( char *ptr ); + int (*CommInfo) ( void *A, int *my_id, int *num_procs ); + void * (*CreateVector) ( void *vector ); + void * (*CreateVectorArray) ( int size, void *vectors ); + int (*DestroyVector) ( void *vector ); + void * (*MatvecCreate) ( void *A, void *x ); + int (*Matvec) ( void *matvec_data, double alpha, void *A, + void *x, double beta, void *y ); + int (*MatvecDestroy) ( void *matvec_data ); + double (*InnerProd) ( void *x, void *y ); + int (*CopyVector) ( void *x, void *y ); + int (*ClearVector) ( void *x ); + int (*ScaleVector) ( double alpha, void *x ); + int (*Axpy) ( double alpha, void *x, void *y ); + + int (*precond)(); + int (*precond_setup)(); + + } hypre_GMRESFunctions; + + /** + * The {\tt hypre\_GMRESData} object ... + **/ + + typedef struct + { + int k_dim; + int min_iter; + int max_iter; + int stop_crit; + double tol; + double rel_residual_norm; + + void *A; + void *r; + void *w; + void **p; + + void *matvec_data; + void *precond_data; + + hypre_GMRESFunctions * functions; + + /* log info (always logged) */ + int num_iterations; + + /* additional log info (logged when `logging' > 0) */ + int logging; + double *norms; + char *log_file_name; + + } hypre_GMRESData; + + #ifdef __cplusplus + extern "C" { + #endif + + /** + * @name generic GMRES Solver + * + * Description... + **/ + /*@{*/ + + /** + * Description... + * + * @param param [IN] ... + **/ + + hypre_GMRESFunctions * + hypre_GMRESFunctionsCreate( + char * (*CAlloc) ( int count, int elt_size ), + int (*Free) ( char *ptr ), + int (*CommInfo) ( void *A, int *my_id, int *num_procs ), + void * (*CreateVector) ( void *vector ), + void * (*CreateVectorArray) ( int size, void *vectors ), + int (*DestroyVector) ( void *vector ), + void * (*MatvecCreate) ( void *A, void *x ), + int (*Matvec) ( void *matvec_data, double alpha, void *A, + void *x, double beta, void *y ), + int (*MatvecDestroy) ( void *matvec_data ), + double (*InnerProd) ( void *x, void *y ), + int (*CopyVector) ( void *x, void *y ), + int (*ClearVector) ( void *x ), + int (*ScaleVector) ( double alpha, void *x ), + int (*Axpy) ( double alpha, void *x, void *y ), + int (*PrecondSetup) ( void *vdata, void *A, void *b, void *x ), + int (*Precond) ( void *vdata, void *A, void *b, void *x ) + ); + + /** + * Description... + * + * @param param [IN] ... + **/ + + void * + hypre_GMRESCreate( hypre_GMRESFunctions *gmres_functions ); + + #ifdef __cplusplus + } + #endif + #endif + /*BHEADER********************************************************************** + * (c) 2000 The Regents of the University of California + * + * See the file COPYRIGHT_and_DISCLAIMER for a complete copyright + * notice, contact person, and disclaimer. + * + * $Revision: 1.1 $ + *********************************************************************EHEADER*/ + + /****************************************************************************** + * + * Preconditioned conjugate gradient (Omin) headers + * + *****************************************************************************/ + + #ifndef HYPRE_KRYLOV_PCG_HEADER + #define HYPRE_KRYLOV_PCG_HEADER + + /*-------------------------------------------------------------------------- + *--------------------------------------------------------------------------*/ + + /** + * @name Generic PCG Interface + * + * A general description of the interface goes here... + * + * @memo A generic PCG linear solver interface + * @version 0.1 + * @author Jeffrey F. Painter + **/ + /*@{*/ + + /*-------------------------------------------------------------------------- + *--------------------------------------------------------------------------*/ + + /*-------------------------------------------------------------------------- + * hypre_PCGData and hypre_PCGFunctions + *--------------------------------------------------------------------------*/ + + + /** + * @name PCG structs + * + * Description... + **/ + /*@{*/ + + /** + * The {\tt hypre\_PCGSFunctions} object ... + **/ + + typedef struct + { + char * (*CAlloc) ( int count, int elt_size ); + int (*Free) ( char *ptr ); + void * (*CreateVector) ( void *vector ); + int (*DestroyVector) ( void *vector ); + void * (*MatvecCreate) ( void *A, void *x ); + int (*Matvec) ( void *matvec_data, double alpha, void *A, + void *x, double beta, void *y ); + int (*MatvecDestroy) ( void *matvec_data ); + double (*InnerProd) ( void *x, void *y ); + int (*CopyVector) ( void *x, void *y ); + int (*ClearVector) ( void *x ); + int (*ScaleVector) ( double alpha, void *x ); + int (*Axpy) ( double alpha, void *x, void *y ); + + int (*precond)(); + int (*precond_setup)(); + } hypre_PCGFunctions; + + /** + * The {\tt hypre\_PCGData} object ... + **/ + + /* rel_change!=0 means: if pass the other stopping criteria, + also check the relative change in the solution x. + stop_crit!=0 means: absolute error tolerance rather than + the usual relative error tolerance on the residual. Never + applies if rel_change!=0. + */ + + typedef struct + { + double tol; + double cf_tol; + int max_iter; + int two_norm; + int rel_change; + int stop_crit; + + void *A; + void *p; + void *s; + void *r; + + void *matvec_data; + void *precond_data; + + hypre_PCGFunctions * functions; + + /* log info (always logged) */ + int num_iterations; + + /* additional log info (logged when `logging' > 0) */ + int logging; + double *norms; + double *rel_norms; + + } hypre_PCGData; + + #ifdef __cplusplus + extern "C" { + #endif + + + /** + * @name generic PCG Solver + * + * Description... + **/ + /*@{*/ + + /** + * Description... + * + * @param param [IN] ... + **/ + + hypre_PCGFunctions * + hypre_PCGFunctionsCreate( + char * (*CAlloc) ( int count, int elt_size ), + int (*Free) ( char *ptr ), + void * (*CreateVector) ( void *vector ), + int (*DestroyVector) ( void *vector ), + void * (*MatvecCreate) ( void *A, void *x ), + int (*Matvec) ( void *matvec_data, double alpha, void *A, + void *x, double beta, void *y ), + int (*MatvecDestroy) ( void *matvec_data ), + double (*InnerProd) ( void *x, void *y ), + int (*CopyVector) ( void *x, void *y ), + int (*ClearVector) ( void *x ), + int (*ScaleVector) ( double alpha, void *x ), + int (*Axpy) ( double alpha, void *x, void *y ), + int (*PrecondSetup) ( void *vdata, void *A, void *b, void *x ), + int (*Precond) ( void *vdata, void *A, void *b, void *x ) + ); + + /** + * Description... + * + * @param param [IN] ... + **/ + + void * + hypre_PCGCreate( hypre_PCGFunctions *pcg_functions ); + + #ifdef __cplusplus + } + #endif + + #endif + + #ifndef hypre_KRYLOV_HEADER + #define hypre_KRYLOV_HEADER + + #ifdef __cplusplus + extern "C" { + #endif + + + /* HYPRE_bicgstab.c */ + int HYPRE_BiCGSTABDestroy( HYPRE_Solver solver ); + int HYPRE_BiCGSTABSetup( HYPRE_Solver solver , HYPRE_Matrix A , HYPRE_Vector b , HYPRE_Vector x ); + int HYPRE_BiCGSTABSolve( HYPRE_Solver solver , HYPRE_Matrix A , HYPRE_Vector b , HYPRE_Vector x ); + int HYPRE_BiCGSTABSetTol( HYPRE_Solver solver , double tol ); + int HYPRE_BiCGSTABSetMinIter( HYPRE_Solver solver , int min_iter ); + int HYPRE_BiCGSTABSetMaxIter( HYPRE_Solver solver , int max_iter ); + int HYPRE_BiCGSTABSetStopCrit( HYPRE_Solver solver , int stop_crit ); + int HYPRE_BiCGSTABSetPrecond( HYPRE_Solver solver , HYPRE_PtrToSolverFcn precond , HYPRE_PtrToSolverFcn precond_setup , HYPRE_Solver precond_solver ); + int HYPRE_BiCGSTABGetPrecond( HYPRE_Solver solver , HYPRE_Solver *precond_data_ptr ); + int HYPRE_BiCGSTABSetLogging( HYPRE_Solver solver , int logging ); + int HYPRE_BiCGSTABGetNumIterations( HYPRE_Solver solver , int *num_iterations ); + int HYPRE_BiCGSTABGetFinalRelativeResidualNorm( HYPRE_Solver solver , double *norm ); + + /* HYPRE_cgnr.c */ + int HYPRE_CGNRDestroy( HYPRE_Solver solver ); + int HYPRE_CGNRSetup( HYPRE_Solver solver , HYPRE_Matrix A , HYPRE_Vector b , HYPRE_Vector x ); + int HYPRE_CGNRSolve( HYPRE_Solver solver , HYPRE_Matrix A , HYPRE_Vector b , HYPRE_Vector x ); + int HYPRE_CGNRSetTol( HYPRE_Solver solver , double tol ); + int HYPRE_CGNRSetMinIter( HYPRE_Solver solver , int min_iter ); + int HYPRE_CGNRSetMaxIter( HYPRE_Solver solver , int max_iter ); + int HYPRE_CGNRSetStopCrit( HYPRE_Solver solver , int stop_crit ); + int HYPRE_CGNRSetPrecond( HYPRE_Solver solver , HYPRE_PtrToSolverFcn precond , HYPRE_PtrToSolverFcn precondT , HYPRE_PtrToSolverFcn precond_setup , HYPRE_Solver precond_solver ); + int HYPRE_CGNRGetPrecond( HYPRE_Solver solver , HYPRE_Solver *precond_data_ptr ); + int HYPRE_CGNRSetLogging( HYPRE_Solver solver , int logging ); + int HYPRE_CGNRGetNumIterations( HYPRE_Solver solver , int *num_iterations ); + int HYPRE_CGNRGetFinalRelativeResidualNorm( HYPRE_Solver solver , double *norm ); + + /* HYPRE_gmres.c */ + int HYPRE_GMRESSetup( HYPRE_Solver solver , HYPRE_Matrix A , HYPRE_Vector b , HYPRE_Vector x ); + int HYPRE_GMRESSolve( HYPRE_Solver solver , HYPRE_Matrix A , HYPRE_Vector b , HYPRE_Vector x ); + int HYPRE_GMRESSetKDim( HYPRE_Solver solver , int k_dim ); + int HYPRE_GMRESSetTol( HYPRE_Solver solver , double tol ); + int HYPRE_GMRESSetMinIter( HYPRE_Solver solver , int min_iter ); + int HYPRE_GMRESSetMaxIter( HYPRE_Solver solver , int max_iter ); + int HYPRE_GMRESSetStopCrit( HYPRE_Solver solver , int stop_crit ); + int HYPRE_GMRESSetPrecond( HYPRE_Solver solver , HYPRE_PtrToSolverFcn precond , HYPRE_PtrToSolverFcn precond_setup , HYPRE_Solver precond_solver ); + int HYPRE_GMRESGetPrecond( HYPRE_Solver solver , HYPRE_Solver *precond_data_ptr ); + int HYPRE_GMRESSetLogging( HYPRE_Solver solver , int logging ); + int HYPRE_GMRESGetNumIterations( HYPRE_Solver solver , int *num_iterations ); + int HYPRE_GMRESGetFinalRelativeResidualNorm( HYPRE_Solver solver , double *norm ); + + /* HYPRE_pcg.c */ + int HYPRE_PCGSetup( HYPRE_Solver solver , HYPRE_Matrix A , HYPRE_Vector b , HYPRE_Vector x ); + int HYPRE_PCGSolve( HYPRE_Solver solver , HYPRE_Matrix A , HYPRE_Vector b , HYPRE_Vector x ); + int HYPRE_PCGSetTol( HYPRE_Solver solver , double tol ); + int HYPRE_PCGSetMaxIter( HYPRE_Solver solver , int max_iter ); + int HYPRE_PCGSetStopCrit( HYPRE_Solver solver , int stop_crit ); + int HYPRE_PCGSetTwoNorm( HYPRE_Solver solver , int two_norm ); + int HYPRE_PCGSetRelChange( HYPRE_Solver solver , int rel_change ); + int HYPRE_PCGSetPrecond( HYPRE_Solver solver , HYPRE_PtrToSolverFcn precond , HYPRE_PtrToSolverFcn precond_setup , HYPRE_Solver precond_solver ); + int HYPRE_PCGGetPrecond( HYPRE_Solver solver , HYPRE_Solver *precond_data_ptr ); + int HYPRE_PCGSetLogging( HYPRE_Solver solver , int logging ); + int HYPRE_PCGGetNumIterations( HYPRE_Solver solver , int *num_iterations ); + int HYPRE_PCGGetFinalRelativeResidualNorm( HYPRE_Solver solver , double *norm ); + + /* bicgstab.c */ + hypre_BiCGSTABFunctions *hypre_BiCGSTABFunctionsCreate( void *(*CreateVector )(void *vvector ), int (*DestroyVector )(void *vvector ), void *(*MatvecCreate )(void *A ,void *x ), int (*Matvec )(void *matvec_data ,double alpha ,void *A ,void *x ,double beta ,void *y ), int (*MatvecDestroy )(void *matvec_data ), double (*InnerProd )(void *x ,void *y ), int (*CopyVector )(void *x ,void *y ), int (*ScaleVector )(double alpha ,void *x ), int (*Axpy )(double alpha ,void *x ,void *y ), int (*CommInfo )(void *A ,int *my_id ,int *num_procs ), int (*precond )(), int (*precond_setup )()); + void *hypre_BiCGSTABCreate( hypre_BiCGSTABFunctions *bicgstab_functions ); + int hypre_BiCGSTABDestroy( void *bicgstab_vdata ); + int hypre_BiCGSTABSetup( void *bicgstab_vdata , void *A , void *b , void *x ); + int hypre_BiCGSTABSolve( void *bicgstab_vdata , void *A , void *b , void *x ); + int hypre_BiCGSTABSetTol( void *bicgstab_vdata , double tol ); + int hypre_BiCGSTABSetMinIter( void *bicgstab_vdata , int min_iter ); + int hypre_BiCGSTABSetMaxIter( void *bicgstab_vdata , int max_iter ); + int hypre_BiCGSTABSetStopCrit( void *bicgstab_vdata , double stop_crit ); + int hypre_BiCGSTABSetPrecond( void *bicgstab_vdata , int (*precond )(), int (*precond_setup )(), void *precond_data ); + int hypre_BiCGSTABGetPrecond( void *bicgstab_vdata , HYPRE_Solver *precond_data_ptr ); + int hypre_BiCGSTABSetLogging( void *bicgstab_vdata , int logging ); + int hypre_BiCGSTABGetNumIterations( void *bicgstab_vdata , int *num_iterations ); + int hypre_BiCGSTABGetFinalRelativeResidualNorm( void *bicgstab_vdata , double *relative_residual_norm ); + + /* cgnr.c */ + hypre_CGNRFunctions *hypre_CGNRFunctionsCreate( int (*CommInfo )(void *A ,int *my_id ,int *num_procs ), void *(*CreateVector )(void *vector ), int (*DestroyVector )(void *vector ), void *(*MatvecCreate )(void *A ,void *x ), int (*Matvec )(void *matvec_data ,double alpha ,void *A ,void *x ,double beta ,void *y ), int (*MatvecT )(void *matvec_data ,double alpha ,void *A ,void *x ,double beta ,void *y ), int (*MatvecDestroy )(void *matvec_data ), double (*InnerProd )(void *x ,void *y ), int (*CopyVector )(void *x ,void *y ), int (*ClearVector )(void *x ), int (*ScaleVector )(double alpha ,void *x ), int (*Axpy )(double alpha ,void *x ,void *y ), int (*PrecondSetup )(void *vdata ,void *A ,void *b ,void *x ), int (*Precond )(void *vdata ,void *A ,void *b ,void *x ), int (*PrecondT )(void *vdata ,void *A ,void *b ,void *x )); + void *hypre_CGNRCreate( hypre_CGNRFunctions *cgnr_functions ); + int hypre_CGNRDestroy( void *cgnr_vdata ); + int hypre_CGNRSetup( void *cgnr_vdata , void *A , void *b , void *x ); + int hypre_CGNRSolve( void *cgnr_vdata , void *A , void *b , void *x ); + int hypre_CGNRSetTol( void *cgnr_vdata , double tol ); + int hypre_CGNRSetMinIter( void *cgnr_vdata , int min_iter ); + int hypre_CGNRSetMaxIter( void *cgnr_vdata , int max_iter ); + int hypre_CGNRSetStopCrit( void *cgnr_vdata , int stop_crit ); + int hypre_CGNRSetPrecond( void *cgnr_vdata , int (*precond )(), int (*precondT )(), int (*precond_setup )(), void *precond_data ); + int hypre_CGNRGetPrecond( void *cgnr_vdata , HYPRE_Solver *precond_data_ptr ); + int hypre_CGNRSetLogging( void *cgnr_vdata , int logging ); + int hypre_CGNRGetNumIterations( void *cgnr_vdata , int *num_iterations ); + int hypre_CGNRGetFinalRelativeResidualNorm( void *cgnr_vdata , double *relative_residual_norm ); + + /* gmres.c */ + hypre_GMRESFunctions *hypre_GMRESFunctionsCreate( char *(*CAlloc )(int count ,int elt_size ), int (*Free )(char *ptr ), int (*CommInfo )(void *A ,int *my_id ,int *num_procs ), void *(*CreateVector )(void *vector ), void *(*CreateVectorArray )(int size ,void *vectors ), int (*DestroyVector )(void *vector ), void *(*MatvecCreate )(void *A ,void *x ), int (*Matvec )(void *matvec_data ,double alpha ,void *A ,void *x ,double beta ,void *y ), int (*MatvecDestroy )(void *matvec_data ), double (*InnerProd )(void *x ,void *y ), int (*CopyVector )(void *x ,void *y ), int (*ClearVector )(void *x ), int (*ScaleVector )(double alpha ,void *x ), int (*Axpy )(double alpha ,void *x ,void *y ), int (*PrecondSetup )(void *vdata ,void *A ,void *b ,void *x ), int (*Precond )(void *vdata ,void *A ,void *b ,void *x )); + void *hypre_GMRESCreate( hypre_GMRESFunctions *gmres_functions ); + int hypre_GMRESDestroy( void *gmres_vdata ); + int hypre_GMRESSetup( void *gmres_vdata , void *A , void *b , void *x ); + int hypre_GMRESSolve( void *gmres_vdata , void *A , void *b , void *x ); + int hypre_GMRESSetKDim( void *gmres_vdata , int k_dim ); + int hypre_GMRESSetTol( void *gmres_vdata , double tol ); + int hypre_GMRESSetMinIter( void *gmres_vdata , int min_iter ); + int hypre_GMRESSetMaxIter( void *gmres_vdata , int max_iter ); + int hypre_GMRESSetStopCrit( void *gmres_vdata , double stop_crit ); + int hypre_GMRESSetPrecond( void *gmres_vdata , int (*precond )(), int (*precond_setup )(), void *precond_data ); + int hypre_GMRESGetPrecond( void *gmres_vdata , HYPRE_Solver *precond_data_ptr ); + int hypre_GMRESSetLogging( void *gmres_vdata , int logging ); + int hypre_GMRESGetNumIterations( void *gmres_vdata , int *num_iterations ); + int hypre_GMRESGetFinalRelativeResidualNorm( void *gmres_vdata , double *relative_residual_norm ); + + /* pcg.c */ + hypre_PCGFunctions *hypre_PCGFunctionsCreate( char *(*CAlloc )(int count ,int elt_size ), int (*Free )(char *ptr ), void *(*CreateVector )(void *vector ), int (*DestroyVector )(void *vector ), void *(*MatvecCreate )(void *A ,void *x ), int (*Matvec )(void *matvec_data ,double alpha ,void *A ,void *x ,double beta ,void *y ), int (*MatvecDestroy )(void *matvec_data ), double (*InnerProd )(void *x ,void *y ), int (*CopyVector )(void *x ,void *y ), int (*ClearVector )(void *x ), int (*ScaleVector )(double alpha ,void *x ), int (*Axpy )(double alpha ,void *x ,void *y ), int (*PrecondSetup )(void *vdata ,void *A ,void *b ,void *x ), int (*Precond )(void *vdata ,void *A ,void *b ,void *x )); + void *hypre_PCGCreate( hypre_PCGFunctions *pcg_functions ); + int hypre_PCGDestroy( void *pcg_vdata ); + int hypre_PCGSetup( void *pcg_vdata , void *A , void *b , void *x ); + int hypre_PCGSolve( void *pcg_vdata , void *A , void *b , void *x ); + int hypre_PCGSetTol( void *pcg_vdata , double tol ); + int hypre_PCGSetConvergenceFactorTol( void *pcg_vdata , double cf_tol ); + int hypre_PCGSetMaxIter( void *pcg_vdata , int max_iter ); + int hypre_PCGSetTwoNorm( void *pcg_vdata , int two_norm ); + int hypre_PCGSetRelChange( void *pcg_vdata , int rel_change ); + int hypre_PCGSetStopCrit( void *pcg_vdata , int stop_crit ); + int hypre_PCGGetPrecond( void *pcg_vdata , HYPRE_Solver *precond_data_ptr ); + int hypre_PCGSetPrecond( void *pcg_vdata , int (*precond )(), int (*precond_setup )(), void *precond_data ); + int hypre_PCGSetLogging( void *pcg_vdata , int logging ); + int hypre_PCGGetNumIterations( void *pcg_vdata , int *num_iterations ); + int hypre_PCGPrintLogging( void *pcg_vdata , int myid ); + int hypre_PCGGetFinalRelativeResidualNorm( void *pcg_vdata , double *relative_residual_norm ); + + + #ifdef __cplusplus + } + #endif + + #endif + Index: llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/memory.c diff -c /dev/null llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/memory.c:1.1 *** /dev/null Mon Apr 11 00:22:18 2005 --- llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/memory.c Mon Apr 11 00:22:07 2005 *************** *** 0 **** --- 1,311 ---- + /*BHEADER********************************************************************** + * (c) 1997 The Regents of the University of California + * + * See the file COPYRIGHT_and_DISCLAIMER for a complete copyright + * notice, contact person, and disclaimer. + * + * $Revision: 1.1 $ + *********************************************************************EHEADER*/ + + /****************************************************************************** + * + * Memory management utilities + * + *****************************************************************************/ + #include + #include + #include "utilities.h" + + #ifdef HYPRE_USE_PTHREADS + #include "threading.h" + + #ifdef HYPRE_USE_UMALLOC + #include "umalloc_local.h" + + #define _umalloc_(size) (threadid == hypre_NumThreads) ? \ + (char *) malloc(size) : \ + (char *) _umalloc(_uparam[threadid].myheap, size) + #define _ucalloc_(count, size) (threadid == hypre_NumThreads) ? \ + (char *) calloc(count, size) : \ + (char *) _ucalloc(_uparam[threadid].myheap,\ + count, size) + #define _urealloc_(ptr, size) (threadid == hypre_NumThreads) ? \ + (char *) realloc(ptr, size) : \ + (char *) _urealloc(ptr, size) + #define _ufree_(ptr) (threadid == hypre_NumThreads) ? \ + free(ptr) : _ufree(ptr) + #endif + #else + #ifdef HYPRE_USE_UMALLOC + #undef HYPRE_USE_UMALLOC + #endif + #endif + + /****************************************************************************** + * + * Standard routines + * + *****************************************************************************/ + + /*-------------------------------------------------------------------------- + * hypre_OutOfMemory + *--------------------------------------------------------------------------*/ + + int + hypre_OutOfMemory( int size ) + { + printf("Out of memory trying to allocate %d bytes\n", size); + fflush(stdout); + + return 0; + } + + /*-------------------------------------------------------------------------- + * hypre_MAlloc + *--------------------------------------------------------------------------*/ + + char * + hypre_MAlloc( int size ) + { + char *ptr; + + if (size > 0) + { + #ifdef HYPRE_USE_UMALLOC + int threadid = hypre_GetThreadID(); + + ptr = _umalloc_(size); + #else + ptr = malloc(size); + #endif + + #if 1 + if (ptr == NULL) + { + hypre_OutOfMemory(size); + } + #endif + } + else + { + ptr = NULL; + } + + return ptr; + } + + /*-------------------------------------------------------------------------- + * hypre_CAlloc + *--------------------------------------------------------------------------*/ + + char * + hypre_CAlloc( int count, + int elt_size ) + { + char *ptr; + int size = count*elt_size; + + if (size > 0) + { + #ifdef HYPRE_USE_UMALLOC + int threadid = hypre_GetThreadID(); + + ptr = _ucalloc_(count, elt_size); + #else + ptr = calloc(count, elt_size); + #endif + + #if 1 + if (ptr == NULL) + { + hypre_OutOfMemory(size); + } + #endif + } + else + { + ptr = NULL; + } + + return ptr; + } + + /*-------------------------------------------------------------------------- + * hypre_ReAlloc + *--------------------------------------------------------------------------*/ + + char * + hypre_ReAlloc( char *ptr, + int size ) + { + #ifdef HYPRE_USE_UMALLOC + if (ptr == NULL) + { + ptr = hypre_MAlloc(size); + } + else if (size == 0) + { + hypre_Free(ptr); + } + else + { + int threadid = hypre_GetThreadID(); + ptr = _urealloc_(ptr, size); + } + #else + ptr = realloc(ptr, size); + #endif + + #if 1 + if ((ptr == NULL) && (size > 0)) + { + hypre_OutOfMemory(size); + } + #endif + + return ptr; + } + + /*-------------------------------------------------------------------------- + * hypre_Free + *--------------------------------------------------------------------------*/ + + void + hypre_Free( char *ptr ) + { + if (ptr) + { + #ifdef HYPRE_USE_UMALLOC + int threadid = hypre_GetThreadID(); + + _ufree_(ptr); + #else + free(ptr); + #endif + } + } + + + /*-------------------------------------------------------------------------- + * These Shared routines are for one thread to allocate memory for data + * will be visible to all threads. The file-scope pointer + * global_alloc_ptr is used in these routines. + *--------------------------------------------------------------------------*/ + + #ifdef HYPRE_USE_PTHREADS + + char *global_alloc_ptr; + double *global_data_ptr; + + /*-------------------------------------------------------------------------- + * hypre_SharedMAlloc + *--------------------------------------------------------------------------*/ + + char * + hypre_SharedMAlloc( int size ) + { + char *ptr; + int unthreaded = pthread_equal(initial_thread, pthread_self()); + int I_call_malloc = unthreaded || + pthread_equal(hypre_thread[0],pthread_self()); + + if (I_call_malloc) { + global_alloc_ptr = hypre_MAlloc( size ); + } + + hypre_barrier(&talloc_mtx, unthreaded); + ptr = global_alloc_ptr; + hypre_barrier(&talloc_mtx, unthreaded); + + return ptr; + } + + /*-------------------------------------------------------------------------- + * hypre_SharedCAlloc + *--------------------------------------------------------------------------*/ + + char * + hypre_SharedCAlloc( int count, + int elt_size ) + { + char *ptr; + int unthreaded = pthread_equal(initial_thread, pthread_self()); + int I_call_calloc = unthreaded || + pthread_equal(hypre_thread[0],pthread_self()); + + if (I_call_calloc) { + global_alloc_ptr = hypre_CAlloc( count, elt_size ); + } + + hypre_barrier(&talloc_mtx, unthreaded); + ptr = global_alloc_ptr; + hypre_barrier(&talloc_mtx, unthreaded); + + return ptr; + } + + /*-------------------------------------------------------------------------- + * hypre_SharedReAlloc + *--------------------------------------------------------------------------*/ + + char * + hypre_SharedReAlloc( char *ptr, + int size ) + { + int unthreaded = pthread_equal(initial_thread, pthread_self()); + int I_call_realloc = unthreaded || + pthread_equal(hypre_thread[0],pthread_self()); + + if (I_call_realloc) { + global_alloc_ptr = hypre_ReAlloc( ptr, size ); + } + + hypre_barrier(&talloc_mtx, unthreaded); + ptr = global_alloc_ptr; + hypre_barrier(&talloc_mtx, unthreaded); + + return ptr; + } + + /*-------------------------------------------------------------------------- + * hypre_SharedFree + *--------------------------------------------------------------------------*/ + + void + hypre_SharedFree( char *ptr ) + { + int unthreaded = pthread_equal(initial_thread, pthread_self()); + int I_call_free = unthreaded || + pthread_equal(hypre_thread[0],pthread_self()); + + hypre_barrier(&talloc_mtx, unthreaded); + if (I_call_free) { + hypre_Free(ptr); + } + hypre_barrier(&talloc_mtx, unthreaded); + } + + /*-------------------------------------------------------------------------- + * hypre_IncrementSharedDataPtr + *--------------------------------------------------------------------------*/ + + double * + hypre_IncrementSharedDataPtr( double *ptr, int size ) + { + int unthreaded = pthread_equal(initial_thread, pthread_self()); + int I_increment = unthreaded || + pthread_equal(hypre_thread[0],pthread_self()); + + if (I_increment) { + global_data_ptr = ptr + size; + } + + hypre_barrier(&talloc_mtx, unthreaded); + ptr = global_data_ptr; + hypre_barrier(&talloc_mtx, unthreaded); + + return ptr; + } + + #endif + Index: llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/memory.h diff -c /dev/null llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/memory.h:1.1 *** /dev/null Mon Apr 11 00:22:18 2005 --- llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/memory.h Mon Apr 11 00:22:07 2005 *************** *** 0 **** --- 1,125 ---- + /*BHEADER********************************************************************** + * (c) 1997 The Regents of the University of California + * + * See the file COPYRIGHT_and_DISCLAIMER for a complete copyright + * notice, contact person, and disclaimer. + * + * $Revision: 1.1 $ + *********************************************************************EHEADER*/ + /****************************************************************************** + * + * Header file for memory management utilities + * + *****************************************************************************/ + + #ifndef hypre_MEMORY_HEADER + #define hypre_MEMORY_HEADER + + #ifdef __cplusplus + extern "C" { + #endif + + /*-------------------------------------------------------------------------- + * Use "Debug Malloc Library", dmalloc + *--------------------------------------------------------------------------*/ + + #ifdef HYPRE_MEMORY_DMALLOC + + #define hypre_InitMemoryDebug(id) hypre_InitMemoryDebugDML(id) + #define hypre_FinalizeMemoryDebug() hypre_FinalizeMemoryDebugDML() + + #define hypre_TAlloc(type, count) \ + ( (type *)hypre_MAllocDML((unsigned int)(sizeof(type) * (count)),\ + __FILE__, __LINE__) ) + + #define hypre_CTAlloc(type, count) \ + ( (type *)hypre_CAllocDML((unsigned int)(count), (unsigned int)sizeof(type),\ + __FILE__, __LINE__) ) + + #define hypre_TReAlloc(ptr, type, count) \ + ( (type *)hypre_ReAllocDML((char *)ptr,\ + (unsigned int)(sizeof(type) * (count)),\ + __FILE__, __LINE__) ) + + #define hypre_TFree(ptr) \ + ( hypre_FreeDML((char *)ptr, __FILE__, __LINE__), ptr = NULL ) + + /*-------------------------------------------------------------------------- + * Use standard memory routines + *--------------------------------------------------------------------------*/ + + #else + + #define hypre_InitMemoryDebug(id) + #define hypre_FinalizeMemoryDebug() + + #define hypre_TAlloc(type, count) \ + ( (type *)hypre_MAlloc((unsigned int)(sizeof(type) * (count))) ) + + #define hypre_CTAlloc(type, count) \ + ( (type *)hypre_CAlloc((unsigned int)(count), (unsigned int)sizeof(type)) ) + + #define hypre_TReAlloc(ptr, type, count) \ + ( (type *)hypre_ReAlloc((char *)ptr, (unsigned int)(sizeof(type) * (count))) ) + + #define hypre_TFree(ptr) \ + ( hypre_Free((char *)ptr), ptr = NULL ) + + #endif + + + #ifdef HYPRE_USE_PTHREADS + + #define hypre_SharedTAlloc(type, count) \ + ( (type *)hypre_SharedMAlloc((unsigned int)(sizeof(type) * (count))) ) + + + #define hypre_SharedCTAlloc(type, count) \ + ( (type *)hypre_SharedCAlloc((unsigned int)(count),\ + (unsigned int)sizeof(type)) ) + + #define hypre_SharedTReAlloc(ptr, type, count) \ + ( (type *)hypre_SharedReAlloc((char *)ptr,\ + (unsigned int)(sizeof(type) * (count))) ) + + #define hypre_SharedTFree(ptr) \ + ( hypre_SharedFree((char *)ptr), ptr = NULL ) + + #else + + #define hypre_SharedTAlloc(type, count) hypre_TAlloc(type, (count)) + #define hypre_SharedCTAlloc(type, count) hypre_CTAlloc(type, (count)) + #define hypre_SharedTReAlloc(type, count) hypre_TReAlloc(type, (count)) + #define hypre_SharedTFree(ptr) hypre_TFree(ptr) + + #endif + + /*-------------------------------------------------------------------------- + * Prototypes + *--------------------------------------------------------------------------*/ + + /* memory.c */ + int hypre_OutOfMemory( int size ); + char *hypre_MAlloc( int size ); + char *hypre_CAlloc( int count , int elt_size ); + char *hypre_ReAlloc( char *ptr , int size ); + void hypre_Free( char *ptr ); + char *hypre_SharedMAlloc( int size ); + char *hypre_SharedCAlloc( int count , int elt_size ); + char *hypre_SharedReAlloc( char *ptr , int size ); + void hypre_SharedFree( char *ptr ); + double *hypre_IncrementSharedDataPtr( double *ptr , int size ); + + /* memory_dmalloc.c */ + int hypre_InitMemoryDebugDML( int id ); + int hypre_FinalizeMemoryDebugDML( void ); + char *hypre_MAllocDML( int size , char *file , int line ); + char *hypre_CAllocDML( int count , int elt_size , char *file , int line ); + char *hypre_ReAllocDML( char *ptr , int size , char *file , int line ); + void hypre_FreeDML( char *ptr , char *file , int line ); + + #ifdef __cplusplus + } + #endif + + #endif Index: llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/mpistubs.c diff -c /dev/null llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/mpistubs.c:1.1 *** /dev/null Mon Apr 11 00:22:18 2005 --- llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/mpistubs.c Mon Apr 11 00:22:07 2005 *************** *** 0 **** --- 1,496 ---- + /*BHEADER********************************************************************** + * (c) 1998 The Regents of the University of California + * + * See the file COPYRIGHT_and_DISCLAIMER for a complete copyright + * notice, contact person, and disclaimer. + * + * $Revision: 1.1 $ + *********************************************************************EHEADER*/ + /****************************************************************************** + * + * Fake mpi stubs to generate serial codes without mpi + * + *****************************************************************************/ + + #include "utilities.h" + + #ifdef HYPRE_SEQUENTIAL + + int + hypre_MPI_Init( int *argc, + char ***argv ) + { + return(0); + } + + int + hypre_MPI_Finalize( ) + { + return(0); + } + + int + hypre_MPI_Abort( hypre_MPI_Comm comm, + int errorcode ) + { + return(0); + } + + double + hypre_MPI_Wtime( ) + { + return(0.0); + } + + double + hypre_MPI_Wtick( ) + { + return(0.0); + } + + int + hypre_MPI_Barrier( hypre_MPI_Comm comm ) + { + return(0); + } + + int + hypre_MPI_Comm_create( hypre_MPI_Comm comm, + hypre_MPI_Group group, + hypre_MPI_Comm *newcomm ) + { + return(0); + } + + int + hypre_MPI_Comm_dup( hypre_MPI_Comm comm, + hypre_MPI_Comm *newcomm ) + { + return(0); + } + + int + hypre_MPI_Comm_size( hypre_MPI_Comm comm, + int *size ) + { + *size = 1; + return(0); + } + + int + hypre_MPI_Comm_rank( hypre_MPI_Comm comm, + int *rank ) + { + *rank = 0; + return(0); + } + + int + hypre_MPI_Comm_free( hypre_MPI_Comm *comm ) + { + return 0; + } + + int + hypre_MPI_Comm_group( hypre_MPI_Comm comm, + hypre_MPI_Group *group ) + { + return(0); + } + + int + hypre_MPI_Group_incl( hypre_MPI_Group group, + int n, + int *ranks, + hypre_MPI_Group *newgroup ) + { + return(0); + } + + int + hypre_MPI_Group_free( hypre_MPI_Group *group ) + { + return 0; + } + + int + hypre_MPI_Address( void *location, + hypre_MPI_Aint *address ) + { + return(0); + } + + int + hypre_MPI_Get_count( hypre_MPI_Status *status, + hypre_MPI_Datatype datatype, + int *count ) + { + return(0); + } + + int + hypre_MPI_Alltoall( void *sendbuf, + int sendcount, + hypre_MPI_Datatype sendtype, + void *recvbuf, + int recvcount, + hypre_MPI_Datatype recvtype, + hypre_MPI_Comm comm ) + { + return(0); + } + + int + hypre_MPI_Allgather( void *sendbuf, + int sendcount, + hypre_MPI_Datatype sendtype, + void *recvbuf, + int recvcount, + hypre_MPI_Datatype recvtype, + hypre_MPI_Comm comm ) + { + int i; + + switch (sendtype) + { + case hypre_MPI_INT: + { + int *crecvbuf = (int *)recvbuf; + int *csendbuf = (int *)sendbuf; + for (i = 0; i < sendcount; i++) + { + crecvbuf[i] = csendbuf[i]; + } + } + break; + + case hypre_MPI_DOUBLE: + { + double *crecvbuf = (double *)recvbuf; + double *csendbuf = (double *)sendbuf; + for (i = 0; i < sendcount; i++) + { + crecvbuf[i] = csendbuf[i]; + } + } + break; + + case hypre_MPI_CHAR: + { + char *crecvbuf = (char *)recvbuf; + char *csendbuf = (char *)sendbuf; + for (i = 0; i < sendcount; i++) + { + crecvbuf[i] = csendbuf[i]; + } + } + break; + } + + return(0); + } + + int + hypre_MPI_Allgatherv( void *sendbuf, + int sendcount, + hypre_MPI_Datatype sendtype, + void *recvbuf, + int *recvcounts, + int *displs, + hypre_MPI_Datatype recvtype, + hypre_MPI_Comm comm ) + { + return ( hypre_MPI_Allgather(sendbuf, sendcount, sendtype, + recvbuf, *recvcounts, recvtype, comm) ); + } + + int + hypre_MPI_Gather( void *sendbuf, + int sendcount, + hypre_MPI_Datatype sendtype, + void *recvbuf, + int recvcount, + hypre_MPI_Datatype recvtype, + int root, + hypre_MPI_Comm comm ) + { + return ( hypre_MPI_Allgather(sendbuf, sendcount, sendtype, + recvbuf, recvcount, recvtype, comm) ); + } + + int + hypre_MPI_Scatter( void *sendbuf, + int sendcount, + hypre_MPI_Datatype sendtype, + void *recvbuf, + int recvcount, + hypre_MPI_Datatype recvtype, + int root, + hypre_MPI_Comm comm ) + { + return ( hypre_MPI_Allgather(sendbuf, sendcount, sendtype, + recvbuf, recvcount, recvtype, comm) ); + } + + int + hypre_MPI_Bcast( void *buffer, + int count, + hypre_MPI_Datatype datatype, + int root, + hypre_MPI_Comm comm ) + { + return(0); + } + + int + hypre_MPI_Send( void *buf, + int count, + hypre_MPI_Datatype datatype, + int dest, + int tag, + hypre_MPI_Comm comm ) + { + return(0); + } + + int + hypre_MPI_Recv( void *buf, + int count, + hypre_MPI_Datatype datatype, + int source, + int tag, + hypre_MPI_Comm comm, + hypre_MPI_Status *status ) + { + return(0); + } + + int + hypre_MPI_Isend( void *buf, + int count, + hypre_MPI_Datatype datatype, + int dest, + int tag, + hypre_MPI_Comm comm, + hypre_MPI_Request *request ) + { + return(0); + } + + int + hypre_MPI_Irecv( void *buf, + int count, + hypre_MPI_Datatype datatype, + int source, + int tag, + hypre_MPI_Comm comm, + hypre_MPI_Request *request ) + { + return(0); + } + + int + hypre_MPI_Send_init( void *buf, + int count, + hypre_MPI_Datatype datatype, + int dest, + int tag, + hypre_MPI_Comm comm, + hypre_MPI_Request *request ) + { + return 0; + } + + int + hypre_MPI_Recv_init( void *buf, + int count, + hypre_MPI_Datatype datatype, + int dest, + int tag, + hypre_MPI_Comm comm, + hypre_MPI_Request *request ) + { + return 0; + } + + int + hypre_MPI_Irsend( void *buf, + int count, + hypre_MPI_Datatype datatype, + int dest, + int tag, + hypre_MPI_Comm comm, + hypre_MPI_Request *request ) + { + return 0; + } + + int + hypre_MPI_Startall( int count, + hypre_MPI_Request *array_of_requests ) + { + return 0; + } + + int + hypre_MPI_Probe( int source, + int tag, + hypre_MPI_Comm comm, + hypre_MPI_Status *status ) + { + return 0; + } + + int + hypre_MPI_Iprobe( int source, + int tag, + hypre_MPI_Comm comm, + int *flag, + hypre_MPI_Status *status ) + { + return 0; + } + + int + hypre_MPI_Test( hypre_MPI_Request *request, + int *flag, + hypre_MPI_Status *status ) + { + *flag = 1; + return(0); + } + + int + hypre_MPI_Testall( int count, + hypre_MPI_Request *array_of_requests, + int *flag, + hypre_MPI_Status *array_of_statuses ) + { + *flag = 1; + return(0); + } + + int + hypre_MPI_Wait( hypre_MPI_Request *request, + hypre_MPI_Status *status ) + { + return(0); + } + + int + hypre_MPI_Waitall( int count, + hypre_MPI_Request *array_of_requests, + hypre_MPI_Status *array_of_statuses ) + { + return(0); + } + + int + hypre_MPI_Waitany( int count, + hypre_MPI_Request *array_of_requests, + int *index, + hypre_MPI_Status *status ) + { + return(0); + } + + int + hypre_MPI_Allreduce( void *sendbuf, + void *recvbuf, + int count, + hypre_MPI_Datatype datatype, + hypre_MPI_Op op, + hypre_MPI_Comm comm ) + { + switch (datatype) + { + case hypre_MPI_INT: + { + int *crecvbuf = (int *)recvbuf; + int *csendbuf = (int *)sendbuf; + crecvbuf[0] = csendbuf[0]; + } + break; + + case hypre_MPI_DOUBLE: + { + double *crecvbuf = (double *)recvbuf; + double *csendbuf = (double *)sendbuf; + crecvbuf[0] = csendbuf[0]; + } + break; + + case hypre_MPI_CHAR: + { + char *crecvbuf = (char *)recvbuf; + char *csendbuf = (char *)sendbuf; + crecvbuf[0] = csendbuf[0]; + } + break; + } + + return(0); + } + + int + hypre_MPI_Request_free( hypre_MPI_Request *request ) + { + return 0; + } + + int + hypre_MPI_Type_contiguous( int count, + hypre_MPI_Datatype oldtype, + hypre_MPI_Datatype *newtype ) + { + return(0); + } + + int + hypre_MPI_Type_vector( int count, + int blocklength, + int stride, + hypre_MPI_Datatype oldtype, + hypre_MPI_Datatype *newtype ) + { + return(0); + } + + int + hypre_MPI_Type_hvector( int count, + int blocklength, + hypre_MPI_Aint stride, + hypre_MPI_Datatype oldtype, + hypre_MPI_Datatype *newtype ) + { + return(0); + } + + int + hypre_MPI_Type_struct( int count, + int *array_of_blocklengths, + hypre_MPI_Aint *array_of_displacements, + hypre_MPI_Datatype *array_of_types, + hypre_MPI_Datatype *newtype ) + { + return(0); + } + + int + hypre_MPI_Type_commit( hypre_MPI_Datatype *datatype ) + { + return(0); + } + + int + hypre_MPI_Type_free( hypre_MPI_Datatype *datatype ) + { + return(0); + } + + #else + + /* this is used only to eliminate compiler warnings */ + int hypre_empty; + + #endif Index: llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/mpistubs.h diff -c /dev/null llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/mpistubs.h:1.1 *** /dev/null Mon Apr 11 00:22:18 2005 --- llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/mpistubs.h Mon Apr 11 00:22:07 2005 *************** *** 0 **** --- 1,192 ---- + /*BHEADER********************************************************************** + * (c) 1998 The Regents of the University of California + * + * See the file COPYRIGHT_and_DISCLAIMER for a complete copyright + * notice, contact person, and disclaimer. + * + * $Revision: 1.1 $ + *********************************************************************EHEADER*/ + /****************************************************************************** + * + * Fake mpi stubs to generate serial codes without mpi + * + *****************************************************************************/ + + #ifndef hypre_MPISTUBS + #define hypre_MPISTUBS + + #ifdef HYPRE_SEQUENTIAL + + #ifdef __cplusplus + extern "C" { + #endif + + /*-------------------------------------------------------------------------- + * Change all MPI names to hypre_MPI names to avoid link conflicts + * + * NOTE: MPI_Comm is the only MPI symbol in the HYPRE user interface, + * and is defined in `HYPRE_utilities.h'. + *--------------------------------------------------------------------------*/ + + #define MPI_Comm hypre_MPI_Comm + #define MPI_Group hypre_MPI_Group + #define MPI_Request hypre_MPI_Request + #define MPI_Datatype hypre_MPI_Datatype + #define MPI_Status hypre_MPI_Status + #define MPI_Op hypre_MPI_Op + #define MPI_Aint hypre_MPI_Aint + + #define MPI_COMM_WORLD hypre_MPI_COMM_WORLD + + #define MPI_BOTTOM hypre_MPI_BOTTOM + + #define MPI_DOUBLE hypre_MPI_DOUBLE + #define MPI_INT hypre_MPI_INT + #define MPI_CHAR hypre_MPI_CHAR + #define MPI_LONG hypre_MPI_LONG + + #define MPI_SUM hypre_MPI_SUM + #define MPI_MIN hypre_MPI_MIN + #define MPI_MAX hypre_MPI_MAX + #define MPI_LOR hypre_MPI_LOR + + #define MPI_UNDEFINED hypre_MPI_UNDEFINED + #define MPI_REQUEST_NULL hypre_MPI_REQUEST_NULL + #define MPI_ANY_SOURCE hypre_MPI_ANY_SOURCE + + #define MPI_Init hypre_MPI_Init + #define MPI_Finalize hypre_MPI_Finalize + #define MPI_Abort hypre_MPI_Abort + #define MPI_Wtime hypre_MPI_Wtime + #define MPI_Wtick hypre_MPI_Wtick + #define MPI_Barrier hypre_MPI_Barrier + #define MPI_Comm_create hypre_MPI_Comm_create + #define MPI_Comm_dup hypre_MPI_Comm_dup + #define MPI_Comm_group hypre_MPI_Comm_group + #define MPI_Comm_size hypre_MPI_Comm_size + #define MPI_Comm_rank hypre_MPI_Comm_rank + #define MPI_Comm_free hypre_MPI_Comm_free + #define MPI_Group_incl hypre_MPI_Group_incl + #define MPI_Group_free hypre_MPI_Group_free + #define MPI_Address hypre_MPI_Address + #define MPI_Get_count hypre_MPI_Get_count + #define MPI_Alltoall hypre_MPI_Alltoall + #define MPI_Allgather hypre_MPI_Allgather + #define MPI_Allgatherv hypre_MPI_Allgatherv + #define MPI_Gather hypre_MPI_Gather + #define MPI_Scatter hypre_MPI_Scatter + #define MPI_Bcast hypre_MPI_Bcast + #define MPI_Send hypre_MPI_Send + #define MPI_Recv hypre_MPI_Recv + #define MPI_Isend hypre_MPI_Isend + #define MPI_Irecv hypre_MPI_Irecv + #define MPI_Send_init hypre_MPI_Send_init + #define MPI_Recv_init hypre_MPI_Recv_init + #define MPI_Irsend hypre_MPI_Irsend + #define MPI_Startall hypre_MPI_Startall + #define MPI_Probe hypre_MPI_Probe + #define MPI_Iprobe hypre_MPI_Iprobe + #define MPI_Test hypre_MPI_Test + #define MPI_Testall hypre_MPI_Testall + #define MPI_Wait hypre_MPI_Wait + #define MPI_Waitall hypre_MPI_Waitall + #define MPI_Waitany hypre_MPI_Waitany + #define MPI_Allreduce hypre_MPI_Allreduce + #define MPI_Request_free hypre_MPI_Request_free + #define MPI_Type_contiguous hypre_MPI_Type_contiguous + #define MPI_Type_vector hypre_MPI_Type_vector + #define MPI_Type_hvector hypre_MPI_Type_hvector + #define MPI_Type_struct hypre_MPI_Type_struct + #define MPI_Type_commit hypre_MPI_Type_commit + #define MPI_Type_free hypre_MPI_Type_free + + /*-------------------------------------------------------------------------- + * Types, etc. + *--------------------------------------------------------------------------*/ + + /* These types have associated creation and destruction routines */ + typedef int hypre_MPI_Comm; + typedef int hypre_MPI_Group; + typedef int hypre_MPI_Request; + typedef int hypre_MPI_Datatype; + + typedef struct { int MPI_SOURCE; } hypre_MPI_Status; + typedef int hypre_MPI_Op; + typedef int hypre_MPI_Aint; + + #define hypre_MPI_COMM_WORLD 0 + + #define hypre_MPI_BOTTOM 0x0 + + #define hypre_MPI_DOUBLE 0 + #define hypre_MPI_INT 1 + #define hypre_MPI_CHAR 2 + #define hypre_MPI_LONG 3 + + #define hypre_MPI_SUM 0 + #define hypre_MPI_MIN 1 + #define hypre_MPI_MAX 2 + #define hypre_MPI_LOR 3 + + #define hypre_MPI_UNDEFINED -9999 + #define hypre_MPI_REQUEST_NULL 0 + #define hypre_MPI_ANY_SOURCE 1 + + /*-------------------------------------------------------------------------- + * Prototypes + *--------------------------------------------------------------------------*/ + + /* mpistubs.c */ + int hypre_MPI_Init( int *argc , char ***argv ); + int hypre_MPI_Finalize( void ); + int hypre_MPI_Abort( hypre_MPI_Comm comm , int errorcode ); + double hypre_MPI_Wtime( void ); + double hypre_MPI_Wtick( void ); + int hypre_MPI_Barrier( hypre_MPI_Comm comm ); + int hypre_MPI_Comm_create( hypre_MPI_Comm comm , hypre_MPI_Group group , hypre_MPI_Comm *newcomm ); + int hypre_MPI_Comm_dup( hypre_MPI_Comm comm , hypre_MPI_Comm *newcomm ); + int hypre_MPI_Comm_size( hypre_MPI_Comm comm , int *size ); + int hypre_MPI_Comm_rank( hypre_MPI_Comm comm , int *rank ); + int hypre_MPI_Comm_free( hypre_MPI_Comm *comm ); + int hypre_MPI_Comm_group( hypre_MPI_Comm comm , hypre_MPI_Group *group ); + int hypre_MPI_Group_incl( hypre_MPI_Group group , int n , int *ranks , hypre_MPI_Group *newgroup ); + int hypre_MPI_Group_free( hypre_MPI_Group *group ); + int hypre_MPI_Address( void *location , hypre_MPI_Aint *address ); + int hypre_MPI_Get_count( hypre_MPI_Status *status , hypre_MPI_Datatype datatype , int *count ); + int hypre_MPI_Alltoall( void *sendbuf , int sendcount , hypre_MPI_Datatype sendtype , void *recvbuf , int recvcount , hypre_MPI_Datatype recvtype , hypre_MPI_Comm comm ); + int hypre_MPI_Allgather( void *sendbuf , int sendcount , hypre_MPI_Datatype sendtype , void *recvbuf , int recvcount , hypre_MPI_Datatype recvtype , hypre_MPI_Comm comm ); + int hypre_MPI_Allgatherv( void *sendbuf , int sendcount , hypre_MPI_Datatype sendtype , void *recvbuf , int *recvcounts , int *displs , hypre_MPI_Datatype recvtype , hypre_MPI_Comm comm ); + int hypre_MPI_Gather( void *sendbuf , int sendcount , hypre_MPI_Datatype sendtype , void *recvbuf , int recvcount , hypre_MPI_Datatype recvtype , int root , hypre_MPI_Comm comm ); + int hypre_MPI_Scatter( void *sendbuf , int sendcount , hypre_MPI_Datatype sendtype , void *recvbuf , int recvcount , hypre_MPI_Datatype recvtype , int root , hypre_MPI_Comm comm ); + int hypre_MPI_Bcast( void *buffer , int count , hypre_MPI_Datatype datatype , int root , hypre_MPI_Comm comm ); + int hypre_MPI_Send( void *buf , int count , hypre_MPI_Datatype datatype , int dest , int tag , hypre_MPI_Comm comm ); + int hypre_MPI_Recv( void *buf , int count , hypre_MPI_Datatype datatype , int source , int tag , hypre_MPI_Comm comm , hypre_MPI_Status *status ); + int hypre_MPI_Isend( void *buf , int count , hypre_MPI_Datatype datatype , int dest , int tag , hypre_MPI_Comm comm , hypre_MPI_Request *request ); + int hypre_MPI_Irecv( void *buf , int count , hypre_MPI_Datatype datatype , int source , int tag , hypre_MPI_Comm comm , hypre_MPI_Request *request ); + int hypre_MPI_Send_init( void *buf , int count , hypre_MPI_Datatype datatype , int dest , int tag , hypre_MPI_Comm comm , hypre_MPI_Request *request ); + int hypre_MPI_Recv_init( void *buf , int count , hypre_MPI_Datatype datatype , int dest , int tag , hypre_MPI_Comm comm , hypre_MPI_Request *request ); + int hypre_MPI_Irsend( void *buf , int count , hypre_MPI_Datatype datatype , int dest , int tag , hypre_MPI_Comm comm , hypre_MPI_Request *request ); + int hypre_MPI_Startall( int count , hypre_MPI_Request *array_of_requests ); + int hypre_MPI_Probe( int source , int tag , hypre_MPI_Comm comm , hypre_MPI_Status *status ); + int hypre_MPI_Iprobe( int source , int tag , hypre_MPI_Comm comm , int *flag , hypre_MPI_Status *status ); + int hypre_MPI_Test( hypre_MPI_Request *request , int *flag , hypre_MPI_Status *status ); + int hypre_MPI_Testall( int count , hypre_MPI_Request *array_of_requests , int *flag , hypre_MPI_Status *array_of_statuses ); + int hypre_MPI_Wait( hypre_MPI_Request *request , hypre_MPI_Status *status ); + int hypre_MPI_Waitall( int count , hypre_MPI_Request *array_of_requests , hypre_MPI_Status *array_of_statuses ); + int hypre_MPI_Waitany( int count , hypre_MPI_Request *array_of_requests , int *index , hypre_MPI_Status *status ); + int hypre_MPI_Allreduce( void *sendbuf , void *recvbuf , int count , hypre_MPI_Datatype datatype , hypre_MPI_Op op , hypre_MPI_Comm comm ); + int hypre_MPI_Request_free( hypre_MPI_Request *request ); + int hypre_MPI_Type_contiguous( int count , hypre_MPI_Datatype oldtype , hypre_MPI_Datatype *newtype ); + int hypre_MPI_Type_vector( int count , int blocklength , int stride , hypre_MPI_Datatype oldtype , hypre_MPI_Datatype *newtype ); + int hypre_MPI_Type_hvector( int count , int blocklength , hypre_MPI_Aint stride , hypre_MPI_Datatype oldtype , hypre_MPI_Datatype *newtype ); + int hypre_MPI_Type_struct( int count , int *array_of_blocklengths , hypre_MPI_Aint *array_of_displacements , hypre_MPI_Datatype *array_of_types , hypre_MPI_Datatype *newtype ); + int hypre_MPI_Type_commit( hypre_MPI_Datatype *datatype ); + int hypre_MPI_Type_free( hypre_MPI_Datatype *datatype ); + + #ifdef __cplusplus + } + #endif + + #endif + + #endif Index: llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/pcg.c diff -c /dev/null llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/pcg.c:1.1 *** /dev/null Mon Apr 11 00:22:18 2005 --- llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/pcg.c Mon Apr 11 00:22:07 2005 *************** *** 0 **** --- 1,658 ---- + /*BHEADER********************************************************************** + * (c) 2000 The Regents of the University of California + * + * See the file COPYRIGHT_and_DISCLAIMER for a complete copyright + * notice, contact person, and disclaimer. + * + * $Revision: 1.1 $ + *********************************************************************EHEADER*/ + + /****************************************************************************** + * + * Preconditioned conjugate gradient (Omin) functions + * + *****************************************************************************/ + + /* This was based on the pcg.c formerly in struct_ls, with + changes (GetPrecond and stop_crit) for compatibility with the pcg.c + in parcsr_ls and elsewhere. Incompatibilities with the + parcsr_ls version: + - logging is different; no attempt has been made to be the same + - treatment of b=0 in Ax=b is different: this returns x=0; the parcsr + version iterates with a special stopping criterion + */ + + #include "krylov.h" + + /*-------------------------------------------------------------------------- + * hypre_PCGFunctionsCreate + *--------------------------------------------------------------------------*/ + + hypre_PCGFunctions * + hypre_PCGFunctionsCreate( + char * (*CAlloc) ( int count, int elt_size ), + int (*Free) ( char *ptr ), + void * (*CreateVector) ( void *vector ), + int (*DestroyVector) ( void *vector ), + void * (*MatvecCreate) ( void *A, void *x ), + int (*Matvec) ( void *matvec_data, double alpha, void *A, + void *x, double beta, void *y ), + int (*MatvecDestroy) ( void *matvec_data ), + double (*InnerProd) ( void *x, void *y ), + int (*CopyVector) ( void *x, void *y ), + int (*ClearVector) ( void *x ), + int (*ScaleVector) ( double alpha, void *x ), + int (*Axpy) ( double alpha, void *x, void *y ), + int (*PrecondSetup) ( void *vdata, void *A, void *b, void *x ), + int (*Precond) ( void *vdata, void *A, void *b, void *x ) + ) + { + hypre_PCGFunctions * pcg_functions; + pcg_functions = (hypre_PCGFunctions *) + CAlloc( 1, sizeof(hypre_PCGFunctions) ); + + pcg_functions->CAlloc = CAlloc; + pcg_functions->Free = Free; + pcg_functions->CreateVector = CreateVector; + pcg_functions->DestroyVector = DestroyVector; + pcg_functions->MatvecCreate = MatvecCreate; + pcg_functions->Matvec = Matvec; + pcg_functions->MatvecDestroy = MatvecDestroy; + pcg_functions->InnerProd = InnerProd; + pcg_functions->CopyVector = CopyVector; + pcg_functions->ClearVector = ClearVector; + pcg_functions->ScaleVector = ScaleVector; + pcg_functions->Axpy = Axpy; + /* default preconditioner must be set here but can be changed later... */ + pcg_functions->precond_setup = PrecondSetup; + pcg_functions->precond = Precond; + + return pcg_functions; + } + + /*-------------------------------------------------------------------------- + * hypre_PCGCreate + *--------------------------------------------------------------------------*/ + + void * + hypre_PCGCreate( hypre_PCGFunctions *pcg_functions ) + { + hypre_PCGData *pcg_data; + + pcg_data = hypre_CTAllocF(hypre_PCGData, 1, pcg_functions); + + pcg_data -> functions = pcg_functions; + + /* set defaults */ + (pcg_data -> tol) = 1.0e-06; + (pcg_data -> cf_tol) = 0.0; + (pcg_data -> max_iter) = 1000; + (pcg_data -> two_norm) = 0; + (pcg_data -> rel_change) = 0; + (pcg_data -> stop_crit) = 0; + (pcg_data -> matvec_data) = NULL; + (pcg_data -> precond_data) = NULL; + (pcg_data -> logging) = 0; + (pcg_data -> norms) = NULL; + (pcg_data -> rel_norms) = NULL; + + return (void *) pcg_data; + } + + /*-------------------------------------------------------------------------- + * hypre_PCGDestroy + *--------------------------------------------------------------------------*/ + + int + hypre_PCGDestroy( void *pcg_vdata ) + { + hypre_PCGData *pcg_data = pcg_vdata; + hypre_PCGFunctions *pcg_functions = pcg_data->functions; + int ierr = 0; + + if (pcg_data) + { + if ((pcg_data -> logging) > 0) + { + hypre_TFreeF( pcg_data -> norms, pcg_functions ); + hypre_TFreeF( pcg_data -> rel_norms, pcg_functions ); + } + + (*(pcg_functions->MatvecDestroy))(pcg_data -> matvec_data); + + (*(pcg_functions->DestroyVector))(pcg_data -> p); + (*(pcg_functions->DestroyVector))(pcg_data -> s); + (*(pcg_functions->DestroyVector))(pcg_data -> r); + + hypre_TFreeF( pcg_data, pcg_functions ); + hypre_TFreeF( pcg_functions, pcg_functions ); + } + + return(ierr); + } + + /*-------------------------------------------------------------------------- + * hypre_PCGSetup + *--------------------------------------------------------------------------*/ + + int + hypre_PCGSetup( void *pcg_vdata, + void *A, + void *b, + void *x ) + { + hypre_PCGData *pcg_data = pcg_vdata; + hypre_PCGFunctions *pcg_functions = pcg_data->functions; + int max_iter = (pcg_data -> max_iter); + int (*precond_setup)() = (pcg_functions -> precond_setup); + void *precond_data = (pcg_data -> precond_data); + int ierr = 0; + + (pcg_data -> A) = A; + + /*-------------------------------------------------- + * The arguments for CreateVector are important to + * maintain consistency between the setup and + * compute phases of matvec and the preconditioner. + *--------------------------------------------------*/ + + (pcg_data -> p) = (*(pcg_functions->CreateVector))(x); + (pcg_data -> s) = (*(pcg_functions->CreateVector))(x); + (pcg_data -> r) = (*(pcg_functions->CreateVector))(b); + + (pcg_data -> matvec_data) = (*(pcg_functions->MatvecCreate))(A, x); + + precond_setup(precond_data, A, b, x); + + /*----------------------------------------------------- + * Allocate space for log info + *-----------------------------------------------------*/ + + if ((pcg_data -> logging) > 0) + { + (pcg_data -> norms) = hypre_CTAllocF( double, max_iter + 1, + pcg_functions); + (pcg_data -> rel_norms) = hypre_CTAllocF( double, max_iter + 1, + pcg_functions ); + } + + return ierr; + } + + /*-------------------------------------------------------------------------- + * hypre_PCGSolve + *-------------------------------------------------------------------------- + * + * We use the following convergence test as the default (see Ashby, Holst, + * Manteuffel, and Saylor): + * + * ||e||_A ||r||_C + * ------- <= [kappa_A(C*A)]^(1/2) ------- < tol + * ||x||_A ||b||_C + * + * where we let (for the time being) kappa_A(CA) = 1. + * We implement the test as: + * + * gamma = / < (tol^2) = eps + * + *--------------------------------------------------------------------------*/ + + int + hypre_PCGSolve( void *pcg_vdata, + void *A, + void *b, + void *x ) + { + hypre_PCGData *pcg_data = pcg_vdata; + hypre_PCGFunctions *pcg_functions = pcg_data->functions; + + double tol = (pcg_data -> tol); + double cf_tol = (pcg_data -> cf_tol); + int max_iter = (pcg_data -> max_iter); + int two_norm = (pcg_data -> two_norm); + int rel_change = (pcg_data -> rel_change); + int stop_crit = (pcg_data -> stop_crit); + void *p = (pcg_data -> p); + void *s = (pcg_data -> s); + void *r = (pcg_data -> r); + void *matvec_data = (pcg_data -> matvec_data); + int (*precond)() = (pcg_functions -> precond); + void *precond_data = (pcg_data -> precond_data); + int logging = (pcg_data -> logging); + double *norms = (pcg_data -> norms); + double *rel_norms = (pcg_data -> rel_norms); + + double alpha, beta; + double gamma, gamma_old; + double bi_prod, i_prod, eps; + double pi_prod, xi_prod; + + double i_prod_0; + double cf_ave_0 = 0.0; + double cf_ave_1 = 0.0; + double weight; + + double guard_zero_residual; + + int i = 0; + int ierr = 0; + + /*----------------------------------------------------------------------- + * With relative change convergence test on, it is possible to attempt + * another iteration with a zero residual. This causes the parameter + * alpha to go NaN. The guard_zero_residual parameter is to circumvent + * this. Perhaps it should be set to something non-zero (but small). + *-----------------------------------------------------------------------*/ + + guard_zero_residual = 0.0; + + /*----------------------------------------------------------------------- + * Start pcg solve + *-----------------------------------------------------------------------*/ + + /* compute eps */ + if (two_norm) + { + /* bi_prod = */ + bi_prod = (*(pcg_functions->InnerProd))(b, b); + } + else + { + /* bi_prod = */ + (*(pcg_functions->ClearVector))(p); + precond(precond_data, A, b, p); + bi_prod = (*(pcg_functions->InnerProd))(p, b); + }; + + eps = tol*tol; + if ( bi_prod > 0.0 ) { + if ( stop_crit && !rel_change ) { /* absolute tolerance */ + eps = eps / bi_prod; + } + } + else /* bi_prod==0.0: the rhs vector b is zero */ + { + /* Set x equal to zero and return */ + (*(pcg_functions->CopyVector))(b, x); + if (logging > 0) + { + norms[0] = 0.0; + rel_norms[i] = 0.0; + } + ierr = 0; + return ierr; + }; + + /* r = b - Ax */ + (*(pcg_functions->CopyVector))(b, r); + (*(pcg_functions->Matvec))(matvec_data, -1.0, A, x, 1.0, r); + + /* Set initial residual norm */ + if (logging > 0 || cf_tol > 0.0) + { + i_prod_0 = (*(pcg_functions->InnerProd))(r,r); + if (logging > 0) norms[0] = sqrt(i_prod_0); + } + + /* p = C*r */ + (*(pcg_functions->ClearVector))(p); + precond(precond_data, A, r, p); + + /* gamma = */ + gamma = (*(pcg_functions->InnerProd))(r,p); + + while ((i+1) <= max_iter) + { + i++; + + /* s = A*p */ + (*(pcg_functions->Matvec))(matvec_data, 1.0, A, p, 0.0, s); + + /* alpha = gamma / */ + alpha = gamma / (*(pcg_functions->InnerProd))(s, p); + + gamma_old = gamma; + + /* x = x + alpha*p */ + (*(pcg_functions->Axpy))(alpha, p, x); + + /* r = r - alpha*s */ + (*(pcg_functions->Axpy))(-alpha, s, r); + + /* s = C*r */ + (*(pcg_functions->ClearVector))(s); + precond(precond_data, A, r, s); + + /* gamma = */ + gamma = (*(pcg_functions->InnerProd))(r, s); + + /* set i_prod for convergence test */ + if (two_norm) + i_prod = (*(pcg_functions->InnerProd))(r,r); + else + i_prod = gamma; + + #if 0 + if (two_norm) + printf("Iter (%d): ||r||_2 = %e, ||r||_2/||b||_2 = %e\n", + i, sqrt(i_prod), (bi_prod ? sqrt(i_prod/bi_prod) : 0)); + else + printf("Iter (%d): ||r||_C = %e, ||r||_C/||b||_C = %e\n", + i, sqrt(i_prod), (bi_prod ? sqrt(i_prod/bi_prod) : 0)); + #endif + + /* log norm info */ + if (logging > 0) + { + norms[i] = sqrt(i_prod); + rel_norms[i] = bi_prod ? sqrt(i_prod/bi_prod) : 0; + } + + /* check for convergence */ + if (i_prod / bi_prod < eps) + { + if (rel_change && i_prod > guard_zero_residual) + { + pi_prod = (*(pcg_functions->InnerProd))(p,p); + xi_prod = (*(pcg_functions->InnerProd))(x,x); + if ((alpha*alpha*pi_prod/xi_prod) < eps) + break; + } + else + { + break; + } + } + + /*-------------------------------------------------------------------- + * Optional test to see if adequate progress is being made. + * The average convergence factor is recorded and compared + * against the tolerance 'cf_tol'. The weighting factor is + * intended to pay more attention to the test when an accurate + * estimate for average convergence factor is available. + *--------------------------------------------------------------------*/ + + if (cf_tol > 0.0) + { + cf_ave_0 = cf_ave_1; + cf_ave_1 = pow( i_prod / i_prod_0, 1.0/(2.0*i)); + + weight = fabs(cf_ave_1 - cf_ave_0); + weight = weight / max(cf_ave_1, cf_ave_0); + weight = 1.0 - weight; + #if 0 + printf("I = %d: cf_new = %e, cf_old = %e, weight = %e\n", + i, cf_ave_1, cf_ave_0, weight ); + #endif + if (weight * cf_ave_1 > cf_tol) break; + } + + /* beta = gamma / gamma_old */ + beta = gamma / gamma_old; + + /* p = s + beta p */ + (*(pcg_functions->ScaleVector))(beta, p); + (*(pcg_functions->Axpy))(1.0, s, p); + } + + #if 0 + if (two_norm) + printf("Iterations = %d: ||r||_2 = %e, ||r||_2/||b||_2 = %e\n", + i, sqrt(i_prod), (bi_prod ? sqrt(i_prod/bi_prod) : 0)); + else + printf("Iterations = %d: ||r||_C = %e, ||r||_C/||b||_C = %e\n", + i, sqrt(i_prod), (bi_prod ? sqrt(i_prod/bi_prod) : 0)); + #endif + + /*----------------------------------------------------------------------- + * Print log + *-----------------------------------------------------------------------*/ + + #if 0 + if (logging > 0) + { + if (two_norm) + { + printf("Iters ||r||_2 ||r||_2/||b||_2\n"); + printf("----- ------------ ------------ \n"); + } + else + { + printf("Iters ||r||_C ||r||_C/||b||_C\n"); + printf("----- ------------ ------------ \n"); + } + for (j = 1; j <= i; j++) + { + printf("% 5d %e %e\n", j, norms[j], rel_norms[j]); + } + } + #endif + + (pcg_data -> num_iterations) = i; + + return ierr; + } + + /*-------------------------------------------------------------------------- + * hypre_PCGSetTol + *--------------------------------------------------------------------------*/ + + int + hypre_PCGSetTol( void *pcg_vdata, + double tol ) + { + hypre_PCGData *pcg_data = pcg_vdata; + int ierr = 0; + + (pcg_data -> tol) = tol; + + return ierr; + } + + /*-------------------------------------------------------------------------- + * hypre_PCGSetConvergenceTol + *--------------------------------------------------------------------------*/ + + int + hypre_PCGSetConvergenceFactorTol( void *pcg_vdata, + double cf_tol ) + { + hypre_PCGData *pcg_data = pcg_vdata; + int ierr = 0; + + (pcg_data -> cf_tol) = cf_tol; + + return ierr; + } + + /*-------------------------------------------------------------------------- + * hypre_PCGSetMaxIter + *--------------------------------------------------------------------------*/ + + int + hypre_PCGSetMaxIter( void *pcg_vdata, + int max_iter ) + { + hypre_PCGData *pcg_data = pcg_vdata; + int ierr = 0; + + (pcg_data -> max_iter) = max_iter; + + return ierr; + } + + /*-------------------------------------------------------------------------- + * hypre_PCGSetTwoNorm + *--------------------------------------------------------------------------*/ + + int + hypre_PCGSetTwoNorm( void *pcg_vdata, + int two_norm ) + { + hypre_PCGData *pcg_data = pcg_vdata; + int ierr = 0; + + (pcg_data -> two_norm) = two_norm; + + return ierr; + } + + /*-------------------------------------------------------------------------- + * hypre_PCGSetRelChange + *--------------------------------------------------------------------------*/ + + int + hypre_PCGSetRelChange( void *pcg_vdata, + int rel_change ) + { + hypre_PCGData *pcg_data = pcg_vdata; + int ierr = 0; + + (pcg_data -> rel_change) = rel_change; + + return ierr; + } + + /*-------------------------------------------------------------------------- + * hypre_PCGSetStopCrit + *--------------------------------------------------------------------------*/ + + int + hypre_PCGSetStopCrit( void *pcg_vdata, + int stop_crit ) + { + hypre_PCGData *pcg_data = pcg_vdata; + int ierr = 0; + + (pcg_data -> stop_crit) = stop_crit; + + return ierr; + } + + /*-------------------------------------------------------------------------- + * hypre_PCGGetPrecond + *--------------------------------------------------------------------------*/ + + int + hypre_PCGGetPrecond( void *pcg_vdata, + HYPRE_Solver *precond_data_ptr ) + { + hypre_PCGData *pcg_data = pcg_vdata; + int ierr = 0; + + *precond_data_ptr = (HYPRE_Solver)(pcg_data -> precond_data); + + return ierr; + } + + /*-------------------------------------------------------------------------- + * hypre_PCGSetPrecond + *--------------------------------------------------------------------------*/ + + int + hypre_PCGSetPrecond( void *pcg_vdata, + int (*precond)(), + int (*precond_setup)(), + void *precond_data ) + { + hypre_PCGData *pcg_data = pcg_vdata; + hypre_PCGFunctions *pcg_functions = pcg_data->functions; + int ierr = 0; + + (pcg_functions -> precond) = precond; + (pcg_functions -> precond_setup) = precond_setup; + (pcg_data -> precond_data) = precond_data; + + return ierr; + } + + /*-------------------------------------------------------------------------- + * hypre_PCGSetLogging + *--------------------------------------------------------------------------*/ + + int + hypre_PCGSetLogging( void *pcg_vdata, + int logging) + { + hypre_PCGData *pcg_data = pcg_vdata; + int ierr = 0; + + (pcg_data -> logging) = logging; + + return ierr; + } + + /*-------------------------------------------------------------------------- + * hypre_PCGGetNumIterations + *--------------------------------------------------------------------------*/ + + int + hypre_PCGGetNumIterations( void *pcg_vdata, + int *num_iterations ) + { + hypre_PCGData *pcg_data = pcg_vdata; + int ierr = 0; + + *num_iterations = (pcg_data -> num_iterations); + + return ierr; + } + + /*-------------------------------------------------------------------------- + * hypre_PCGPrintLogging + *--------------------------------------------------------------------------*/ + + int + hypre_PCGPrintLogging( void *pcg_vdata, + int myid) + { + hypre_PCGData *pcg_data = pcg_vdata; + + int num_iterations = (pcg_data -> num_iterations); + int logging = (pcg_data -> logging); + double *norms = (pcg_data -> norms); + double *rel_norms = (pcg_data -> rel_norms); + + int i; + int ierr = 0; + + if (myid == 0) + { + if (logging > 0) + { + for (i = 0; i < num_iterations; i++) + { + printf("Residual norm[%d] = %e ", i, norms[i]); + printf("Relative residual norm[%d] = %e\n", i, rel_norms[i]); + } + } + } + + return ierr; + } + + /*-------------------------------------------------------------------------- + * hypre_PCGGetFinalRelativeResidualNorm + *--------------------------------------------------------------------------*/ + + int + hypre_PCGGetFinalRelativeResidualNorm( void *pcg_vdata, + double *relative_residual_norm ) + { + hypre_PCGData *pcg_data = pcg_vdata; + + int num_iterations = (pcg_data -> num_iterations); + int logging = (pcg_data -> logging); + double *rel_norms = (pcg_data -> rel_norms); + + int ierr = -1; + + if (logging > 0) + { + *relative_residual_norm = rel_norms[num_iterations]; + ierr = 0; + } + + return ierr; + } + Index: llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/pcg_struct.c diff -c /dev/null llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/pcg_struct.c:1.1 *** /dev/null Mon Apr 11 00:22:18 2005 --- llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/pcg_struct.c Mon Apr 11 00:22:07 2005 *************** *** 0 **** --- 1,245 ---- + /*BHEADER********************************************************************** + * (c) 1998 The Regents of the University of California + * + * See the file COPYRIGHT_and_DISCLAIMER for a complete copyright + * notice, contact person, and disclaimer. + * + * $Revision: 1.1 $ + *********************************************************************EHEADER*/ + + /****************************************************************************** + * + * Struct matrix-vector implementation of PCG interface routines. + * + *****************************************************************************/ + + #include "headers.h" + + /*-------------------------------------------------------------------------- + * hypre_StructKrylovCAlloc + *--------------------------------------------------------------------------*/ + + char * + hypre_StructKrylovCAlloc( int count, + int elt_size ) + { + return( hypre_CAlloc( count, elt_size ) ); + } + + /*-------------------------------------------------------------------------- + * hypre_StructKrylovFree + *--------------------------------------------------------------------------*/ + + int + hypre_StructKrylovFree( char *ptr ) + { + int ierr = 0; + + hypre_Free( ptr ); + + return ierr; + } + + /*-------------------------------------------------------------------------- + * hypre_StructKrylovCreateVector + *--------------------------------------------------------------------------*/ + + void * + hypre_StructKrylovCreateVector( void *vvector ) + { + hypre_StructVector *vector = vvector; + hypre_StructVector *new_vector; + + new_vector = hypre_StructVectorCreate( hypre_StructVectorComm(vector), + hypre_StructVectorGrid(vector) ); + hypre_StructVectorInitialize(new_vector); + hypre_StructVectorAssemble(new_vector); + + return ( (void *) new_vector ); + } + + /*-------------------------------------------------------------------------- + * hypre_StructKrylovCreateVectorArray + *--------------------------------------------------------------------------*/ + + void * + hypre_StructKrylovCreateVectorArray(int n, void *vvector ) + { + hypre_StructVector *vector = vvector; + hypre_StructVector **new_vector; + int i; + + new_vector = hypre_CTAlloc(hypre_StructVector*,n); + for (i=0; i < n; i++) + { + HYPRE_StructVectorCreate(hypre_StructVectorComm(vector), + hypre_StructVectorGrid(vector), + (HYPRE_StructVector *) &new_vector[i] ); + HYPRE_StructVectorInitialize((HYPRE_StructVector) new_vector[i]); + HYPRE_StructVectorAssemble((HYPRE_StructVector) new_vector[i]); + } + + return ( (void *) new_vector ); + } + + /*-------------------------------------------------------------------------- + * hypre_StructKrylovDestroyVector + *--------------------------------------------------------------------------*/ + + int + hypre_StructKrylovDestroyVector( void *vvector ) + { + hypre_StructVector *vector = vvector; + + return( hypre_StructVectorDestroy( vector ) ); + } + + /*-------------------------------------------------------------------------- + * hypre_StructKrylovMatvecCreate + *--------------------------------------------------------------------------*/ + + void * + hypre_StructKrylovMatvecCreate( void *A, + void *x ) + { + void *matvec_data; + + matvec_data = hypre_StructMatvecCreate(); + hypre_StructMatvecSetup(matvec_data, A, x); + + return ( matvec_data ); + } + + /*-------------------------------------------------------------------------- + * hypre_StructKrylovMatvec + *--------------------------------------------------------------------------*/ + + int + hypre_StructKrylovMatvec( void *matvec_data, + double alpha, + void *A, + void *x, + double beta, + void *y ) + { + return ( hypre_StructMatvecCompute( matvec_data, + alpha, + (hypre_StructMatrix *) A, + (hypre_StructVector *) x, + beta, + (hypre_StructVector *) y ) ); + } + + /*-------------------------------------------------------------------------- + * hypre_StructKrylovMatvecDestroy + *--------------------------------------------------------------------------*/ + + int + hypre_StructKrylovMatvecDestroy( void *matvec_data ) + { + return ( hypre_StructMatvecDestroy( matvec_data ) ); + } + + /*-------------------------------------------------------------------------- + * hypre_StructKrylovInnerProd + *--------------------------------------------------------------------------*/ + + double + hypre_StructKrylovInnerProd( void *x, + void *y ) + { + return ( hypre_StructInnerProd( (hypre_StructVector *) x, + (hypre_StructVector *) y ) ); + } + + + /*-------------------------------------------------------------------------- + * hypre_StructKrylovCopyVector + *--------------------------------------------------------------------------*/ + + int + hypre_StructKrylovCopyVector( void *x, + void *y ) + { + return ( hypre_StructCopy( (hypre_StructVector *) x, + (hypre_StructVector *) y ) ); + } + + /*-------------------------------------------------------------------------- + * hypre_StructKrylovClearVector + *--------------------------------------------------------------------------*/ + + int + hypre_StructKrylovClearVector( void *x ) + { + return ( hypre_StructVectorSetConstantValues( (hypre_StructVector *) x, + 0.0 ) ); + } + + /*-------------------------------------------------------------------------- + * hypre_StructKrylovScaleVector + *--------------------------------------------------------------------------*/ + + int + hypre_StructKrylovScaleVector( double alpha, + void *x ) + { + return ( hypre_StructScale( alpha, (hypre_StructVector *) x ) ); + } + + /*-------------------------------------------------------------------------- + * hypre_StructKrylovAxpy + *--------------------------------------------------------------------------*/ + + int + hypre_StructKrylovAxpy( double alpha, + void *x, + void *y ) + { + return ( hypre_StructAxpy( alpha, (hypre_StructVector *) x, + (hypre_StructVector *) y ) ); + } + + /*-------------------------------------------------------------------------- + * hypre_StructKrylovIdentitySetup (for a default preconditioner) + *--------------------------------------------------------------------------*/ + + int + hypre_StructKrylovIdentitySetup( void *vdata, + void *A, + void *b, + void *x ) + + { + return 0; + } + + /*-------------------------------------------------------------------------- + * hypre_StructKrylovIdentity (for a default preconditioner) + *--------------------------------------------------------------------------*/ + + int + hypre_StructKrylovIdentity( void *vdata, + void *A, + void *b, + void *x ) + + { + return( hypre_StructKrylovCopyVector( b, x ) ); + } + + /*-------------------------------------------------------------------------- + * hypre_StructKrylovCommInfo + *--------------------------------------------------------------------------*/ + + int + hypre_StructKrylovCommInfo( void *A, + int *my_id, + int *num_procs ) + { + MPI_Comm comm = hypre_StructMatrixComm((hypre_StructMatrix *) A); + MPI_Comm_size(comm,num_procs); + MPI_Comm_rank(comm,my_id); + return 0; + } + Index: llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/point_relax.c diff -c /dev/null llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/point_relax.c:1.1 *** /dev/null Mon Apr 11 00:22:18 2005 --- llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/point_relax.c Mon Apr 11 00:22:07 2005 *************** *** 0 **** --- 1,779 ---- + /*BHEADER********************************************************************** + * (c) 1999 The Regents of the University of California + * + * See the file COPYRIGHT_and_DISCLAIMER for a complete copyright + * notice, contact person, and disclaimer. + * + * $Revision: 1.1 $ + *********************************************************************EHEADER*/ + /****************************************************************************** + * + * + *****************************************************************************/ + + #include "headers.h" + + /*-------------------------------------------------------------------------- + * hypre_PointRelaxData data structure + *--------------------------------------------------------------------------*/ + + typedef struct + { + MPI_Comm comm; + + double tol; /* not yet used */ + int max_iter; + int rel_change; /* not yet used */ + int zero_guess; + double weight; + + int num_pointsets; + int *pointset_sizes; + int *pointset_ranks; + hypre_Index *pointset_strides; + hypre_Index **pointset_indices; + + hypre_StructMatrix *A; + hypre_StructVector *b; + hypre_StructVector *x; + + hypre_StructVector *t; + + int diag_rank; + + hypre_ComputePkg **compute_pkgs; + + /* log info (always logged) */ + int num_iterations; + int time_index; + int flops; + + } hypre_PointRelaxData; + + /*-------------------------------------------------------------------------- + * hypre_PointRelaxCreate + *--------------------------------------------------------------------------*/ + + void * + hypre_PointRelaxCreate( MPI_Comm comm ) + { + hypre_PointRelaxData *relax_data; + + hypre_Index stride; + hypre_Index indices[1]; + + relax_data = hypre_CTAlloc(hypre_PointRelaxData, 1); + + (relax_data -> comm) = comm; + (relax_data -> time_index) = hypre_InitializeTiming("PointRelax"); + + /* set defaults */ + (relax_data -> tol) = 1.0e-06; + (relax_data -> max_iter) = 1000; + (relax_data -> rel_change) = 0; + (relax_data -> zero_guess) = 0; + (relax_data -> weight) = 1.0; + (relax_data -> num_pointsets) = 0; + (relax_data -> pointset_sizes) = NULL; + (relax_data -> pointset_ranks) = NULL; + (relax_data -> pointset_strides) = NULL; + (relax_data -> pointset_indices) = NULL; + (relax_data -> t) = NULL; + + hypre_SetIndex(stride, 1, 1, 1); + hypre_SetIndex(indices[0], 0, 0, 0); + hypre_PointRelaxSetNumPointsets((void *) relax_data, 1); + hypre_PointRelaxSetPointset((void *) relax_data, 0, 1, stride, indices); + + return (void *) relax_data; + } + + /*-------------------------------------------------------------------------- + * hypre_PointRelaxDestroy + *--------------------------------------------------------------------------*/ + + int + hypre_PointRelaxDestroy( void *relax_vdata ) + { + hypre_PointRelaxData *relax_data = relax_vdata; + int i; + int ierr = 0; + + if (relax_data) + { + for (i = 0; i < (relax_data -> num_pointsets); i++) + { + hypre_TFree(relax_data -> pointset_indices[i]); + hypre_ComputePkgDestroy(relax_data -> compute_pkgs[i]); + } + hypre_TFree(relax_data -> pointset_sizes); + hypre_TFree(relax_data -> pointset_ranks); + hypre_TFree(relax_data -> pointset_strides); + hypre_TFree(relax_data -> pointset_indices); + hypre_StructMatrixDestroy(relax_data -> A); + hypre_StructVectorDestroy(relax_data -> b); + hypre_StructVectorDestroy(relax_data -> x); + hypre_TFree(relax_data -> compute_pkgs); + hypre_StructVectorDestroy(relax_data -> t); + + hypre_FinalizeTiming(relax_data -> time_index); + hypre_TFree(relax_data); + } + + return ierr; + } + + /*-------------------------------------------------------------------------- + * hypre_PointRelaxSetup + *--------------------------------------------------------------------------*/ + + int + hypre_PointRelaxSetup( void *relax_vdata, + hypre_StructMatrix *A, + hypre_StructVector *b, + hypre_StructVector *x ) + { + hypre_PointRelaxData *relax_data = relax_vdata; + + int num_pointsets = (relax_data -> num_pointsets); + int *pointset_sizes = (relax_data -> pointset_sizes); + hypre_Index *pointset_strides = (relax_data -> pointset_strides); + hypre_Index **pointset_indices = (relax_data -> pointset_indices); + hypre_StructVector *t; + int diag_rank; + hypre_ComputePkg **compute_pkgs; + + hypre_Index unit_stride; + hypre_Index diag_index; + hypre_IndexRef stride; + hypre_IndexRef index; + + hypre_StructGrid *grid; + hypre_StructStencil *stencil; + + hypre_BoxArrayArray *send_boxes; + hypre_BoxArrayArray *recv_boxes; + int **send_processes; + int **recv_processes; + hypre_BoxArrayArray *indt_boxes; + hypre_BoxArrayArray *dept_boxes; + + hypre_BoxArrayArray *orig_indt_boxes; + hypre_BoxArrayArray *orig_dept_boxes; + hypre_BoxArrayArray *box_aa; + hypre_BoxArray *box_a; + hypre_Box *box; + int box_aa_size; + int box_a_size; + hypre_BoxArrayArray *new_box_aa; + hypre_BoxArray *new_box_a; + hypre_Box *new_box; + + double scale; + int frac; + + int i, j, k, p, m, compute_i; + int ierr = 0; + + /*---------------------------------------------------------- + * Set up the temp vector + *----------------------------------------------------------*/ + + if ((relax_data -> t) == NULL) + { + t = hypre_StructVectorCreate(hypre_StructVectorComm(b), + hypre_StructVectorGrid(b)); + hypre_StructVectorSetNumGhost(t, hypre_StructVectorNumGhost(b)); + hypre_StructVectorInitialize(t); + hypre_StructVectorAssemble(t); + (relax_data -> t) = t; + } + + /*---------------------------------------------------------- + * Find the matrix diagonal + *----------------------------------------------------------*/ + + grid = hypre_StructMatrixGrid(A); + stencil = hypre_StructMatrixStencil(A); + + hypre_SetIndex(diag_index, 0, 0, 0); + diag_rank = hypre_StructStencilElementRank(stencil, diag_index); + + /*---------------------------------------------------------- + * Set up the compute packages + *----------------------------------------------------------*/ + + hypre_SetIndex(unit_stride, 1, 1, 1); + + compute_pkgs = hypre_CTAlloc(hypre_ComputePkg *, num_pointsets); + + for (p = 0; p < num_pointsets; p++) + { + hypre_CreateComputeInfo(grid, stencil, + &send_boxes, &recv_boxes, + &send_processes, &recv_processes, + &orig_indt_boxes, &orig_dept_boxes); + + stride = pointset_strides[p]; + + for (compute_i = 0; compute_i < 2; compute_i++) + { + switch(compute_i) + { + case 0: + box_aa = orig_indt_boxes; + break; + + case 1: + box_aa = orig_dept_boxes; + break; + } + box_aa_size = hypre_BoxArrayArraySize(box_aa); + new_box_aa = hypre_BoxArrayArrayCreate(box_aa_size); + + for (i = 0; i < box_aa_size; i++) + { + box_a = hypre_BoxArrayArrayBoxArray(box_aa, i); + box_a_size = hypre_BoxArraySize(box_a); + new_box_a = hypre_BoxArrayArrayBoxArray(new_box_aa, i); + hypre_BoxArraySetSize(new_box_a, box_a_size * pointset_sizes[p]); + + k = 0; + for (m = 0; m < pointset_sizes[p]; m++) + { + index = pointset_indices[p][m]; + + for (j = 0; j < box_a_size; j++) + { + box = hypre_BoxArrayBox(box_a, j); + new_box = hypre_BoxArrayBox(new_box_a, k); + + hypre_CopyBox(box, new_box); + hypre_ProjectBox(new_box, index, stride); + + k++; + } + } + } + + switch(compute_i) + { + case 0: + indt_boxes = new_box_aa; + break; + + case 1: + dept_boxes = new_box_aa; + break; + } + } + + hypre_ComputePkgCreate(send_boxes, recv_boxes, + unit_stride, unit_stride, + send_processes, recv_processes, + indt_boxes, dept_boxes, + stride, grid, + hypre_StructVectorDataSpace(x), 1, + &compute_pkgs[p]); + + hypre_BoxArrayArrayDestroy(orig_indt_boxes); + hypre_BoxArrayArrayDestroy(orig_dept_boxes); + } + + /*---------------------------------------------------------- + * Set up the relax data structure + *----------------------------------------------------------*/ + + (relax_data -> A) = hypre_StructMatrixRef(A); + (relax_data -> x) = hypre_StructVectorRef(x); + (relax_data -> b) = hypre_StructVectorRef(b); + (relax_data -> diag_rank) = diag_rank; + (relax_data -> compute_pkgs) = compute_pkgs; + + /*----------------------------------------------------- + * Compute flops + *-----------------------------------------------------*/ + + scale = 0.0; + for (p = 0; p < num_pointsets; p++) + { + stride = pointset_strides[p]; + frac = hypre_IndexX(stride); + frac *= hypre_IndexY(stride); + frac *= hypre_IndexZ(stride); + scale += (pointset_sizes[p] / frac); + } + (relax_data -> flops) = scale * (hypre_StructMatrixGlobalSize(A) + + hypre_StructVectorGlobalSize(x)); + + return ierr; + } + + /*-------------------------------------------------------------------------- + * hypre_PointRelax + *--------------------------------------------------------------------------*/ + + int + hypre_PointRelax( void *relax_vdata, + hypre_StructMatrix *A, + hypre_StructVector *b, + hypre_StructVector *x ) + { + hypre_PointRelaxData *relax_data = relax_vdata; + + int max_iter = (relax_data -> max_iter); + int zero_guess = (relax_data -> zero_guess); + double weight = (relax_data -> weight); + int num_pointsets = (relax_data -> num_pointsets); + int *pointset_ranks = (relax_data -> pointset_ranks); + hypre_Index *pointset_strides = (relax_data -> pointset_strides); + hypre_StructVector *t = (relax_data -> t); + int diag_rank = (relax_data -> diag_rank); + hypre_ComputePkg **compute_pkgs = (relax_data -> compute_pkgs); + + hypre_ComputePkg *compute_pkg; + hypre_CommHandle *comm_handle; + + hypre_BoxArrayArray *compute_box_aa; + hypre_BoxArray *compute_box_a; + hypre_Box *compute_box; + + hypre_Box *A_data_box; + hypre_Box *b_data_box; + hypre_Box *x_data_box; + hypre_Box *t_data_box; + + int Ai; + int bi; + int xi; + int ti; + + double *Ap; + double *bp; + double *xp; + double *tp; + + hypre_IndexRef stride; + hypre_IndexRef start; + hypre_Index loop_size; + + hypre_StructStencil *stencil; + hypre_Index *stencil_shape; + int stencil_size; + + int iter, p, compute_i, i, j, si; + int loopi, loopj, loopk; + int pointset; + + int ierr = 0; + + /*---------------------------------------------------------- + * Initialize some things and deal with special cases + *----------------------------------------------------------*/ + + hypre_BeginTiming(relax_data -> time_index); + + hypre_StructMatrixDestroy(relax_data -> A); + hypre_StructVectorDestroy(relax_data -> b); + hypre_StructVectorDestroy(relax_data -> x); + (relax_data -> A) = hypre_StructMatrixRef(A); + (relax_data -> x) = hypre_StructVectorRef(x); + (relax_data -> b) = hypre_StructVectorRef(b); + + (relax_data -> num_iterations) = 0; + + /* if max_iter is zero, return */ + if (max_iter == 0) + { + /* if using a zero initial guess, return zero */ + if (zero_guess) + { + hypre_StructVectorSetConstantValues(x, 0.0); + } + + hypre_EndTiming(relax_data -> time_index); + return ierr; + } + + stencil = hypre_StructMatrixStencil(A); + stencil_shape = hypre_StructStencilShape(stencil); + stencil_size = hypre_StructStencilSize(stencil); + + /*---------------------------------------------------------- + * Do zero_guess iteration + *----------------------------------------------------------*/ + + p = 0; + iter = 0; + + if (zero_guess) + { + pointset = pointset_ranks[p]; + compute_pkg = compute_pkgs[pointset]; + stride = pointset_strides[pointset]; + + for (compute_i = 0; compute_i < 2; compute_i++) + { + switch(compute_i) + { + case 0: + { + compute_box_aa = hypre_ComputePkgIndtBoxes(compute_pkg); + } + break; + + case 1: + { + compute_box_aa = hypre_ComputePkgDeptBoxes(compute_pkg); + } + break; + } + + hypre_ForBoxArrayI(i, compute_box_aa) + { + compute_box_a = hypre_BoxArrayArrayBoxArray(compute_box_aa, i); + + A_data_box = + hypre_BoxArrayBox(hypre_StructMatrixDataSpace(A), i); + b_data_box = + hypre_BoxArrayBox(hypre_StructVectorDataSpace(b), i); + x_data_box = + hypre_BoxArrayBox(hypre_StructVectorDataSpace(x), i); + + Ap = hypre_StructMatrixBoxData(A, i, diag_rank); + bp = hypre_StructVectorBoxData(b, i); + xp = hypre_StructVectorBoxData(x, i); + + hypre_ForBoxI(j, compute_box_a) + { + compute_box = hypre_BoxArrayBox(compute_box_a, j); + + start = hypre_BoxIMin(compute_box); + hypre_BoxGetStrideSize(compute_box, stride, loop_size); + + hypre_BoxLoop3Begin(loop_size, + A_data_box, start, stride, Ai, + b_data_box, start, stride, bi, + x_data_box, start, stride, xi); + #define HYPRE_BOX_SMP_PRIVATE loopk,loopi,loopj,Ai,bi,xi + #include "hypre_box_smp_forloop.h" + hypre_BoxLoop3For(loopi, loopj, loopk, Ai, bi, xi) + { + xp[xi] = bp[bi] / Ap[Ai]; + } + hypre_BoxLoop3End(Ai, bi, xi); + } + } + } + + if (weight != 1.0) + { + hypre_StructScale(weight, x); + } + + p = (p + 1) % num_pointsets; + iter = iter + (p == 0); + } + + /*---------------------------------------------------------- + * Do regular iterations + *----------------------------------------------------------*/ + + while (iter < max_iter) + { + pointset = pointset_ranks[p]; + compute_pkg = compute_pkgs[pointset]; + stride = pointset_strides[pointset]; + + hypre_StructCopy(x, t); + + for (compute_i = 0; compute_i < 2; compute_i++) + { + switch(compute_i) + { + case 0: + { + xp = hypre_StructVectorData(x); + hypre_InitializeIndtComputations(compute_pkg, xp, &comm_handle); + compute_box_aa = hypre_ComputePkgIndtBoxes(compute_pkg); + } + break; + + case 1: + { + hypre_FinalizeIndtComputations(comm_handle); + compute_box_aa = hypre_ComputePkgDeptBoxes(compute_pkg); + } + break; + } + + hypre_ForBoxArrayI(i, compute_box_aa) + { + compute_box_a = hypre_BoxArrayArrayBoxArray(compute_box_aa, i); + + A_data_box = + hypre_BoxArrayBox(hypre_StructMatrixDataSpace(A), i); + b_data_box = + hypre_BoxArrayBox(hypre_StructVectorDataSpace(b), i); + x_data_box = + hypre_BoxArrayBox(hypre_StructVectorDataSpace(x), i); + t_data_box = + hypre_BoxArrayBox(hypre_StructVectorDataSpace(t), i); + + bp = hypre_StructVectorBoxData(b, i); + tp = hypre_StructVectorBoxData(t, i); + + hypre_ForBoxI(j, compute_box_a) + { + compute_box = hypre_BoxArrayBox(compute_box_a, j); + + start = hypre_BoxIMin(compute_box); + hypre_BoxGetStrideSize(compute_box, stride, loop_size); + + hypre_BoxLoop2Begin(loop_size, + b_data_box, start, stride, bi, + t_data_box, start, stride, ti); + #define HYPRE_BOX_SMP_PRIVATE loopk,loopi,loopj,bi,ti + #include "hypre_box_smp_forloop.h" + hypre_BoxLoop2For(loopi, loopj, loopk, bi, ti) + { + tp[ti] = bp[bi]; + } + hypre_BoxLoop2End(bi, ti); + + for (si = 0; si < stencil_size; si++) + { + if (si != diag_rank) + { + Ap = hypre_StructMatrixBoxData(A, i, si); + xp = hypre_StructVectorBoxData(x, i) + + hypre_BoxOffsetDistance(x_data_box, + stencil_shape[si]); + + hypre_BoxLoop3Begin(loop_size, + A_data_box, start, stride, Ai, + x_data_box, start, stride, xi, + t_data_box, start, stride, ti); + #define HYPRE_BOX_SMP_PRIVATE loopk,loopi,loopj,Ai,xi,ti + #include "hypre_box_smp_forloop.h" + hypre_BoxLoop3For(loopi, loopj, loopk, Ai, xi, ti) + { + tp[ti] -= Ap[Ai] * xp[xi]; + } + hypre_BoxLoop3End(Ai, xi, ti); + } + } + + Ap = hypre_StructMatrixBoxData(A, i, diag_rank); + + hypre_BoxLoop2Begin(loop_size, + A_data_box, start, stride, Ai, + t_data_box, start, stride, ti); + #define HYPRE_BOX_SMP_PRIVATE loopk,loopi,loopj,Ai,ti + #include "hypre_box_smp_forloop.h" + hypre_BoxLoop2For(loopi, loopj, loopk, Ai, ti) + { + tp[ti] /= Ap[Ai]; + } + hypre_BoxLoop2End(Ai, ti); + } + } + } + + if (weight != 1.0) + { + hypre_StructScale((1.0 - weight), x); + hypre_StructAxpy(weight, t, x); + } + else + { + hypre_StructCopy(t, x); + } + + p = (p + 1) % num_pointsets; + iter = iter + (p == 0); + } + + (relax_data -> num_iterations) = iter; + + /*----------------------------------------------------------------------- + * Return + *-----------------------------------------------------------------------*/ + + hypre_IncFLOPCount(relax_data -> flops); + hypre_EndTiming(relax_data -> time_index); + + return ierr; + } + + /*-------------------------------------------------------------------------- + * hypre_PointRelaxSetTol + *--------------------------------------------------------------------------*/ + + int + hypre_PointRelaxSetTol( void *relax_vdata, + double tol ) + { + hypre_PointRelaxData *relax_data = relax_vdata; + int ierr = 0; + + (relax_data -> tol) = tol; + + return ierr; + } + + /*-------------------------------------------------------------------------- + * hypre_PointRelaxSetMaxIter + *--------------------------------------------------------------------------*/ + + int + hypre_PointRelaxSetMaxIter( void *relax_vdata, + int max_iter ) + { + hypre_PointRelaxData *relax_data = relax_vdata; + int ierr = 0; + + (relax_data -> max_iter) = max_iter; + + return ierr; + } + + /*-------------------------------------------------------------------------- + * hypre_PointRelaxSetZeroGuess + *--------------------------------------------------------------------------*/ + + int + hypre_PointRelaxSetZeroGuess( void *relax_vdata, + int zero_guess ) + { + hypre_PointRelaxData *relax_data = relax_vdata; + int ierr = 0; + + (relax_data -> zero_guess) = zero_guess; + + return ierr; + } + + /*-------------------------------------------------------------------------- + * hypre_PointRelaxSetWeight + *--------------------------------------------------------------------------*/ + + int + hypre_PointRelaxSetWeight( void *relax_vdata, + double weight ) + { + hypre_PointRelaxData *relax_data = relax_vdata; + int ierr = 0; + + (relax_data -> weight) = weight; + + return ierr; + } + + /*-------------------------------------------------------------------------- + * hypre_PointRelaxSetNumPointsets + *--------------------------------------------------------------------------*/ + + int + hypre_PointRelaxSetNumPointsets( void *relax_vdata, + int num_pointsets ) + { + hypre_PointRelaxData *relax_data = relax_vdata; + int i; + int ierr = 0; + + /* free up old pointset memory */ + for (i = 0; i < (relax_data -> num_pointsets); i++) + { + hypre_TFree(relax_data -> pointset_indices[i]); + } + hypre_TFree(relax_data -> pointset_sizes); + hypre_TFree(relax_data -> pointset_ranks); + hypre_TFree(relax_data -> pointset_strides); + hypre_TFree(relax_data -> pointset_indices); + + /* alloc new pointset memory */ + (relax_data -> num_pointsets) = num_pointsets; + (relax_data -> pointset_sizes) = hypre_TAlloc(int, num_pointsets); + (relax_data -> pointset_ranks) = hypre_TAlloc(int, num_pointsets); + (relax_data -> pointset_strides) = hypre_TAlloc(hypre_Index, num_pointsets); + (relax_data -> pointset_indices) = hypre_TAlloc(hypre_Index *, + num_pointsets); + for (i = 0; i < num_pointsets; i++) + { + (relax_data -> pointset_sizes[i]) = 0; + (relax_data -> pointset_ranks[i]) = i; + (relax_data -> pointset_indices[i]) = NULL; + } + + return ierr; + } + + /*-------------------------------------------------------------------------- + * hypre_PointRelaxSetPointset + *--------------------------------------------------------------------------*/ + + int + hypre_PointRelaxSetPointset( void *relax_vdata, + int pointset, + int pointset_size, + hypre_Index pointset_stride, + hypre_Index *pointset_indices ) + { + hypre_PointRelaxData *relax_data = relax_vdata; + int i; + int ierr = 0; + + /* free up old pointset memory */ + hypre_TFree(relax_data -> pointset_indices[pointset]); + + /* alloc new pointset memory */ + (relax_data -> pointset_indices[pointset]) = + hypre_TAlloc(hypre_Index, pointset_size); + + (relax_data -> pointset_sizes[pointset]) = pointset_size; + hypre_CopyIndex(pointset_stride, + (relax_data -> pointset_strides[pointset])); + for (i = 0; i < pointset_size; i++) + { + hypre_CopyIndex(pointset_indices[i], + (relax_data -> pointset_indices[pointset][i])); + } + + return ierr; + } + + /*-------------------------------------------------------------------------- + * hypre_PointRelaxSetPointsetRank + *--------------------------------------------------------------------------*/ + + int + hypre_PointRelaxSetPointsetRank( void *relax_vdata, + int pointset, + int pointset_rank ) + { + hypre_PointRelaxData *relax_data = relax_vdata; + int ierr = 0; + + (relax_data -> pointset_ranks[pointset]) = pointset_rank; + + return ierr; + } + + /*-------------------------------------------------------------------------- + * hypre_PointRelaxSetTempVec + *--------------------------------------------------------------------------*/ + + int + hypre_PointRelaxSetTempVec( void *relax_vdata, + hypre_StructVector *t ) + { + hypre_PointRelaxData *relax_data = relax_vdata; + int ierr = 0; + + hypre_StructVectorDestroy(relax_data -> t); + (relax_data -> t) = hypre_StructVectorRef(t); + + return ierr; + } + Index: llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/project.c diff -c /dev/null llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/project.c:1.1 *** /dev/null Mon Apr 11 00:22:18 2005 --- llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/project.c Mon Apr 11 00:22:07 2005 *************** *** 0 **** --- 1,118 ---- + /*BHEADER********************************************************************** + * (c) 1997 The Regents of the University of California + * + * See the file COPYRIGHT_and_DISCLAIMER for a complete copyright + * notice, contact person, and disclaimer. + * + * $Revision: 1.1 $ + *********************************************************************EHEADER*/ + /****************************************************************************** + * + * Projection routines. + * + *****************************************************************************/ + + #include "headers.h" + + /*-------------------------------------------------------------------------- + * hypre_ProjectBox: + * Projects a box onto a strided index space that contains the + * index `index' and has stride `stride'. + * + * Note: An "empty" projection is represented by a box with volume 0. + *--------------------------------------------------------------------------*/ + + int + hypre_ProjectBox( hypre_Box *box, + hypre_Index index, + hypre_Index stride ) + { + int i, s, d, hl, hu, kl, ku; + int ierr = 0; + + /*------------------------------------------------------ + * project in all 3 dimensions + *------------------------------------------------------*/ + + for (d = 0; d < 3; d++) + { + + i = hypre_IndexD(index, d); + s = hypre_IndexD(stride, d); + + hl = hypre_BoxIMinD(box, d) - i; + hu = hypre_BoxIMaxD(box, d) - i; + + if ( hl <= 0 ) + kl = (int) (hl / s); + else + kl = (int) ((hl + (s-1)) / s); + + if ( hu >= 0 ) + ku = (int) (hu / s); + else + ku = (int) ((hu - (s-1)) / s); + + hypre_BoxIMinD(box, d) = i + kl * s; + hypre_BoxIMaxD(box, d) = i + ku * s; + + } + + return ierr; + } + + /*-------------------------------------------------------------------------- + * hypre_ProjectBoxArray: + * + * Note: The dimensions of the modified box array are not changed. + * So, it is possible to have boxes with volume 0. + *--------------------------------------------------------------------------*/ + + int + hypre_ProjectBoxArray( hypre_BoxArray *box_array, + hypre_Index index, + hypre_Index stride ) + { + hypre_Box *box; + int i; + int ierr = 0; + + hypre_ForBoxI(i, box_array) + { + box = hypre_BoxArrayBox(box_array, i); + hypre_ProjectBox(box, index, stride); + } + + return ierr; + } + + /*-------------------------------------------------------------------------- + * hypre_ProjectBoxArrayArray: + * + * Note: The dimensions of the modified box array-array are not changed. + * So, it is possible to have boxes with volume 0. + *--------------------------------------------------------------------------*/ + + int + hypre_ProjectBoxArrayArray( hypre_BoxArrayArray *box_array_array, + hypre_Index index, + hypre_Index stride ) + { + hypre_BoxArray *box_array; + hypre_Box *box; + int i, j; + int ierr = 0; + + hypre_ForBoxArrayI(i, box_array_array) + { + box_array = hypre_BoxArrayArrayBoxArray(box_array_array, i); + hypre_ForBoxI(j, box_array) + { + box = hypre_BoxArrayBox(box_array, j); + hypre_ProjectBox(box, index, stride); + } + } + + return ierr; + } + Index: llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/random.c diff -c /dev/null llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/random.c:1.1 *** /dev/null Mon Apr 11 00:22:18 2005 --- llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/random.c Mon Apr 11 00:22:07 2005 *************** *** 0 **** --- 1,49 ---- + /*BHEADER********************************************************************** + * (c) 1996 The Regents of the University of California + * + * See the file COPYRIGHT_and_DISCLAIMER for a complete copyright + * notice, contact person, and disclaimer. + * + * $Revision: 1.1 $ + *********************************************************************EHEADER*/ + + /****************************************************************************** + * + * Routines for generating random numbers. + * + *****************************************************************************/ + + + /*-------------------------------------------------------------------------- + * Static variables + *--------------------------------------------------------------------------*/ + + static int Seed = 13579; + + #define L 1664525 + #define M 1024 + + /*-------------------------------------------------------------------------- + * hypre_SeedRand: + * The seed must always be positive. + * + * Note: the internal seed must be positive and odd, so it is set + * to (2*input_seed - 1); + *--------------------------------------------------------------------------*/ + + void hypre_SeedRand(seed) + int seed; + { + Seed = (2*seed - 1) % M; + } + + /*-------------------------------------------------------------------------- + * hypre_Rand + *--------------------------------------------------------------------------*/ + + double hypre_Rand() + { + Seed = (L * Seed) % M; + + return ( ((double) Seed) / ((double) M) ); + } Index: llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/semi_interp.c diff -c /dev/null llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/semi_interp.c:1.1 *** /dev/null Mon Apr 11 00:22:18 2005 --- llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/semi_interp.c Mon Apr 11 00:22:07 2005 *************** *** 0 **** --- 1,333 ---- + /*BHEADER********************************************************************** + * (c) 2000 The Regents of the University of California + * + * See the file COPYRIGHT_and_DISCLAIMER for a complete copyright + * notice, contact person, and disclaimer. + * + * $Revision: 1.1 $ + *********************************************************************EHEADER*/ + /****************************************************************************** + * + * + *****************************************************************************/ + + #include "headers.h" + + /*-------------------------------------------------------------------------- + * hypre_SemiInterpData data structure + *--------------------------------------------------------------------------*/ + + typedef struct + { + hypre_StructMatrix *P; + int P_stored_as_transpose; + hypre_ComputePkg *compute_pkg; + hypre_Index cindex; + hypre_Index findex; + hypre_Index stride; + + int time_index; + + } hypre_SemiInterpData; + + /*-------------------------------------------------------------------------- + * hypre_SemiInterpCreate + *--------------------------------------------------------------------------*/ + + void * + hypre_SemiInterpCreate( ) + { + hypre_SemiInterpData *interp_data; + + interp_data = hypre_CTAlloc(hypre_SemiInterpData, 1); + (interp_data -> time_index) = hypre_InitializeTiming("SemiInterp"); + + return (void *) interp_data; + } + + /*-------------------------------------------------------------------------- + * hypre_SemiInterpSetup + *--------------------------------------------------------------------------*/ + + int + hypre_SemiInterpSetup( void *interp_vdata, + hypre_StructMatrix *P, + int P_stored_as_transpose, + hypre_StructVector *xc, + hypre_StructVector *e, + hypre_Index cindex, + hypre_Index findex, + hypre_Index stride ) + { + hypre_SemiInterpData *interp_data = interp_vdata; + + hypre_StructGrid *grid; + hypre_StructStencil *stencil; + + hypre_BoxArrayArray *send_boxes; + hypre_BoxArrayArray *recv_boxes; + int **send_processes; + int **recv_processes; + hypre_BoxArrayArray *indt_boxes; + hypre_BoxArrayArray *dept_boxes; + + hypre_ComputePkg *compute_pkg; + + int ierr = 0; + + /*---------------------------------------------------------- + * Set up the compute package + *----------------------------------------------------------*/ + + grid = hypre_StructVectorGrid(e); + stencil = hypre_StructMatrixStencil(P); + + hypre_CreateComputeInfo(grid, stencil, + &send_boxes, &recv_boxes, + &send_processes, &recv_processes, + &indt_boxes, &dept_boxes); + + hypre_ProjectBoxArrayArray(send_boxes, cindex, stride); + hypre_ProjectBoxArrayArray(recv_boxes, cindex, stride); + hypre_ProjectBoxArrayArray(indt_boxes, findex, stride); + hypre_ProjectBoxArrayArray(dept_boxes, findex, stride); + + hypre_ComputePkgCreate(send_boxes, recv_boxes, + stride, stride, + send_processes, recv_processes, + indt_boxes, dept_boxes, + stride, grid, + hypre_StructVectorDataSpace(e), 1, + &compute_pkg); + + /*---------------------------------------------------------- + * Set up the interp data structure + *----------------------------------------------------------*/ + + (interp_data -> P) = hypre_StructMatrixRef(P); + (interp_data -> P_stored_as_transpose) = P_stored_as_transpose; + (interp_data -> compute_pkg) = compute_pkg; + hypre_CopyIndex(cindex, (interp_data -> cindex)); + hypre_CopyIndex(findex, (interp_data -> findex)); + hypre_CopyIndex(stride, (interp_data -> stride)); + + return ierr; + } + + /*-------------------------------------------------------------------------- + * hypre_SemiInterp: + *--------------------------------------------------------------------------*/ + + int + hypre_SemiInterp( void *interp_vdata, + hypre_StructMatrix *P, + hypre_StructVector *xc, + hypre_StructVector *e ) + { + int ierr = 0; + + hypre_SemiInterpData *interp_data = interp_vdata; + + int P_stored_as_transpose; + hypre_ComputePkg *compute_pkg; + hypre_IndexRef cindex; + hypre_IndexRef findex; + hypre_IndexRef stride; + + hypre_StructGrid *fgrid; + int *fgrid_ids; + hypre_StructGrid *cgrid; + hypre_BoxArray *cgrid_boxes; + int *cgrid_ids; + + hypre_CommHandle *comm_handle; + + hypre_BoxArrayArray *compute_box_aa; + hypre_BoxArray *compute_box_a; + hypre_Box *compute_box; + + hypre_Box *P_dbox; + hypre_Box *xc_dbox; + hypre_Box *e_dbox; + + int Pi; + int xci; + int ei; + + double *Pp0, *Pp1; + double *xcp; + double *ep, *ep0, *ep1; + + hypre_Index loop_size; + hypre_Index start; + hypre_Index startc; + hypre_Index stridec; + + hypre_StructStencil *stencil; + hypre_Index *stencil_shape; + + int compute_i, fi, ci, j; + int loopi, loopj, loopk; + + /*----------------------------------------------------------------------- + * Initialize some things + *-----------------------------------------------------------------------*/ + + hypre_BeginTiming(interp_data -> time_index); + + P_stored_as_transpose = (interp_data -> P_stored_as_transpose); + compute_pkg = (interp_data -> compute_pkg); + cindex = (interp_data -> cindex); + findex = (interp_data -> findex); + stride = (interp_data -> stride); + + stencil = hypre_StructMatrixStencil(P); + stencil_shape = hypre_StructStencilShape(stencil); + + hypre_SetIndex(stridec, 1, 1, 1); + + /*----------------------------------------------------------------------- + * Compute e at coarse points (injection) + *-----------------------------------------------------------------------*/ + + fgrid = hypre_StructVectorGrid(e); + fgrid_ids = hypre_StructGridIDs(fgrid); + cgrid = hypre_StructVectorGrid(xc); + cgrid_boxes = hypre_StructGridBoxes(cgrid); + cgrid_ids = hypre_StructGridIDs(cgrid); + + fi = 0; + hypre_ForBoxI(ci, cgrid_boxes) + { + while (fgrid_ids[fi] != cgrid_ids[ci]) + { + fi++; + } + + compute_box = hypre_BoxArrayBox(cgrid_boxes, ci); + + hypre_CopyIndex(hypre_BoxIMin(compute_box), startc); + hypre_StructMapCoarseToFine(startc, cindex, stride, start); + + e_dbox = hypre_BoxArrayBox(hypre_StructVectorDataSpace(e), fi); + xc_dbox = hypre_BoxArrayBox(hypre_StructVectorDataSpace(xc), ci); + + ep = hypre_StructVectorBoxData(e, fi); + xcp = hypre_StructVectorBoxData(xc, ci); + + hypre_BoxGetSize(compute_box, loop_size); + + hypre_BoxLoop2Begin(loop_size, + e_dbox, start, stride, ei, + xc_dbox, startc, stridec, xci); + #define HYPRE_BOX_SMP_PRIVATE loopk,loopi,loopj,ei,xci + #include "hypre_box_smp_forloop.h" + hypre_BoxLoop2For(loopi, loopj, loopk, ei, xci) + { + ep[ei] = xcp[xci]; + } + hypre_BoxLoop2End(ei, xci); + } + + /*----------------------------------------------------------------------- + * Compute e at fine points + *-----------------------------------------------------------------------*/ + + for (compute_i = 0; compute_i < 2; compute_i++) + { + switch(compute_i) + { + case 0: + { + ep = hypre_StructVectorData(e); + hypre_InitializeIndtComputations(compute_pkg, ep, &comm_handle); + compute_box_aa = hypre_ComputePkgIndtBoxes(compute_pkg); + } + break; + + case 1: + { + hypre_FinalizeIndtComputations(comm_handle); + compute_box_aa = hypre_ComputePkgDeptBoxes(compute_pkg); + } + break; + } + + hypre_ForBoxArrayI(fi, compute_box_aa) + { + compute_box_a = hypre_BoxArrayArrayBoxArray(compute_box_aa, fi); + + P_dbox = hypre_BoxArrayBox(hypre_StructMatrixDataSpace(P), fi); + e_dbox = hypre_BoxArrayBox(hypre_StructVectorDataSpace(e), fi); + + if (P_stored_as_transpose) + { + Pp0 = hypre_StructMatrixBoxData(P, fi, 1); + Pp1 = hypre_StructMatrixBoxData(P, fi, 0) - + hypre_BoxOffsetDistance(P_dbox, stencil_shape[0]); + } + else + { + Pp0 = hypre_StructMatrixBoxData(P, fi, 0); + Pp1 = hypre_StructMatrixBoxData(P, fi, 1); + } + ep = hypre_StructVectorBoxData(e, fi); + ep0 = ep + hypre_BoxOffsetDistance(e_dbox, stencil_shape[0]); + ep1 = ep + hypre_BoxOffsetDistance(e_dbox, stencil_shape[1]); + + hypre_ForBoxI(j, compute_box_a) + { + compute_box = hypre_BoxArrayBox(compute_box_a, j); + + hypre_CopyIndex(hypre_BoxIMin(compute_box), start); + hypre_StructMapFineToCoarse(start, findex, stride, startc); + + hypre_BoxGetStrideSize(compute_box, stride, loop_size); + + hypre_BoxLoop2Begin(loop_size, + P_dbox, startc, stridec, Pi, + e_dbox, start, stride, ei); + #define HYPRE_BOX_SMP_PRIVATE loopk,loopi,loopj,Pi,ei + #include "hypre_box_smp_forloop.h" + hypre_BoxLoop2For(loopi, loopj, loopk, Pi, ei) + { + ep[ei] = (Pp0[Pi] * ep0[ei] + + Pp1[Pi] * ep1[ei]); + } + hypre_BoxLoop2End(Pi, ei); + } + } + } + + /*----------------------------------------------------------------------- + * Return + *-----------------------------------------------------------------------*/ + + hypre_IncFLOPCount(3*hypre_StructVectorGlobalSize(xc)); + hypre_EndTiming(interp_data -> time_index); + + return ierr; + } + + /*-------------------------------------------------------------------------- + * hypre_SemiInterpDestroy + *--------------------------------------------------------------------------*/ + + int + hypre_SemiInterpDestroy( void *interp_vdata ) + { + int ierr = 0; + + hypre_SemiInterpData *interp_data = interp_vdata; + + if (interp_data) + { + hypre_StructMatrixDestroy(interp_data -> P); + hypre_ComputePkgDestroy(interp_data -> compute_pkg); + hypre_FinalizeTiming(interp_data -> time_index); + hypre_TFree(interp_data); + } + + return ierr; + } + Index: llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/semi_restrict.c diff -c /dev/null llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/semi_restrict.c:1.1 *** /dev/null Mon Apr 11 00:22:18 2005 --- llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/semi_restrict.c Mon Apr 11 00:22:07 2005 *************** *** 0 **** --- 1,301 ---- + /*BHEADER********************************************************************** + * (c) 2000 The Regents of the University of California + * + * See the file COPYRIGHT_and_DISCLAIMER for a complete copyright + * notice, contact person, and disclaimer. + * + * $Revision: 1.1 $ + *********************************************************************EHEADER*/ + /****************************************************************************** + * + * + *****************************************************************************/ + + #include "headers.h" + + /*-------------------------------------------------------------------------- + * hypre_SemiRestrictData data structure + *--------------------------------------------------------------------------*/ + + typedef struct + { + hypre_StructMatrix *R; + int R_stored_as_transpose; + hypre_ComputePkg *compute_pkg; + hypre_Index cindex; + hypre_Index stride; + + int time_index; + + } hypre_SemiRestrictData; + + /*-------------------------------------------------------------------------- + * hypre_SemiRestrictCreate + *--------------------------------------------------------------------------*/ + + void * + hypre_SemiRestrictCreate( ) + { + hypre_SemiRestrictData *restrict_data; + + restrict_data = hypre_CTAlloc(hypre_SemiRestrictData, 1); + + (restrict_data -> time_index) = hypre_InitializeTiming("SemiRestrict"); + + return (void *) restrict_data; + } + + /*-------------------------------------------------------------------------- + * hypre_SemiRestrictSetup + *--------------------------------------------------------------------------*/ + + int + hypre_SemiRestrictSetup( void *restrict_vdata, + hypre_StructMatrix *R, + int R_stored_as_transpose, + hypre_StructVector *r, + hypre_StructVector *rc, + hypre_Index cindex, + hypre_Index findex, + hypre_Index stride ) + { + hypre_SemiRestrictData *restrict_data = restrict_vdata; + + hypre_StructGrid *grid; + hypre_StructStencil *stencil; + + hypre_BoxArrayArray *send_boxes; + hypre_BoxArrayArray *recv_boxes; + int **send_processes; + int **recv_processes; + hypre_BoxArrayArray *indt_boxes; + hypre_BoxArrayArray *dept_boxes; + + hypre_ComputePkg *compute_pkg; + + int ierr = 0; + + /*---------------------------------------------------------- + * Set up the compute package + *----------------------------------------------------------*/ + + grid = hypre_StructVectorGrid(r); + stencil = hypre_StructMatrixStencil(R); + + hypre_CreateComputeInfo(grid, stencil, + &send_boxes, &recv_boxes, + &send_processes, &recv_processes, + &indt_boxes, &dept_boxes); + + hypre_ProjectBoxArrayArray(send_boxes, findex, stride); + hypre_ProjectBoxArrayArray(recv_boxes, findex, stride); + hypre_ProjectBoxArrayArray(indt_boxes, cindex, stride); + hypre_ProjectBoxArrayArray(dept_boxes, cindex, stride); + + hypre_ComputePkgCreate(send_boxes, recv_boxes, + stride, stride, + send_processes, recv_processes, + indt_boxes, dept_boxes, + stride, grid, + hypre_StructVectorDataSpace(r), 1, + &compute_pkg); + + /*---------------------------------------------------------- + * Set up the restrict data structure + *----------------------------------------------------------*/ + + (restrict_data -> R) = hypre_StructMatrixRef(R); + (restrict_data -> R_stored_as_transpose) = R_stored_as_transpose; + (restrict_data -> compute_pkg) = compute_pkg; + hypre_CopyIndex(cindex ,(restrict_data -> cindex)); + hypre_CopyIndex(stride ,(restrict_data -> stride)); + + return ierr; + } + + /*-------------------------------------------------------------------------- + * hypre_SemiRestrict: + *--------------------------------------------------------------------------*/ + + int + hypre_SemiRestrict( void *restrict_vdata, + hypre_StructMatrix *R, + hypre_StructVector *r, + hypre_StructVector *rc ) + { + int ierr = 0; + + hypre_SemiRestrictData *restrict_data = restrict_vdata; + + int R_stored_as_transpose; + hypre_ComputePkg *compute_pkg; + hypre_IndexRef cindex; + hypre_IndexRef stride; + + hypre_StructGrid *fgrid; + int *fgrid_ids; + hypre_StructGrid *cgrid; + hypre_BoxArray *cgrid_boxes; + int *cgrid_ids; + + hypre_CommHandle *comm_handle; + + hypre_BoxArrayArray *compute_box_aa; + hypre_BoxArray *compute_box_a; + hypre_Box *compute_box; + + hypre_Box *R_dbox; + hypre_Box *r_dbox; + hypre_Box *rc_dbox; + + int Ri; + int ri; + int rci; + + double *Rp0, *Rp1; + double *rp, *rp0, *rp1; + double *rcp; + + hypre_Index loop_size; + hypre_IndexRef start; + hypre_Index startc; + hypre_Index stridec; + + hypre_StructStencil *stencil; + hypre_Index *stencil_shape; + + int compute_i, fi, ci, j; + int loopi, loopj, loopk; + + /*----------------------------------------------------------------------- + * Initialize some things. + *-----------------------------------------------------------------------*/ + + hypre_BeginTiming(restrict_data -> time_index); + + R_stored_as_transpose = (restrict_data -> R_stored_as_transpose); + compute_pkg = (restrict_data -> compute_pkg); + cindex = (restrict_data -> cindex); + stride = (restrict_data -> stride); + + stencil = hypre_StructMatrixStencil(R); + stencil_shape = hypre_StructStencilShape(stencil); + + hypre_SetIndex(stridec, 1, 1, 1); + + /*-------------------------------------------------------------------- + * Restrict the residual. + *--------------------------------------------------------------------*/ + + fgrid = hypre_StructVectorGrid(r); + fgrid_ids = hypre_StructGridIDs(fgrid); + cgrid = hypre_StructVectorGrid(rc); + cgrid_boxes = hypre_StructGridBoxes(cgrid); + cgrid_ids = hypre_StructGridIDs(cgrid); + + for (compute_i = 0; compute_i < 2; compute_i++) + { + switch(compute_i) + { + case 0: + { + rp = hypre_StructVectorData(r); + hypre_InitializeIndtComputations(compute_pkg, rp, &comm_handle); + compute_box_aa = hypre_ComputePkgIndtBoxes(compute_pkg); + } + break; + + case 1: + { + hypre_FinalizeIndtComputations(comm_handle); + compute_box_aa = hypre_ComputePkgDeptBoxes(compute_pkg); + } + break; + } + + fi = 0; + hypre_ForBoxArrayI(ci, cgrid_boxes) + { + while (fgrid_ids[fi] != cgrid_ids[ci]) + { + fi++; + } + + compute_box_a = hypre_BoxArrayArrayBoxArray(compute_box_aa, fi); + + R_dbox = hypre_BoxArrayBox(hypre_StructMatrixDataSpace(R), fi); + r_dbox = hypre_BoxArrayBox(hypre_StructVectorDataSpace(r), fi); + rc_dbox = hypre_BoxArrayBox(hypre_StructVectorDataSpace(rc), ci); + + if (R_stored_as_transpose) + { + Rp0 = hypre_StructMatrixBoxData(R, fi, 1) - + hypre_BoxOffsetDistance(R_dbox, stencil_shape[1]); + Rp1 = hypre_StructMatrixBoxData(R, fi, 0); + } + else + { + Rp0 = hypre_StructMatrixBoxData(R, fi, 0); + Rp1 = hypre_StructMatrixBoxData(R, fi, 1); + } + rp = hypre_StructVectorBoxData(r, fi); + rp0 = rp + hypre_BoxOffsetDistance(r_dbox, stencil_shape[0]); + rp1 = rp + hypre_BoxOffsetDistance(r_dbox, stencil_shape[1]); + rcp = hypre_StructVectorBoxData(rc, ci); + + hypre_ForBoxI(j, compute_box_a) + { + compute_box = hypre_BoxArrayBox(compute_box_a, j); + + start = hypre_BoxIMin(compute_box); + hypre_StructMapFineToCoarse(start, cindex, stride, startc); + + hypre_BoxGetStrideSize(compute_box, stride, loop_size); + hypre_BoxLoop3Begin(loop_size, + R_dbox, startc, stridec, Ri, + r_dbox, start, stride, ri, + rc_dbox, startc, stridec, rci); + #define HYPRE_BOX_SMP_PRIVATE loopk,loopi,loopj,Ri,ri,rci + #include "hypre_box_smp_forloop.h" + hypre_BoxLoop3For(loopi, loopj, loopk, Ri, ri, rci) + { + rcp[rci] = rp[ri] + (Rp0[Ri] * rp0[ri] + + Rp1[Ri] * rp1[ri]); + } + hypre_BoxLoop3End(Ri, ri, rci); + } + } + } + + /*----------------------------------------------------------------------- + * Return + *-----------------------------------------------------------------------*/ + + hypre_IncFLOPCount(4*hypre_StructVectorGlobalSize(rc)); + hypre_EndTiming(restrict_data -> time_index); + + return ierr; + } + + /*-------------------------------------------------------------------------- + * hypre_SemiRestrictDestroy + *--------------------------------------------------------------------------*/ + + int + hypre_SemiRestrictDestroy( void *restrict_vdata ) + { + int ierr = 0; + + hypre_SemiRestrictData *restrict_data = restrict_vdata; + + if (restrict_data) + { + hypre_StructMatrixDestroy(restrict_data -> R); + hypre_ComputePkgDestroy(restrict_data -> compute_pkg); + hypre_FinalizeTiming(restrict_data -> time_index); + hypre_TFree(restrict_data); + } + + return ierr; + } + Index: llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/smg.c diff -c /dev/null llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/smg.c:1.1 *** /dev/null Mon Apr 11 00:22:18 2005 --- llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/smg.c Mon Apr 11 00:22:07 2005 *************** *** 0 **** --- 1,423 ---- + /*BHEADER********************************************************************** + * (c) 1997 The Regents of the University of California + * + * See the file COPYRIGHT_and_DISCLAIMER for a complete copyright + * notice, contact person, and disclaimer. + * + * $Revision: 1.1 $ + *********************************************************************EHEADER*/ + /****************************************************************************** + * + * + *****************************************************************************/ + + #include "headers.h" + #include "smg.h" + + /*-------------------------------------------------------------------------- + * hypre_SMGCreate + *--------------------------------------------------------------------------*/ + + void * + hypre_SMGCreate( MPI_Comm comm ) + { + hypre_SMGData *smg_data; + + smg_data = hypre_CTAlloc(hypre_SMGData, 1); + + (smg_data -> comm) = comm; + (smg_data -> time_index) = hypre_InitializeTiming("SMG"); + + /* set defaults */ + (smg_data -> memory_use) = 0; + (smg_data -> tol) = 1.0e-06; + (smg_data -> max_iter) = 200; + (smg_data -> rel_change) = 0; + (smg_data -> zero_guess) = 0; + (smg_data -> max_levels) = 0; + (smg_data -> num_pre_relax) = 1; + (smg_data -> num_post_relax) = 1; + (smg_data -> cdir) = 2; + hypre_SetIndex((smg_data -> base_index), 0, 0, 0); + hypre_SetIndex((smg_data -> base_stride), 1, 1, 1); + (smg_data -> logging) = 0; + + /* initialize */ + (smg_data -> num_levels) = -1; + + return (void *) smg_data; + } + + /*-------------------------------------------------------------------------- + * hypre_SMGDestroy + *--------------------------------------------------------------------------*/ + + int + hypre_SMGDestroy( void *smg_vdata ) + { + hypre_SMGData *smg_data = smg_vdata; + + int l; + int ierr = 0; + + if (smg_data) + { + if ((smg_data -> logging) > 0) + { + hypre_TFree(smg_data -> norms); + hypre_TFree(smg_data -> rel_norms); + } + + if ((smg_data -> num_levels) > -1) + { + for (l = 0; l < ((smg_data -> num_levels) - 1); l++) + { + hypre_SMGRelaxDestroy(smg_data -> relax_data_l[l]); + hypre_SMGResidualDestroy(smg_data -> residual_data_l[l]); + hypre_SemiRestrictDestroy(smg_data -> restrict_data_l[l]); + hypre_SemiInterpDestroy(smg_data -> interp_data_l[l]); + } + hypre_SMGRelaxDestroy(smg_data -> relax_data_l[l]); + if (l == 0) + { + hypre_SMGResidualDestroy(smg_data -> residual_data_l[l]); + } + hypre_TFree(smg_data -> relax_data_l); + hypre_TFree(smg_data -> residual_data_l); + hypre_TFree(smg_data -> restrict_data_l); + hypre_TFree(smg_data -> interp_data_l); + + hypre_StructVectorDestroy(smg_data -> tb_l[0]); + hypre_StructVectorDestroy(smg_data -> tx_l[0]); + hypre_StructGridDestroy(smg_data -> grid_l[0]); + hypre_StructMatrixDestroy(smg_data -> A_l[0]); + hypre_StructVectorDestroy(smg_data -> b_l[0]); + hypre_StructVectorDestroy(smg_data -> x_l[0]); + for (l = 0; l < ((smg_data -> num_levels) - 1); l++) + { + hypre_StructGridDestroy(smg_data -> grid_l[l+1]); + hypre_StructGridDestroy(smg_data -> PT_grid_l[l+1]); + hypre_StructMatrixDestroy(smg_data -> A_l[l+1]); + if (smg_data -> PT_l[l] == smg_data -> R_l[l]) + { + hypre_StructMatrixDestroy(smg_data -> PT_l[l]); + } + else + { + hypre_StructMatrixDestroy(smg_data -> PT_l[l]); + hypre_StructMatrixDestroy(smg_data -> R_l[l]); + } + hypre_StructVectorDestroy(smg_data -> b_l[l+1]); + hypre_StructVectorDestroy(smg_data -> x_l[l+1]); + hypre_StructVectorDestroy(smg_data -> tb_l[l+1]); + hypre_StructVectorDestroy(smg_data -> tx_l[l+1]); + } + hypre_SharedTFree(smg_data -> data); + hypre_TFree(smg_data -> grid_l); + hypre_TFree(smg_data -> PT_grid_l); + hypre_TFree(smg_data -> A_l); + hypre_TFree(smg_data -> PT_l); + hypre_TFree(smg_data -> R_l); + hypre_TFree(smg_data -> b_l); + hypre_TFree(smg_data -> x_l); + hypre_TFree(smg_data -> tb_l); + hypre_TFree(smg_data -> tx_l); + } + + hypre_FinalizeTiming(smg_data -> time_index); + hypre_TFree(smg_data); + } + + return(ierr); + } + + /*-------------------------------------------------------------------------- + * hypre_SMGSetMemoryUse + *--------------------------------------------------------------------------*/ + + int + hypre_SMGSetMemoryUse( void *smg_vdata, + int memory_use ) + { + hypre_SMGData *smg_data = smg_vdata; + int ierr = 0; + + (smg_data -> memory_use) = memory_use; + + return ierr; + } + + /*-------------------------------------------------------------------------- + * hypre_SMGSetTol + *--------------------------------------------------------------------------*/ + + int + hypre_SMGSetTol( void *smg_vdata, + double tol ) + { + hypre_SMGData *smg_data = smg_vdata; + int ierr = 0; + + (smg_data -> tol) = tol; + + return ierr; + } + + /*-------------------------------------------------------------------------- + * hypre_SMGSetMaxIter + *--------------------------------------------------------------------------*/ + + int + hypre_SMGSetMaxIter( void *smg_vdata, + int max_iter ) + { + hypre_SMGData *smg_data = smg_vdata; + int ierr = 0; + + (smg_data -> max_iter) = max_iter; + + return ierr; + } + + /*-------------------------------------------------------------------------- + * hypre_SMGSetRelChange + *--------------------------------------------------------------------------*/ + + int + hypre_SMGSetRelChange( void *smg_vdata, + int rel_change ) + { + hypre_SMGData *smg_data = smg_vdata; + int ierr = 0; + + (smg_data -> rel_change) = rel_change; + + return ierr; + } + + /*-------------------------------------------------------------------------- + * hypre_SMGSetZeroGuess + *--------------------------------------------------------------------------*/ + + int + hypre_SMGSetZeroGuess( void *smg_vdata, + int zero_guess ) + { + hypre_SMGData *smg_data = smg_vdata; + int ierr = 0; + + (smg_data -> zero_guess) = zero_guess; + + return ierr; + } + + /*-------------------------------------------------------------------------- + * hypre_SMGSetNumPreRelax + * Note that we require at least 1 pre-relax sweep. + *--------------------------------------------------------------------------*/ + + int + hypre_SMGSetNumPreRelax( void *smg_vdata, + int num_pre_relax ) + { + hypre_SMGData *smg_data = smg_vdata; + int ierr = 0; + + (smg_data -> num_pre_relax) = hypre_max(num_pre_relax,1); + + return ierr; + } + + /*-------------------------------------------------------------------------- + * hypre_SMGSetNumPostRelax + *--------------------------------------------------------------------------*/ + + int + hypre_SMGSetNumPostRelax( void *smg_vdata, + int num_post_relax ) + { + hypre_SMGData *smg_data = smg_vdata; + int ierr = 0; + + (smg_data -> num_post_relax) = num_post_relax; + + return ierr; + } + + /*-------------------------------------------------------------------------- + * hypre_SMGSetBase + *--------------------------------------------------------------------------*/ + + int + hypre_SMGSetBase( void *smg_vdata, + hypre_Index base_index, + hypre_Index base_stride ) + { + hypre_SMGData *smg_data = smg_vdata; + int d; + int ierr = 0; + + for (d = 0; d < 3; d++) + { + hypre_IndexD((smg_data -> base_index), d) = + hypre_IndexD(base_index, d); + hypre_IndexD((smg_data -> base_stride), d) = + hypre_IndexD(base_stride, d); + } + + return ierr; + } + + /*-------------------------------------------------------------------------- + * hypre_SMGSetLogging + *--------------------------------------------------------------------------*/ + + int + hypre_SMGSetLogging( void *smg_vdata, + int logging) + { + hypre_SMGData *smg_data = smg_vdata; + int ierr = 0; + + (smg_data -> logging) = logging; + + return ierr; + } + + /*-------------------------------------------------------------------------- + * hypre_SMGGetNumIterations + *--------------------------------------------------------------------------*/ + + int + hypre_SMGGetNumIterations( void *smg_vdata, + int *num_iterations ) + { + hypre_SMGData *smg_data = smg_vdata; + int ierr = 0; + + *num_iterations = (smg_data -> num_iterations); + + return ierr; + } + + /*-------------------------------------------------------------------------- + * hypre_SMGPrintLogging + *--------------------------------------------------------------------------*/ + + int + hypre_SMGPrintLogging( void *smg_vdata, + int myid) + { + hypre_SMGData *smg_data = smg_vdata; + int ierr = 0; + int i; + int num_iterations = (smg_data -> num_iterations); + int logging = (smg_data -> logging); + double *norms = (smg_data -> norms); + double *rel_norms = (smg_data -> rel_norms); + + + if (myid == 0) + { + if (logging > 0) + { + for (i = 0; i < num_iterations; i++) + { + printf("Residual norm[%d] = %e ",i,norms[i]); + printf("Relative residual norm[%d] = %e\n",i,rel_norms[i]); + } + } + } + + return ierr; + } + + /*-------------------------------------------------------------------------- + * hypre_SMGGetFinalRelativeResidualNorm + *--------------------------------------------------------------------------*/ + + int + hypre_SMGGetFinalRelativeResidualNorm( void *smg_vdata, + double *relative_residual_norm ) + { + hypre_SMGData *smg_data = smg_vdata; + + int max_iter = (smg_data -> max_iter); + int num_iterations = (smg_data -> num_iterations); + int logging = (smg_data -> logging); + double *rel_norms = (smg_data -> rel_norms); + + int ierr = -1; + + + if (logging > 0) + { + if (num_iterations == max_iter) + { + *relative_residual_norm = rel_norms[num_iterations-1]; + } + else + { + *relative_residual_norm = rel_norms[num_iterations]; + } + + ierr = 0; + } + + return ierr; + } + + /*-------------------------------------------------------------------------- + * hypre_SMGSetStructVectorConstantValues + *--------------------------------------------------------------------------*/ + + int + hypre_SMGSetStructVectorConstantValues( hypre_StructVector *vector, + double values, + hypre_BoxArray *box_array, + hypre_Index stride ) + { + int ierr = 0; + + hypre_Box *v_data_box; + + int vi; + double *vp; + + hypre_Box *box; + hypre_Index loop_size; + hypre_IndexRef start; + + int loopi, loopj, loopk; + int i; + + /*----------------------------------------------------------------------- + * Set the vector coefficients + *-----------------------------------------------------------------------*/ + + hypre_ForBoxI(i, box_array) + { + box = hypre_BoxArrayBox(box_array, i); + start = hypre_BoxIMin(box); + + v_data_box = + hypre_BoxArrayBox(hypre_StructVectorDataSpace(vector), i); + vp = hypre_StructVectorBoxData(vector, i); + + hypre_BoxGetStrideSize(box, stride, loop_size); + + hypre_BoxLoop1Begin(loop_size, + v_data_box, start, stride, vi); + #define HYPRE_BOX_SMP_PRIVATE loopk,loopi,loopj,vi + #include "hypre_box_smp_forloop.h" + hypre_BoxLoop1For(loopi, loopj, loopk, vi) + { + vp[vi] = values; + } + hypre_BoxLoop1End(vi); + } + + return ierr; + } + + Index: llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/smg.h diff -c /dev/null llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/smg.h:1.1 *** /dev/null Mon Apr 11 00:22:18 2005 --- llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/smg.h Mon Apr 11 00:22:07 2005 *************** *** 0 **** --- 1,114 ---- + /*BHEADER********************************************************************** + * (c) 1997 The Regents of the University of California + * + * See the file COPYRIGHT_and_DISCLAIMER for a complete copyright + * notice, contact person, and disclaimer. + * + * $Revision: 1.1 $ + *********************************************************************EHEADER*/ + /****************************************************************************** + * + * Header info for the SMG solver + * + *****************************************************************************/ + + #ifndef hypre_SMG_HEADER + #define hypre_SMG_HEADER + + /*-------------------------------------------------------------------------- + * hypre_SMGData: + *--------------------------------------------------------------------------*/ + + typedef struct + { + MPI_Comm comm; + + int memory_use; + double tol; + int max_iter; + int rel_change; + int zero_guess; + int max_levels; /* max_level <= 0 means no limit */ + + int num_levels; + + int num_pre_relax; /* number of pre relaxation sweeps */ + int num_post_relax; /* number of post relaxation sweeps */ + + int cdir; /* coarsening direction */ + + /* base index space info */ + hypre_Index base_index; + hypre_Index base_stride; + + hypre_StructGrid **grid_l; + hypre_StructGrid **PT_grid_l; + + double *data; + hypre_StructMatrix **A_l; + hypre_StructMatrix **PT_l; + hypre_StructMatrix **R_l; + hypre_StructVector **b_l; + hypre_StructVector **x_l; + + /* temp vectors */ + hypre_StructVector **tb_l; + hypre_StructVector **tx_l; + hypre_StructVector **r_l; + hypre_StructVector **e_l; + + void **relax_data_l; + void **residual_data_l; + void **restrict_data_l; + void **interp_data_l; + + /* log info (always logged) */ + int num_iterations; + int time_index; + + /* additional log info (logged when `logging' > 0) */ + int logging; + double *norms; + double *rel_norms; + + } hypre_SMGData; + + /*-------------------------------------------------------------------------- + * Utility routines: + *--------------------------------------------------------------------------*/ + + #define hypre_SMGSetBIndex(base_index, base_stride, level, bindex) \ + {\ + if (level > 0)\ + hypre_SetIndex(bindex, 0, 0, 0);\ + else\ + hypre_CopyIndex(base_index, bindex);\ + } + + #define hypre_SMGSetBStride(base_index, base_stride, level, bstride) \ + {\ + if (level > 0)\ + hypre_SetIndex(bstride, 1, 1, 1);\ + else\ + hypre_CopyIndex(base_stride, bstride);\ + } + + #define hypre_SMGSetCIndex(base_index, base_stride, level, cdir, cindex) \ + {\ + hypre_SMGSetBIndex(base_index, base_stride, level, cindex);\ + hypre_IndexD(cindex, cdir) += 0;\ + } + + #define hypre_SMGSetFIndex(base_index, base_stride, level, cdir, findex) \ + {\ + hypre_SMGSetBIndex(base_index, base_stride, level, findex);\ + hypre_IndexD(findex, cdir) += 1;\ + } + + #define hypre_SMGSetStride(base_index, base_stride, level, cdir, stride) \ + {\ + hypre_SMGSetBStride(base_index, base_stride, level, stride);\ + hypre_IndexD(stride, cdir) *= 2;\ + } + + #endif Index: llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/smg2000.c diff -c /dev/null llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/smg2000.c:1.1 *** /dev/null Mon Apr 11 00:22:18 2005 --- llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/smg2000.c Mon Apr 11 00:22:07 2005 *************** *** 0 **** --- 1,638 ---- + #include + #include + #include + + #include "utilities.h" + #include "HYPRE_struct_ls.h" + #include "krylov.h" + + /*-------------------------------------------------------------------------- + * Test driver for structured matrix interface (structured storage) + *--------------------------------------------------------------------------*/ + + /*---------------------------------------------------------------------- + * Standard 7-point laplacian in 3D with grid and anisotropy determined + * as command line arguments. Do `driver -help' for usage info. + *----------------------------------------------------------------------*/ + + int + main( int argc, + char *argv[] ) + { + int arg_index; + int print_usage; + int nx, ny, nz; + int P, Q, R; + int bx, by, bz; + double cx, cy, cz; + int solver_id; + + int A_num_ghost[6] = {0, 0, 0, 0, 0, 0}; + + HYPRE_StructMatrix A; + HYPRE_StructVector b; + HYPRE_StructVector x; + + HYPRE_StructSolver solver; + HYPRE_StructSolver precond; + int num_iterations; + int time_index; + double final_res_norm; + + int num_procs, myid; + + int p, q, r; + int dim; + int n_pre, n_post; + int nblocks, volume; + + int **iupper; + int **ilower; + + int istart[3]; + + int **offsets; + + HYPRE_StructGrid grid; + HYPRE_StructStencil stencil; + + int *stencil_indices; + double *values; + + int i, s, d; + int ix, iy, iz, ib; + + /*----------------------------------------------------------- + * Initialize some stuff + *-----------------------------------------------------------*/ + + /* Initialize MPI */ + MPI_Init(&argc, &argv); + + MPI_Comm_size(MPI_COMM_WORLD, &num_procs ); + MPI_Comm_rank(MPI_COMM_WORLD, &myid ); + + /*----------------------------------------------------------- + * Set defaults + *-----------------------------------------------------------*/ + + dim = 3; + + nx = 10; + ny = 10; + nz = 10; + + P = num_procs; + Q = 1; + R = 1; + + bx = 1; + by = 1; + bz = 1; + + cx = 1.0; + cy = 1.0; + cz = 1.0; + + n_pre = 1; + n_post = 1; + + solver_id = 0; + + istart[0] = -17; + istart[1] = 0 ; + istart[2] = 32; + + /*----------------------------------------------------------- + * Parse command line + *-----------------------------------------------------------*/ + + print_usage = 0; + arg_index = 1; + while (arg_index < argc) + { + if ( strcmp(argv[arg_index], "-n") == 0 ) + { + arg_index++; + nx = atoi(argv[arg_index++]); + ny = atoi(argv[arg_index++]); + nz = atoi(argv[arg_index++]); + } + else if ( strcmp(argv[arg_index], "-P") == 0 ) + { + arg_index++; + P = atoi(argv[arg_index++]); + Q = atoi(argv[arg_index++]); + R = atoi(argv[arg_index++]); + } + else if ( strcmp(argv[arg_index], "-b") == 0 ) + { + arg_index++; + bx = atoi(argv[arg_index++]); + by = atoi(argv[arg_index++]); + bz = atoi(argv[arg_index++]); + } + else if ( strcmp(argv[arg_index], "-c") == 0 ) + { + arg_index++; + cx = atof(argv[arg_index++]); + cy = atof(argv[arg_index++]); + cz = atof(argv[arg_index++]); + } + else if ( strcmp(argv[arg_index], "-v") == 0 ) + { + arg_index++; + n_pre = atoi(argv[arg_index++]); + n_post = atoi(argv[arg_index++]); + } + else if ( strcmp(argv[arg_index], "-d") == 0 ) + { + arg_index++; + dim = atoi(argv[arg_index++]); + } + else if ( strcmp(argv[arg_index], "-solver") == 0 ) + { + arg_index++; + solver_id = atoi(argv[arg_index++]); + } + else if ( strcmp(argv[arg_index], "-help") == 0 ) + { + print_usage = 1; + break; + } + else + { + break; + } + } + + /*----------------------------------------------------------- + * Print usage info + *-----------------------------------------------------------*/ + + if ( (print_usage) && (myid == 0) ) + { + printf("\n"); + printf("Usage: %s []\n", argv[0]); + printf("\n"); + printf(" -n : problem size per block\n"); + printf(" -P : processor topology\n"); + printf(" -b : blocking per processor\n"); + printf(" -c : diffusion coefficients\n"); + printf(" -v : number of pre and post relaxations\n"); + printf(" -d : problem dimension (2 or 3)\n"); + printf(" -solver : solver ID (default = 0)\n"); + printf(" 0 - SMG\n"); + printf(" 1 - CG with SMG precond\n"); + printf(" 2 - CG with diagonal scaling\n"); + printf(" 3 - CG\n"); + printf("\n"); + } + + if ( print_usage ) + { + exit(1); + } + + /*----------------------------------------------------------- + * Check a few things + *-----------------------------------------------------------*/ + + if ((P*Q*R) != num_procs) + { + printf("Error: Invalid number of processors or processor topology \n"); + exit(1); + } + + /*----------------------------------------------------------- + * Print driver parameters + *-----------------------------------------------------------*/ + + if (myid == 0) + { + printf("Running with these driver parameters:\n"); + printf(" (nx, ny, nz) = (%d, %d, %d)\n", nx, ny, nz); + printf(" (Px, Py, Pz) = (%d, %d, %d)\n", P, Q, R); + printf(" (bx, by, bz) = (%d, %d, %d)\n", bx, by, bz); + printf(" (cx, cy, cz) = (%f, %f, %f)\n", cx, cy, cz); + printf(" (n_pre, n_post) = (%d, %d)\n", n_pre, n_post); + printf(" dim = %d\n", dim); + printf(" solver ID = %d\n", solver_id); + } + + /*----------------------------------------------------------- + * Synchronize so that timings make sense + *-----------------------------------------------------------*/ + + MPI_Barrier(MPI_COMM_WORLD); + + time_index = hypre_InitializeTiming("Struct Interface"); + hypre_BeginTiming(time_index); + + /*----------------------------------------------------------- + * Set up the grid structure + *-----------------------------------------------------------*/ + + switch (dim) + { + case 1: + volume = nx; + nblocks = bx; + stencil_indices = hypre_CTAlloc(int, 2); + offsets = hypre_CTAlloc(int*, 2); + offsets[0] = hypre_CTAlloc(int, 1); + offsets[0][0] = -1; + offsets[1] = hypre_CTAlloc(int, 1); + offsets[1][0] = 0; + /* compute p from P and myid */ + p = myid % P; + break; + case 2: + volume = nx*ny; + nblocks = bx*by; + stencil_indices = hypre_CTAlloc(int, 3); + offsets = hypre_CTAlloc(int*, 3); + offsets[0] = hypre_CTAlloc(int, 2); + offsets[0][0] = -1; + offsets[0][1] = 0; + offsets[1] = hypre_CTAlloc(int, 2); + offsets[1][0] = 0; + offsets[1][1] = -1; + offsets[2] = hypre_CTAlloc(int, 2); + offsets[2][0] = 0; + offsets[2][1] = 0; + /* compute p,q from P,Q and myid */ + p = myid % P; + q = (( myid - p)/P) % Q; + break; + case 3: + volume = nx*ny*nz; + nblocks = bx*by*bz; + stencil_indices = hypre_CTAlloc(int, 4); + offsets = hypre_CTAlloc(int*, 4); + offsets[0] = hypre_CTAlloc(int, 3); + offsets[0][0] = -1; + offsets[0][1] = 0; + offsets[0][2] = 0; + offsets[1] = hypre_CTAlloc(int, 3); + offsets[1][0] = 0; + offsets[1][1] = -1; + offsets[1][2] = 0; + offsets[2] = hypre_CTAlloc(int, 3); + offsets[2][0] = 0; + offsets[2][1] = 0; + offsets[2][2] = -1; + offsets[3] = hypre_CTAlloc(int, 3); + offsets[3][0] = 0; + offsets[3][1] = 0; + offsets[3][2] = 0; + /* compute p,q,r from P,Q,R and myid */ + p = myid % P; + q = (( myid - p)/P) % Q; + r = ( myid - p - P*q)/( P*Q ); + break; + } + + ilower = hypre_CTAlloc(int*, nblocks); + iupper = hypre_CTAlloc(int*, nblocks); + for (i = 0; i < nblocks; i++) + { + ilower[i] = hypre_CTAlloc(int, dim); + iupper[i] = hypre_CTAlloc(int, dim); + } + + for (i = 0; i < dim; i++) + { + A_num_ghost[2*i] = 1; + A_num_ghost[2*i + 1] = 1; + } + + /* compute ilower and iupper from (p,q,r), (bx,by,bz), and (nx,ny,nz) */ + ib = 0; + switch (dim) + { + case 1: + for (ix = 0; ix < bx; ix++) + { + ilower[ib][0] = istart[0]+ nx*(bx*p+ix); + iupper[ib][0] = istart[0]+ nx*(bx*p+ix+1) - 1; + ib++; + } + break; + case 2: + for (iy = 0; iy < by; iy++) + for (ix = 0; ix < bx; ix++) + { + ilower[ib][0] = istart[0]+ nx*(bx*p+ix); + iupper[ib][0] = istart[0]+ nx*(bx*p+ix+1) - 1; + ilower[ib][1] = istart[1]+ ny*(by*q+iy); + iupper[ib][1] = istart[1]+ ny*(by*q+iy+1) - 1; + ib++; + } + break; + case 3: + for (iz = 0; iz < bz; iz++) + for (iy = 0; iy < by; iy++) + for (ix = 0; ix < bx; ix++) + { + ilower[ib][0] = istart[0]+ nx*(bx*p+ix); + iupper[ib][0] = istart[0]+ nx*(bx*p+ix+1) - 1; + ilower[ib][1] = istart[1]+ ny*(by*q+iy); + iupper[ib][1] = istart[1]+ ny*(by*q+iy+1) - 1; + ilower[ib][2] = istart[2]+ nz*(bz*r+iz); + iupper[ib][2] = istart[2]+ nz*(bz*r+iz+1) - 1; + ib++; + } + break; + } + + HYPRE_StructGridCreate(MPI_COMM_WORLD, dim, &grid); + for (ib = 0; ib < nblocks; ib++) + { + HYPRE_StructGridSetExtents(grid, ilower[ib], iupper[ib]); + } + HYPRE_StructGridAssemble(grid); + + /*----------------------------------------------------------- + * Set up the stencil structure + *-----------------------------------------------------------*/ + + HYPRE_StructStencilCreate(dim, dim + 1, &stencil); + for (s = 0; s < dim + 1; s++) + { + HYPRE_StructStencilSetElement(stencil, s, offsets[s]); + } + + /*----------------------------------------------------------- + * Set up the matrix structure + *-----------------------------------------------------------*/ + + HYPRE_StructMatrixCreate(MPI_COMM_WORLD, grid, stencil, &A); + HYPRE_StructMatrixSetSymmetric(A, 1); + HYPRE_StructMatrixSetNumGhost(A, A_num_ghost); + HYPRE_StructMatrixInitialize(A); + /*----------------------------------------------------------- + * Fill in the matrix elements + *-----------------------------------------------------------*/ + + values = hypre_CTAlloc(double, (dim +1)*volume); + + /* Set the coefficients for the grid */ + for (i = 0; i < (dim + 1)*volume; i += (dim + 1)) + { + for (s = 0; s < (dim + 1); s++) + { + stencil_indices[s] = s; + switch (dim) + { + case 1: + values[i ] = -cx; + values[i+1] = 2.0*(cx); + break; + case 2: + values[i ] = -cx; + values[i+1] = -cy; + values[i+2] = 2.0*(cx+cy); + break; + case 3: + values[i ] = -cx; + values[i+1] = -cy; + values[i+2] = -cz; + values[i+3] = 2.0*(cx+cy+cz); + break; + } + } + } + for (ib = 0; ib < nblocks; ib++) + { + HYPRE_StructMatrixSetBoxValues(A, ilower[ib], iupper[ib], (dim+1), + stencil_indices, values); + } + + /* Zero out stencils reaching to real boundary */ + for (i = 0; i < volume; i++) + { + values[i] = 0.0; + } + for (d = 0; d < dim; d++) + { + for (ib = 0; ib < nblocks; ib++) + { + if( ilower[ib][d] == istart[d] ) + { + i = iupper[ib][d]; + iupper[ib][d] = istart[d]; + stencil_indices[0] = d; + HYPRE_StructMatrixSetBoxValues(A, ilower[ib], iupper[ib], + 1, stencil_indices, values); + iupper[ib][d] = i; + } + } + } + + HYPRE_StructMatrixAssemble(A); + #if 0 + HYPRE_StructMatrixPrint("driver.out.A", A, 0); + #endif + + hypre_TFree(values); + + /*----------------------------------------------------------- + * Set up the linear system + *-----------------------------------------------------------*/ + + values = hypre_CTAlloc(double, volume); + + HYPRE_StructVectorCreate(MPI_COMM_WORLD, grid, &b); + HYPRE_StructVectorInitialize(b); + for (i = 0; i < volume; i++) + { + values[i] = 1.0; + } + for (ib = 0; ib < nblocks; ib++) + { + HYPRE_StructVectorSetBoxValues(b, ilower[ib], iupper[ib], values); + } + HYPRE_StructVectorAssemble(b); + #if 0 + HYPRE_StructVectorPrint("driver.out.b", b, 0); + #endif + + HYPRE_StructVectorCreate(MPI_COMM_WORLD, grid, &x); + HYPRE_StructVectorInitialize(x); + for (i = 0; i < volume; i++) + { + values[i] = 0.0; + } + for (ib = 0; ib < nblocks; ib++) + { + HYPRE_StructVectorSetBoxValues(x, ilower[ib], iupper[ib], values); + } + HYPRE_StructVectorAssemble(x); + #if 0 + HYPRE_StructVectorPrint("driver.out.x0", x, 0); + #endif + + hypre_TFree(values); + + hypre_EndTiming(time_index); + hypre_PrintTiming("Struct Interface", MPI_COMM_WORLD); + hypre_FinalizeTiming(time_index); + hypre_ClearTiming(); + + /*----------------------------------------------------------- + * Solve the system using SMG + *-----------------------------------------------------------*/ + + if (solver_id == 0) + { + time_index = hypre_InitializeTiming("SMG Setup"); + hypre_BeginTiming(time_index); + + HYPRE_StructSMGCreate(MPI_COMM_WORLD, &solver); + HYPRE_StructSMGSetMemoryUse(solver, 0); + HYPRE_StructSMGSetMaxIter(solver, 50); + HYPRE_StructSMGSetTol(solver, 1.0e-06); + HYPRE_StructSMGSetRelChange(solver, 0); + HYPRE_StructSMGSetNumPreRelax(solver, n_pre); + HYPRE_StructSMGSetNumPostRelax(solver, n_post); + HYPRE_StructSMGSetLogging(solver, 1); + HYPRE_StructSMGSetup(solver, A, b, x); + + hypre_EndTiming(time_index); + hypre_PrintTiming("Setup phase times", MPI_COMM_WORLD); + hypre_FinalizeTiming(time_index); + hypre_ClearTiming(); + + time_index = hypre_InitializeTiming("SMG Solve"); + hypre_BeginTiming(time_index); + + HYPRE_StructSMGSolve(solver, A, b, x); + + hypre_EndTiming(time_index); + hypre_PrintTiming("Solve phase times", MPI_COMM_WORLD); + hypre_FinalizeTiming(time_index); + hypre_ClearTiming(); + + HYPRE_StructSMGGetNumIterations(solver, &num_iterations); + HYPRE_StructSMGGetFinalRelativeResidualNorm(solver, &final_res_norm); + HYPRE_StructSMGDestroy(solver); + } + + /*----------------------------------------------------------- + * Solve the system using CG + *-----------------------------------------------------------*/ + + if (solver_id > 0) + { + time_index = hypre_InitializeTiming("PCG Setup"); + hypre_BeginTiming(time_index); + + HYPRE_StructPCGCreate(MPI_COMM_WORLD, &solver); + HYPRE_PCGSetMaxIter( (HYPRE_Solver)solver, 50 ); + HYPRE_PCGSetTol( (HYPRE_Solver)solver, 1.0e-06 ); + HYPRE_PCGSetTwoNorm( (HYPRE_Solver)solver, 1 ); + HYPRE_PCGSetRelChange( (HYPRE_Solver)solver, 0 ); + HYPRE_PCGSetLogging( (HYPRE_Solver)solver, 1 ); + + if (solver_id == 1) + { + /* use symmetric SMG as preconditioner */ + HYPRE_StructSMGCreate(MPI_COMM_WORLD, &precond); + HYPRE_StructSMGSetMemoryUse(precond, 0); + HYPRE_StructSMGSetMaxIter(precond, 1); + HYPRE_StructSMGSetTol(precond, 0.0); + HYPRE_StructSMGSetZeroGuess(precond); + HYPRE_StructSMGSetNumPreRelax(precond, n_pre); + HYPRE_StructSMGSetNumPostRelax(precond, n_post); + HYPRE_StructSMGSetLogging(precond, 0); + HYPRE_PCGSetPrecond((HYPRE_Solver) solver, + (HYPRE_PtrToSolverFcn) HYPRE_StructSMGSolve, + (HYPRE_PtrToSolverFcn) HYPRE_StructSMGSetup, + (HYPRE_Solver) precond); + } + + else if (solver_id == 2) + { + /* use diagonal scaling as preconditioner */ + precond = NULL; + HYPRE_PCGSetPrecond((HYPRE_Solver) solver, + (HYPRE_PtrToSolverFcn) HYPRE_StructDiagScale, + (HYPRE_PtrToSolverFcn) HYPRE_StructDiagScaleSetup, + (HYPRE_Solver) precond); + } + + HYPRE_PCGSetup((HYPRE_Solver)solver, + (HYPRE_Matrix)A, (HYPRE_Vector)b, (HYPRE_Vector)x); + + hypre_EndTiming(time_index); + hypre_PrintTiming("Setup phase times", MPI_COMM_WORLD); + hypre_FinalizeTiming(time_index); + hypre_ClearTiming(); + + time_index = hypre_InitializeTiming("PCG Solve"); + hypre_BeginTiming(time_index); + + HYPRE_PCGSolve((HYPRE_Solver)solver, + (HYPRE_Matrix)A, (HYPRE_Vector)b, (HYPRE_Vector)x); + + hypre_EndTiming(time_index); + hypre_PrintTiming("Solve phase times", MPI_COMM_WORLD); + hypre_FinalizeTiming(time_index); + hypre_ClearTiming(); + + HYPRE_PCGGetNumIterations((HYPRE_Solver)solver, &num_iterations); + HYPRE_PCGGetFinalRelativeResidualNorm((HYPRE_Solver)solver, + &final_res_norm); + HYPRE_StructPCGDestroy(solver); + + if (solver_id == 1) + { + HYPRE_StructSMGDestroy(precond); + } + } + + /*----------------------------------------------------------- + * Print the solution and other info + *-----------------------------------------------------------*/ + + #if 0 + HYPRE_StructVectorPrint("driver.out.x", x, 0); + #endif + + if (myid == 0) + { + printf("\n"); + printf("Iterations = %d\n", num_iterations); + printf("Final Relative Residual Norm = %e\n", final_res_norm); + printf("\n"); + } + + /*----------------------------------------------------------- + * Finalize things + *-----------------------------------------------------------*/ + + HYPRE_StructGridDestroy(grid); + HYPRE_StructStencilDestroy(stencil); + HYPRE_StructMatrixDestroy(A); + HYPRE_StructVectorDestroy(b); + HYPRE_StructVectorDestroy(x); + + for (i = 0; i < nblocks; i++) + { + hypre_TFree(iupper[i]); + hypre_TFree(ilower[i]); + } + hypre_TFree(ilower); + hypre_TFree(iupper); + hypre_TFree(stencil_indices); + + for ( i = 0; i < (dim + 1); i++) + hypre_TFree(offsets[i]); + hypre_TFree(offsets); + + /* Finalize MPI */ + MPI_Finalize(); + + return (0); + } Index: llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/smg2_setup_rap.c diff -c /dev/null llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/smg2_setup_rap.c:1.1 *** /dev/null Mon Apr 11 00:22:18 2005 --- llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/smg2_setup_rap.c Mon Apr 11 00:22:07 2005 *************** *** 0 **** --- 1,985 ---- + /*BHEADER********************************************************************** + * (c) 1997 The Regents of the University of California + * + * See the file COPYRIGHT_and_DISCLAIMER for a complete copyright + * notice, contact person, and disclaimer. + * + * $Revision: 1.1 $ + *********************************************************************EHEADER*/ + /****************************************************************************** + * + * + *****************************************************************************/ + + #include "headers.h" + #include "smg.h" + + /*-------------------------------------------------------------------------- + * hypre_SMG2CreateRAPOp + * Sets up new coarse grid operator stucture. + *--------------------------------------------------------------------------*/ + + hypre_StructMatrix * + hypre_SMG2CreateRAPOp( hypre_StructMatrix *R, + hypre_StructMatrix *A, + hypre_StructMatrix *PT, + hypre_StructGrid *coarse_grid ) + { + hypre_StructMatrix *RAP; + + hypre_Index *RAP_stencil_shape; + hypre_StructStencil *RAP_stencil; + int RAP_stencil_size; + int RAP_stencil_dim; + int RAP_num_ghost[] = {1, 1, 1, 1, 0, 0}; + + int j, i; + int stencil_rank; + + RAP_stencil_dim = 2; + + /*----------------------------------------------------------------------- + * Define RAP_stencil + *-----------------------------------------------------------------------*/ + + stencil_rank = 0; + + /*----------------------------------------------------------------------- + * non-symmetric case + *-----------------------------------------------------------------------*/ + + if (!hypre_StructMatrixSymmetric(A)) + { + + /*-------------------------------------------------------------------- + * 5 or 9 point fine grid stencil produces 9 point RAP + *--------------------------------------------------------------------*/ + RAP_stencil_size = 9; + RAP_stencil_shape = hypre_CTAlloc(hypre_Index, RAP_stencil_size); + for (j = -1; j < 2; j++) + { + for (i = -1; i < 2; i++) + { + + /*-------------------------------------------------------------- + * Storage for 9 elements (c,w,e,n,s,sw,se,nw,se) + *--------------------------------------------------------------*/ + hypre_SetIndex(RAP_stencil_shape[stencil_rank],i,j,0); + stencil_rank++; + } + } + } + + /*----------------------------------------------------------------------- + * symmetric case + *-----------------------------------------------------------------------*/ + + else + { + + /*-------------------------------------------------------------------- + * 5 or 9 point fine grid stencil produces 9 point RAP + * Only store the lower triangular part + diagonal = 5 entries, + * lower triangular means the lower triangular part on the matrix + * in the standard lexicalgraphic ordering. + *--------------------------------------------------------------------*/ + RAP_stencil_size = 5; + RAP_stencil_shape = hypre_CTAlloc(hypre_Index, RAP_stencil_size); + for (j = -1; j < 1; j++) + { + for (i = -1; i < 2; i++) + { + + /*-------------------------------------------------------------- + * Store 5 elements in (c,w,s,sw,se) + *--------------------------------------------------------------*/ + if( i+j <=0 ) + { + hypre_SetIndex(RAP_stencil_shape[stencil_rank],i,j,0); + stencil_rank++; + } + } + } + } + + RAP_stencil = hypre_StructStencilCreate(RAP_stencil_dim, RAP_stencil_size, + RAP_stencil_shape); + + RAP = hypre_StructMatrixCreate(hypre_StructMatrixComm(A), + coarse_grid, RAP_stencil); + + hypre_StructStencilDestroy(RAP_stencil); + + /*----------------------------------------------------------------------- + * Coarse operator in symmetric iff fine operator is + *-----------------------------------------------------------------------*/ + hypre_StructMatrixSymmetric(RAP) = hypre_StructMatrixSymmetric(A); + + /*----------------------------------------------------------------------- + * Set number of ghost points + *-----------------------------------------------------------------------*/ + if (hypre_StructMatrixSymmetric(A)) + { + RAP_num_ghost[1] = 0; + RAP_num_ghost[3] = 0; + } + hypre_StructMatrixSetNumGhost(RAP, RAP_num_ghost); + + return RAP; + } + + /*-------------------------------------------------------------------------- + * Routines to build RAP. These routines are fairly general + * 1) No assumptions about symmetry of A + * 2) No assumption that R = transpose(P) + * 3) 5 or 9-point fine grid A + * + * I am, however, assuming that the c-to-c interpolation is the identity. + * + * I've written two routines - hypre_SMG2BuildRAPSym to build the + * lower triangular part of RAP (including the diagonal) and + * hypre_SMG2BuildRAPNoSym to build the upper triangular part of RAP + * (excluding the diagonal). So using symmetric storage, only the + * first routine would be called. With full storage both would need to + * be called. + * + *--------------------------------------------------------------------------*/ + + int + hypre_SMG2BuildRAPSym( hypre_StructMatrix *A, + hypre_StructMatrix *PT, + hypre_StructMatrix *R, + hypre_StructMatrix *RAP, + hypre_Index cindex, + hypre_Index cstride ) + + { + + hypre_Index index; + + hypre_StructStencil *fine_stencil; + int fine_stencil_size; + + hypre_StructGrid *fgrid; + int *fgrid_ids; + hypre_StructGrid *cgrid; + hypre_BoxArray *cgrid_boxes; + int *cgrid_ids; + hypre_Box *cgrid_box; + hypre_IndexRef cstart; + hypre_Index stridec; + hypre_Index fstart; + hypre_IndexRef stridef; + hypre_Index loop_size; + + int fi, ci; + int loopi, loopj, loopk; + + hypre_Box *A_dbox; + hypre_Box *PT_dbox; + hypre_Box *R_dbox; + hypre_Box *RAP_dbox; + + double *pa, *pb; + double *ra, *rb; + + double *a_cc, *a_cw, *a_ce, *a_cs, *a_cn; + double *a_csw, *a_cse, *a_cnw; + + double *rap_cc, *rap_cw, *rap_cs; + double *rap_csw, *rap_cse; + + int iA, iAm1, iAp1; + int iAc; + int iP, iP1; + int iR; + + int yOffsetA; + int xOffsetP; + int yOffsetP; + + int ierr = 0; + + fine_stencil = hypre_StructMatrixStencil(A); + fine_stencil_size = hypre_StructStencilSize(fine_stencil); + + stridef = cstride; + hypre_SetIndex(stridec, 1, 1, 1); + + fgrid = hypre_StructMatrixGrid(A); + fgrid_ids = hypre_StructGridIDs(fgrid); + + cgrid = hypre_StructMatrixGrid(RAP); + cgrid_boxes = hypre_StructGridBoxes(cgrid); + cgrid_ids = hypre_StructGridIDs(cgrid); + + fi = 0; + hypre_ForBoxI(ci, cgrid_boxes) + { + while (fgrid_ids[fi] != cgrid_ids[ci]) + { + fi++; + } + + cgrid_box = hypre_BoxArrayBox(cgrid_boxes, ci); + + cstart = hypre_BoxIMin(cgrid_box); + hypre_StructMapCoarseToFine(cstart, cindex, cstride, fstart); + + A_dbox = hypre_BoxArrayBox(hypre_StructMatrixDataSpace(A), fi); + PT_dbox = hypre_BoxArrayBox(hypre_StructMatrixDataSpace(PT), fi); + R_dbox = hypre_BoxArrayBox(hypre_StructMatrixDataSpace(R), fi); + RAP_dbox = hypre_BoxArrayBox(hypre_StructMatrixDataSpace(RAP), ci); + + /*----------------------------------------------------------------- + * Extract pointers for interpolation operator: + * pa is pointer for weight for f-point above c-point + * pb is pointer for weight for f-point below c-point + *-----------------------------------------------------------------*/ + + hypre_SetIndex(index,0,1,0); + pa = hypre_StructMatrixExtractPointerByIndex(PT, fi, index); + + hypre_SetIndex(index,0,-1,0); + pb = hypre_StructMatrixExtractPointerByIndex(PT, fi, index); + + /*----------------------------------------------------------------- + * Extract pointers for restriction operator: + * ra is pointer for weight for f-point above c-point + * rb is pointer for weight for f-point below c-point + *-----------------------------------------------------------------*/ + + hypre_SetIndex(index,0,1,0); + ra = hypre_StructMatrixExtractPointerByIndex(R, fi, index); + + hypre_SetIndex(index,0,-1,0); + rb = hypre_StructMatrixExtractPointerByIndex(R, fi, index); + + /*----------------------------------------------------------------- + * Extract pointers for 5-point fine grid operator: + * + * a_cc is pointer for center coefficient + * a_cw is pointer for west coefficient + * a_ce is pointer for east coefficient + * a_cs is pointer for south coefficient + * a_cn is pointer for north coefficient + *-----------------------------------------------------------------*/ + + hypre_SetIndex(index,0,0,0); + a_cc = hypre_StructMatrixExtractPointerByIndex(A, fi, index); + + hypre_SetIndex(index,-1,0,0); + a_cw = hypre_StructMatrixExtractPointerByIndex(A, fi, index); + + hypre_SetIndex(index,1,0,0); + a_ce = hypre_StructMatrixExtractPointerByIndex(A, fi, index); + + hypre_SetIndex(index,0,-1,0); + a_cs = hypre_StructMatrixExtractPointerByIndex(A, fi, index); + + hypre_SetIndex(index,0,1,0); + a_cn = hypre_StructMatrixExtractPointerByIndex(A, fi, index); + + /*----------------------------------------------------------------- + * Extract additional pointers for 9-point fine grid operator: + * + * a_csw is pointer for southwest coefficient + * a_cse is pointer for southeast coefficient + * a_cnw is pointer for northwest coefficient + * a_cne is pointer for northeast coefficient + *-----------------------------------------------------------------*/ + + if(fine_stencil_size > 5) + { + hypre_SetIndex(index,-1,-1,0); + a_csw = hypre_StructMatrixExtractPointerByIndex(A, fi, index); + + hypre_SetIndex(index,1,-1,0); + a_cse = hypre_StructMatrixExtractPointerByIndex(A, fi, index); + + hypre_SetIndex(index,-1,1,0); + a_cnw = hypre_StructMatrixExtractPointerByIndex(A, fi, index); + } + + /*----------------------------------------------------------------- + * Extract pointers for coarse grid operator - always 9-point: + * + * We build only the lower triangular part (plus diagonal). + * + * rap_cc is pointer for center coefficient (etc.) + *-----------------------------------------------------------------*/ + + hypre_SetIndex(index,0,0,0); + rap_cc = hypre_StructMatrixExtractPointerByIndex(RAP, ci, index); + + hypre_SetIndex(index,-1,0,0); + rap_cw = hypre_StructMatrixExtractPointerByIndex(RAP, ci, index); + + hypre_SetIndex(index,0,-1,0); + rap_cs = hypre_StructMatrixExtractPointerByIndex(RAP, ci, index); + + hypre_SetIndex(index,-1,-1,0); + rap_csw = hypre_StructMatrixExtractPointerByIndex(RAP, ci, index); + + hypre_SetIndex(index,1,-1,0); + rap_cse = hypre_StructMatrixExtractPointerByIndex(RAP, ci, index); + + /*----------------------------------------------------------------- + * Define offsets for fine grid stencil and interpolation + * + * In the BoxLoop below I assume iA and iP refer to data associated + * with the point which we are building the stencil for. The below + * Offsets are used in refering to data associated with other points. + *-----------------------------------------------------------------*/ + + hypre_SetIndex(index,0,1,0); + yOffsetA = hypre_BoxOffsetDistance(A_dbox,index); + yOffsetP = hypre_BoxOffsetDistance(PT_dbox,index); + hypre_SetIndex(index,1,0,0); + xOffsetP = hypre_BoxOffsetDistance(PT_dbox,index); + + /*----------------------------------------------------------------- + * Switch statement to direct control to apropriate BoxLoop depending + * on stencil size. Default is full 9-point. + *-----------------------------------------------------------------*/ + + switch (fine_stencil_size) + { + + /*-------------------------------------------------------------- + * Loop for symmetric 5-point fine grid operator; produces a + * symmetric 9-point coarse grid operator. We calculate only the + * lower triangular stencil entries: (southwest, south, southeast, + * west, and center). + *--------------------------------------------------------------*/ + + case 5: + + hypre_BoxGetSize(cgrid_box, loop_size); + hypre_BoxLoop4Begin(loop_size, + PT_dbox, cstart, stridec, iP, + R_dbox, cstart, stridec, iR, + A_dbox, fstart, stridef, iA, + RAP_dbox, cstart, stridec, iAc); + #define HYPRE_BOX_SMP_PRIVATE loopk,loopi,loopj,iP,iR,iA,iAc,iAm1,iAp1,iP1 + #include "hypre_box_smp_forloop.h" + hypre_BoxLoop4For(loopi, loopj, loopk, iP, iR, iA, iAc) + { + iAm1 = iA - yOffsetA; + iAp1 = iA + yOffsetA; + + iP1 = iP - yOffsetP - xOffsetP; + rap_csw[iAc] = rb[iR] * a_cw[iAm1] * pa[iP1]; + + iP1 = iP - yOffsetP; + rap_cs[iAc] = rb[iR] * a_cc[iAm1] * pa[iP1] + + rb[iR] * a_cs[iAm1] + + a_cs[iA] * pa[iP1]; + + iP1 = iP - yOffsetP + xOffsetP; + rap_cse[iAc] = rb[iR] * a_ce[iAm1] * pa[iP1]; + + iP1 = iP - xOffsetP; + rap_cw[iAc] = a_cw[iA] + + rb[iR] * a_cw[iAm1] * pb[iP1] + + ra[iR] * a_cw[iAp1] * pa[iP1]; + + rap_cc[iAc] = a_cc[iA] + + rb[iR] * a_cc[iAm1] * pb[iP] + + ra[iR] * a_cc[iAp1] * pa[iP] + + rb[iR] * a_cn[iAm1] + + ra[iR] * a_cs[iAp1] + + a_cs[iA] * pb[iP] + + a_cn[iA] * pa[iP]; + + } + hypre_BoxLoop4End(iP, iR, iA, iAc); + + break; + + /*-------------------------------------------------------------- + * Loop for symmetric 9-point fine grid operator; produces a + * symmetric 9-point coarse grid operator. We calculate only the + * lower triangular stencil entries: (southwest, south, southeast, + * west, and center). + *--------------------------------------------------------------*/ + + default: + + hypre_BoxGetSize(cgrid_box, loop_size); + hypre_BoxLoop4Begin(loop_size, + PT_dbox, cstart, stridec, iP, + R_dbox, cstart, stridec, iR, + A_dbox, fstart, stridef, iA, + RAP_dbox, cstart, stridec, iAc); + #define HYPRE_BOX_SMP_PRIVATE loopk,loopi,loopj,iP,iR,iA,iAc,iAm1,iAp1,iP1 + #include "hypre_box_smp_forloop.h" + hypre_BoxLoop4For(loopi, loopj, loopk, iP, iR, iA, iAc) + { + iAm1 = iA - yOffsetA; + iAp1 = iA + yOffsetA; + + iP1 = iP - yOffsetP - xOffsetP; + rap_csw[iAc] = rb[iR] * a_cw[iAm1] * pa[iP1] + + rb[iR] * a_csw[iAm1] + + a_csw[iA] * pa[iP1]; + + iP1 = iP - yOffsetP; + rap_cs[iAc] = rb[iR] * a_cc[iAm1] * pa[iP1] + + rb[iR] * a_cs[iAm1] + + a_cs[iA] * pa[iP1]; + + iP1 = iP - yOffsetP + xOffsetP; + rap_cse[iAc] = rb[iR] * a_ce[iAm1] * pa[iP1] + + rb[iR] * a_cse[iAm1] + + a_cse[iA] * pa[iP1]; + + iP1 = iP - xOffsetP; + rap_cw[iAc] = a_cw[iA] + + rb[iR] * a_cw[iAm1] * pb[iP1] + + ra[iR] * a_cw[iAp1] * pa[iP1] + + rb[iR] * a_cnw[iAm1] + + ra[iR] * a_csw[iAp1] + + a_csw[iA] * pb[iP1] + + a_cnw[iA] * pa[iP1]; + + rap_cc[iAc] = a_cc[iA] + + rb[iR] * a_cc[iAm1] * pb[iP] + + ra[iR] * a_cc[iAp1] * pa[iP] + + rb[iR] * a_cn[iAm1] + + ra[iR] * a_cs[iAp1] + + a_cs[iA] * pb[iP] + + a_cn[iA] * pa[iP]; + + } + hypre_BoxLoop4End(iP, iR, iA, iAc); + + break; + + } /* end switch statement */ + + } /* end ForBoxI */ + + return ierr; + } + + /*-------------------------------------------------------------------------- + *--------------------------------------------------------------------------*/ + + int + hypre_SMG2BuildRAPNoSym( hypre_StructMatrix *A, + hypre_StructMatrix *PT, + hypre_StructMatrix *R, + hypre_StructMatrix *RAP, + hypre_Index cindex, + hypre_Index cstride ) + + { + + hypre_Index index; + + hypre_StructStencil *fine_stencil; + int fine_stencil_size; + + hypre_StructGrid *fgrid; + int *fgrid_ids; + hypre_StructGrid *cgrid; + hypre_BoxArray *cgrid_boxes; + int *cgrid_ids; + hypre_Box *cgrid_box; + hypre_IndexRef cstart; + hypre_Index stridec; + hypre_Index fstart; + hypre_IndexRef stridef; + hypre_Index loop_size; + + int fi, ci; + int loopi, loopj, loopk; + + hypre_Box *A_dbox; + hypre_Box *PT_dbox; + hypre_Box *R_dbox; + hypre_Box *RAP_dbox; + + double *pa, *pb; + double *ra, *rb; + + double *a_cc, *a_cw, *a_ce, *a_cn; + double *a_cse, *a_cnw, *a_cne; + + double *rap_ce, *rap_cn; + double *rap_cnw, *rap_cne; + + int iA, iAm1, iAp1; + int iAc; + int iP, iP1; + int iR; + + int yOffsetA; + int xOffsetP; + int yOffsetP; + + int ierr = 0; + + fine_stencil = hypre_StructMatrixStencil(A); + fine_stencil_size = hypre_StructStencilSize(fine_stencil); + + stridef = cstride; + hypre_SetIndex(stridec, 1, 1, 1); + + fgrid = hypre_StructMatrixGrid(A); + fgrid_ids = hypre_StructGridIDs(fgrid); + + cgrid = hypre_StructMatrixGrid(RAP); + cgrid_boxes = hypre_StructGridBoxes(cgrid); + cgrid_ids = hypre_StructGridIDs(cgrid); + + fi = 0; + hypre_ForBoxI(ci, cgrid_boxes) + { + while (fgrid_ids[fi] != cgrid_ids[ci]) + { + fi++; + } + + cgrid_box = hypre_BoxArrayBox(cgrid_boxes, ci); + + cstart = hypre_BoxIMin(cgrid_box); + hypre_StructMapCoarseToFine(cstart, cindex, cstride, fstart); + + A_dbox = hypre_BoxArrayBox(hypre_StructMatrixDataSpace(A), fi); + PT_dbox = hypre_BoxArrayBox(hypre_StructMatrixDataSpace(PT), fi); + R_dbox = hypre_BoxArrayBox(hypre_StructMatrixDataSpace(R), fi); + RAP_dbox = hypre_BoxArrayBox(hypre_StructMatrixDataSpace(RAP), ci); + + /*----------------------------------------------------------------- + * Extract pointers for interpolation operator: + * pa is pointer for weight for f-point above c-point + * pb is pointer for weight for f-point below c-point + *-----------------------------------------------------------------*/ + + hypre_SetIndex(index,0,1,0); + pa = hypre_StructMatrixExtractPointerByIndex(PT, fi, index); + + hypre_SetIndex(index,0,-1,0); + pb = hypre_StructMatrixExtractPointerByIndex(PT, fi, index); + + /*----------------------------------------------------------------- + * Extract pointers for restriction operator: + * ra is pointer for weight for f-point above c-point + * rb is pointer for weight for f-point below c-point + *-----------------------------------------------------------------*/ + + hypre_SetIndex(index,0,1,0); + ra = hypre_StructMatrixExtractPointerByIndex(R, fi, index); + + hypre_SetIndex(index,0,-1,0); + rb = hypre_StructMatrixExtractPointerByIndex(R, fi, index); + + /*----------------------------------------------------------------- + * Extract pointers for 5-point fine grid operator: + * + * a_cc is pointer for center coefficient + * a_cw is pointer for west coefficient + * a_ce is pointer for east coefficient + * a_cs is pointer for south coefficient + * a_cn is pointer for north coefficient + *-----------------------------------------------------------------*/ + + hypre_SetIndex(index,0,0,0); + a_cc = hypre_StructMatrixExtractPointerByIndex(A, fi, index); + + hypre_SetIndex(index,-1,0,0); + a_cw = hypre_StructMatrixExtractPointerByIndex(A, fi, index); + + hypre_SetIndex(index,1,0,0); + a_ce = hypre_StructMatrixExtractPointerByIndex(A, fi, index); + + hypre_SetIndex(index,0,1,0); + a_cn = hypre_StructMatrixExtractPointerByIndex(A, fi, index); + + /*----------------------------------------------------------------- + * Extract additional pointers for 9-point fine grid operator: + * + * a_csw is pointer for southwest coefficient + * a_cse is pointer for southeast coefficient + * a_cnw is pointer for northwest coefficient + * a_cne is pointer for northeast coefficient + *-----------------------------------------------------------------*/ + + if(fine_stencil_size > 5) + { + hypre_SetIndex(index,1,-1,0); + a_cse = hypre_StructMatrixExtractPointerByIndex(A, fi, index); + + hypre_SetIndex(index,-1,1,0); + a_cnw = hypre_StructMatrixExtractPointerByIndex(A, fi, index); + + hypre_SetIndex(index,1,1,0); + a_cne = hypre_StructMatrixExtractPointerByIndex(A, fi, index); + + } + + /*----------------------------------------------------------------- + * Extract pointers for coarse grid operator - always 9-point: + * + * We build only the upper triangular part. + * + * rap_ce is pointer for east coefficient (etc.) + *-----------------------------------------------------------------*/ + + hypre_SetIndex(index,1,0,0); + rap_ce = hypre_StructMatrixExtractPointerByIndex(RAP, ci, index); + + hypre_SetIndex(index,0,1,0); + rap_cn = hypre_StructMatrixExtractPointerByIndex(RAP, ci, index); + + hypre_SetIndex(index,1,1,0); + rap_cne = hypre_StructMatrixExtractPointerByIndex(RAP, ci, index); + + hypre_SetIndex(index,-1,1,0); + rap_cnw = hypre_StructMatrixExtractPointerByIndex(RAP, ci, index); + + /*----------------------------------------------------------------- + * Define offsets for fine grid stencil and interpolation + * + * In the BoxLoop below I assume iA and iP refer to data associated + * with the point which we are building the stencil for. The below + * Offsets are used in refering to data associated with other points. + *-----------------------------------------------------------------*/ + + hypre_SetIndex(index,0,1,0); + yOffsetA = hypre_BoxOffsetDistance(A_dbox,index); + yOffsetP = hypre_BoxOffsetDistance(PT_dbox,index); + hypre_SetIndex(index,1,0,0); + xOffsetP = hypre_BoxOffsetDistance(PT_dbox,index); + + /*----------------------------------------------------------------- + * Switch statement to direct control to apropriate BoxLoop depending + * on stencil size. Default is full 27-point. + *-----------------------------------------------------------------*/ + + switch (fine_stencil_size) + { + + /*-------------------------------------------------------------- + * Loop for 5-point fine grid operator; produces upper triangular + * part of 9-point coarse grid operator - excludes diagonal. + * stencil entries: (northeast, north, northwest, and east) + *--------------------------------------------------------------*/ + + case 5: + + hypre_BoxGetSize(cgrid_box, loop_size); + hypre_BoxLoop4Begin(loop_size, + PT_dbox, cstart, stridec, iP, + R_dbox, cstart, stridec, iR, + A_dbox, fstart, stridef, iA, + RAP_dbox, cstart, stridec, iAc); + #define HYPRE_BOX_SMP_PRIVATE loopk,loopi,loopj,iP,iR,iA,iAc,iAm1,iAp1,iP1 + #include "hypre_box_smp_forloop.h" + hypre_BoxLoop4For(loopi, loopj, loopk, iP, iR, iA, iAc) + { + iAm1 = iA - yOffsetA; + iAp1 = iA + yOffsetA; + + iP1 = iP + yOffsetP + xOffsetP; + rap_cne[iAc] = ra[iR] * a_ce[iAp1] * pb[iP1]; + + iP1 = iP + yOffsetP; + rap_cn[iAc] = ra[iR] * a_cc[iAp1] * pb[iP1] + + ra[iR] * a_cn[iAp1] + + a_cn[iA] * pb[iP1]; + + iP1 = iP + yOffsetP - xOffsetP; + rap_cnw[iAc] = ra[iR] * a_cw[iAp1] * pb[iP1]; + + iP1 = iP + xOffsetP; + rap_ce[iAc] = a_ce[iA] + + rb[iR] * a_ce[iAm1] * pb[iP1] + + ra[iR] * a_ce[iAp1] * pa[iP1]; + + } + hypre_BoxLoop4End(iP, iR, iA, iAc); + + break; + + /*-------------------------------------------------------------- + * Loop for 9-point fine grid operator; produces upper triangular + * part of 9-point coarse grid operator - excludes diagonal. + * stencil entries: (northeast, north, northwest, and east) + *--------------------------------------------------------------*/ + + default: + + hypre_BoxGetSize(cgrid_box, loop_size); + hypre_BoxLoop4Begin(loop_size, + PT_dbox, cstart, stridec, iP, + R_dbox, cstart, stridec, iR, + A_dbox, fstart, stridef, iA, + RAP_dbox, cstart, stridec, iAc); + #define HYPRE_BOX_SMP_PRIVATE loopk,loopi,loopj,iP,iR,iA,iAc,iAm1,iAp1,iP1 + #include "hypre_box_smp_forloop.h" + hypre_BoxLoop4For(loopi, loopj, loopk, iP, iR, iA, iAc) + { + iAm1 = iA - yOffsetA; + iAp1 = iA + yOffsetA; + + iP1 = iP + yOffsetP + xOffsetP; + rap_cne[iAc] = ra[iR] * a_ce[iAp1] * pb[iP1] + + ra[iR] * a_cne[iAp1] + + a_cne[iA] * pb[iP1]; + + iP1 = iP + yOffsetP; + rap_cn[iAc] = ra[iR] * a_cc[iAp1] * pb[iP1] + + ra[iR] * a_cn[iAp1] + + a_cn[iA] * pb[iP1]; + + iP1 = iP + yOffsetP - xOffsetP; + rap_cnw[iAc] = ra[iR] * a_cw[iAp1] * pb[iP1] + + ra[iR] * a_cnw[iAp1] + + a_cnw[iA] * pb[iP1]; + + iP1 = iP + xOffsetP; + rap_ce[iAc] = a_ce[iA] + + rb[iR] * a_ce[iAm1] * pb[iP1] + + ra[iR] * a_ce[iAp1] * pa[iP1] + + rb[iR] * a_cne[iAm1] + + ra[iR] * a_cse[iAp1] + + a_cse[iA] * pb[iP1] + + a_cne[iA] * pa[iP1]; + + } + hypre_BoxLoop4End(iP, iR, iA, iAc); + + break; + + } /* end switch statement */ + + } /* end ForBoxI */ + + return ierr; + } + + + /*-------------------------------------------------------------------------- + * hypre_SMG2RAPPeriodicSym + * Collapses stencil in periodic direction on coarsest grid. + *--------------------------------------------------------------------------*/ + + + int + hypre_SMG2RAPPeriodicSym( hypre_StructMatrix *RAP, + hypre_Index cindex, + hypre_Index cstride ) + + { + hypre_Index index; + + hypre_StructGrid *cgrid; + hypre_BoxArray *cgrid_boxes; + hypre_Box *cgrid_box; + hypre_IndexRef cstart; + hypre_Index stridec; + hypre_Index loop_size; + + int ci; + int loopi, loopj, loopk; + + hypre_Box *RAP_dbox; + + double *rap_cc, *rap_cw, *rap_cs; + double *rap_csw, *rap_cse; + + int iAc; + int iAcm1; + + int xOffset; + + double zero = 0.0; + + int ierr = 0; + + hypre_SetIndex(stridec, 1, 1, 1); + + cgrid = hypre_StructMatrixGrid(RAP); + cgrid_boxes = hypre_StructGridBoxes(cgrid); + + if (hypre_IndexY(hypre_StructGridPeriodic(cgrid)) == 1) + { + hypre_StructMatrixAssemble(RAP); + hypre_ForBoxI(ci, cgrid_boxes) + { + cgrid_box = hypre_BoxArrayBox(cgrid_boxes, ci); + + cstart = hypre_BoxIMin(cgrid_box); + + RAP_dbox = + hypre_BoxArrayBox(hypre_StructMatrixDataSpace(RAP), ci); + + hypre_SetIndex(index,1,0,0); + xOffset = hypre_BoxOffsetDistance(RAP_dbox,index); + + /*----------------------------------------------------------------- + * Extract pointers for coarse grid operator - always 9-point: + *-----------------------------------------------------------------*/ + hypre_SetIndex(index,0,0,0); + rap_cc = hypre_StructMatrixExtractPointerByIndex(RAP, ci, index); + + hypre_SetIndex(index,-1,0,0); + rap_cw = hypre_StructMatrixExtractPointerByIndex(RAP, ci, index); + + hypre_SetIndex(index,0,-1,0); + rap_cs = hypre_StructMatrixExtractPointerByIndex(RAP, ci, index); + + hypre_SetIndex(index,-1,-1,0); + rap_csw = hypre_StructMatrixExtractPointerByIndex(RAP, ci, index); + + hypre_SetIndex(index,1,-1,0); + rap_cse = hypre_StructMatrixExtractPointerByIndex(RAP, ci, index); + + hypre_BoxGetSize(cgrid_box, loop_size); + + + hypre_BoxLoop1Begin(loop_size, + RAP_dbox, cstart, stridec, iAc); + #define HYPRE_BOX_SMP_PRIVATE loopk,loopi,loopj,iAc,iAcm1 + #include "hypre_box_smp_forloop.h" + hypre_BoxLoop1For(loopi, loopj, loopk, iAc) + { + iAcm1 = iAc - xOffset; + + rap_cw[iAc] += (rap_cse[iAcm1] + rap_csw[iAc]); + rap_cc[iAc] += (2.0 * rap_cs[iAc]); + } + hypre_BoxLoop1End(iAc); + + hypre_BoxLoop1Begin(loop_size, + RAP_dbox, cstart, stridec, iAc); + #define HYPRE_BOX_SMP_PRIVATE loopk,loopi,loopj,iAc + #include "hypre_box_smp_forloop.h" + hypre_BoxLoop1For(loopi, loopj, loopk, iAc) + { + rap_csw[iAc] = zero; + rap_cs[iAc] = zero; + rap_cse[iAc] = zero; + } + hypre_BoxLoop1End(iAc); + + } /* end ForBoxI */ + + } + return ierr; + } + + /*-------------------------------------------------------------------------- + * hypre_SMG2RAPPeriodicNoSym + * Collapses stencil in periodic direction on coarsest grid. + *--------------------------------------------------------------------------*/ + + int + hypre_SMG2RAPPeriodicNoSym( hypre_StructMatrix *RAP, + hypre_Index cindex, + hypre_Index cstride ) + + { + + hypre_Index index; + + hypre_StructGrid *cgrid; + hypre_BoxArray *cgrid_boxes; + hypre_Box *cgrid_box; + hypre_IndexRef cstart; + hypre_Index stridec; + hypre_Index loop_size; + + int ci; + int loopi, loopj, loopk; + + hypre_Box *RAP_dbox; + + double *rap_cc, *rap_cw, *rap_cs; + double *rap_csw, *rap_cse; + double *rap_ce, *rap_cn; + double *rap_cnw, *rap_cne; + + int iAc; + + double zero = 0.0; + + int ierr = 0; + + hypre_SetIndex(stridec, 1, 1, 1); + + cgrid = hypre_StructMatrixGrid(RAP); + cgrid_boxes = hypre_StructGridBoxes(cgrid); + + if (hypre_IndexY(hypre_StructGridPeriodic(cgrid)) == 1) + { + hypre_ForBoxI(ci, cgrid_boxes) + { + cgrid_box = hypre_BoxArrayBox(cgrid_boxes, ci); + + cstart = hypre_BoxIMin(cgrid_box); + + RAP_dbox = hypre_BoxArrayBox(hypre_StructMatrixDataSpace(RAP), ci); + + /*----------------------------------------------------------------- + * Extract pointers for coarse grid operator - always 9-point: + *-----------------------------------------------------------------*/ + hypre_SetIndex(index,0,0,0); + rap_cc = hypre_StructMatrixExtractPointerByIndex(RAP, ci, index); + + hypre_SetIndex(index,-1,0,0); + rap_cw = hypre_StructMatrixExtractPointerByIndex(RAP, ci, index); + + hypre_SetIndex(index,0,-1,0); + rap_cs = hypre_StructMatrixExtractPointerByIndex(RAP, ci, index); + + hypre_SetIndex(index,-1,-1,0); + rap_csw = hypre_StructMatrixExtractPointerByIndex(RAP, ci, index); + + hypre_SetIndex(index,1,-1,0); + rap_cse = hypre_StructMatrixExtractPointerByIndex(RAP, ci, index); + + hypre_SetIndex(index,1,0,0); + rap_ce = hypre_StructMatrixExtractPointerByIndex(RAP, ci, index); + + hypre_SetIndex(index,0,1,0); + rap_cn = hypre_StructMatrixExtractPointerByIndex(RAP, ci, index); + + hypre_SetIndex(index,1,1,0); + rap_cne = hypre_StructMatrixExtractPointerByIndex(RAP, ci, index); + + hypre_SetIndex(index,-1,1,0); + rap_cnw = hypre_StructMatrixExtractPointerByIndex(RAP, ci, index); + + + hypre_BoxGetSize(cgrid_box, loop_size); + + hypre_BoxLoop1Begin(loop_size, + RAP_dbox, cstart, stridec, iAc); + #define HYPRE_BOX_SMP_PRIVATE loopk,loopi,loopj,iAc + #include "hypre_box_smp_forloop.h" + hypre_BoxLoop1For(loopi, loopj, loopk, iAc) + { + rap_cw[iAc] += (rap_cnw[iAc] + rap_csw[iAc]); + rap_cnw[iAc] = zero; + rap_csw[iAc] = zero; + + rap_cc[iAc] += (rap_cn[iAc] + rap_cs[iAc]); + rap_cn[iAc] = zero; + rap_cs[iAc] = zero; + + rap_ce[iAc] += (rap_cne[iAc] + rap_cse[iAc]); + rap_cne[iAc] = zero; + rap_cse[iAc] = zero; + } + hypre_BoxLoop1End(iAc); + + } /* end ForBoxI */ + + } + return ierr; + } + Index: llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/smg3_setup_rap.c diff -c /dev/null llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/smg3_setup_rap.c:1.1 *** /dev/null Mon Apr 11 00:22:18 2005 --- llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/smg3_setup_rap.c Mon Apr 11 00:22:07 2005 *************** *** 0 **** --- 1,2044 ---- + /*BHEADER********************************************************************** + * (c) 1997 The Regents of the University of California + * + * See the file COPYRIGHT_and_DISCLAIMER for a complete copyright + * notice, contact person, and disclaimer. + * + * $Revision: 1.1 $ + *********************************************************************EHEADER*/ + /****************************************************************************** + * + * + *****************************************************************************/ + + #include "headers.h" + #include "smg.h" + + /*-------------------------------------------------------------------------- + * hypre_SMG3CreateRAPOp + * Sets up new coarse grid operator stucture. + *--------------------------------------------------------------------------*/ + + hypre_StructMatrix * + hypre_SMG3CreateRAPOp( hypre_StructMatrix *R, + hypre_StructMatrix *A, + hypre_StructMatrix *PT, + hypre_StructGrid *coarse_grid ) + { + hypre_StructMatrix *RAP; + + hypre_Index *RAP_stencil_shape; + hypre_StructStencil *RAP_stencil; + int RAP_stencil_size; + int RAP_stencil_dim; + int RAP_num_ghost[] = {1, 1, 1, 1, 1, 1}; + + hypre_StructStencil *A_stencil; + int A_stencil_size; + + int k, j, i; + int stencil_rank; + + RAP_stencil_dim = 3; + + A_stencil = hypre_StructMatrixStencil(A); + A_stencil_size = hypre_StructStencilSize(A_stencil); + + /*----------------------------------------------------------------------- + * Define RAP_stencil + *-----------------------------------------------------------------------*/ + + stencil_rank = 0; + + /*----------------------------------------------------------------------- + * non-symmetric case + *-----------------------------------------------------------------------*/ + + if (!hypre_StructMatrixSymmetric(A)) + { + + /*-------------------------------------------------------------------- + * 7 or 15 point fine grid stencil produces 15 point RAP + *--------------------------------------------------------------------*/ + if( A_stencil_size <= 15) + { + RAP_stencil_size = 15; + RAP_stencil_shape = hypre_CTAlloc(hypre_Index, RAP_stencil_size); + for (k = -1; k < 2; k++) + { + for (j = -1; j < 2; j++) + { + for (i = -1; i < 2; i++) + { + + /*-------------------------------------------------------- + * Storage for c,w,e,n,s elements in each plane + *--------------------------------------------------------*/ + if( i*j == 0 ) + { + hypre_SetIndex(RAP_stencil_shape[stencil_rank],i,j,k); + stencil_rank++; + } + } + } + } + } + + /*-------------------------------------------------------------------- + * 19 or 27 point fine grid stencil produces 27 point RAP + *--------------------------------------------------------------------*/ + else + { + RAP_stencil_size = 27; + RAP_stencil_shape = hypre_CTAlloc(hypre_Index, RAP_stencil_size); + for (k = -1; k < 2; k++) + { + for (j = -1; j < 2; j++) + { + for (i = -1; i < 2; i++) + { + + /*-------------------------------------------------------- + * Storage for 9 elements (c,w,e,n,s,sw,se,nw,se) in + * each plane + *--------------------------------------------------------*/ + hypre_SetIndex(RAP_stencil_shape[stencil_rank],i,j,k); + stencil_rank++; + } + } + } + } + } + + /*----------------------------------------------------------------------- + * symmetric case + *-----------------------------------------------------------------------*/ + + else + { + + /*-------------------------------------------------------------------- + * 7 or 15 point fine grid stencil produces 15 point RAP + * Only store the lower triangular part + diagonal = 8 entries, + * lower triangular means the lower triangular part on the matrix + * in the standard lexicalgraphic ordering. + *--------------------------------------------------------------------*/ + if( A_stencil_size <= 15) + { + RAP_stencil_size = 8; + RAP_stencil_shape = hypre_CTAlloc(hypre_Index, RAP_stencil_size); + for (k = -1; k < 1; k++) + { + for (j = -1; j < 2; j++) + { + for (i = -1; i < 2; i++) + { + + /*-------------------------------------------------------- + * Store 5 elements in lower plane (c,w,e,s,n) + * and 3 elements in same plane (c,w,s) + *--------------------------------------------------------*/ + if( i*j == 0 && i+j+k <= 0) + { + hypre_SetIndex(RAP_stencil_shape[stencil_rank],i,j,k); + stencil_rank++; + } + } + } + } + } + + /*-------------------------------------------------------------------- + * 19 or 27 point fine grid stencil produces 27 point RAP + * Only store the lower triangular part + diagonal = 14 entries, + * lower triangular means the lower triangular part on the matrix + * in the standard lexicalgraphic ordering. + *--------------------------------------------------------------------*/ + else + { + RAP_stencil_size = 14; + RAP_stencil_shape = hypre_CTAlloc(hypre_Index, RAP_stencil_size); + for (k = -1; k < 1; k++) + { + for (j = -1; j < 2; j++) + { + for (i = -1; i < 2; i++) + { + + /*-------------------------------------------------------- + * Store 9 elements in lower plane (c,w,e,s,n,sw,se,nw,ne) + * and 5 elements in same plane (c,w,s,sw,se) + *--------------------------------------------------------*/ + if( k < 0 || (i+j+k <=0 && j < 1) ) + { + hypre_SetIndex(RAP_stencil_shape[stencil_rank],i,j,k); + stencil_rank++; + } + } + } + } + } + } + + RAP_stencil = hypre_StructStencilCreate(RAP_stencil_dim, RAP_stencil_size, + RAP_stencil_shape); + RAP = hypre_StructMatrixCreate(hypre_StructMatrixComm(A), + coarse_grid, RAP_stencil); + + hypre_StructStencilDestroy(RAP_stencil); + + /*----------------------------------------------------------------------- + * Coarse operator in symmetric iff fine operator is + *-----------------------------------------------------------------------*/ + hypre_StructMatrixSymmetric(RAP) = hypre_StructMatrixSymmetric(A); + + /*----------------------------------------------------------------------- + * Set number of ghost points + *-----------------------------------------------------------------------*/ + if (hypre_StructMatrixSymmetric(A)) + { + RAP_num_ghost[1] = 0; + RAP_num_ghost[3] = 0; + RAP_num_ghost[5] = 0; + } + hypre_StructMatrixSetNumGhost(RAP, RAP_num_ghost); + + return RAP; + } + + /*-------------------------------------------------------------------------- + * Routines to build RAP. These routines are fairly general + * 1) No assumptions about symmetry of A + * 2) No assumption that R = transpose(P) + * 3) 7,15,19 or 27-point fine grid A + * + * I am, however, assuming that the c-to-c interpolation is the identity. + * + * I've written a two routines - hypre_SMG3BuildRAPSym to build the lower + * triangular part of RAP (including the diagonal) and + * hypre_SMG3BuildRAPNoSym to build the upper triangular part of RAP + * (excluding the diagonal). So using symmetric storage, only the first + * routine would be called. With full storage both would need to be called. + * + *--------------------------------------------------------------------------*/ + + int + hypre_SMG3BuildRAPSym( hypre_StructMatrix *A, + hypre_StructMatrix *PT, + hypre_StructMatrix *R, + hypre_StructMatrix *RAP, + hypre_Index cindex, + hypre_Index cstride ) + + { + + hypre_Index index; + + hypre_StructStencil *fine_stencil; + int fine_stencil_size; + + hypre_StructGrid *fgrid; + int *fgrid_ids; + hypre_StructGrid *cgrid; + hypre_BoxArray *cgrid_boxes; + int *cgrid_ids; + hypre_Box *cgrid_box; + hypre_IndexRef cstart; + hypre_Index stridec; + hypre_Index fstart; + hypre_IndexRef stridef; + hypre_Index loop_size; + + int fi, ci; + int loopi, loopj, loopk; + + hypre_Box *A_dbox; + hypre_Box *PT_dbox; + hypre_Box *R_dbox; + hypre_Box *RAP_dbox; + + double *pa, *pb; + double *ra, *rb; + + double *a_cc, *a_cw, *a_ce, *a_cs, *a_cn; + double *a_ac, *a_aw, *a_as; + double *a_bc, *a_bw, *a_be, *a_bs, *a_bn; + double *a_csw, *a_cse, *a_cnw, *a_cne; + double *a_asw, *a_ase; + double *a_bsw, *a_bse, *a_bnw, *a_bne; + + double *rap_cc, *rap_cw, *rap_cs; + double *rap_bc, *rap_bw, *rap_be, *rap_bs, *rap_bn; + double *rap_csw, *rap_cse; + double *rap_bsw, *rap_bse, *rap_bnw, *rap_bne; + + int iA, iAm1, iAp1; + int iAc; + int iP, iP1; + int iR; + + int zOffsetA; + int xOffsetP; + int yOffsetP; + int zOffsetP; + + int ierr = 0; + + fine_stencil = hypre_StructMatrixStencil(A); + fine_stencil_size = hypre_StructStencilSize(fine_stencil); + + stridef = cstride; + hypre_SetIndex(stridec, 1, 1, 1); + + fgrid = hypre_StructMatrixGrid(A); + fgrid_ids = hypre_StructGridIDs(fgrid); + + cgrid = hypre_StructMatrixGrid(RAP); + cgrid_boxes = hypre_StructGridBoxes(cgrid); + cgrid_ids = hypre_StructGridIDs(cgrid); + + fi = 0; + hypre_ForBoxI(ci, cgrid_boxes) + { + while (fgrid_ids[fi] != cgrid_ids[ci]) + { + fi++; + } + + cgrid_box = hypre_BoxArrayBox(cgrid_boxes, ci); + + cstart = hypre_BoxIMin(cgrid_box); + hypre_StructMapCoarseToFine(cstart, cindex, cstride, fstart); + + A_dbox = hypre_BoxArrayBox(hypre_StructMatrixDataSpace(A), fi); + PT_dbox = hypre_BoxArrayBox(hypre_StructMatrixDataSpace(PT), fi); + R_dbox = hypre_BoxArrayBox(hypre_StructMatrixDataSpace(R), fi); + RAP_dbox = hypre_BoxArrayBox(hypre_StructMatrixDataSpace(RAP), ci); + + /*----------------------------------------------------------------- + * Extract pointers for interpolation operator: + * pa is pointer for weight for f-point above c-point + * pb is pointer for weight for f-point below c-point + *-----------------------------------------------------------------*/ + + hypre_SetIndex(index,0,0,1); + pa = hypre_StructMatrixExtractPointerByIndex(PT, fi, index); + + hypre_SetIndex(index,0,0,-1); + pb = hypre_StructMatrixExtractPointerByIndex(PT, fi, index); + + /*----------------------------------------------------------------- + * Extract pointers for restriction operator: + * ra is pointer for weight for f-point above c-point + * rb is pointer for weight for f-point below c-point + *-----------------------------------------------------------------*/ + + hypre_SetIndex(index,0,0,1); + ra = hypre_StructMatrixExtractPointerByIndex(R, fi, index); + + hypre_SetIndex(index,0,0,-1); + rb = hypre_StructMatrixExtractPointerByIndex(R, fi, index); + + /*----------------------------------------------------------------- + * Extract pointers for 7-point fine grid operator: + * + * a_cc is pointer for center coefficient + * a_cw is pointer for west coefficient in same plane + * a_ce is pointer for east coefficient in same plane + * a_cs is pointer for south coefficient in same plane + * a_cn is pointer for north coefficient in same plane + * a_ac is pointer for center coefficient in plane above + * a_bc is pointer for center coefficient in plane below + *-----------------------------------------------------------------*/ + + hypre_SetIndex(index,0,0,0); + a_cc = hypre_StructMatrixExtractPointerByIndex(A, fi, index); + + hypre_SetIndex(index,-1,0,0); + a_cw = hypre_StructMatrixExtractPointerByIndex(A, fi, index); + + hypre_SetIndex(index,1,0,0); + a_ce = hypre_StructMatrixExtractPointerByIndex(A, fi, index); + + hypre_SetIndex(index,0,-1,0); + a_cs = hypre_StructMatrixExtractPointerByIndex(A, fi, index); + + hypre_SetIndex(index,0,1,0); + a_cn = hypre_StructMatrixExtractPointerByIndex(A, fi, index); + + hypre_SetIndex(index,0,0,1); + a_ac = hypre_StructMatrixExtractPointerByIndex(A, fi, index); + + hypre_SetIndex(index,0,0,-1); + a_bc = hypre_StructMatrixExtractPointerByIndex(A, fi, index); + + /*----------------------------------------------------------------- + * Extract additional pointers for 15-point fine grid operator: + * + * a_aw is pointer for west coefficient in plane above + * a_ae is pointer for east coefficient in plane above + * a_as is pointer for south coefficient in plane above + * a_an is pointer for north coefficient in plane above + * a_bw is pointer for west coefficient in plane below + * a_be is pointer for east coefficient in plane below + * a_bs is pointer for south coefficient in plane below + * a_bn is pointer for north coefficient in plane below + *-----------------------------------------------------------------*/ + + if(fine_stencil_size > 7) + { + hypre_SetIndex(index,-1,0,1); + a_aw = hypre_StructMatrixExtractPointerByIndex(A, fi, index); + + hypre_SetIndex(index,0,-1,1); + a_as = hypre_StructMatrixExtractPointerByIndex(A, fi, index); + + hypre_SetIndex(index,-1,0,-1); + a_bw = hypre_StructMatrixExtractPointerByIndex(A, fi, index); + + hypre_SetIndex(index,1,0,-1); + a_be = hypre_StructMatrixExtractPointerByIndex(A, fi, index); + + hypre_SetIndex(index,0,-1,-1); + a_bs = hypre_StructMatrixExtractPointerByIndex(A, fi, index); + + hypre_SetIndex(index,0,1,-1); + a_bn = hypre_StructMatrixExtractPointerByIndex(A, fi, index); + + } + + /*----------------------------------------------------------------- + * Extract additional pointers for 19-point fine grid operator: + * + * a_csw is pointer for southwest coefficient in same plane + * a_cse is pointer for southeast coefficient in same plane + * a_cnw is pointer for northwest coefficient in same plane + * a_cne is pointer for northeast coefficient in same plane + *-----------------------------------------------------------------*/ + + if(fine_stencil_size > 15) + { + hypre_SetIndex(index,-1,-1,0); + a_csw = hypre_StructMatrixExtractPointerByIndex(A, fi, index); + + hypre_SetIndex(index,1,-1,0); + a_cse = hypre_StructMatrixExtractPointerByIndex(A, fi, index); + + hypre_SetIndex(index,-1,1,0); + a_cnw = hypre_StructMatrixExtractPointerByIndex(A, fi, index); + + hypre_SetIndex(index,1,1,0); + a_cne = hypre_StructMatrixExtractPointerByIndex(A, fi, index); + + } + + /*----------------------------------------------------------------- + * Extract additional pointers for 27-point fine grid operator: + * + * a_asw is pointer for southwest coefficient in plane above + * a_ase is pointer for southeast coefficient in plane above + * a_anw is pointer for northwest coefficient in plane above + * a_ane is pointer for northeast coefficient in plane above + * a_bsw is pointer for southwest coefficient in plane below + * a_bse is pointer for southeast coefficient in plane below + * a_bnw is pointer for northwest coefficient in plane below + * a_bne is pointer for northeast coefficient in plane below + *-----------------------------------------------------------------*/ + + if(fine_stencil_size > 19) + { + hypre_SetIndex(index,-1,-1,1); + a_asw = hypre_StructMatrixExtractPointerByIndex(A, fi, index); + + hypre_SetIndex(index,1,-1,1); + a_ase = hypre_StructMatrixExtractPointerByIndex(A, fi, index); + + hypre_SetIndex(index,-1,-1,-1); + a_bsw = hypre_StructMatrixExtractPointerByIndex(A, fi, index); + + hypre_SetIndex(index,1,-1,-1); + a_bse = hypre_StructMatrixExtractPointerByIndex(A, fi, index); + + hypre_SetIndex(index,-1,1,-1); + a_bnw = hypre_StructMatrixExtractPointerByIndex(A, fi, index); + + hypre_SetIndex(index,1,1,-1); + a_bne = hypre_StructMatrixExtractPointerByIndex(A, fi, index); + + } + + /*----------------------------------------------------------------- + * Extract pointers for 15-point coarse grid operator: + * + * We build only the lower triangular part (plus diagonal). + * + * rap_cc is pointer for center coefficient (etc.) + *-----------------------------------------------------------------*/ + + hypre_SetIndex(index,0,0,0); + rap_cc = hypre_StructMatrixExtractPointerByIndex(RAP, ci, index); + + hypre_SetIndex(index,-1,0,0); + rap_cw = hypre_StructMatrixExtractPointerByIndex(RAP, ci, index); + + hypre_SetIndex(index,0,-1,0); + rap_cs = hypre_StructMatrixExtractPointerByIndex(RAP, ci, index); + + hypre_SetIndex(index,0,0,-1); + rap_bc = hypre_StructMatrixExtractPointerByIndex(RAP, ci, index); + + hypre_SetIndex(index,-1,0,-1); + rap_bw = hypre_StructMatrixExtractPointerByIndex(RAP, ci, index); + + hypre_SetIndex(index,1,0,-1); + rap_be = hypre_StructMatrixExtractPointerByIndex(RAP, ci, index); + + hypre_SetIndex(index,0,-1,-1); + rap_bs = hypre_StructMatrixExtractPointerByIndex(RAP, ci, index); + + hypre_SetIndex(index,0,1,-1); + rap_bn = hypre_StructMatrixExtractPointerByIndex(RAP, ci, index); + + /*----------------------------------------------------------------- + * Extract additional pointers for 27-point coarse grid operator: + * + * A 27-point coarse grid operator is produced when the fine grid + * stencil is 19 or 27 point. + * + * We build only the lower triangular part. + * + * rap_csw is pointer for southwest coefficient in same plane (etc.) + *-----------------------------------------------------------------*/ + + if(fine_stencil_size > 15) + { + hypre_SetIndex(index,-1,-1,0); + rap_csw = + hypre_StructMatrixExtractPointerByIndex(RAP, ci, index); + + hypre_SetIndex(index,1,-1,0); + rap_cse = + hypre_StructMatrixExtractPointerByIndex(RAP, ci, index); + + hypre_SetIndex(index,-1,-1,-1); + rap_bsw = + hypre_StructMatrixExtractPointerByIndex(RAP, ci, index); + + hypre_SetIndex(index,1,-1,-1); + rap_bse = + hypre_StructMatrixExtractPointerByIndex(RAP, ci, index); + + hypre_SetIndex(index,-1,1,-1); + rap_bnw = + hypre_StructMatrixExtractPointerByIndex(RAP, ci, index); + + hypre_SetIndex(index,1,1,-1); + rap_bne = + hypre_StructMatrixExtractPointerByIndex(RAP, ci, index); + + } + + /*----------------------------------------------------------------- + * Define offsets for fine grid stencil and interpolation + * + * In the BoxLoop below I assume iA and iP refer to data associated + * with the point which we are building the stencil for. The below + * Offsets are used in refering to data associated with other points. + *-----------------------------------------------------------------*/ + + hypre_SetIndex(index,0,0,1); + zOffsetA = hypre_BoxOffsetDistance(A_dbox,index); + zOffsetP = hypre_BoxOffsetDistance(PT_dbox,index); + hypre_SetIndex(index,0,1,0); + yOffsetP = hypre_BoxOffsetDistance(PT_dbox,index); + hypre_SetIndex(index,1,0,0); + xOffsetP = hypre_BoxOffsetDistance(PT_dbox,index); + + /*-------------------------------------------------------------------- + * Switch statement to direct control to apropriate BoxLoop depending + * on stencil size. Default is full 27-point. + *-----------------------------------------------------------------*/ + + switch (fine_stencil_size) + { + + /*-------------------------------------------------------------- + * Loop for symmetric 7-point fine grid operator; produces a + * symmetric 15-point coarse grid operator. We calculate only the + * lower triangular stencil entries: (below-south, below-west, + * below-center, below-east, below-north, center-south, + * center-west, and center-center). + *--------------------------------------------------------------*/ + + case 7: + + hypre_BoxGetSize(cgrid_box, loop_size); + hypre_BoxLoop4Begin(loop_size, + PT_dbox, cstart, stridec, iP, + R_dbox, cstart, stridec, iR, + A_dbox, fstart, stridef, iA, + RAP_dbox, cstart, stridec, iAc); + #define HYPRE_BOX_SMP_PRIVATE loopk,loopi,loopj,iP,iR,iA,iAc,iAm1,iAp1,iP1 + #include "hypre_box_smp_forloop.h" + hypre_BoxLoop4For(loopi, loopj, loopk, iP, iR, iA, iAc) + { + iAm1 = iA - zOffsetA; + iAp1 = iA + zOffsetA; + + iP1 = iP - zOffsetP - yOffsetP; + rap_bs[iAc] = rb[iR] * a_cs[iAm1] * pa[iP1]; + + iP1 = iP - zOffsetP - xOffsetP; + rap_bw[iAc] = rb[iR] * a_cw[iAm1] * pa[iP1]; + + iP1 = iP - zOffsetP; + rap_bc[iAc] = a_bc[iA] * pa[iP1] + + rb[iR] * a_cc[iAm1] * pa[iP1] + + rb[iR] * a_bc[iAm1]; + + iP1 = iP - zOffsetP + xOffsetP; + rap_be[iAc] = rb[iR] * a_ce[iAm1] * pa[iP1]; + + iP1 = iP - zOffsetP + yOffsetP; + rap_bn[iAc] = rb[iR] * a_cn[iAm1] * pa[iP1]; + + iP1 = iP - yOffsetP; + rap_cs[iAc] = a_cs[iA] + + rb[iR] * a_cs[iAm1] * pb[iP1] + + ra[iR] * a_cs[iAp1] * pa[iP1]; + + iP1 = iP - xOffsetP; + rap_cw[iAc] = a_cw[iA] + + rb[iR] * a_cw[iAm1] * pb[iP1] + + ra[iR] * a_cw[iAp1] * pa[iP1]; + + rap_cc[iAc] = a_cc[iA] + + rb[iR] * a_cc[iAm1] * pb[iP] + + ra[iR] * a_cc[iAp1] * pa[iP] + + rb[iR] * a_ac[iAm1] + + ra[iR] * a_bc[iAp1] + + a_bc[iA] * pb[iP] + + a_ac[iA] * pa[iP]; + + } + hypre_BoxLoop4End(iP, iR, iA, iAc); + + break; + + /*-------------------------------------------------------------- + * Loop for symmetric 15-point fine grid operator; produces a + * symmetric 15-point coarse grid operator. We calculate only the + * lower triangular stencil entries: (below-south, below-west, + * below-center, below-east, below-north, center-south, + * center-west, and center-center). + *--------------------------------------------------------------*/ + + case 15: + + hypre_BoxGetSize(cgrid_box, loop_size); + hypre_BoxLoop4Begin(loop_size, + PT_dbox, cstart, stridec, iP, + R_dbox, cstart, stridec, iR, + A_dbox, fstart, stridef, iA, + RAP_dbox, cstart, stridec, iAc); + #define HYPRE_BOX_SMP_PRIVATE loopk,loopi,loopj,iP,iR,iA,iAc,iAm1,iAp1,iP1 + #include "hypre_box_smp_forloop.h" + hypre_BoxLoop4For(loopi, loopj, loopk, iP, iR, iA, iAc) + { + iAm1 = iA - zOffsetA; + iAp1 = iA + zOffsetA; + + iP1 = iP - zOffsetP - yOffsetP; + rap_bs[iAc] = rb[iR] * a_cs[iAm1] * pa[iP1] + + rb[iR] * a_bs[iAm1] + + a_bs[iA] * pa[iP1]; + + iP1 = iP - zOffsetP - xOffsetP; + rap_bw[iAc] = rb[iR] * a_cw[iAm1] * pa[iP1] + + rb[iR] * a_bw[iAm1] + + a_bw[iA] * pa[iP1]; + + iP1 = iP - zOffsetP; + rap_bc[iAc] = a_bc[iA] * pa[iP1] + + rb[iR] * a_cc[iAm1] * pa[iP1] + + rb[iR] * a_bc[iAm1]; + + iP1 = iP - zOffsetP + xOffsetP; + rap_be[iAc] = rb[iR] * a_ce[iAm1] * pa[iP1] + + rb[iR] * a_be[iAm1] + + a_be[iA] * pa[iP1]; + + iP1 = iP - zOffsetP + yOffsetP; + rap_bn[iAc] = rb[iR] * a_cn[iAm1] * pa[iP1] + + rb[iR] * a_bn[iAm1] + + a_bn[iA] * pa[iP1]; + + iP1 = iP - yOffsetP; + rap_cs[iAc] = a_cs[iA] + + rb[iR] * a_cs[iAm1] * pb[iP1] + + ra[iR] * a_cs[iAp1] * pa[iP1] + + a_bs[iA] * pb[iP1] + + a_as[iA] * pa[iP1] + + rb[iR] * a_as[iAm1] + + ra[iR] * a_bs[iAp1]; + + iP1 = iP - xOffsetP; + rap_cw[iAc] = a_cw[iA] + + rb[iR] * a_cw[iAm1] * pb[iP1] + + ra[iR] * a_cw[iAp1] * pa[iP1] + + a_bw[iA] * pb[iP1] + + a_aw[iA] * pa[iP1] + + rb[iR] * a_aw[iAm1] + + ra[iR] * a_bw[iAp1]; + + rap_cc[iAc] = a_cc[iA] + + rb[iR] * a_cc[iAm1] * pb[iP] + + ra[iR] * a_cc[iAp1] * pa[iP] + + rb[iR] * a_ac[iAm1] + + ra[iR] * a_bc[iAp1] + + a_bc[iA] * pb[iP] + + a_ac[iA] * pa[iP]; + + } + hypre_BoxLoop4End(iP, iR, iA, iAc); + + break; + + /*-------------------------------------------------------------- + * Loop for symmetric 19-point fine grid operator; produces a + * symmetric 27-point coarse grid operator. We calculate only the + * lower triangular stencil entries: (below-southwest, below-south, + * below-southeast, below-west, below-center, below-east, + * below-northwest, below-north, below-northeast, center-southwest, + * center-south, center-southeast, center-west, and center-center). + *--------------------------------------------------------------*/ + + case 19: + + hypre_BoxGetSize(cgrid_box, loop_size); + hypre_BoxLoop4Begin(loop_size, + PT_dbox, cstart, stridec, iP, + R_dbox, cstart, stridec, iR, + A_dbox, fstart, stridef, iA, + RAP_dbox, cstart, stridec, iAc); + #define HYPRE_BOX_SMP_PRIVATE loopk,loopi,loopj,iP,iR,iA,iAc,iAm1,iAp1,iP1 + #include "hypre_box_smp_forloop.h" + hypre_BoxLoop4For(loopi, loopj, loopk, iP, iR, iA, iAc) + { + iAm1 = iA - zOffsetA; + iAp1 = iA + zOffsetA; + + iP1 = iP - zOffsetP - yOffsetP - xOffsetP; + rap_bsw[iAc] = rb[iR] * a_csw[iAm1] * pa[iP1]; + + iP1 = iP - zOffsetP - yOffsetP; + rap_bs[iAc] = rb[iR] * a_cs[iAm1] * pa[iP1] + + rb[iR] * a_bs[iAm1] + + a_bs[iA] * pa[iP1]; + + iP1 = iP - zOffsetP - yOffsetP + xOffsetP; + rap_bse[iAc] = rb[iR] * a_cse[iAm1] * pa[iP1]; + + iP1 = iP - zOffsetP - xOffsetP; + rap_bw[iAc] = rb[iR] * a_cw[iAm1] * pa[iP1] + + rb[iR] * a_bw[iAm1] + + a_bw[iA] * pa[iP1]; + + iP1 = iP - zOffsetP; + rap_bc[iAc] = a_bc[iA] * pa[iP1] + + rb[iR] * a_cc[iAm1] * pa[iP1] + + rb[iR] * a_bc[iAm1]; + + iP1 = iP - zOffsetP + xOffsetP; + rap_be[iAc] = rb[iR] * a_ce[iAm1] * pa[iP1] + + rb[iR] * a_be[iAm1] + + a_be[iA] * pa[iP1]; + + iP1 = iP - zOffsetP + yOffsetP - xOffsetP; + rap_bnw[iAc] = rb[iR] * a_cnw[iAm1] * pa[iP1]; + + iP1 = iP - zOffsetP + yOffsetP; + rap_bn[iAc] = rb[iR] * a_cn[iAm1] * pa[iP1] + + rb[iR] * a_bn[iAm1] + + a_bn[iA] * pa[iP1]; + + iP1 = iP - zOffsetP + yOffsetP + xOffsetP; + rap_bne[iAc] = rb[iR] * a_cne[iAm1] * pa[iP1]; + + iP1 = iP - yOffsetP - xOffsetP; + rap_csw[iAc] = a_csw[iA] + + rb[iR] * a_csw[iAm1] * pb[iP1] + + ra[iR] * a_csw[iAp1] * pa[iP1]; + + iP1 = iP - yOffsetP; + rap_cs[iAc] = a_cs[iA] + + rb[iR] * a_cs[iAm1] * pb[iP1] + + ra[iR] * a_cs[iAp1] * pa[iP1] + + a_bs[iA] * pb[iP1] + + a_as[iA] * pa[iP1] + + rb[iR] * a_as[iAm1] + + ra[iR] * a_bs[iAp1]; + + iP1 = iP - yOffsetP + xOffsetP; + rap_cse[iAc] = a_cse[iA] + + rb[iR] * a_cse[iAm1] * pb[iP1] + + ra[iR] * a_cse[iAp1] * pa[iP1]; + + iP1 = iP - xOffsetP; + rap_cw[iAc] = a_cw[iA] + + rb[iR] * a_cw[iAm1] * pb[iP1] + + ra[iR] * a_cw[iAp1] * pa[iP1] + + a_bw[iA] * pb[iP1] + + a_aw[iA] * pa[iP1] + + rb[iR] * a_aw[iAm1] + + ra[iR] * a_bw[iAp1]; + + rap_cc[iAc] = a_cc[iA] + + rb[iR] * a_cc[iAm1] * pb[iP] + + ra[iR] * a_cc[iAp1] * pa[iP] + + rb[iR] * a_ac[iAm1] + + ra[iR] * a_bc[iAp1] + + a_bc[iA] * pb[iP] + + a_ac[iA] * pa[iP]; + + } + hypre_BoxLoop4End(iP, iR, iA, iAc); + + break; + + /*-------------------------------------------------------------- + * Loop for symmetric 27-point fine grid operator; produces a + * symmetric 27-point coarse grid operator. We calculate only the + * lower triangular stencil entries: (below-southwest, below-south, + * below-southeast, below-west, below-center, below-east, + * below-northwest, below-north, below-northeast, center-southwest, + * center-south, center-southeast, center-west, and center-center). + *--------------------------------------------------------------*/ + + default: + + hypre_BoxGetSize(cgrid_box, loop_size); + hypre_BoxLoop4Begin(loop_size, + PT_dbox, cstart, stridec, iP, + R_dbox, cstart, stridec, iR, + A_dbox, fstart, stridef, iA, + RAP_dbox, cstart, stridec, iAc); + #define HYPRE_BOX_SMP_PRIVATE loopk,loopi,loopj,iP,iR,iA,iAc,iAm1,iAp1,iP1 + #include "hypre_box_smp_forloop.h" + hypre_BoxLoop4For(loopi, loopj, loopk, iP, iR, iA, iAc) + { + iAm1 = iA - zOffsetA; + iAp1 = iA + zOffsetA; + + iP1 = iP - zOffsetP - yOffsetP - xOffsetP; + rap_bsw[iAc] = rb[iR] * a_csw[iAm1] * pa[iP1] + + rb[iR] * a_bsw[iAm1] + + a_bsw[iA] * pa[iP1]; + + iP1 = iP - zOffsetP - yOffsetP; + rap_bs[iAc] = rb[iR] * a_cs[iAm1] * pa[iP1] + + rb[iR] * a_bs[iAm1] + + a_bs[iA] * pa[iP1]; + + iP1 = iP - zOffsetP - yOffsetP + xOffsetP; + rap_bse[iAc] = rb[iR] * a_cse[iAm1] * pa[iP1] + + rb[iR] * a_bse[iAm1] + + a_bse[iA] * pa[iP1]; + + iP1 = iP - zOffsetP - xOffsetP; + rap_bw[iAc] = rb[iR] * a_cw[iAm1] * pa[iP1] + + rb[iR] * a_bw[iAm1] + + a_bw[iA] * pa[iP1]; + + iP1 = iP - zOffsetP; + rap_bc[iAc] = a_bc[iA] * pa[iP1] + + rb[iR] * a_cc[iAm1] * pa[iP1] + + rb[iR] * a_bc[iAm1]; + + iP1 = iP - zOffsetP + xOffsetP; + rap_be[iAc] = rb[iR] * a_ce[iAm1] * pa[iP1] + + rb[iR] * a_be[iAm1] + + a_be[iA] * pa[iP1]; + + iP1 = iP - zOffsetP + yOffsetP - xOffsetP; + rap_bnw[iAc] = rb[iR] * a_cnw[iAm1] * pa[iP1] + + rb[iR] * a_bnw[iAm1] + + a_bnw[iA] * pa[iP1]; + + iP1 = iP - zOffsetP + yOffsetP; + rap_bn[iAc] = rb[iR] * a_cn[iAm1] * pa[iP1] + + rb[iR] * a_bn[iAm1] + + a_bn[iA] * pa[iP1]; + + iP1 = iP - zOffsetP + yOffsetP + xOffsetP; + rap_bne[iAc] = rb[iR] * a_cne[iAm1] * pa[iP1] + + rb[iR] * a_bne[iAm1] + + a_bne[iA] * pa[iP1]; + + iP1 = iP - yOffsetP - xOffsetP; + rap_csw[iAc] = a_csw[iA] + + rb[iR] * a_csw[iAm1] * pb[iP1] + + ra[iR] * a_csw[iAp1] * pa[iP1] + + a_bsw[iA] * pb[iP1] + + a_asw[iA] * pa[iP1] + + rb[iR] * a_asw[iAm1] + + ra[iR] * a_bsw[iAp1]; + + iP1 = iP - yOffsetP; + rap_cs[iAc] = a_cs[iA] + + rb[iR] * a_cs[iAm1] * pb[iP1] + + ra[iR] * a_cs[iAp1] * pa[iP1] + + a_bs[iA] * pb[iP1] + + a_as[iA] * pa[iP1] + + rb[iR] * a_as[iAm1] + + ra[iR] * a_bs[iAp1]; + + iP1 = iP - yOffsetP + xOffsetP; + rap_cse[iAc] = a_cse[iA] + + rb[iR] * a_cse[iAm1] * pb[iP1] + + ra[iR] * a_cse[iAp1] * pa[iP1] + + a_bse[iA] * pb[iP1] + + a_ase[iA] * pa[iP1] + + rb[iR] * a_ase[iAm1] + + ra[iR] * a_bse[iAp1]; + + iP1 = iP - xOffsetP; + rap_cw[iAc] = a_cw[iA] + + rb[iR] * a_cw[iAm1] * pb[iP1] + + ra[iR] * a_cw[iAp1] * pa[iP1] + + a_bw[iA] * pb[iP1] + + a_aw[iA] * pa[iP1] + + rb[iR] * a_aw[iAm1] + + ra[iR] * a_bw[iAp1]; + + rap_cc[iAc] = a_cc[iA] + + rb[iR] * a_cc[iAm1] * pb[iP] + + ra[iR] * a_cc[iAp1] * pa[iP] + + rb[iR] * a_ac[iAm1] + + ra[iR] * a_bc[iAp1] + + a_bc[iA] * pb[iP] + + a_ac[iA] * pa[iP]; + + } + hypre_BoxLoop4End(iP, iR, iA, iAc); + + break; + + } /* end switch statement */ + + } /* end ForBoxI */ + + return ierr; + } + + /*-------------------------------------------------------------------------- + *--------------------------------------------------------------------------*/ + + int + hypre_SMG3BuildRAPNoSym( hypre_StructMatrix *A, + hypre_StructMatrix *PT, + hypre_StructMatrix *R, + hypre_StructMatrix *RAP, + hypre_Index cindex, + hypre_Index cstride ) + + { + + hypre_Index index; + + hypre_StructStencil *fine_stencil; + int fine_stencil_size; + + hypre_StructGrid *fgrid; + int *fgrid_ids; + hypre_StructGrid *cgrid; + hypre_BoxArray *cgrid_boxes; + int *cgrid_ids; + hypre_Box *cgrid_box; + hypre_IndexRef cstart; + hypre_Index stridec; + hypre_Index fstart; + hypre_IndexRef stridef; + hypre_Index loop_size; + + int fi, ci; + int loopi, loopj, loopk; + + hypre_Box *A_dbox; + hypre_Box *PT_dbox; + hypre_Box *R_dbox; + hypre_Box *RAP_dbox; + + double *pa, *pb; + double *ra, *rb; + + double *a_cc, *a_cw, *a_ce, *a_cs, *a_cn; + double *a_ac, *a_aw, *a_ae, *a_as, *a_an; + double *a_be, *a_bn; + double *a_csw, *a_cse, *a_cnw, *a_cne; + double *a_asw, *a_ase, *a_anw, *a_ane; + double *a_bnw, *a_bne; + + double *rap_ce, *rap_cn; + double *rap_ac, *rap_aw, *rap_ae, *rap_as, *rap_an; + double *rap_cnw, *rap_cne; + double *rap_asw, *rap_ase, *rap_anw, *rap_ane; + + int iA, iAm1, iAp1; + int iAc; + int iP, iP1; + int iR; + + int zOffsetA; + int xOffsetP; + int yOffsetP; + int zOffsetP; + + int ierr = 0; + + fine_stencil = hypre_StructMatrixStencil(A); + fine_stencil_size = hypre_StructStencilSize(fine_stencil); + + stridef = cstride; + hypre_SetIndex(stridec, 1, 1, 1); + + fgrid = hypre_StructMatrixGrid(A); + fgrid_ids = hypre_StructGridIDs(fgrid); + + cgrid = hypre_StructMatrixGrid(RAP); + cgrid_boxes = hypre_StructGridBoxes(cgrid); + cgrid_ids = hypre_StructGridIDs(cgrid); + + fi = 0; + hypre_ForBoxI(ci, cgrid_boxes) + { + while (fgrid_ids[fi] != cgrid_ids[ci]) + { + fi++; + } + + cgrid_box = hypre_BoxArrayBox(cgrid_boxes, ci); + + cstart = hypre_BoxIMin(cgrid_box); + hypre_StructMapCoarseToFine(cstart, cindex, cstride, fstart); + + A_dbox = hypre_BoxArrayBox(hypre_StructMatrixDataSpace(A), fi); + PT_dbox = hypre_BoxArrayBox(hypre_StructMatrixDataSpace(PT), fi); + R_dbox = hypre_BoxArrayBox(hypre_StructMatrixDataSpace(R), fi); + RAP_dbox = hypre_BoxArrayBox(hypre_StructMatrixDataSpace(RAP), ci); + + /*----------------------------------------------------------------- + * Extract pointers for interpolation operator: + * pa is pointer for weight for f-point above c-point + * pb is pointer for weight for f-point below c-point + *-----------------------------------------------------------------*/ + + hypre_SetIndex(index,0,0,1); + pa = hypre_StructMatrixExtractPointerByIndex(PT, fi, index); + + hypre_SetIndex(index,0,0,-1); + pb = hypre_StructMatrixExtractPointerByIndex(PT, fi, index); + + + /*----------------------------------------------------------------- + * Extract pointers for restriction operator: + * ra is pointer for weight for f-point above c-point + * rb is pointer for weight for f-point below c-point + *-----------------------------------------------------------------*/ + + hypre_SetIndex(index,0,0,1); + ra = hypre_StructMatrixExtractPointerByIndex(R, fi, index); + + hypre_SetIndex(index,0,0,-1); + rb = hypre_StructMatrixExtractPointerByIndex(R, fi, index); + + + /*----------------------------------------------------------------- + * Extract pointers for 7-point fine grid operator: + * + * a_cc is pointer for center coefficient + * a_cw is pointer for west coefficient in same plane + * a_ce is pointer for east coefficient in same plane + * a_cs is pointer for south coefficient in same plane + * a_cn is pointer for north coefficient in same plane + * a_ac is pointer for center coefficient in plane above + * a_bc is pointer for center coefficient in plane below + *-----------------------------------------------------------------*/ + + hypre_SetIndex(index,0,0,0); + a_cc = hypre_StructMatrixExtractPointerByIndex(A, fi, index); + + hypre_SetIndex(index,-1,0,0); + a_cw = hypre_StructMatrixExtractPointerByIndex(A, fi, index); + + hypre_SetIndex(index,1,0,0); + a_ce = hypre_StructMatrixExtractPointerByIndex(A, fi, index); + + hypre_SetIndex(index,0,-1,0); + a_cs = hypre_StructMatrixExtractPointerByIndex(A, fi, index); + + hypre_SetIndex(index,0,1,0); + a_cn = hypre_StructMatrixExtractPointerByIndex(A, fi, index); + + hypre_SetIndex(index,0,0,1); + a_ac = hypre_StructMatrixExtractPointerByIndex(A, fi, index); + + /*----------------------------------------------------------------- + * Extract additional pointers for 15-point fine grid operator: + * + * a_aw is pointer for west coefficient in plane above + * a_ae is pointer for east coefficient in plane above + * a_as is pointer for south coefficient in plane above + * a_an is pointer for north coefficient in plane above + * a_bw is pointer for west coefficient in plane below + * a_be is pointer for east coefficient in plane below + * a_bs is pointer for south coefficient in plane below + * a_bn is pointer for north coefficient in plane below + *-----------------------------------------------------------------*/ + + if(fine_stencil_size > 7) + { + hypre_SetIndex(index,-1,0,1); + a_aw = hypre_StructMatrixExtractPointerByIndex(A, fi, index); + + hypre_SetIndex(index,1,0,1); + a_ae = hypre_StructMatrixExtractPointerByIndex(A, fi, index); + + hypre_SetIndex(index,0,-1,1); + a_as = hypre_StructMatrixExtractPointerByIndex(A, fi, index); + + hypre_SetIndex(index,0,1,1); + a_an = hypre_StructMatrixExtractPointerByIndex(A, fi, index); + + hypre_SetIndex(index,1,0,-1); + a_be = hypre_StructMatrixExtractPointerByIndex(A, fi, index); + + hypre_SetIndex(index,0,1,-1); + a_bn = hypre_StructMatrixExtractPointerByIndex(A, fi, index); + + } + + /*----------------------------------------------------------------- + * Extract additional pointers for 19-point fine grid operator: + * + * a_csw is pointer for southwest coefficient in same plane + * a_cse is pointer for southeast coefficient in same plane + * a_cnw is pointer for northwest coefficient in same plane + * a_cne is pointer for northeast coefficient in same plane + *-----------------------------------------------------------------*/ + + if(fine_stencil_size > 15) + { + hypre_SetIndex(index,-1,-1,0); + a_csw = hypre_StructMatrixExtractPointerByIndex(A, fi, index); + + hypre_SetIndex(index,1,-1,0); + a_cse = hypre_StructMatrixExtractPointerByIndex(A, fi, index); + + hypre_SetIndex(index,-1,1,0); + a_cnw = hypre_StructMatrixExtractPointerByIndex(A, fi, index); + + hypre_SetIndex(index,1,1,0); + a_cne = hypre_StructMatrixExtractPointerByIndex(A, fi, index); + + } + + /*----------------------------------------------------------------- + * Extract additional pointers for 27-point fine grid operator: + * + * a_asw is pointer for southwest coefficient in plane above + * a_ase is pointer for southeast coefficient in plane above + * a_anw is pointer for northwest coefficient in plane above + * a_ane is pointer for northeast coefficient in plane above + * a_bsw is pointer for southwest coefficient in plane below + * a_bse is pointer for southeast coefficient in plane below + * a_bnw is pointer for northwest coefficient in plane below + * a_bne is pointer for northeast coefficient in plane below + *-----------------------------------------------------------------*/ + + if(fine_stencil_size > 19) + { + hypre_SetIndex(index,-1,-1,1); + a_asw = hypre_StructMatrixExtractPointerByIndex(A, fi, index); + + hypre_SetIndex(index,1,-1,1); + a_ase = hypre_StructMatrixExtractPointerByIndex(A, fi, index); + + hypre_SetIndex(index,-1,1,1); + a_anw = hypre_StructMatrixExtractPointerByIndex(A, fi, index); + + hypre_SetIndex(index,1,1,1); + a_ane = hypre_StructMatrixExtractPointerByIndex(A, fi, index); + + hypre_SetIndex(index,-1,1,-1); + a_bnw = hypre_StructMatrixExtractPointerByIndex(A, fi, index); + + hypre_SetIndex(index,1,1,-1); + a_bne = hypre_StructMatrixExtractPointerByIndex(A, fi, index); + + } + + /*----------------------------------------------------------------- + * Extract pointers for 15-point coarse grid operator: + * + * We build only the upper triangular part (excluding diagonal). + * + * rap_ce is pointer for east coefficient in same plane (etc.) + *-----------------------------------------------------------------*/ + + hypre_SetIndex(index,1,0,0); + rap_ce = hypre_StructMatrixExtractPointerByIndex(RAP, ci, index); + + hypre_SetIndex(index,0,1,0); + rap_cn = hypre_StructMatrixExtractPointerByIndex(RAP, ci, index); + + hypre_SetIndex(index,0,0,1); + rap_ac = hypre_StructMatrixExtractPointerByIndex(RAP, ci, index); + + hypre_SetIndex(index,-1,0,1); + rap_aw = hypre_StructMatrixExtractPointerByIndex(RAP, ci, index); + + hypre_SetIndex(index,1,0,1); + rap_ae = hypre_StructMatrixExtractPointerByIndex(RAP, ci, index); + + hypre_SetIndex(index,0,-1,1); + rap_as = hypre_StructMatrixExtractPointerByIndex(RAP, ci, index); + + hypre_SetIndex(index,0,1,1); + rap_an = hypre_StructMatrixExtractPointerByIndex(RAP, ci, index); + + /*----------------------------------------------------------------- + * Extract additional pointers for 27-point coarse grid operator: + * + * A 27-point coarse grid operator is produced when the fine grid + * stencil is 19 or 27 point. + * + * We build only the upper triangular part. + * + * rap_cnw is pointer for northwest coefficient in same plane (etc.) + *-----------------------------------------------------------------*/ + + if(fine_stencil_size > 15) + { + hypre_SetIndex(index,-1,1,0); + rap_cnw = + hypre_StructMatrixExtractPointerByIndex(RAP, ci, index); + + hypre_SetIndex(index,1,1,0); + rap_cne = + hypre_StructMatrixExtractPointerByIndex(RAP, ci, index); + + hypre_SetIndex(index,-1,-1,1); + rap_asw = + hypre_StructMatrixExtractPointerByIndex(RAP, ci, index); + + hypre_SetIndex(index,1,-1,1); + rap_ase = + hypre_StructMatrixExtractPointerByIndex(RAP, ci, index); + + hypre_SetIndex(index,-1,1,1); + rap_anw = + hypre_StructMatrixExtractPointerByIndex(RAP, ci, index); + + hypre_SetIndex(index,1,1,1); + rap_ane = + hypre_StructMatrixExtractPointerByIndex(RAP, ci, index); + + } + + /*----------------------------------------------------------------- + * Define offsets for fine grid stencil and interpolation + * + * In the BoxLoop below I assume iA and iP refer to data associated + * with the point which we are building the stencil for. The below + * Offsets are used in refering to data associated with other points. + *-----------------------------------------------------------------*/ + + hypre_SetIndex(index,0,0,1); + zOffsetA = hypre_BoxOffsetDistance(A_dbox,index); + zOffsetP = hypre_BoxOffsetDistance(PT_dbox,index); + hypre_SetIndex(index,0,1,0); + yOffsetP = hypre_BoxOffsetDistance(PT_dbox,index); + hypre_SetIndex(index,1,0,0); + xOffsetP = hypre_BoxOffsetDistance(PT_dbox,index); + + /*----------------------------------------------------------------- + * Switch statement to direct control to apropriate BoxLoop depending + * on stencil size. Default is full 27-point. + *-----------------------------------------------------------------*/ + + switch (fine_stencil_size) + { + + /*-------------------------------------------------------------- + * Loop for 7-point fine grid operator; produces upper triangular + * part of 15-point coarse grid operator. stencil entries: + * (above-north, above-east, above-center, above-west, + * above-south, center-north, and center-east). + *--------------------------------------------------------------*/ + + case 7: + + hypre_BoxGetSize(cgrid_box, loop_size); + hypre_BoxLoop4Begin(loop_size, + PT_dbox, cstart, stridec, iP, + R_dbox, cstart, stridec, iR, + A_dbox, fstart, stridef, iA, + RAP_dbox, cstart, stridec, iAc); + #define HYPRE_BOX_SMP_PRIVATE loopk,loopi,loopj,iP,iR,iA,iAc,iAm1,iAp1,iP1 + #include "hypre_box_smp_forloop.h" + hypre_BoxLoop4For(loopi, loopj, loopk, iP, iR, iA, iAc) + { + iAm1 = iA - zOffsetA; + iAp1 = iA + zOffsetA; + + iP1 = iP + zOffsetP + yOffsetP; + rap_an[iAc] = ra[iR] * a_cn[iAp1] * pb[iP1]; + + iP1 = iP + zOffsetP + xOffsetP; + rap_ae[iAc] = ra[iR] * a_ce[iAp1] * pb[iP1]; + + iP1 = iP + zOffsetP; + rap_ac[iAc] = a_ac[iA] * pb[iP1] + + ra[iR] * a_cc[iAp1] * pb[iP1] + + ra[iR] * a_ac[iAp1]; + + iP1 = iP + zOffsetP - xOffsetP; + rap_aw[iAc] = ra[iR] * a_cw[iAp1] * pb[iP1]; + + iP1 = iP + zOffsetP - yOffsetP; + rap_as[iAc] = ra[iR] * a_cs[iAp1] * pb[iP1]; + + iP1 = iP + yOffsetP; + rap_cn[iAc] = a_cn[iA] + + rb[iR] * a_cn[iAm1] * pb[iP1] + + ra[iR] * a_cn[iAp1] * pa[iP1]; + + iP1 = iP + xOffsetP; + rap_ce[iAc] = a_ce[iA] + + rb[iR] * a_ce[iAm1] * pb[iP1] + + ra[iR] * a_ce[iAp1] * pa[iP1]; + + } + hypre_BoxLoop4End(iP, iR, iA, iAc); + + break; + + /*-------------------------------------------------------------- + * Loop for 15-point fine grid operator; produces upper triangular + * part of 15-point coarse grid operator. stencil entries: + * (above-north, above-east, above-center, above-west, + * above-south, center-north, and center-east). + *--------------------------------------------------------------*/ + + case 15: + + hypre_BoxGetSize(cgrid_box, loop_size); + hypre_BoxLoop4Begin(loop_size, + PT_dbox, cstart, stridec, iP, + R_dbox, cstart, stridec, iR, + A_dbox, fstart, stridef, iA, + RAP_dbox, cstart, stridec, iAc); + #define HYPRE_BOX_SMP_PRIVATE loopk,loopi,loopj,iP,iR,iA,iAc,iAm1,iAp1,iP1 + #include "hypre_box_smp_forloop.h" + hypre_BoxLoop4For(loopi, loopj, loopk, iP, iR, iA, iAc) + { + iAm1 = iA - zOffsetA; + iAp1 = iA + zOffsetA; + + iP1 = iP + zOffsetP + yOffsetP; + rap_an[iAc] = ra[iR] * a_cn[iAp1] * pb[iP1] + + ra[iR] * a_an[iAp1] + + a_an[iA] * pb[iP1]; + + iP1 = iP + zOffsetP + xOffsetP; + rap_ae[iAc] = ra[iR] * a_ce[iAp1] * pb[iP1] + + ra[iR] * a_ae[iAp1] + + a_ae[iA] * pb[iP1]; + + iP1 = iP + zOffsetP; + rap_ac[iAc] = a_ac[iA] * pb[iP1] + + ra[iR] * a_cc[iAp1] * pb[iP1] + + ra[iR] * a_ac[iAp1]; + + iP1 = iP + zOffsetP - xOffsetP; + rap_aw[iAc] = ra[iR] * a_cw[iAp1] * pb[iP1] + + ra[iR] * a_aw[iAp1] + + a_aw[iA] * pb[iP1]; + + iP1 = iP + zOffsetP - yOffsetP; + rap_as[iAc] = ra[iR] * a_cs[iAp1] * pb[iP1] + + ra[iR] * a_as[iAp1] + + a_as[iA] * pb[iP1]; + + iP1 = iP + yOffsetP; + rap_cn[iAc] = a_cn[iA] + + rb[iR] * a_cn[iAm1] * pb[iP1] + + ra[iR] * a_cn[iAp1] * pa[iP1] + + a_bn[iA] * pb[iP1] + + a_an[iA] * pa[iP1] + + rb[iR] * a_an[iAm1] + + ra[iR] * a_bn[iAp1]; + + iP1 = iP + xOffsetP; + rap_ce[iAc] = a_ce[iA] + + rb[iR] * a_ce[iAm1] * pb[iP1] + + ra[iR] * a_ce[iAp1] * pa[iP1] + + a_be[iA] * pb[iP1] + + a_ae[iA] * pa[iP1] + + rb[iR] * a_ae[iAm1] + + ra[iR] * a_be[iAp1]; + + } + hypre_BoxLoop4End(iP, iR, iA, iAc); + + break; + + + /*-------------------------------------------------------------- + * Loop for 19-point fine grid operator; produces upper triangular + * part of 27-point coarse grid operator. stencil entries: + * (above-northeast, above-north, above-northwest, above-east, + * above-center, above-west, above-southeast, above-south, + * above-southwest, center-northeast, center-north, + * center-northwest, and center-east). + *--------------------------------------------------------------*/ + + case 19: + + hypre_BoxGetSize(cgrid_box, loop_size); + hypre_BoxLoop4Begin(loop_size, + PT_dbox, cstart, stridec, iP, + R_dbox, cstart, stridec, iR, + A_dbox, fstart, stridef, iA, + RAP_dbox, cstart, stridec, iAc); + #define HYPRE_BOX_SMP_PRIVATE loopk,loopi,loopj,iP,iR,iA,iAc,iAm1,iAp1,iP1 + #include "hypre_box_smp_forloop.h" + hypre_BoxLoop4For(loopi, loopj, loopk, iP, iR, iA, iAc) + { + iAm1 = iA - zOffsetA; + iAp1 = iA + zOffsetA; + + iP1 = iP + zOffsetP + yOffsetP + xOffsetP; + rap_ane[iAc] = ra[iR] * a_cne[iAp1] * pb[iP1]; + + iP1 = iP + zOffsetP + yOffsetP; + rap_an[iAc] = ra[iR] * a_cn[iAp1] * pb[iP1] + + ra[iR] * a_an[iAp1] + + a_an[iA] * pb[iP1]; + + iP1 = iP + zOffsetP + yOffsetP - xOffsetP; + rap_anw[iAc] = ra[iR] * a_cnw[iAp1] * pb[iP1]; + + iP1 = iP + zOffsetP + xOffsetP; + rap_ae[iAc] = ra[iR] * a_ce[iAp1] * pb[iP1] + + ra[iR] * a_ae[iAp1] + + a_ae[iA] * pb[iP1]; + + iP1 = iP + zOffsetP; + rap_ac[iAc] = a_ac[iA] * pb[iP1] + + ra[iR] * a_cc[iAp1] * pb[iP1] + + ra[iR] * a_ac[iAp1]; + + iP1 = iP + zOffsetP - xOffsetP; + rap_aw[iAc] = ra[iR] * a_cw[iAp1] * pb[iP1] + + ra[iR] * a_aw[iAp1] + + a_aw[iA] * pb[iP1]; + + iP1 = iP + zOffsetP - yOffsetP + xOffsetP; + rap_ase[iAc] = ra[iR] * a_cse[iAp1] * pb[iP1]; + + iP1 = iP + zOffsetP - yOffsetP; + rap_as[iAc] = ra[iR] * a_cs[iAp1] * pb[iP1] + + ra[iR] * a_as[iAp1] + + a_as[iA] * pb[iP1]; + + iP1 = iP + zOffsetP - yOffsetP - xOffsetP; + rap_asw[iAc] = ra[iR] * a_csw[iAp1] * pb[iP1]; + + iP1 = iP + yOffsetP + xOffsetP; + rap_cne[iAc] = a_cne[iA] + + rb[iR] * a_cne[iAm1] * pb[iP1] + + ra[iR] * a_cne[iAp1] * pa[iP1]; + + iP1 = iP + yOffsetP; + rap_cn[iAc] = a_cn[iA] + + rb[iR] * a_cn[iAm1] * pb[iP1] + + ra[iR] * a_cn[iAp1] * pa[iP1] + + a_bn[iA] * pb[iP1] + + a_an[iA] * pa[iP1] + + rb[iR] * a_an[iAm1] + + ra[iR] * a_bn[iAp1]; + + iP1 = iP + yOffsetP - xOffsetP; + rap_cnw[iAc] = a_cnw[iA] + + rb[iR] * a_cnw[iAm1] * pb[iP1] + + ra[iR] * a_cnw[iAp1] * pa[iP1]; + + iP1 = iP + xOffsetP; + rap_ce[iAc] = a_ce[iA] + + rb[iR] * a_ce[iAm1] * pb[iP1] + + ra[iR] * a_ce[iAp1] * pa[iP1] + + a_be[iA] * pb[iP1] + + a_ae[iA] * pa[iP1] + + rb[iR] * a_ae[iAm1] + + ra[iR] * a_be[iAp1]; + + } + hypre_BoxLoop4End(iP, iR, iA, iAc); + + break; + + /*-------------------------------------------------------------- + * Loop for 27-point fine grid operator; produces upper triangular + * part of 27-point coarse grid operator. stencil entries: + * (above-northeast, above-north, above-northwest, above-east, + * above-center, above-west, above-southeast, above-south, + * above-southwest, center-northeast, center-north, + * center-northwest, and center-east). + *--------------------------------------------------------------*/ + + default: + + hypre_BoxGetSize(cgrid_box, loop_size); + hypre_BoxLoop4Begin(loop_size, + PT_dbox, cstart, stridec, iP, + R_dbox, cstart, stridec, iR, + A_dbox, fstart, stridef, iA, + RAP_dbox, cstart, stridec, iAc); + #define HYPRE_BOX_SMP_PRIVATE loopk,loopi,loopj,iP,iR,iA,iAc,iAm1,iAp1,iP1 + #include "hypre_box_smp_forloop.h" + hypre_BoxLoop4For(loopi, loopj, loopk, iP, iR, iA, iAc) + { + iAm1 = iA - zOffsetA; + iAp1 = iA + zOffsetA; + + iP1 = iP + zOffsetP + yOffsetP + xOffsetP; + rap_ane[iAc] = ra[iR] * a_cne[iAp1] * pb[iP1] + + ra[iR] * a_ane[iAp1] + + a_ane[iA] * pb[iP1]; + + iP1 = iP + zOffsetP + yOffsetP; + rap_an[iAc] = ra[iR] * a_cn[iAp1] * pb[iP1] + + ra[iR] * a_an[iAp1] + + a_an[iA] * pb[iP1]; + + iP1 = iP + zOffsetP + yOffsetP - xOffsetP; + rap_anw[iAc] = ra[iR] * a_cnw[iAp1] * pb[iP1] + + ra[iR] * a_anw[iAp1] + + a_anw[iA] * pb[iP1]; + + iP1 = iP + zOffsetP + xOffsetP; + rap_ae[iAc] = ra[iR] * a_ce[iAp1] * pb[iP1] + + ra[iR] * a_ae[iAp1] + + a_ae[iA] * pb[iP1]; + + iP1 = iP + zOffsetP; + rap_ac[iAc] = a_ac[iA] * pb[iP1] + + ra[iR] * a_cc[iAp1] * pb[iP1] + + ra[iR] * a_ac[iAp1]; + + iP1 = iP + zOffsetP - xOffsetP; + rap_aw[iAc] = ra[iR] * a_cw[iAp1] * pb[iP1] + + ra[iR] * a_aw[iAp1] + + a_aw[iA] * pb[iP1]; + + iP1 = iP + zOffsetP - yOffsetP + xOffsetP; + rap_ase[iAc] = ra[iR] * a_cse[iAp1] * pb[iP1] + + ra[iR] * a_ase[iAp1] + + a_ase[iA] * pb[iP1]; + + iP1 = iP + zOffsetP - yOffsetP; + rap_as[iAc] = ra[iR] * a_cs[iAp1] * pb[iP1] + + ra[iR] * a_as[iAp1] + + a_as[iA] * pb[iP1]; + + iP1 = iP + zOffsetP - yOffsetP - xOffsetP; + rap_asw[iAc] = ra[iR] * a_csw[iAp1] * pb[iP1] + + ra[iR] * a_asw[iAp1] + + a_asw[iA] * pb[iP1]; + + + iP1 = iP + yOffsetP + xOffsetP; + rap_cne[iAc] = a_cne[iA] + + rb[iR] * a_cne[iAm1] * pb[iP1] + + ra[iR] * a_cne[iAp1] * pa[iP1] + + a_bne[iA] * pb[iP1] + + a_ane[iA] * pa[iP1] + + rb[iR] * a_ane[iAm1] + + ra[iR] * a_bne[iAp1]; + + iP1 = iP + yOffsetP; + rap_cn[iAc] = a_cn[iA] + + rb[iR] * a_cn[iAm1] * pb[iP1] + + ra[iR] * a_cn[iAp1] * pa[iP1] + + a_bn[iA] * pb[iP1] + + a_an[iA] * pa[iP1] + + rb[iR] * a_an[iAm1] + + ra[iR] * a_bn[iAp1]; + + iP1 = iP + yOffsetP - xOffsetP; + rap_cnw[iAc] = a_cnw[iA] + + rb[iR] * a_cnw[iAm1] * pb[iP1] + + ra[iR] * a_cnw[iAp1] * pa[iP1] + + a_bnw[iA] * pb[iP1] + + a_anw[iA] * pa[iP1] + + rb[iR] * a_anw[iAm1] + + ra[iR] * a_bnw[iAp1]; + + iP1 = iP + xOffsetP; + rap_ce[iAc] = a_ce[iA] + + rb[iR] * a_ce[iAm1] * pb[iP1] + + ra[iR] * a_ce[iAp1] * pa[iP1] + + a_be[iA] * pb[iP1] + + a_ae[iA] * pa[iP1] + + rb[iR] * a_ae[iAm1] + + ra[iR] * a_be[iAp1]; + + } + hypre_BoxLoop4End(iP, iR, iA, iAc); + + break; + + } /* end switch statement */ + + } /* end ForBoxI */ + + return ierr; + } + + + /*-------------------------------------------------------------------------- + * hypre_SMG3RAPPeriodicSym + * Collapses stencil in periodic direction on coarsest grid. + *--------------------------------------------------------------------------*/ + + int + hypre_SMG3RAPPeriodicSym( hypre_StructMatrix *RAP, + hypre_Index cindex, + hypre_Index cstride ) + + { + + hypre_Index index; + + hypre_StructGrid *cgrid; + hypre_BoxArray *cgrid_boxes; + hypre_Box *cgrid_box; + hypre_IndexRef cstart; + hypre_Index stridec; + hypre_Index loop_size; + + int ci; + int loopi, loopj, loopk; + + hypre_Box *RAP_dbox; + + double *rap_bc, *rap_bw, *rap_be, *rap_bs, *rap_bn; + double *rap_cc, *rap_cw, *rap_cs; + double *rap_bsw, *rap_bse, *rap_bnw, *rap_bne; + double *rap_csw, *rap_cse; + + int iAc; + int iAcmx; + int iAcmy; + int iAcmxmy; + int iAcpxmy; + + int xOffset; + int yOffset; + + double zero = 0.0; + + hypre_StructStencil *stencil; + int stencil_size; + + int ierr = 0; + + stencil = hypre_StructMatrixStencil(RAP); + stencil_size = hypre_StructStencilSize(stencil); + + hypre_SetIndex(stridec, 1, 1, 1); + + cgrid = hypre_StructMatrixGrid(RAP); + cgrid_boxes = hypre_StructGridBoxes(cgrid); + + if (hypre_IndexZ(hypre_StructGridPeriodic(cgrid)) == 1) + { + hypre_StructMatrixAssemble(RAP); + + hypre_ForBoxI(ci, cgrid_boxes) + { + cgrid_box = hypre_BoxArrayBox(cgrid_boxes, ci); + + cstart = hypre_BoxIMin(cgrid_box); + + RAP_dbox = hypre_BoxArrayBox(hypre_StructMatrixDataSpace(RAP), ci); + + hypre_SetIndex(index,1,0,0); + xOffset = hypre_BoxOffsetDistance(RAP_dbox,index); + hypre_SetIndex(index,0,1,0); + yOffset = hypre_BoxOffsetDistance(RAP_dbox,index); + + + /*----------------------------------------------------------------- + * Extract pointers for 15-point coarse grid operator: + *-----------------------------------------------------------------*/ + + hypre_SetIndex(index,0,0,-1); + rap_bc = hypre_StructMatrixExtractPointerByIndex(RAP, ci, index); + + hypre_SetIndex(index,-1,0,-1); + rap_bw = hypre_StructMatrixExtractPointerByIndex(RAP, ci, index); + + hypre_SetIndex(index,1,0,-1); + rap_be = hypre_StructMatrixExtractPointerByIndex(RAP, ci, index); + + hypre_SetIndex(index,0,-1,-1); + rap_bs = hypre_StructMatrixExtractPointerByIndex(RAP, ci, index); + + hypre_SetIndex(index,0,1,-1); + rap_bn = hypre_StructMatrixExtractPointerByIndex(RAP, ci, index); + + hypre_SetIndex(index,0,0,0); + rap_cc = hypre_StructMatrixExtractPointerByIndex(RAP, ci, index); + + hypre_SetIndex(index,-1,0,0); + rap_cw = hypre_StructMatrixExtractPointerByIndex(RAP, ci, index); + + hypre_SetIndex(index,0,-1,0); + rap_cs = hypre_StructMatrixExtractPointerByIndex(RAP, ci, index); + + /*----------------------------------------------------------------- + * Extract additional pointers for 27-point coarse grid operator: + *-----------------------------------------------------------------*/ + + if(stencil_size == 27) + { + hypre_SetIndex(index,-1,-1,-1); + rap_bsw = hypre_StructMatrixExtractPointerByIndex(RAP, ci, index); + + hypre_SetIndex(index,1,-1,-1); + rap_bse = hypre_StructMatrixExtractPointerByIndex(RAP, ci, index); + + hypre_SetIndex(index,-1,1,-1); + rap_bnw = hypre_StructMatrixExtractPointerByIndex(RAP, ci, index); + + hypre_SetIndex(index,1,1,-1); + rap_bne = hypre_StructMatrixExtractPointerByIndex(RAP, ci, index); + + hypre_SetIndex(index,-1,-1,0); + rap_csw = hypre_StructMatrixExtractPointerByIndex(RAP, ci, index); + + hypre_SetIndex(index,1,-1,0); + rap_cse = hypre_StructMatrixExtractPointerByIndex(RAP, ci, index); + } + + /*----------------------------------------------------------------- + * Collapse 15 point operator. + *-----------------------------------------------------------------*/ + + hypre_BoxGetSize(cgrid_box, loop_size); + + hypre_BoxLoop1Begin(loop_size, + RAP_dbox, cstart, stridec, iAc); + #define HYPRE_BOX_SMP_PRIVATE loopk,loopi,loopj,iAc,iAcmx,iAcmy + #include "hypre_box_smp_forloop.h" + hypre_BoxLoop1For(loopi, loopj, loopk, iAc) + { + iAcmx = iAc - xOffset; + iAcmy = iAc - yOffset; + + rap_cc[iAc] += (2.0 * rap_bc[iAc]); + rap_cw[iAc] += (rap_bw[iAc] + rap_be[iAcmx]); + rap_cs[iAc] += (rap_bs[iAc] + rap_bn[iAcmy]); + } + hypre_BoxLoop1End(iAc); + + hypre_BoxLoop1Begin(loop_size, + RAP_dbox, cstart, stridec, iAc); + #define HYPRE_BOX_SMP_PRIVATE loopk,loopi,loopj,iAc + #include "hypre_box_smp_forloop.h" + hypre_BoxLoop1For(loopi, loopj, loopk, iAc) + { + rap_bc[iAc] = zero; + rap_bw[iAc] = zero; + rap_be[iAc] = zero; + rap_bs[iAc] = zero; + rap_bn[iAc] = zero; + } + hypre_BoxLoop1End(iAc); + + /*----------------------------------------------------------------- + * Collapse additional entries for 27 point operator. + *-----------------------------------------------------------------*/ + + if (stencil_size == 27) + { + hypre_BoxGetSize(cgrid_box, loop_size); + + hypre_BoxLoop1Begin(loop_size, + RAP_dbox, cstart, stridec, iAc); + #define HYPRE_BOX_SMP_PRIVATE loopk,loopi,loopj,iAc,iAcmxmy,iAcpxmy + #include "hypre_box_smp_forloop.h" + hypre_BoxLoop1For(loopi, loopj, loopk, iAc) + { + iAcmxmy = iAc - xOffset - yOffset; + iAcpxmy = iAc + xOffset - yOffset; + + rap_csw[iAc] += (rap_bsw[iAc] + rap_bne[iAcmxmy]); + + rap_cse[iAc] += (rap_bse[iAc] + rap_bnw[iAcpxmy]); + + } + hypre_BoxLoop1End(iAc); + + hypre_BoxLoop1Begin(loop_size, + RAP_dbox, cstart, stridec, iAc); + #define HYPRE_BOX_SMP_PRIVATE loopk,loopi,loopj,iAc + #include "hypre_box_smp_forloop.h" + hypre_BoxLoop1For(loopi, loopj, loopk, iAc) + { + rap_bsw[iAc] = zero; + rap_bse[iAc] = zero; + rap_bnw[iAc] = zero; + rap_bne[iAc] = zero; + } + hypre_BoxLoop1End(iAc); + } + + } /* end ForBoxI */ + + } + return ierr; + } + + /*-------------------------------------------------------------------------- + * hypre_SMG3RAPPeriodicNoSym + * Collapses stencil in periodic direction on coarsest grid. + *--------------------------------------------------------------------------*/ + + int + hypre_SMG3RAPPeriodicNoSym( hypre_StructMatrix *RAP, + hypre_Index cindex, + hypre_Index cstride ) + + { + + hypre_Index index; + + hypre_StructGrid *cgrid; + hypre_BoxArray *cgrid_boxes; + hypre_Box *cgrid_box; + hypre_IndexRef cstart; + hypre_Index stridec; + hypre_Index loop_size; + + int ci; + int loopi, loopj, loopk; + + hypre_Box *RAP_dbox; + + double *rap_bc, *rap_bw, *rap_be, *rap_bs, *rap_bn; + double *rap_cc, *rap_cw, *rap_ce, *rap_cs, *rap_cn; + double *rap_ac, *rap_aw, *rap_ae, *rap_as, *rap_an; + double *rap_bsw, *rap_bse, *rap_bnw, *rap_bne; + double *rap_csw, *rap_cse, *rap_cnw, *rap_cne; + double *rap_asw, *rap_ase, *rap_anw, *rap_ane; + + int iAc; + + double zero = 0.0; + + hypre_StructStencil *stencil; + int stencil_size; + + int ierr = 0; + + stencil = hypre_StructMatrixStencil(RAP); + stencil_size = hypre_StructStencilSize(stencil); + + hypre_SetIndex(stridec, 1, 1, 1); + + cgrid = hypre_StructMatrixGrid(RAP); + cgrid_boxes = hypre_StructGridBoxes(cgrid); + + if (hypre_IndexZ(hypre_StructGridPeriodic(cgrid)) == 1) + { + hypre_ForBoxI(ci, cgrid_boxes) + { + cgrid_box = hypre_BoxArrayBox(cgrid_boxes, ci); + + cstart = hypre_BoxIMin(cgrid_box); + + RAP_dbox = hypre_BoxArrayBox(hypre_StructMatrixDataSpace(RAP), ci); + + /*----------------------------------------------------------------- + * Extract pointers for 15-point coarse grid operator: + *-----------------------------------------------------------------*/ + + hypre_SetIndex(index,0,0,-1); + rap_bc = hypre_StructMatrixExtractPointerByIndex(RAP, ci, index); + + hypre_SetIndex(index,-1,0,-1); + rap_bw = hypre_StructMatrixExtractPointerByIndex(RAP, ci, index); + + hypre_SetIndex(index,1,0,-1); + rap_be = hypre_StructMatrixExtractPointerByIndex(RAP, ci, index); + + hypre_SetIndex(index,0,-1,-1); + rap_bs = hypre_StructMatrixExtractPointerByIndex(RAP, ci, index); + + hypre_SetIndex(index,0,1,-1); + rap_bn = hypre_StructMatrixExtractPointerByIndex(RAP, ci, index); + + hypre_SetIndex(index,0,0,0); + rap_cc = hypre_StructMatrixExtractPointerByIndex(RAP, ci, index); + + hypre_SetIndex(index,-1,0,0); + rap_cw = hypre_StructMatrixExtractPointerByIndex(RAP, ci, index); + + hypre_SetIndex(index,1,0,0); + rap_ce = hypre_StructMatrixExtractPointerByIndex(RAP, ci, index); + + hypre_SetIndex(index,0,-1,0); + rap_cs = hypre_StructMatrixExtractPointerByIndex(RAP, ci, index); + + hypre_SetIndex(index,0,1,0); + rap_cn = hypre_StructMatrixExtractPointerByIndex(RAP, ci, index); + + hypre_SetIndex(index,0,0,1); + rap_ac = hypre_StructMatrixExtractPointerByIndex(RAP, ci, index); + + hypre_SetIndex(index,-1,0,1); + rap_aw = hypre_StructMatrixExtractPointerByIndex(RAP, ci, index); + + hypre_SetIndex(index,1,0,1); + rap_ae = hypre_StructMatrixExtractPointerByIndex(RAP, ci, index); + + hypre_SetIndex(index,0,-1,1); + rap_as = hypre_StructMatrixExtractPointerByIndex(RAP, ci, index); + + hypre_SetIndex(index,0,1,1); + rap_an = hypre_StructMatrixExtractPointerByIndex(RAP, ci, index); + + /*----------------------------------------------------------------- + * Extract additional pointers for 27-point coarse grid operator: + *-----------------------------------------------------------------*/ + + if(stencil_size == 27) + { + + hypre_SetIndex(index,-1,-1,-1); + rap_bsw = hypre_StructMatrixExtractPointerByIndex(RAP, ci, index); + + hypre_SetIndex(index,1,-1,-1); + rap_bse = hypre_StructMatrixExtractPointerByIndex(RAP, ci, index); + + hypre_SetIndex(index,-1,1,-1); + rap_bnw = hypre_StructMatrixExtractPointerByIndex(RAP, ci, index); + + hypre_SetIndex(index,1,1,-1); + rap_bne = hypre_StructMatrixExtractPointerByIndex(RAP, ci, index); + + hypre_SetIndex(index,-1,-1,0); + rap_csw = hypre_StructMatrixExtractPointerByIndex(RAP, ci, index); + + hypre_SetIndex(index,1,-1,0); + rap_cse = hypre_StructMatrixExtractPointerByIndex(RAP, ci, index); + + hypre_SetIndex(index,-1,1,0); + rap_cnw = hypre_StructMatrixExtractPointerByIndex(RAP, ci, index); + + hypre_SetIndex(index,1,1,0); + rap_cne = hypre_StructMatrixExtractPointerByIndex(RAP, ci, index); + + hypre_SetIndex(index,-1,-1,1); + rap_asw = hypre_StructMatrixExtractPointerByIndex(RAP, ci, index); + + hypre_SetIndex(index,1,-1,1); + rap_ase = hypre_StructMatrixExtractPointerByIndex(RAP, ci, index); + + hypre_SetIndex(index,-1,1,1); + rap_anw = hypre_StructMatrixExtractPointerByIndex(RAP, ci, index); + + hypre_SetIndex(index,1,1,1); + rap_ane = hypre_StructMatrixExtractPointerByIndex(RAP, ci, index); + + } + + /*----------------------------------------------------------------- + * Collapse 15 point operator. + *-----------------------------------------------------------------*/ + + hypre_BoxGetSize(cgrid_box, loop_size); + + hypre_BoxLoop1Begin(loop_size, + RAP_dbox, cstart, stridec, iAc); + + #define HYPRE_BOX_SMP_PRIVATE loopk,loopi,loopj,iAc + #include "hypre_box_smp_forloop.h" + + hypre_BoxLoop1For(loopi, loopj, loopk, iAc) + { + rap_cc[iAc] += (rap_bc[iAc] + rap_ac[iAc]); + rap_bc[iAc] = zero; + rap_ac[iAc] = zero; + + rap_cw[iAc] += (rap_bw[iAc] + rap_aw[iAc]); + rap_bw[iAc] = zero; + rap_aw[iAc] = zero; + + rap_ce[iAc] += (rap_be[iAc] + rap_ae[iAc]); + rap_be[iAc] = zero; + rap_ae[iAc] = zero; + + rap_cs[iAc] += (rap_bs[iAc] + rap_as[iAc]); + rap_bs[iAc] = zero; + rap_as[iAc] = zero; + + rap_cn[iAc] += (rap_bn[iAc] + rap_an[iAc]); + rap_bn[iAc] = zero; + rap_an[iAc] = zero; + } + hypre_BoxLoop1End(iAc); + + /*----------------------------------------------------------------- + * Collapse additional entries for 27 point operator. + *-----------------------------------------------------------------*/ + + if (stencil_size == 27) + { + hypre_BoxGetSize(cgrid_box, loop_size); + + hypre_BoxLoop1Begin(loop_size, + RAP_dbox, cstart, stridec, iAc); + #define HYPRE_BOX_SMP_PRIVATE loopk,loopi,loopj,iAc + #include "hypre_box_smp_forloop.h" + hypre_BoxLoop1For(loopi, loopj, loopk, iAc) + { + rap_csw[iAc] += (rap_bsw[iAc] + rap_asw[iAc]); + rap_bsw[iAc] = zero; + rap_asw[iAc] = zero; + + rap_cse[iAc] += (rap_bse[iAc] + rap_ase[iAc]); + rap_bse[iAc] = zero; + rap_ase[iAc] = zero; + + rap_cnw[iAc] += (rap_bnw[iAc] + rap_anw[iAc]); + rap_bnw[iAc] = zero; + rap_anw[iAc] = zero; + + rap_cne[iAc] += (rap_bne[iAc] + rap_ane[iAc]); + rap_bne[iAc] = zero; + rap_ane[iAc] = zero; + } + hypre_BoxLoop1End(iAc); + } + + } /* end ForBoxI */ + + } + return ierr; + } + Index: llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/smg_axpy.c diff -c /dev/null llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/smg_axpy.c:1.1 *** /dev/null Mon Apr 11 00:22:18 2005 --- llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/smg_axpy.c Mon Apr 11 00:22:07 2005 *************** *** 0 **** --- 1,76 ---- + /*BHEADER********************************************************************** + * (c) 2000 The Regents of the University of California + * + * See the file COPYRIGHT_and_DISCLAIMER for a complete copyright + * notice, contact person, and disclaimer. + * + * $Revision: 1.1 $ + *********************************************************************EHEADER*/ + /****************************************************************************** + * + * SMG axpy routine + * + *****************************************************************************/ + + #include "headers.h" + + /*-------------------------------------------------------------------------- + * hypre_SMGAxpy + *--------------------------------------------------------------------------*/ + + int + hypre_SMGAxpy( double alpha, + hypre_StructVector *x, + hypre_StructVector *y, + hypre_Index base_index, + hypre_Index base_stride ) + { + int ierr = 0; + + hypre_Box *x_data_box; + hypre_Box *y_data_box; + + int xi; + int yi; + + double *xp; + double *yp; + + hypre_BoxArray *boxes; + hypre_Box *box; + hypre_Index loop_size; + hypre_IndexRef start; + + int i; + int loopi, loopj, loopk; + + box = hypre_BoxCreate(); + boxes = hypre_StructGridBoxes(hypre_StructVectorGrid(y)); + hypre_ForBoxI(i, boxes) + { + hypre_CopyBox(hypre_BoxArrayBox(boxes, i), box); + hypre_ProjectBox(box, base_index, base_stride); + start = hypre_BoxIMin(box); + + x_data_box = hypre_BoxArrayBox(hypre_StructVectorDataSpace(x), i); + y_data_box = hypre_BoxArrayBox(hypre_StructVectorDataSpace(y), i); + + xp = hypre_StructVectorBoxData(x, i); + yp = hypre_StructVectorBoxData(y, i); + + hypre_BoxGetStrideSize(box, base_stride, loop_size); + hypre_BoxLoop2Begin(loop_size, + x_data_box, start, base_stride, xi, + y_data_box, start, base_stride, yi); + #define HYPRE_BOX_SMP_PRIVATE loopk,loopi,loopj,xi,yi + #include "hypre_box_smp_forloop.h" + hypre_BoxLoop2For(loopi, loopj, loopk, xi, yi) + { + yp[yi] += alpha * xp[xi]; + } + hypre_BoxLoop2End(xi, yi); + } + hypre_BoxDestroy(box); + + return ierr; + } Index: llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/smg_relax.c diff -c /dev/null llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/smg_relax.c:1.1 *** /dev/null Mon Apr 11 00:22:18 2005 --- llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/smg_relax.c Mon Apr 11 00:22:07 2005 *************** *** 0 **** --- 1,989 ---- + /*BHEADER********************************************************************** + * (c) 1997 The Regents of the University of California + * + * See the file COPYRIGHT_and_DISCLAIMER for a complete copyright + * notice, contact person, and disclaimer. + * + * $Revision: 1.1 $ + *********************************************************************EHEADER*/ + /****************************************************************************** + * + * + *****************************************************************************/ + + #include "headers.h" + + /*-------------------------------------------------------------------------- + * hypre_SMGRelaxData data structure + *--------------------------------------------------------------------------*/ + + typedef struct + { + int setup_temp_vec; + int setup_a_rem; + int setup_a_sol; + + MPI_Comm comm; + + int memory_use; + double tol; + int max_iter; + int zero_guess; + + int num_spaces; + int *space_indices; + int *space_strides; + + int num_pre_spaces; + int num_reg_spaces; + int *pre_space_ranks; + int *reg_space_ranks; + + hypre_Index base_index; + hypre_Index base_stride; + hypre_BoxArray *base_box_array; + + int stencil_dim; + + hypre_StructMatrix *A; + hypre_StructVector *b; + hypre_StructVector *x; + + hypre_StructVector *temp_vec; + hypre_StructMatrix *A_sol; /* Coefficients of A that make up + the (sol)ve part of the relaxation */ + hypre_StructMatrix *A_rem; /* Coefficients of A (rem)aining: + A_rem = A - A_sol */ + void **residual_data; /* Array of size `num_spaces' */ + void **solve_data; /* Array of size `num_spaces' */ + + /* log info (always logged) */ + int num_iterations; + int time_index; + + int num_pre_relax; + int num_post_relax; + + } hypre_SMGRelaxData; + + /*-------------------------------------------------------------------------- + * hypre_SMGRelaxCreate + *--------------------------------------------------------------------------*/ + + void * + hypre_SMGRelaxCreate( MPI_Comm comm ) + { + hypre_SMGRelaxData *relax_data; + + relax_data = hypre_CTAlloc(hypre_SMGRelaxData, 1); + (relax_data -> setup_temp_vec) = 1; + (relax_data -> setup_a_rem) = 1; + (relax_data -> setup_a_sol) = 1; + (relax_data -> comm) = comm; + (relax_data -> base_box_array) = NULL; + (relax_data -> time_index) = hypre_InitializeTiming("SMGRelax"); + /* set defaults */ + (relax_data -> memory_use) = 0; + (relax_data -> tol) = 1.0e-06; + (relax_data -> max_iter) = 1000; + (relax_data -> zero_guess) = 0; + (relax_data -> num_spaces) = 1; + (relax_data -> space_indices) = hypre_TAlloc(int, 1); + (relax_data -> space_strides) = hypre_TAlloc(int, 1); + (relax_data -> space_indices[0]) = 0; + (relax_data -> space_strides[0]) = 1; + (relax_data -> num_pre_spaces) = 0; + (relax_data -> num_reg_spaces) = 1; + (relax_data -> pre_space_ranks) = NULL; + (relax_data -> reg_space_ranks) = hypre_TAlloc(int, 1); + (relax_data -> reg_space_ranks[0]) = 0; + hypre_SetIndex((relax_data -> base_index), 0, 0, 0); + hypre_SetIndex((relax_data -> base_stride), 1, 1, 1); + (relax_data -> A) = NULL; + (relax_data -> b) = NULL; + (relax_data -> x) = NULL; + (relax_data -> temp_vec) = NULL; + + (relax_data -> num_pre_relax) = 1; + (relax_data -> num_post_relax) = 1; + + return (void *) relax_data; + } + + /*-------------------------------------------------------------------------- + * hypre_SMGRelaxDestroyTempVec + *--------------------------------------------------------------------------*/ + + int + hypre_SMGRelaxDestroyTempVec( void *relax_vdata ) + { + hypre_SMGRelaxData *relax_data = relax_vdata; + int ierr = 0; + + hypre_StructVectorDestroy(relax_data -> temp_vec); + (relax_data -> setup_temp_vec) = 1; + + return ierr; + } + + /*-------------------------------------------------------------------------- + * hypre_SMGRelaxDestroyARem + *--------------------------------------------------------------------------*/ + + int + hypre_SMGRelaxDestroyARem( void *relax_vdata ) + { + hypre_SMGRelaxData *relax_data = relax_vdata; + int i; + int ierr = 0; + + if (relax_data -> A_rem) + { + for (i = 0; i < (relax_data -> num_spaces); i++) + { + hypre_SMGResidualDestroy(relax_data -> residual_data[i]); + } + hypre_TFree(relax_data -> residual_data); + hypre_StructMatrixDestroy(relax_data -> A_rem); + (relax_data -> A_rem) = NULL; + } + (relax_data -> setup_a_rem) = 1; + + return ierr; + } + + /*-------------------------------------------------------------------------- + * hypre_SMGRelaxDestroyASol + *--------------------------------------------------------------------------*/ + + int + hypre_SMGRelaxDestroyASol( void *relax_vdata ) + { + hypre_SMGRelaxData *relax_data = relax_vdata; + int stencil_dim; + int i; + int ierr = 0; + + if (relax_data -> A_sol) + { + stencil_dim = (relax_data -> stencil_dim); + for (i = 0; i < (relax_data -> num_spaces); i++) + { + if (stencil_dim > 2) + hypre_SMGDestroy(relax_data -> solve_data[i]); + else + hypre_CyclicReductionDestroy(relax_data -> solve_data[i]); + } + hypre_TFree(relax_data -> solve_data); + hypre_StructMatrixDestroy(relax_data -> A_sol); + (relax_data -> A_sol) = NULL; + } + (relax_data -> setup_a_sol) = 1; + + return ierr; + } + + /*-------------------------------------------------------------------------- + * hypre_SMGRelaxDestroy + *--------------------------------------------------------------------------*/ + + int + hypre_SMGRelaxDestroy( void *relax_vdata ) + { + hypre_SMGRelaxData *relax_data = relax_vdata; + int ierr = 0; + + if (relax_data) + { + hypre_TFree(relax_data -> space_indices); + hypre_TFree(relax_data -> space_strides); + hypre_TFree(relax_data -> pre_space_ranks); + hypre_TFree(relax_data -> reg_space_ranks); + hypre_BoxArrayDestroy(relax_data -> base_box_array); + + hypre_StructMatrixDestroy(relax_data -> A); + hypre_StructVectorDestroy(relax_data -> b); + hypre_StructVectorDestroy(relax_data -> x); + + hypre_SMGRelaxDestroyTempVec(relax_vdata); + hypre_SMGRelaxDestroyARem(relax_vdata); + hypre_SMGRelaxDestroyASol(relax_vdata); + + hypre_FinalizeTiming(relax_data -> time_index); + hypre_TFree(relax_data); + } + + return ierr; + } + + /*-------------------------------------------------------------------------- + * hypre_SMGRelax + *--------------------------------------------------------------------------*/ + + int + hypre_SMGRelax( void *relax_vdata, + hypre_StructMatrix *A, + hypre_StructVector *b, + hypre_StructVector *x ) + { + hypre_SMGRelaxData *relax_data = relax_vdata; + + int zero_guess; + int stencil_dim; + hypre_StructVector *temp_vec; + hypre_StructMatrix *A_sol; + hypre_StructMatrix *A_rem; + void **residual_data; + void **solve_data; + + hypre_IndexRef base_stride; + hypre_BoxArray *base_box_a; + double zero = 0.0; + + int max_iter; + int num_spaces; + int *space_ranks; + + int i, j, k, is; + + int ierr = 0; + + /*---------------------------------------------------------- + * Note: The zero_guess stuff is not handled correctly + * for general relaxation parameters. It is correct when + * the spaces are independent sets in the direction of + * relaxation. + *----------------------------------------------------------*/ + + hypre_BeginTiming(relax_data -> time_index); + + /*---------------------------------------------------------- + * Set up the solver + *----------------------------------------------------------*/ + + /* insure that the solver memory gets fully set up */ + if ((relax_data -> setup_a_sol) > 0) + { + (relax_data -> setup_a_sol) = 2; + } + + hypre_SMGRelaxSetup(relax_vdata, A, b, x); + + zero_guess = (relax_data -> zero_guess); + stencil_dim = (relax_data -> stencil_dim); + temp_vec = (relax_data -> temp_vec); + A_sol = (relax_data -> A_sol); + A_rem = (relax_data -> A_rem); + residual_data = (relax_data -> residual_data); + solve_data = (relax_data -> solve_data); + + + /*---------------------------------------------------------- + * Set zero values + *----------------------------------------------------------*/ + + if (zero_guess) + { + base_stride = (relax_data -> base_stride); + base_box_a = (relax_data -> base_box_array); + ierr = hypre_SMGSetStructVectorConstantValues(x, zero, base_box_a, + base_stride); + } + + /*---------------------------------------------------------- + * Iterate + *----------------------------------------------------------*/ + + for (k = 0; k < 2; k++) + { + switch(k) + { + /* Do pre-relaxation iterations */ + case 0: + max_iter = 1; + num_spaces = (relax_data -> num_pre_spaces); + space_ranks = (relax_data -> pre_space_ranks); + break; + + /* Do regular relaxation iterations */ + case 1: + max_iter = (relax_data -> max_iter); + num_spaces = (relax_data -> num_reg_spaces); + space_ranks = (relax_data -> reg_space_ranks); + break; + } + + for (i = 0; i < max_iter; i++) + { + for (j = 0; j < num_spaces; j++) + { + is = space_ranks[j]; + + hypre_SMGResidual(residual_data[is], A_rem, x, b, temp_vec); + + if (stencil_dim > 2) + hypre_SMGSolve(solve_data[is], A_sol, temp_vec, x); + else + hypre_CyclicReduction(solve_data[is], A_sol, temp_vec, x); + } + + (relax_data -> num_iterations) = (i + 1); + } + } + + /*---------------------------------------------------------- + * Free up memory according to memory_use parameter + *----------------------------------------------------------*/ + + if ((stencil_dim - 1) <= (relax_data -> memory_use)) + { + hypre_SMGRelaxDestroyASol(relax_vdata); + } + + hypre_EndTiming(relax_data -> time_index); + + return ierr; + } + + /*-------------------------------------------------------------------------- + * hypre_SMGRelaxSetup + *--------------------------------------------------------------------------*/ + + int + hypre_SMGRelaxSetup( void *relax_vdata, + hypre_StructMatrix *A, + hypre_StructVector *b, + hypre_StructVector *x ) + { + hypre_SMGRelaxData *relax_data = relax_vdata; + int stencil_dim; + int a_sol_test; + int ierr = 0; + + stencil_dim = hypre_StructStencilDim(hypre_StructMatrixStencil(A)); + (relax_data -> stencil_dim) = stencil_dim; + hypre_StructMatrixDestroy(relax_data -> A); + hypre_StructVectorDestroy(relax_data -> b); + hypre_StructVectorDestroy(relax_data -> x); + (relax_data -> A) = hypre_StructMatrixRef(A); + (relax_data -> b) = hypre_StructVectorRef(b); + (relax_data -> x) = hypre_StructVectorRef(x); + + /*---------------------------------------------------------- + * Set up memory according to memory_use parameter. + * + * If a subset of the solver memory is not to be set up + * until the solve is actually done, it's "setup" tag + * should have a value greater than 1. + *----------------------------------------------------------*/ + + if ((stencil_dim - 1) <= (relax_data -> memory_use)) + { + a_sol_test = 1; + } + else + { + a_sol_test = 0; + } + + /*---------------------------------------------------------- + * Set up the solver + *----------------------------------------------------------*/ + + if ((relax_data -> setup_temp_vec) > 0) + { + ierr = hypre_SMGRelaxSetupTempVec(relax_vdata, A, b, x); + } + + if ((relax_data -> setup_a_rem) > 0) + { + ierr = hypre_SMGRelaxSetupARem(relax_vdata, A, b, x); + } + + if ((relax_data -> setup_a_sol) > a_sol_test) + { + ierr = hypre_SMGRelaxSetupASol(relax_vdata, A, b, x); + } + + if ((relax_data -> base_box_array) == NULL) + { + ierr = hypre_SMGRelaxSetupBaseBoxArray(relax_vdata, A, b, x); + } + + + return ierr; + } + + /*-------------------------------------------------------------------------- + * hypre_SMGRelaxSetupTempVec + *--------------------------------------------------------------------------*/ + + int + hypre_SMGRelaxSetupTempVec( void *relax_vdata, + hypre_StructMatrix *A, + hypre_StructVector *b, + hypre_StructVector *x ) + { + hypre_SMGRelaxData *relax_data = relax_vdata; + hypre_StructVector *temp_vec = (relax_data -> temp_vec); + int ierr = 0; + + /*---------------------------------------------------------- + * Set up data + *----------------------------------------------------------*/ + + if ((relax_data -> temp_vec) == NULL) + { + temp_vec = hypre_StructVectorCreate(hypre_StructVectorComm(b), + hypre_StructVectorGrid(b)); + hypre_StructVectorSetNumGhost(temp_vec, hypre_StructVectorNumGhost(b)); + hypre_StructVectorInitialize(temp_vec); + hypre_StructVectorAssemble(temp_vec); + (relax_data -> temp_vec) = temp_vec; + } + (relax_data -> setup_temp_vec) = 0; + + return ierr; + } + + /*-------------------------------------------------------------------------- + * hypre_SMGRelaxSetupARem + *--------------------------------------------------------------------------*/ + + int + hypre_SMGRelaxSetupARem( void *relax_vdata, + hypre_StructMatrix *A, + hypre_StructVector *b, + hypre_StructVector *x ) + { + hypre_SMGRelaxData *relax_data = relax_vdata; + + int num_spaces = (relax_data -> num_spaces); + int *space_indices = (relax_data -> space_indices); + int *space_strides = (relax_data -> space_strides); + hypre_StructVector *temp_vec = (relax_data -> temp_vec); + + hypre_StructStencil *stencil = hypre_StructMatrixStencil(A); + hypre_Index *stencil_shape = hypre_StructStencilShape(stencil); + int stencil_size = hypre_StructStencilSize(stencil); + int stencil_dim = hypre_StructStencilDim(stencil); + + hypre_StructMatrix *A_rem; + void **residual_data; + + hypre_Index base_index; + hypre_Index base_stride; + + int num_stencil_indices; + int *stencil_indices; + + int i; + + int ierr = 0; + + /*---------------------------------------------------------- + * Free up old data before putting new data into structure + *----------------------------------------------------------*/ + + hypre_SMGRelaxDestroyARem(relax_vdata); + + /*---------------------------------------------------------- + * Set up data + *----------------------------------------------------------*/ + + hypre_CopyIndex((relax_data -> base_index), base_index); + hypre_CopyIndex((relax_data -> base_stride), base_stride); + + stencil_indices = hypre_TAlloc(int, stencil_size); + num_stencil_indices = 0; + for (i = 0; i < stencil_size; i++) + { + if (hypre_IndexD(stencil_shape[i], (stencil_dim - 1)) != 0) + { + stencil_indices[num_stencil_indices] = i; + num_stencil_indices++; + } + } + A_rem = hypre_StructMatrixCreateMask(A, num_stencil_indices, stencil_indices); + hypre_TFree(stencil_indices); + + /* Set up residual_data */ + residual_data = hypre_TAlloc(void *, num_spaces); + + for (i = 0; i < num_spaces; i++) + { + hypre_IndexD(base_index, (stencil_dim - 1)) = space_indices[i]; + hypre_IndexD(base_stride, (stencil_dim - 1)) = space_strides[i]; + + residual_data[i] = hypre_SMGResidualCreate(); + hypre_SMGResidualSetBase(residual_data[i], base_index, base_stride); + hypre_SMGResidualSetup(residual_data[i], A_rem, x, b, temp_vec); + } + + (relax_data -> A_rem) = A_rem; + (relax_data -> residual_data) = residual_data; + + (relax_data -> setup_a_rem) = 0; + + return ierr; + } + + /*-------------------------------------------------------------------------- + * hypre_SMGRelaxSetupASol + *--------------------------------------------------------------------------*/ + + int + hypre_SMGRelaxSetupASol( void *relax_vdata, + hypre_StructMatrix *A, + hypre_StructVector *b, + hypre_StructVector *x ) + { + hypre_SMGRelaxData *relax_data = relax_vdata; + + int num_spaces = (relax_data -> num_spaces); + int *space_indices = (relax_data -> space_indices); + int *space_strides = (relax_data -> space_strides); + hypre_StructVector *temp_vec = (relax_data -> temp_vec); + + int num_pre_relax = (relax_data -> num_pre_relax); + int num_post_relax = (relax_data -> num_post_relax); + + hypre_StructStencil *stencil = hypre_StructMatrixStencil(A); + hypre_Index *stencil_shape = hypre_StructStencilShape(stencil); + int stencil_size = hypre_StructStencilSize(stencil); + int stencil_dim = hypre_StructStencilDim(stencil); + + hypre_StructMatrix *A_sol; + void **solve_data; + + hypre_Index base_index; + hypre_Index base_stride; + + int num_stencil_indices; + int *stencil_indices; + + int i; + + int ierr = 0; + + /*---------------------------------------------------------- + * Free up old data before putting new data into structure + *----------------------------------------------------------*/ + + hypre_SMGRelaxDestroyASol(relax_vdata); + + /*---------------------------------------------------------- + * Set up data + *----------------------------------------------------------*/ + + hypre_CopyIndex((relax_data -> base_index), base_index); + hypre_CopyIndex((relax_data -> base_stride), base_stride); + + stencil_indices = hypre_TAlloc(int, stencil_size); + num_stencil_indices = 0; + for (i = 0; i < stencil_size; i++) + { + if (hypre_IndexD(stencil_shape[i], (stencil_dim - 1)) == 0) + { + stencil_indices[num_stencil_indices] = i; + num_stencil_indices++; + } + } + A_sol = hypre_StructMatrixCreateMask(A, num_stencil_indices, stencil_indices); + hypre_StructStencilDim(hypre_StructMatrixStencil(A_sol)) = stencil_dim - 1; + hypre_TFree(stencil_indices); + + /* Set up solve_data */ + solve_data = hypre_TAlloc(void *, num_spaces); + + for (i = 0; i < num_spaces; i++) + { + hypre_IndexD(base_index, (stencil_dim - 1)) = space_indices[i]; + hypre_IndexD(base_stride, (stencil_dim - 1)) = space_strides[i]; + + if (stencil_dim > 2) + { + solve_data[i] = hypre_SMGCreate(relax_data -> comm); + hypre_SMGSetNumPreRelax( solve_data[i], num_pre_relax); + hypre_SMGSetNumPostRelax( solve_data[i], num_post_relax); + hypre_SMGSetBase(solve_data[i], base_index, base_stride); + hypre_SMGSetMemoryUse(solve_data[i], (relax_data -> memory_use)); + hypre_SMGSetTol(solve_data[i], 0.0); + hypre_SMGSetMaxIter(solve_data[i], 1); + hypre_SMGSetup(solve_data[i], A_sol, temp_vec, x); + } + else + { + solve_data[i] = hypre_CyclicReductionCreate(relax_data -> comm); + hypre_CyclicReductionSetBase(solve_data[i], base_index, base_stride); + hypre_CyclicReductionSetup(solve_data[i], A_sol, temp_vec, x); + } + } + + (relax_data -> A_sol) = A_sol; + (relax_data -> solve_data) = solve_data; + + (relax_data -> setup_a_sol) = 0; + + return ierr; + } + + /*-------------------------------------------------------------------------- + * hypre_SMGRelaxSetTempVec + *--------------------------------------------------------------------------*/ + + int + hypre_SMGRelaxSetTempVec( void *relax_vdata, + hypre_StructVector *temp_vec ) + { + hypre_SMGRelaxData *relax_data = relax_vdata; + int ierr = 0; + + hypre_SMGRelaxDestroyTempVec(relax_vdata); + (relax_data -> temp_vec) = hypre_StructVectorRef(temp_vec); + + (relax_data -> setup_temp_vec) = 1; + (relax_data -> setup_a_rem) = 1; + (relax_data -> setup_a_sol) = 1; + + return ierr; + } + + /*-------------------------------------------------------------------------- + * hypre_SMGRelaxSetMemoryUse + *--------------------------------------------------------------------------*/ + + int + hypre_SMGRelaxSetMemoryUse( void *relax_vdata, + int memory_use ) + { + hypre_SMGRelaxData *relax_data = relax_vdata; + int ierr = 0; + + (relax_data -> memory_use) = memory_use; + + return ierr; + } + + /*-------------------------------------------------------------------------- + * hypre_SMGRelaxSetTol + *--------------------------------------------------------------------------*/ + + int + hypre_SMGRelaxSetTol( void *relax_vdata, + double tol ) + { + hypre_SMGRelaxData *relax_data = relax_vdata; + int ierr = 0; + + (relax_data -> tol) = tol; + + return ierr; + } + + /*-------------------------------------------------------------------------- + * hypre_SMGRelaxSetMaxIter + *--------------------------------------------------------------------------*/ + + int + hypre_SMGRelaxSetMaxIter( void *relax_vdata, + int max_iter ) + { + hypre_SMGRelaxData *relax_data = relax_vdata; + int ierr = 0; + + (relax_data -> max_iter) = max_iter; + + return ierr; + } + + /*-------------------------------------------------------------------------- + * hypre_SMGRelaxSetZeroGuess + *--------------------------------------------------------------------------*/ + + int + hypre_SMGRelaxSetZeroGuess( void *relax_vdata, + int zero_guess ) + { + hypre_SMGRelaxData *relax_data = relax_vdata; + int ierr = 0; + + (relax_data -> zero_guess) = zero_guess; + + return ierr; + } + + /*-------------------------------------------------------------------------- + * hypre_SMGRelaxSetNumSpaces + *--------------------------------------------------------------------------*/ + + int + hypre_SMGRelaxSetNumSpaces( void *relax_vdata, + int num_spaces ) + { + hypre_SMGRelaxData *relax_data = relax_vdata; + int i; + int ierr = 0; + + (relax_data -> num_spaces) = num_spaces; + + hypre_TFree(relax_data -> space_indices); + hypre_TFree(relax_data -> space_strides); + hypre_TFree(relax_data -> pre_space_ranks); + hypre_TFree(relax_data -> reg_space_ranks); + (relax_data -> space_indices) = hypre_TAlloc(int, num_spaces); + (relax_data -> space_strides) = hypre_TAlloc(int, num_spaces); + (relax_data -> num_pre_spaces) = 0; + (relax_data -> num_reg_spaces) = num_spaces; + (relax_data -> pre_space_ranks) = NULL; + (relax_data -> reg_space_ranks) = hypre_TAlloc(int, num_spaces); + + for (i = 0; i < num_spaces; i++) + { + (relax_data -> space_indices[i]) = 0; + (relax_data -> space_strides[i]) = 1; + (relax_data -> reg_space_ranks[i]) = i; + } + + (relax_data -> setup_temp_vec) = 1; + (relax_data -> setup_a_rem) = 1; + (relax_data -> setup_a_sol) = 1; + + return ierr; + } + + /*-------------------------------------------------------------------------- + * hypre_SMGRelaxSetNumPreSpaces + *--------------------------------------------------------------------------*/ + + int + hypre_SMGRelaxSetNumPreSpaces( void *relax_vdata, + int num_pre_spaces ) + { + hypre_SMGRelaxData *relax_data = relax_vdata; + int i; + int ierr = 0; + + (relax_data -> num_pre_spaces) = num_pre_spaces; + + hypre_TFree(relax_data -> pre_space_ranks); + (relax_data -> pre_space_ranks) = hypre_TAlloc(int, num_pre_spaces); + + for (i = 0; i < num_pre_spaces; i++) + (relax_data -> pre_space_ranks[i]) = 0; + + return ierr; + } + + /*-------------------------------------------------------------------------- + * hypre_SMGRelaxSetNumRegSpaces + *--------------------------------------------------------------------------*/ + + int + hypre_SMGRelaxSetNumRegSpaces( void *relax_vdata, + int num_reg_spaces ) + { + hypre_SMGRelaxData *relax_data = relax_vdata; + int i; + int ierr = 0; + + (relax_data -> num_reg_spaces) = num_reg_spaces; + + hypre_TFree(relax_data -> reg_space_ranks); + (relax_data -> reg_space_ranks) = hypre_TAlloc(int, num_reg_spaces); + + for (i = 0; i < num_reg_spaces; i++) + (relax_data -> reg_space_ranks[i]) = 0; + + return ierr; + } + + /*-------------------------------------------------------------------------- + * hypre_SMGRelaxSetSpace + *--------------------------------------------------------------------------*/ + + int + hypre_SMGRelaxSetSpace( void *relax_vdata, + int i, + int space_index, + int space_stride ) + { + hypre_SMGRelaxData *relax_data = relax_vdata; + int ierr = 0; + + (relax_data -> space_indices[i]) = space_index; + (relax_data -> space_strides[i]) = space_stride; + + (relax_data -> setup_temp_vec) = 1; + (relax_data -> setup_a_rem) = 1; + (relax_data -> setup_a_sol) = 1; + + return ierr; + } + + /*-------------------------------------------------------------------------- + * hypre_SMGRelaxSetRegSpaceRank + *--------------------------------------------------------------------------*/ + + int + hypre_SMGRelaxSetRegSpaceRank( void *relax_vdata, + int i, + int reg_space_rank ) + { + hypre_SMGRelaxData *relax_data = relax_vdata; + int ierr = 0; + + (relax_data -> reg_space_ranks[i]) = reg_space_rank; + + return ierr; + } + + /*-------------------------------------------------------------------------- + * hypre_SMGRelaxSetPreSpaceRank + *--------------------------------------------------------------------------*/ + + int + hypre_SMGRelaxSetPreSpaceRank( void *relax_vdata, + int i, + int pre_space_rank ) + { + hypre_SMGRelaxData *relax_data = relax_vdata; + int ierr = 0; + + (relax_data -> pre_space_ranks[i]) = pre_space_rank; + + return ierr; + } + + /*-------------------------------------------------------------------------- + * hypre_SMGRelaxSetBase + *--------------------------------------------------------------------------*/ + + int + hypre_SMGRelaxSetBase( void *relax_vdata, + hypre_Index base_index, + hypre_Index base_stride ) + { + hypre_SMGRelaxData *relax_data = relax_vdata; + int d; + int ierr = 0; + + for (d = 0; d < 3; d++) + { + hypre_IndexD((relax_data -> base_index), d) = + hypre_IndexD(base_index, d); + hypre_IndexD((relax_data -> base_stride), d) = + hypre_IndexD(base_stride, d); + } + + if ((relax_data -> base_box_array) != NULL) + { + hypre_BoxArrayDestroy((relax_data -> base_box_array)); + (relax_data -> base_box_array) = NULL; + } + + (relax_data -> setup_temp_vec) = 1; + (relax_data -> setup_a_rem) = 1; + (relax_data -> setup_a_sol) = 1; + + return ierr; + } + + /*-------------------------------------------------------------------------- + * hypre_SMGRelaxSetNumPreRelax + * Note that we require at least 1 pre-relax sweep. + *--------------------------------------------------------------------------*/ + + int + hypre_SMGRelaxSetNumPreRelax( void *relax_vdata, + int num_pre_relax ) + { + hypre_SMGRelaxData *relax_data = relax_vdata; + int ierr = 0; + + (relax_data -> num_pre_relax) = hypre_max(num_pre_relax,1); + + return ierr; + } + + /*-------------------------------------------------------------------------- + * hypre_SMGRelaxSetNumPostRelax + *--------------------------------------------------------------------------*/ + + int + hypre_SMGRelaxSetNumPostRelax( void *relax_vdata, + int num_post_relax ) + { + hypre_SMGRelaxData *relax_data = relax_vdata; + int ierr = 0; + + (relax_data -> num_post_relax) = num_post_relax; + + return ierr; + } + + /*-------------------------------------------------------------------------- + * hypre_SMGRelaxSetNewMatrixStencil + *--------------------------------------------------------------------------*/ + + int + hypre_SMGRelaxSetNewMatrixStencil( void *relax_vdata, + hypre_StructStencil *diff_stencil ) + { + hypre_SMGRelaxData *relax_data = relax_vdata; + + hypre_Index *stencil_shape = hypre_StructStencilShape(diff_stencil); + int stencil_size = hypre_StructStencilSize(diff_stencil); + int stencil_dim = hypre_StructStencilDim(diff_stencil); + + int i; + + int ierr = 0; + + for (i = 0; i < stencil_size; i++) + { + if (hypre_IndexD(stencil_shape[i], (stencil_dim - 1)) != 0) + { + (relax_data -> setup_a_rem) = 1; + } + else + { + (relax_data -> setup_a_sol) = 1; + } + } + + return ierr; + } + + + /*-------------------------------------------------------------------------- + * hypre_SMGRelaxSetupBaseBoxArray + *--------------------------------------------------------------------------*/ + + int + hypre_SMGRelaxSetupBaseBoxArray( void *relax_vdata, + hypre_StructMatrix *A, + hypre_StructVector *b, + hypre_StructVector *x ) + { + hypre_SMGRelaxData *relax_data = relax_vdata; + + hypre_StructGrid *grid; + hypre_BoxArray *boxes; + hypre_BoxArray *base_box_array; + + int ierr = 0; + + grid = hypre_StructVectorGrid(x); + boxes = hypre_StructGridBoxes(grid); + + base_box_array = hypre_BoxArrayDuplicate(boxes); + hypre_ProjectBoxArray(base_box_array, + (relax_data -> base_index), + (relax_data -> base_stride)); + + (relax_data -> base_box_array) = base_box_array; + + return ierr; + } + Index: llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/smg_residual.c diff -c /dev/null llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/smg_residual.c:1.1 *** /dev/null Mon Apr 11 00:22:18 2005 --- llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/smg_residual.c Mon Apr 11 00:22:07 2005 *************** *** 0 **** --- 1,356 ---- + /*BHEADER********************************************************************** + * (c) 1997 The Regents of the University of California + * + * See the file COPYRIGHT_and_DISCLAIMER for a complete copyright + * notice, contact person, and disclaimer. + * + * $Revision: 1.1 $ + *********************************************************************EHEADER*/ + /****************************************************************************** + * + * Routine for computing residuals in the SMG code + * + *****************************************************************************/ + + #include "headers.h" + + /*-------------------------------------------------------------------------- + * hypre_SMGResidualData data structure + *--------------------------------------------------------------------------*/ + + typedef struct + { + hypre_Index base_index; + hypre_Index base_stride; + + hypre_StructMatrix *A; + hypre_StructVector *x; + hypre_StructVector *b; + hypre_StructVector *r; + hypre_BoxArray *base_points; + hypre_ComputePkg *compute_pkg; + + int time_index; + int flops; + + } hypre_SMGResidualData; + + /*-------------------------------------------------------------------------- + * hypre_SMGResidualCreate + *--------------------------------------------------------------------------*/ + + void * + hypre_SMGResidualCreate( ) + { + hypre_SMGResidualData *residual_data; + + residual_data = hypre_CTAlloc(hypre_SMGResidualData, 1); + + (residual_data -> time_index) = hypre_InitializeTiming("SMGResidual"); + + /* set defaults */ + hypre_SetIndex((residual_data -> base_index), 0, 0, 0); + hypre_SetIndex((residual_data -> base_stride), 1, 1, 1); + + return (void *) residual_data; + } + + /*-------------------------------------------------------------------------- + * hypre_SMGResidualSetup + *--------------------------------------------------------------------------*/ + + int + hypre_SMGResidualSetup( void *residual_vdata, + hypre_StructMatrix *A, + hypre_StructVector *x, + hypre_StructVector *b, + hypre_StructVector *r ) + { + int ierr = 0; + + hypre_SMGResidualData *residual_data = residual_vdata; + + hypre_IndexRef base_index = (residual_data -> base_index); + hypre_IndexRef base_stride = (residual_data -> base_stride); + hypre_Index unit_stride; + + hypre_StructGrid *grid; + hypre_StructStencil *stencil; + + hypre_BoxArrayArray *send_boxes; + hypre_BoxArrayArray *recv_boxes; + int **send_processes; + int **recv_processes; + hypre_BoxArrayArray *indt_boxes; + hypre_BoxArrayArray *dept_boxes; + + hypre_BoxArray *base_points; + hypre_ComputePkg *compute_pkg; + + /*---------------------------------------------------------- + * Set up base points and the compute package + *----------------------------------------------------------*/ + + grid = hypre_StructMatrixGrid(A); + stencil = hypre_StructMatrixStencil(A); + + hypre_SetIndex(unit_stride, 1, 1, 1); + + base_points = hypre_BoxArrayDuplicate(hypre_StructGridBoxes(grid)); + hypre_ProjectBoxArray(base_points, base_index, base_stride); + + hypre_CreateComputeInfo(grid, stencil, + &send_boxes, &recv_boxes, + &send_processes, &recv_processes, + &indt_boxes, &dept_boxes); + + hypre_ProjectBoxArrayArray(indt_boxes, base_index, base_stride); + hypre_ProjectBoxArrayArray(dept_boxes, base_index, base_stride); + + hypre_ComputePkgCreate(send_boxes, recv_boxes, + unit_stride, unit_stride, + send_processes, recv_processes, + indt_boxes, dept_boxes, + base_stride, grid, + hypre_StructVectorDataSpace(x), 1, + &compute_pkg); + + /*---------------------------------------------------------- + * Set up the residual data structure + *----------------------------------------------------------*/ + + (residual_data -> A) = hypre_StructMatrixRef(A); + (residual_data -> x) = hypre_StructVectorRef(x); + (residual_data -> b) = hypre_StructVectorRef(b); + (residual_data -> r) = hypre_StructVectorRef(r); + (residual_data -> base_points) = base_points; + (residual_data -> compute_pkg) = compute_pkg; + + /*----------------------------------------------------- + * Compute flops + *-----------------------------------------------------*/ + + (residual_data -> flops) = + (hypre_StructMatrixGlobalSize(A) + hypre_StructVectorGlobalSize(x)) / + (hypre_IndexX(base_stride) * + hypre_IndexY(base_stride) * + hypre_IndexZ(base_stride) ); + + return ierr; + } + + /*-------------------------------------------------------------------------- + * hypre_SMGResidual + *--------------------------------------------------------------------------*/ + + int + hypre_SMGResidual( void *residual_vdata, + hypre_StructMatrix *A, + hypre_StructVector *x, + hypre_StructVector *b, + hypre_StructVector *r ) + { + int ierr = 0; + + hypre_SMGResidualData *residual_data = residual_vdata; + + hypre_IndexRef base_stride = (residual_data -> base_stride); + hypre_BoxArray *base_points = (residual_data -> base_points); + hypre_ComputePkg *compute_pkg = (residual_data -> compute_pkg); + + hypre_CommHandle *comm_handle; + + hypre_BoxArrayArray *compute_box_aa; + hypre_BoxArray *compute_box_a; + hypre_Box *compute_box; + + hypre_Box *A_data_box; + hypre_Box *x_data_box; + hypre_Box *b_data_box; + hypre_Box *r_data_box; + + int Ai; + int xi; + int bi; + int ri; + + double *Ap; + double *xp; + double *bp; + double *rp; + + hypre_Index loop_size; + hypre_IndexRef start; + + hypre_StructStencil *stencil; + hypre_Index *stencil_shape; + int stencil_size; + + int compute_i, i, j, si; + int loopi, loopj, loopk; + + hypre_BeginTiming(residual_data -> time_index); + + /*----------------------------------------------------------------------- + * Compute residual r = b - Ax + *-----------------------------------------------------------------------*/ + + stencil = hypre_StructMatrixStencil(A); + stencil_shape = hypre_StructStencilShape(stencil); + stencil_size = hypre_StructStencilSize(stencil); + + for (compute_i = 0; compute_i < 2; compute_i++) + { + switch(compute_i) + { + case 0: + { + xp = hypre_StructVectorData(x); + hypre_InitializeIndtComputations(compute_pkg, xp, &comm_handle); + compute_box_aa = hypre_ComputePkgIndtBoxes(compute_pkg); + + /*---------------------------------------- + * Copy b into r + *----------------------------------------*/ + + compute_box_a = base_points; + hypre_ForBoxI(i, compute_box_a) + { + compute_box = hypre_BoxArrayBox(compute_box_a, i); + start = hypre_BoxIMin(compute_box); + + b_data_box = + hypre_BoxArrayBox(hypre_StructVectorDataSpace(b), i); + r_data_box = + hypre_BoxArrayBox(hypre_StructVectorDataSpace(r), i); + + bp = hypre_StructVectorBoxData(b, i); + rp = hypre_StructVectorBoxData(r, i); + + hypre_BoxGetStrideSize(compute_box, base_stride, loop_size); + hypre_BoxLoop2Begin(loop_size, + b_data_box, start, base_stride, bi, + r_data_box, start, base_stride, ri); + #define HYPRE_BOX_SMP_PRIVATE loopk,loopi,loopj,bi,ri + #include "hypre_box_smp_forloop.h" + hypre_BoxLoop2For(loopi, loopj, loopk, bi, ri) + { + rp[ri] = bp[bi]; + } + hypre_BoxLoop2End(bi, ri); + } + } + break; + + case 1: + { + hypre_FinalizeIndtComputations(comm_handle); + compute_box_aa = hypre_ComputePkgDeptBoxes(compute_pkg); + } + break; + } + + /*-------------------------------------------------------------------- + * Compute r -= A*x + *--------------------------------------------------------------------*/ + + hypre_ForBoxArrayI(i, compute_box_aa) + { + compute_box_a = hypre_BoxArrayArrayBoxArray(compute_box_aa, i); + + A_data_box = hypre_BoxArrayBox(hypre_StructMatrixDataSpace(A), i); + x_data_box = hypre_BoxArrayBox(hypre_StructVectorDataSpace(x), i); + r_data_box = hypre_BoxArrayBox(hypre_StructVectorDataSpace(r), i); + + rp = hypre_StructVectorBoxData(r, i); + + hypre_ForBoxI(j, compute_box_a) + { + compute_box = hypre_BoxArrayBox(compute_box_a, j); + + start = hypre_BoxIMin(compute_box); + + for (si = 0; si < stencil_size; si++) + { + Ap = hypre_StructMatrixBoxData(A, i, si); + xp = hypre_StructVectorBoxData(x, i) + + hypre_BoxOffsetDistance(x_data_box, stencil_shape[si]); + + hypre_BoxGetStrideSize(compute_box, base_stride, + loop_size); + hypre_BoxLoop3Begin(loop_size, + A_data_box, start, base_stride, Ai, + x_data_box, start, base_stride, xi, + r_data_box, start, base_stride, ri); + #define HYPRE_BOX_SMP_PRIVATE loopk,loopi,loopj,Ai,xi,ri + #include "hypre_box_smp_forloop.h" + hypre_BoxLoop3For(loopi, loopj, loopk, Ai, xi, ri) + { + rp[ri] -= Ap[Ai] * xp[xi]; + } + hypre_BoxLoop3End(Ai, xi, ri); + } + } + } + } + + /*----------------------------------------------------------------------- + * Return + *-----------------------------------------------------------------------*/ + + hypre_IncFLOPCount(residual_data -> flops); + hypre_EndTiming(residual_data -> time_index); + + return ierr; + } + + /*-------------------------------------------------------------------------- + * hypre_SMGResidualSetBase + *--------------------------------------------------------------------------*/ + + int + hypre_SMGResidualSetBase( void *residual_vdata, + hypre_Index base_index, + hypre_Index base_stride ) + { + hypre_SMGResidualData *residual_data = residual_vdata; + int d; + int ierr = 0; + + for (d = 0; d < 3; d++) + { + hypre_IndexD((residual_data -> base_index), d) + = hypre_IndexD(base_index, d); + hypre_IndexD((residual_data -> base_stride), d) + = hypre_IndexD(base_stride, d); + } + + return ierr; + } + + /*-------------------------------------------------------------------------- + * hypre_SMGResidualDestroy + *--------------------------------------------------------------------------*/ + + int + hypre_SMGResidualDestroy( void *residual_vdata ) + { + int ierr = 0; + + hypre_SMGResidualData *residual_data = residual_vdata; + + if (residual_data) + { + hypre_StructMatrixDestroy(residual_data -> A); + hypre_StructVectorDestroy(residual_data -> x); + hypre_StructVectorDestroy(residual_data -> b); + hypre_StructVectorDestroy(residual_data -> r); + hypre_BoxArrayDestroy(residual_data -> base_points); + hypre_ComputePkgDestroy(residual_data -> compute_pkg ); + hypre_FinalizeTiming(residual_data -> time_index); + hypre_TFree(residual_data); + } + + return ierr; + } + Index: llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/smg_setup.c diff -c /dev/null llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/smg_setup.c:1.1 *** /dev/null Mon Apr 11 00:22:18 2005 --- llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/smg_setup.c Mon Apr 11 00:22:07 2005 *************** *** 0 **** --- 1,435 ---- + /*BHEADER********************************************************************** + * (c) 1997 The Regents of the University of California + * + * See the file COPYRIGHT_and_DISCLAIMER for a complete copyright + * notice, contact person, and disclaimer. + * + * $Revision: 1.1 $ + *********************************************************************EHEADER*/ + /****************************************************************************** + * + * + *****************************************************************************/ + + #include "headers.h" + #include "smg.h" + + #define DEBUG 0 + + /*-------------------------------------------------------------------------- + * hypre_SMGSetup + *--------------------------------------------------------------------------*/ + + int + hypre_SMGSetup( void *smg_vdata, + hypre_StructMatrix *A, + hypre_StructVector *b, + hypre_StructVector *x ) + { + hypre_SMGData *smg_data = smg_vdata; + + MPI_Comm comm = (smg_data -> comm); + hypre_IndexRef base_index = (smg_data -> base_index); + hypre_IndexRef base_stride = (smg_data -> base_stride); + + int n_pre = (smg_data -> num_pre_relax); + int n_post = (smg_data -> num_post_relax); + + int max_iter; + int max_levels; + + int num_levels; + + int cdir; + + hypre_Index bindex; + hypre_Index bstride; + hypre_Index cindex; + hypre_Index findex; + hypre_Index stride; + + hypre_StructGrid **grid_l; + hypre_StructGrid **PT_grid_l; + + double *data; + int data_size = 0; + hypre_StructMatrix **A_l; + hypre_StructMatrix **PT_l; + hypre_StructMatrix **R_l; + hypre_StructVector **b_l; + hypre_StructVector **x_l; + + /* temp vectors */ + hypre_StructVector **tb_l; + hypre_StructVector **tx_l; + hypre_StructVector **r_l; + hypre_StructVector **e_l; + double *b_data; + double *x_data; + int b_data_alloced; + int x_data_alloced; + + void **relax_data_l; + void **residual_data_l; + void **restrict_data_l; + void **interp_data_l; + + hypre_StructGrid *grid; + + hypre_Box *cbox; + + int i, l; + + int b_num_ghost[] = {0, 0, 0, 0, 0, 0}; + int x_num_ghost[] = {0, 0, 0, 0, 0, 0}; + + int ierr = 0; + #if DEBUG + char filename[255]; + #endif + + /*----------------------------------------------------- + * Set up coarsening direction + *-----------------------------------------------------*/ + + cdir = hypre_StructStencilDim(hypre_StructMatrixStencil(A)) - 1; + (smg_data -> cdir) = cdir; + + /*----------------------------------------------------- + * Set up coarse grids + *-----------------------------------------------------*/ + + grid = hypre_StructMatrixGrid(A); + + /* Compute a new max_levels value based on the grid */ + cbox = hypre_BoxDuplicate(hypre_StructGridBoundingBox(grid)); + max_levels = hypre_Log2(hypre_BoxSizeD(cbox, cdir)) + 2; + if ((smg_data -> max_levels) > 0) + { + max_levels = hypre_min(max_levels, (smg_data -> max_levels)); + } + (smg_data -> max_levels) = max_levels; + + grid_l = hypre_TAlloc(hypre_StructGrid *, max_levels); + PT_grid_l = hypre_TAlloc(hypre_StructGrid *, max_levels); + PT_grid_l[0] = NULL; + hypre_StructGridRef(grid, &grid_l[0]); + for (l = 0; ; l++) + { + /* set cindex and stride */ + hypre_SMGSetCIndex(base_index, base_stride, l, cdir, cindex); + hypre_SMGSetStride(base_index, base_stride, l, cdir, stride); + + /* check to see if we should coarsen */ + if ( ( hypre_BoxIMinD(cbox, cdir) == hypre_BoxIMaxD(cbox, cdir) ) || + (l == (max_levels - 1)) ) + { + /* stop coarsening */ + break; + } + + /* coarsen cbox */ + hypre_ProjectBox(cbox, cindex, stride); + hypre_StructMapFineToCoarse(hypre_BoxIMin(cbox), cindex, stride, + hypre_BoxIMin(cbox)); + hypre_StructMapFineToCoarse(hypre_BoxIMax(cbox), cindex, stride, + hypre_BoxIMax(cbox)); + + /* build the interpolation grid */ + hypre_StructCoarsen(grid_l[l], cindex, stride, 0, &PT_grid_l[l+1]); + + /* build the coarse grid */ + hypre_StructCoarsen(grid_l[l], cindex, stride, 1, &grid_l[l+1]); + } + num_levels = l + 1; + + /* free up some things */ + hypre_BoxDestroy(cbox); + + (smg_data -> num_levels) = num_levels; + (smg_data -> grid_l) = grid_l; + (smg_data -> PT_grid_l) = PT_grid_l; + + /*----------------------------------------------------- + * Set up matrix and vector structures + *-----------------------------------------------------*/ + + A_l = hypre_TAlloc(hypre_StructMatrix *, num_levels); + PT_l = hypre_TAlloc(hypre_StructMatrix *, num_levels - 1); + R_l = hypre_TAlloc(hypre_StructMatrix *, num_levels - 1); + b_l = hypre_TAlloc(hypre_StructVector *, num_levels); + x_l = hypre_TAlloc(hypre_StructVector *, num_levels); + tb_l = hypre_TAlloc(hypre_StructVector *, num_levels); + tx_l = hypre_TAlloc(hypre_StructVector *, num_levels); + r_l = tx_l; + e_l = tx_l; + + A_l[0] = hypre_StructMatrixRef(A); + b_l[0] = hypre_StructVectorRef(b); + x_l[0] = hypre_StructVectorRef(x); + + for (i = 0; i <= cdir; i++) + { + x_num_ghost[2*i] = 1; + x_num_ghost[2*i + 1] = 1; + } + + tb_l[0] = hypre_StructVectorCreate(comm, grid_l[0]); + hypre_StructVectorSetNumGhost(tb_l[0], hypre_StructVectorNumGhost(b)); + hypre_StructVectorInitializeShell(tb_l[0]); + data_size += hypre_StructVectorDataSize(tb_l[0]); + + tx_l[0] = hypre_StructVectorCreate(comm, grid_l[0]); + hypre_StructVectorSetNumGhost(tx_l[0], hypre_StructVectorNumGhost(x)); + hypre_StructVectorInitializeShell(tx_l[0]); + data_size += hypre_StructVectorDataSize(tx_l[0]); + + for (l = 0; l < (num_levels - 1); l++) + { + PT_l[l] = hypre_SMGCreateInterpOp(A_l[l], PT_grid_l[l+1], cdir); + hypre_StructMatrixInitializeShell(PT_l[l]); + data_size += hypre_StructMatrixDataSize(PT_l[l]); + + if (hypre_StructMatrixSymmetric(A)) + { + R_l[l] = PT_l[l]; + } + else + { + R_l[l] = PT_l[l]; + #if 0 + /* Allow R != PT for non symmetric case */ + /* NOTE: Need to create a non-pruned grid for this to work */ + R_l[l] = hypre_SMGCreateRestrictOp(A_l[l], grid_l[l+1], cdir); + hypre_StructMatrixInitializeShell(R_l[l]); + data_size += hypre_StructMatrixDataSize(R_l[l]); + #endif + } + + A_l[l+1] = hypre_SMGCreateRAPOp(R_l[l], A_l[l], PT_l[l], grid_l[l+1]); + hypre_StructMatrixInitializeShell(A_l[l+1]); + data_size += hypre_StructMatrixDataSize(A_l[l+1]); + + b_l[l+1] = hypre_StructVectorCreate(comm, grid_l[l+1]); + hypre_StructVectorSetNumGhost(b_l[l+1], b_num_ghost); + hypre_StructVectorInitializeShell(b_l[l+1]); + data_size += hypre_StructVectorDataSize(b_l[l+1]); + + x_l[l+1] = hypre_StructVectorCreate(comm, grid_l[l+1]); + hypre_StructVectorSetNumGhost(x_l[l+1], x_num_ghost); + hypre_StructVectorInitializeShell(x_l[l+1]); + data_size += hypre_StructVectorDataSize(x_l[l+1]); + + tb_l[l+1] = hypre_StructVectorCreate(comm, grid_l[l+1]); + hypre_StructVectorSetNumGhost(tb_l[l+1], hypre_StructVectorNumGhost(b)); + hypre_StructVectorInitializeShell(tb_l[l+1]); + + tx_l[l+1] = hypre_StructVectorCreate(comm, grid_l[l+1]); + hypre_StructVectorSetNumGhost(tx_l[l+1], hypre_StructVectorNumGhost(x)); + hypre_StructVectorInitializeShell(tx_l[l+1]); + } + + data = hypre_SharedCTAlloc(double, data_size); + (smg_data -> data) = data; + + hypre_StructVectorInitializeData(tb_l[0], data); + hypre_StructVectorAssemble(tb_l[0]); + data += hypre_StructVectorDataSize(tb_l[0]); + + hypre_StructVectorInitializeData(tx_l[0], data); + hypre_StructVectorAssemble(tx_l[0]); + data += hypre_StructVectorDataSize(tx_l[0]); + + for (l = 0; l < (num_levels - 1); l++) + { + hypre_StructMatrixInitializeData(PT_l[l], data); + data += hypre_StructMatrixDataSize(PT_l[l]); + + #if 0 + /* Allow R != PT for non symmetric case */ + if (!hypre_StructMatrixSymmetric(A)) + { + hypre_StructMatrixInitializeData(R_l[l], data); + data += hypre_StructMatrixDataSize(R_l[l]); + } + #endif + + hypre_StructMatrixInitializeData(A_l[l+1], data); + data += hypre_StructMatrixDataSize(A_l[l+1]); + + hypre_StructVectorInitializeData(b_l[l+1], data); + hypre_StructVectorAssemble(b_l[l+1]); + data += hypre_StructVectorDataSize(b_l[l+1]); + + hypre_StructVectorInitializeData(x_l[l+1], data); + hypre_StructVectorAssemble(x_l[l+1]); + data += hypre_StructVectorDataSize(x_l[l+1]); + + hypre_StructVectorInitializeData(tb_l[l+1], + hypre_StructVectorData(tb_l[0])); + hypre_StructVectorAssemble(tb_l[l+1]); + + hypre_StructVectorInitializeData(tx_l[l+1], + hypre_StructVectorData(tx_l[0])); + hypre_StructVectorAssemble(tx_l[l+1]); + } + + (smg_data -> A_l) = A_l; + (smg_data -> PT_l) = PT_l; + (smg_data -> R_l) = R_l; + (smg_data -> b_l) = b_l; + (smg_data -> x_l) = x_l; + (smg_data -> tb_l) = tb_l; + (smg_data -> tx_l) = tx_l; + (smg_data -> r_l) = r_l; + (smg_data -> e_l) = e_l; + + /*----------------------------------------------------- + * Set up multigrid operators and call setup routines + * + * Note: The routine that sets up interpolation uses + * the same relaxation routines used in the solve + * phase of the algorithm. To do this, the data for + * the fine-grid unknown and right-hand-side vectors + * is temporarily changed to temporary data. + *-----------------------------------------------------*/ + + relax_data_l = hypre_TAlloc(void *, num_levels); + residual_data_l = hypre_TAlloc(void *, num_levels); + restrict_data_l = hypre_TAlloc(void *, num_levels); + interp_data_l = hypre_TAlloc(void *, num_levels); + + /* temporarily set the data for x_l[0] and b_l[0] to temp data */ + b_data = hypre_StructVectorData(b_l[0]); + b_data_alloced = hypre_StructVectorDataAlloced(b_l[0]); + x_data = hypre_StructVectorData(x_l[0]); + x_data_alloced = hypre_StructVectorDataAlloced(x_l[0]); + hypre_StructVectorInitializeData(b_l[0], hypre_StructVectorData(tb_l[0])); + hypre_StructVectorInitializeData(x_l[0], hypre_StructVectorData(tx_l[0])); + hypre_StructVectorAssemble(b_l[0]); + hypre_StructVectorAssemble(x_l[0]); + + for (l = 0; l < (num_levels - 1); l++) + { + hypre_SMGSetBIndex(base_index, base_stride, l, bindex); + hypre_SMGSetBStride(base_index, base_stride, l, bstride); + hypre_SMGSetCIndex(base_index, base_stride, l, cdir, cindex); + hypre_SMGSetFIndex(base_index, base_stride, l, cdir, findex); + hypre_SMGSetStride(base_index, base_stride, l, cdir, stride); + + /* set up relaxation */ + relax_data_l[l] = hypre_SMGRelaxCreate(comm); + hypre_SMGRelaxSetBase(relax_data_l[l], bindex, bstride); + hypre_SMGRelaxSetMemoryUse(relax_data_l[l], (smg_data -> memory_use)); + hypre_SMGRelaxSetTol(relax_data_l[l], 0.0); + hypre_SMGRelaxSetNumSpaces(relax_data_l[l], 2); + hypre_SMGRelaxSetSpace(relax_data_l[l], 0, + hypre_IndexD(cindex, cdir), + hypre_IndexD(stride, cdir)); + hypre_SMGRelaxSetSpace(relax_data_l[l], 1, + hypre_IndexD(findex, cdir), + hypre_IndexD(stride, cdir)); + hypre_SMGRelaxSetTempVec(relax_data_l[l], tb_l[l]); + hypre_SMGRelaxSetNumPreRelax( relax_data_l[l], n_pre); + hypre_SMGRelaxSetNumPostRelax( relax_data_l[l], n_post); + hypre_SMGRelaxSetup(relax_data_l[l], A_l[l], b_l[l], x_l[l]); + + hypre_SMGSetupInterpOp(relax_data_l[l], A_l[l], b_l[l], x_l[l], + PT_l[l], cdir, cindex, findex, stride); + + /* (re)set relaxation parameters */ + hypre_SMGRelaxSetNumPreSpaces(relax_data_l[l], 0); + hypre_SMGRelaxSetNumRegSpaces(relax_data_l[l], 2); + hypre_SMGRelaxSetup(relax_data_l[l], A_l[l], b_l[l], x_l[l]); + + /* set up the residual routine */ + residual_data_l[l] = hypre_SMGResidualCreate(); + hypre_SMGResidualSetBase(residual_data_l[l], bindex, bstride); + hypre_SMGResidualSetup(residual_data_l[l], + A_l[l], x_l[l], b_l[l], r_l[l]); + + /* set up the interpolation routine */ + interp_data_l[l] = hypre_SemiInterpCreate(); + hypre_SemiInterpSetup(interp_data_l[l], PT_l[l], 1, x_l[l+1], e_l[l], + cindex, findex, stride); + + /* set up the restriction operator */ + #if 0 + /* Allow R != PT for non symmetric case */ + if (!hypre_StructMatrixSymmetric(A)) + hypre_SMGSetupRestrictOp(A_l[l], R_l[l], tx_l[l], cdir, + cindex, stride); + #endif + + /* set up the restriction routine */ + restrict_data_l[l] = hypre_SemiRestrictCreate(); + hypre_SemiRestrictSetup(restrict_data_l[l], R_l[l], 0, r_l[l], b_l[l+1], + cindex, findex, stride); + + /* set up the coarse grid operator */ + hypre_SMGSetupRAPOp(R_l[l], A_l[l], PT_l[l], A_l[l+1], + cindex, stride); + } + + hypre_SMGSetBIndex(base_index, base_stride, l, bindex); + hypre_SMGSetBStride(base_index, base_stride, l, bstride); + relax_data_l[l] = hypre_SMGRelaxCreate(comm); + hypre_SMGRelaxSetBase(relax_data_l[l], bindex, bstride); + hypre_SMGRelaxSetTol(relax_data_l[l], 0.0); + hypre_SMGRelaxSetMaxIter(relax_data_l[l], 1); + hypre_SMGRelaxSetTempVec(relax_data_l[l], tb_l[l]); + hypre_SMGRelaxSetNumPreRelax( relax_data_l[l], n_pre); + hypre_SMGRelaxSetNumPostRelax( relax_data_l[l], n_post); + hypre_SMGRelaxSetup(relax_data_l[l], A_l[l], b_l[l], x_l[l]); + + /* set up the residual routine in case of a single grid level */ + if( l == 0 ) + { + residual_data_l[l] = hypre_SMGResidualCreate(); + hypre_SMGResidualSetBase(residual_data_l[l], bindex, bstride); + hypre_SMGResidualSetup(residual_data_l[l], + A_l[l], x_l[l], b_l[l], r_l[l]); + } + + /* set the data for x_l[0] and b_l[0] the way they were */ + hypre_StructVectorInitializeData(b_l[0], b_data); + hypre_StructVectorDataAlloced(b_l[0]) = b_data_alloced; + hypre_StructVectorInitializeData(x_l[0], x_data); + hypre_StructVectorDataAlloced(x_l[0]) = x_data_alloced; + hypre_StructVectorAssemble(b_l[0]); + hypre_StructVectorAssemble(x_l[0]); + + (smg_data -> relax_data_l) = relax_data_l; + (smg_data -> residual_data_l) = residual_data_l; + (smg_data -> restrict_data_l) = restrict_data_l; + (smg_data -> interp_data_l) = interp_data_l; + + /*----------------------------------------------------- + * Allocate space for log info + *-----------------------------------------------------*/ + + if ((smg_data -> logging) > 0) + { + max_iter = (smg_data -> max_iter); + (smg_data -> norms) = hypre_TAlloc(double, max_iter); + (smg_data -> rel_norms) = hypre_TAlloc(double, max_iter); + } + + #if DEBUG + if(hypre_StructGridDim(grid_l[0]) == 3) + { + for (l = 0; l < (num_levels - 1); l++) + { + sprintf(filename, "zout_A.%02d", l); + hypre_StructMatrixPrint(filename, A_l[l], 0); + sprintf(filename, "zout_PT.%02d", l); + hypre_StructMatrixPrint(filename, PT_l[l], 0); + } + sprintf(filename, "zout_A.%02d", l); + hypre_StructMatrixPrint(filename, A_l[l], 0); + } + #endif + + return ierr; + } + Index: llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/smg_setup_interp.c diff -c /dev/null llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/smg_setup_interp.c:1.1 *** /dev/null Mon Apr 11 00:22:18 2005 --- llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/smg_setup_interp.c Mon Apr 11 00:22:07 2005 *************** *** 0 **** --- 1,315 ---- + /*BHEADER********************************************************************** + * (c) 1997 The Regents of the University of California + * + * See the file COPYRIGHT_and_DISCLAIMER for a complete copyright + * notice, contact person, and disclaimer. + * + * $Revision: 1.1 $ + *********************************************************************EHEADER*/ + /****************************************************************************** + * + * + *****************************************************************************/ + + #include "headers.h" + #include "smg.h" + + /*-------------------------------------------------------------------------- + * hypre_SMGCreateInterpOp + *--------------------------------------------------------------------------*/ + + hypre_StructMatrix * + hypre_SMGCreateInterpOp( hypre_StructMatrix *A, + hypre_StructGrid *cgrid, + int cdir ) + { + hypre_StructMatrix *PT; + + hypre_StructStencil *stencil; + hypre_Index *stencil_shape; + int stencil_size; + int stencil_dim; + + int num_ghost[] = {1, 1, 1, 1, 1, 1}; + + int i; + + /* set up stencil */ + stencil_size = 2; + stencil_dim = hypre_StructStencilDim(hypre_StructMatrixStencil(A)); + stencil_shape = hypre_CTAlloc(hypre_Index, stencil_size); + for (i = 0; i < stencil_size; i++) + { + hypre_SetIndex(stencil_shape[i], 0, 0, 0); + } + hypre_IndexD(stencil_shape[0], cdir) = -1; + hypre_IndexD(stencil_shape[1], cdir) = 1; + stencil = + hypre_StructStencilCreate(stencil_dim, stencil_size, stencil_shape); + + /* set up matrix */ + PT = hypre_StructMatrixCreate(hypre_StructMatrixComm(A), cgrid, stencil); + hypre_StructMatrixSetNumGhost(PT, num_ghost); + + hypre_StructStencilDestroy(stencil); + + return PT; + } + + /*-------------------------------------------------------------------------- + * hypre_SMGSetupInterpOp + * + * This routine uses SMGRelax to set up the interpolation operator. + * + * To illustrate how it proceeds, consider setting up the the {0, 0, -1} + * stencil coefficient of P^T. This coefficient corresponds to the + * {0, 0, 1} coefficient of P. Do one sweep of plane relaxation on the + * fine grid points for the system, A_mask x = b, with initial guess + * x_0 = all ones and right-hand-side b = all zeros. The A_mask matrix + * contains all coefficients of A except for those in the same direction + * as {0, 0, -1}. + * + * The relaxation data for the multigrid algorithm is passed in and used. + * When this routine returns, the only modified relaxation parameters + * are MaxIter, RegSpace and PreSpace info, the right-hand-side and + * solution info. + *--------------------------------------------------------------------------*/ + + int + hypre_SMGSetupInterpOp( void *relax_data, + hypre_StructMatrix *A, + hypre_StructVector *b, + hypre_StructVector *x, + hypre_StructMatrix *PT, + int cdir, + hypre_Index cindex, + hypre_Index findex, + hypre_Index stride ) + { + hypre_StructMatrix *A_mask; + + hypre_StructStencil *A_stencil; + hypre_Index *A_stencil_shape; + int A_stencil_size; + hypre_StructStencil *PT_stencil; + hypre_Index *PT_stencil_shape; + int PT_stencil_size; + + int *stencil_indices; + int num_stencil_indices; + + hypre_StructGrid *fgrid; + + hypre_StructStencil *compute_pkg_stencil; + hypre_Index *compute_pkg_stencil_shape; + int compute_pkg_stencil_size = 1; + int compute_pkg_stencil_dim = 1; + hypre_ComputePkg *compute_pkg; + + hypre_BoxArrayArray *send_boxes; + hypre_BoxArrayArray *recv_boxes; + int **send_processes; + int **recv_processes; + hypre_BoxArrayArray *indt_boxes; + hypre_BoxArrayArray *dept_boxes; + + hypre_CommHandle *comm_handle; + + hypre_BoxArrayArray *compute_box_aa; + hypre_BoxArray *compute_box_a; + hypre_Box *compute_box; + + hypre_Box *PT_data_box; + hypre_Box *x_data_box; + double *PTp; + double *xp; + int PTi; + int xi; + + hypre_Index loop_size; + hypre_Index start; + hypre_Index startc; + hypre_Index stridec; + + int si, sj, d; + int compute_i, i, j; + int loopi, loopj, loopk; + + int ierr = 0; + + /*-------------------------------------------------------- + * Initialize some things + *--------------------------------------------------------*/ + + hypre_SetIndex(stridec, 1, 1, 1); + + fgrid = hypre_StructMatrixGrid(A); + + A_stencil = hypre_StructMatrixStencil(A); + A_stencil_shape = hypre_StructStencilShape(A_stencil); + A_stencil_size = hypre_StructStencilSize(A_stencil); + PT_stencil = hypre_StructMatrixStencil(PT); + PT_stencil_shape = hypre_StructStencilShape(PT_stencil); + PT_stencil_size = hypre_StructStencilSize(PT_stencil); + + /* Set up relaxation parameters */ + hypre_SMGRelaxSetMaxIter(relax_data, 1); + hypre_SMGRelaxSetNumPreSpaces(relax_data, 0); + hypre_SMGRelaxSetNumRegSpaces(relax_data, 1); + hypre_SMGRelaxSetRegSpaceRank(relax_data, 0, 1); + + compute_pkg_stencil_shape = + hypre_CTAlloc(hypre_Index, compute_pkg_stencil_size); + compute_pkg_stencil = hypre_StructStencilCreate(compute_pkg_stencil_dim, + compute_pkg_stencil_size, + compute_pkg_stencil_shape); + + for (si = 0; si < PT_stencil_size; si++) + { + /*----------------------------------------------------- + * Compute A_mask matrix: This matrix contains all + * stencil coefficients of A except for the coefficients + * in the opposite direction of the current P stencil + * coefficient being computed (same direction for P^T). + *-----------------------------------------------------*/ + + stencil_indices = hypre_TAlloc(int, A_stencil_size); + num_stencil_indices = 0; + for (sj = 0; sj < A_stencil_size; sj++) + { + if (hypre_IndexD(A_stencil_shape[sj], cdir) != + hypre_IndexD(PT_stencil_shape[si], cdir) ) + { + stencil_indices[num_stencil_indices] = sj; + num_stencil_indices++; + } + } + A_mask = + hypre_StructMatrixCreateMask(A, num_stencil_indices, stencil_indices); + hypre_TFree(stencil_indices); + + /*----------------------------------------------------- + * Do relaxation sweep to compute coefficients + *-----------------------------------------------------*/ + + hypre_StructVectorClearGhostValues(x); + hypre_StructVectorSetConstantValues(x, 1.0); + hypre_StructVectorSetConstantValues(b, 0.0); + hypre_SMGRelaxSetNewMatrixStencil(relax_data, PT_stencil); + hypre_SMGRelaxSetup(relax_data, A_mask, b, x); + hypre_SMGRelax(relax_data, A_mask, b, x); + + /*----------------------------------------------------- + * Free up A_mask matrix + *-----------------------------------------------------*/ + + hypre_StructMatrixDestroy(A_mask); + + /*----------------------------------------------------- + * Set up compute package for communication of + * coefficients from fine to coarse across processor + * boundaries. + *-----------------------------------------------------*/ + + hypre_CopyIndex(PT_stencil_shape[si], compute_pkg_stencil_shape[0]); + hypre_CreateComputeInfo(fgrid, compute_pkg_stencil, + &send_boxes, &recv_boxes, + &send_processes, &recv_processes, + &indt_boxes, &dept_boxes); + + hypre_ProjectBoxArrayArray(send_boxes, findex, stride); + hypre_ProjectBoxArrayArray(recv_boxes, findex, stride); + hypre_ProjectBoxArrayArray(indt_boxes, cindex, stride); + hypre_ProjectBoxArrayArray(dept_boxes, cindex, stride); + hypre_ComputePkgCreate(send_boxes, recv_boxes, + stride, stride, + send_processes, recv_processes, + indt_boxes, dept_boxes, + stride, fgrid, + hypre_StructVectorDataSpace(x), 1, + &compute_pkg); + + /*----------------------------------------------------- + * Copy coefficients from x into P^T + *-----------------------------------------------------*/ + + for (compute_i = 0; compute_i < 2; compute_i++) + { + switch(compute_i) + { + case 0: + { + xp = hypre_StructVectorData(x); + hypre_InitializeIndtComputations(compute_pkg, xp, &comm_handle); + compute_box_aa = hypre_ComputePkgIndtBoxes(compute_pkg); + } + break; + + case 1: + { + hypre_FinalizeIndtComputations(comm_handle); + compute_box_aa = hypre_ComputePkgDeptBoxes(compute_pkg); + } + break; + } + + hypre_ForBoxArrayI(i, compute_box_aa) + { + compute_box_a = + hypre_BoxArrayArrayBoxArray(compute_box_aa, i); + + x_data_box = + hypre_BoxArrayBox(hypre_StructVectorDataSpace(x), i); + PT_data_box = + hypre_BoxArrayBox(hypre_StructMatrixDataSpace(PT), i); + + xp = hypre_StructVectorBoxData(x, i); + PTp = hypre_StructMatrixBoxData(PT, i, si); + + hypre_ForBoxI(j, compute_box_a) + { + compute_box = hypre_BoxArrayBox(compute_box_a, j); + + hypre_CopyIndex(hypre_BoxIMin(compute_box), start); + hypre_StructMapFineToCoarse(start, cindex, stride, + startc); + + /* shift start index to appropriate F-point */ + for (d = 0; d < 3; d++) + { + hypre_IndexD(start, d) += + hypre_IndexD(PT_stencil_shape[si], d); + } + + hypre_BoxGetStrideSize(compute_box, stride, loop_size); + hypre_BoxLoop2Begin(loop_size, + x_data_box, start, stride, xi, + PT_data_box, startc, stridec, PTi); + #define HYPRE_BOX_SMP_PRIVATE loopk,loopi,loopj,xi,PTi + #include "hypre_box_smp_forloop.h" + hypre_BoxLoop2For(loopi, loopj, loopk, xi, PTi) + { + PTp[PTi] = xp[xi]; + } + hypre_BoxLoop2End(xi, PTi); + } + } + } + + /*----------------------------------------------------- + * Free up compute package info + *-----------------------------------------------------*/ + + hypre_ComputePkgDestroy(compute_pkg); + } + + /* Tell SMGRelax that the stencil has changed */ + hypre_SMGRelaxSetNewMatrixStencil(relax_data, PT_stencil); + + hypre_StructStencilDestroy(compute_pkg_stencil); + + hypre_StructMatrixAssemble(PT); + + return ierr; + } + Index: llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/smg_setup_rap.c diff -c /dev/null llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/smg_setup_rap.c:1.1 *** /dev/null Mon Apr 11 00:22:18 2005 --- llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/smg_setup_rap.c Mon Apr 11 00:22:07 2005 *************** *** 0 **** --- 1,135 ---- + /*BHEADER********************************************************************** + * (c) 1997 The Regents of the University of California + * + * See the file COPYRIGHT_and_DISCLAIMER for a complete copyright + * notice, contact person, and disclaimer. + * + * $Revision: 1.1 $ + *********************************************************************EHEADER*/ + /****************************************************************************** + * + * + *****************************************************************************/ + + #include "headers.h" + #include "smg.h" + + /*-------------------------------------------------------------------------- + * hypre_SMGCreateRAPOp + * + * Wrapper for 2 and 3d CreateRAPOp routines which set up new coarse + * grid structures. + *--------------------------------------------------------------------------*/ + + hypre_StructMatrix * + hypre_SMGCreateRAPOp( hypre_StructMatrix *R, + hypre_StructMatrix *A, + hypre_StructMatrix *PT, + hypre_StructGrid *coarse_grid ) + { + hypre_StructMatrix *RAP; + hypre_StructStencil *stencil; + + stencil = hypre_StructMatrixStencil(A); + + switch (hypre_StructStencilDim(stencil)) + { + case 2: + RAP = hypre_SMG2CreateRAPOp(R ,A, PT, coarse_grid); + break; + + case 3: + RAP = hypre_SMG3CreateRAPOp(R ,A, PT, coarse_grid); + break; + } + + return RAP; + } + + /*-------------------------------------------------------------------------- + * hypre_SMGSetupRAPOp + * + * Wrapper for 2 and 3d, symmetric and non-symmetric routines to calculate + * entries in RAP. Incomplete error handling at the moment. + *--------------------------------------------------------------------------*/ + + int + hypre_SMGSetupRAPOp( hypre_StructMatrix *R, + hypre_StructMatrix *A, + hypre_StructMatrix *PT, + hypre_StructMatrix *Ac, + hypre_Index cindex, + hypre_Index cstride ) + { + int ierr = 0; + + hypre_StructStencil *stencil; + + stencil = hypre_StructMatrixStencil(A); + + switch (hypre_StructStencilDim(stencil)) + { + + case 2: + + /*-------------------------------------------------------------------- + * Set lower triangular (+ diagonal) coefficients + *--------------------------------------------------------------------*/ + ierr = hypre_SMG2BuildRAPSym(A, PT, R, Ac, cindex, cstride); + + /*-------------------------------------------------------------------- + * For non-symmetric A, set upper triangular coefficients as well + *--------------------------------------------------------------------*/ + if(!hypre_StructMatrixSymmetric(A)) + { + ierr += hypre_SMG2BuildRAPNoSym(A, PT, R, Ac, cindex, cstride); + /*----------------------------------------------------------------- + * Collapse stencil for periodic probems on coarsest grid. + *-----------------------------------------------------------------*/ + ierr = hypre_SMG2RAPPeriodicNoSym(Ac, cindex, cstride); + } + else + { + /*----------------------------------------------------------------- + * Collapse stencil for periodic problems on coarsest grid. + *-----------------------------------------------------------------*/ + ierr = hypre_SMG2RAPPeriodicSym(Ac, cindex, cstride); + } + + break; + + case 3: + + /*-------------------------------------------------------------------- + * Set lower triangular (+ diagonal) coefficients + *--------------------------------------------------------------------*/ + ierr = hypre_SMG3BuildRAPSym(A, PT, R, Ac, cindex, cstride); + + /*-------------------------------------------------------------------- + * For non-symmetric A, set upper triangular coefficients as well + *--------------------------------------------------------------------*/ + if(!hypre_StructMatrixSymmetric(A)) + { + ierr += hypre_SMG3BuildRAPNoSym(A, PT, R, Ac, cindex, cstride); + /*----------------------------------------------------------------- + * Collapse stencil for periodic probems on coarsest grid. + *-----------------------------------------------------------------*/ + ierr = hypre_SMG3RAPPeriodicNoSym(Ac, cindex, cstride); + } + else + { + /*----------------------------------------------------------------- + * Collapse stencil for periodic problems on coarsest grid. + *-----------------------------------------------------------------*/ + ierr = hypre_SMG3RAPPeriodicSym(Ac, cindex, cstride); + } + + break; + + } + + hypre_StructMatrixAssemble(Ac); + + return ierr; + } + Index: llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/smg_setup_restrict.c diff -c /dev/null llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/smg_setup_restrict.c:1.1 *** /dev/null Mon Apr 11 00:22:18 2005 --- llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/smg_setup_restrict.c Mon Apr 11 00:22:07 2005 *************** *** 0 **** --- 1,46 ---- + /*BHEADER********************************************************************** + * (c) 1997 The Regents of the University of California + * + * See the file COPYRIGHT_and_DISCLAIMER for a complete copyright + * notice, contact person, and disclaimer. + * + * $Revision: 1.1 $ + *********************************************************************EHEADER*/ + /****************************************************************************** + * + * + *****************************************************************************/ + + #include "headers.h" + #include "smg.h" + + /*-------------------------------------------------------------------------- + * hypre_SMGCreateRestrictOp + *--------------------------------------------------------------------------*/ + + hypre_StructMatrix * + hypre_SMGCreateRestrictOp( hypre_StructMatrix *A, + hypre_StructGrid *cgrid, + int cdir ) + { + hypre_StructMatrix *R = NULL; + + return R; + } + + /*-------------------------------------------------------------------------- + * hypre_SMGSetupRestrictOp + *--------------------------------------------------------------------------*/ + + int + hypre_SMGSetupRestrictOp( hypre_StructMatrix *A, + hypre_StructMatrix *R, + hypre_StructVector *temp_vec, + int cdir, + hypre_Index cindex, + hypre_Index cstride ) + { + int ierr = 0; + + return ierr; + } Index: llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/smg_solve.c diff -c /dev/null llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/smg_solve.c:1.1 *** /dev/null Mon Apr 11 00:22:18 2005 --- llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/smg_solve.c Mon Apr 11 00:22:07 2005 *************** *** 0 **** --- 1,327 ---- + /*BHEADER********************************************************************** + * (c) 1997 The Regents of the University of California + * + * See the file COPYRIGHT_and_DISCLAIMER for a complete copyright + * notice, contact person, and disclaimer. + * + * $Revision: 1.1 $ + *********************************************************************EHEADER*/ + /****************************************************************************** + * + * + *****************************************************************************/ + + #include "headers.h" + #include "smg.h" + + #define DEBUG 0 + + /*-------------------------------------------------------------------------- + * hypre_SMGSolve + * This is the main solve routine for the Schaffer multigrid method. + * This solver works for 1D, 2D, or 3D linear systems. The dimension + * is determined by the hypre_StructStencilDim argument of the matrix + * stencil. The hypre_StructGridDim argument of the matrix grid is + * allowed to be larger than the dimension of the solver, and in fact, + * this feature is used in the smaller-dimensional solves required + * in the relaxation method for both the 2D and 3D algorithms. This + * allows one to do multiple 2D or 1D solves in parallel (e.g., multiple + * 2D solves, where the 2D problems are "stacked" planes in 3D). + * The only additional requirement is that the linear system(s) data + * be contiguous in memory. + * + * Notes: + * - Iterations are counted as follows: 1 iteration consists of a + * V-cycle plus an extra pre-relaxation. If the number of MG levels + * is equal to 1, then only the extra pre-relaxation step is done at + * each iteration. When the solver exits because the maximum number + * of iterations is reached, the last extra pre-relaxation is not done. + * This allows one to use the solver as a preconditioner for conjugate + * gradient and insure symmetry. + * - hypre_SMGRelax is the relaxation routine. There are different "data" + * structures for each call to reflect different arguments and parameters. + * One important parameter sets whether or not an initial guess of zero + * is to be used in the relaxation. + * - hypre_SMGResidual computes the residual, b - Ax. + * - hypre_SemiRestrict restricts the residual to the coarse grid. + * - hypre_SemiInterp interpolates the coarse error and adds it to the + * fine grid solution. + * + *--------------------------------------------------------------------------*/ + + int + hypre_SMGSolve( void *smg_vdata, + hypre_StructMatrix *A, + hypre_StructVector *b, + hypre_StructVector *x ) + { + + hypre_SMGData *smg_data = smg_vdata; + + double tol = (smg_data -> tol); + int max_iter = (smg_data -> max_iter); + int rel_change = (smg_data -> rel_change); + int zero_guess = (smg_data -> zero_guess); + int num_levels = (smg_data -> num_levels); + int num_pre_relax = (smg_data -> num_pre_relax); + int num_post_relax = (smg_data -> num_post_relax); + hypre_IndexRef base_index = (smg_data -> base_index); + hypre_IndexRef base_stride = (smg_data -> base_stride); + hypre_StructMatrix **A_l = (smg_data -> A_l); + hypre_StructMatrix **PT_l = (smg_data -> PT_l); + hypre_StructMatrix **R_l = (smg_data -> R_l); + hypre_StructVector **b_l = (smg_data -> b_l); + hypre_StructVector **x_l = (smg_data -> x_l); + hypre_StructVector **r_l = (smg_data -> r_l); + hypre_StructVector **e_l = (smg_data -> e_l); + void **relax_data_l = (smg_data -> relax_data_l); + void **residual_data_l = (smg_data -> residual_data_l); + void **restrict_data_l = (smg_data -> restrict_data_l); + void **interp_data_l = (smg_data -> interp_data_l); + int logging = (smg_data -> logging); + double *norms = (smg_data -> norms); + double *rel_norms = (smg_data -> rel_norms); + + double b_dot_b, r_dot_r, eps; + double e_dot_e, x_dot_x; + + int i, l; + + int ierr = 0; + #if DEBUG + char filename[255]; + #endif + + /*----------------------------------------------------- + * Initialize some things and deal with special cases + *-----------------------------------------------------*/ + + hypre_BeginTiming(smg_data -> time_index); + + hypre_StructMatrixDestroy(A_l[0]); + hypre_StructVectorDestroy(b_l[0]); + hypre_StructVectorDestroy(x_l[0]); + A_l[0] = hypre_StructMatrixRef(A); + b_l[0] = hypre_StructVectorRef(b); + x_l[0] = hypre_StructVectorRef(x); + + (smg_data -> num_iterations) = 0; + + /* if max_iter is zero, return */ + if (max_iter == 0) + { + /* if using a zero initial guess, return zero */ + if (zero_guess) + { + hypre_StructVectorSetConstantValues(x, 0.0); + } + + hypre_EndTiming(smg_data -> time_index); + return ierr; + } + + /* part of convergence check */ + if (tol > 0.0) + { + /* eps = (tol^2) */ + b_dot_b = hypre_StructInnerProd(b_l[0], b_l[0]); + eps = tol*tol; + + /* if rhs is zero, return a zero solution */ + if (b_dot_b == 0.0) + { + hypre_StructVectorSetConstantValues(x, 0.0); + if (logging > 0) + { + norms[0] = 0.0; + rel_norms[0] = 0.0; + } + + hypre_EndTiming(smg_data -> time_index); + return ierr; + } + } + + /*----------------------------------------------------- + * Do V-cycles: + * For each index l, "fine" = l, "coarse" = (l+1) + *-----------------------------------------------------*/ + + for (i = 0; i < max_iter; i++) + { + /*-------------------------------------------------- + * Down cycle + *--------------------------------------------------*/ + + /* fine grid pre-relaxation */ + if (num_levels > 1) + { + hypre_SMGRelaxSetRegSpaceRank(relax_data_l[0], 0, 0); + hypre_SMGRelaxSetRegSpaceRank(relax_data_l[0], 1, 1); + } + hypre_SMGRelaxSetMaxIter(relax_data_l[0], num_pre_relax); + hypre_SMGRelaxSetZeroGuess(relax_data_l[0], zero_guess); + hypre_SMGRelax(relax_data_l[0], A_l[0], b_l[0], x_l[0]); + zero_guess = 0; + + /* compute fine grid residual (b - Ax) */ + hypre_SMGResidual(residual_data_l[0], A_l[0], x_l[0], b_l[0], r_l[0]); + + /* convergence check */ + if (tol > 0.0) + { + r_dot_r = hypre_StructInnerProd(r_l[0], r_l[0]); + + if (logging > 0) + { + norms[i] = sqrt(r_dot_r); + if (b_dot_b > 0) + rel_norms[i] = sqrt(r_dot_r/b_dot_b); + else + rel_norms[i] = 0.0; + } + + /* always do at least 1 V-cycle */ + if ((r_dot_r/b_dot_b < eps) && (i > 0)) + { + if (rel_change) + { + if ((e_dot_e/x_dot_x) < eps) + break; + } + else + { + break; + } + } + } + + if (num_levels > 1) + { + /* restrict fine grid residual */ + hypre_SemiRestrict(restrict_data_l[0], R_l[0], r_l[0], b_l[1]); + #if DEBUG + if(hypre_StructStencilDim(hypre_StructMatrixStencil(A)) == 3) + { + sprintf(filename, "zout_xdown.%02d", 0); + hypre_StructVectorPrint(filename, x_l[0], 0); + sprintf(filename, "zout_rdown.%02d", 0); + hypre_StructVectorPrint(filename, r_l[0], 0); + sprintf(filename, "zout_b.%02d", 1); + hypre_StructVectorPrint(filename, b_l[1], 0); + } + #endif + for (l = 1; l <= (num_levels - 2); l++) + { + /* pre-relaxation */ + hypre_SMGRelaxSetRegSpaceRank(relax_data_l[l], 0, 0); + hypre_SMGRelaxSetRegSpaceRank(relax_data_l[l], 1, 1); + hypre_SMGRelaxSetMaxIter(relax_data_l[l], num_pre_relax); + hypre_SMGRelaxSetZeroGuess(relax_data_l[l], 1); + hypre_SMGRelax(relax_data_l[l], A_l[l], b_l[l], x_l[l]); + + /* compute residual (b - Ax) */ + hypre_SMGResidual(residual_data_l[l], + A_l[l], x_l[l], b_l[l], r_l[l]); + + /* restrict residual */ + hypre_SemiRestrict(restrict_data_l[l], R_l[l], r_l[l], b_l[l+1]); + #if DEBUG + if(hypre_StructStencilDim(hypre_StructMatrixStencil(A)) == 3) + { + sprintf(filename, "zout_xdown.%02d", l); + hypre_StructVectorPrint(filename, x_l[l], 0); + sprintf(filename, "zout_rdown.%02d", l); + hypre_StructVectorPrint(filename, r_l[l], 0); + sprintf(filename, "zout_b.%02d", l+1); + hypre_StructVectorPrint(filename, b_l[l+1], 0); + } + #endif + } + + /*-------------------------------------------------- + * Bottom + *--------------------------------------------------*/ + + hypre_SMGRelaxSetZeroGuess(relax_data_l[l], 1); + hypre_SMGRelax(relax_data_l[l], A_l[l], b_l[l], x_l[l]); + #if DEBUG + if(hypre_StructStencilDim(hypre_StructMatrixStencil(A)) == 3) + { + sprintf(filename, "zout_xbottom.%02d", l); + hypre_StructVectorPrint(filename, x_l[l], 0); + } + #endif + + /*-------------------------------------------------- + * Up cycle + *--------------------------------------------------*/ + + for (l = (num_levels - 2); l >= 1; l--) + { + /* interpolate error and correct (x = x + Pe_c) */ + hypre_SemiInterp(interp_data_l[l], PT_l[l], x_l[l+1], e_l[l]); + hypre_StructAxpy(1.0, e_l[l], x_l[l]); + #if DEBUG + if(hypre_StructStencilDim(hypre_StructMatrixStencil(A)) == 3) + { + sprintf(filename, "zout_eup.%02d", l); + hypre_StructVectorPrint(filename, e_l[l], 0); + sprintf(filename, "zout_xup.%02d", l); + hypre_StructVectorPrint(filename, x_l[l], 0); + } + #endif + /* post-relaxation */ + hypre_SMGRelaxSetRegSpaceRank(relax_data_l[l], 0, 1); + hypre_SMGRelaxSetRegSpaceRank(relax_data_l[l], 1, 0); + hypre_SMGRelaxSetMaxIter(relax_data_l[l], num_post_relax); + hypre_SMGRelaxSetZeroGuess(relax_data_l[l], 0); + hypre_SMGRelax(relax_data_l[l], A_l[l], b_l[l], x_l[l]); + } + + /* interpolate error and correct on fine grid (x = x + Pe_c) */ + hypre_SemiInterp(interp_data_l[0], PT_l[0], x_l[1], e_l[0]); + hypre_SMGAxpy(1.0, e_l[0], x_l[0], base_index, base_stride); + #if DEBUG + if(hypre_StructStencilDim(hypre_StructMatrixStencil(A)) == 3) + { + sprintf(filename, "zout_eup.%02d", 0); + hypre_StructVectorPrint(filename, e_l[0], 0); + sprintf(filename, "zout_xup.%02d", 0); + hypre_StructVectorPrint(filename, x_l[0], 0); + } + #endif + } + + /* part of convergence check */ + if ((tol > 0.0) && (rel_change)) + { + if (num_levels > 1) + { + e_dot_e = hypre_StructInnerProd(e_l[0], e_l[0]); + x_dot_x = hypre_StructInnerProd(x_l[0], x_l[0]); + } + else + { + e_dot_e = 0.0; + x_dot_x = 1.0; + } + } + + /* fine grid post-relaxation */ + if (num_levels > 1) + { + hypre_SMGRelaxSetRegSpaceRank(relax_data_l[0], 0, 1); + hypre_SMGRelaxSetRegSpaceRank(relax_data_l[0], 1, 0); + } + hypre_SMGRelaxSetMaxIter(relax_data_l[0], num_post_relax); + hypre_SMGRelaxSetZeroGuess(relax_data_l[0], 0); + hypre_SMGRelax(relax_data_l[0], A_l[0], b_l[0], x_l[0]); + + (smg_data -> num_iterations) = (i + 1); + } + + hypre_EndTiming(smg_data -> time_index); + + return ierr; + } Index: llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/struct_axpy.c diff -c /dev/null llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/struct_axpy.c:1.1 *** /dev/null Mon Apr 11 00:22:18 2005 --- llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/struct_axpy.c Mon Apr 11 00:22:07 2005 *************** *** 0 **** --- 1,75 ---- + /*BHEADER********************************************************************** + * (c) 1997 The Regents of the University of California + * + * See the file COPYRIGHT_and_DISCLAIMER for a complete copyright + * notice, contact person, and disclaimer. + * + * $Revision: 1.1 $ + *********************************************************************EHEADER*/ + /****************************************************************************** + * + * Structured axpy routine + * + *****************************************************************************/ + + #include "headers.h" + + /*-------------------------------------------------------------------------- + * hypre_StructAxpy + *--------------------------------------------------------------------------*/ + + int + hypre_StructAxpy( double alpha, + hypre_StructVector *x, + hypre_StructVector *y ) + { + int ierr = 0; + + hypre_Box *x_data_box; + hypre_Box *y_data_box; + + int xi; + int yi; + + double *xp; + double *yp; + + hypre_BoxArray *boxes; + hypre_Box *box; + hypre_Index loop_size; + hypre_IndexRef start; + hypre_Index unit_stride; + + int i; + int loopi, loopj, loopk; + + hypre_SetIndex(unit_stride, 1, 1, 1); + + boxes = hypre_StructGridBoxes(hypre_StructVectorGrid(y)); + hypre_ForBoxI(i, boxes) + { + box = hypre_BoxArrayBox(boxes, i); + start = hypre_BoxIMin(box); + + x_data_box = hypre_BoxArrayBox(hypre_StructVectorDataSpace(x), i); + y_data_box = hypre_BoxArrayBox(hypre_StructVectorDataSpace(y), i); + + xp = hypre_StructVectorBoxData(x, i); + yp = hypre_StructVectorBoxData(y, i); + + hypre_BoxGetSize(box, loop_size); + + hypre_BoxLoop2Begin(loop_size, + x_data_box, start, unit_stride, xi, + y_data_box, start, unit_stride, yi); + #define HYPRE_BOX_SMP_PRIVATE loopk,loopi,loopj,xi,yi + #include "hypre_box_smp_forloop.h" + hypre_BoxLoop2For(loopi, loopj, loopk, xi, yi) + { + yp[yi] += alpha * xp[xi]; + } + hypre_BoxLoop2End(xi, yi); + } + + return ierr; + } Index: llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/struct_copy.c diff -c /dev/null llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/struct_copy.c:1.1 *** /dev/null Mon Apr 11 00:22:18 2005 --- llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/struct_copy.c Mon Apr 11 00:22:07 2005 *************** *** 0 **** --- 1,75 ---- + /*BHEADER********************************************************************** + * (c) 1997 The Regents of the University of California + * + * See the file COPYRIGHT_and_DISCLAIMER for a complete copyright + * notice, contact person, and disclaimer. + * + * $Revision: 1.1 $ + *********************************************************************EHEADER*/ + /****************************************************************************** + * + * Structured copy routine + * + *****************************************************************************/ + + #include "headers.h" + + + /*-------------------------------------------------------------------------- + * hypre_StructCopy + *--------------------------------------------------------------------------*/ + + int + hypre_StructCopy( hypre_StructVector *x, + hypre_StructVector *y ) + { + int ierr = 0; + + hypre_Box *x_data_box; + hypre_Box *y_data_box; + + int xi; + int yi; + + double *xp; + double *yp; + + hypre_BoxArray *boxes; + hypre_Box *box; + hypre_Index loop_size; + hypre_IndexRef start; + hypre_Index unit_stride; + + int i; + int loopi, loopj, loopk; + + hypre_SetIndex(unit_stride, 1, 1, 1); + + boxes = hypre_StructGridBoxes(hypre_StructVectorGrid(y)); + hypre_ForBoxI(i, boxes) + { + box = hypre_BoxArrayBox(boxes, i); + start = hypre_BoxIMin(box); + + x_data_box = hypre_BoxArrayBox(hypre_StructVectorDataSpace(x), i); + y_data_box = hypre_BoxArrayBox(hypre_StructVectorDataSpace(y), i); + + xp = hypre_StructVectorBoxData(x, i); + yp = hypre_StructVectorBoxData(y, i); + + hypre_BoxGetSize(box, loop_size); + + hypre_BoxLoop2Begin(loop_size, + x_data_box, start, unit_stride, xi, + y_data_box, start, unit_stride, yi); + #define HYPRE_BOX_SMP_PRIVATE loopk,loopi,loopj,xi,yi + #include "hypre_box_smp_forloop.h" + hypre_BoxLoop2For(loopi, loopj, loopk, xi, yi) + { + yp[yi] = xp[xi]; + } + hypre_BoxLoop2End(xi, yi); + } + + return ierr; + } Index: llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/struct_grid.c diff -c /dev/null llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/struct_grid.c:1.1 *** /dev/null Mon Apr 11 00:22:18 2005 --- llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/struct_grid.c Mon Apr 11 00:22:07 2005 *************** *** 0 **** --- 1,667 ---- + /*BHEADER********************************************************************** + * (c) 1997 The Regents of the University of California + * + * See the file COPYRIGHT_and_DISCLAIMER for a complete copyright + * notice, contact person, and disclaimer. + * + * $Revision: 1.1 $ + *********************************************************************EHEADER*/ + /****************************************************************************** + * + * Member functions for hypre_StructGrid class. + * + *****************************************************************************/ + + #include "headers.h" + + /*-------------------------------------------------------------------------- + * hypre_StructGridCreate + *--------------------------------------------------------------------------*/ + + int + hypre_StructGridCreate( MPI_Comm comm, + int dim, + hypre_StructGrid **grid_ptr) + { + hypre_StructGrid *grid; + + grid = hypre_TAlloc(hypre_StructGrid, 1); + + hypre_StructGridComm(grid) = comm; + hypre_StructGridDim(grid) = dim; + hypre_StructGridBoxes(grid) = hypre_BoxArrayCreate(0); + hypre_StructGridIDs(grid) = NULL; + hypre_StructGridNeighbors(grid) = NULL; + hypre_StructGridMaxDistance(grid) = 2; + hypre_StructGridBoundingBox(grid) = NULL; + hypre_StructGridLocalSize(grid) = 0; + hypre_StructGridGlobalSize(grid) = 0; + hypre_SetIndex(hypre_StructGridPeriodic(grid), 0, 0, 0); + hypre_StructGridRefCount(grid) = 1; + + *grid_ptr = grid; + + return 0; + } + + /*-------------------------------------------------------------------------- + * hypre_StructGridRef + *--------------------------------------------------------------------------*/ + + int + hypre_StructGridRef( hypre_StructGrid *grid, + hypre_StructGrid **grid_ref) + { + hypre_StructGridRefCount(grid) ++; + *grid_ref = grid; + + return 0; + } + + /*-------------------------------------------------------------------------- + * hypre_StructGridDestroy + *--------------------------------------------------------------------------*/ + + int + hypre_StructGridDestroy( hypre_StructGrid *grid ) + { + int ierr = 0; + + if (grid) + { + hypre_StructGridRefCount(grid) --; + if (hypre_StructGridRefCount(grid) == 0) + { + hypre_BoxDestroy(hypre_StructGridBoundingBox(grid)); + hypre_BoxNeighborsDestroy(hypre_StructGridNeighbors(grid)); + hypre_TFree(hypre_StructGridIDs(grid)); + hypre_BoxArrayDestroy(hypre_StructGridBoxes(grid)); + hypre_TFree(grid); + } + } + + return ierr; + } + + /*-------------------------------------------------------------------------- + * hypre_StructGridSetHoodInfo + *--------------------------------------------------------------------------*/ + + int + hypre_StructGridSetHoodInfo( hypre_StructGrid *grid, + int max_distance ) + { + int ierr = 0; + + hypre_StructGridMaxDistance(grid) = max_distance; + + return ierr; + } + + /*-------------------------------------------------------------------------- + * hypre_StructGridSetPeriodic + *--------------------------------------------------------------------------*/ + + int + hypre_StructGridSetPeriodic( hypre_StructGrid *grid, + hypre_Index periodic) + { + int ierr = 0; + + hypre_CopyIndex(periodic, hypre_StructGridPeriodic(grid)); + + return ierr; + } + + /*-------------------------------------------------------------------------- + * hypre_StructGridSetExtents + *--------------------------------------------------------------------------*/ + + int + hypre_StructGridSetExtents( hypre_StructGrid *grid, + hypre_Index ilower, + hypre_Index iupper ) + { + int ierr = 0; + hypre_Box *box; + + box = hypre_BoxCreate(); + hypre_BoxSetExtents(box, ilower, iupper); + hypre_AppendBox(box, hypre_StructGridBoxes(grid)); + hypre_BoxDestroy(box); + + return ierr; + } + + /*-------------------------------------------------------------------------- + * hypre_StructGridSetBoxes + *--------------------------------------------------------------------------*/ + + int + hypre_StructGridSetBoxes( hypre_StructGrid *grid, + hypre_BoxArray *boxes ) + { + int ierr = 0; + + hypre_TFree(hypre_StructGridBoxes(grid)); + hypre_StructGridBoxes(grid) = boxes; + + return ierr; + } + + /*-------------------------------------------------------------------------- + * hypre_StructGridSetHood + *--------------------------------------------------------------------------*/ + + int + hypre_StructGridSetHood( hypre_StructGrid *grid, + hypre_BoxArray *hood_boxes, + int *hood_procs, + int *hood_ids, + int first_local, + int num_local, + int num_periodic, + hypre_Box *bounding_box ) + { + int ierr = 0; + + hypre_BoxArray *boxes; + int *ids; + hypre_BoxNeighbors *neighbors; + + int i, ilocal; + + boxes = hypre_BoxArrayCreate(num_local); + ids = hypre_TAlloc(int, num_local); + for (i = 0; i < num_local; i++) + { + ilocal = first_local + i; + hypre_CopyBox(hypre_BoxArrayBox(hood_boxes, ilocal), + hypre_BoxArrayBox(boxes, i)); + ids[i] = hood_ids[ilocal]; + } + hypre_TFree(hypre_StructGridBoxes(grid)); + hypre_TFree(hypre_StructGridIDs(grid)); + hypre_StructGridBoxes(grid) = boxes; + hypre_StructGridIDs(grid) = ids; + + hypre_BoxNeighborsCreate(hood_boxes, hood_procs, hood_ids, + first_local, num_local, num_periodic, &neighbors); + hypre_StructGridNeighbors(grid) = neighbors; + + hypre_BoxDestroy(hypre_StructGridBoundingBox(grid)); + hypre_StructGridBoundingBox(grid) = bounding_box; + + return ierr; + } + + /*-------------------------------------------------------------------------- + * hypre_StructGridAssemble + * + * NOTE: Box ids are set here. They are globally unique, and appear + * in increasing order. + * + * NOTE: Box procs are set here. They appear in non-decreasing order. + * + * NOTE: The boxes in `all_boxes' appear as follows, for example: + * + * proc: 0 0 0 0 1 1 2 2 2 2 ... + * ID: 0 1 2 3 4 5 6 7 8 9 ... + * periodic: * * * * * + * + *--------------------------------------------------------------------------*/ + + int + hypre_StructGridAssemble( hypre_StructGrid *grid ) + { + int ierr = 0; + + hypre_BoxArray *boxes; + hypre_Box *box; + int size; + int prune; + int i; + + boxes = hypre_StructGridBoxes(grid); + prune = 1; + + if (hypre_StructGridNeighbors(grid) == NULL) + { + MPI_Comm comm = hypre_StructGridComm(grid); + int dim = hypre_StructGridDim(grid); + int *ids; + hypre_BoxNeighbors *neighbors; + hypre_Box *bounding_box; + + hypre_BoxArray *all_boxes; + int *all_procs; + int *all_ids; + int first_local; + int num_local; + int num_periodic; + + int d, idmin, idmax; + + /* gather grid box info */ + hypre_GatherAllBoxes(comm, boxes, &all_boxes, &all_procs, &first_local); + num_local = hypre_BoxArraySize(boxes); + + /* set bounding box */ + bounding_box = hypre_BoxCreate(); + for (d = 0; d < dim; d++) + { + idmin = hypre_BoxIMinD(hypre_BoxArrayBox(all_boxes, 0), d); + idmax = hypre_BoxIMaxD(hypre_BoxArrayBox(all_boxes, 0), d); + hypre_ForBoxI(i, all_boxes) + { + box = hypre_BoxArrayBox(all_boxes, i); + idmin = hypre_min(idmin, hypre_BoxIMinD(box, d)); + idmax = hypre_max(idmax, hypre_BoxIMaxD(box, d)); + } + hypre_BoxIMinD(bounding_box, d) = idmin; + hypre_BoxIMaxD(bounding_box, d) = idmax; + } + for (d = dim; d < 3; d++) + { + hypre_BoxIMinD(bounding_box, d) = 0; + hypre_BoxIMaxD(bounding_box, d) = 0; + } + hypre_StructGridBoundingBox(grid) = bounding_box; + + /* set global size */ + size = 0; + hypre_ForBoxI(i, all_boxes) + { + box = hypre_BoxArrayBox(all_boxes, i); + size += hypre_BoxVolume(box); + } + hypre_StructGridGlobalSize(grid) = size; + + /* modify all_boxes as required for periodicity */ + hypre_StructGridPeriodicAllBoxes(grid, &all_boxes, &all_procs, + &first_local, &num_periodic); + + /* set all_ids */ + all_ids = hypre_TAlloc(int, hypre_BoxArraySize(all_boxes)); + hypre_ForBoxI(i, all_boxes) + { + all_ids[i] = i; + } + + /* set neighbors */ + hypre_BoxNeighborsCreate(all_boxes, all_procs, all_ids, + first_local, num_local, num_periodic, + &neighbors); + hypre_StructGridNeighbors(grid) = neighbors; + + /* set ids */ + ids = hypre_TAlloc(int, hypre_BoxArraySize(boxes)); + hypre_ForBoxI(i, boxes) + { + ids[i] = all_ids[first_local + i]; + } + hypre_StructGridIDs(grid) = ids; + + prune = 1; + } + + hypre_BoxNeighborsAssemble(hypre_StructGridNeighbors(grid), + hypre_StructGridMaxDistance(grid), prune); + + /* compute local size */ + size = 0; + hypre_ForBoxI(i, boxes) + { + box = hypre_BoxArrayBox(boxes, i); + size += hypre_BoxVolume(box); + } + hypre_StructGridLocalSize(grid) = size; + + return ierr; + } + + /*-------------------------------------------------------------------------- + * hypre_GatherAllBoxes + *--------------------------------------------------------------------------*/ + + int + hypre_GatherAllBoxes(MPI_Comm comm, + hypre_BoxArray *boxes, + hypre_BoxArray **all_boxes_ptr, + int **all_procs_ptr, + int *first_local_ptr) + { + hypre_BoxArray *all_boxes; + int *all_procs; + int first_local; + int all_boxes_size; + + hypre_Box *box; + hypre_Index imin; + hypre_Index imax; + + int num_all_procs, my_rank; + + int *sendbuf; + int sendcount; + int *recvbuf; + int *recvcounts; + int *displs; + int recvbuf_size; + + int i, p, b, ab, d; + int ierr = 0; + + /*----------------------------------------------------- + * Accumulate the box info + *-----------------------------------------------------*/ + + MPI_Comm_size(comm, &num_all_procs); + MPI_Comm_rank(comm, &my_rank); + + /* compute recvcounts and displs */ + sendcount = 7*hypre_BoxArraySize(boxes); + recvcounts = hypre_SharedTAlloc(int, num_all_procs); + displs = hypre_TAlloc(int, num_all_procs); + MPI_Allgather(&sendcount, 1, MPI_INT, + recvcounts, 1, MPI_INT, comm); + displs[0] = 0; + recvbuf_size = recvcounts[0]; + for (p = 1; p < num_all_procs; p++) + { + displs[p] = displs[p-1] + recvcounts[p-1]; + recvbuf_size += recvcounts[p]; + } + + /* allocate sendbuf and recvbuf */ + sendbuf = hypre_TAlloc(int, sendcount); + recvbuf = hypre_SharedTAlloc(int, recvbuf_size); + + /* put local box extents and process number into sendbuf */ + i = 0; + for (b = 0; b < hypre_BoxArraySize(boxes); b++) + { + sendbuf[i++] = my_rank; + + box = hypre_BoxArrayBox(boxes, b); + for (d = 0; d < 3; d++) + { + sendbuf[i++] = hypre_BoxIMinD(box, d); + sendbuf[i++] = hypre_BoxIMaxD(box, d); + } + } + + /* get global grid info */ + MPI_Allgatherv(sendbuf, sendcount, MPI_INT, + recvbuf, recvcounts, displs, MPI_INT, comm); + + /* sort recvbuf by process rank? */ + + /*----------------------------------------------------- + * Create all_boxes, all_procs, and first_local + *-----------------------------------------------------*/ + + /* unpack recvbuf box info */ + all_boxes_size = recvbuf_size / 7; + all_boxes = hypre_BoxArrayCreate(all_boxes_size); + all_procs = hypre_TAlloc(int, all_boxes_size); + first_local = -1; + i = 0; + p = 0; + ab = 0; + box = hypre_BoxCreate(); + while (i < recvbuf_size) + { + all_procs[p] = recvbuf[i++]; + for (d = 0; d < 3; d++) + { + hypre_IndexD(imin, d) = recvbuf[i++]; + hypre_IndexD(imax, d) = recvbuf[i++]; + } + hypre_BoxSetExtents(box, imin, imax); + hypre_CopyBox(box, hypre_BoxArrayBox(all_boxes, ab)); + ab++; + + if ((first_local < 0) && (all_procs[p] == my_rank)) + { + first_local = p; + } + + p++; + } + hypre_BoxDestroy(box); + + /*----------------------------------------------------- + * Return + *-----------------------------------------------------*/ + + hypre_TFree(sendbuf); + hypre_SharedTFree(recvbuf); + hypre_SharedTFree(recvcounts); + hypre_TFree(displs); + + *all_boxes_ptr = all_boxes; + *all_procs_ptr = all_procs; + *first_local_ptr = first_local; + + return ierr; + } + + /*-------------------------------------------------------------------------- + * hypre_StructGridPrint + *--------------------------------------------------------------------------*/ + + int + hypre_StructGridPrint( FILE *file, + hypre_StructGrid *grid ) + { + int ierr = 0; + + hypre_BoxArray *boxes; + hypre_Box *box; + + int i; + + fprintf(file, "%d\n", hypre_StructGridDim(grid)); + + boxes = hypre_StructGridBoxes(grid); + fprintf(file, "%d\n", hypre_BoxArraySize(boxes)); + hypre_ForBoxI(i, boxes) + { + box = hypre_BoxArrayBox(boxes, i); + fprintf(file, "%d: (%d, %d, %d) x (%d, %d, %d)\n", + i, + hypre_BoxIMinX(box), + hypre_BoxIMinY(box), + hypre_BoxIMinZ(box), + hypre_BoxIMaxX(box), + hypre_BoxIMaxY(box), + hypre_BoxIMaxZ(box)); + } + + return ierr; + } + + /*-------------------------------------------------------------------------- + * hypre_StructGridRead + *--------------------------------------------------------------------------*/ + + int + hypre_StructGridRead( MPI_Comm comm, + FILE *file, + hypre_StructGrid **grid_ptr ) + { + int ierr = 0; + + hypre_StructGrid *grid; + + hypre_Index ilower; + hypre_Index iupper; + + int dim; + int num_boxes; + + int i, idummy; + + fscanf(file, "%d\n", &dim); + hypre_StructGridCreate(comm, dim, &grid); + + fscanf(file, "%d\n", &num_boxes); + for (i = 0; i < num_boxes; i++) + { + fscanf(file, "%d: (%d, %d, %d) x (%d, %d, %d)\n", + &idummy, + &hypre_IndexX(ilower), + &hypre_IndexY(ilower), + &hypre_IndexZ(ilower), + &hypre_IndexX(iupper), + &hypre_IndexY(iupper), + &hypre_IndexZ(iupper)); + + hypre_StructGridSetExtents(grid, ilower, iupper); + } + + hypre_StructGridAssemble(grid); + + *grid_ptr = grid; + + return ierr; + } + + /*-------------------------------------------------------------------------- + * hypre_StructGridPeriodicAllBoxes + *--------------------------------------------------------------------------*/ + + int + hypre_StructGridPeriodicAllBoxes( hypre_StructGrid *grid, + hypre_BoxArray **all_boxes_ptr, + int **all_procs_ptr, + int *first_local_ptr, + int *num_periodic_ptr ) + { + int ierr = 0; + + int new_num_periodic = 0; + + int px = hypre_IndexX(hypre_StructGridPeriodic(grid)); + int py = hypre_IndexY(hypre_StructGridPeriodic(grid)); + int pz = hypre_IndexZ(hypre_StructGridPeriodic(grid)); + + int i_periodic = 0; + int j_periodic = 0; + int k_periodic = 0; + + if (px != 0) + i_periodic = 1; + if (py != 0) + j_periodic = 1; + if (pz != 0) + k_periodic = 1; + + if( !(i_periodic == 0 && j_periodic == 0 && k_periodic == 0) ) + { + hypre_BoxArray *new_all_boxes; + int *new_all_procs; + int new_first_local; + + hypre_BoxArray *all_boxes = *all_boxes_ptr; + int *all_procs = *all_procs_ptr; + int first_local = *first_local_ptr; + int num_local; + int num_periodic; + + hypre_Box *box; + + int num_all, new_num_all; + int i, inew, ip, jp, kp; + int first_i, first_inew; + + num_all = hypre_BoxArraySize(all_boxes); + new_num_all = num_all * ((1+2*i_periodic) * + (1+2*j_periodic) * + (1+2*k_periodic)); + + new_all_boxes = hypre_BoxArrayCreate(new_num_all); + new_all_procs = hypre_TAlloc(int, new_num_all); + + /* add boxes required for periodicity */ + i = 0; + inew = 0; + while (i < num_all) + { + first_i = i; + first_inew = inew; + + for (i = first_i; i < num_all; i++) + { + if (all_procs[i] != all_procs[first_i]) + { + break; + } + + hypre_CopyBox(hypre_BoxArrayBox(all_boxes, i), + hypre_BoxArrayBox(new_all_boxes, inew)); + new_all_procs[inew] = all_procs[i]; + + inew++; + } + num_local = i - first_i; + + for (ip = -i_periodic; ip <= i_periodic; ip++) + { + for (jp = -j_periodic; jp <= j_periodic; jp++) + { + for (kp = -k_periodic; kp <= k_periodic; kp++) + { + if( !(ip == 0 && jp == 0 && kp == 0) ) + { + for (i = first_i; i < (first_i + num_local); i++) + { + box = hypre_BoxArrayBox(new_all_boxes, inew); + hypre_CopyBox(hypre_BoxArrayBox(all_boxes, i), box); + + /* shift box */ + hypre_BoxIMinD(box, 0) = + hypre_BoxIMinD(box, 0) + (ip * px); + hypre_BoxIMinD(box, 1) = + hypre_BoxIMinD(box, 1) + (jp * py); + hypre_BoxIMinD(box, 2) = + hypre_BoxIMinD(box, 2) + (kp * pz); + hypre_BoxIMaxD(box, 0) = + hypre_BoxIMaxD(box, 0) + (ip * px); + hypre_BoxIMaxD(box, 1) = + hypre_BoxIMaxD(box, 1) + (jp * py); + hypre_BoxIMaxD(box, 2) = + hypre_BoxIMaxD(box, 2) + (kp * pz); + + new_all_procs[inew] = all_procs[i]; + + inew++; + } + } + } + } + } + num_periodic = inew - first_inew - num_local; + + if (first_i == first_local) + { + new_first_local = first_inew; + new_num_periodic = num_periodic; + } + } + + hypre_BoxArraySetSize(new_all_boxes, inew); + + hypre_BoxArrayDestroy(all_boxes); + hypre_TFree(all_procs); + + *all_boxes_ptr = new_all_boxes; + *all_procs_ptr = new_all_procs; + *first_local_ptr = new_first_local; + } + + *num_periodic_ptr = new_num_periodic; + + return ierr; + } Index: llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/struct_innerprod.c diff -c /dev/null llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/struct_innerprod.c:1.1 *** /dev/null Mon Apr 11 00:22:18 2005 --- llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/struct_innerprod.c Mon Apr 11 00:22:07 2005 *************** *** 0 **** --- 1,117 ---- + /*BHEADER********************************************************************** + * (c) 1997 The Regents of the University of California + * + * See the file COPYRIGHT_and_DISCLAIMER for a complete copyright + * notice, contact person, and disclaimer. + * + * $Revision: 1.1 $ + *********************************************************************EHEADER*/ + /****************************************************************************** + * + * Structured inner product routine + * + *****************************************************************************/ + + #include "headers.h" + + + /*-------------------------------------------------------------------------- + * hypre_StructInnerProd + *--------------------------------------------------------------------------*/ + + #ifdef HYPRE_USE_PTHREADS + double *local_result_ref[hypre_MAX_THREADS]; + #endif + + double final_innerprod_result; + + + double + hypre_StructInnerProd( hypre_StructVector *x, + hypre_StructVector *y ) + { + double local_result; + double process_result; + + hypre_Box *x_data_box; + hypre_Box *y_data_box; + + int xi; + int yi; + + double *xp; + double *yp; + + hypre_BoxArray *boxes; + hypre_Box *box; + hypre_Index loop_size; + hypre_IndexRef start; + hypre_Index unit_stride; + + int i; + int loopi, loopj, loopk; + #ifdef HYPRE_USE_PTHREADS + int threadid = hypre_GetThreadID(); + #endif + + local_result = 0.0; + process_result = 0.0; + + hypre_SetIndex(unit_stride, 1, 1, 1); + + boxes = hypre_StructGridBoxes(hypre_StructVectorGrid(y)); + hypre_ForBoxI(i, boxes) + { + box = hypre_BoxArrayBox(boxes, i); + start = hypre_BoxIMin(box); + + x_data_box = hypre_BoxArrayBox(hypre_StructVectorDataSpace(x), i); + y_data_box = hypre_BoxArrayBox(hypre_StructVectorDataSpace(y), i); + + xp = hypre_StructVectorBoxData(x, i); + yp = hypre_StructVectorBoxData(y, i); + + hypre_BoxGetSize(box, loop_size); + + #ifdef HYPRE_USE_PTHREADS + local_result_ref[threadid] = &local_result; + #endif + + hypre_BoxLoop2Begin(loop_size, + x_data_box, start, unit_stride, xi, + y_data_box, start, unit_stride, yi); + #define HYPRE_BOX_SMP_PRIVATE loopk,loopi,loopj,xi,yi + #define HYPRE_SMP_REDUCTION_OP + + #define HYPRE_SMP_REDUCTION_VARS local_result + #include "hypre_box_smp_forloop.h" + hypre_BoxLoop2For(loopi, loopj, loopk, xi, yi) + { + local_result += xp[xi] * yp[yi]; + } + hypre_BoxLoop2End(xi, yi); + } + + #ifdef HYPRE_USE_PTHREADS + if (threadid != hypre_NumThreads) + { + for (i = 0; i < hypre_NumThreads; i++) + process_result += *local_result_ref[i]; + } + else + process_result = *local_result_ref[threadid]; + #else + process_result = local_result; + #endif + + + MPI_Allreduce(&process_result, &final_innerprod_result, 1, + MPI_DOUBLE, MPI_SUM, hypre_StructVectorComm(x)); + + + #ifdef HYPRE_USE_PTHREADS + if (threadid == 0 || threadid == hypre_NumThreads) + #endif + hypre_IncFLOPCount(2*hypre_StructVectorGlobalSize(x)); + + return final_innerprod_result; + } Index: llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/struct_io.c diff -c /dev/null llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/struct_io.c:1.1 *** /dev/null Mon Apr 11 00:22:18 2005 --- llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/struct_io.c Mon Apr 11 00:22:07 2005 *************** *** 0 **** --- 1,154 ---- + /*BHEADER********************************************************************** + * (c) 1997 The Regents of the University of California + * + * See the file COPYRIGHT_and_DISCLAIMER for a complete copyright + * notice, contact person, and disclaimer. + * + * $Revision: 1.1 $ + *********************************************************************EHEADER*/ + /****************************************************************************** + * + * Functions for scanning and printing "box-dimensioned" data. + * + *****************************************************************************/ + + #ifdef HYPRE_USE_PTHREADS + #undef HYPRE_USE_PTHREADS + #endif + + #include "headers.h" + + + /*-------------------------------------------------------------------------- + * hypre_PrintBoxArrayData + *--------------------------------------------------------------------------*/ + + int + hypre_PrintBoxArrayData( FILE *file, + hypre_BoxArray *box_array, + hypre_BoxArray *data_space, + int num_values, + double *data ) + { + int ierr = 0; + + hypre_Box *box; + hypre_Box *data_box; + + int data_box_volume; + int datai; + + hypre_Index loop_size; + hypre_IndexRef start; + hypre_Index stride; + + int i, j; + int loopi, loopj, loopk; + + /*---------------------------------------- + * Print data + *----------------------------------------*/ + + hypre_SetIndex(stride, 1, 1, 1); + + hypre_ForBoxI(i, box_array) + { + box = hypre_BoxArrayBox(box_array, i); + data_box = hypre_BoxArrayBox(data_space, i); + + start = hypre_BoxIMin(box); + data_box_volume = hypre_BoxVolume(data_box); + + hypre_BoxGetSize(box, loop_size); + + hypre_BoxLoop1Begin(loop_size, + data_box, start, stride, datai); + #define HYPRE_BOX_SMP_PRIVATE loopk,loopi,loopj,datai + #include "hypre_box_smp_forloop.h" + hypre_BoxLoop1For(loopi, loopj, loopk, datai) + { + for (j = 0; j < num_values; j++) + { + fprintf(file, "%d: (%d, %d, %d; %d) %e\n", + i, + hypre_IndexX(start) + loopi, + hypre_IndexY(start) + loopj, + hypre_IndexZ(start) + loopk, + j, + data[datai + j*data_box_volume]); + } + } + hypre_BoxLoop1End(datai); + + data += num_values*data_box_volume; + } + + return ierr; + } + + /*-------------------------------------------------------------------------- + * hypre_ReadBoxArrayData + *--------------------------------------------------------------------------*/ + + int + hypre_ReadBoxArrayData( FILE *file, + hypre_BoxArray *box_array, + hypre_BoxArray *data_space, + int num_values, + double *data ) + { + int ierr = 0; + + hypre_Box *box; + hypre_Box *data_box; + + int data_box_volume; + int datai; + + hypre_Index loop_size; + hypre_IndexRef start; + hypre_Index stride; + + int i, j, idummy; + int loopi, loopj, loopk; + + /*---------------------------------------- + * Read data + *----------------------------------------*/ + + hypre_SetIndex(stride, 1, 1, 1); + + hypre_ForBoxI(i, box_array) + { + box = hypre_BoxArrayBox(box_array, i); + data_box = hypre_BoxArrayBox(data_space, i); + + start = hypre_BoxIMin(box); + data_box_volume = hypre_BoxVolume(data_box); + + hypre_BoxGetSize(box, loop_size); + + hypre_BoxLoop1Begin(loop_size, + data_box, start, stride, datai); + #define HYPRE_BOX_SMP_PRIVATE loopk,loopi,loopj,datai + #include "hypre_box_smp_forloop.h" + hypre_BoxLoop1For(loopi, loopj, loopk, datai) + { + for (j = 0; j < num_values; j++) + { + fscanf(file, "%d: (%d, %d, %d; %d) %le\n", + &idummy, + &idummy, + &idummy, + &idummy, + &idummy, + &data[datai + j*data_box_volume]); + } + } + hypre_BoxLoop1End(datai); + + data += num_values*data_box_volume; + } + + return ierr; + } Index: llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/struct_ls.h diff -c /dev/null llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/struct_ls.h:1.1 *** /dev/null Mon Apr 11 00:22:18 2005 --- llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/struct_ls.h Mon Apr 11 00:22:07 2005 *************** *** 0 **** --- 1,431 ---- + + #include + + #include "HYPRE_struct_ls.h" + + #ifndef hypre_STRUCT_LS_HEADER + #define hypre_STRUCT_LS_HEADER + + #include "utilities.h" + #include "struct_mv.h" + #include "krylov.h" + + #ifdef __cplusplus + extern "C" { + #endif + + + /* HYPRE_struct_gmres.c */ + int HYPRE_StructGMRESCreate( MPI_Comm comm , HYPRE_StructSolver *solver ); + int HYPRE_StructGMRESDestroy( HYPRE_StructSolver solver ); + int HYPRE_StructGMRESSetup( HYPRE_StructSolver solver , HYPRE_StructMatrix A , HYPRE_StructVector b , HYPRE_StructVector x ); + int HYPRE_StructGMRESSolve( HYPRE_StructSolver solver , HYPRE_StructMatrix A , HYPRE_StructVector b , HYPRE_StructVector x ); + int HYPRE_StructGMRESSetTol( HYPRE_StructSolver solver , double tol ); + int HYPRE_StructGMRESSetMaxIter( HYPRE_StructSolver solver , int max_iter ); + int HYPRE_StructGMRESSetPrecond( HYPRE_StructSolver solver , HYPRE_PtrToStructSolverFcn precond , HYPRE_PtrToStructSolverFcn precond_setup , HYPRE_StructSolver precond_solver ); + int HYPRE_StructGMRESSetLogging( HYPRE_StructSolver solver , int logging ); + int HYPRE_StructGMRESGetNumIterations( HYPRE_StructSolver solver , int *num_iterations ); + int HYPRE_StructGMRESGetFinalRelativeResidualNorm( HYPRE_StructSolver solver , double *norm ); + + /* HYPRE_struct_hybrid.c */ + int HYPRE_StructHybridCreate( MPI_Comm comm , HYPRE_StructSolver *solver ); + int HYPRE_StructHybridDestroy( HYPRE_StructSolver solver ); + int HYPRE_StructHybridSetup( HYPRE_StructSolver solver , HYPRE_StructMatrix A , HYPRE_StructVector b , HYPRE_StructVector x ); + int HYPRE_StructHybridSolve( HYPRE_StructSolver solver , HYPRE_StructMatrix A , HYPRE_StructVector b , HYPRE_StructVector x ); + int HYPRE_StructHybridSetTol( HYPRE_StructSolver solver , double tol ); + int HYPRE_StructHybridSetConvergenceTol( HYPRE_StructSolver solver , double cf_tol ); + int HYPRE_StructHybridSetDSCGMaxIter( HYPRE_StructSolver solver , int dscg_max_its ); + int HYPRE_StructHybridSetPCGMaxIter( HYPRE_StructSolver solver , int pcg_max_its ); + int HYPRE_StructHybridSetTwoNorm( HYPRE_StructSolver solver , int two_norm ); + int HYPRE_StructHybridSetRelChange( HYPRE_StructSolver solver , int rel_change ); + int HYPRE_StructHybridSetPrecond( HYPRE_StructSolver solver , HYPRE_PtrToStructSolverFcn precond , HYPRE_PtrToStructSolverFcn precond_setup , HYPRE_StructSolver precond_solver ); + int HYPRE_StructHybridSetLogging( HYPRE_StructSolver solver , int logging ); + int HYPRE_StructHybridGetNumIterations( HYPRE_StructSolver solver , int *num_its ); + int HYPRE_StructHybridGetDSCGNumIterations( HYPRE_StructSolver solver , int *dscg_num_its ); + int HYPRE_StructHybridGetPCGNumIterations( HYPRE_StructSolver solver , int *pcg_num_its ); + int HYPRE_StructHybridGetFinalRelativeResidualNorm( HYPRE_StructSolver solver , double *norm ); + + /* HYPRE_struct_jacobi.c */ + int HYPRE_StructJacobiCreate( MPI_Comm comm , HYPRE_StructSolver *solver ); + int HYPRE_StructJacobiDestroy( HYPRE_StructSolver solver ); + int HYPRE_StructJacobiSetup( HYPRE_StructSolver solver , HYPRE_StructMatrix A , HYPRE_StructVector b , HYPRE_StructVector x ); + int HYPRE_StructJacobiSolve( HYPRE_StructSolver solver , HYPRE_StructMatrix A , HYPRE_StructVector b , HYPRE_StructVector x ); + int HYPRE_StructJacobiSetTol( HYPRE_StructSolver solver , double tol ); + int HYPRE_StructJacobiSetMaxIter( HYPRE_StructSolver solver , int max_iter ); + int HYPRE_StructJacobiSetZeroGuess( HYPRE_StructSolver solver ); + int HYPRE_StructJacobiSetNonZeroGuess( HYPRE_StructSolver solver ); + int HYPRE_StructJacobiGetNumIterations( HYPRE_StructSolver solver , int *num_iterations ); + int HYPRE_StructJacobiGetFinalRelativeResidualNorm( HYPRE_StructSolver solver , double *norm ); + + /* HYPRE_struct_pcg.c */ + int HYPRE_StructPCGCreate( MPI_Comm comm , HYPRE_StructSolver *solver ); + int HYPRE_StructPCGDestroy( HYPRE_StructSolver solver ); + int HYPRE_StructPCGSetup( HYPRE_StructSolver solver , HYPRE_StructMatrix A , HYPRE_StructVector b , HYPRE_StructVector x ); + int HYPRE_StructPCGSolve( HYPRE_StructSolver solver , HYPRE_StructMatrix A , HYPRE_StructVector b , HYPRE_StructVector x ); + int HYPRE_StructPCGSetTol( HYPRE_StructSolver solver , double tol ); + int HYPRE_StructPCGSetMaxIter( HYPRE_StructSolver solver , int max_iter ); + int HYPRE_StructPCGSetTwoNorm( HYPRE_StructSolver solver , int two_norm ); + int HYPRE_StructPCGSetRelChange( HYPRE_StructSolver solver , int rel_change ); + int HYPRE_StructPCGSetPrecond( HYPRE_StructSolver solver , HYPRE_PtrToStructSolverFcn precond , HYPRE_PtrToStructSolverFcn precond_setup , HYPRE_StructSolver precond_solver ); + int HYPRE_StructPCGSetLogging( HYPRE_StructSolver solver , int logging ); + int HYPRE_StructPCGGetNumIterations( HYPRE_StructSolver solver , int *num_iterations ); + int HYPRE_StructPCGGetFinalRelativeResidualNorm( HYPRE_StructSolver solver , double *norm ); + int HYPRE_StructDiagScaleSetup( HYPRE_StructSolver solver , HYPRE_StructMatrix A , HYPRE_StructVector y , HYPRE_StructVector x ); + int HYPRE_StructDiagScale( HYPRE_StructSolver solver , HYPRE_StructMatrix HA , HYPRE_StructVector Hy , HYPRE_StructVector Hx ); + + /* HYPRE_struct_pfmg.c */ + int HYPRE_StructPFMGCreate( MPI_Comm comm , HYPRE_StructSolver *solver ); + int HYPRE_StructPFMGDestroy( HYPRE_StructSolver solver ); + int HYPRE_StructPFMGSetup( HYPRE_StructSolver solver , HYPRE_StructMatrix A , HYPRE_StructVector b , HYPRE_StructVector x ); + int HYPRE_StructPFMGSolve( HYPRE_StructSolver solver , HYPRE_StructMatrix A , HYPRE_StructVector b , HYPRE_StructVector x ); + int HYPRE_StructPFMGSetTol( HYPRE_StructSolver solver , double tol ); + int HYPRE_StructPFMGSetMaxIter( HYPRE_StructSolver solver , int max_iter ); + int HYPRE_StructPFMGSetRelChange( HYPRE_StructSolver solver , int rel_change ); + int HYPRE_StructPFMGSetZeroGuess( HYPRE_StructSolver solver ); + int HYPRE_StructPFMGSetNonZeroGuess( HYPRE_StructSolver solver ); + int HYPRE_StructPFMGSetRelaxType( HYPRE_StructSolver solver , int relax_type ); + int HYPRE_StructPFMGSetNumPreRelax( HYPRE_StructSolver solver , int num_pre_relax ); + int HYPRE_StructPFMGSetNumPostRelax( HYPRE_StructSolver solver , int num_post_relax ); + int HYPRE_StructPFMGSetSkipRelax( HYPRE_StructSolver solver , int skip_relax ); + int HYPRE_StructPFMGSetDxyz( HYPRE_StructSolver solver , double *dxyz ); + int HYPRE_StructPFMGSetLogging( HYPRE_StructSolver solver , int logging ); + int HYPRE_StructPFMGGetNumIterations( HYPRE_StructSolver solver , int *num_iterations ); + int HYPRE_StructPFMGGetFinalRelativeResidualNorm( HYPRE_StructSolver solver , double *norm ); + + /* HYPRE_struct_smg.c */ + int HYPRE_StructSMGCreate( MPI_Comm comm , HYPRE_StructSolver *solver ); + int HYPRE_StructSMGDestroy( HYPRE_StructSolver solver ); + int HYPRE_StructSMGSetup( HYPRE_StructSolver solver , HYPRE_StructMatrix A , HYPRE_StructVector b , HYPRE_StructVector x ); + int HYPRE_StructSMGSolve( HYPRE_StructSolver solver , HYPRE_StructMatrix A , HYPRE_StructVector b , HYPRE_StructVector x ); + int HYPRE_StructSMGSetMemoryUse( HYPRE_StructSolver solver , int memory_use ); + int HYPRE_StructSMGSetTol( HYPRE_StructSolver solver , double tol ); + int HYPRE_StructSMGSetMaxIter( HYPRE_StructSolver solver , int max_iter ); + int HYPRE_StructSMGSetRelChange( HYPRE_StructSolver solver , int rel_change ); + int HYPRE_StructSMGSetZeroGuess( HYPRE_StructSolver solver ); + int HYPRE_StructSMGSetNonZeroGuess( HYPRE_StructSolver solver ); + int HYPRE_StructSMGSetNumPreRelax( HYPRE_StructSolver solver , int num_pre_relax ); + int HYPRE_StructSMGSetNumPostRelax( HYPRE_StructSolver solver , int num_post_relax ); + int HYPRE_StructSMGSetLogging( HYPRE_StructSolver solver , int logging ); + int HYPRE_StructSMGGetNumIterations( HYPRE_StructSolver solver , int *num_iterations ); + int HYPRE_StructSMGGetFinalRelativeResidualNorm( HYPRE_StructSolver solver , double *norm ); + + /* HYPRE_struct_sparse_msg.c */ + int HYPRE_StructSparseMSGCreate( MPI_Comm comm , HYPRE_StructSolver *solver ); + int HYPRE_StructSparseMSGDestroy( HYPRE_StructSolver solver ); + int HYPRE_StructSparseMSGSetup( HYPRE_StructSolver solver , HYPRE_StructMatrix A , HYPRE_StructVector b , HYPRE_StructVector x ); + int HYPRE_StructSparseMSGSolve( HYPRE_StructSolver solver , HYPRE_StructMatrix A , HYPRE_StructVector b , HYPRE_StructVector x ); + int HYPRE_StructSparseMSGSetTol( HYPRE_StructSolver solver , double tol ); + int HYPRE_StructSparseMSGSetMaxIter( HYPRE_StructSolver solver , int max_iter ); + int HYPRE_StructSparseMSGSetJump( HYPRE_StructSolver solver , int jump ); + int HYPRE_StructSparseMSGSetRelChange( HYPRE_StructSolver solver , int rel_change ); + int HYPRE_StructSparseMSGSetZeroGuess( HYPRE_StructSolver solver ); + int HYPRE_StructSparseMSGSetNonZeroGuess( HYPRE_StructSolver solver ); + int HYPRE_StructSparseMSGSetRelaxType( HYPRE_StructSolver solver , int relax_type ); + int HYPRE_StructSparseMSGSetNumPreRelax( HYPRE_StructSolver solver , int num_pre_relax ); + int HYPRE_StructSparseMSGSetNumPostRelax( HYPRE_StructSolver solver , int num_post_relax ); + int HYPRE_StructSparseMSGSetNumFineRelax( HYPRE_StructSolver solver , int num_fine_relax ); + int HYPRE_StructSparseMSGSetLogging( HYPRE_StructSolver solver , int logging ); + int HYPRE_StructSparseMSGGetNumIterations( HYPRE_StructSolver solver , int *num_iterations ); + int HYPRE_StructSparseMSGGetFinalRelativeResidualNorm( HYPRE_StructSolver solver , double *norm ); + + /* coarsen.c */ + int hypre_StructMapFineToCoarse( hypre_Index findex , hypre_Index index , hypre_Index stride , hypre_Index cindex ); + int hypre_StructMapCoarseToFine( hypre_Index cindex , hypre_Index index , hypre_Index stride , hypre_Index findex ); + int hypre_StructCoarsen( hypre_StructGrid *fgrid , hypre_Index index , hypre_Index stride , int prune , hypre_StructGrid **cgrid_ptr ); + int hypre_StructCoarsen( hypre_StructGrid *fgrid , hypre_Index index , hypre_Index stride , int prune , hypre_StructGrid **cgrid_ptr ); + + /* cyclic_reduction.c */ + void *hypre_CyclicReductionCreate( MPI_Comm comm ); + hypre_StructMatrix *hypre_CycRedCreateCoarseOp( hypre_StructMatrix *A , hypre_StructGrid *coarse_grid , int cdir ); + int hypre_CycRedSetupCoarseOp( hypre_StructMatrix *A , hypre_StructMatrix *Ac , hypre_Index cindex , hypre_Index cstride ); + int hypre_CyclicReductionSetup( void *cyc_red_vdata , hypre_StructMatrix *A , hypre_StructVector *b , hypre_StructVector *x ); + int hypre_CyclicReduction( void *cyc_red_vdata , hypre_StructMatrix *A , hypre_StructVector *b , hypre_StructVector *x ); + int hypre_CyclicReductionSetBase( void *cyc_red_vdata , hypre_Index base_index , hypre_Index base_stride ); + int hypre_CyclicReductionDestroy( void *cyc_red_vdata ); + + /* general.c */ + int hypre_Log2( int p ); + + /* hybrid.c */ + void *hypre_HybridCreate( MPI_Comm comm ); + int hypre_HybridDestroy( void *hybrid_vdata ); + int hypre_HybridSetTol( void *hybrid_vdata , double tol ); + int hypre_HybridSetConvergenceTol( void *hybrid_vdata , double cf_tol ); + int hypre_HybridSetDSCGMaxIter( void *hybrid_vdata , int dscg_max_its ); + int hypre_HybridSetPCGMaxIter( void *hybrid_vdata , int pcg_max_its ); + int hypre_HybridSetTwoNorm( void *hybrid_vdata , int two_norm ); + int hypre_HybridSetRelChange( void *hybrid_vdata , int rel_change ); + int hypre_HybridSetPrecond( void *pcg_vdata , int (*pcg_precond_solve )(), int (*pcg_precond_setup )(), void *pcg_precond ); + int hypre_HybridSetLogging( void *hybrid_vdata , int logging ); + int hypre_HybridGetNumIterations( void *hybrid_vdata , int *num_its ); + int hypre_HybridGetDSCGNumIterations( void *hybrid_vdata , int *dscg_num_its ); + int hypre_HybridGetPCGNumIterations( void *hybrid_vdata , int *pcg_num_its ); + int hypre_HybridGetFinalRelativeResidualNorm( void *hybrid_vdata , double *final_rel_res_norm ); + int hypre_HybridSetup( void *hybrid_vdata , hypre_StructMatrix *A , hypre_StructVector *b , hypre_StructVector *x ); + int hypre_HybridSolve( void *hybrid_vdata , hypre_StructMatrix *A , hypre_StructVector *b , hypre_StructVector *x ); + + /* jacobi.c */ + void *hypre_JacobiCreate( MPI_Comm comm ); + int hypre_JacobiDestroy( void *jacobi_vdata ); + int hypre_JacobiSetup( void *jacobi_vdata , hypre_StructMatrix *A , hypre_StructVector *b , hypre_StructVector *x ); + int hypre_JacobiSolve( void *jacobi_vdata , hypre_StructMatrix *A , hypre_StructVector *b , hypre_StructVector *x ); + int hypre_JacobiSetTol( void *jacobi_vdata , double tol ); + int hypre_JacobiSetMaxIter( void *jacobi_vdata , int max_iter ); + int hypre_JacobiSetZeroGuess( void *jacobi_vdata , int zero_guess ); + int hypre_JacobiSetTempVec( void *jacobi_vdata , hypre_StructVector *t ); + + /* pcg_struct.c */ + char *hypre_StructKrylovCAlloc( int count , int elt_size ); + int hypre_StructKrylovFree( char *ptr ); + void *hypre_StructKrylovCreateVector( void *vvector ); + void *hypre_StructKrylovCreateVectorArray( int n , void *vvector ); + int hypre_StructKrylovDestroyVector( void *vvector ); + void *hypre_StructKrylovMatvecCreate( void *A , void *x ); + int hypre_StructKrylovMatvec( void *matvec_data , double alpha , void *A , void *x , double beta , void *y ); + int hypre_StructKrylovMatvecDestroy( void *matvec_data ); + double hypre_StructKrylovInnerProd( void *x , void *y ); + int hypre_StructKrylovCopyVector( void *x , void *y ); + int hypre_StructKrylovClearVector( void *x ); + int hypre_StructKrylovScaleVector( double alpha , void *x ); + int hypre_StructKrylovAxpy( double alpha , void *x , void *y ); + int hypre_StructKrylovIdentitySetup( void *vdata , void *A , void *b , void *x ); + int hypre_StructKrylovIdentity( void *vdata , void *A , void *b , void *x ); + int hypre_StructKrylovCommInfo( void *A , int *my_id , int *num_procs ); + + /* pfmg.c */ + void *hypre_PFMGCreate( MPI_Comm comm ); + int hypre_PFMGDestroy( void *pfmg_vdata ); + int hypre_PFMGSetTol( void *pfmg_vdata , double tol ); + int hypre_PFMGSetMaxIter( void *pfmg_vdata , int max_iter ); + int hypre_PFMGSetRelChange( void *pfmg_vdata , int rel_change ); + int hypre_PFMGSetZeroGuess( void *pfmg_vdata , int zero_guess ); + int hypre_PFMGSetRelaxType( void *pfmg_vdata , int relax_type ); + int hypre_PFMGSetNumPreRelax( void *pfmg_vdata , int num_pre_relax ); + int hypre_PFMGSetNumPostRelax( void *pfmg_vdata , int num_post_relax ); + int hypre_PFMGSetSkipRelax( void *pfmg_vdata , int skip_relax ); + int hypre_PFMGSetDxyz( void *pfmg_vdata , double *dxyz ); + int hypre_PFMGSetLogging( void *pfmg_vdata , int logging ); + int hypre_PFMGGetNumIterations( void *pfmg_vdata , int *num_iterations ); + int hypre_PFMGPrintLogging( void *pfmg_vdata , int myid ); + int hypre_PFMGGetFinalRelativeResidualNorm( void *pfmg_vdata , double *relative_residual_norm ); + + /* pfmg2_setup_rap.c */ + hypre_StructMatrix *hypre_PFMG2CreateRAPOp( hypre_StructMatrix *R , hypre_StructMatrix *A , hypre_StructMatrix *P , hypre_StructGrid *coarse_grid , int cdir ); + int hypre_PFMG2BuildRAPSym( hypre_StructMatrix *A , hypre_StructMatrix *P , hypre_StructMatrix *R , int cdir , hypre_Index cindex , hypre_Index cstride , hypre_StructMatrix *RAP ); + int hypre_PFMG2BuildRAPNoSym( hypre_StructMatrix *A , hypre_StructMatrix *P , hypre_StructMatrix *R , int cdir , hypre_Index cindex , hypre_Index cstride , hypre_StructMatrix *RAP ); + + /* pfmg3_setup_rap.c */ + hypre_StructMatrix *hypre_PFMG3CreateRAPOp( hypre_StructMatrix *R , hypre_StructMatrix *A , hypre_StructMatrix *P , hypre_StructGrid *coarse_grid , int cdir ); + int hypre_PFMG3BuildRAPSym( hypre_StructMatrix *A , hypre_StructMatrix *P , hypre_StructMatrix *R , int cdir , hypre_Index cindex , hypre_Index cstride , hypre_StructMatrix *RAP ); + int hypre_PFMG3BuildRAPNoSym( hypre_StructMatrix *A , hypre_StructMatrix *P , hypre_StructMatrix *R , int cdir , hypre_Index cindex , hypre_Index cstride , hypre_StructMatrix *RAP ); + + /* pfmg_relax.c */ + void *hypre_PFMGRelaxCreate( MPI_Comm comm ); + int hypre_PFMGRelaxDestroy( void *pfmg_relax_vdata ); + int hypre_PFMGRelax( void *pfmg_relax_vdata , hypre_StructMatrix *A , hypre_StructVector *b , hypre_StructVector *x ); + int hypre_PFMGRelaxSetup( void *pfmg_relax_vdata , hypre_StructMatrix *A , hypre_StructVector *b , hypre_StructVector *x ); + int hypre_PFMGRelaxSetType( void *pfmg_relax_vdata , int relax_type ); + int hypre_PFMGRelaxSetPreRelax( void *pfmg_relax_vdata ); + int hypre_PFMGRelaxSetPostRelax( void *pfmg_relax_vdata ); + int hypre_PFMGRelaxSetTol( void *pfmg_relax_vdata , double tol ); + int hypre_PFMGRelaxSetMaxIter( void *pfmg_relax_vdata , int max_iter ); + int hypre_PFMGRelaxSetZeroGuess( void *pfmg_relax_vdata , int zero_guess ); + int hypre_PFMGRelaxSetTempVec( void *pfmg_relax_vdata , hypre_StructVector *t ); + + /* pfmg_setup.c */ + int hypre_PFMGSetup( void *pfmg_vdata , hypre_StructMatrix *A , hypre_StructVector *b , hypre_StructVector *x ); + int hypre_PFMGComputeDxyz( hypre_StructMatrix *A , double *dxyz ); + + /* pfmg_setup_interp.c */ + hypre_StructMatrix *hypre_PFMGCreateInterpOp( hypre_StructMatrix *A , hypre_StructGrid *cgrid , int cdir ); + int hypre_PFMGSetupInterpOp( hypre_StructMatrix *A , int cdir , hypre_Index findex , hypre_Index stride , hypre_StructMatrix *P ); + + /* pfmg_setup_rap.c */ + hypre_StructMatrix *hypre_PFMGCreateRAPOp( hypre_StructMatrix *R , hypre_StructMatrix *A , hypre_StructMatrix *P , hypre_StructGrid *coarse_grid , int cdir ); + int hypre_PFMGSetupRAPOp( hypre_StructMatrix *R , hypre_StructMatrix *A , hypre_StructMatrix *P , int cdir , hypre_Index cindex , hypre_Index cstride , hypre_StructMatrix *Ac ); + + /* pfmg_solve.c */ + int hypre_PFMGSolve( void *pfmg_vdata , hypre_StructMatrix *A , hypre_StructVector *b , hypre_StructVector *x ); + + /* point_relax.c */ + void *hypre_PointRelaxCreate( MPI_Comm comm ); + int hypre_PointRelaxDestroy( void *relax_vdata ); + int hypre_PointRelaxSetup( void *relax_vdata , hypre_StructMatrix *A , hypre_StructVector *b , hypre_StructVector *x ); + int hypre_PointRelax( void *relax_vdata , hypre_StructMatrix *A , hypre_StructVector *b , hypre_StructVector *x ); + int hypre_PointRelaxSetTol( void *relax_vdata , double tol ); + int hypre_PointRelaxSetMaxIter( void *relax_vdata , int max_iter ); + int hypre_PointRelaxSetZeroGuess( void *relax_vdata , int zero_guess ); + int hypre_PointRelaxSetWeight( void *relax_vdata , double weight ); + int hypre_PointRelaxSetNumPointsets( void *relax_vdata , int num_pointsets ); + int hypre_PointRelaxSetPointset( void *relax_vdata , int pointset , int pointset_size , hypre_Index pointset_stride , hypre_Index *pointset_indices ); + int hypre_PointRelaxSetPointsetRank( void *relax_vdata , int pointset , int pointset_rank ); + int hypre_PointRelaxSetTempVec( void *relax_vdata , hypre_StructVector *t ); + + /* semi_interp.c */ + void *hypre_SemiInterpCreate( void ); + int hypre_SemiInterpSetup( void *interp_vdata , hypre_StructMatrix *P , int P_stored_as_transpose , hypre_StructVector *xc , hypre_StructVector *e , hypre_Index cindex , hypre_Index findex , hypre_Index stride ); + int hypre_SemiInterp( void *interp_vdata , hypre_StructMatrix *P , hypre_StructVector *xc , hypre_StructVector *e ); + int hypre_SemiInterpDestroy( void *interp_vdata ); + + /* semi_restrict.c */ + void *hypre_SemiRestrictCreate( void ); + int hypre_SemiRestrictSetup( void *restrict_vdata , hypre_StructMatrix *R , int R_stored_as_transpose , hypre_StructVector *r , hypre_StructVector *rc , hypre_Index cindex , hypre_Index findex , hypre_Index stride ); + int hypre_SemiRestrict( void *restrict_vdata , hypre_StructMatrix *R , hypre_StructVector *r , hypre_StructVector *rc ); + int hypre_SemiRestrictDestroy( void *restrict_vdata ); + + /* smg.c */ + void *hypre_SMGCreate( MPI_Comm comm ); + int hypre_SMGDestroy( void *smg_vdata ); + int hypre_SMGSetMemoryUse( void *smg_vdata , int memory_use ); + int hypre_SMGSetTol( void *smg_vdata , double tol ); + int hypre_SMGSetMaxIter( void *smg_vdata , int max_iter ); + int hypre_SMGSetRelChange( void *smg_vdata , int rel_change ); + int hypre_SMGSetZeroGuess( void *smg_vdata , int zero_guess ); + int hypre_SMGSetNumPreRelax( void *smg_vdata , int num_pre_relax ); + int hypre_SMGSetNumPostRelax( void *smg_vdata , int num_post_relax ); + int hypre_SMGSetBase( void *smg_vdata , hypre_Index base_index , hypre_Index base_stride ); + int hypre_SMGSetLogging( void *smg_vdata , int logging ); + int hypre_SMGGetNumIterations( void *smg_vdata , int *num_iterations ); + int hypre_SMGPrintLogging( void *smg_vdata , int myid ); + int hypre_SMGGetFinalRelativeResidualNorm( void *smg_vdata , double *relative_residual_norm ); + int hypre_SMGSetStructVectorConstantValues( hypre_StructVector *vector , double values , hypre_BoxArray *box_array , hypre_Index stride ); + + /* smg2_setup_rap.c */ + hypre_StructMatrix *hypre_SMG2CreateRAPOp( hypre_StructMatrix *R , hypre_StructMatrix *A , hypre_StructMatrix *PT , hypre_StructGrid *coarse_grid ); + int hypre_SMG2BuildRAPSym( hypre_StructMatrix *A , hypre_StructMatrix *PT , hypre_StructMatrix *R , hypre_StructMatrix *RAP , hypre_Index cindex , hypre_Index cstride ); + int hypre_SMG2BuildRAPNoSym( hypre_StructMatrix *A , hypre_StructMatrix *PT , hypre_StructMatrix *R , hypre_StructMatrix *RAP , hypre_Index cindex , hypre_Index cstride ); + int hypre_SMG2RAPPeriodicSym( hypre_StructMatrix *RAP , hypre_Index cindex , hypre_Index cstride ); + int hypre_SMG2RAPPeriodicNoSym( hypre_StructMatrix *RAP , hypre_Index cindex , hypre_Index cstride ); + + /* smg3_setup_rap.c */ + hypre_StructMatrix *hypre_SMG3CreateRAPOp( hypre_StructMatrix *R , hypre_StructMatrix *A , hypre_StructMatrix *PT , hypre_StructGrid *coarse_grid ); + int hypre_SMG3BuildRAPSym( hypre_StructMatrix *A , hypre_StructMatrix *PT , hypre_StructMatrix *R , hypre_StructMatrix *RAP , hypre_Index cindex , hypre_Index cstride ); + int hypre_SMG3BuildRAPNoSym( hypre_StructMatrix *A , hypre_StructMatrix *PT , hypre_StructMatrix *R , hypre_StructMatrix *RAP , hypre_Index cindex , hypre_Index cstride ); + int hypre_SMG3RAPPeriodicSym( hypre_StructMatrix *RAP , hypre_Index cindex , hypre_Index cstride ); + int hypre_SMG3RAPPeriodicNoSym( hypre_StructMatrix *RAP , hypre_Index cindex , hypre_Index cstride ); + + /* smg_axpy.c */ + int hypre_SMGAxpy( double alpha , hypre_StructVector *x , hypre_StructVector *y , hypre_Index base_index , hypre_Index base_stride ); + + /* smg_relax.c */ + void *hypre_SMGRelaxCreate( MPI_Comm comm ); + int hypre_SMGRelaxDestroyTempVec( void *relax_vdata ); + int hypre_SMGRelaxDestroyARem( void *relax_vdata ); + int hypre_SMGRelaxDestroyASol( void *relax_vdata ); + int hypre_SMGRelaxDestroy( void *relax_vdata ); + int hypre_SMGRelax( void *relax_vdata , hypre_StructMatrix *A , hypre_StructVector *b , hypre_StructVector *x ); + int hypre_SMGRelaxSetup( void *relax_vdata , hypre_StructMatrix *A , hypre_StructVector *b , hypre_StructVector *x ); + int hypre_SMGRelaxSetupTempVec( void *relax_vdata , hypre_StructMatrix *A , hypre_StructVector *b , hypre_StructVector *x ); + int hypre_SMGRelaxSetupARem( void *relax_vdata , hypre_StructMatrix *A , hypre_StructVector *b , hypre_StructVector *x ); + int hypre_SMGRelaxSetupASol( void *relax_vdata , hypre_StructMatrix *A , hypre_StructVector *b , hypre_StructVector *x ); + int hypre_SMGRelaxSetTempVec( void *relax_vdata , hypre_StructVector *temp_vec ); + int hypre_SMGRelaxSetMemoryUse( void *relax_vdata , int memory_use ); + int hypre_SMGRelaxSetTol( void *relax_vdata , double tol ); + int hypre_SMGRelaxSetMaxIter( void *relax_vdata , int max_iter ); + int hypre_SMGRelaxSetZeroGuess( void *relax_vdata , int zero_guess ); + int hypre_SMGRelaxSetNumSpaces( void *relax_vdata , int num_spaces ); + int hypre_SMGRelaxSetNumPreSpaces( void *relax_vdata , int num_pre_spaces ); + int hypre_SMGRelaxSetNumRegSpaces( void *relax_vdata , int num_reg_spaces ); + int hypre_SMGRelaxSetSpace( void *relax_vdata , int i , int space_index , int space_stride ); + int hypre_SMGRelaxSetRegSpaceRank( void *relax_vdata , int i , int reg_space_rank ); + int hypre_SMGRelaxSetPreSpaceRank( void *relax_vdata , int i , int pre_space_rank ); + int hypre_SMGRelaxSetBase( void *relax_vdata , hypre_Index base_index , hypre_Index base_stride ); + int hypre_SMGRelaxSetNumPreRelax( void *relax_vdata , int num_pre_relax ); + int hypre_SMGRelaxSetNumPostRelax( void *relax_vdata , int num_post_relax ); + int hypre_SMGRelaxSetNewMatrixStencil( void *relax_vdata , hypre_StructStencil *diff_stencil ); + int hypre_SMGRelaxSetupBaseBoxArray( void *relax_vdata , hypre_StructMatrix *A , hypre_StructVector *b , hypre_StructVector *x ); + + /* smg_residual.c */ + void *hypre_SMGResidualCreate( void ); + int hypre_SMGResidualSetup( void *residual_vdata , hypre_StructMatrix *A , hypre_StructVector *x , hypre_StructVector *b , hypre_StructVector *r ); + int hypre_SMGResidual( void *residual_vdata , hypre_StructMatrix *A , hypre_StructVector *x , hypre_StructVector *b , hypre_StructVector *r ); + int hypre_SMGResidualSetBase( void *residual_vdata , hypre_Index base_index , hypre_Index base_stride ); + int hypre_SMGResidualDestroy( void *residual_vdata ); + + /* smg_residual_unrolled.c */ + void *hypre_SMGResidualCreate( void ); + int hypre_SMGResidualSetup( void *residual_vdata , hypre_StructMatrix *A , hypre_StructVector *x , hypre_StructVector *b , hypre_StructVector *r ); + int hypre_SMGResidual( void *residual_vdata , hypre_StructMatrix *A , hypre_StructVector *x , hypre_StructVector *b , hypre_StructVector *r ); + int hypre_SMGResidualSetBase( void *residual_vdata , hypre_Index base_index , hypre_Index base_stride ); + int hypre_SMGResidualDestroy( void *residual_vdata ); + + /* smg_setup.c */ + int hypre_SMGSetup( void *smg_vdata , hypre_StructMatrix *A , hypre_StructVector *b , hypre_StructVector *x ); + + /* smg_setup_interp.c */ + hypre_StructMatrix *hypre_SMGCreateInterpOp( hypre_StructMatrix *A , hypre_StructGrid *cgrid , int cdir ); + int hypre_SMGSetupInterpOp( void *relax_data , hypre_StructMatrix *A , hypre_StructVector *b , hypre_StructVector *x , hypre_StructMatrix *PT , int cdir , hypre_Index cindex , hypre_Index findex , hypre_Index stride ); + + /* smg_setup_rap.c */ + hypre_StructMatrix *hypre_SMGCreateRAPOp( hypre_StructMatrix *R , hypre_StructMatrix *A , hypre_StructMatrix *PT , hypre_StructGrid *coarse_grid ); + int hypre_SMGSetupRAPOp( hypre_StructMatrix *R , hypre_StructMatrix *A , hypre_StructMatrix *PT , hypre_StructMatrix *Ac , hypre_Index cindex , hypre_Index cstride ); + + /* smg_setup_restrict.c */ + hypre_StructMatrix *hypre_SMGCreateRestrictOp( hypre_StructMatrix *A , hypre_StructGrid *cgrid , int cdir ); + int hypre_SMGSetupRestrictOp( hypre_StructMatrix *A , hypre_StructMatrix *R , hypre_StructVector *temp_vec , int cdir , hypre_Index cindex , hypre_Index cstride ); + + /* smg_solve.c */ + int hypre_SMGSolve( void *smg_vdata , hypre_StructMatrix *A , hypre_StructVector *b , hypre_StructVector *x ); + + /* sparse_msg.c */ + void *hypre_SparseMSGCreate( MPI_Comm comm ); + int hypre_SparseMSGDestroy( void *smsg_vdata ); + int hypre_SparseMSGSetTol( void *smsg_vdata , double tol ); + int hypre_SparseMSGSetMaxIter( void *smsg_vdata , int max_iter ); + int hypre_SparseMSGSetJump( void *smsg_vdata , int jump ); + int hypre_SparseMSGSetRelChange( void *smsg_vdata , int rel_change ); + int hypre_SparseMSGSetZeroGuess( void *smsg_vdata , int zero_guess ); + int hypre_SparseMSGSetRelaxType( void *smsg_vdata , int relax_type ); + int hypre_SparseMSGSetNumPreRelax( void *smsg_vdata , int num_pre_relax ); + int hypre_SparseMSGSetNumPostRelax( void *smsg_vdata , int num_post_relax ); + int hypre_SparseMSGSetNumFineRelax( void *smsg_vdata , int num_fine_relax ); + int hypre_SparseMSGSetLogging( void *smsg_vdata , int logging ); + int hypre_SparseMSGGetNumIterations( void *smsg_vdata , int *num_iterations ); + int hypre_SparseMSGPrintLogging( void *smsg_vdata , int myid ); + int hypre_SparseMSGGetFinalRelativeResidualNorm( void *smsg_vdata , double *relative_residual_norm ); + + /* sparse_msg2_setup_rap.c */ + hypre_StructMatrix *hypre_SparseMSG2CreateRAPOp( hypre_StructMatrix *R , hypre_StructMatrix *A , hypre_StructMatrix *P , hypre_StructGrid *coarse_grid , int cdir ); + int hypre_SparseMSG2BuildRAPSym( hypre_StructMatrix *A , hypre_StructMatrix *P , hypre_StructMatrix *R , int cdir , hypre_Index cindex , hypre_Index cstride , hypre_Index stridePR , hypre_StructMatrix *RAP ); + int hypre_SparseMSG2BuildRAPNoSym( hypre_StructMatrix *A , hypre_StructMatrix *P , hypre_StructMatrix *R , int cdir , hypre_Index cindex , hypre_Index cstride , hypre_Index stridePR , hypre_StructMatrix *RAP ); + + /* sparse_msg3_setup_rap.c */ + hypre_StructMatrix *hypre_SparseMSG3CreateRAPOp( hypre_StructMatrix *R , hypre_StructMatrix *A , hypre_StructMatrix *P , hypre_StructGrid *coarse_grid , int cdir ); + int hypre_SparseMSG3BuildRAPSym( hypre_StructMatrix *A , hypre_StructMatrix *P , hypre_StructMatrix *R , int cdir , hypre_Index cindex , hypre_Index cstride , hypre_Index stridePR , hypre_StructMatrix *RAP ); + int hypre_SparseMSG3BuildRAPNoSym( hypre_StructMatrix *A , hypre_StructMatrix *P , hypre_StructMatrix *R , int cdir , hypre_Index cindex , hypre_Index cstride , hypre_Index stridePR , hypre_StructMatrix *RAP ); + + /* sparse_msg_filter.c */ + int hypre_SparseMSGFilterSetup( hypre_StructMatrix *A , int *num_grids , int lx , int ly , int lz , int jump , hypre_StructVector *visitx , hypre_StructVector *visity , hypre_StructVector *visitz ); + int hypre_SparseMSGFilter( hypre_StructVector *visit , hypre_StructVector *e , int lx , int ly , int lz , int jump ); + int hypre_SparseMSGFilterSetup( hypre_StructMatrix *A , int *num_grids , int lx , int ly , int lz , int jump , hypre_StructVector *visitx , hypre_StructVector *visity , hypre_StructVector *visitz ); + int hypre_SparseMSGFilter( hypre_StructVector *visit , hypre_StructVector *e , int lx , int ly , int lz , int jump ); + + /* sparse_msg_interp.c */ + void *hypre_SparseMSGInterpCreate( void ); + int hypre_SparseMSGInterpSetup( void *interp_vdata , hypre_StructMatrix *P , hypre_StructVector *xc , hypre_StructVector *e , hypre_Index cindex , hypre_Index findex , hypre_Index stride , hypre_Index strideP ); + int hypre_SparseMSGInterp( void *interp_vdata , hypre_StructMatrix *P , hypre_StructVector *xc , hypre_StructVector *e ); + int hypre_SparseMSGInterpDestroy( void *interp_vdata ); + + /* sparse_msg_restrict.c */ + void *hypre_SparseMSGRestrictCreate( void ); + int hypre_SparseMSGRestrictSetup( void *restrict_vdata , hypre_StructMatrix *R , hypre_StructVector *r , hypre_StructVector *rc , hypre_Index cindex , hypre_Index findex , hypre_Index stride , hypre_Index strideR ); + int hypre_SparseMSGRestrict( void *restrict_vdata , hypre_StructMatrix *R , hypre_StructVector *r , hypre_StructVector *rc ); + int hypre_SparseMSGRestrictDestroy( void *restrict_vdata ); + + /* sparse_msg_setup.c */ + int hypre_SparseMSGSetup( void *smsg_vdata , hypre_StructMatrix *A , hypre_StructVector *b , hypre_StructVector *x ); + + /* sparse_msg_setup_rap.c */ + hypre_StructMatrix *hypre_SparseMSGCreateRAPOp( hypre_StructMatrix *R , hypre_StructMatrix *A , hypre_StructMatrix *P , hypre_StructGrid *coarse_grid , int cdir ); + int hypre_SparseMSGSetupRAPOp( hypre_StructMatrix *R , hypre_StructMatrix *A , hypre_StructMatrix *P , int cdir , hypre_Index cindex , hypre_Index cstride , hypre_Index stridePR , hypre_StructMatrix *Ac ); + + /* sparse_msg_solve.c */ + int hypre_SparseMSGSolve( void *smsg_vdata , hypre_StructMatrix *A , hypre_StructVector *b , hypre_StructVector *x ); + + + #ifdef __cplusplus + } + #endif + + #endif + Index: llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/struct_matrix.c diff -c /dev/null llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/struct_matrix.c:1.1 *** /dev/null Mon Apr 11 00:22:18 2005 --- llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/struct_matrix.c Mon Apr 11 00:22:07 2005 *************** *** 0 **** --- 1,928 ---- + /*BHEADER********************************************************************** + * (c) 1997 The Regents of the University of California + * + * See the file COPYRIGHT_and_DISCLAIMER for a complete copyright + * notice, contact person, and disclaimer. + * + * $Revision: 1.1 $ + *********************************************************************EHEADER*/ + /****************************************************************************** + * + * Member functions for hypre_StructMatrix class. + * + *****************************************************************************/ + + #include "headers.h" + + + /*-------------------------------------------------------------------------- + * hypre_StructMatrixExtractPointerByIndex + * Returns pointer to data for stencil entry coresponding to + * `index' in `matrix'. If the index does not exist in the matrix's + * stencil, the NULL pointer is returned. + *--------------------------------------------------------------------------*/ + + double * + hypre_StructMatrixExtractPointerByIndex( hypre_StructMatrix *matrix, + int b, + hypre_Index index ) + { + hypre_StructStencil *stencil; + int rank; + + stencil = hypre_StructMatrixStencil(matrix); + rank = hypre_StructStencilElementRank( stencil, index ); + + if ( rank >= 0 ) + return hypre_StructMatrixBoxData(matrix, b, rank); + else + return NULL; /* error - invalid index */ + } + + /*-------------------------------------------------------------------------- + * hypre_StructMatrixCreate + *--------------------------------------------------------------------------*/ + + hypre_StructMatrix * + hypre_StructMatrixCreate( MPI_Comm comm, + hypre_StructGrid *grid, + hypre_StructStencil *user_stencil ) + { + hypre_StructMatrix *matrix; + + int i; + + matrix = hypre_CTAlloc(hypre_StructMatrix, 1); + + hypre_StructMatrixComm(matrix) = comm; + hypre_StructGridRef(grid, &hypre_StructMatrixGrid(matrix)); + hypre_StructMatrixUserStencil(matrix) = + hypre_StructStencilRef(user_stencil); + hypre_StructMatrixDataAlloced(matrix) = 1; + hypre_StructMatrixRefCount(matrix) = 1; + + /* set defaults */ + hypre_StructMatrixSymmetric(matrix) = 0; + for (i = 0; i < 6; i++) + hypre_StructMatrixNumGhost(matrix)[i] = 0; + + return matrix; + } + + /*-------------------------------------------------------------------------- + * hypre_StructMatrixRef + *--------------------------------------------------------------------------*/ + + hypre_StructMatrix * + hypre_StructMatrixRef( hypre_StructMatrix *matrix ) + { + hypre_StructMatrixRefCount(matrix) ++; + + return matrix; + } + + /*-------------------------------------------------------------------------- + * hypre_StructMatrixDestroy + *--------------------------------------------------------------------------*/ + + int + hypre_StructMatrixDestroy( hypre_StructMatrix *matrix ) + { + int i; + int ierr = 0; + + if (matrix) + { + hypre_StructMatrixRefCount(matrix) --; + if (hypre_StructMatrixRefCount(matrix) == 0) + { + if (hypre_StructMatrixDataAlloced(matrix)) + { + hypre_SharedTFree(hypre_StructMatrixData(matrix)); + } + hypre_CommPkgDestroy(hypre_StructMatrixCommPkg(matrix)); + + hypre_ForBoxI(i, hypre_StructMatrixDataSpace(matrix)) + hypre_TFree(hypre_StructMatrixDataIndices(matrix)[i]); + hypre_TFree(hypre_StructMatrixDataIndices(matrix)); + + hypre_BoxArrayDestroy(hypre_StructMatrixDataSpace(matrix)); + + hypre_TFree(hypre_StructMatrixSymmElements(matrix)); + hypre_StructStencilDestroy(hypre_StructMatrixUserStencil(matrix)); + hypre_StructStencilDestroy(hypre_StructMatrixStencil(matrix)); + hypre_StructGridDestroy(hypre_StructMatrixGrid(matrix)); + + hypre_TFree(matrix); + } + } + + return ierr; + } + + /*-------------------------------------------------------------------------- + * hypre_StructMatrixInitializeShell + *--------------------------------------------------------------------------*/ + + int + hypre_StructMatrixInitializeShell( hypre_StructMatrix *matrix ) + { + int ierr = 0; + + hypre_StructGrid *grid; + + hypre_StructStencil *user_stencil; + hypre_StructStencil *stencil; + hypre_Index *stencil_shape; + int stencil_size; + int num_values; + int *symm_elements; + + int *num_ghost; + int extra_ghost[] = {0, 0, 0, 0, 0, 0}; + + hypre_BoxArray *data_space; + hypre_BoxArray *boxes; + hypre_Box *box; + hypre_Box *data_box; + + int **data_indices; + int data_size; + int data_box_volume; + + int i, j, d; + + grid = hypre_StructMatrixGrid(matrix); + + /*----------------------------------------------------------------------- + * Set up stencil and num_values: + * The stencil is a "symmetrized" version of the user's stencil + * as computed by hypre_StructStencilSymmetrize. + * + * The `symm_elements' array is used to determine what data is + * explicitely stored (symm_elements[i] < 0) and what data does is + * not explicitely stored (symm_elements[i] >= 0), but is instead + * stored as the transpose coefficient at a neighboring grid point. + *-----------------------------------------------------------------------*/ + + if (hypre_StructMatrixStencil(matrix) == NULL) + { + user_stencil = hypre_StructMatrixUserStencil(matrix); + + hypre_StructStencilSymmetrize(user_stencil, &stencil, &symm_elements); + + stencil_shape = hypre_StructStencilShape(stencil); + stencil_size = hypre_StructStencilSize(stencil); + + if (!hypre_StructMatrixSymmetric(matrix)) + { + /* store all element data */ + for (i = 0; i < stencil_size; i++) + symm_elements[i] = -1; + num_values = stencil_size; + } + else + { + num_values = (stencil_size + 1) / 2; + } + + hypre_StructMatrixStencil(matrix) = stencil; + hypre_StructMatrixSymmElements(matrix) = symm_elements; + hypre_StructMatrixNumValues(matrix) = num_values; + } + + /*----------------------------------------------------------------------- + * Set ghost-layer size for symmetric storage + * - All stencil coeffs are to be available at each point in the + * grid, as well as in the user-specified ghost layer. + *-----------------------------------------------------------------------*/ + + num_ghost = hypre_StructMatrixNumGhost(matrix); + stencil = hypre_StructMatrixStencil(matrix); + stencil_shape = hypre_StructStencilShape(stencil); + stencil_size = hypre_StructStencilSize(stencil); + symm_elements = hypre_StructMatrixSymmElements(matrix); + + for (i = 0; i < stencil_size; i++) + { + if (symm_elements[i] >= 0) + { + for (d = 0; d < 3; d++) + { + extra_ghost[2*d] = + hypre_max(extra_ghost[2*d], -hypre_IndexD(stencil_shape[i], d)); + extra_ghost[2*d + 1] = + hypre_max(extra_ghost[2*d + 1], hypre_IndexD(stencil_shape[i], d)); + } + } + } + + for (d = 0; d < 3; d++) + { + num_ghost[2*d] += extra_ghost[2*d]; + num_ghost[2*d + 1] += extra_ghost[2*d + 1]; + } + + /*----------------------------------------------------------------------- + * Set up data_space + *-----------------------------------------------------------------------*/ + + if (hypre_StructMatrixDataSpace(matrix) == NULL) + { + boxes = hypre_StructGridBoxes(grid); + data_space = hypre_BoxArrayCreate(hypre_BoxArraySize(boxes)); + + hypre_ForBoxI(i, boxes) + { + box = hypre_BoxArrayBox(boxes, i); + data_box = hypre_BoxArrayBox(data_space, i); + + hypre_CopyBox(box, data_box); + for (d = 0; d < 3; d++) + { + hypre_BoxIMinD(data_box, d) -= num_ghost[2*d]; + hypre_BoxIMaxD(data_box, d) += num_ghost[2*d + 1]; + } + } + + hypre_StructMatrixDataSpace(matrix) = data_space; + } + + /*----------------------------------------------------------------------- + * Set up data_indices array and data-size + *-----------------------------------------------------------------------*/ + + if (hypre_StructMatrixDataIndices(matrix) == NULL) + { + data_space = hypre_StructMatrixDataSpace(matrix); + data_indices = hypre_CTAlloc(int *, hypre_BoxArraySize(data_space)); + + data_size = 0; + hypre_ForBoxI(i, data_space) + { + data_box = hypre_BoxArrayBox(data_space, i); + data_box_volume = hypre_BoxVolume(data_box); + + data_indices[i] = hypre_CTAlloc(int, stencil_size); + + /* set pointers for "stored" coefficients */ + for (j = 0; j < stencil_size; j++) + { + if (symm_elements[j] < 0) + { + data_indices[i][j] = data_size; + data_size += data_box_volume; + } + } + + /* set pointers for "symmetric" coefficients */ + for (j = 0; j < stencil_size; j++) + { + if (symm_elements[j] >= 0) + { + data_indices[i][j] = data_indices[i][symm_elements[j]] + + hypre_BoxOffsetDistance(data_box, stencil_shape[j]); + } + } + } + + hypre_StructMatrixDataIndices(matrix) = data_indices; + hypre_StructMatrixDataSize(matrix) = data_size; + } + + /*----------------------------------------------------------------------- + * Set total number of nonzero coefficients + *-----------------------------------------------------------------------*/ + + hypre_StructMatrixGlobalSize(matrix) = + hypre_StructGridGlobalSize(grid) * stencil_size; + + /*----------------------------------------------------------------------- + * Return + *-----------------------------------------------------------------------*/ + + return ierr; + } + + /*-------------------------------------------------------------------------- + * hypre_StructMatrixInitializeData + *--------------------------------------------------------------------------*/ + + int + hypre_StructMatrixInitializeData( hypre_StructMatrix *matrix, + double *data ) + { + int ierr = 0; + + hypre_BoxArray *data_boxes; + hypre_Box *data_box; + hypre_Index loop_size; + hypre_Index index; + hypre_IndexRef start; + hypre_Index stride; + double *datap; + int datai; + int i; + int loopi, loopj, loopk; + + hypre_StructMatrixData(matrix) = data; + hypre_StructMatrixDataAlloced(matrix) = 0; + + /*------------------------------------------------- + * If the matrix has a diagonal, set these entries + * to 1 everywhere. This reduces the complexity of + * many computations by eliminating divide-by-zero + * in the ghost region. + *-------------------------------------------------*/ + + hypre_SetIndex(index, 0, 0, 0); + hypre_SetIndex(stride, 1, 1, 1); + + data_boxes = hypre_StructMatrixDataSpace(matrix); + hypre_ForBoxI(i, data_boxes) + { + datap = hypre_StructMatrixExtractPointerByIndex(matrix, i, index); + + if (datap) + { + data_box = hypre_BoxArrayBox(data_boxes, i); + start = hypre_BoxIMin(data_box); + + hypre_BoxGetSize(data_box, loop_size); + + hypre_BoxLoop1Begin(loop_size, + data_box, start, stride, datai); + #define HYPRE_BOX_SMP_PRIVATE loopk,loopi,loopj,datai + #include "hypre_box_smp_forloop.h" + hypre_BoxLoop1For(loopi, loopj, loopk, datai) + { + datap[datai] = 1.0; + } + hypre_BoxLoop1End(datai); + } + } + + return ierr; + } + + /*-------------------------------------------------------------------------- + * hypre_StructMatrixInitialize + *--------------------------------------------------------------------------*/ + int + hypre_StructMatrixInitialize( hypre_StructMatrix *matrix ) + { + int ierr = 0; + + double *data; + + ierr = hypre_StructMatrixInitializeShell(matrix); + + data = hypre_SharedCTAlloc(double, hypre_StructMatrixDataSize(matrix)); + hypre_StructMatrixInitializeData(matrix, data); + hypre_StructMatrixDataAlloced(matrix) = 1; + + return ierr; + } + + /*-------------------------------------------------------------------------- + * hypre_StructMatrixSetValues + *--------------------------------------------------------------------------*/ + + int + hypre_StructMatrixSetValues( hypre_StructMatrix *matrix, + hypre_Index grid_index, + int num_stencil_indices, + int *stencil_indices, + double *values, + int add_to ) + { + int ierr = 0; + + hypre_BoxArray *boxes; + hypre_Box *box; + + double *matp; + + int i, s; + + boxes = hypre_StructGridBoxes(hypre_StructMatrixGrid(matrix)); + + hypre_ForBoxI(i, boxes) + { + box = hypre_BoxArrayBox(boxes, i); + + if ((hypre_IndexX(grid_index) >= hypre_BoxIMinX(box)) && + (hypre_IndexX(grid_index) <= hypre_BoxIMaxX(box)) && + (hypre_IndexY(grid_index) >= hypre_BoxIMinY(box)) && + (hypre_IndexY(grid_index) <= hypre_BoxIMaxY(box)) && + (hypre_IndexZ(grid_index) >= hypre_BoxIMinZ(box)) && + (hypre_IndexZ(grid_index) <= hypre_BoxIMaxZ(box)) ) + { + if (add_to) + { + for (s = 0; s < num_stencil_indices; s++) + { + matp = hypre_StructMatrixBoxDataValue(matrix, i, + stencil_indices[s], + grid_index); + *matp += values[s]; + } + } + else + { + for (s = 0; s < num_stencil_indices; s++) + { + matp = hypre_StructMatrixBoxDataValue(matrix, i, + stencil_indices[s], + grid_index); + *matp = values[s]; + } + } + } + } + + return(ierr); + } + + /*-------------------------------------------------------------------------- + * hypre_StructMatrixSetBoxValues + *--------------------------------------------------------------------------*/ + + int + hypre_StructMatrixSetBoxValues( hypre_StructMatrix *matrix, + hypre_Box *value_box, + int num_stencil_indices, + int *stencil_indices, + double *values, + int add_to ) + { + int ierr = 0; + + hypre_BoxArray *grid_boxes; + hypre_Box *grid_box; + hypre_BoxArray *box_array; + hypre_Box *box; + + hypre_BoxArray *data_space; + hypre_Box *data_box; + hypre_IndexRef data_start; + hypre_Index data_stride; + int datai; + double *datap; + + hypre_Box *dval_box; + hypre_Index dval_start; + hypre_Index dval_stride; + int dvali; + + hypre_Index loop_size; + + int i, s; + int loopi, loopj, loopk; + + /*----------------------------------------------------------------------- + * Set up `box_array' by intersecting `box' with the grid boxes + *-----------------------------------------------------------------------*/ + + grid_boxes = hypre_StructGridBoxes(hypre_StructMatrixGrid(matrix)); + box_array = hypre_BoxArrayCreate(hypre_BoxArraySize(grid_boxes)); + box = hypre_BoxCreate(); + hypre_ForBoxI(i, grid_boxes) + { + grid_box = hypre_BoxArrayBox(grid_boxes, i); + hypre_IntersectBoxes(value_box, grid_box, box); + hypre_CopyBox(box, hypre_BoxArrayBox(box_array, i)); + } + hypre_BoxDestroy(box); + + /*----------------------------------------------------------------------- + * Set the matrix coefficients + *-----------------------------------------------------------------------*/ + + if (box_array) + { + data_space = hypre_StructMatrixDataSpace(matrix); + hypre_SetIndex(data_stride, 1, 1, 1); + + dval_box = hypre_BoxDuplicate(value_box); + hypre_BoxIMinD(dval_box, 0) *= num_stencil_indices; + hypre_BoxIMaxD(dval_box, 0) *= num_stencil_indices; + hypre_BoxIMaxD(dval_box, 0) += num_stencil_indices - 1; + hypre_SetIndex(dval_stride, num_stencil_indices, 1, 1); + + hypre_ForBoxI(i, box_array) + { + box = hypre_BoxArrayBox(box_array, i); + data_box = hypre_BoxArrayBox(data_space, i); + + /* if there was an intersection */ + if (box) + { + data_start = hypre_BoxIMin(box); + hypre_CopyIndex(data_start, dval_start); + hypre_IndexD(dval_start, 0) *= num_stencil_indices; + + for (s = 0; s < num_stencil_indices; s++) + { + datap = hypre_StructMatrixBoxData(matrix, i, + stencil_indices[s]); + + hypre_BoxGetSize(box, loop_size); + + if (add_to) + { + hypre_BoxLoop2Begin(loop_size, + data_box,data_start,data_stride,datai, + dval_box,dval_start,dval_stride,dvali); + #define HYPRE_BOX_SMP_PRIVATE loopk,loopi,loopj,datai,dvali + #include "hypre_box_smp_forloop.h" + hypre_BoxLoop2For(loopi, loopj, loopk, datai, dvali) + { + datap[datai] += values[dvali]; + } + hypre_BoxLoop2End(datai, dvali); + } + else + { + hypre_BoxLoop2Begin(loop_size, + data_box,data_start,data_stride,datai, + dval_box,dval_start,dval_stride,dvali); + #define HYPRE_BOX_SMP_PRIVATE loopk,loopi,loopj,datai,dvali + #include "hypre_box_smp_forloop.h" + hypre_BoxLoop2For(loopi, loopj, loopk, datai, dvali) + { + datap[datai] = values[dvali]; + } + hypre_BoxLoop2End(datai, dvali); + } + + hypre_IndexD(dval_start, 0) ++; + } + } + } + + hypre_BoxDestroy(dval_box); + } + + hypre_BoxArrayDestroy(box_array); + + return(ierr); + } + + /*-------------------------------------------------------------------------- + * hypre_StructMatrixAssemble + *--------------------------------------------------------------------------*/ + + int + hypre_StructMatrixAssemble( hypre_StructMatrix *matrix ) + { + int ierr = 0; + + int *num_ghost = hypre_StructMatrixNumGhost(matrix); + + hypre_BoxArrayArray *send_boxes; + hypre_BoxArrayArray *recv_boxes; + int **send_processes; + int **recv_processes; + + hypre_Index unit_stride; + + hypre_CommPkg *comm_pkg; + + hypre_CommHandle *comm_handle; + + /*----------------------------------------------------------------------- + * If the CommPkg has not been set up, set it up + *-----------------------------------------------------------------------*/ + + comm_pkg = hypre_StructMatrixCommPkg(matrix); + + if (!comm_pkg) + { + hypre_SetIndex(unit_stride, 1, 1, 1); + + hypre_CreateCommInfoFromNumGhost(hypre_StructMatrixGrid(matrix), + num_ghost, + &send_boxes, &recv_boxes, + &send_processes, &recv_processes); + + comm_pkg = hypre_CommPkgCreate(send_boxes, recv_boxes, + unit_stride, unit_stride, + hypre_StructMatrixDataSpace(matrix), + hypre_StructMatrixDataSpace(matrix), + send_processes, recv_processes, + hypre_StructMatrixNumValues(matrix), + hypre_StructMatrixComm(matrix), + hypre_StructGridPeriodic( + hypre_StructMatrixGrid(matrix))); + + hypre_StructMatrixCommPkg(matrix) = comm_pkg; + } + + /*----------------------------------------------------------------------- + * Update the ghost data + *-----------------------------------------------------------------------*/ + + hypre_InitializeCommunication(comm_pkg, + hypre_StructMatrixData(matrix), + hypre_StructMatrixData(matrix), + &comm_handle); + hypre_FinalizeCommunication(comm_handle); + + return(ierr); + } + + /*-------------------------------------------------------------------------- + * hypre_StructMatrixSetNumGhost + *--------------------------------------------------------------------------*/ + + int + hypre_StructMatrixSetNumGhost( hypre_StructMatrix *matrix, + int *num_ghost ) + { + int ierr = 0; + int i; + + for (i = 0; i < 6; i++) + hypre_StructMatrixNumGhost(matrix)[i] = num_ghost[i]; + + return ierr; + } + + /*-------------------------------------------------------------------------- + * hypre_StructMatrixPrint + *--------------------------------------------------------------------------*/ + + int + hypre_StructMatrixPrint( char *filename, + hypre_StructMatrix *matrix, + int all ) + { + int ierr = 0; + + FILE *file; + char new_filename[255]; + + hypre_StructGrid *grid; + hypre_BoxArray *boxes; + + hypre_StructStencil *stencil; + hypre_Index *stencil_shape; + + int num_values; + + hypre_BoxArray *data_space; + + int *symm_elements; + + int i, j; + + int myid; + + /*---------------------------------------- + * Open file + *----------------------------------------*/ + + #ifdef HYPRE_USE_PTHREADS + #if MPI_Comm_rank == hypre_thread_MPI_Comm_rank + #undef MPI_Comm_rank + #endif + #endif + + MPI_Comm_rank(hypre_StructMatrixComm(matrix), &myid); + + sprintf(new_filename, "%s.%05d", filename, myid); + + if ((file = fopen(new_filename, "w")) == NULL) + { + printf("Error: can't open output file %s\n", new_filename); + exit(1); + } + + /*---------------------------------------- + * Print header info + *----------------------------------------*/ + + fprintf(file, "StructMatrix\n"); + + fprintf(file, "\nSymmetric: %d\n", hypre_StructMatrixSymmetric(matrix)); + + /* print grid info */ + fprintf(file, "\nGrid:\n"); + grid = hypre_StructMatrixGrid(matrix); + hypre_StructGridPrint(file, grid); + + /* print stencil info */ + fprintf(file, "\nStencil:\n"); + stencil = hypre_StructMatrixStencil(matrix); + stencil_shape = hypre_StructStencilShape(stencil); + + num_values = hypre_StructMatrixNumValues(matrix); + symm_elements = hypre_StructMatrixSymmElements(matrix); + fprintf(file, "%d\n", num_values); + j = 0; + for (i = 0; i < hypre_StructStencilSize(stencil); i++) + { + if (symm_elements[i] < 0) + { + fprintf(file, "%d: %d %d %d\n", j++, + hypre_IndexX(stencil_shape[i]), + hypre_IndexY(stencil_shape[i]), + hypre_IndexZ(stencil_shape[i])); + } + } + + /*---------------------------------------- + * Print data + *----------------------------------------*/ + + data_space = hypre_StructMatrixDataSpace(matrix); + + if (all) + boxes = data_space; + else + boxes = hypre_StructGridBoxes(grid); + + fprintf(file, "\nData:\n"); + hypre_PrintBoxArrayData(file, boxes, data_space, num_values, + hypre_StructMatrixData(matrix)); + + /*---------------------------------------- + * Close file + *----------------------------------------*/ + + fflush(file); + fclose(file); + + return ierr; + } + + /*-------------------------------------------------------------------------- + * hypre_StructMatrixMigrate + *--------------------------------------------------------------------------*/ + + int + hypre_StructMatrixMigrate( hypre_StructMatrix *from_matrix, + hypre_StructMatrix *to_matrix ) + { + hypre_BoxArrayArray *send_boxes; + hypre_BoxArrayArray *recv_boxes; + int **send_processes; + int **recv_processes; + + hypre_Index unit_stride; + + hypre_CommPkg *comm_pkg; + hypre_CommHandle *comm_handle; + + int ierr = 0; + + /*------------------------------------------------------ + * Set up hypre_CommPkg + *------------------------------------------------------*/ + + hypre_SetIndex(unit_stride, 1, 1, 1); + + hypre_CreateCommInfoFromGrids(hypre_StructMatrixGrid(from_matrix), + hypre_StructMatrixGrid(to_matrix), + &send_boxes, &recv_boxes, + &send_processes, &recv_processes); + + comm_pkg = hypre_CommPkgCreate(send_boxes, recv_boxes, + unit_stride, unit_stride, + hypre_StructMatrixDataSpace(from_matrix), + hypre_StructMatrixDataSpace(to_matrix), + send_processes, recv_processes, + hypre_StructMatrixNumValues(from_matrix), + hypre_StructMatrixComm(from_matrix), + hypre_StructGridPeriodic( + hypre_StructMatrixGrid(from_matrix))); + /* is this correct for periodic? */ + + /*----------------------------------------------------------------------- + * Migrate the matrix data + *-----------------------------------------------------------------------*/ + + hypre_InitializeCommunication(comm_pkg, + hypre_StructMatrixData(from_matrix), + hypre_StructMatrixData(to_matrix), + &comm_handle); + hypre_FinalizeCommunication(comm_handle); + + return ierr; + } + + /*-------------------------------------------------------------------------- + * hypre_StructMatrixRead + *--------------------------------------------------------------------------*/ + + hypre_StructMatrix * + hypre_StructMatrixRead( MPI_Comm comm, + char *filename, + int *num_ghost ) + { + FILE *file; + char new_filename[255]; + + hypre_StructMatrix *matrix; + + hypre_StructGrid *grid; + hypre_BoxArray *boxes; + int dim; + + hypre_StructStencil *stencil; + hypre_Index *stencil_shape; + int stencil_size; + + int num_values; + + hypre_BoxArray *data_space; + + int symmetric; + + int i, idummy; + + int myid; + + /*---------------------------------------- + * Open file + *----------------------------------------*/ + + #ifdef HYPRE_USE_PTHREADS + #if MPI_Comm_rank == hypre_thread_MPI_Comm_rank + #undef MPI_Comm_rank + #endif + #endif + + MPI_Comm_rank(comm, &myid ); + + sprintf(new_filename, "%s.%05d", filename, myid); + + if ((file = fopen(new_filename, "r")) == NULL) + { + printf("Error: can't open output file %s\n", new_filename); + exit(1); + } + + /*---------------------------------------- + * Read header info + *----------------------------------------*/ + + fscanf(file, "StructMatrix\n"); + + fscanf(file, "\nSymmetric: %d\n", &symmetric); + + /* read grid info */ + fscanf(file, "\nGrid:\n"); + hypre_StructGridRead(comm,file,&grid); + + /* read stencil info */ + fscanf(file, "\nStencil:\n"); + dim = hypre_StructGridDim(grid); + fscanf(file, "%d\n", &stencil_size); + stencil_shape = hypre_CTAlloc(hypre_Index, stencil_size); + for (i = 0; i < stencil_size; i++) + { + fscanf(file, "%d: %d %d %d\n", &idummy, + &hypre_IndexX(stencil_shape[i]), + &hypre_IndexY(stencil_shape[i]), + &hypre_IndexZ(stencil_shape[i])); + } + stencil = hypre_StructStencilCreate(dim, stencil_size, stencil_shape); + + /*---------------------------------------- + * Initialize the matrix + *----------------------------------------*/ + + matrix = hypre_StructMatrixCreate(comm, grid, stencil); + hypre_StructMatrixSymmetric(matrix) = symmetric; + hypre_StructMatrixSetNumGhost(matrix, num_ghost); + hypre_StructMatrixInitialize(matrix); + + /*---------------------------------------- + * Read data + *----------------------------------------*/ + + boxes = hypre_StructGridBoxes(grid); + data_space = hypre_StructMatrixDataSpace(matrix); + num_values = hypre_StructMatrixNumValues(matrix); + + fscanf(file, "\nData:\n"); + hypre_ReadBoxArrayData(file, boxes, data_space, num_values, + hypre_StructMatrixData(matrix)); + + /*---------------------------------------- + * Assemble the matrix + *----------------------------------------*/ + + hypre_StructMatrixAssemble(matrix); + + /*---------------------------------------- + * Close file + *----------------------------------------*/ + + fclose(file); + + return matrix; + } + Index: llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/struct_matrix_mask.c diff -c /dev/null llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/struct_matrix_mask.c:1.1 *** /dev/null Mon Apr 11 00:22:18 2005 --- llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/struct_matrix_mask.c Mon Apr 11 00:22:07 2005 *************** *** 0 **** --- 1,124 ---- + /*BHEADER********************************************************************** + * (c) 1997 The Regents of the University of California + * + * See the file COPYRIGHT_and_DISCLAIMER for a complete copyright + * notice, contact person, and disclaimer. + * + * $Revision: 1.1 $ + *********************************************************************EHEADER*/ + /****************************************************************************** + * + * Member functions for hypre_StructMatrix class. + * + *****************************************************************************/ + + #include "headers.h" + + /*-------------------------------------------------------------------------- + * hypre_StructMatrixCreateMask + * This routine returns the matrix, `mask', containing pointers to + * some of the data in the input matrix `matrix'. This can be useful, + * for example, to construct "splittings" of a matrix for use in + * iterative methods. The key note here is that the matrix `mask' does + * NOT contain a copy of the data in `matrix', but it can be used as + * if it were a normal StructMatrix object. + * + * Notes: + * (1) Only the stencil, data_indices, and global_size components of the + * StructMatrix structure are modified. + * (2) PrintStructMatrix will not correctly print the stencil-to-data + * correspondence. + *--------------------------------------------------------------------------*/ + + hypre_StructMatrix * + hypre_StructMatrixCreateMask( hypre_StructMatrix *matrix, + int num_stencil_indices, + int *stencil_indices ) + { + hypre_StructMatrix *mask; + + hypre_StructStencil *stencil; + hypre_Index *stencil_shape; + int stencil_size; + hypre_Index *mask_stencil_shape; + int mask_stencil_size; + + hypre_BoxArray *data_space; + int **data_indices; + int **mask_data_indices; + + int i, j; + + stencil = hypre_StructMatrixStencil(matrix); + stencil_shape = hypre_StructStencilShape(stencil); + stencil_size = hypre_StructStencilSize(stencil); + + mask = hypre_CTAlloc(hypre_StructMatrix, 1); + + hypre_StructMatrixComm(mask) = hypre_StructMatrixComm(matrix); + + hypre_StructGridRef(hypre_StructMatrixGrid(matrix), + &hypre_StructMatrixGrid(mask)); + + hypre_StructMatrixUserStencil(mask) = + hypre_StructStencilRef(hypre_StructMatrixUserStencil(matrix)); + + mask_stencil_size = num_stencil_indices; + mask_stencil_shape = hypre_CTAlloc(hypre_Index, num_stencil_indices); + for (i = 0; i < num_stencil_indices; i++) + { + hypre_CopyIndex(stencil_shape[stencil_indices[i]], + mask_stencil_shape[i]); + } + hypre_StructMatrixStencil(mask) = + hypre_StructStencilCreate(hypre_StructStencilDim(stencil), + mask_stencil_size, + mask_stencil_shape); + + hypre_StructMatrixNumValues(mask) = hypre_StructMatrixNumValues(matrix); + + hypre_StructMatrixDataSpace(mask) = + hypre_BoxArrayDuplicate(hypre_StructMatrixDataSpace(matrix)); + + hypre_StructMatrixData(mask) = hypre_StructMatrixData(matrix); + hypre_StructMatrixDataAlloced(mask) = 0; + hypre_StructMatrixDataSize(mask) = hypre_StructMatrixDataSize(matrix); + data_space = hypre_StructMatrixDataSpace(matrix); + data_indices = hypre_StructMatrixDataIndices(matrix); + mask_data_indices = hypre_CTAlloc(int *, hypre_BoxArraySize(data_space)); + hypre_ForBoxI(i, data_space) + { + mask_data_indices[i] = hypre_TAlloc(int, num_stencil_indices); + for (j = 0; j < num_stencil_indices; j++) + { + mask_data_indices[i][j] = data_indices[i][stencil_indices[j]]; + } + } + hypre_StructMatrixDataIndices(mask) = mask_data_indices; + + hypre_StructMatrixSymmetric(mask) = hypre_StructMatrixSymmetric(matrix); + + hypre_StructMatrixSymmElements(mask) = hypre_TAlloc(int, stencil_size); + for (i = 0; i < stencil_size; i++) + { + hypre_StructMatrixSymmElements(mask)[i] = + hypre_StructMatrixSymmElements(matrix)[i]; + } + + for (i = 0; i < 6; i++) + { + hypre_StructMatrixNumGhost(mask)[i] = + hypre_StructMatrixNumGhost(matrix)[i]; + } + + hypre_StructMatrixGlobalSize(mask) = + hypre_StructGridGlobalSize(hypre_StructMatrixGrid(mask)) * + mask_stencil_size; + + hypre_StructMatrixCommPkg(mask) = NULL; + + hypre_StructMatrixRefCount(mask) = 1; + + return mask; + } + Index: llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/struct_matvec.c diff -c /dev/null llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/struct_matvec.c:1.1 *** /dev/null Mon Apr 11 00:22:18 2005 --- llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/struct_matvec.c Mon Apr 11 00:22:07 2005 *************** *** 0 **** --- 1,610 ---- + /*BHEADER********************************************************************** + * (c) 1997 The Regents of the University of California + * + * See the file COPYRIGHT_and_DISCLAIMER for a complete copyright + * notice, contact person, and disclaimer. + * + * $Revision: 1.1 $ + *********************************************************************EHEADER*/ + /****************************************************************************** + * + * Structured matrix-vector multiply routine + * + *****************************************************************************/ + + #include "headers.h" + + /* this currently cannot be greater than 7 */ + #ifdef MAX_DEPTH + #undef MAX_DEPTH + #endif + #define MAX_DEPTH 7 + + /*-------------------------------------------------------------------------- + * hypre_StructMatvecData data structure + *--------------------------------------------------------------------------*/ + + typedef struct + { + hypre_StructMatrix *A; + hypre_StructVector *x; + hypre_ComputePkg *compute_pkg; + + } hypre_StructMatvecData; + + /*-------------------------------------------------------------------------- + * hypre_StructMatvecCreate + *--------------------------------------------------------------------------*/ + + void * + hypre_StructMatvecCreate( ) + { + hypre_StructMatvecData *matvec_data; + + matvec_data = hypre_CTAlloc(hypre_StructMatvecData, 1); + + return (void *) matvec_data; + } + + /*-------------------------------------------------------------------------- + * hypre_StructMatvecSetup + *--------------------------------------------------------------------------*/ + + int + hypre_StructMatvecSetup( void *matvec_vdata, + hypre_StructMatrix *A, + hypre_StructVector *x ) + { + int ierr = 0; + + hypre_StructMatvecData *matvec_data = matvec_vdata; + + hypre_StructGrid *grid; + hypre_StructStencil *stencil; + + hypre_BoxArrayArray *send_boxes; + hypre_BoxArrayArray *recv_boxes; + int **send_processes; + int **recv_processes; + hypre_BoxArrayArray *indt_boxes; + hypre_BoxArrayArray *dept_boxes; + + hypre_Index unit_stride; + + hypre_ComputePkg *compute_pkg; + + /*---------------------------------------------------------- + * Set up the compute package + *----------------------------------------------------------*/ + + grid = hypre_StructMatrixGrid(A); + stencil = hypre_StructMatrixStencil(A); + + hypre_CreateComputeInfo(grid, stencil, + &send_boxes, &recv_boxes, + &send_processes, &recv_processes, + &indt_boxes, &dept_boxes); + + hypre_SetIndex(unit_stride, 1, 1, 1); + hypre_ComputePkgCreate(send_boxes, recv_boxes, + unit_stride, unit_stride, + send_processes, recv_processes, + indt_boxes, dept_boxes, + unit_stride, + grid, hypre_StructVectorDataSpace(x), 1, + &compute_pkg); + + /*---------------------------------------------------------- + * Set up the matvec data structure + *----------------------------------------------------------*/ + + (matvec_data -> A) = hypre_StructMatrixRef(A); + (matvec_data -> x) = hypre_StructVectorRef(x); + (matvec_data -> compute_pkg) = compute_pkg; + + return ierr; + } + + /*-------------------------------------------------------------------------- + * hypre_StructMatvecCompute + *--------------------------------------------------------------------------*/ + + int + hypre_StructMatvecCompute( void *matvec_vdata, + double alpha, + hypre_StructMatrix *A, + hypre_StructVector *x, + double beta, + hypre_StructVector *y ) + { + int ierr = 0; + + hypre_StructMatvecData *matvec_data = matvec_vdata; + + hypre_ComputePkg *compute_pkg; + + hypre_CommHandle *comm_handle; + + hypre_BoxArrayArray *compute_box_aa; + hypre_BoxArray *compute_box_a; + hypre_Box *compute_box; + + hypre_Box *A_data_box; + hypre_Box *x_data_box; + hypre_Box *y_data_box; + + int Ai; + int xi; + int xoff0; + int xoff1; + int xoff2; + int xoff3; + int xoff4; + int xoff5; + int xoff6; + int yi; + + double *Ap0; + double *Ap1; + double *Ap2; + double *Ap3; + double *Ap4; + double *Ap5; + double *Ap6; + double *xp; + double *yp; + + hypre_BoxArray *boxes; + hypre_Box *box; + hypre_Index loop_size; + hypre_IndexRef start; + hypre_IndexRef stride; + + hypre_StructStencil *stencil; + hypre_Index *stencil_shape; + int stencil_size; + int depth; + + double temp; + int compute_i, i, j, si; + int loopi, loopj, loopk; + + /*----------------------------------------------------------------------- + * Initialize some things + *-----------------------------------------------------------------------*/ + + compute_pkg = (matvec_data -> compute_pkg); + + stride = hypre_ComputePkgStride(compute_pkg); + + /*----------------------------------------------------------------------- + * Do (alpha == 0.0) computation + *-----------------------------------------------------------------------*/ + + if (alpha == 0.0) + { + boxes = hypre_StructGridBoxes(hypre_StructMatrixGrid(A)); + hypre_ForBoxI(i, boxes) + { + box = hypre_BoxArrayBox(boxes, i); + start = hypre_BoxIMin(box); + + y_data_box = hypre_BoxArrayBox(hypre_StructVectorDataSpace(y), i); + yp = hypre_StructVectorBoxData(y, i); + + hypre_BoxGetSize(box, loop_size); + + hypre_BoxLoop1Begin(loop_size, + y_data_box, start, stride, yi); + #define HYPRE_BOX_SMP_PRIVATE loopk,loopi,loopj,yi + #include "hypre_box_smp_forloop.h" + hypre_BoxLoop1For(loopi, loopj, loopk, yi) + { + yp[yi] *= beta; + } + hypre_BoxLoop1End(yi); + } + + return ierr; + } + + /*----------------------------------------------------------------------- + * Do (alpha != 0.0) computation + *-----------------------------------------------------------------------*/ + + stencil = hypre_StructMatrixStencil(A); + stencil_shape = hypre_StructStencilShape(stencil); + stencil_size = hypre_StructStencilSize(stencil); + + for (compute_i = 0; compute_i < 2; compute_i++) + { + switch(compute_i) + { + case 0: + { + xp = hypre_StructVectorData(x); + hypre_InitializeIndtComputations(compute_pkg, xp, &comm_handle); + compute_box_aa = hypre_ComputePkgIndtBoxes(compute_pkg); + + /*-------------------------------------------------------------- + * initialize y= (beta/alpha)*y + *--------------------------------------------------------------*/ + + temp = beta / alpha; + if (temp != 1.0) + { + boxes = hypre_StructGridBoxes(hypre_StructMatrixGrid(A)); + hypre_ForBoxI(i, boxes) + { + box = hypre_BoxArrayBox(boxes, i); + start = hypre_BoxIMin(box); + + y_data_box = + hypre_BoxArrayBox(hypre_StructVectorDataSpace(y), i); + yp = hypre_StructVectorBoxData(y, i); + + if (temp == 0.0) + { + hypre_BoxGetSize(box, loop_size); + + hypre_BoxLoop1Begin(loop_size, + y_data_box, start, stride, yi); + #define HYPRE_BOX_SMP_PRIVATE loopk,loopi,loopj,yi + #include "hypre_box_smp_forloop.h" + hypre_BoxLoop1For(loopi, loopj, loopk, yi) + { + yp[yi] = 0.0; + } + hypre_BoxLoop1End(yi); + } + else + { + hypre_BoxGetSize(box, loop_size); + + hypre_BoxLoop1Begin(loop_size, + y_data_box, start, stride, yi); + #define HYPRE_BOX_SMP_PRIVATE loopk,loopi,loopj,yi + #include "hypre_box_smp_forloop.h" + hypre_BoxLoop1For(loopi, loopj, loopk, yi) + { + yp[yi] *= temp; + } + hypre_BoxLoop1End(yi); + } + } + } + } + break; + + case 1: + { + hypre_FinalizeIndtComputations(comm_handle); + compute_box_aa = hypre_ComputePkgDeptBoxes(compute_pkg); + } + break; + } + + /*-------------------------------------------------------------------- + * y += A*x + *--------------------------------------------------------------------*/ + + hypre_ForBoxArrayI(i, compute_box_aa) + { + compute_box_a = hypre_BoxArrayArrayBoxArray(compute_box_aa, i); + + A_data_box = hypre_BoxArrayBox(hypre_StructMatrixDataSpace(A), i); + x_data_box = hypre_BoxArrayBox(hypre_StructVectorDataSpace(x), i); + y_data_box = hypre_BoxArrayBox(hypre_StructVectorDataSpace(y), i); + + xp = hypre_StructVectorBoxData(x, i); + yp = hypre_StructVectorBoxData(y, i); + + hypre_ForBoxI(j, compute_box_a) + { + compute_box = hypre_BoxArrayBox(compute_box_a, j); + + hypre_BoxGetSize(compute_box, loop_size); + start = hypre_BoxIMin(compute_box); + + /* unroll up to depth MAX_DEPTH */ + for (si = 0; si < stencil_size; si+= MAX_DEPTH) + { + depth = hypre_min(MAX_DEPTH, (stencil_size -si)); + switch(depth) + { + case 7: + Ap0 = hypre_StructMatrixBoxData(A, i, si+0); + Ap1 = hypre_StructMatrixBoxData(A, i, si+1); + Ap2 = hypre_StructMatrixBoxData(A, i, si+2); + Ap3 = hypre_StructMatrixBoxData(A, i, si+3); + Ap4 = hypre_StructMatrixBoxData(A, i, si+4); + Ap5 = hypre_StructMatrixBoxData(A, i, si+5); + Ap6 = hypre_StructMatrixBoxData(A, i, si+6); + + xoff0 = hypre_BoxOffsetDistance(x_data_box, + stencil_shape[si+0]); + xoff1 = hypre_BoxOffsetDistance(x_data_box, + stencil_shape[si+1]); + xoff2 = hypre_BoxOffsetDistance(x_data_box, + stencil_shape[si+2]); + xoff3 = hypre_BoxOffsetDistance(x_data_box, + stencil_shape[si+3]); + xoff4 = hypre_BoxOffsetDistance(x_data_box, + stencil_shape[si+4]); + xoff5 = hypre_BoxOffsetDistance(x_data_box, + stencil_shape[si+5]); + xoff6 = hypre_BoxOffsetDistance(x_data_box, + stencil_shape[si+6]); + + hypre_BoxLoop3Begin(loop_size, + A_data_box, start, stride, Ai, + x_data_box, start, stride, xi, + y_data_box, start, stride, yi); + #define HYPRE_BOX_SMP_PRIVATE loopk,loopi,loopj,yi,xi,Ai + #include "hypre_box_smp_forloop.h" + hypre_BoxLoop3For(loopi, loopj, loopk, Ai, xi, yi) + { + yp[yi] += + Ap0[Ai] * xp[xi + xoff0] + + Ap1[Ai] * xp[xi + xoff1] + + Ap2[Ai] * xp[xi + xoff2] + + Ap3[Ai] * xp[xi + xoff3] + + Ap4[Ai] * xp[xi + xoff4] + + Ap5[Ai] * xp[xi + xoff5] + + Ap6[Ai] * xp[xi + xoff6]; + } + hypre_BoxLoop3End(Ai, xi, yi); + + break; + + case 6: + Ap0 = hypre_StructMatrixBoxData(A, i, si+0); + Ap1 = hypre_StructMatrixBoxData(A, i, si+1); + Ap2 = hypre_StructMatrixBoxData(A, i, si+2); + Ap3 = hypre_StructMatrixBoxData(A, i, si+3); + Ap4 = hypre_StructMatrixBoxData(A, i, si+4); + Ap5 = hypre_StructMatrixBoxData(A, i, si+5); + + xoff0 = hypre_BoxOffsetDistance(x_data_box, + stencil_shape[si+0]); + xoff1 = hypre_BoxOffsetDistance(x_data_box, + stencil_shape[si+1]); + xoff2 = hypre_BoxOffsetDistance(x_data_box, + stencil_shape[si+2]); + xoff3 = hypre_BoxOffsetDistance(x_data_box, + stencil_shape[si+3]); + xoff4 = hypre_BoxOffsetDistance(x_data_box, + stencil_shape[si+4]); + xoff5 = hypre_BoxOffsetDistance(x_data_box, + stencil_shape[si+5]); + + hypre_BoxLoop3Begin(loop_size, + A_data_box, start, stride, Ai, + x_data_box, start, stride, xi, + y_data_box, start, stride, yi); + #define HYPRE_BOX_SMP_PRIVATE loopk,loopi,loopj,yi,xi,Ai + #include "hypre_box_smp_forloop.h" + hypre_BoxLoop3For(loopi, loopj, loopk, Ai, xi, yi) + { + yp[yi] += + Ap0[Ai] * xp[xi + xoff0] + + Ap1[Ai] * xp[xi + xoff1] + + Ap2[Ai] * xp[xi + xoff2] + + Ap3[Ai] * xp[xi + xoff3] + + Ap4[Ai] * xp[xi + xoff4] + + Ap5[Ai] * xp[xi + xoff5]; + } + hypre_BoxLoop3End(Ai, xi, yi); + + break; + + case 5: + Ap0 = hypre_StructMatrixBoxData(A, i, si+0); + Ap1 = hypre_StructMatrixBoxData(A, i, si+1); + Ap2 = hypre_StructMatrixBoxData(A, i, si+2); + Ap3 = hypre_StructMatrixBoxData(A, i, si+3); + Ap4 = hypre_StructMatrixBoxData(A, i, si+4); + + xoff0 = hypre_BoxOffsetDistance(x_data_box, + stencil_shape[si+0]); + xoff1 = hypre_BoxOffsetDistance(x_data_box, + stencil_shape[si+1]); + xoff2 = hypre_BoxOffsetDistance(x_data_box, + stencil_shape[si+2]); + xoff3 = hypre_BoxOffsetDistance(x_data_box, + stencil_shape[si+3]); + xoff4 = hypre_BoxOffsetDistance(x_data_box, + stencil_shape[si+4]); + + hypre_BoxLoop3Begin(loop_size, + A_data_box, start, stride, Ai, + x_data_box, start, stride, xi, + y_data_box, start, stride, yi); + #define HYPRE_BOX_SMP_PRIVATE loopk,loopi,loopj,yi,xi,Ai + #include "hypre_box_smp_forloop.h" + hypre_BoxLoop3For(loopi, loopj, loopk, Ai, xi, yi) + { + yp[yi] += + Ap0[Ai] * xp[xi + xoff0] + + Ap1[Ai] * xp[xi + xoff1] + + Ap2[Ai] * xp[xi + xoff2] + + Ap3[Ai] * xp[xi + xoff3] + + Ap4[Ai] * xp[xi + xoff4]; + } + hypre_BoxLoop3End(Ai, xi, yi); + + break; + + case 4: + Ap0 = hypre_StructMatrixBoxData(A, i, si+0); + Ap1 = hypre_StructMatrixBoxData(A, i, si+1); + Ap2 = hypre_StructMatrixBoxData(A, i, si+2); + Ap3 = hypre_StructMatrixBoxData(A, i, si+3); + + xoff0 = hypre_BoxOffsetDistance(x_data_box, + stencil_shape[si+0]); + xoff1 = hypre_BoxOffsetDistance(x_data_box, + stencil_shape[si+1]); + xoff2 = hypre_BoxOffsetDistance(x_data_box, + stencil_shape[si+2]); + xoff3 = hypre_BoxOffsetDistance(x_data_box, + stencil_shape[si+3]); + + hypre_BoxLoop3Begin(loop_size, + A_data_box, start, stride, Ai, + x_data_box, start, stride, xi, + y_data_box, start, stride, yi); + #define HYPRE_BOX_SMP_PRIVATE loopk,loopi,loopj,yi,xi,Ai + #include "hypre_box_smp_forloop.h" + hypre_BoxLoop3For(loopi, loopj, loopk, Ai, xi, yi) + { + yp[yi] += + Ap0[Ai] * xp[xi + xoff0] + + Ap1[Ai] * xp[xi + xoff1] + + Ap2[Ai] * xp[xi + xoff2] + + Ap3[Ai] * xp[xi + xoff3]; + } + hypre_BoxLoop3End(Ai, xi, yi); + + break; + + case 3: + Ap0 = hypre_StructMatrixBoxData(A, i, si+0); + Ap1 = hypre_StructMatrixBoxData(A, i, si+1); + Ap2 = hypre_StructMatrixBoxData(A, i, si+2); + + xoff0 = hypre_BoxOffsetDistance(x_data_box, + stencil_shape[si+0]); + xoff1 = hypre_BoxOffsetDistance(x_data_box, + stencil_shape[si+1]); + xoff2 = hypre_BoxOffsetDistance(x_data_box, + stencil_shape[si+2]); + + hypre_BoxLoop3Begin(loop_size, + A_data_box, start, stride, Ai, + x_data_box, start, stride, xi, + y_data_box, start, stride, yi); + #define HYPRE_BOX_SMP_PRIVATE loopk,loopi,loopj,yi,xi,Ai + #include "hypre_box_smp_forloop.h" + hypre_BoxLoop3For(loopi, loopj, loopk, Ai, xi, yi) + { + yp[yi] += + Ap0[Ai] * xp[xi + xoff0] + + Ap1[Ai] * xp[xi + xoff1] + + Ap2[Ai] * xp[xi + xoff2]; + } + hypre_BoxLoop3End(Ai, xi, yi); + + break; + + case 2: + Ap0 = hypre_StructMatrixBoxData(A, i, si+0); + Ap1 = hypre_StructMatrixBoxData(A, i, si+1); + + xoff0 = hypre_BoxOffsetDistance(x_data_box, + stencil_shape[si+0]); + xoff1 = hypre_BoxOffsetDistance(x_data_box, + stencil_shape[si+1]); + + hypre_BoxLoop3Begin(loop_size, + A_data_box, start, stride, Ai, + x_data_box, start, stride, xi, + y_data_box, start, stride, yi); + #define HYPRE_BOX_SMP_PRIVATE loopk,loopi,loopj,yi,xi,Ai + #include "hypre_box_smp_forloop.h" + hypre_BoxLoop3For(loopi, loopj, loopk, Ai, xi, yi) + { + yp[yi] += + Ap0[Ai] * xp[xi + xoff0] + + Ap1[Ai] * xp[xi + xoff1]; + } + hypre_BoxLoop3End(Ai, xi, yi); + + break; + + case 1: + Ap0 = hypre_StructMatrixBoxData(A, i, si+0); + + xoff0 = hypre_BoxOffsetDistance(x_data_box, + stencil_shape[si+0]); + + hypre_BoxLoop3Begin(loop_size, + A_data_box, start, stride, Ai, + x_data_box, start, stride, xi, + y_data_box, start, stride, yi); + #define HYPRE_BOX_SMP_PRIVATE loopk,loopi,loopj,yi,xi,Ai + #include "hypre_box_smp_forloop.h" + hypre_BoxLoop3For(loopi, loopj, loopk, Ai, xi, yi) + { + yp[yi] += + Ap0[Ai] * xp[xi + xoff0]; + } + hypre_BoxLoop3End(Ai, xi, yi); + + break; + } + } + + if (alpha != 1.0) + { + hypre_BoxLoop1Begin(loop_size, + y_data_box, start, stride, yi); + #define HYPRE_BOX_SMP_PRIVATE loopk,loopi,loopj,yi + #include "hypre_box_smp_forloop.h" + hypre_BoxLoop1For(loopi, loopj, loopk, yi) + { + yp[yi] *= alpha; + } + hypre_BoxLoop1End(yi); + } + } + } + } + + return ierr; + } + + /*-------------------------------------------------------------------------- + * hypre_StructMatvecDestroy + *--------------------------------------------------------------------------*/ + + int + hypre_StructMatvecDestroy( void *matvec_vdata ) + { + int ierr = 0; + + hypre_StructMatvecData *matvec_data = matvec_vdata; + + if (matvec_data) + { + hypre_StructMatrixDestroy(matvec_data -> A); + hypre_StructVectorDestroy(matvec_data -> x); + hypre_ComputePkgDestroy(matvec_data -> compute_pkg ); + hypre_TFree(matvec_data); + } + + return ierr; + } + + /*-------------------------------------------------------------------------- + * hypre_StructMatvec + *--------------------------------------------------------------------------*/ + + int + hypre_StructMatvec( double alpha, + hypre_StructMatrix *A, + hypre_StructVector *x, + double beta, + hypre_StructVector *y ) + { + int ierr = 0; + + void *matvec_data; + + matvec_data = hypre_StructMatvecCreate(); + ierr = hypre_StructMatvecSetup(matvec_data, A, x); + ierr = hypre_StructMatvecCompute(matvec_data, alpha, A, x, beta, y); + ierr = hypre_StructMatvecDestroy(matvec_data); + + return ierr; + } Index: llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/struct_mv.h diff -c /dev/null llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/struct_mv.h:1.1 *** /dev/null Mon Apr 11 00:22:18 2005 --- llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/struct_mv.h Mon Apr 11 00:22:07 2005 *************** *** 0 **** --- 1,1685 ---- + + #include + + #include "HYPRE_struct_mv.h" + + #ifndef hypre_STRUCT_MV_HEADER + #define hypre_STRUCT_MV_HEADER + + #include "utilities.h" + + #ifdef __cplusplus + extern "C" { + #endif + + /*BHEADER********************************************************************** + * (c) 1997 The Regents of the University of California + * + * See the file COPYRIGHT_and_DISCLAIMER for a complete copyright + * notice, contact person, and disclaimer. + * + * $Revision: 1.1 $ + *********************************************************************EHEADER*/ + /****************************************************************************** + * + * Header info for the Box structures + * + *****************************************************************************/ + + #ifndef hypre_BOX_HEADER + #define hypre_BOX_HEADER + + /*-------------------------------------------------------------------------- + * hypre_Index: + * This is used to define indices in index space, or dimension + * sizes of boxes. + * + * The spatial dimensions x, y, and z may be specified by the + * integers 0, 1, and 2, respectively (see the hypre_IndexD macro below). + * This simplifies the code in the hypre_Box class by reducing code + * replication. + *--------------------------------------------------------------------------*/ + + typedef int hypre_Index[3]; + typedef int *hypre_IndexRef; + + /*-------------------------------------------------------------------------- + * hypre_Box: + * Structure describing a cartesian region of some index space. + *--------------------------------------------------------------------------*/ + + typedef struct hypre_Box_struct + { + hypre_Index imin; /* min bounding indices */ + hypre_Index imax; /* max bounding indices */ + + } hypre_Box; + + /*-------------------------------------------------------------------------- + * hypre_BoxArray: + * An array of boxes. + *--------------------------------------------------------------------------*/ + + typedef struct hypre_BoxArray_struct + { + hypre_Box *boxes; /* Array of boxes */ + int size; /* Size of box array */ + int alloc_size; /* Size of currently alloced space */ + + } hypre_BoxArray; + + #define hypre_BoxArrayExcess 10 + + /*-------------------------------------------------------------------------- + * hypre_BoxArrayArray: + * An array of box arrays. + *--------------------------------------------------------------------------*/ + + typedef struct hypre_BoxArrayArray_struct + { + hypre_BoxArray **box_arrays; /* Array of pointers to box arrays */ + int size; /* Size of box array array */ + + } hypre_BoxArrayArray; + + + /*-------------------------------------------------------------------------- + * Accessor macros: hypre_Index + *--------------------------------------------------------------------------*/ + + #define hypre_IndexD(index, d) (index[d]) + + #define hypre_IndexX(index) hypre_IndexD(index, 0) + #define hypre_IndexY(index) hypre_IndexD(index, 1) + #define hypre_IndexZ(index) hypre_IndexD(index, 2) + + /*-------------------------------------------------------------------------- + * Member functions: hypre_Index + *--------------------------------------------------------------------------*/ + + #define hypre_SetIndex(index, ix, iy, iz) \ + ( hypre_IndexX(index) = ix,\ + hypre_IndexY(index) = iy,\ + hypre_IndexZ(index) = iz ) + + #define hypre_ClearIndex(index) hypre_SetIndex(index, 0, 0, 0) + + #define hypre_CopyIndex(index1, index2) \ + ( hypre_IndexX(index2) = hypre_IndexX(index1),\ + hypre_IndexY(index2) = hypre_IndexY(index1),\ + hypre_IndexZ(index2) = hypre_IndexZ(index1) ) + + #define hypre_CopyToCleanIndex(in_index, ndim, out_index) \ + {\ + int d;\ + for (d = 0; d < ndim; d++)\ + {\ + hypre_IndexD(out_index, d) = hypre_IndexD(in_index, d);\ + }\ + for (d = ndim; d < 3; d++)\ + {\ + hypre_IndexD(out_index, d) = 0;\ + }\ + } + + /*-------------------------------------------------------------------------- + * Accessor macros: hypre_Box + *--------------------------------------------------------------------------*/ + + #define hypre_BoxIMin(box) ((box) -> imin) + #define hypre_BoxIMax(box) ((box) -> imax) + + #define hypre_BoxIMinD(box, d) (hypre_IndexD(hypre_BoxIMin(box), d)) + #define hypre_BoxIMaxD(box, d) (hypre_IndexD(hypre_BoxIMax(box), d)) + #define hypre_BoxSizeD(box, d) \ + hypre_max(0, (hypre_BoxIMaxD(box, d) - hypre_BoxIMinD(box, d) + 1)) + + #define hypre_BoxIMinX(box) hypre_BoxIMinD(box, 0) + #define hypre_BoxIMinY(box) hypre_BoxIMinD(box, 1) + #define hypre_BoxIMinZ(box) hypre_BoxIMinD(box, 2) + + #define hypre_BoxIMaxX(box) hypre_BoxIMaxD(box, 0) + #define hypre_BoxIMaxY(box) hypre_BoxIMaxD(box, 1) + #define hypre_BoxIMaxZ(box) hypre_BoxIMaxD(box, 2) + + #define hypre_BoxSizeX(box) hypre_BoxSizeD(box, 0) + #define hypre_BoxSizeY(box) hypre_BoxSizeD(box, 1) + #define hypre_BoxSizeZ(box) hypre_BoxSizeD(box, 2) + + #define hypre_CopyBox(box1, box2) \ + ( hypre_CopyIndex(hypre_BoxIMin(box1), hypre_BoxIMin(box2)),\ + hypre_CopyIndex(hypre_BoxIMax(box1), hypre_BoxIMax(box2)) ) + + #define hypre_BoxVolume(box) \ + (hypre_BoxSizeX(box) * hypre_BoxSizeY(box) * hypre_BoxSizeZ(box)) + + #define hypre_BoxIndexRank(box, index) \ + ((hypre_IndexX(index) - hypre_BoxIMinX(box)) + \ + ((hypre_IndexY(index) - hypre_BoxIMinY(box)) + \ + ((hypre_IndexZ(index) - hypre_BoxIMinZ(box)) * \ + hypre_BoxSizeY(box))) * \ + hypre_BoxSizeX(box)) + + #define hypre_BoxOffsetDistance(box, index) \ + (hypre_IndexX(index) + \ + (hypre_IndexY(index) + \ + (hypre_IndexZ(index) * \ + hypre_BoxSizeY(box))) * \ + hypre_BoxSizeX(box)) + + /*-------------------------------------------------------------------------- + * Accessor macros: hypre_BoxArray + *--------------------------------------------------------------------------*/ + + #define hypre_BoxArrayBoxes(box_array) ((box_array) -> boxes) + #define hypre_BoxArrayBox(box_array, i) &((box_array) -> boxes[(i)]) + #define hypre_BoxArraySize(box_array) ((box_array) -> size) + #define hypre_BoxArrayAllocSize(box_array) ((box_array) -> alloc_size) + + /*-------------------------------------------------------------------------- + * Accessor macros: hypre_BoxArrayArray + *--------------------------------------------------------------------------*/ + + #define hypre_BoxArrayArrayBoxArrays(box_array_array) \ + ((box_array_array) -> box_arrays) + #define hypre_BoxArrayArrayBoxArray(box_array_array, i) \ + ((box_array_array) -> box_arrays[(i)]) + #define hypre_BoxArrayArraySize(box_array_array) \ + ((box_array_array) -> size) + + /*-------------------------------------------------------------------------- + * Looping macros: + *--------------------------------------------------------------------------*/ + + #define hypre_ForBoxI(i, box_array) \ + for (i = 0; i < hypre_BoxArraySize(box_array); i++) + + #define hypre_ForBoxArrayI(i, box_array_array) \ + for (i = 0; i < hypre_BoxArrayArraySize(box_array_array); i++) + + /*-------------------------------------------------------------------------- + * BoxLoop macros: + * + * NOTE: PThreads version of BoxLoop looping macros are in `box_pthreads.h'. + * + *--------------------------------------------------------------------------*/ + + #ifndef HYPRE_USE_PTHREADS + + #define hypre_BoxLoopDeclareS(dbox, stride, sx, sy, sz) \ + int sx = (hypre_IndexX(stride));\ + int sy = (hypre_IndexY(stride)*hypre_BoxSizeX(dbox));\ + int sz = (hypre_IndexZ(stride)*\ + hypre_BoxSizeX(dbox)*hypre_BoxSizeY(dbox)) + + #define hypre_BoxLoopDeclareN(loop_size) \ + int hypre__nx = hypre_IndexX(loop_size);\ + int hypre__ny = hypre_IndexY(loop_size);\ + int hypre__nz = hypre_IndexZ(loop_size);\ + int hypre__mx = hypre__nx;\ + int hypre__my = hypre__ny;\ + int hypre__mz = hypre__nz;\ + int hypre__dir, hypre__max;\ + int hypre__div, hypre__mod;\ + int hypre__block, hypre__num_blocks;\ + hypre__dir = 0;\ + hypre__max = hypre__nx;\ + if (hypre__ny > hypre__max)\ + {\ + hypre__dir = 1;\ + hypre__max = hypre__ny;\ + }\ + if (hypre__nz > hypre__max)\ + {\ + hypre__dir = 2;\ + hypre__max = hypre__nz;\ + }\ + hypre__num_blocks = hypre_NumThreads();\ + if (hypre__max < hypre__num_blocks)\ + {\ + hypre__num_blocks = hypre__max;\ + }\ + if (hypre__num_blocks > 0)\ + {\ + hypre__div = hypre__max / hypre__num_blocks;\ + hypre__mod = hypre__max % hypre__num_blocks;\ + } + + #define hypre_BoxLoopSet(i, j, k) \ + i = 0;\ + j = 0;\ + k = 0;\ + hypre__nx = hypre__mx;\ + hypre__ny = hypre__my;\ + hypre__nz = hypre__mz;\ + if (hypre__num_blocks > 1)\ + {\ + if (hypre__dir == 0)\ + {\ + i = hypre__block * hypre__div + hypre_min(hypre__mod, hypre__block);\ + hypre__nx = hypre__div + ((hypre__mod > hypre__block) ? 1 : 0);\ + }\ + else if (hypre__dir == 1)\ + {\ + j = hypre__block * hypre__div + hypre_min(hypre__mod, hypre__block);\ + hypre__ny = hypre__div + ((hypre__mod > hypre__block) ? 1 : 0);\ + }\ + else if (hypre__dir == 2)\ + {\ + k = hypre__block * hypre__div + hypre_min(hypre__mod, hypre__block);\ + hypre__nz = hypre__div + ((hypre__mod > hypre__block) ? 1 : 0);\ + }\ + } + + /*-----------------------------------*/ + + #define hypre_BoxLoop0Begin(loop_size)\ + {\ + hypre_BoxLoopDeclareN(loop_size); + + #define hypre_BoxLoop0For(i, j, k)\ + for (hypre__block = 0; hypre__block < hypre__num_blocks; hypre__block++)\ + {\ + hypre_BoxLoopSet(i, j, k);\ + for (k = 0; k < hypre__nz; k++)\ + {\ + for (j = 0; j < hypre__ny; j++)\ + {\ + for (i = 0; i < hypre__nx; i++)\ + { + + #define hypre_BoxLoop0End()\ + }\ + }\ + }\ + }\ + } + + /*-----------------------------------*/ + + #define hypre_BoxLoop1Begin(loop_size,\ + dbox1, start1, stride1, i1)\ + {\ + int hypre__i1start = hypre_BoxIndexRank(dbox1, start1);\ + hypre_BoxLoopDeclareS(dbox1, stride1, hypre__sx1, hypre__sy1, hypre__sz1);\ + hypre_BoxLoopDeclareN(loop_size); + + #define hypre_BoxLoop1For(i, j, k, i1)\ + for (hypre__block = 0; hypre__block < hypre__num_blocks; hypre__block++)\ + {\ + hypre_BoxLoopSet(i, j, k);\ + i1 = hypre__i1start + i*hypre__sx1 + j*hypre__sy1 + k*hypre__sz1;\ + for (k = 0; k < hypre__nz; k++)\ + {\ + for (j = 0; j < hypre__ny; j++)\ + {\ + for (i = 0; i < hypre__nx; i++)\ + { + + #define hypre_BoxLoop1End(i1)\ + i1 += hypre__sx1;\ + }\ + i1 += hypre__sy1 - hypre__nx*hypre__sx1;\ + }\ + i1 += hypre__sz1 - hypre__ny*hypre__sy1;\ + }\ + }\ + } + + /*-----------------------------------*/ + + #define hypre_BoxLoop2Begin(loop_size,\ + dbox1, start1, stride1, i1,\ + dbox2, start2, stride2, i2)\ + {\ + int hypre__i1start = hypre_BoxIndexRank(dbox1, start1);\ + int hypre__i2start = hypre_BoxIndexRank(dbox2, start2);\ + hypre_BoxLoopDeclareS(dbox1, stride1, hypre__sx1, hypre__sy1, hypre__sz1);\ + hypre_BoxLoopDeclareS(dbox2, stride2, hypre__sx2, hypre__sy2, hypre__sz2);\ + hypre_BoxLoopDeclareN(loop_size); + + #define hypre_BoxLoop2For(i, j, k, i1, i2)\ + for (hypre__block = 0; hypre__block < hypre__num_blocks; hypre__block++)\ + {\ + hypre_BoxLoopSet(i, j, k);\ + i1 = hypre__i1start + i*hypre__sx1 + j*hypre__sy1 + k*hypre__sz1;\ + i2 = hypre__i2start + i*hypre__sx2 + j*hypre__sy2 + k*hypre__sz2;\ + for (k = 0; k < hypre__nz; k++)\ + {\ + for (j = 0; j < hypre__ny; j++)\ + {\ + for (i = 0; i < hypre__nx; i++)\ + { + + #define hypre_BoxLoop2End(i1, i2)\ + i1 += hypre__sx1;\ + i2 += hypre__sx2;\ + }\ + i1 += hypre__sy1 - hypre__nx*hypre__sx1;\ + i2 += hypre__sy2 - hypre__nx*hypre__sx2;\ + }\ + i1 += hypre__sz1 - hypre__ny*hypre__sy1;\ + i2 += hypre__sz2 - hypre__ny*hypre__sy2;\ + }\ + }\ + } + + /*-----------------------------------*/ + + #define hypre_BoxLoop3Begin(loop_size,\ + dbox1, start1, stride1, i1,\ + dbox2, start2, stride2, i2,\ + dbox3, start3, stride3, i3)\ + {\ + int hypre__i1start = hypre_BoxIndexRank(dbox1, start1);\ + int hypre__i2start = hypre_BoxIndexRank(dbox2, start2);\ + int hypre__i3start = hypre_BoxIndexRank(dbox3, start3);\ + hypre_BoxLoopDeclareS(dbox1, stride1, hypre__sx1, hypre__sy1, hypre__sz1);\ + hypre_BoxLoopDeclareS(dbox2, stride2, hypre__sx2, hypre__sy2, hypre__sz2);\ + hypre_BoxLoopDeclareS(dbox3, stride3, hypre__sx3, hypre__sy3, hypre__sz3);\ + hypre_BoxLoopDeclareN(loop_size); + + #define hypre_BoxLoop3For(i, j, k, i1, i2, i3)\ + for (hypre__block = 0; hypre__block < hypre__num_blocks; hypre__block++)\ + {\ + hypre_BoxLoopSet(i, j, k);\ + i1 = hypre__i1start + i*hypre__sx1 + j*hypre__sy1 + k*hypre__sz1;\ + i2 = hypre__i2start + i*hypre__sx2 + j*hypre__sy2 + k*hypre__sz2;\ + i3 = hypre__i3start + i*hypre__sx3 + j*hypre__sy3 + k*hypre__sz3;\ + for (k = 0; k < hypre__nz; k++)\ + {\ + for (j = 0; j < hypre__ny; j++)\ + {\ + for (i = 0; i < hypre__nx; i++)\ + { + + #define hypre_BoxLoop3End(i1, i2, i3)\ + i1 += hypre__sx1;\ + i2 += hypre__sx2;\ + i3 += hypre__sx3;\ + }\ + i1 += hypre__sy1 - hypre__nx*hypre__sx1;\ + i2 += hypre__sy2 - hypre__nx*hypre__sx2;\ + i3 += hypre__sy3 - hypre__nx*hypre__sx3;\ + }\ + i1 += hypre__sz1 - hypre__ny*hypre__sy1;\ + i2 += hypre__sz2 - hypre__ny*hypre__sy2;\ + i3 += hypre__sz3 - hypre__ny*hypre__sy3;\ + }\ + }\ + } + + /*-----------------------------------*/ + + #define hypre_BoxLoop4Begin(loop_size,\ + dbox1, start1, stride1, i1,\ + dbox2, start2, stride2, i2,\ + dbox3, start3, stride3, i3,\ + dbox4, start4, stride4, i4)\ + {\ + int hypre__i1start = hypre_BoxIndexRank(dbox1, start1);\ + int hypre__i2start = hypre_BoxIndexRank(dbox2, start2);\ + int hypre__i3start = hypre_BoxIndexRank(dbox3, start3);\ + int hypre__i4start = hypre_BoxIndexRank(dbox4, start4);\ + hypre_BoxLoopDeclareS(dbox1, stride1, hypre__sx1, hypre__sy1, hypre__sz1);\ + hypre_BoxLoopDeclareS(dbox2, stride2, hypre__sx2, hypre__sy2, hypre__sz2);\ + hypre_BoxLoopDeclareS(dbox3, stride3, hypre__sx3, hypre__sy3, hypre__sz3);\ + hypre_BoxLoopDeclareS(dbox4, stride4, hypre__sx4, hypre__sy4, hypre__sz4);\ + hypre_BoxLoopDeclareN(loop_size); + + #define hypre_BoxLoop4For(i, j, k, i1, i2, i3, i4)\ + for (hypre__block = 0; hypre__block < hypre__num_blocks; hypre__block++)\ + {\ + hypre_BoxLoopSet(i, j, k);\ + i1 = hypre__i1start + i*hypre__sx1 + j*hypre__sy1 + k*hypre__sz1;\ + i2 = hypre__i2start + i*hypre__sx2 + j*hypre__sy2 + k*hypre__sz2;\ + i3 = hypre__i3start + i*hypre__sx3 + j*hypre__sy3 + k*hypre__sz3;\ + i4 = hypre__i4start + i*hypre__sx4 + j*hypre__sy4 + k*hypre__sz4;\ + for (k = 0; k < hypre__nz; k++)\ + {\ + for (j = 0; j < hypre__ny; j++)\ + {\ + for (i = 0; i < hypre__nx; i++)\ + { + + #define hypre_BoxLoop4End(i1, i2, i3, i4)\ + i1 += hypre__sx1;\ + i2 += hypre__sx2;\ + i3 += hypre__sx3;\ + i4 += hypre__sx4;\ + }\ + i1 += hypre__sy1 - hypre__nx*hypre__sx1;\ + i2 += hypre__sy2 - hypre__nx*hypre__sx2;\ + i3 += hypre__sy3 - hypre__nx*hypre__sx3;\ + i4 += hypre__sy4 - hypre__nx*hypre__sx4;\ + }\ + i1 += hypre__sz1 - hypre__ny*hypre__sy1;\ + i2 += hypre__sz2 - hypre__ny*hypre__sy2;\ + i3 += hypre__sz3 - hypre__ny*hypre__sy3;\ + i4 += hypre__sz4 - hypre__ny*hypre__sy4;\ + }\ + }\ + } + + /*-----------------------------------*/ + + #endif /* ifndef HYPRE_USE_PTHREADS */ + + #endif + /*BHEADER********************************************************************** + * (c) 1998 The Regents of the University of California + * + * See the file COPYRIGHT_and_DISCLAIMER for a complete copyright + * notice, contact person, and disclaimer. + * + * $Revision: 1.1 $ + *********************************************************************EHEADER*/ + /****************************************************************************** + * + * Header info for the Box structures + * + *****************************************************************************/ + + #ifdef HYPRE_USE_PTHREADS + + #ifndef hypre_BOX_PTHREADS_HEADER + #define hypre_BOX_PTHREADS_HEADER + + #include + #include "threading.h" + + + extern volatile int hypre_thread_counter; + extern int iteration_counter; + + /*-------------------------------------------------------------------------- + * Threaded Looping macros: + *--------------------------------------------------------------------------*/ + + #ifndef CHUNK_GOAL + #define CHUNK_GOAL (hypre_NumThreads*1) + #endif + #ifndef MIN_VOL + #define MIN_VOL 125 + #endif + #ifndef MAX_VOL + #define MAX_VOL 64000 + #endif + + #define hypre_BoxLoopDeclare(loop_size, data_box, stride, iinc, jinc, kinc) \ + int iinc = (hypre_IndexX(stride));\ + int jinc = (hypre_IndexY(stride)*hypre_BoxSizeX(data_box) -\ + hypre_IndexX(loop_size)*hypre_IndexX(stride));\ + int kinc = (hypre_IndexZ(stride)*\ + hypre_BoxSizeX(data_box)*hypre_BoxSizeY(data_box) -\ + hypre_IndexY(loop_size)*\ + hypre_IndexY(stride)*hypre_BoxSizeX(data_box)) + + #define vol_cbrt(vol) (int) pow((double)(vol), 1. / 3.) + + #define hypre_ThreadLoopBegin(local_counter, init_val, stop_val, tl_index,\ + tl_mtx, tl_body)\ + for (local_counter = ifetchadd(&tl_index, &tl_mtx) + init_val;\ + local_counter < stop_val;\ + local_counter = ifetchadd(&tl_index, &tl_mtx) + init_val)\ + {\ + tl_body; + + #define hypre_ThreadLoop(tl_index,\ + tl_count, tl_release, tl_mtx)\ + if (pthread_equal(initial_thread, pthread_self()) == 0)\ + {\ + pthread_mutex_lock(&tl_mtx);\ + tl_count++;\ + if (tl_count < hypre_NumThreads)\ + {\ + pthread_mutex_unlock(&tl_mtx);\ + while (!tl_release);\ + pthread_mutex_lock(&tl_mtx);\ + tl_count--;\ + pthread_mutex_unlock(&tl_mtx);\ + while (tl_release);\ + }\ + else\ + {\ + tl_count--;\ + tl_index = 0;\ + pthread_mutex_unlock(&tl_mtx);\ + tl_release = 1;\ + while (tl_count);\ + tl_release = 0;\ + }\ + }\ + else\ + tl_index = 0 + + #define hypre_ThreadLoopOld(local_counter, init_val, stop_val, tl_index,\ + tl_count, tl_release, tl_mtx, tl_body)\ + {\ + for (local_counter = ifetchadd(&tl_index, &tl_mtx) + init_val;\ + local_counter < stop_val;\ + local_counter = ifetchadd(&tl_index, &tl_mtx) + init_val)\ + {\ + tl_body;\ + }\ + if (pthread_equal(initial_thread, pthread_self()) == 0)\ + {\ + pthread_mutex_lock(&tl_mtx);\ + tl_count++;\ + if (tl_count < hypre_NumThreads)\ + {\ + pthread_mutex_unlock(&tl_mtx);\ + while (!tl_release);\ + pthread_mutex_lock(&tl_mtx);\ + tl_count--;\ + pthread_mutex_unlock(&tl_mtx);\ + while (tl_release);\ + }\ + else\ + {\ + tl_count--;\ + tl_index = 0;\ + pthread_mutex_unlock(&tl_mtx);\ + tl_release = 1;\ + while (tl_count);\ + tl_release = 0;\ + }\ + }\ + else\ + tl_index = 0;\ + } + + #define hypre_ChunkLoopExternalSetup(hypre__nx, hypre__ny, hypre__nz)\ + int target_vol, target_area, target_len;\ + int cbrt_tar_vol, sqrt_tar_area;\ + int edge_divisor;\ + int znumchunk, ynumchunk, xnumchunk;\ + int hypre__cz, hypre__cy, hypre__cx;\ + int numchunks;\ + int clfreq[3], clreset[3];\ + int clstart[3];\ + int clfinish[3];\ + int chunkcount;\ + target_vol = hypre_min(hypre_max((hypre__nx * hypre__ny * hypre__nz) / CHUNK_GOAL,\ + MIN_VOL), MAX_VOL);\ + cbrt_tar_vol = (int) (pow ((double)target_vol, 1./3.));\ + edge_divisor = hypre__nz / cbrt_tar_vol + !!(hypre__nz % cbrt_tar_vol);\ + hypre__cz = hypre__nz / edge_divisor + !!(hypre__nz % edge_divisor);\ + znumchunk = hypre__nz / hypre__cz + !!(hypre__nz % hypre__cz);\ + target_area = target_vol / hypre__cz;\ + sqrt_tar_area = (int) (sqrt((double)target_area));\ + edge_divisor = hypre__ny / sqrt_tar_area + !!(hypre__ny % sqrt_tar_area);\ + hypre__cy = hypre__ny / edge_divisor + !!(hypre__ny % edge_divisor);\ + ynumchunk = hypre__ny / hypre__cy + !!(hypre__ny % hypre__cy);\ + target_len = target_area / hypre__cy;\ + edge_divisor = hypre__nx / target_len + !!(hypre__nx % target_len);\ + hypre__cx = hypre__nx / edge_divisor + !!(hypre__nx % edge_divisor);\ + xnumchunk = hypre__nx / hypre__cx + !!(hypre__nx % hypre__cx);\ + numchunks = znumchunk * ynumchunk * xnumchunk;\ + clfreq[0] = 1;\ + clreset[0] = xnumchunk;\ + clfreq[1] = clreset[0];\ + clreset[1] = ynumchunk * xnumchunk;\ + clfreq[2] = clreset[1];\ + clreset[2] = znumchunk * ynumchunk * xnumchunk + + #define hypre_ChunkLoopInternalSetup(clstart, clfinish, clreset, clfreq,\ + hypre__nx, hypre__ny, hypre__nz,\ + hypre__cx, hypre__cy, hypre__cz,\ + chunkcount)\ + clstart[0] = ((chunkcount % clreset[0]) / clfreq[0]) * hypre__cx;\ + if (clstart[0] < hypre__nx - hypre__cx)\ + clfinish[0] = clstart[0] + hypre__cx;\ + else\ + clfinish[0] = hypre__nx;\ + clstart[1] = ((chunkcount % clreset[1]) / clfreq[1]) * hypre__cy;\ + if (clstart[1] < hypre__ny - hypre__cy)\ + clfinish[1] = clstart[1] + hypre__cy;\ + else\ + clfinish[1] = hypre__ny;\ + clstart[2] = ((chunkcount % clreset[2]) / clfreq[2]) * hypre__cz;\ + if (clstart[2] < hypre__nz - hypre__cz)\ + clfinish[2] = clstart[2] + hypre__cz;\ + else\ + clfinish[2] = hypre__nz + + #define hypre_BoxLoop0Begin(loop_size)\ + {\ + int hypre__nx = hypre_IndexX(loop_size);\ + int hypre__ny = hypre_IndexY(loop_size);\ + int hypre__nz = hypre_IndexZ(loop_size);\ + if (hypre__nx && hypre__ny && hypre__nz )\ + {\ + hypre_ChunkLoopExternalSetup(hypre__nx, hypre__ny, hypre__nz);\ + hypre_ThreadLoopBegin(chunkcount, 0, numchunks, iteration_counter,\ + hypre_mutex_boxloops,\ + hypre_ChunkLoopInternalSetup(clstart, clfinish, clreset, clfreq,\ + hypre__nx, hypre__ny, hypre__nz,\ + hypre__cx, hypre__cy, hypre__cz,\ + chunkcount)); + + #define hypre_BoxLoop0For(i, j, k)\ + for (k = clstart[2]; k < clfinish[2]; k++ )\ + {\ + for (j = clstart[1]; j < clfinish[1]; j++ )\ + {\ + for (i = clstart[0]; i < clfinish[0]; i++ )\ + { + + #define hypre_BoxLoop0End() }}}hypre_ThreadLoop(iteration_counter,\ + hypre_thread_counter, hypre_thread_release,\ + hypre_mutex_boxloops);}}} + + + #define hypre_BoxLoop1Begin(loop_size,\ + data_box1, start1, stride1, i1)\ + {\ + hypre_BoxLoopDeclare(loop_size, data_box1, stride1,\ + hypre__iinc1, hypre__jinc1, hypre__kinc1);\ + int hypre__nx = hypre_IndexX(loop_size);\ + int hypre__ny = hypre_IndexY(loop_size);\ + int hypre__nz = hypre_IndexZ(loop_size);\ + int orig_i1 = hypre_BoxIndexRank(data_box1, start1);\ + if (hypre__nx && hypre__ny && hypre__nz )\ + {\ + hypre_ChunkLoopExternalSetup(hypre__nx, hypre__ny, hypre__nz);\ + hypre_ThreadLoopBegin(chunkcount, 0, numchunks, iteration_counter,\ + hypre_mutex_boxloops,\ + hypre_ChunkLoopInternalSetup(clstart, clfinish, clreset, clfreq,\ + hypre__nx, hypre__ny, hypre__nz,\ + hypre__cx, hypre__cy, hypre__cz,\ + chunkcount)); + + #define hypre_BoxLoop1For(i, j, k, i1)\ + for (k = clstart[2]; k < clfinish[2]; k++)\ + {\ + for (j = clstart[1]; j < clfinish[1]; j++)\ + {\ + for (i = clstart[0]; i < clfinish[0]; i++)\ + {\ + i1 = orig_i1 +\ + (i + hypre__nx*j + hypre__nx*hypre__ny*k)*hypre__iinc1 +\ + (j + hypre__ny*k)*hypre__jinc1 + k*hypre__kinc1; + + #define hypre_BoxLoop1End(i1) }}}hypre_ThreadLoop(iteration_counter,\ + hypre_thread_counter, hypre_thread_release,\ + hypre_mutex_boxloops);}}} + + #define hypre_BoxLoop2Begin(loop_size,\ + data_box1, start1, stride1, i1,\ + data_box2, start2, stride2, i2)\ + {\ + hypre_BoxLoopDeclare(loop_size, data_box1, stride1,\ + hypre__iinc1, hypre__jinc1, hypre__kinc1);\ + hypre_BoxLoopDeclare(loop_size, data_box2, stride2,\ + hypre__iinc2, hypre__jinc2, hypre__kinc2);\ + int hypre__nx = hypre_IndexX(loop_size);\ + int hypre__ny = hypre_IndexY(loop_size);\ + int hypre__nz = hypre_IndexZ(loop_size);\ + int orig_i1 = hypre_BoxIndexRank(data_box1, start1);\ + int orig_i2 = hypre_BoxIndexRank(data_box2, start2);\ + if (hypre__nx && hypre__ny && hypre__nz )\ + {\ + hypre_ChunkLoopExternalSetup(hypre__nx, hypre__ny, hypre__nz);\ + hypre_ThreadLoopBegin(chunkcount, 0, numchunks, iteration_counter,\ + hypre_mutex_boxloops,\ + hypre_ChunkLoopInternalSetup(clstart, clfinish, clreset, clfreq,\ + hypre__nx, hypre__ny, hypre__nz,\ + hypre__cx, hypre__cy, hypre__cz,\ + chunkcount)) + + #define hypre_BoxLoop2For(i, j, k, i1, i2)\ + for (k = clstart[2]; k < clfinish[2]; k++)\ + {\ + for (j = clstart[1]; j < clfinish[1]; j++)\ + {\ + for (i = clstart[0]; i < clfinish[0]; i++)\ + {\ + i1 = orig_i1 +\ + (i + hypre__nx*j + hypre__nx*hypre__ny*k)*hypre__iinc1 +\ + (j + hypre__ny*k)*hypre__jinc1 + k*hypre__kinc1;\ + i2 = orig_i2 +\ + (i + hypre__nx*j + hypre__nx*hypre__ny*k)*hypre__iinc2 +\ + (j + hypre__ny*k)*hypre__jinc2 + k*hypre__kinc2; + + #define hypre_BoxLoop2End(i1, i2) }}}hypre_ThreadLoop(iteration_counter,\ + hypre_thread_counter, hypre_thread_release,\ + hypre_mutex_boxloops);}}} + + + + + #define hypre_BoxLoop3Begin(loop_size,\ + data_box1, start1, stride1, i1,\ + data_box2, start2, stride2, i2,\ + data_box3, start3, stride3, i3)\ + {\ + hypre_BoxLoopDeclare(loop_size, data_box1, stride1,\ + hypre__iinc1, hypre__jinc1, hypre__kinc1);\ + hypre_BoxLoopDeclare(loop_size, data_box2, stride2,\ + hypre__iinc2, hypre__jinc2, hypre__kinc2);\ + hypre_BoxLoopDeclare(loop_size, data_box3, stride3,\ + hypre__iinc3, hypre__jinc3, hypre__kinc3);\ + int hypre__nx = hypre_IndexX(loop_size);\ + int hypre__ny = hypre_IndexY(loop_size);\ + int hypre__nz = hypre_IndexZ(loop_size);\ + int orig_i1 = hypre_BoxIndexRank(data_box1, start1);\ + int orig_i2 = hypre_BoxIndexRank(data_box2, start2);\ + int orig_i3 = hypre_BoxIndexRank(data_box3, start3);\ + if (hypre__nx && hypre__ny && hypre__nz )\ + {\ + hypre_ChunkLoopExternalSetup(hypre__nx, hypre__ny, hypre__nz);\ + hypre_ThreadLoopBegin(chunkcount, 0, numchunks, iteration_counter,\ + hypre_mutex_boxloops,\ + hypre_ChunkLoopInternalSetup(clstart, clfinish, clreset, clfreq,\ + hypre__nx, hypre__ny, hypre__nz,\ + hypre__cx, hypre__cy, hypre__cz,\ + chunkcount)) + + #define hypre_BoxLoop3For(i, j, k, i1, i2, i3)\ + for (k = clstart[2]; k < clfinish[2]; k++)\ + {\ + for (j = clstart[1]; j < clfinish[1]; j++)\ + {\ + for (i = clstart[0]; i < clfinish[0]; i++)\ + {\ + i1 = orig_i1 +\ + (i + hypre__nx*j + hypre__nx*hypre__ny*k)*hypre__iinc1 +\ + (j + hypre__ny*k)*hypre__jinc1 + k*hypre__kinc1;\ + i2 = orig_i2 +\ + (i + hypre__nx*j + hypre__nx*hypre__ny*k)*hypre__iinc2 +\ + (j + hypre__ny*k)*hypre__jinc2 + k*hypre__kinc2;\ + i3 = orig_i3 +\ + (i + hypre__nx*j + hypre__nx*hypre__ny*k)*hypre__iinc3 +\ + (j + hypre__ny*k)*hypre__jinc3 + k*hypre__kinc3;\ + + #define hypre_BoxLoop3End(i1, i2, i3) }}}hypre_ThreadLoop(iteration_counter,\ + hypre_thread_counter, hypre_thread_release,\ + hypre_mutex_boxloops);}}} + + + #define hypre_BoxLoop4Begin(loop_size,\ + data_box1, start1, stride1, i1,\ + data_box2, start2, stride2, i2,\ + data_box3, start3, stride3, i3,\ + data_box4, start4, stride4, i4)\ + {\ + hypre_BoxLoopDeclare(loop_size, data_box1, stride1,\ + hypre__iinc1, hypre__jinc1, hypre__kinc1);\ + hypre_BoxLoopDeclare(loop_size, data_box2, stride2,\ + hypre__iinc2, hypre__jinc2, hypre__kinc2);\ + hypre_BoxLoopDeclare(loop_size, data_box3, stride3,\ + hypre__iinc3, hypre__jinc3, hypre__kinc3);\ + hypre_BoxLoopDeclare(loop_size, data_box4, stride4,\ + hypre__iinc4, hypre__jinc4, hypre__kinc4);\ + int hypre__nx = hypre_IndexX(loop_size);\ + int hypre__ny = hypre_IndexY(loop_size);\ + int hypre__nz = hypre_IndexZ(loop_size);\ + int orig_i1 = hypre_BoxIndexRank(data_box1, start1);\ + int orig_i2 = hypre_BoxIndexRank(data_box2, start2);\ + int orig_i3 = hypre_BoxIndexRank(data_box3, start3);\ + int orig_i4 = hypre_BoxIndexRank(data_box4, start4);\ + if (hypre__nx && hypre__ny && hypre__nz )\ + {\ + hypre_ChunkLoopExternalSetup(hypre__nx, hypre__ny, hypre__nz);\ + hypre_ThreadLoopBegin(chunkcount, 0, numchunks, iteration_counter,\ + hypre_mutex_boxloops,\ + hypre_ChunkLoopInternalSetup(clstart, clfinish, clreset, clfreq,\ + hypre__nx, hypre__ny, hypre__nz,\ + hypre__cx, hypre__cy, hypre__cz,\ + chunkcount)) + + #define hypre_BoxLoop4For(i, j, k, i1, i2, i3, i4)\ + for (k = clstart[2]; k < clfinish[2]; k++)\ + {\ + for (j = clstart[1]; j < clfinish[1]; j++)\ + {\ + for (i = clstart[0]; i < clfinish[0]; i++)\ + {\ + i1 = orig_i1 +\ + (i + hypre__nx*j + hypre__nx*hypre__ny*k)*hypre__iinc1 +\ + (j + hypre__ny*k)*hypre__jinc1 + k*hypre__kinc1;\ + i2 = orig_i2 +\ + (i + hypre__nx*j + hypre__nx*hypre__ny*k)*hypre__iinc2 +\ + (j + hypre__ny*k)*hypre__jinc2 + k*hypre__kinc2;\ + i3 = orig_i3 +\ + (i + hypre__nx*j + hypre__nx*hypre__ny*k)*hypre__iinc3 +\ + (j + hypre__ny*k)*hypre__jinc3 + k*hypre__kinc3;\ + i4 = orig_i4 +\ + (i + hypre__nx*j + hypre__nx*hypre__ny*k)*hypre__iinc4 +\ + (j + hypre__ny*k)*hypre__jinc4 + k*hypre__kinc4;\ + + + #define hypre_BoxLoop4End(i1, i2, i3, i4) }}}hypre_ThreadLoop(iteration_counter,\ + hypre_thread_counter, hypre_thread_release,\ + hypre_mutex_boxloops);}}} + + + #endif + + #endif + + /*BHEADER********************************************************************** + * (c) 1998 The Regents of the University of California + * + * See the file COPYRIGHT_and_DISCLAIMER for a complete copyright + * notice, contact person, and disclaimer. + * + * $Revision: 1.1 $ + *********************************************************************EHEADER*/ + /****************************************************************************** + * + * Header info for the hypre_BoxNeighbors structures + * + *****************************************************************************/ + + #ifndef hypre_BOX_NEIGHBORS_HEADER + #define hypre_BOX_NEIGHBORS_HEADER + + /*-------------------------------------------------------------------------- + * hypre_RankLink: + *--------------------------------------------------------------------------*/ + + typedef struct hypre_RankLink_struct + { + int rank; + struct hypre_RankLink_struct *next; + + } hypre_RankLink; + + typedef hypre_RankLink *hypre_RankLinkArray[3][3][3]; + + /*-------------------------------------------------------------------------- + * hypre_BoxNeighbors: + *--------------------------------------------------------------------------*/ + + typedef struct hypre_BoxNeighbors_struct + { + hypre_BoxArray *boxes; /* boxes in the neighborhood */ + int *procs; /* procs for 'boxes' */ + int *ids; /* ids for 'boxes' */ + int first_local; /* first local box address */ + int num_local; /* number of local boxes */ + int num_periodic; /* number of periodic boxes */ + + hypre_RankLinkArray *rank_links; /* neighbors of local boxes */ + + } hypre_BoxNeighbors; + + /*-------------------------------------------------------------------------- + * Accessor macros: hypre_RankLink + *--------------------------------------------------------------------------*/ + + #define hypre_RankLinkRank(link) ((link) -> rank) + #define hypre_RankLinkDistance(link) ((link) -> distance) + #define hypre_RankLinkNext(link) ((link) -> next) + + /*-------------------------------------------------------------------------- + * Accessor macros: hypre_BoxNeighbors + *--------------------------------------------------------------------------*/ + + #define hypre_BoxNeighborsBoxes(neighbors) ((neighbors) -> boxes) + #define hypre_BoxNeighborsProcs(neighbors) ((neighbors) -> procs) + #define hypre_BoxNeighborsIDs(neighbors) ((neighbors) -> ids) + #define hypre_BoxNeighborsFirstLocal(neighbors) ((neighbors) -> first_local) + #define hypre_BoxNeighborsNumLocal(neighbors) ((neighbors) -> num_local) + #define hypre_BoxNeighborsNumPeriodic(neighbors) ((neighbors) -> num_periodic) + #define hypre_BoxNeighborsRankLinks(neighbors) ((neighbors) -> rank_links) + + #define hypre_BoxNeighborsNumBoxes(neighbors) \ + (hypre_BoxArraySize(hypre_BoxNeighborsBoxes(neighbors))) + #define hypre_BoxNeighborsRankLink(neighbors, b, i, j, k) \ + (hypre_BoxNeighborsRankLinks(neighbors)[b][i+1][j+1][k+1]) + + /*-------------------------------------------------------------------------- + * Looping macros: + *--------------------------------------------------------------------------*/ + + #define hypre_BeginBoxNeighborsLoop(n, neighbors, b, distance_index)\ + {\ + int hypre__istart = 0;\ + int hypre__jstart = 0;\ + int hypre__kstart = 0;\ + int hypre__istop = 0;\ + int hypre__jstop = 0;\ + int hypre__kstop = 0;\ + hypre_RankLink *hypre__rank_link;\ + int hypre__i, hypre__j, hypre__k;\ + \ + hypre__i = hypre_IndexX(distance_index);\ + if (hypre__i < 0)\ + hypre__istart = -1;\ + else if (hypre__i > 0)\ + hypre__istop = 1;\ + \ + hypre__j = hypre_IndexY(distance_index);\ + if (hypre__j < 0)\ + hypre__jstart = -1;\ + else if (hypre__j > 0)\ + hypre__jstop = 1;\ + \ + hypre__k = hypre_IndexZ(distance_index);\ + if (hypre__k < 0)\ + hypre__kstart = -1;\ + else if (hypre__k > 0)\ + hypre__kstop = 1;\ + \ + for (hypre__k = hypre__kstart; hypre__k <= hypre__kstop; hypre__k++)\ + {\ + for (hypre__j = hypre__jstart; hypre__j <= hypre__jstop; hypre__j++)\ + {\ + for (hypre__i = hypre__istart; hypre__i <= hypre__istop; hypre__i++)\ + {\ + hypre__rank_link = \ + hypre_BoxNeighborsRankLink(neighbors, b,\ + hypre__i, hypre__j, hypre__k);\ + while (hypre__rank_link)\ + {\ + n = hypre_RankLinkRank(hypre__rank_link); + + #define hypre_EndBoxNeighborsLoop\ + hypre__rank_link = hypre_RankLinkNext(hypre__rank_link);\ + }\ + }\ + }\ + }\ + } + + #endif + /*BHEADER********************************************************************** + * (c) 1997 The Regents of the University of California + * + * See the file COPYRIGHT_and_DISCLAIMER for a complete copyright + * notice, contact person, and disclaimer. + * + * $Revision: 1.1 $ + *********************************************************************EHEADER*/ + /****************************************************************************** + * + * Header info for the hypre_StructGrid structures + * + *****************************************************************************/ + + #ifndef hypre_STRUCT_GRID_HEADER + #define hypre_STRUCT_GRID_HEADER + + /*-------------------------------------------------------------------------- + * hypre_StructGrid: + *--------------------------------------------------------------------------*/ + + typedef struct hypre_StructGrid_struct + { + MPI_Comm comm; + + int dim; /* Number of grid dimensions */ + + hypre_BoxArray *boxes; /* Array of boxes in this process */ + int *ids; /* Unique IDs for boxes */ + + hypre_BoxNeighbors *neighbors; /* Neighbors of boxes */ + int max_distance; /* Neighborhood size */ + + hypre_Box *bounding_box; /* Bounding box around grid */ + + int local_size; /* Number of grid points locally */ + int global_size; /* Total number of grid points */ + + hypre_Index periodic; /* Indicates if grid is periodic */ + + int ref_count; + + } hypre_StructGrid; + + /*-------------------------------------------------------------------------- + * Accessor macros: hypre_StructGrid + *--------------------------------------------------------------------------*/ + + #define hypre_StructGridComm(grid) ((grid) -> comm) + #define hypre_StructGridDim(grid) ((grid) -> dim) + #define hypre_StructGridBoxes(grid) ((grid) -> boxes) + #define hypre_StructGridIDs(grid) ((grid) -> ids) + #define hypre_StructGridNeighbors(grid) ((grid) -> neighbors) + #define hypre_StructGridMaxDistance(grid) ((grid) -> max_distance) + #define hypre_StructGridBoundingBox(grid) ((grid) -> bounding_box) + #define hypre_StructGridLocalSize(grid) ((grid) -> local_size) + #define hypre_StructGridGlobalSize(grid) ((grid) -> global_size) + #define hypre_StructGridPeriodic(grid) ((grid) -> periodic) + #define hypre_StructGridRefCount(grid) ((grid) -> ref_count) + + #define hypre_StructGridBox(grid, i) \ + (hypre_BoxArrayBox(hypre_StructGridBoxes(grid), i)) + #define hypre_StructGridNumBoxes(grid) \ + (hypre_BoxArraySize(hypre_StructGridBoxes(grid))) + + /*-------------------------------------------------------------------------- + * Looping macros: + *--------------------------------------------------------------------------*/ + + #define hypre_ForStructGridBoxI(i, grid) \ + hypre_ForBoxI(i, hypre_StructGridBoxes(grid)) + + #endif + + /*BHEADER********************************************************************** + * (c) 1997 The Regents of the University of California + * + * See the file COPYRIGHT_and_DISCLAIMER for a complete copyright + * notice, contact person, and disclaimer. + * + * $Revision: 1.1 $ + *********************************************************************EHEADER*/ + /****************************************************************************** + * + * Header info for hypre_StructStencil data structures + * + *****************************************************************************/ + + #ifndef hypre_STRUCT_STENCIL_HEADER + #define hypre_STRUCT_STENCIL_HEADER + + /*-------------------------------------------------------------------------- + * hypre_StructStencil + *--------------------------------------------------------------------------*/ + + typedef struct hypre_StructStencil_struct + { + hypre_Index *shape; /* Description of a stencil's shape */ + int size; /* Number of stencil coefficients */ + int max_offset; + + int dim; /* Number of dimensions */ + + int ref_count; + + } hypre_StructStencil; + + /*-------------------------------------------------------------------------- + * Accessor functions for the hypre_StructStencil structure + *--------------------------------------------------------------------------*/ + + #define hypre_StructStencilShape(stencil) ((stencil) -> shape) + #define hypre_StructStencilSize(stencil) ((stencil) -> size) + #define hypre_StructStencilMaxOffset(stencil) ((stencil) -> max_offset) + #define hypre_StructStencilDim(stencil) ((stencil) -> dim) + #define hypre_StructStencilRefCount(stencil) ((stencil) -> ref_count) + + #define hypre_StructStencilElement(stencil, i) \ + hypre_StructStencilShape(stencil)[i] + + #endif + /*BHEADER********************************************************************** + * (c) 1997 The Regents of the University of California + * + * See the file COPYRIGHT_and_DISCLAIMER for a complete copyright + * notice, contact person, and disclaimer. + * + * $Revision: 1.1 $ + *********************************************************************EHEADER*/ + + #ifndef hypre_COMMUNICATION_HEADER + #define hypre_COMMUNICATION_HEADER + + /*-------------------------------------------------------------------------- + * hypre_CommTypeEntry: + *--------------------------------------------------------------------------*/ + + typedef struct hypre_CommTypeEntry_struct + { + hypre_Index imin; /* global imin for the data */ + hypre_Index imax; /* global imin for the data */ + int offset; /* offset for the data */ + + int dim; /* dimension of the communication */ + int length_array[4]; + int stride_array[4]; + + } hypre_CommTypeEntry; + + /*-------------------------------------------------------------------------- + * hypre_CommType: + *--------------------------------------------------------------------------*/ + + typedef struct hypre_CommType_struct + { + hypre_CommTypeEntry **comm_entries; + int num_entries; + + } hypre_CommType; + + /*-------------------------------------------------------------------------- + * hypre_CommPkg: + * Structure containing information for doing communications + *--------------------------------------------------------------------------*/ + + typedef struct hypre_CommPkg_struct + { + int num_values; + MPI_Comm comm; + + int num_sends; + int num_recvs; + int *send_procs; + int *recv_procs; + + /* remote communication information */ + hypre_CommType **send_types; + hypre_CommType **recv_types; + MPI_Datatype *send_mpi_types; + MPI_Datatype *recv_mpi_types; + + /* local copy information */ + hypre_CommType *copy_from_type; + hypre_CommType *copy_to_type; + + } hypre_CommPkg; + + /*-------------------------------------------------------------------------- + * CommHandle: + *--------------------------------------------------------------------------*/ + + typedef struct hypre_CommHandle_struct + { + hypre_CommPkg *comm_pkg; + double *send_data; + double *recv_data; + + int num_requests; + MPI_Request *requests; + MPI_Status *status; + + #if defined(HYPRE_COMM_SIMPLE) + double **send_buffers; + double **recv_buffers; + int *send_sizes; + int *recv_sizes; + #endif + + } hypre_CommHandle; + + /*-------------------------------------------------------------------------- + * Accessor macros: hypre_CommTypeEntry + *--------------------------------------------------------------------------*/ + + #define hypre_CommTypeEntryIMin(entry) (entry -> imin) + #define hypre_CommTypeEntryIMax(entry) (entry -> imax) + #define hypre_CommTypeEntryOffset(entry) (entry -> offset) + #define hypre_CommTypeEntryDim(entry) (entry -> dim) + #define hypre_CommTypeEntryLengthArray(entry) (entry -> length_array) + #define hypre_CommTypeEntryStrideArray(entry) (entry -> stride_array) + + /*-------------------------------------------------------------------------- + * Accessor macros: hypre_CommType + *--------------------------------------------------------------------------*/ + + #define hypre_CommTypeCommEntries(type) (type -> comm_entries) + #define hypre_CommTypeCommEntry(type, i) (type -> comm_entries[i]) + #define hypre_CommTypeNumEntries(type) (type -> num_entries) + + /*-------------------------------------------------------------------------- + * Accessor macros: hypre_CommPkg + *--------------------------------------------------------------------------*/ + + #define hypre_CommPkgNumValues(comm_pkg) (comm_pkg -> num_values) + #define hypre_CommPkgComm(comm_pkg) (comm_pkg -> comm) + + #define hypre_CommPkgNumSends(comm_pkg) (comm_pkg -> num_sends) + #define hypre_CommPkgNumRecvs(comm_pkg) (comm_pkg -> num_recvs) + #define hypre_CommPkgSendProcs(comm_pkg) (comm_pkg -> send_procs) + #define hypre_CommPkgSendProc(comm_pkg, i) (comm_pkg -> send_procs[i]) + #define hypre_CommPkgRecvProcs(comm_pkg) (comm_pkg -> recv_procs) + #define hypre_CommPkgRecvProc(comm_pkg, i) (comm_pkg -> recv_procs[i]) + + #define hypre_CommPkgSendTypes(comm_pkg) (comm_pkg -> send_types) + #define hypre_CommPkgSendType(comm_pkg, i) (comm_pkg -> send_types[i]) + #define hypre_CommPkgRecvTypes(comm_pkg) (comm_pkg -> recv_types) + #define hypre_CommPkgRecvType(comm_pkg, i) (comm_pkg -> recv_types[i]) + #define hypre_CommPkgSendMPITypes(comm_pkg) (comm_pkg -> send_mpi_types) + #define hypre_CommPkgSendMPIType(comm_pkg, i) (comm_pkg -> send_mpi_types[i]) + #define hypre_CommPkgRecvMPITypes(comm_pkg) (comm_pkg -> recv_mpi_types) + #define hypre_CommPkgRecvMPIType(comm_pkg, i) (comm_pkg -> recv_mpi_types[i]) + + #define hypre_CommPkgCopyFromType(comm_pkg) (comm_pkg -> copy_from_type) + #define hypre_CommPkgCopyToType(comm_pkg) (comm_pkg -> copy_to_type) + + /*-------------------------------------------------------------------------- + * Accessor macros: hypre_CommHandle + *--------------------------------------------------------------------------*/ + + #define hypre_CommHandleCommPkg(comm_handle) (comm_handle -> comm_pkg) + #define hypre_CommHandleSendData(comm_handle) (comm_handle -> send_data) + #define hypre_CommHandleRecvData(comm_handle) (comm_handle -> recv_data) + #define hypre_CommHandleNumRequests(comm_handle) (comm_handle -> num_requests) + #define hypre_CommHandleRequests(comm_handle) (comm_handle -> requests) + #define hypre_CommHandleStatus(comm_handle) (comm_handle -> status) + #if defined(HYPRE_COMM_SIMPLE) + #define hypre_CommHandleSendBuffers(comm_handle) (comm_handle -> send_buffers) + #define hypre_CommHandleRecvBuffers(comm_handle) (comm_handle -> recv_buffers) + #define hypre_CommHandleSendSizes(comm_handle) (comm_handle -> send_sizes) + #define hypre_CommHandleRecvSizes(comm_handle) (comm_handle -> recv_sizes) + #endif + + #endif + /*BHEADER********************************************************************** + * (c) 1997 The Regents of the University of California + * + * See the file COPYRIGHT_and_DISCLAIMER for a complete copyright + * notice, contact person, and disclaimer. + * + * $Revision: 1.1 $ + *********************************************************************EHEADER*/ + /****************************************************************************** + * + * Header info for computation + * + *****************************************************************************/ + + #ifndef hypre_COMPUTATION_HEADER + #define hypre_COMPUTATION_HEADER + + /*-------------------------------------------------------------------------- + * hypre_ComputePkg: + * Structure containing information for doing computations. + *--------------------------------------------------------------------------*/ + + typedef struct hypre_ComputePkg_struct + { + hypre_CommPkg *comm_pkg; + + hypre_BoxArrayArray *indt_boxes; + hypre_BoxArrayArray *dept_boxes; + hypre_Index stride; + + hypre_StructGrid *grid; + hypre_BoxArray *data_space; + int num_values; + + } hypre_ComputePkg; + + /*-------------------------------------------------------------------------- + * Accessor macros: hypre_ComputePkg + *--------------------------------------------------------------------------*/ + + #define hypre_ComputePkgCommPkg(compute_pkg) (compute_pkg -> comm_pkg) + + #define hypre_ComputePkgIndtBoxes(compute_pkg) (compute_pkg -> indt_boxes) + #define hypre_ComputePkgDeptBoxes(compute_pkg) (compute_pkg -> dept_boxes) + #define hypre_ComputePkgStride(compute_pkg) (compute_pkg -> stride) + + #define hypre_ComputePkgGrid(compute_pkg) (compute_pkg -> grid) + #define hypre_ComputePkgDataSpace(compute_pkg) (compute_pkg -> data_space) + #define hypre_ComputePkgNumValues(compute_pkg) (compute_pkg -> num_values) + + #endif + /*BHEADER********************************************************************** + * (c) 1997 The Regents of the University of California + * + * See the file COPYRIGHT_and_DISCLAIMER for a complete copyright + * notice, contact person, and disclaimer. + * + * $Revision: 1.1 $ + *********************************************************************EHEADER*/ + /****************************************************************************** + * + * Header info for the hypre_StructMatrix structures + * + *****************************************************************************/ + + #ifndef hypre_STRUCT_MATRIX_HEADER + #define hypre_STRUCT_MATRIX_HEADER + + /*-------------------------------------------------------------------------- + * hypre_StructMatrix: + *--------------------------------------------------------------------------*/ + + typedef struct hypre_StructMatrix_struct + { + MPI_Comm comm; + + hypre_StructGrid *grid; + hypre_StructStencil *user_stencil; + hypre_StructStencil *stencil; + int num_values; /* Number of "stored" coefficients */ + + hypre_BoxArray *data_space; + + double *data; /* Pointer to matrix data */ + int data_alloced; /* Boolean used for freeing data */ + int data_size; /* Size of matrix data */ + int **data_indices; /* num-boxes by stencil-size array + of indices into the data array. + data_indices[b][s] is the starting + index of matrix data corresponding + to box b and stencil coefficient s */ + + int symmetric; /* Is the matrix symmetric */ + int *symm_elements;/* Which elements are "symmetric" */ + int num_ghost[6]; /* Num ghost layers in each direction */ + + int global_size; /* Total number of nonzero coeffs */ + + hypre_CommPkg *comm_pkg; /* Info on how to update ghost data */ + + int ref_count; + + } hypre_StructMatrix; + + /*-------------------------------------------------------------------------- + * Accessor macros: hypre_StructMatrix + *--------------------------------------------------------------------------*/ + + #define hypre_StructMatrixComm(matrix) ((matrix) -> comm) + #define hypre_StructMatrixGrid(matrix) ((matrix) -> grid) + #define hypre_StructMatrixUserStencil(matrix) ((matrix) -> user_stencil) + #define hypre_StructMatrixStencil(matrix) ((matrix) -> stencil) + #define hypre_StructMatrixNumValues(matrix) ((matrix) -> num_values) + #define hypre_StructMatrixDataSpace(matrix) ((matrix) -> data_space) + #define hypre_StructMatrixData(matrix) ((matrix) -> data) + #define hypre_StructMatrixDataAlloced(matrix) ((matrix) -> data_alloced) + #define hypre_StructMatrixDataSize(matrix) ((matrix) -> data_size) + #define hypre_StructMatrixDataIndices(matrix) ((matrix) -> data_indices) + #define hypre_StructMatrixSymmetric(matrix) ((matrix) -> symmetric) + #define hypre_StructMatrixSymmElements(matrix) ((matrix) -> symm_elements) + #define hypre_StructMatrixNumGhost(matrix) ((matrix) -> num_ghost) + #define hypre_StructMatrixGlobalSize(matrix) ((matrix) -> global_size) + #define hypre_StructMatrixCommPkg(matrix) ((matrix) -> comm_pkg) + #define hypre_StructMatrixRefCount(matrix) ((matrix) -> ref_count) + + #define hypre_StructMatrixBox(matrix, b) \ + hypre_BoxArrayBox(hypre_StructMatrixDataSpace(matrix), b) + + #define hypre_StructMatrixBoxData(matrix, b, s) \ + (hypre_StructMatrixData(matrix) + hypre_StructMatrixDataIndices(matrix)[b][s]) + + #define hypre_StructMatrixBoxDataValue(matrix, b, s, index) \ + (hypre_StructMatrixBoxData(matrix, b, s) + \ + hypre_BoxIndexRank(hypre_StructMatrixBox(matrix, b), index)) + + #endif + /*BHEADER********************************************************************** + * (c) 1997 The Regents of the University of California + * + * See the file COPYRIGHT_and_DISCLAIMER for a complete copyright + * notice, contact person, and disclaimer. + * + * $Revision: 1.1 $ + *********************************************************************EHEADER*/ + /****************************************************************************** + * + * Header info for the hypre_StructVector structures + * + *****************************************************************************/ + + #ifndef hypre_STRUCT_VECTOR_HEADER + #define hypre_STRUCT_VECTOR_HEADER + + /*-------------------------------------------------------------------------- + * hypre_StructVector: + *--------------------------------------------------------------------------*/ + + typedef struct hypre_StructVector_struct + { + MPI_Comm comm; + + hypre_StructGrid *grid; + + hypre_BoxArray *data_space; + + double *data; /* Pointer to vector data */ + int data_alloced; /* Boolean used for freeing data */ + int data_size; /* Size of vector data */ + int *data_indices; /* num-boxes array of indices into + the data array. data_indices[b] + is the starting index of vector + data corresponding to box b. */ + + int num_ghost[6]; /* Num ghost layers in each direction */ + + int global_size; /* Total number coefficients */ + + int ref_count; + + } hypre_StructVector; + + /*-------------------------------------------------------------------------- + * Accessor macros: hypre_StructVector + *--------------------------------------------------------------------------*/ + + #define hypre_StructVectorComm(vector) ((vector) -> comm) + #define hypre_StructVectorGrid(vector) ((vector) -> grid) + #define hypre_StructVectorDataSpace(vector) ((vector) -> data_space) + #define hypre_StructVectorData(vector) ((vector) -> data) + #define hypre_StructVectorDataAlloced(vector) ((vector) -> data_alloced) + #define hypre_StructVectorDataSize(vector) ((vector) -> data_size) + #define hypre_StructVectorDataIndices(vector) ((vector) -> data_indices) + #define hypre_StructVectorNumGhost(vector) ((vector) -> num_ghost) + #define hypre_StructVectorGlobalSize(vector) ((vector) -> global_size) + #define hypre_StructVectorRefCount(vector) ((vector) -> ref_count) + + #define hypre_StructVectorBox(vector, b) \ + hypre_BoxArrayBox(hypre_StructVectorDataSpace(vector), b) + + #define hypre_StructVectorBoxData(vector, b) \ + (hypre_StructVectorData(vector) + hypre_StructVectorDataIndices(vector)[b]) + + #define hypre_StructVectorBoxDataValue(vector, b, index) \ + (hypre_StructVectorBoxData(vector, b) + \ + hypre_BoxIndexRank(hypre_StructVectorBox(vector, b), index)) + + #endif + + /* HYPRE_struct_grid.c */ + int HYPRE_StructGridCreate( MPI_Comm comm , int dim , HYPRE_StructGrid *grid ); + int HYPRE_StructGridDestroy( HYPRE_StructGrid grid ); + int HYPRE_StructGridSetExtents( HYPRE_StructGrid grid , int *ilower , int *iupper ); + int HYPRE_StructGridSetPeriodic( HYPRE_StructGrid grid , int *periodic ); + int HYPRE_StructGridAssemble( HYPRE_StructGrid grid ); + + /* HYPRE_struct_matrix.c */ + int HYPRE_StructMatrixCreate( MPI_Comm comm , HYPRE_StructGrid grid , HYPRE_StructStencil stencil , HYPRE_StructMatrix *matrix ); + int HYPRE_StructMatrixDestroy( HYPRE_StructMatrix matrix ); + int HYPRE_StructMatrixInitialize( HYPRE_StructMatrix matrix ); + int HYPRE_StructMatrixSetValues( HYPRE_StructMatrix matrix , int *grid_index , int num_stencil_indices , int *stencil_indices , double *values ); + int HYPRE_StructMatrixSetBoxValues( HYPRE_StructMatrix matrix , int *ilower , int *iupper , int num_stencil_indices , int *stencil_indices , double *values ); + int HYPRE_StructMatrixAddToValues( HYPRE_StructMatrix matrix , int *grid_index , int num_stencil_indices , int *stencil_indices , double *values ); + int HYPRE_StructMatrixAddToBoxValues( HYPRE_StructMatrix matrix , int *ilower , int *iupper , int num_stencil_indices , int *stencil_indices , double *values ); + int HYPRE_StructMatrixAssemble( HYPRE_StructMatrix matrix ); + int HYPRE_StructMatrixSetNumGhost( HYPRE_StructMatrix matrix , int *num_ghost ); + int HYPRE_StructMatrixGetGrid( HYPRE_StructMatrix matrix , HYPRE_StructGrid *grid ); + int HYPRE_StructMatrixSetSymmetric( HYPRE_StructMatrix matrix , int symmetric ); + int HYPRE_StructMatrixPrint( char *filename , HYPRE_StructMatrix matrix , int all ); + + /* HYPRE_struct_stencil.c */ + int HYPRE_StructStencilCreate( int dim , int size , HYPRE_StructStencil *stencil ); + int HYPRE_StructStencilSetElement( HYPRE_StructStencil stencil , int element_index , int *offset ); + int HYPRE_StructStencilDestroy( HYPRE_StructStencil stencil ); + + /* HYPRE_struct_vector.c */ + int HYPRE_StructVectorCreate( MPI_Comm comm , HYPRE_StructGrid grid , HYPRE_StructVector *vector ); + int HYPRE_StructVectorDestroy( HYPRE_StructVector struct_vector ); + int HYPRE_StructVectorInitialize( HYPRE_StructVector vector ); + int HYPRE_StructVectorSetValues( HYPRE_StructVector vector , int *grid_index , double values ); + int HYPRE_StructVectorSetBoxValues( HYPRE_StructVector vector , int *ilower , int *iupper , double *values ); + int HYPRE_StructVectorAddToValues( HYPRE_StructVector vector , int *grid_index , double values ); + int HYPRE_StructVectorAddToBoxValues( HYPRE_StructVector vector , int *ilower , int *iupper , double *values ); + int HYPRE_StructVectorGetValues( HYPRE_StructVector vector , int *grid_index , double *values_ptr ); + int HYPRE_StructVectorGetBoxValues( HYPRE_StructVector vector , int *ilower , int *iupper , double *values ); + int HYPRE_StructVectorAssemble( HYPRE_StructVector vector ); + int HYPRE_StructVectorPrint( char *filename , HYPRE_StructVector vector , int all ); + int HYPRE_StructVectorSetNumGhost( HYPRE_StructVector vector , int *num_ghost ); + int HYPRE_StructVectorSetConstantValues( HYPRE_StructVector vector , double values ); + int HYPRE_StructVectorGetMigrateCommPkg( HYPRE_StructVector from_vector , HYPRE_StructVector to_vector , HYPRE_CommPkg *comm_pkg ); + int HYPRE_StructVectorMigrate( HYPRE_CommPkg comm_pkg , HYPRE_StructVector from_vector , HYPRE_StructVector to_vector ); + int HYPRE_CommPkgDestroy( HYPRE_CommPkg comm_pkg ); + + /* box.c */ + hypre_Box *hypre_BoxCreate( void ); + int hypre_BoxSetExtents( hypre_Box *box , hypre_Index imin , hypre_Index imax ); + hypre_BoxArray *hypre_BoxArrayCreate( int size ); + int hypre_BoxArraySetSize( hypre_BoxArray *box_array , int size ); + hypre_BoxArrayArray *hypre_BoxArrayArrayCreate( int size ); + int hypre_BoxDestroy( hypre_Box *box ); + int hypre_BoxArrayDestroy( hypre_BoxArray *box_array ); + int hypre_BoxArrayArrayDestroy( hypre_BoxArrayArray *box_array_array ); + hypre_Box *hypre_BoxDuplicate( hypre_Box *box ); + hypre_BoxArray *hypre_BoxArrayDuplicate( hypre_BoxArray *box_array ); + hypre_BoxArrayArray *hypre_BoxArrayArrayDuplicate( hypre_BoxArrayArray *box_array_array ); + int hypre_AppendBox( hypre_Box *box , hypre_BoxArray *box_array ); + int hypre_DeleteBox( hypre_BoxArray *box_array , int index ); + int hypre_AppendBoxArray( hypre_BoxArray *box_array_0 , hypre_BoxArray *box_array_1 ); + int hypre_BoxGetSize( hypre_Box *box , hypre_Index size ); + int hypre_BoxGetStrideSize( hypre_Box *box , hypre_Index stride , hypre_Index size ); + int hypre_IModPeriod( int i , int period ); + int hypre_IModPeriodX( hypre_Index index , hypre_Index periodic ); + int hypre_IModPeriodY( hypre_Index index , hypre_Index periodic ); + int hypre_IModPeriodZ( hypre_Index index , hypre_Index periodic ); + + /* box_algebra.c */ + int hypre_IntersectBoxes( hypre_Box *box1 , hypre_Box *box2 , hypre_Box *ibox ); + int hypre_SubtractBoxes( hypre_Box *box1 , hypre_Box *box2 , hypre_BoxArray *box_array ); + int hypre_UnionBoxes( hypre_BoxArray *boxes ); + + /* box_alloc.c */ + int hypre_BoxInitializeMemory( const int at_a_time ); + int hypre_BoxFinalizeMemory( void ); + hypre_Box *hypre_BoxAlloc( void ); + int hypre_BoxFree( hypre_Box *box ); + + /* box_neighbors.c */ + int hypre_RankLinkCreate( int rank , hypre_RankLink **rank_link_ptr ); + int hypre_RankLinkDestroy( hypre_RankLink *rank_link ); + int hypre_BoxNeighborsCreate( hypre_BoxArray *boxes , int *procs , int *ids , int first_local , int num_local , int num_periodic , hypre_BoxNeighbors **neighbors_ptr ); + int hypre_BoxNeighborsAssemble( hypre_BoxNeighbors *neighbors , int max_distance , int prune ); + int hypre_BoxNeighborsDestroy( hypre_BoxNeighbors *neighbors ); + + /* communication.c */ + hypre_CommPkg *hypre_CommPkgCreate( hypre_BoxArrayArray *send_boxes , hypre_BoxArrayArray *recv_boxes , hypre_Index send_stride , hypre_Index recv_stride , hypre_BoxArray *send_data_space , hypre_BoxArray *recv_data_space , int **send_processes , int **recv_processes , int num_values , MPI_Comm comm , hypre_Index periodic ); + int hypre_CommPkgDestroy( hypre_CommPkg *comm_pkg ); + int hypre_InitializeCommunication( hypre_CommPkg *comm_pkg , double *send_data , double *recv_data , hypre_CommHandle **comm_handle_ptr ); + int hypre_InitializeCommunication( hypre_CommPkg *comm_pkg , double *send_data , double *recv_data , hypre_CommHandle **comm_handle_ptr ); + int hypre_FinalizeCommunication( hypre_CommHandle *comm_handle ); + int hypre_FinalizeCommunication( hypre_CommHandle *comm_handle ); + int hypre_ExchangeLocalData( hypre_CommPkg *comm_pkg , double *send_data , double *recv_data ); + hypre_CommType *hypre_CommTypeCreate( hypre_CommTypeEntry **comm_entries , int num_entries ); + int hypre_CommTypeDestroy( hypre_CommType *comm_type ); + hypre_CommTypeEntry *hypre_CommTypeEntryCreate( hypre_Box *box , hypre_Index stride , hypre_Box *data_box , int num_values , int data_box_offset ); + int hypre_CommTypeEntryDestroy( hypre_CommTypeEntry *comm_entry ); + int hypre_CommPkgCreateInfo( hypre_BoxArrayArray *boxes , hypre_Index stride , hypre_BoxArray *data_space , int **processes , int num_values , MPI_Comm comm , hypre_Index periodic , int *num_comms_ptr , int **comm_processes_ptr , hypre_CommType ***comm_types_ptr , hypre_CommType **copy_type_ptr ); + int hypre_CommTypeSort( hypre_CommType *comm_type , hypre_Index periodic ); + int hypre_CommPkgCommit( hypre_CommPkg *comm_pkg ); + int hypre_CommPkgUnCommit( hypre_CommPkg *comm_pkg ); + int hypre_CommTypeBuildMPI( int num_comms , int *comm_procs , hypre_CommType **comm_types , MPI_Datatype *comm_mpi_types ); + int hypre_CommTypeEntryBuildMPI( hypre_CommTypeEntry *comm_entry , MPI_Datatype *comm_entry_mpi_type ); + + /* communication_info.c */ + int hypre_CreateCommInfoFromStencil( hypre_StructGrid *grid , hypre_StructStencil *stencil , hypre_BoxArrayArray **send_boxes_ptr , hypre_BoxArrayArray **recv_boxes_ptr , int ***send_procs_ptr , int ***recv_procs_ptr ); + int hypre_CreateCommInfoFromNumGhost( hypre_StructGrid *grid , int *num_ghost , hypre_BoxArrayArray **send_boxes_ptr , hypre_BoxArrayArray **recv_boxes_ptr , int ***send_procs_ptr , int ***recv_procs_ptr ); + int hypre_CreateCommInfoFromGrids( hypre_StructGrid *from_grid , hypre_StructGrid *to_grid , hypre_BoxArrayArray **send_boxes_ptr , hypre_BoxArrayArray **recv_boxes_ptr , int ***send_procs_ptr , int ***recv_procs_ptr ); + + /* computation.c */ + int hypre_CreateComputeInfo( hypre_StructGrid *grid , hypre_StructStencil *stencil , hypre_BoxArrayArray **send_boxes_ptr , hypre_BoxArrayArray **recv_boxes_ptr , int ***send_processes_ptr , int ***recv_processes_ptr , hypre_BoxArrayArray **indt_boxes_ptr , hypre_BoxArrayArray **dept_boxes_ptr ); + int hypre_ComputePkgCreate( hypre_BoxArrayArray *send_boxes , hypre_BoxArrayArray *recv_boxes , hypre_Index send_stride , hypre_Index recv_stride , int **send_processes , int **recv_processes , hypre_BoxArrayArray *indt_boxes , hypre_BoxArrayArray *dept_boxes , hypre_Index stride , hypre_StructGrid *grid , hypre_BoxArray *data_space , int num_values , hypre_ComputePkg **compute_pkg_ptr ); + int hypre_ComputePkgDestroy( hypre_ComputePkg *compute_pkg ); + int hypre_InitializeIndtComputations( hypre_ComputePkg *compute_pkg , double *data , hypre_CommHandle **comm_handle_ptr ); + int hypre_FinalizeIndtComputations( hypre_CommHandle *comm_handle ); + + /* grow.c */ + hypre_BoxArray *hypre_GrowBoxByStencil( hypre_Box *box , hypre_StructStencil *stencil , int transpose ); + hypre_BoxArrayArray *hypre_GrowBoxArrayByStencil( hypre_BoxArray *box_array , hypre_StructStencil *stencil , int transpose ); + + /* project.c */ + int hypre_ProjectBox( hypre_Box *box , hypre_Index index , hypre_Index stride ); + int hypre_ProjectBoxArray( hypre_BoxArray *box_array , hypre_Index index , hypre_Index stride ); + int hypre_ProjectBoxArrayArray( hypre_BoxArrayArray *box_array_array , hypre_Index index , hypre_Index stride ); + + /* struct_axpy.c */ + int hypre_StructAxpy( double alpha , hypre_StructVector *x , hypre_StructVector *y ); + + /* struct_copy.c */ + int hypre_StructCopy( hypre_StructVector *x , hypre_StructVector *y ); + + /* struct_grid.c */ + int hypre_StructGridCreate( MPI_Comm comm , int dim , hypre_StructGrid **grid_ptr ); + int hypre_StructGridRef( hypre_StructGrid *grid , hypre_StructGrid **grid_ref ); + int hypre_StructGridDestroy( hypre_StructGrid *grid ); + int hypre_StructGridSetHoodInfo( hypre_StructGrid *grid , int max_distance ); + int hypre_StructGridSetPeriodic( hypre_StructGrid *grid , hypre_Index periodic ); + int hypre_StructGridSetExtents( hypre_StructGrid *grid , hypre_Index ilower , hypre_Index iupper ); + int hypre_StructGridSetBoxes( hypre_StructGrid *grid , hypre_BoxArray *boxes ); + int hypre_StructGridSetHood( hypre_StructGrid *grid , hypre_BoxArray *hood_boxes , int *hood_procs , int *hood_ids , int first_local , int num_local , int num_periodic , hypre_Box *bounding_box ); + int hypre_StructGridAssemble( hypre_StructGrid *grid ); + int hypre_GatherAllBoxes( MPI_Comm comm , hypre_BoxArray *boxes , hypre_BoxArray **all_boxes_ptr , int **all_procs_ptr , int *first_local_ptr ); + int hypre_StructGridPrint( FILE *file , hypre_StructGrid *grid ); + int hypre_StructGridRead( MPI_Comm comm , FILE *file , hypre_StructGrid **grid_ptr ); + int hypre_StructGridPeriodicAllBoxes( hypre_StructGrid *grid , hypre_BoxArray **all_boxes_ptr , int **all_procs_ptr , int *first_local_ptr , int *num_periodic_ptr ); + + /* struct_innerprod.c */ + double hypre_StructInnerProd( hypre_StructVector *x , hypre_StructVector *y ); + + /* struct_io.c */ + int hypre_PrintBoxArrayData( FILE *file , hypre_BoxArray *box_array , hypre_BoxArray *data_space , int num_values , double *data ); + int hypre_ReadBoxArrayData( FILE *file , hypre_BoxArray *box_array , hypre_BoxArray *data_space , int num_values , double *data ); + + /* struct_matrix.c */ + double *hypre_StructMatrixExtractPointerByIndex( hypre_StructMatrix *matrix , int b , hypre_Index index ); + hypre_StructMatrix *hypre_StructMatrixCreate( MPI_Comm comm , hypre_StructGrid *grid , hypre_StructStencil *user_stencil ); + hypre_StructMatrix *hypre_StructMatrixRef( hypre_StructMatrix *matrix ); + int hypre_StructMatrixDestroy( hypre_StructMatrix *matrix ); + int hypre_StructMatrixInitializeShell( hypre_StructMatrix *matrix ); + int hypre_StructMatrixInitializeData( hypre_StructMatrix *matrix , double *data ); + int hypre_StructMatrixInitialize( hypre_StructMatrix *matrix ); + int hypre_StructMatrixSetValues( hypre_StructMatrix *matrix , hypre_Index grid_index , int num_stencil_indices , int *stencil_indices , double *values , int add_to ); + int hypre_StructMatrixSetBoxValues( hypre_StructMatrix *matrix , hypre_Box *value_box , int num_stencil_indices , int *stencil_indices , double *values , int add_to ); + int hypre_StructMatrixAssemble( hypre_StructMatrix *matrix ); + int hypre_StructMatrixSetNumGhost( hypre_StructMatrix *matrix , int *num_ghost ); + int hypre_StructMatrixPrint( char *filename , hypre_StructMatrix *matrix , int all ); + int hypre_StructMatrixMigrate( hypre_StructMatrix *from_matrix , hypre_StructMatrix *to_matrix ); + hypre_StructMatrix *hypre_StructMatrixRead( MPI_Comm comm , char *filename , int *num_ghost ); + + /* struct_matrix_mask.c */ + hypre_StructMatrix *hypre_StructMatrixCreateMask( hypre_StructMatrix *matrix , int num_stencil_indices , int *stencil_indices ); + + /* struct_matvec.c */ + void *hypre_StructMatvecCreate( void ); + int hypre_StructMatvecSetup( void *matvec_vdata , hypre_StructMatrix *A , hypre_StructVector *x ); + int hypre_StructMatvecCompute( void *matvec_vdata , double alpha , hypre_StructMatrix *A , hypre_StructVector *x , double beta , hypre_StructVector *y ); + int hypre_StructMatvecDestroy( void *matvec_vdata ); + int hypre_StructMatvec( double alpha , hypre_StructMatrix *A , hypre_StructVector *x , double beta , hypre_StructVector *y ); + + /* struct_scale.c */ + int hypre_StructScale( double alpha , hypre_StructVector *y ); + + /* struct_stencil.c */ + hypre_StructStencil *hypre_StructStencilCreate( int dim , int size , hypre_Index *shape ); + hypre_StructStencil *hypre_StructStencilRef( hypre_StructStencil *stencil ); + int hypre_StructStencilDestroy( hypre_StructStencil *stencil ); + int hypre_StructStencilElementRank( hypre_StructStencil *stencil , hypre_Index stencil_element ); + int hypre_StructStencilSymmetrize( hypre_StructStencil *stencil , hypre_StructStencil **symm_stencil_ptr , int **symm_elements_ptr ); + + /* struct_vector.c */ + hypre_StructVector *hypre_StructVectorCreate( MPI_Comm comm , hypre_StructGrid *grid ); + hypre_StructVector *hypre_StructVectorRef( hypre_StructVector *vector ); + int hypre_StructVectorDestroy( hypre_StructVector *vector ); + int hypre_StructVectorInitializeShell( hypre_StructVector *vector ); + int hypre_StructVectorInitializeData( hypre_StructVector *vector , double *data ); + int hypre_StructVectorInitialize( hypre_StructVector *vector ); + int hypre_StructVectorSetValues( hypre_StructVector *vector , hypre_Index grid_index , double values , int add_to ); + int hypre_StructVectorSetBoxValues( hypre_StructVector *vector , hypre_Box *value_box , double *values , int add_to ); + int hypre_StructVectorGetValues( hypre_StructVector *vector , hypre_Index grid_index , double *values_ptr ); + int hypre_StructVectorGetBoxValues( hypre_StructVector *vector , hypre_Box *value_box , double *values ); + int hypre_StructVectorSetNumGhost( hypre_StructVector *vector , int *num_ghost ); + int hypre_StructVectorAssemble( hypre_StructVector *vector ); + int hypre_StructVectorSetConstantValues( hypre_StructVector *vector , double values ); + int hypre_StructVectorClearGhostValues( hypre_StructVector *vector ); + int hypre_StructVectorClearAllValues( hypre_StructVector *vector ); + hypre_CommPkg *hypre_StructVectorGetMigrateCommPkg( hypre_StructVector *from_vector , hypre_StructVector *to_vector ); + int hypre_StructVectorMigrate( hypre_CommPkg *comm_pkg , hypre_StructVector *from_vector , hypre_StructVector *to_vector ); + int hypre_StructVectorPrint( char *filename , hypre_StructVector *vector , int all ); + hypre_StructVector *hypre_StructVectorRead( MPI_Comm comm , char *filename , int *num_ghost ); + + + #ifdef __cplusplus + } + #endif + + #endif + Index: llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/struct_scale.c diff -c /dev/null llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/struct_scale.c:1.1 *** /dev/null Mon Apr 11 00:22:18 2005 --- llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/struct_scale.c Mon Apr 11 00:22:07 2005 *************** *** 0 **** --- 1,66 ---- + /*BHEADER********************************************************************** + * (c) 1997 The Regents of the University of California + * + * See the file COPYRIGHT_and_DISCLAIMER for a complete copyright + * notice, contact person, and disclaimer. + * + * $Revision: 1.1 $ + *********************************************************************EHEADER*/ + /****************************************************************************** + * + * Structured scale routine + * + *****************************************************************************/ + + #include "headers.h" + + /*-------------------------------------------------------------------------- + * hypre_StructScale + *--------------------------------------------------------------------------*/ + + int + hypre_StructScale( double alpha, + hypre_StructVector *y ) + { + int ierr = 0; + + hypre_Box *y_data_box; + + int yi; + double *yp; + + hypre_BoxArray *boxes; + hypre_Box *box; + hypre_Index loop_size; + hypre_IndexRef start; + hypre_Index unit_stride; + + int i; + int loopi, loopj, loopk; + + hypre_SetIndex(unit_stride, 1, 1, 1); + + boxes = hypre_StructGridBoxes(hypre_StructVectorGrid(y)); + hypre_ForBoxI(i, boxes) + { + box = hypre_BoxArrayBox(boxes, i); + start = hypre_BoxIMin(box); + + y_data_box = hypre_BoxArrayBox(hypre_StructVectorDataSpace(y), i); + yp = hypre_StructVectorBoxData(y, i); + + hypre_BoxGetSize(box, loop_size); + + hypre_BoxLoop1Begin(loop_size, + y_data_box, start, unit_stride, yi); + #define HYPRE_BOX_SMP_PRIVATE loopk,loopi,loopj,yi + #include "hypre_box_smp_forloop.h" + hypre_BoxLoop1For(loopi, loopj, loopk, yi) + { + yp[yi] *= alpha; + } + hypre_BoxLoop1End(yi); + } + + return ierr; + } Index: llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/struct_stencil.c diff -c /dev/null llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/struct_stencil.c:1.1 *** /dev/null Mon Apr 11 00:22:18 2005 --- llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/struct_stencil.c Mon Apr 11 00:22:07 2005 *************** *** 0 **** --- 1,212 ---- + /*BHEADER********************************************************************** + * (c) 1997 The Regents of the University of California + * + * See the file COPYRIGHT_and_DISCLAIMER for a complete copyright + * notice, contact person, and disclaimer. + * + * $Revision: 1.1 $ + *********************************************************************EHEADER*/ + /****************************************************************************** + * + * Constructors and destructors for stencil structure. + * + *****************************************************************************/ + + #include "headers.h" + + /*-------------------------------------------------------------------------- + * hypre_StructStencilCreate + *--------------------------------------------------------------------------*/ + + hypre_StructStencil * + hypre_StructStencilCreate( int dim, + int size, + hypre_Index *shape ) + { + hypre_StructStencil *stencil; + + int abs_offset; + int max_offset; + int s, d; + + stencil = hypre_TAlloc(hypre_StructStencil, 1); + + hypre_StructStencilShape(stencil) = shape; + hypre_StructStencilSize(stencil) = size; + hypre_StructStencilDim(stencil) = dim; + hypre_StructStencilRefCount(stencil) = 1; + + /* compute max_offset */ + max_offset = 0; + for (s = 0; s < size; s++) + { + for (d = 0; d < 3; d++) + { + abs_offset = hypre_IndexD(shape[s], d); + abs_offset = (abs_offset < 0) ? -abs_offset : abs_offset; + max_offset = hypre_max(abs_offset, max_offset); + } + } + hypre_StructStencilMaxOffset(stencil) = max_offset; + + return stencil; + } + + /*-------------------------------------------------------------------------- + * hypre_StructStencilRef + *--------------------------------------------------------------------------*/ + + hypre_StructStencil * + hypre_StructStencilRef( hypre_StructStencil *stencil ) + { + hypre_StructStencilRefCount(stencil) ++; + + return stencil; + } + + /*-------------------------------------------------------------------------- + * hypre_StructStencilDestroy + *--------------------------------------------------------------------------*/ + + int + hypre_StructStencilDestroy( hypre_StructStencil *stencil ) + { + int ierr = 0; + + if (stencil) + { + hypre_StructStencilRefCount(stencil) --; + if (hypre_StructStencilRefCount(stencil) == 0) + { + hypre_TFree(hypre_StructStencilShape(stencil)); + hypre_TFree(stencil); + } + } + + return ierr; + } + + /*-------------------------------------------------------------------------- + * hypre_StructStencilElementRank + * Returns the rank of the `stencil_element' in `stencil'. + * If the element is not found, a -1 is returned. + *--------------------------------------------------------------------------*/ + + int + hypre_StructStencilElementRank( hypre_StructStencil *stencil, + hypre_Index stencil_element ) + { + hypre_Index *stencil_shape; + int rank; + int i; + + rank = -1; + stencil_shape = hypre_StructStencilShape(stencil); + for (i = 0; i < hypre_StructStencilSize(stencil); i++) + { + if ((hypre_IndexX(stencil_shape[i]) == hypre_IndexX(stencil_element)) && + (hypre_IndexY(stencil_shape[i]) == hypre_IndexY(stencil_element)) && + (hypre_IndexZ(stencil_shape[i]) == hypre_IndexZ(stencil_element)) ) + { + rank = i; + break; + } + } + + return rank; + } + + /*-------------------------------------------------------------------------- + * hypre_StructStencilSymmetrize: + * Computes a new "symmetrized" stencil. + * + * An integer array called `symm_elements' is also set up. A non-negative + * value of `symm_elements[i]' indicates that the `i'th stencil element + * is a "symmetric element". That is, this stencil element is the + * transpose element of an element that is not a "symmetric element". + *--------------------------------------------------------------------------*/ + + int + hypre_StructStencilSymmetrize( hypre_StructStencil *stencil, + hypre_StructStencil **symm_stencil_ptr, + int **symm_elements_ptr ) + { + hypre_Index *stencil_shape = hypre_StructStencilShape(stencil); + int stencil_size = hypre_StructStencilSize(stencil); + + hypre_StructStencil *symm_stencil; + hypre_Index *symm_stencil_shape; + int symm_stencil_size; + int *symm_elements; + + int no_symmetric_stencil_element; + int i, j, d; + + int ierr = 0; + + /*------------------------------------------------------ + * Copy stencil elements into `symm_stencil_shape' + *------------------------------------------------------*/ + + symm_stencil_shape = hypre_CTAlloc(hypre_Index, 2*stencil_size); + for (i = 0; i < stencil_size; i++) + { + hypre_CopyIndex(stencil_shape[i], symm_stencil_shape[i]); + } + + /*------------------------------------------------------ + * Create symmetric stencil elements and `symm_elements' + *------------------------------------------------------*/ + + symm_elements = hypre_CTAlloc(int, 2*stencil_size); + for (i = 0; i < 2*stencil_size; i++) + symm_elements[i] = -1; + + symm_stencil_size = stencil_size; + for (i = 0; i < stencil_size; i++) + { + if (symm_elements[i] < 0) + { + /* note: start at i to handle "center" element correctly */ + no_symmetric_stencil_element = 1; + for (j = i; j < stencil_size; j++) + { + if ( (hypre_IndexX(symm_stencil_shape[j]) == + -hypre_IndexX(symm_stencil_shape[i]) ) && + (hypre_IndexY(symm_stencil_shape[j]) == + -hypre_IndexY(symm_stencil_shape[i]) ) && + (hypre_IndexZ(symm_stencil_shape[j]) == + -hypre_IndexZ(symm_stencil_shape[i]) ) ) + { + /* only "off-center" elements have symmetric entries */ + if (i != j) + symm_elements[j] = i; + no_symmetric_stencil_element = 0; + } + } + + if (no_symmetric_stencil_element) + { + /* add symmetric stencil element to `symm_stencil' */ + for (d = 0; d < 3; d++) + { + hypre_IndexD(symm_stencil_shape[symm_stencil_size], d) = + -hypre_IndexD(symm_stencil_shape[i], d); + } + + symm_elements[symm_stencil_size] = i; + symm_stencil_size++; + } + } + } + + symm_stencil = hypre_StructStencilCreate(hypre_StructStencilDim(stencil), + symm_stencil_size, + symm_stencil_shape); + + *symm_stencil_ptr = symm_stencil; + *symm_elements_ptr = symm_elements; + + return ierr; + } + Index: llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/struct_vector.c diff -c /dev/null llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/struct_vector.c:1.1 *** /dev/null Mon Apr 11 00:22:18 2005 --- llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/struct_vector.c Mon Apr 11 00:22:07 2005 *************** *** 0 **** --- 1,921 ---- + /*BHEADER********************************************************************** + * (c) 1997 The Regents of the University of California + * + * See the file COPYRIGHT_and_DISCLAIMER for a complete copyright + * notice, contact person, and disclaimer. + * + * $Revision: 1.1 $ + *********************************************************************EHEADER*/ + /****************************************************************************** + * + * Member functions for hypre_StructVector class. + * + *****************************************************************************/ + + #include "headers.h" + + /*-------------------------------------------------------------------------- + * hypre_StructVectorCreate + *--------------------------------------------------------------------------*/ + + hypre_StructVector * + hypre_StructVectorCreate( MPI_Comm comm, + hypre_StructGrid *grid ) + { + hypre_StructVector *vector; + + int i; + + vector = hypre_CTAlloc(hypre_StructVector, 1); + + hypre_StructVectorComm(vector) = comm; + hypre_StructGridRef(grid, &hypre_StructVectorGrid(vector)); + hypre_StructVectorDataAlloced(vector) = 1; + hypre_StructVectorRefCount(vector) = 1; + + /* set defaults */ + for (i = 0; i < 6; i++) + hypre_StructVectorNumGhost(vector)[i] = 1; + + return vector; + } + + /*-------------------------------------------------------------------------- + * hypre_StructVectorRef + *--------------------------------------------------------------------------*/ + + hypre_StructVector * + hypre_StructVectorRef( hypre_StructVector *vector ) + { + hypre_StructVectorRefCount(vector) ++; + + return vector; + } + + /*-------------------------------------------------------------------------- + * hypre_StructVectorDestroy + *--------------------------------------------------------------------------*/ + + int + hypre_StructVectorDestroy( hypre_StructVector *vector ) + { + int ierr = 0; + + if (vector) + { + hypre_StructVectorRefCount(vector) --; + if (hypre_StructVectorRefCount(vector) == 0) + { + if (hypre_StructVectorDataAlloced(vector)) + { + hypre_SharedTFree(hypre_StructVectorData(vector)); + } + hypre_TFree(hypre_StructVectorDataIndices(vector)); + hypre_BoxArrayDestroy(hypre_StructVectorDataSpace(vector)); + hypre_StructGridDestroy(hypre_StructVectorGrid(vector)); + hypre_TFree(vector); + } + } + + return ierr; + } + + /*-------------------------------------------------------------------------- + * hypre_StructVectorInitializeShell + *--------------------------------------------------------------------------*/ + + int + hypre_StructVectorInitializeShell( hypre_StructVector *vector ) + { + int ierr = 0; + + hypre_StructGrid *grid; + + int *num_ghost; + + hypre_BoxArray *data_space; + hypre_BoxArray *boxes; + hypre_Box *box; + hypre_Box *data_box; + + int *data_indices; + int data_size; + + int i, d; + + /*----------------------------------------------------------------------- + * Set up data_space + *-----------------------------------------------------------------------*/ + + grid = hypre_StructVectorGrid(vector); + + if (hypre_StructVectorDataSpace(vector) == NULL) + { + num_ghost = hypre_StructVectorNumGhost(vector); + + boxes = hypre_StructGridBoxes(grid); + data_space = hypre_BoxArrayCreate(hypre_BoxArraySize(boxes)); + + hypre_ForBoxI(i, boxes) + { + box = hypre_BoxArrayBox(boxes, i); + data_box = hypre_BoxArrayBox(data_space, i); + + hypre_CopyBox(box, data_box); + for (d = 0; d < 3; d++) + { + hypre_BoxIMinD(data_box, d) -= num_ghost[2*d]; + hypre_BoxIMaxD(data_box, d) += num_ghost[2*d + 1]; + } + } + + hypre_StructVectorDataSpace(vector) = data_space; + } + + /*----------------------------------------------------------------------- + * Set up data_indices array and data_size + *-----------------------------------------------------------------------*/ + + if (hypre_StructVectorDataIndices(vector) == NULL) + { + data_space = hypre_StructVectorDataSpace(vector); + data_indices = hypre_CTAlloc(int, hypre_BoxArraySize(data_space)); + + data_size = 0; + hypre_ForBoxI(i, data_space) + { + data_box = hypre_BoxArrayBox(data_space, i); + + data_indices[i] = data_size; + data_size += hypre_BoxVolume(data_box); + } + + hypre_StructVectorDataIndices(vector) = data_indices; + hypre_StructVectorDataSize(vector) = data_size; + } + + /*----------------------------------------------------------------------- + * Set total number of nonzero coefficients + *-----------------------------------------------------------------------*/ + + hypre_StructVectorGlobalSize(vector) = hypre_StructGridGlobalSize(grid); + + return ierr; + } + + /*-------------------------------------------------------------------------- + * hypre_StructVectorInitializeData + *--------------------------------------------------------------------------*/ + + int + hypre_StructVectorInitializeData( hypre_StructVector *vector, + double *data ) + { + int ierr = 0; + + hypre_StructVectorData(vector) = data; + hypre_StructVectorDataAlloced(vector) = 0; + + return ierr; + } + + /*-------------------------------------------------------------------------- + * hypre_StructVectorInitialize + *--------------------------------------------------------------------------*/ + + int + hypre_StructVectorInitialize( hypre_StructVector *vector ) + { + int ierr = 0; + + double *data; + + ierr = hypre_StructVectorInitializeShell(vector); + + data = hypre_SharedCTAlloc(double, hypre_StructVectorDataSize(vector)); + hypre_StructVectorInitializeData(vector, data); + hypre_StructVectorDataAlloced(vector) = 1; + + return ierr; + } + + /*-------------------------------------------------------------------------- + * hypre_StructVectorSetValues + *--------------------------------------------------------------------------*/ + + int + hypre_StructVectorSetValues( hypre_StructVector *vector, + hypre_Index grid_index, + double values, + int add_to ) + { + int ierr = 0; + + hypre_BoxArray *boxes; + hypre_Box *box; + + double *vecp; + + int i; + + boxes = hypre_StructGridBoxes(hypre_StructVectorGrid(vector)); + + hypre_ForBoxI(i, boxes) + { + box = hypre_BoxArrayBox(boxes, i); + + if ((hypre_IndexX(grid_index) >= hypre_BoxIMinX(box)) && + (hypre_IndexX(grid_index) <= hypre_BoxIMaxX(box)) && + (hypre_IndexY(grid_index) >= hypre_BoxIMinY(box)) && + (hypre_IndexY(grid_index) <= hypre_BoxIMaxY(box)) && + (hypre_IndexZ(grid_index) >= hypre_BoxIMinZ(box)) && + (hypre_IndexZ(grid_index) <= hypre_BoxIMaxZ(box)) ) + { + vecp = hypre_StructVectorBoxDataValue(vector, i, grid_index); + if (add_to) + { + *vecp += values; + } + else + { + *vecp = values; + } + } + } + + return ierr; + } + + /*-------------------------------------------------------------------------- + * hypre_StructVectorSetBoxValues + *--------------------------------------------------------------------------*/ + + int + hypre_StructVectorSetBoxValues( hypre_StructVector *vector, + hypre_Box *value_box, + double *values, + int add_to ) + { + int ierr = 0; + + hypre_BoxArray *grid_boxes; + hypre_Box *grid_box; + hypre_BoxArray *box_array; + hypre_Box *box; + + hypre_BoxArray *data_space; + hypre_Box *data_box; + hypre_IndexRef data_start; + hypre_Index data_stride; + int datai; + double *datap; + + hypre_Box *dval_box; + hypre_Index dval_start; + hypre_Index dval_stride; + int dvali; + + hypre_Index loop_size; + + int i; + int loopi, loopj, loopk; + + /*----------------------------------------------------------------------- + * Set up `box_array' by intersecting `box' with the grid boxes + *-----------------------------------------------------------------------*/ + + grid_boxes = hypre_StructGridBoxes(hypre_StructVectorGrid(vector)); + box_array = hypre_BoxArrayCreate(hypre_BoxArraySize(grid_boxes)); + box = hypre_BoxCreate(); + hypre_ForBoxI(i, grid_boxes) + { + grid_box = hypre_BoxArrayBox(grid_boxes, i); + hypre_IntersectBoxes(value_box, grid_box, box); + hypre_CopyBox(box, hypre_BoxArrayBox(box_array, i)); + } + hypre_BoxDestroy(box); + + /*----------------------------------------------------------------------- + * Set the vector coefficients + *-----------------------------------------------------------------------*/ + + if (box_array) + { + data_space = hypre_StructVectorDataSpace(vector); + hypre_SetIndex(data_stride, 1, 1, 1); + + dval_box = hypre_BoxDuplicate(value_box); + hypre_SetIndex(dval_stride, 1, 1, 1); + + hypre_ForBoxI(i, box_array) + { + box = hypre_BoxArrayBox(box_array, i); + data_box = hypre_BoxArrayBox(data_space, i); + + /* if there was an intersection */ + if (box) + { + data_start = hypre_BoxIMin(box); + hypre_CopyIndex(data_start, dval_start); + + datap = hypre_StructVectorBoxData(vector, i); + + hypre_BoxGetSize(box, loop_size); + + if (add_to) + { + hypre_BoxLoop2Begin(loop_size, + data_box,data_start,data_stride,datai, + dval_box,dval_start,dval_stride,dvali); + #define HYPRE_BOX_SMP_PRIVATE loopk,loopi,loopj,datai,dvali + #include "hypre_box_smp_forloop.h" + hypre_BoxLoop2For(loopi, loopj, loopk, datai, dvali) + { + datap[datai] += values[dvali]; + } + hypre_BoxLoop2End(datai, dvali); + } + else + { + hypre_BoxLoop2Begin(loop_size, + data_box,data_start,data_stride,datai, + dval_box,dval_start,dval_stride,dvali); + #define HYPRE_BOX_SMP_PRIVATE loopk,loopi,loopj,datai,dvali + #include "hypre_box_smp_forloop.h" + hypre_BoxLoop2For(loopi, loopj, loopk, datai, dvali) + { + datap[datai] = values[dvali]; + } + hypre_BoxLoop2End(datai, dvali); + } + } + } + + hypre_BoxDestroy(dval_box); + } + + hypre_BoxArrayDestroy(box_array); + + return ierr; + } + + /*-------------------------------------------------------------------------- + * hypre_StructVectorGetValues + *--------------------------------------------------------------------------*/ + + int + hypre_StructVectorGetValues( hypre_StructVector *vector, + hypre_Index grid_index, + double *values_ptr ) + { + int ierr = 0; + + double values; + + hypre_BoxArray *boxes; + hypre_Box *box; + + double *vecp; + + int i; + + boxes = hypre_StructGridBoxes(hypre_StructVectorGrid(vector)); + + hypre_ForBoxI(i, boxes) + { + box = hypre_BoxArrayBox(boxes, i); + + if ((hypre_IndexX(grid_index) >= hypre_BoxIMinX(box)) && + (hypre_IndexX(grid_index) <= hypre_BoxIMaxX(box)) && + (hypre_IndexY(grid_index) >= hypre_BoxIMinY(box)) && + (hypre_IndexY(grid_index) <= hypre_BoxIMaxY(box)) && + (hypre_IndexZ(grid_index) >= hypre_BoxIMinZ(box)) && + (hypre_IndexZ(grid_index) <= hypre_BoxIMaxZ(box)) ) + { + vecp = hypre_StructVectorBoxDataValue(vector, i, grid_index); + values = *vecp; + } + } + + *values_ptr = values; + + return ierr; + } + + /*-------------------------------------------------------------------------- + * hypre_StructVectorGetBoxValues + *--------------------------------------------------------------------------*/ + + int + hypre_StructVectorGetBoxValues( hypre_StructVector *vector, + hypre_Box *value_box, + double *values ) + { + int ierr = 0; + + hypre_BoxArray *grid_boxes; + hypre_Box *grid_box; + hypre_BoxArray *box_array; + hypre_Box *box; + + hypre_BoxArray *data_space; + hypre_Box *data_box; + hypre_IndexRef data_start; + hypre_Index data_stride; + int datai; + double *datap; + + hypre_Box *dval_box; + hypre_Index dval_start; + hypre_Index dval_stride; + int dvali; + + hypre_Index loop_size; + + int i; + int loopi, loopj, loopk; + + /*----------------------------------------------------------------------- + * Set up `box_array' by intersecting `box' with the grid boxes + *-----------------------------------------------------------------------*/ + + grid_boxes = hypre_StructGridBoxes(hypre_StructVectorGrid(vector)); + box_array = hypre_BoxArrayCreate(hypre_BoxArraySize(grid_boxes)); + box = hypre_BoxCreate(); + hypre_ForBoxI(i, grid_boxes) + { + grid_box = hypre_BoxArrayBox(grid_boxes, i); + hypre_IntersectBoxes(value_box, grid_box, box); + hypre_CopyBox(box, hypre_BoxArrayBox(box_array, i)); + } + hypre_BoxDestroy(box); + + /*----------------------------------------------------------------------- + * Set the vector coefficients + *-----------------------------------------------------------------------*/ + + if (box_array) + { + data_space = hypre_StructVectorDataSpace(vector); + hypre_SetIndex(data_stride, 1, 1, 1); + + dval_box = hypre_BoxDuplicate(value_box); + hypre_SetIndex(dval_stride, 1, 1, 1); + + hypre_ForBoxI(i, box_array) + { + box = hypre_BoxArrayBox(box_array, i); + data_box = hypre_BoxArrayBox(data_space, i); + + /* if there was an intersection */ + if (box) + { + data_start = hypre_BoxIMin(box); + hypre_CopyIndex(data_start, dval_start); + + datap = hypre_StructVectorBoxData(vector, i); + + hypre_BoxGetSize(box, loop_size); + + hypre_BoxLoop2Begin(loop_size, + data_box, data_start, data_stride, datai, + dval_box, dval_start, dval_stride, dvali); + #define HYPRE_BOX_SMP_PRIVATE loopk,loopi,loopj,datai,dvali + #include "hypre_box_smp_forloop.h" + hypre_BoxLoop2For(loopi, loopj, loopk, datai, dvali) + { + values[dvali] = datap[datai]; + } + hypre_BoxLoop2End(datai, dvali); + } + } + + hypre_BoxDestroy(dval_box); + } + + hypre_BoxArrayDestroy(box_array); + + return ierr; + } + + /*-------------------------------------------------------------------------- + * hypre_StructVectorSetNumGhost + *--------------------------------------------------------------------------*/ + + int + hypre_StructVectorSetNumGhost( hypre_StructVector *vector, + int *num_ghost ) + { + int ierr = 0; + int i; + + for (i = 0; i < 6; i++) + hypre_StructVectorNumGhost(vector)[i] = num_ghost[i]; + + return ierr; + } + + /*-------------------------------------------------------------------------- + * hypre_StructVectorAssemble + *--------------------------------------------------------------------------*/ + + int + hypre_StructVectorAssemble( hypre_StructVector *vector ) + { + int ierr = 0; + + return ierr; + } + + /*-------------------------------------------------------------------------- + * hypre_StructVectorSetConstantValues + *--------------------------------------------------------------------------*/ + + int + hypre_StructVectorSetConstantValues( hypre_StructVector *vector, + double values ) + { + int ierr = 0; + + hypre_Box *v_data_box; + + int vi; + double *vp; + + hypre_BoxArray *boxes; + hypre_Box *box; + hypre_Index loop_size; + hypre_IndexRef start; + hypre_Index unit_stride; + + int i; + int loopi, loopj, loopk; + + /*----------------------------------------------------------------------- + * Set the vector coefficients + *-----------------------------------------------------------------------*/ + + hypre_SetIndex(unit_stride, 1, 1, 1); + + boxes = hypre_StructGridBoxes(hypre_StructVectorGrid(vector)); + hypre_ForBoxI(i, boxes) + { + box = hypre_BoxArrayBox(boxes, i); + start = hypre_BoxIMin(box); + + v_data_box = + hypre_BoxArrayBox(hypre_StructVectorDataSpace(vector), i); + vp = hypre_StructVectorBoxData(vector, i); + + hypre_BoxGetSize(box, loop_size); + + hypre_BoxLoop1Begin(loop_size, + v_data_box, start, unit_stride, vi); + #define HYPRE_BOX_SMP_PRIVATE loopk,loopi,loopj,vi + #include "hypre_box_smp_forloop.h" + hypre_BoxLoop1For(loopi, loopj, loopk, vi) + { + vp[vi] = values; + } + hypre_BoxLoop1End(vi); + } + + return ierr; + } + + /*-------------------------------------------------------------------------- + * hypre_StructVectorClearGhostValues + *--------------------------------------------------------------------------*/ + + int + hypre_StructVectorClearGhostValues( hypre_StructVector *vector ) + { + int ierr = 0; + + hypre_Box *v_data_box; + + int vi; + double *vp; + + hypre_BoxArray *boxes; + hypre_Box *box; + hypre_BoxArray *diff_boxes; + hypre_Box *diff_box; + hypre_Index loop_size; + hypre_IndexRef start; + hypre_Index unit_stride; + + int i, j; + int loopi, loopj, loopk; + + /*----------------------------------------------------------------------- + * Set the vector coefficients + *-----------------------------------------------------------------------*/ + + hypre_SetIndex(unit_stride, 1, 1, 1); + + boxes = hypre_StructGridBoxes(hypre_StructVectorGrid(vector)); + diff_boxes = hypre_BoxArrayCreate(0); + hypre_ForBoxI(i, boxes) + { + box = hypre_BoxArrayBox(boxes, i); + + v_data_box = + hypre_BoxArrayBox(hypre_StructVectorDataSpace(vector), i); + vp = hypre_StructVectorBoxData(vector, i); + + hypre_SubtractBoxes(v_data_box, box, diff_boxes); + hypre_ForBoxI(j, diff_boxes) + { + diff_box = hypre_BoxArrayBox(diff_boxes, j); + start = hypre_BoxIMin(diff_box); + + hypre_BoxGetSize(diff_box, loop_size); + + hypre_BoxLoop1Begin(loop_size, + v_data_box, start, unit_stride, vi); + #define HYPRE_BOX_SMP_PRIVATE loopk,loopi,loopj,vi + #include "hypre_box_smp_forloop.h" + hypre_BoxLoop1For(loopi, loopj, loopk, vi) + { + vp[vi] = 0.0; + } + hypre_BoxLoop1End(vi); + } + } + hypre_BoxArrayDestroy(diff_boxes); + + return ierr; + } + + /*-------------------------------------------------------------------------- + * hypre_StructVectorClearAllValues + *--------------------------------------------------------------------------*/ + + int + hypre_StructVectorClearAllValues( hypre_StructVector *vector ) + { + int ierr = 0; + + int datai; + double *data; + + hypre_Index imin; + hypre_Index imax; + hypre_Box *box; + hypre_Index loop_size; + + int loopi, loopj, loopk; + + /*----------------------------------------------------------------------- + * Set the vector coefficients + *-----------------------------------------------------------------------*/ + + box = hypre_BoxCreate(); + hypre_SetIndex(imin, 1, 1, 1); + hypre_SetIndex(imax, hypre_StructVectorDataSize(vector), 1, 1); + hypre_BoxSetExtents(box, imin, imax); + data = hypre_StructVectorData(vector); + hypre_BoxGetSize(box, loop_size); + + hypre_BoxLoop1Begin(loop_size, + box, imin, imin, datai); + #define HYPRE_BOX_SMP_PRIVATE loopk,loopi,loopj,datai + #include "hypre_box_smp_forloop.h" + hypre_BoxLoop1For(loopi, loopj, loopk, datai) + { + data[datai] = 0.0; + } + hypre_BoxLoop1End(datai); + + hypre_BoxDestroy(box); + + return ierr; + } + + /*-------------------------------------------------------------------------- + * hypre_StructVectorGetMigrateCommPkg + *--------------------------------------------------------------------------*/ + + hypre_CommPkg * + hypre_StructVectorGetMigrateCommPkg( hypre_StructVector *from_vector, + hypre_StructVector *to_vector ) + { + hypre_BoxArrayArray *send_boxes; + hypre_BoxArrayArray *recv_boxes; + int **send_processes; + int **recv_processes; + int num_values; + + hypre_Index unit_stride; + + hypre_CommPkg *comm_pkg; + + /*------------------------------------------------------ + * Set up hypre_CommPkg + *------------------------------------------------------*/ + + num_values = 1; + hypre_SetIndex(unit_stride, 1, 1, 1); + + hypre_CreateCommInfoFromGrids(hypre_StructVectorGrid(from_vector), + hypre_StructVectorGrid(to_vector), + &send_boxes, &recv_boxes, + &send_processes, &recv_processes); + + comm_pkg = hypre_CommPkgCreate(send_boxes, recv_boxes, + unit_stride, unit_stride, + hypre_StructVectorDataSpace(from_vector), + hypre_StructVectorDataSpace(to_vector), + send_processes, recv_processes, + num_values, + hypre_StructVectorComm(from_vector), + hypre_StructGridPeriodic( + hypre_StructVectorGrid(from_vector))); + /* is this correct for periodic? */ + + return comm_pkg; + } + + /*-------------------------------------------------------------------------- + * hypre_StructVectorMigrate + *--------------------------------------------------------------------------*/ + + int + hypre_StructVectorMigrate( hypre_CommPkg *comm_pkg, + hypre_StructVector *from_vector, + hypre_StructVector *to_vector ) + { + hypre_CommHandle *comm_handle; + + int ierr = 0; + + /*----------------------------------------------------------------------- + * Migrate the vector data + *-----------------------------------------------------------------------*/ + + hypre_InitializeCommunication(comm_pkg, + hypre_StructVectorData(from_vector), + hypre_StructVectorData(to_vector), + &comm_handle); + hypre_FinalizeCommunication(comm_handle); + + return ierr; + } + + /*-------------------------------------------------------------------------- + * hypre_StructVectorPrint + *--------------------------------------------------------------------------*/ + + int + hypre_StructVectorPrint( char *filename, + hypre_StructVector *vector, + int all ) + { + int ierr = 0; + + FILE *file; + char new_filename[255]; + + hypre_StructGrid *grid; + hypre_BoxArray *boxes; + + hypre_BoxArray *data_space; + + int myid; + + /*---------------------------------------- + * Open file + *----------------------------------------*/ + + MPI_Comm_rank(hypre_StructVectorComm(vector), &myid ); + + sprintf(new_filename, "%s.%05d", filename, myid); + + if ((file = fopen(new_filename, "w")) == NULL) + { + printf("Error: can't open output file %s\n", new_filename); + exit(1); + } + + /*---------------------------------------- + * Print header info + *----------------------------------------*/ + + fprintf(file, "StructVector\n"); + + /* print grid info */ + fprintf(file, "\nGrid:\n"); + grid = hypre_StructVectorGrid(vector); + hypre_StructGridPrint(file, grid); + + /*---------------------------------------- + * Print data + *----------------------------------------*/ + + data_space = hypre_StructVectorDataSpace(vector); + + if (all) + boxes = data_space; + else + boxes = hypre_StructGridBoxes(grid); + + fprintf(file, "\nData:\n"); + hypre_PrintBoxArrayData(file, boxes, data_space, 1, + hypre_StructVectorData(vector)); + + /*---------------------------------------- + * Close file + *----------------------------------------*/ + + fflush(file); + fclose(file); + + return ierr; + } + + /*-------------------------------------------------------------------------- + * hypre_StructVectorRead + *--------------------------------------------------------------------------*/ + + hypre_StructVector * + hypre_StructVectorRead( MPI_Comm comm, + char *filename, + int *num_ghost ) + { + FILE *file; + char new_filename[255]; + + hypre_StructVector *vector; + + hypre_StructGrid *grid; + hypre_BoxArray *boxes; + + hypre_BoxArray *data_space; + + int myid; + + /*---------------------------------------- + * Open file + *----------------------------------------*/ + + #ifdef HYPRE_USE_PTHREADS + #if MPI_Comm_rank == hypre_thread_MPI_Comm_rank + #undef MPI_Comm_rank + #endif + #endif + + MPI_Comm_rank(comm, &myid ); + + sprintf(new_filename, "%s.%05d", filename, myid); + + if ((file = fopen(new_filename, "r")) == NULL) + { + printf("Error: can't open output file %s\n", new_filename); + exit(1); + } + + /*---------------------------------------- + * Read header info + *----------------------------------------*/ + + fscanf(file, "StructVector\n"); + + /* read grid info */ + fscanf(file, "\nGrid:\n"); + hypre_StructGridRead(comm,file,&grid); + + /*---------------------------------------- + * Initialize the vector + *----------------------------------------*/ + + vector = hypre_StructVectorCreate(comm, grid); + hypre_StructVectorSetNumGhost(vector, num_ghost); + hypre_StructVectorInitialize(vector); + + /*---------------------------------------- + * Read data + *----------------------------------------*/ + + boxes = hypre_StructGridBoxes(grid); + data_space = hypre_StructVectorDataSpace(vector); + + fscanf(file, "\nData:\n"); + hypre_ReadBoxArrayData(file, boxes, data_space, 1, + hypre_StructVectorData(vector)); + + /*---------------------------------------- + * Assemble the vector + *----------------------------------------*/ + + hypre_StructVectorAssemble(vector); + + /*---------------------------------------- + * Close file + *----------------------------------------*/ + + fclose(file); + + return vector; + } + Index: llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/threading.c diff -c /dev/null llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/threading.c:1.1 *** /dev/null Mon Apr 11 00:22:18 2005 --- llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/threading.c Mon Apr 11 00:22:07 2005 *************** *** 0 **** --- 1,263 ---- + /*BHEADER********************************************************************** + * See the file COPYRIGHT_and_DISCLAIMER for a complete copyright + * notice, contact person, and disclaimer. + * + * $Revision: 1.1 $ + *********************************************************************EHEADER*/ + + #include + #include + #include "utilities.h" + + #if defined(HYPRE_USING_OPENMP) || defined (HYPRE_USING_PGCC_SMP) + + int + hypre_NumThreads( ) + { + int num_threads; + + #ifdef HYPRE_USING_OPENMP + #pragma omp parallel + num_threads = omp_get_num_threads(); + #endif + #ifdef HYPRE_USING_PGCC_SMP + num_threads = 2; + #endif + + return num_threads; + } + + #endif + + /*!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!*/ + /* The pthreads stuff needs to be reworked */ + + #define HYPRE_THREAD_GLOBALS + + #ifdef HYPRE_USE_PTHREADS + + #ifdef HYPRE_USE_UMALLOC + #include "umalloc_local.h" + #endif + + int iteration_counter = 0; + volatile int hypre_thread_counter; + volatile int work_continue = 1; + + + int HYPRE_InitPthreads( int num_threads ) + { + int err; + int i; + hypre_qptr = + (hypre_workqueue_t) malloc(sizeof(struct hypre_workqueue_struct)); + + hypre_NumThreads = num_threads; + initial_thread = pthread_self(); + + if (hypre_qptr != NULL) { + pthread_mutex_init(&hypre_qptr->lock, NULL); + pthread_cond_init(&hypre_qptr->work_wait, NULL); + pthread_cond_init(&hypre_qptr->finish_wait, NULL); + hypre_qptr->n_working = hypre_qptr->n_waiting = hypre_qptr->n_queue = 0; + hypre_qptr->inp = hypre_qptr->outp = 0; + for (i=0; i < hypre_NumThreads; i++) { + #ifdef HYPRE_USE_UMALLOC + /* Get initial area to start heap */ + assert ((_uinitial_block[i] = malloc(INITIAL_HEAP_SIZE))!=NULL); + + /* Create a user heap */ + assert ((_uparam[i].myheap = _ucreate(initial_block[i], + INITIAL_HEAP_SIZE, + _BLOCK_CLEAN, + _HEAP_REGULAR, + _uget_fn, + _urelease_fn)) != NULL); + #endif + err=pthread_create(&hypre_thread[i], NULL, + (void *(*)(void *))hypre_pthread_worker, + (void *)i); + assert(err == 0); + } + } + + pthread_mutex_init(&hypre_mutex_boxloops, NULL); + pthread_mutex_init(&mpi_mtx, NULL); + pthread_mutex_init(&talloc_mtx, NULL); + pthread_mutex_init(&time_mtx, NULL); + pthread_mutex_init(&worker_mtx, NULL); + hypre_thread_counter = 0; + hypre_thread_release = 0; + + return (err); + } + + void hypre_StopWorker(void *i) + { + work_continue = 0; + } + + void HYPRE_DestroyPthreads( void ) + { + int i; + void *status; + + for (i=0; i < hypre_NumThreads; i++) { + hypre_work_put(hypre_StopWorker, (void *) &i); + } + + #ifdef HYPRE_USE_UMALLOC + for (i=0; ilock); + pthread_mutex_destroy(&hypre_mutex_boxloops); + pthread_mutex_destroy(&mpi_mtx); + pthread_mutex_destroy(&talloc_mtx); + pthread_mutex_destroy(&time_mtx); + pthread_mutex_destroy(&worker_mtx); + pthread_cond_destroy(&hypre_qptr->work_wait); + pthread_cond_destroy(&hypre_qptr->finish_wait); + free (hypre_qptr); + } + + + void hypre_pthread_worker( int threadid ) + { + void *argptr; + hypre_work_proc_t funcptr; + + pthread_mutex_lock(&hypre_qptr->lock); + + hypre_qptr->n_working++; + + while(work_continue) { + while (hypre_qptr->n_queue == 0) { + if (--hypre_qptr->n_working == 0) + pthread_cond_signal(&hypre_qptr->finish_wait); + hypre_qptr->n_waiting++; + pthread_cond_wait(&hypre_qptr->work_wait, &hypre_qptr->lock); + hypre_qptr->n_waiting--; + hypre_qptr->n_working++; + } + hypre_qptr->n_queue--; + funcptr = hypre_qptr->worker_proc_queue[hypre_qptr->outp]; + argptr = hypre_qptr->argqueue[hypre_qptr->outp]; + + hypre_qptr->outp = (hypre_qptr->outp + 1) % MAX_QUEUE; + + pthread_mutex_unlock(&hypre_qptr->lock); + + (*funcptr)(argptr); + + hypre_barrier(&worker_mtx, 0); + + if (work_continue) + pthread_mutex_lock(&hypre_qptr->lock); + } + } + + void + hypre_work_put( hypre_work_proc_t funcptr, void *argptr ) + { + pthread_mutex_lock(&hypre_qptr->lock); + if (hypre_qptr->n_waiting) { + /* idle workers to be awakened */ + pthread_cond_signal(&hypre_qptr->work_wait); + } + assert(hypre_qptr->n_queue != MAX_QUEUE); + + hypre_qptr->n_queue++; + hypre_qptr->worker_proc_queue[hypre_qptr->inp] = funcptr; + hypre_qptr->argqueue[hypre_qptr->inp] = argptr; + hypre_qptr->inp = (hypre_qptr->inp + 1) % MAX_QUEUE; + pthread_mutex_unlock(&hypre_qptr->lock); + } + + + /* Wait until all work is done and workers quiesce. */ + void + hypre_work_wait( void ) + { + pthread_mutex_lock(&hypre_qptr->lock); + while(hypre_qptr->n_queue !=0 || hypre_qptr->n_working != 0) + pthread_cond_wait(&hypre_qptr->finish_wait, &hypre_qptr->lock); + pthread_mutex_unlock(&hypre_qptr->lock); + } + + + int + hypre_fetch_and_add( int *w ) + { + int temp; + + temp = *w; + *w += 1; + + return temp; + } + + int + ifetchadd( int *w, pthread_mutex_t *mutex_fetchadd ) + { + int n; + + pthread_mutex_lock(mutex_fetchadd); + n = *w; + *w += 1; + pthread_mutex_unlock(mutex_fetchadd); + + return n; + } + + static volatile int thb_count = 0; + static volatile int thb_release = 0; + + void hypre_barrier(pthread_mutex_t *mtx, int unthreaded) + { + if (!unthreaded) { + pthread_mutex_lock(mtx); + thb_count++; + + if (thb_count < hypre_NumThreads) { + pthread_mutex_unlock(mtx); + while (!thb_release); + pthread_mutex_lock(mtx); + thb_count--; + pthread_mutex_unlock(mtx); + while (thb_release); + } + else if (thb_count == hypre_NumThreads) { + thb_count--; + pthread_mutex_unlock(mtx); + thb_release++; + while (thb_count); + thb_release = 0; + } + } + } + + int + hypre_GetThreadID( void ) + { + int i; + + if (pthread_equal(pthread_self(), initial_thread)) + return hypre_NumThreads; + + for (i = 0; i < hypre_NumThreads; i++) + { + if (pthread_equal(pthread_self(), hypre_thread[i])) + return i; + } + + return -1; + } + + #endif + /*!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!*/ Index: llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/threading.h diff -c /dev/null llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/threading.h:1.1 *** /dev/null Mon Apr 11 00:22:18 2005 --- llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/threading.h Mon Apr 11 00:22:07 2005 *************** *** 0 **** --- 1,81 ---- + /*BHEADER********************************************************************** + * (c) 1997 The Regents of the University of California + * + * See the file COPYRIGHT_and_DISCLAIMER for a complete copyright + * notice, contact person, and disclaimer. + * + * $Revision: 1.1 $ + *********************************************************************EHEADER*/ + + #ifndef hypre_THREADING_HEADER + #define hypre_THREADING_HEADER + + #if defined(HYPRE_USING_OPENMP) || defined (HYPRE_USING_PGCC_SMP) + + int hypre_NumThreads( void ); + + #else + + #define hypre_NumThreads() 1 + + #endif + + + /*!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!*/ + /* The pthreads stuff needs to be reworked */ + + #ifdef HYPRE_USE_PTHREADS + + #ifndef MAX_QUEUE + #define MAX_QUEUE 256 + #endif + + #include + + /* hypre_work_proc_t typedef'd to be a pointer to a function with a void* + argument and a void return type */ + typedef void (*hypre_work_proc_t)(void *); + + typedef struct hypre_workqueue_struct { + pthread_mutex_t lock; + pthread_cond_t work_wait; + pthread_cond_t finish_wait; + hypre_work_proc_t worker_proc_queue[MAX_QUEUE]; + int n_working; + int n_waiting; + int n_queue; + int inp; + int outp; + void *argqueue[MAX_QUEUE]; + } *hypre_workqueue_t; + + void hypre_work_put( hypre_work_proc_t funcptr, void *argptr ); + void hypre_work_wait( void ); + int HYPRE_InitPthreads( int num_threads ); + void HYPRE_DestroyPthreads( void ); + void hypre_pthread_worker( int threadid ); + int ifetchadd( int *w, pthread_mutex_t *mutex_fetchadd ); + int hypre_fetch_and_add( int *w ); + void hypre_barrier(pthread_mutex_t *mpi_mtx, int unthreaded); + int hypre_GetThreadID( void ); + + pthread_t initial_thread; + pthread_t hypre_thread[hypre_MAX_THREADS]; + pthread_mutex_t hypre_mutex_boxloops; + pthread_mutex_t talloc_mtx; + pthread_mutex_t worker_mtx; + hypre_workqueue_t hypre_qptr; + pthread_mutex_t mpi_mtx; + pthread_mutex_t time_mtx; + volatile int hypre_thread_release; + + #ifdef HYPRE_THREAD_GLOBALS + int hypre_NumThreads = 4; + #else + extern int hypre_NumThreads; + #endif + + #endif + /*!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!*/ + + #endif Index: llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/timer.c diff -c /dev/null llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/timer.c:1.1 *** /dev/null Mon Apr 11 00:22:18 2005 --- llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/timer.c Mon Apr 11 00:22:07 2005 *************** *** 0 **** --- 1,45 ---- + /*BHEADER********************************************************************** + * (c) 1997 The Regents of the University of California + * + * See the file COPYRIGHT_and_DISCLAIMER for a complete copyright + * notice, contact person, and disclaimer. + * + * $Revision: 1.1 $ + *********************************************************************EHEADER*/ + + /* + * File: timer.c + * Copyright: (c) 1997 The Regents of the University of California + * Author: Scott Kohn (skohn at llnl.gov) + * Description: somewhat portable timing routines for C++, C, and Fortran + * + * If TIMER_USE_MPI is defined, then the MPI timers are used to get + * wallclock seconds, since we assume that the MPI timers have better + * resolution than the system timers. + */ + + #include + #include + #ifdef TIMER_USE_MPI + #include "mpi.h" + #endif + + double time_getWallclockSeconds(void) + { + return(0.0); + } + + double time_getCPUSeconds(void) + { + return(0.0); + } + + double time_get_wallclock_seconds_(void) + { + return(time_getWallclockSeconds()); + } + + double time_get_cpu_seconds_(void) + { + return(time_getCPUSeconds()); + } Index: llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/timing.c diff -c /dev/null llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/timing.c:1.1 *** /dev/null Mon Apr 11 00:22:18 2005 --- llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/timing.c Mon Apr 11 00:22:07 2005 *************** *** 0 **** --- 1,626 ---- + /*BHEADER********************************************************************** + * (c) 1997 The Regents of the University of California + * + * See the file COPYRIGHT_and_DISCLAIMER for a complete copyright + * notice, contact person, and disclaimer. + * + * $Revision: 1.1 $ + *********************************************************************EHEADER*/ + + /****************************************************************************** + * + * Routines for doing timing. + * + *****************************************************************************/ + + #define HYPRE_TIMING_GLOBALS + #include "utilities.h" + #include "timing.h" + + + /*------------------------------------------------------- + * Timing macros + *-------------------------------------------------------*/ + + #define hypre_StartTiming() \ + hypre_TimingWallCount -= time_getWallclockSeconds();\ + hypre_TimingCPUCount -= time_getCPUSeconds() + + #define hypre_StopTiming() \ + hypre_TimingWallCount += time_getWallclockSeconds();\ + hypre_TimingCPUCount += time_getCPUSeconds() + + #ifndef HYPRE_USE_PTHREADS + #define hypre_global_timing_ref(index,field) hypre_global_timing->field + #else + #define hypre_global_timing_ref(index,field) \ + hypre_global_timing[index].field + #endif + + /*-------------------------------------------------------------------------- + * hypre_InitializeTiming + *--------------------------------------------------------------------------*/ + + int + hypre_InitializeTiming( char *name ) + { + int time_index; + + double *old_wall_time; + double *old_cpu_time; + double *old_flops; + char **old_name; + int *old_state; + int *old_num_regs; + + int new_name; + int i; + #ifdef HYPRE_USE_PTHREADS + int threadid = hypre_GetThreadID(); + #endif + + /*------------------------------------------------------- + * Allocate global TimingType structure if needed + *-------------------------------------------------------*/ + + if (hypre_global_timing == NULL) + { + #ifndef HYPRE_USE_PTHREADS + hypre_global_timing = hypre_CTAlloc(hypre_TimingType, 1); + #else + hypre_global_timing = hypre_CTAlloc(hypre_TimingType, + hypre_NumThreads + 1); + #endif + } + + /*------------------------------------------------------- + * Check to see if name has already been registered + *-------------------------------------------------------*/ + + new_name = 1; + for (i = 0; i < (hypre_global_timing_ref(threadid, size)); i++) + { + if (hypre_TimingNumRegs(i) > 0) + { + if (strcmp(name, hypre_TimingName(i)) == 0) + { + new_name = 0; + time_index = i; + hypre_TimingNumRegs(time_index) ++; + break; + } + } + } + + if (new_name) + { + for (i = 0; i < hypre_global_timing_ref(threadid ,size); i++) + { + if (hypre_TimingNumRegs(i) == 0) + { + break; + } + } + time_index = i; + } + + /*------------------------------------------------------- + * Register the new timing name + *-------------------------------------------------------*/ + + if (new_name) + { + if (time_index == (hypre_global_timing_ref(threadid, size))) + { + old_wall_time = (hypre_global_timing_ref(threadid, wall_time)); + old_cpu_time = (hypre_global_timing_ref(threadid, cpu_time)); + old_flops = (hypre_global_timing_ref(threadid, flops)); + old_name = (hypre_global_timing_ref(threadid, name)); + old_state = (hypre_global_timing_ref(threadid, state)); + old_num_regs = (hypre_global_timing_ref(threadid, num_regs)); + + (hypre_global_timing_ref(threadid, wall_time)) = + hypre_CTAlloc(double, (time_index+1)); + (hypre_global_timing_ref(threadid, cpu_time)) = + hypre_CTAlloc(double, (time_index+1)); + (hypre_global_timing_ref(threadid, flops)) = + hypre_CTAlloc(double, (time_index+1)); + (hypre_global_timing_ref(threadid, name)) = + hypre_CTAlloc(char *, (time_index+1)); + (hypre_global_timing_ref(threadid, state)) = + hypre_CTAlloc(int, (time_index+1)); + (hypre_global_timing_ref(threadid, num_regs)) = + hypre_CTAlloc(int, (time_index+1)); + (hypre_global_timing_ref(threadid, size)) ++; + + for (i = 0; i < time_index; i++) + { + hypre_TimingWallTime(i) = old_wall_time[i]; + hypre_TimingCPUTime(i) = old_cpu_time[i]; + hypre_TimingFLOPS(i) = old_flops[i]; + hypre_TimingName(i) = old_name[i]; + hypre_TimingState(i) = old_state[i]; + hypre_TimingNumRegs(i) = old_num_regs[i]; + } + + hypre_TFree(old_wall_time); + hypre_TFree(old_cpu_time); + hypre_TFree(old_flops); + hypre_TFree(old_name); + hypre_TFree(old_state); + hypre_TFree(old_num_regs); + } + + hypre_TimingName(time_index) = hypre_CTAlloc(char, 80); + strncpy(hypre_TimingName(time_index), name, 79); + hypre_TimingState(time_index) = 0; + hypre_TimingNumRegs(time_index) = 1; + (hypre_global_timing_ref(threadid, num_names)) ++; + } + + return time_index; + } + + /*-------------------------------------------------------------------------- + * hypre_FinalizeTiming + *--------------------------------------------------------------------------*/ + + int + hypre_FinalizeTiming( int time_index ) + { + int ierr = 0; + int i; + #ifdef HYPRE_USE_PTHREADS + int threadid = hypre_GetThreadID(); + int free_global_timing; + #endif + + if (hypre_global_timing == NULL) + return ierr; + + if (time_index < (hypre_global_timing_ref(threadid, size))) + { + if (hypre_TimingNumRegs(time_index) > 0) + { + hypre_TimingNumRegs(time_index) --; + } + + if (hypre_TimingNumRegs(time_index) == 0) + { + hypre_TFree(hypre_TimingName(time_index)); + (hypre_global_timing_ref(threadid, num_names)) --; + } + } + + #ifdef HYPRE_USE_PTHREADS + + free_global_timing = 1; + for (i = 0; i <= hypre_NumThreads; i++) + { + if (hypre_global_timing_ref(i, num_names)) + { + free_global_timing = 0; + break; + } + } + + if (free_global_timing) + { + pthread_mutex_lock(&time_mtx); + if(hypre_global_timing) + { + for (i = 0; i <= hypre_NumThreads; i++) + + { + hypre_TFree(hypre_global_timing_ref(i, wall_time)); + hypre_TFree(hypre_global_timing_ref(i, cpu_time)); + hypre_TFree(hypre_global_timing_ref(i, flops)); + hypre_TFree(hypre_global_timing_ref(i, name)); + hypre_TFree(hypre_global_timing_ref(i, state)); + hypre_TFree(hypre_global_timing_ref(i, num_regs)); + } + + hypre_TFree(hypre_global_timing); + hypre_global_timing = NULL; + } + pthread_mutex_unlock(&time_mtx); + } + + #else + + if ((hypre_global_timing -> num_names) == 0) + { + for (i = 0; i < (hypre_global_timing -> size); i++) + { + hypre_TFree(hypre_global_timing_ref(i, wall_time)); + hypre_TFree(hypre_global_timing_ref(i, cpu_time)); + hypre_TFree(hypre_global_timing_ref(i, flops)); + hypre_TFree(hypre_global_timing_ref(i, name)); + hypre_TFree(hypre_global_timing_ref(i, state)); + hypre_TFree(hypre_global_timing_ref(i, num_regs)); + } + + hypre_TFree(hypre_global_timing); + hypre_global_timing = NULL; + } + + #endif + + return ierr; + } + + /*-------------------------------------------------------------------------- + * hypre_IncFLOPCount + *--------------------------------------------------------------------------*/ + + int + hypre_IncFLOPCount( int inc ) + { + int ierr = 0; + #ifdef HYPRE_USE_PTHREADS + int threadid = hypre_GetThreadID(); + #endif + + if (hypre_global_timing == NULL) + return ierr; + + hypre_TimingFLOPCount += (double) (inc); + + #ifdef HYPRE_USE_PTHREADS + if (threadid != hypre_NumThreads) + { + pthread_mutex_lock(&time_mtx); + hypre_TimingAllFLOPS += (double) (inc); + pthread_mutex_unlock(&time_mtx); + } + #endif + + return ierr; + } + + /*-------------------------------------------------------------------------- + * hypre_BeginTiming + *--------------------------------------------------------------------------*/ + + int + hypre_BeginTiming( int time_index ) + { + int ierr = 0; + #ifdef HYPRE_USE_PTHREADS + int threadid = hypre_GetThreadID(); + #endif + + if (hypre_global_timing == NULL) + return ierr; + + if (hypre_TimingState(time_index) == 0) + { + hypre_StopTiming(); + hypre_TimingWallTime(time_index) -= hypre_TimingWallCount; + hypre_TimingCPUTime(time_index) -= hypre_TimingCPUCount; + #ifdef HYPRE_USE_PTHREADS + if (threadid != hypre_NumThreads) + hypre_TimingFLOPS(time_index) -= hypre_TimingFLOPCount; + else + hypre_TimingFLOPS(time_index) -= hypre_TimingAllFLOPS; + #else + hypre_TimingFLOPS(time_index) -= hypre_TimingFLOPCount; + #endif + + hypre_StartTiming(); + } + hypre_TimingState(time_index) ++; + + return ierr; + } + + /*-------------------------------------------------------------------------- + * hypre_EndTiming + *--------------------------------------------------------------------------*/ + + int + hypre_EndTiming( int time_index ) + { + int ierr = 0; + #ifdef HYPRE_USE_PTHREADS + int threadid = hypre_GetThreadID(); + #endif + + if (hypre_global_timing == NULL) + return ierr; + + hypre_TimingState(time_index) --; + if (hypre_TimingState(time_index) == 0) + { + hypre_StopTiming(); + hypre_TimingWallTime(time_index) += hypre_TimingWallCount; + hypre_TimingCPUTime(time_index) += hypre_TimingCPUCount; + #ifdef HYPRE_USE_PTHREADS + if (threadid != hypre_NumThreads) + hypre_TimingFLOPS(time_index) += hypre_TimingFLOPCount; + else + hypre_TimingFLOPS(time_index) += hypre_TimingAllFLOPS; + #else + hypre_TimingFLOPS(time_index) += hypre_TimingFLOPCount; + #endif + hypre_StartTiming(); + } + + return ierr; + } + + /*-------------------------------------------------------------------------- + * hypre_ClearTiming + *--------------------------------------------------------------------------*/ + + int + hypre_ClearTiming( ) + { + int ierr = 0; + int i; + #ifdef HYPRE_USE_PTHREADS + int threadid = hypre_GetThreadID(); + #endif + + if (hypre_global_timing == NULL) + return ierr; + + for (i = 0; i < (hypre_global_timing_ref(threadid,size)); i++) + { + hypre_TimingWallTime(i) = 0.0; + hypre_TimingCPUTime(i) = 0.0; + hypre_TimingFLOPS(i) = 0.0; + } + + return ierr; + } + + /*-------------------------------------------------------------------------- + * hypre_PrintTiming + *--------------------------------------------------------------------------*/ + + #ifndef HYPRE_USE_PTHREADS /* non-threaded version of hypre_PrintTiming */ + + int + hypre_PrintTiming( char *heading, + MPI_Comm comm ) + { + int ierr = 0; + + double local_wall_time; + double local_cpu_time; + double wall_time; + double cpu_time; + double wall_mflops; + double cpu_mflops; + + int i; + int myrank; + + if (hypre_global_timing == NULL) + return ierr; + + MPI_Comm_rank(comm, &myrank ); + + /* print heading */ + if (myrank == 0) + { + printf("=============================================\n"); + printf("%s:\n", heading); + printf("=============================================\n"); + } + + for (i = 0; i < (hypre_global_timing -> size); i++) + { + if (hypre_TimingNumRegs(i) > 0) + { + local_wall_time = hypre_TimingWallTime(i); + local_cpu_time = hypre_TimingCPUTime(i); + MPI_Allreduce(&local_wall_time, &wall_time, 1, + MPI_DOUBLE, MPI_MAX, comm); + MPI_Allreduce(&local_cpu_time, &cpu_time, 1, + MPI_DOUBLE, MPI_MAX, comm); + + if (myrank == 0) + { + printf("%s:\n", hypre_TimingName(i)); + + /* print wall clock info */ + printf(" wall clock time = %f seconds\n", wall_time); + if (wall_time) + wall_mflops = hypre_TimingFLOPS(i) / wall_time / 1.0E6; + else + wall_mflops = 0.0; + /* printf(" wall MFLOPS = %f\n", wall_mflops); */ + + /* print CPU clock info */ + printf(" cpu clock time = %f seconds\n", cpu_time); + if (cpu_time) + cpu_mflops = hypre_TimingFLOPS(i) / cpu_time / 1.0E6; + else + cpu_mflops = 0.0; + /* printf(" cpu MFLOPS = %f\n\n", cpu_mflops); */ + } + } + } + + return ierr; + } + + #else /* threaded version of hypre_PrintTiming */ + + #ifdef MPI_Comm_rank + #undef MPI_Comm_rank + #endif + #ifdef MPI_Allreduce + #undef MPI_Allreduce + #endif + + int + hypre_PrintTiming( char *heading, + MPI_Comm comm ) + { + int ierr = 0; + + double local_wall_time; + double local_cpu_time; + double wall_time; + double cpu_time; + double wall_mflops; + double cpu_mflops; + + int i, j, index; + int myrank; + int my_thread = hypre_GetThreadID(); + int threadid; + int max_size; + int num_regs; + + char target_name[32]; + + if (my_thread == hypre_NumThreads) + { + if (hypre_global_timing == NULL) + return ierr; + + MPI_Comm_rank(comm, &myrank ); + + /* print heading */ + if (myrank == 0) + { + printf("=============================================\n"); + printf("%s:\n", heading); + printf("=============================================\n"); + } + + for (i = 0; i < 7; i++) + { + switch (i) + { + case 0: + threadid = my_thread; + strcpy(target_name, hypre_TimingName(i)); + break; + case 1: + strcpy(target_name, "SMG"); + break; + case 2: + strcpy(target_name, "SMGRelax"); + break; + case 3: + strcpy(target_name, "SMGResidual"); + break; + case 4: + strcpy(target_name, "CyclicReduction"); + break; + case 5: + strcpy(target_name, "SMGIntAdd"); + break; + case 6: + strcpy(target_name, "SMGRestrict"); + break; + } + + threadid = 0; + for (j = 0; j < hypre_global_timing[threadid].size; j++) + { + if (strcmp(target_name, hypre_TimingName(j)) == 0) + { + index = j; + break; + } + else + index = -1; + } + + if (i < hypre_global_timing[my_thread].size) + { + threadid = my_thread; + num_regs = hypre_TimingNumRegs(i); + } + else + num_regs = hypre_TimingNumRegs(index); + + if (num_regs > 0) + { + local_wall_time = 0.0; + local_cpu_time = 0.0; + if (index >= 0) + { + for (threadid = 0; threadid < hypre_NumThreads; threadid++) + { + local_wall_time = + hypre_max(local_wall_time, hypre_TimingWallTime(index)); + local_cpu_time = + hypre_max(local_cpu_time, hypre_TimingCPUTime(index)); + } + } + + if (i < hypre_global_timing[my_thread].size) + { + threadid = my_thread; + local_wall_time += hypre_TimingWallTime(i); + local_cpu_time += hypre_TimingCPUTime(i); + } + + MPI_Allreduce(&local_wall_time, &wall_time, 1, + MPI_DOUBLE, MPI_MAX, comm); + MPI_Allreduce(&local_cpu_time, &cpu_time, 1, + MPI_DOUBLE, MPI_MAX, comm); + + if (myrank == 0) + { + printf("%s:\n", target_name); + + /* print wall clock info */ + printf(" wall clock time = %f seconds\n", wall_time); + wall_mflops = 0.0; + if (wall_time) + { + if (index >= 0) + { + for (threadid = 0; threadid < hypre_NumThreads; threadid++) + { + wall_mflops += + hypre_TimingFLOPS(index) / wall_time / 1.0E6; + } + } + if (i < hypre_global_timing[my_thread].size) + { + threadid = my_thread; + wall_mflops += hypre_TimingFLOPS(i) / wall_time / 1.0E6; + } + } + + /* printf(" wall MFLOPS = %f\n", wall_mflops); */ + + /* print CPU clock info */ + printf(" cpu clock time = %f seconds\n", cpu_time); + cpu_mflops = 0.0; + if (cpu_time) + { + if (index >= 0) + { + for (threadid = 0; threadid < hypre_NumThreads; threadid++) + { + cpu_mflops += + hypre_TimingFLOPS(index) / cpu_time / 1.0E6; + } + } + if (i < hypre_global_timing[my_thread].size) + { + threadid = my_thread; + cpu_mflops += hypre_TimingFLOPS(i) / cpu_time / 1.0E6; + } + } + + /* printf(" cpu MFLOPS = %f\n\n", cpu_mflops); */ + } + } + } + } + + return ierr; + } + + #endif Index: llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/timing.h diff -c /dev/null llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/timing.h:1.1 *** /dev/null Mon Apr 11 00:22:18 2005 --- llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/timing.h Mon Apr 11 00:22:07 2005 *************** *** 0 **** --- 1,130 ---- + /*BHEADER********************************************************************** + * (c) 1997 The Regents of the University of California + * + * See the file COPYRIGHT_and_DISCLAIMER for a complete copyright + * notice, contact person, and disclaimer. + * + * $Revision: 1.1 $ + *********************************************************************EHEADER*/ + + /****************************************************************************** + * + * Header file for doing timing + * + *****************************************************************************/ + + #ifndef HYPRE_TIMING_HEADER + #define HYPRE_TIMING_HEADER + + #include + #include + #include + + #ifdef __cplusplus + extern "C" { + #endif + + /*-------------------------------------------------------------------------- + * Prototypes for low-level timing routines + *--------------------------------------------------------------------------*/ + + /* timer.c */ + double time_getWallclockSeconds( void ); + double time_getCPUSeconds( void ); + double time_get_wallclock_seconds_( void ); + double time_get_cpu_seconds_( void ); + + /*-------------------------------------------------------------------------- + * With timing off + *--------------------------------------------------------------------------*/ + + #ifndef HYPRE_TIMING + + #define hypre_InitializeTiming(name) 0 + #define hypre_IncFLOPCount(inc) + #define hypre_BeginTiming(i) + #define hypre_EndTiming(i) + #define hypre_PrintTiming(heading, comm) + #define hypre_FinalizeTiming(index) + + /*-------------------------------------------------------------------------- + * With timing on + *--------------------------------------------------------------------------*/ + + #else + + /*------------------------------------------------------- + * Global timing structure + *-------------------------------------------------------*/ + + typedef struct + { + double *wall_time; + double *cpu_time; + double *flops; + char **name; + int *state; /* boolean flag to allow for recursive timing */ + int *num_regs; /* count of how many times a name is registered */ + + int num_names; + int size; + + double wall_count; + double CPU_count; + double FLOP_count; + + } hypre_TimingType; + + #ifdef HYPRE_TIMING_GLOBALS + hypre_TimingType *hypre_global_timing = NULL; + #else + extern hypre_TimingType *hypre_global_timing; + #endif + + /*------------------------------------------------------- + * Accessor functions + *-------------------------------------------------------*/ + + #ifndef HYPRE_USE_PTHREADS + #define hypre_TimingWallTime(i) (hypre_global_timing -> wall_time[(i)]) + #define hypre_TimingCPUTime(i) (hypre_global_timing -> cpu_time[(i)]) + #define hypre_TimingFLOPS(i) (hypre_global_timing -> flops[(i)]) + #define hypre_TimingName(i) (hypre_global_timing -> name[(i)]) + #define hypre_TimingState(i) (hypre_global_timing -> state[(i)]) + #define hypre_TimingNumRegs(i) (hypre_global_timing -> num_regs[(i)]) + #define hypre_TimingWallCount (hypre_global_timing -> wall_count) + #define hypre_TimingCPUCount (hypre_global_timing -> CPU_count) + #define hypre_TimingFLOPCount (hypre_global_timing -> FLOP_count) + #else + #define hypre_TimingWallTime(i) (hypre_global_timing[threadid].wall_time[(i)]) + #define hypre_TimingCPUTime(i) (hypre_global_timing[threadid].cpu_time[(i)]) + #define hypre_TimingFLOPS(i) (hypre_global_timing[threadid].flops[(i)]) + #define hypre_TimingName(i) (hypre_global_timing[threadid].name[(i)]) + #define hypre_TimingState(i) (hypre_global_timing[threadid].state[(i)]) + #define hypre_TimingNumRegs(i) (hypre_global_timing[threadid].num_regs[(i)]) + #define hypre_TimingWallCount (hypre_global_timing[threadid].wall_count) + #define hypre_TimingCPUCount (hypre_global_timing[threadid].CPU_count) + #define hypre_TimingFLOPCount (hypre_global_timing[threadid].FLOP_count) + #define hypre_TimingAllFLOPS (hypre_global_timing[hypre_NumThreads].FLOP_count) + #endif + + /*------------------------------------------------------- + * Prototypes + *-------------------------------------------------------*/ + + /* timing.c */ + int hypre_InitializeTiming( char *name ); + int hypre_FinalizeTiming( int time_index ); + int hypre_IncFLOPCount( int inc ); + int hypre_BeginTiming( int time_index ); + int hypre_EndTiming( int time_index ); + int hypre_ClearTiming( void ); + int hypre_PrintTiming( char *heading , MPI_Comm comm ); + + #endif + + #ifdef __cplusplus + } + #endif + + #endif Index: llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/utilities.h diff -c /dev/null llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/utilities.h:1.1 *** /dev/null Mon Apr 11 00:22:18 2005 --- llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000/utilities.h Mon Apr 11 00:22:07 2005 *************** *** 0 **** --- 1,755 ---- + + #include + + #include "HYPRE_utilities.h" + + #ifndef hypre_UTILITIES_HEADER + #define hypre_UTILITIES_HEADER + + #ifdef __cplusplus + extern "C" { + #endif + + /*BHEADER********************************************************************** + * (c) 1997 The Regents of the University of California + * + * See the file COPYRIGHT_and_DISCLAIMER for a complete copyright + * notice, contact person, and disclaimer. + * + * $Revision: 1.1 $ + *********************************************************************EHEADER*/ + /****************************************************************************** + * + * General structures and values + * + *****************************************************************************/ + + #ifndef hypre_GENERAL_HEADER + #define hypre_GENERAL_HEADER + + /*-------------------------------------------------------------------------- + * Define various functions + *--------------------------------------------------------------------------*/ + + #ifndef hypre_max + #define hypre_max(a,b) (((a)<(b)) ? (b) : (a)) + #endif + #ifndef hypre_min + #define hypre_min(a,b) (((a)<(b)) ? (a) : (b)) + #endif + + #ifndef hypre_round + #define hypre_round(x) ( ((x) < 0.0) ? ((int)(x - 0.5)) : ((int)(x + 0.5)) ) + #endif + + #endif + /*BHEADER********************************************************************** + * (c) 1998 The Regents of the University of California + * + * See the file COPYRIGHT_and_DISCLAIMER for a complete copyright + * notice, contact person, and disclaimer. + * + * $Revision: 1.1 $ + *********************************************************************EHEADER*/ + /****************************************************************************** + * + * Fake mpi stubs to generate serial codes without mpi + * + *****************************************************************************/ + + #ifndef hypre_MPISTUBS + #define hypre_MPISTUBS + + #ifdef HYPRE_SEQUENTIAL + + #ifdef __cplusplus + extern "C" { + #endif + + /*-------------------------------------------------------------------------- + * Change all MPI names to hypre_MPI names to avoid link conflicts + * + * NOTE: MPI_Comm is the only MPI symbol in the HYPRE user interface, + * and is defined in `HYPRE_utilities.h'. + *--------------------------------------------------------------------------*/ + + #define MPI_Comm hypre_MPI_Comm + #define MPI_Group hypre_MPI_Group + #define MPI_Request hypre_MPI_Request + #define MPI_Datatype hypre_MPI_Datatype + #define MPI_Status hypre_MPI_Status + #define MPI_Op hypre_MPI_Op + #define MPI_Aint hypre_MPI_Aint + + #define MPI_COMM_WORLD hypre_MPI_COMM_WORLD + + #define MPI_BOTTOM hypre_MPI_BOTTOM + + #define MPI_DOUBLE hypre_MPI_DOUBLE + #define MPI_INT hypre_MPI_INT + #define MPI_CHAR hypre_MPI_CHAR + #define MPI_LONG hypre_MPI_LONG + + #define MPI_SUM hypre_MPI_SUM + #define MPI_MIN hypre_MPI_MIN + #define MPI_MAX hypre_MPI_MAX + #define MPI_LOR hypre_MPI_LOR + + #define MPI_UNDEFINED hypre_MPI_UNDEFINED + #define MPI_REQUEST_NULL hypre_MPI_REQUEST_NULL + #define MPI_ANY_SOURCE hypre_MPI_ANY_SOURCE + + #define MPI_Init hypre_MPI_Init + #define MPI_Finalize hypre_MPI_Finalize + #define MPI_Abort hypre_MPI_Abort + #define MPI_Wtime hypre_MPI_Wtime + #define MPI_Wtick hypre_MPI_Wtick + #define MPI_Barrier hypre_MPI_Barrier + #define MPI_Comm_create hypre_MPI_Comm_create + #define MPI_Comm_dup hypre_MPI_Comm_dup + #define MPI_Comm_group hypre_MPI_Comm_group + #define MPI_Comm_size hypre_MPI_Comm_size + #define MPI_Comm_rank hypre_MPI_Comm_rank + #define MPI_Comm_free hypre_MPI_Comm_free + #define MPI_Group_incl hypre_MPI_Group_incl + #define MPI_Group_free hypre_MPI_Group_free + #define MPI_Address hypre_MPI_Address + #define MPI_Get_count hypre_MPI_Get_count + #define MPI_Alltoall hypre_MPI_Alltoall + #define MPI_Allgather hypre_MPI_Allgather + #define MPI_Allgatherv hypre_MPI_Allgatherv + #define MPI_Gather hypre_MPI_Gather + #define MPI_Scatter hypre_MPI_Scatter + #define MPI_Bcast hypre_MPI_Bcast + #define MPI_Send hypre_MPI_Send + #define MPI_Recv hypre_MPI_Recv + #define MPI_Isend hypre_MPI_Isend + #define MPI_Irecv hypre_MPI_Irecv + #define MPI_Send_init hypre_MPI_Send_init + #define MPI_Recv_init hypre_MPI_Recv_init + #define MPI_Irsend hypre_MPI_Irsend + #define MPI_Startall hypre_MPI_Startall + #define MPI_Probe hypre_MPI_Probe + #define MPI_Iprobe hypre_MPI_Iprobe + #define MPI_Test hypre_MPI_Test + #define MPI_Testall hypre_MPI_Testall + #define MPI_Wait hypre_MPI_Wait + #define MPI_Waitall hypre_MPI_Waitall + #define MPI_Waitany hypre_MPI_Waitany + #define MPI_Allreduce hypre_MPI_Allreduce + #define MPI_Request_free hypre_MPI_Request_free + #define MPI_Type_contiguous hypre_MPI_Type_contiguous + #define MPI_Type_vector hypre_MPI_Type_vector + #define MPI_Type_hvector hypre_MPI_Type_hvector + #define MPI_Type_struct hypre_MPI_Type_struct + #define MPI_Type_commit hypre_MPI_Type_commit + #define MPI_Type_free hypre_MPI_Type_free + + /*-------------------------------------------------------------------------- + * Types, etc. + *--------------------------------------------------------------------------*/ + + /* These types have associated creation and destruction routines */ + typedef int hypre_MPI_Comm; + typedef int hypre_MPI_Group; + typedef int hypre_MPI_Request; + typedef int hypre_MPI_Datatype; + + typedef struct { int MPI_SOURCE; } hypre_MPI_Status; + typedef int hypre_MPI_Op; + typedef int hypre_MPI_Aint; + + #define hypre_MPI_COMM_WORLD 0 + + #define hypre_MPI_BOTTOM 0x0 + + #define hypre_MPI_DOUBLE 0 + #define hypre_MPI_INT 1 + #define hypre_MPI_CHAR 2 + #define hypre_MPI_LONG 3 + + #define hypre_MPI_SUM 0 + #define hypre_MPI_MIN 1 + #define hypre_MPI_MAX 2 + #define hypre_MPI_LOR 3 + + #define hypre_MPI_UNDEFINED -9999 + #define hypre_MPI_REQUEST_NULL 0 + #define hypre_MPI_ANY_SOURCE 1 + + /*-------------------------------------------------------------------------- + * Prototypes + *--------------------------------------------------------------------------*/ + + /* mpistubs.c */ + int hypre_MPI_Init( int *argc , char ***argv ); + int hypre_MPI_Finalize( void ); + int hypre_MPI_Abort( hypre_MPI_Comm comm , int errorcode ); + double hypre_MPI_Wtime( void ); + double hypre_MPI_Wtick( void ); + int hypre_MPI_Barrier( hypre_MPI_Comm comm ); + int hypre_MPI_Comm_create( hypre_MPI_Comm comm , hypre_MPI_Group group , hypre_MPI_Comm *newcomm ); + int hypre_MPI_Comm_dup( hypre_MPI_Comm comm , hypre_MPI_Comm *newcomm ); + int hypre_MPI_Comm_size( hypre_MPI_Comm comm , int *size ); + int hypre_MPI_Comm_rank( hypre_MPI_Comm comm , int *rank ); + int hypre_MPI_Comm_free( hypre_MPI_Comm *comm ); + int hypre_MPI_Comm_group( hypre_MPI_Comm comm , hypre_MPI_Group *group ); + int hypre_MPI_Group_incl( hypre_MPI_Group group , int n , int *ranks , hypre_MPI_Group *newgroup ); + int hypre_MPI_Group_free( hypre_MPI_Group *group ); + int hypre_MPI_Address( void *location , hypre_MPI_Aint *address ); + int hypre_MPI_Get_count( hypre_MPI_Status *status , hypre_MPI_Datatype datatype , int *count ); + int hypre_MPI_Alltoall( void *sendbuf , int sendcount , hypre_MPI_Datatype sendtype , void *recvbuf , int recvcount , hypre_MPI_Datatype recvtype , hypre_MPI_Comm comm ); + int hypre_MPI_Allgather( void *sendbuf , int sendcount , hypre_MPI_Datatype sendtype , void *recvbuf , int recvcount , hypre_MPI_Datatype recvtype , hypre_MPI_Comm comm ); + int hypre_MPI_Allgatherv( void *sendbuf , int sendcount , hypre_MPI_Datatype sendtype , void *recvbuf , int *recvcounts , int *displs , hypre_MPI_Datatype recvtype , hypre_MPI_Comm comm ); + int hypre_MPI_Gather( void *sendbuf , int sendcount , hypre_MPI_Datatype sendtype , void *recvbuf , int recvcount , hypre_MPI_Datatype recvtype , int root , hypre_MPI_Comm comm ); + int hypre_MPI_Scatter( void *sendbuf , int sendcount , hypre_MPI_Datatype sendtype , void *recvbuf , int recvcount , hypre_MPI_Datatype recvtype , int root , hypre_MPI_Comm comm ); + int hypre_MPI_Bcast( void *buffer , int count , hypre_MPI_Datatype datatype , int root , hypre_MPI_Comm comm ); + int hypre_MPI_Send( void *buf , int count , hypre_MPI_Datatype datatype , int dest , int tag , hypre_MPI_Comm comm ); + int hypre_MPI_Recv( void *buf , int count , hypre_MPI_Datatype datatype , int source , int tag , hypre_MPI_Comm comm , hypre_MPI_Status *status ); + int hypre_MPI_Isend( void *buf , int count , hypre_MPI_Datatype datatype , int dest , int tag , hypre_MPI_Comm comm , hypre_MPI_Request *request ); + int hypre_MPI_Irecv( void *buf , int count , hypre_MPI_Datatype datatype , int source , int tag , hypre_MPI_Comm comm , hypre_MPI_Request *request ); + int hypre_MPI_Send_init( void *buf , int count , hypre_MPI_Datatype datatype , int dest , int tag , hypre_MPI_Comm comm , hypre_MPI_Request *request ); + int hypre_MPI_Recv_init( void *buf , int count , hypre_MPI_Datatype datatype , int dest , int tag , hypre_MPI_Comm comm , hypre_MPI_Request *request ); + int hypre_MPI_Irsend( void *buf , int count , hypre_MPI_Datatype datatype , int dest , int tag , hypre_MPI_Comm comm , hypre_MPI_Request *request ); + int hypre_MPI_Startall( int count , hypre_MPI_Request *array_of_requests ); + int hypre_MPI_Probe( int source , int tag , hypre_MPI_Comm comm , hypre_MPI_Status *status ); + int hypre_MPI_Iprobe( int source , int tag , hypre_MPI_Comm comm , int *flag , hypre_MPI_Status *status ); + int hypre_MPI_Test( hypre_MPI_Request *request , int *flag , hypre_MPI_Status *status ); + int hypre_MPI_Testall( int count , hypre_MPI_Request *array_of_requests , int *flag , hypre_MPI_Status *array_of_statuses ); + int hypre_MPI_Wait( hypre_MPI_Request *request , hypre_MPI_Status *status ); + int hypre_MPI_Waitall( int count , hypre_MPI_Request *array_of_requests , hypre_MPI_Status *array_of_statuses ); + int hypre_MPI_Waitany( int count , hypre_MPI_Request *array_of_requests , int *index , hypre_MPI_Status *status ); + int hypre_MPI_Allreduce( void *sendbuf , void *recvbuf , int count , hypre_MPI_Datatype datatype , hypre_MPI_Op op , hypre_MPI_Comm comm ); + int hypre_MPI_Request_free( hypre_MPI_Request *request ); + int hypre_MPI_Type_contiguous( int count , hypre_MPI_Datatype oldtype , hypre_MPI_Datatype *newtype ); + int hypre_MPI_Type_vector( int count , int blocklength , int stride , hypre_MPI_Datatype oldtype , hypre_MPI_Datatype *newtype ); + int hypre_MPI_Type_hvector( int count , int blocklength , hypre_MPI_Aint stride , hypre_MPI_Datatype oldtype , hypre_MPI_Datatype *newtype ); + int hypre_MPI_Type_struct( int count , int *array_of_blocklengths , hypre_MPI_Aint *array_of_displacements , hypre_MPI_Datatype *array_of_types , hypre_MPI_Datatype *newtype ); + int hypre_MPI_Type_commit( hypre_MPI_Datatype *datatype ); + int hypre_MPI_Type_free( hypre_MPI_Datatype *datatype ); + + #ifdef __cplusplus + } + #endif + + #endif + + #endif + /*BHEADER********************************************************************** + * (c) 1997 The Regents of the University of California + * + * See the file COPYRIGHT_and_DISCLAIMER for a complete copyright + * notice, contact person, and disclaimer. + * + * $Revision: 1.1 $ + *********************************************************************EHEADER*/ + /****************************************************************************** + * + * Header file for memory management utilities + * + *****************************************************************************/ + + #ifndef hypre_MEMORY_HEADER + #define hypre_MEMORY_HEADER + + #ifdef __cplusplus + extern "C" { + #endif + + /*-------------------------------------------------------------------------- + * Use "Debug Malloc Library", dmalloc + *--------------------------------------------------------------------------*/ + + #ifdef HYPRE_MEMORY_DMALLOC + + #define hypre_InitMemoryDebug(id) hypre_InitMemoryDebugDML(id) + #define hypre_FinalizeMemoryDebug() hypre_FinalizeMemoryDebugDML() + + #define hypre_TAlloc(type, count) \ + ( (type *)hypre_MAllocDML((unsigned int)(sizeof(type) * (count)),\ + __FILE__, __LINE__) ) + + #define hypre_CTAlloc(type, count) \ + ( (type *)hypre_CAllocDML((unsigned int)(count), (unsigned int)sizeof(type),\ + __FILE__, __LINE__) ) + + #define hypre_TReAlloc(ptr, type, count) \ + ( (type *)hypre_ReAllocDML((char *)ptr,\ + (unsigned int)(sizeof(type) * (count)),\ + __FILE__, __LINE__) ) + + #define hypre_TFree(ptr) \ + ( hypre_FreeDML((char *)ptr, __FILE__, __LINE__), ptr = NULL ) + + /*-------------------------------------------------------------------------- + * Use standard memory routines + *--------------------------------------------------------------------------*/ + + #else + + #define hypre_InitMemoryDebug(id) + #define hypre_FinalizeMemoryDebug() + + #define hypre_TAlloc(type, count) \ + ( (type *)hypre_MAlloc((unsigned int)(sizeof(type) * (count))) ) + + #define hypre_CTAlloc(type, count) \ + ( (type *)hypre_CAlloc((unsigned int)(count), (unsigned int)sizeof(type)) ) + + #define hypre_TReAlloc(ptr, type, count) \ + ( (type *)hypre_ReAlloc((char *)ptr, (unsigned int)(sizeof(type) * (count))) ) + + #define hypre_TFree(ptr) \ + ( hypre_Free((char *)ptr), ptr = NULL ) + + #endif + + + #ifdef HYPRE_USE_PTHREADS + + #define hypre_SharedTAlloc(type, count) \ + ( (type *)hypre_SharedMAlloc((unsigned int)(sizeof(type) * (count))) ) + + + #define hypre_SharedCTAlloc(type, count) \ + ( (type *)hypre_SharedCAlloc((unsigned int)(count),\ + (unsigned int)sizeof(type)) ) + + #define hypre_SharedTReAlloc(ptr, type, count) \ + ( (type *)hypre_SharedReAlloc((char *)ptr,\ + (unsigned int)(sizeof(type) * (count))) ) + + #define hypre_SharedTFree(ptr) \ + ( hypre_SharedFree((char *)ptr), ptr = NULL ) + + #else + + #define hypre_SharedTAlloc(type, count) hypre_TAlloc(type, (count)) + #define hypre_SharedCTAlloc(type, count) hypre_CTAlloc(type, (count)) + #define hypre_SharedTReAlloc(type, count) hypre_TReAlloc(type, (count)) + #define hypre_SharedTFree(ptr) hypre_TFree(ptr) + + #endif + + /*-------------------------------------------------------------------------- + * Prototypes + *--------------------------------------------------------------------------*/ + + /* memory.c */ + int hypre_OutOfMemory( int size ); + char *hypre_MAlloc( int size ); + char *hypre_CAlloc( int count , int elt_size ); + char *hypre_ReAlloc( char *ptr , int size ); + void hypre_Free( char *ptr ); + char *hypre_SharedMAlloc( int size ); + char *hypre_SharedCAlloc( int count , int elt_size ); + char *hypre_SharedReAlloc( char *ptr , int size ); + void hypre_SharedFree( char *ptr ); + double *hypre_IncrementSharedDataPtr( double *ptr , int size ); + + /* memory_dmalloc.c */ + int hypre_InitMemoryDebugDML( int id ); + int hypre_FinalizeMemoryDebugDML( void ); + char *hypre_MAllocDML( int size , char *file , int line ); + char *hypre_CAllocDML( int count , int elt_size , char *file , int line ); + char *hypre_ReAllocDML( char *ptr , int size , char *file , int line ); + void hypre_FreeDML( char *ptr , char *file , int line ); + + #ifdef __cplusplus + } + #endif + + #endif + + /* random.c */ + void hypre_SeedRand( int seed ); + double hypre_Rand( void ); + + /*BHEADER********************************************************************** + * (c) 1998 The Regents of the University of California + * + * See the file COPYRIGHT_and_DISCLAIMER for a complete copyright + * notice, contact person, and disclaimer. + * + * $Revision: 1.1 $ + *********************************************************************EHEADER*/ + /****************************************************************************** + * + * Fake mpi stubs to generate serial codes without mpi + * + *****************************************************************************/ + /*just a test comment*/ + #ifndef hypre_thread_MPISTUBS + #define hypre_thread_MPISTUBS + + #ifdef HYPRE_USE_PTHREADS + + #ifdef __cplusplus + extern "C" { + #endif + + #ifndef HYPRE_USING_THREAD_MPISTUBS + + #define MPI_Init hypre_thread_MPI_Init + #define MPI_Wtime hypre_thread_MPI_Wtime + #define MPI_Wtick hypre_thread_MPI_Wtick + #define MPI_Barrier hypre_thread_MPI_Barrier + #define MPI_Finalize hypre_thread_MPI_Finalize + #define MPI_Comm_group hypre_thread_MPI_Comm_group + #define MPI_Comm_dup hypre_thread_MPI_Comm_dup + #define MPI_Group_incl hypre_thread_MPI_Group_incl + #define MPI_Comm_create hypre_thread_MPI_Comm_create + #define MPI_Allgather hypre_thread_MPI_Allgather + #define MPI_Allgatherv hypre_thread_MPI_Allgatherv + #define MPI_Bcast hypre_thread_MPI_Bcast + #define MPI_Send hypre_thread_MPI_Send + #define MPI_Recv hypre_thread_MPI_Recv + + #define MPI_Isend hypre_thread_MPI_Isend + #define MPI_Irecv hypre_thread_MPI_Irecv + #define MPI_Wait hypre_thread_MPI_Wait + #define MPI_Waitall hypre_thread_MPI_Waitall + #define MPI_Waitany hypre_thread_MPI_Waitany + #define MPI_Comm_size hypre_thread_MPI_Comm_size + #define MPI_Comm_rank hypre_thread_MPI_Comm_rank + #define MPI_Allreduce hypre_thread_MPI_Allreduce + #define MPI_Type_hvector hypre_thread_MPI_Type_hvector + #define MPI_Type_struct hypre_thread_MPI_Type_struct + #define MPI_Type_free hypre_thread_MPI_Type_free + #define MPI_Type_commit hypre_thread_MPI_Type_commit + + #endif + + /*-------------------------------------------------------------------------- + * Prototypes + *--------------------------------------------------------------------------*/ + + /* mpistubs.c */ + int MPI_Init( int *argc , char ***argv ); + double MPI_Wtime( void ); + double MPI_Wtick( void ); + int MPI_Barrier( MPI_Comm comm ); + int MPI_Finalize( void ); + int MPI_Abort( MPI_Comm comm , int errorcode ); + int MPI_Comm_group( MPI_Comm comm , MPI_Group *group ); + int MPI_Comm_dup( MPI_Comm comm , MPI_Comm *newcomm ); + int MPI_Group_incl( MPI_Group group , int n , int *ranks , MPI_Group *newgroup ); + int MPI_Comm_create( MPI_Comm comm , MPI_Group group , MPI_Comm *newcomm ); + int MPI_Get_count( MPI_Status *status , MPI_Datatype datatype , int *count ); + int MPI_Alltoall( void *sendbuf , int sendcount , MPI_Datatype sendtype , void *recvbuf , int recvcount , MPI_Datatype recvtype , MPI_Comm comm ); + int MPI_Allgather( void *sendbuf , int sendcount , MPI_Datatype sendtype , void *recvbuf , int recvcount , MPI_Datatype recvtype , MPI_Comm comm ); + int MPI_Allgatherv( void *sendbuf , int sendcount , MPI_Datatype sendtype , void *recvbuf , int *recvcounts , int *displs , MPI_Datatype recvtype , MPI_Comm comm ); + int MPI_Gather( void *sendbuf , int sendcount , MPI_Datatype sendtype , void *recvbuf , int recvcount , MPI_Datatype recvtype , int root , MPI_Comm comm ); + int MPI_Scatter( void *sendbuf , int sendcount , MPI_Datatype sendtype , void *recvbuf , int recvcount , MPI_Datatype recvtype , int root , MPI_Comm comm ); + int MPI_Bcast( void *buffer , int count , MPI_Datatype datatype , int root , MPI_Comm comm ); + int MPI_Send( void *buf , int count , MPI_Datatype datatype , int dest , int tag , MPI_Comm comm ); + int MPI_Recv( void *buf , int count , MPI_Datatype datatype , int source , int tag , MPI_Comm comm , MPI_Status *status ); + int MPI_Isend( void *buf , int count , MPI_Datatype datatype , int dest , int tag , MPI_Comm comm , MPI_Request *request ); + int MPI_Irecv( void *buf , int count , MPI_Datatype datatype , int source , int tag , MPI_Comm comm , MPI_Request *request ); + int MPI_Wait( MPI_Request *request , MPI_Status *status ); + int MPI_Waitall( int count , MPI_Request *array_of_requests , MPI_Status *array_of_statuses ); + int MPI_Waitany( int count , MPI_Request *array_of_requests , int *index , MPI_Status *status ); + int MPI_Comm_size( MPI_Comm comm , int *size ); + int MPI_Comm_rank( MPI_Comm comm , int *rank ); + int MPI_Allreduce( void *sendbuf , void *recvbuf , int count , MPI_Datatype datatype , MPI_Op op , MPI_Comm comm ); + int MPI_Address( void *location , MPI_Aint *address ); + int MPI_Type_contiguous( int count , MPI_Datatype oldtype , MPI_Datatype *newtype ); + int MPI_Type_vector( int count , int blocklength , int stride , MPI_Datatype oldtype , MPI_Datatype *newtype ); + int MPI_Type_hvector( int count , int blocklength , MPI_Aint stride , MPI_Datatype oldtype , MPI_Datatype *newtype ); + int MPI_Type_struct( int count , int *array_of_blocklengths , MPI_Aint *array_of_displacements , MPI_Datatype *array_of_types , MPI_Datatype *newtype ); + int MPI_Type_free( MPI_Datatype *datatype ); + int MPI_Type_commit( MPI_Datatype *datatype ); + int MPI_Request_free( MPI_Request *request ); + int MPI_Send_init( void *buf , int count , MPI_Datatype datatype , int dest , int tag , MPI_Comm comm , MPI_Request *request ); + int MPI_Recv_init( void *buf , int count , MPI_Datatype datatype , int dest , int tag , MPI_Comm comm , MPI_Request *request ); + int MPI_Startall( int count , MPI_Request *array_of_requests ); + int MPI_Iprobe( int source , int tag , MPI_Comm comm , int *flag , MPI_Status *status ); + int MPI_Probe( int source , int tag , MPI_Comm comm , MPI_Status *status ); + int MPI_Irsend( void *buf , int count , MPI_Datatype datatype , int dest , int tag , MPI_Comm comm , MPI_Request *request ); + + #ifdef __cplusplus + } + #endif + + #endif + + #endif + /*BHEADER********************************************************************** + * (c) 1997 The Regents of the University of California + * + * See the file COPYRIGHT_and_DISCLAIMER for a complete copyright + * notice, contact person, and disclaimer. + * + * $Revision: 1.1 $ + *********************************************************************EHEADER*/ + + #ifndef hypre_THREADING_HEADER + #define hypre_THREADING_HEADER + + #if defined(HYPRE_USING_OPENMP) || defined (HYPRE_USING_PGCC_SMP) + + int hypre_NumThreads( void ); + + #else + + #define hypre_NumThreads() 1 + + #endif + + + /*!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!*/ + /* The pthreads stuff needs to be reworked */ + + #ifdef HYPRE_USE_PTHREADS + + #ifndef MAX_QUEUE + #define MAX_QUEUE 256 + #endif + + #include + + /* hypre_work_proc_t typedef'd to be a pointer to a function with a void* + argument and a void return type */ + typedef void (*hypre_work_proc_t)(void *); + + typedef struct hypre_workqueue_struct { + pthread_mutex_t lock; + pthread_cond_t work_wait; + pthread_cond_t finish_wait; + hypre_work_proc_t worker_proc_queue[MAX_QUEUE]; + int n_working; + int n_waiting; + int n_queue; + int inp; + int outp; + void *argqueue[MAX_QUEUE]; + } *hypre_workqueue_t; + + void hypre_work_put( hypre_work_proc_t funcptr, void *argptr ); + void hypre_work_wait( void ); + int HYPRE_InitPthreads( int num_threads ); + void HYPRE_DestroyPthreads( void ); + void hypre_pthread_worker( int threadid ); + int ifetchadd( int *w, pthread_mutex_t *mutex_fetchadd ); + int hypre_fetch_and_add( int *w ); + void hypre_barrier(pthread_mutex_t *mpi_mtx, int unthreaded); + int hypre_GetThreadID( void ); + + pthread_t initial_thread; + pthread_t hypre_thread[hypre_MAX_THREADS]; + pthread_mutex_t hypre_mutex_boxloops; + pthread_mutex_t talloc_mtx; + pthread_mutex_t worker_mtx; + hypre_workqueue_t hypre_qptr; + pthread_mutex_t mpi_mtx; + pthread_mutex_t time_mtx; + volatile int hypre_thread_release; + + #ifdef HYPRE_THREAD_GLOBALS + int hypre_NumThreads = 4; + #else + extern int hypre_NumThreads; + #endif + + #endif + /*!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!*/ + + #endif + /*BHEADER********************************************************************** + * (c) 1997 The Regents of the University of California + * + * See the file COPYRIGHT_and_DISCLAIMER for a complete copyright + * notice, contact person, and disclaimer. + * + * $Revision: 1.1 $ + *********************************************************************EHEADER*/ + + /****************************************************************************** + * + * Header file for doing timing + * + *****************************************************************************/ + + #ifndef HYPRE_TIMING_HEADER + #define HYPRE_TIMING_HEADER + + #include + #include + #include + + #ifdef __cplusplus + extern "C" { + #endif + + /*-------------------------------------------------------------------------- + * Prototypes for low-level timing routines + *--------------------------------------------------------------------------*/ + + /* timer.c */ + double time_getWallclockSeconds( void ); + double time_getCPUSeconds( void ); + double time_get_wallclock_seconds_( void ); + double time_get_cpu_seconds_( void ); + + /*-------------------------------------------------------------------------- + * With timing off + *--------------------------------------------------------------------------*/ + + #ifndef HYPRE_TIMING + + #define hypre_InitializeTiming(name) 0 + #define hypre_IncFLOPCount(inc) + #define hypre_BeginTiming(i) + #define hypre_EndTiming(i) + #define hypre_PrintTiming(heading, comm) + #define hypre_FinalizeTiming(index) + + /*-------------------------------------------------------------------------- + * With timing on + *--------------------------------------------------------------------------*/ + + #else + + /*------------------------------------------------------- + * Global timing structure + *-------------------------------------------------------*/ + + typedef struct + { + double *wall_time; + double *cpu_time; + double *flops; + char **name; + int *state; /* boolean flag to allow for recursive timing */ + int *num_regs; /* count of how many times a name is registered */ + + int num_names; + int size; + + double wall_count; + double CPU_count; + double FLOP_count; + + } hypre_TimingType; + + #ifdef HYPRE_TIMING_GLOBALS + hypre_TimingType *hypre_global_timing = NULL; + #else + extern hypre_TimingType *hypre_global_timing; + #endif + + /*------------------------------------------------------- + * Accessor functions + *-------------------------------------------------------*/ + + #ifndef HYPRE_USE_PTHREADS + #define hypre_TimingWallTime(i) (hypre_global_timing -> wall_time[(i)]) + #define hypre_TimingCPUTime(i) (hypre_global_timing -> cpu_time[(i)]) + #define hypre_TimingFLOPS(i) (hypre_global_timing -> flops[(i)]) + #define hypre_TimingName(i) (hypre_global_timing -> name[(i)]) + #define hypre_TimingState(i) (hypre_global_timing -> state[(i)]) + #define hypre_TimingNumRegs(i) (hypre_global_timing -> num_regs[(i)]) + #define hypre_TimingWallCount (hypre_global_timing -> wall_count) + #define hypre_TimingCPUCount (hypre_global_timing -> CPU_count) + #define hypre_TimingFLOPCount (hypre_global_timing -> FLOP_count) + #else + #define hypre_TimingWallTime(i) (hypre_global_timing[threadid].wall_time[(i)]) + #define hypre_TimingCPUTime(i) (hypre_global_timing[threadid].cpu_time[(i)]) + #define hypre_TimingFLOPS(i) (hypre_global_timing[threadid].flops[(i)]) + #define hypre_TimingName(i) (hypre_global_timing[threadid].name[(i)]) + #define hypre_TimingState(i) (hypre_global_timing[threadid].state[(i)]) + #define hypre_TimingNumRegs(i) (hypre_global_timing[threadid].num_regs[(i)]) + #define hypre_TimingWallCount (hypre_global_timing[threadid].wall_count) + #define hypre_TimingCPUCount (hypre_global_timing[threadid].CPU_count) + #define hypre_TimingFLOPCount (hypre_global_timing[threadid].FLOP_count) + #define hypre_TimingAllFLOPS (hypre_global_timing[hypre_NumThreads].FLOP_count) + #endif + + /*------------------------------------------------------- + * Prototypes + *-------------------------------------------------------*/ + + /* timing.c */ + int hypre_InitializeTiming( char *name ); + int hypre_FinalizeTiming( int time_index ); + int hypre_IncFLOPCount( int inc ); + int hypre_BeginTiming( int time_index ); + int hypre_EndTiming( int time_index ); + int hypre_ClearTiming( void ); + int hypre_PrintTiming( char *heading , MPI_Comm comm ); + + #endif + + #ifdef __cplusplus + } + #endif + + #endif + /*BHEADER********************************************************************** + * (c) 1997 The Regents of the University of California + * + * See the file COPYRIGHT_and_DISCLAIMER for a complete copyright + * notice, contact person, and disclaimer. + * + * $Revision: 1.1 $ + *********************************************************************EHEADER*/ + + /****************************************************************************** + * + * Header file link lists + * + *****************************************************************************/ + + #ifndef HYPRE_LINKLIST_HEADER + #define HYPRE_LINKLIST_HEADER + + #include + #include + #include + + #ifdef __cplusplus + extern "C" { + #endif + + #define LIST_HEAD -1 + #define LIST_TAIL -2 + + struct double_linked_list + { + int data; + struct double_linked_list *next_elt; + struct double_linked_list *prev_elt; + int head; + int tail; + }; + + typedef struct double_linked_list hypre_ListElement; + typedef hypre_ListElement *hypre_LinkList; + + #ifdef __cplusplus + } + #endif + + #endif + + /* amg_linklist.c */ + void dispose_elt( hypre_LinkList element_ptr ); + void remove_point( hypre_LinkList *LoL_head_ptr , hypre_LinkList *LoL_tail_ptr , int measure , int index , int *lists , int *where ); + hypre_LinkList create_elt( int Item ); + void enter_on_lists( hypre_LinkList *LoL_head_ptr , hypre_LinkList *LoL_tail_ptr , int measure , int index , int *lists , int *where ); + + + /* binsearch.c */ + int hypre_BinarySearch( int *list , int value , int list_length ); + + + /* qsplit.c */ + int hypre_DoubleQuickSplit( double *values , int *indices , int list_length , int NumberKept ); + + + #ifdef __cplusplus + } + #endif + + #endif + From duraid at octopus.com.au Mon Apr 11 00:25:19 2005 From: duraid at octopus.com.au (Duraid Madina) Date: Mon, 11 Apr 2005 00:25:19 -0500 Subject: [llvm-commits] CVS: llvm-test/MultiSource/Benchmarks/ASCI_Purple/Makefile Message-ID: <200504110525.AAA27492@zion.cs.uiuc.edu> Changes in directory llvm-test/MultiSource/Benchmarks/ASCI_Purple: Makefile added (r1.1) --- Log message: Makefile for the ASCI_Purple directory - will probably add some other benchmarks in here later --- Diffs of the changes: (+7 -0) Makefile | 7 +++++++ 1 files changed, 7 insertions(+) Index: llvm-test/MultiSource/Benchmarks/ASCI_Purple/Makefile diff -c /dev/null llvm-test/MultiSource/Benchmarks/ASCI_Purple/Makefile:1.1 *** /dev/null Mon Apr 11 00:25:18 2005 --- llvm-test/MultiSource/Benchmarks/ASCI_Purple/Makefile Mon Apr 11 00:25:08 2005 *************** *** 0 **** --- 1,7 ---- + # MultiSource/ASCI_Purple Makefile: Build all subdirectories automatically + + LEVEL = ../../.. + PARALLEL_DIRS = SMG2000 + + + include $(LEVEL)/Makefile.programs From duraid at octopus.com.au Mon Apr 11 00:27:28 2005 From: duraid at octopus.com.au (Duraid Madina) Date: Mon, 11 Apr 2005 00:27:28 -0500 Subject: [llvm-commits] CVS: llvm-test/LICENSE.TXT Message-ID: <200504110527.AAA27512@zion.cs.uiuc.edu> Changes in directory llvm-test: LICENSE.TXT updated: 1.6 -> 1.7 --- Log message: add SMG2000 license pointer --- Diffs of the changes: (+1 -0) LICENSE.TXT | 1 + 1 files changed, 1 insertion(+) Index: llvm-test/LICENSE.TXT diff -u llvm-test/LICENSE.TXT:1.6 llvm-test/LICENSE.TXT:1.7 --- llvm-test/LICENSE.TXT:1.6 Tue Jan 11 12:13:15 2005 +++ llvm-test/LICENSE.TXT Mon Apr 11 00:27:17 2005 @@ -74,6 +74,7 @@ kimwitu++: llvm-test/MultiSource/Applications/kimwitu++ obsequi: llvm-test/MultiSource/Applications/obsequi Hexxagon: llvm-test/MultiSource/Applications/hexxagon +smg2000: llvm-test/MultiSource/Benchmarks/ASCI_Purple/SMG2000 Fhourstones: llvm-test/MultiSource/Benchmarks/Fhourstones McCat: llvm-test/MultiSource/Benchmarks/McCat Olden: llvm-test/MultiSource/Benchmarks/Olden From reid at x10sys.com Mon Apr 11 00:48:15 2005 From: reid at x10sys.com (Reid Spencer) Date: Mon, 11 Apr 2005 00:48:15 -0500 Subject: [llvm-commits] CVS: llvm/tools/llvmc/CompilerDriver.cpp Message-ID: <200504110548.AAA27745@zion.cs.uiuc.edu> Changes in directory llvm/tools/llvmc: CompilerDriver.cpp updated: 1.26 -> 1.27 --- Log message: Ensure that the arguments passed to sys::Program::ExecuteAndWait include the program name as the first argument. Thanks go to Markus Oberhumer for noticing this problem. --- Diffs of the changes: (+3 -2) CompilerDriver.cpp | 5 +++-- 1 files changed, 3 insertions(+), 2 deletions(-) Index: llvm/tools/llvmc/CompilerDriver.cpp diff -u llvm/tools/llvmc/CompilerDriver.cpp:1.26 llvm/tools/llvmc/CompilerDriver.cpp:1.27 --- llvm/tools/llvmc/CompilerDriver.cpp:1.26 Sun Feb 13 17:10:34 2005 +++ llvm/tools/llvmc/CompilerDriver.cpp Mon Apr 11 00:48:04 2005 @@ -394,8 +394,9 @@ // Invoke the program const char** Args = (const char**) - alloca(sizeof(const char*)*(action->args.size()+1)); - for (unsigned i = 0; i != action->args.size(); ++i) + alloca(sizeof(const char*)*(action->args.size()+2)); + Args[0] = action->program.toString().c_str(); + for (unsigned i = 1; i != action->args.size(); ++i) Args[i] = action->args[i].c_str(); Args[action->args.size()] = 0; // null terminate list. if (isSet(TIME_ACTIONS_FLAG)) { From duraid at octopus.com.au Mon Apr 11 00:56:07 2005 From: duraid at octopus.com.au (Duraid Madina) Date: Mon, 11 Apr 2005 00:56:07 -0500 Subject: [llvm-commits] CVS: llvm/lib/Target/IA64/IA64ISelPattern.cpp IA64AsmPrinter.cpp IA64RegisterInfo.cpp IA64InstrInfo.td Message-ID: <200504110556.AAA27820@zion.cs.uiuc.edu> Changes in directory llvm/lib/Target/IA64: IA64ISelPattern.cpp updated: 1.15 -> 1.16 IA64AsmPrinter.cpp updated: 1.8 -> 1.9 IA64RegisterInfo.cpp updated: 1.2 -> 1.3 IA64InstrInfo.td updated: 1.8 -> 1.9 --- Log message: assorted fixes: * clean up immediates (we use 14, 22 and 64 bit immediates now. sane.) * fold r0/f0/f1 registers into comparisons against 0/0.0/1.0 * fix nasty thinko - didn't use two-address form of conditional add for extending bools to integers, so occasionally there would be garbage in the result. it's amazing how often zeros are just sitting around in registers ;) - this should fix a bunch of tests. --- Diffs of the changes: (+78 -66) IA64AsmPrinter.cpp | 22 ++------------ IA64ISelPattern.cpp | 79 ++++++++++++++++++++++++++++++++++++--------------- IA64InstrInfo.td | 37 +++++++++-------------- IA64RegisterInfo.cpp | 6 +-- 4 files changed, 78 insertions(+), 66 deletions(-) Index: llvm/lib/Target/IA64/IA64ISelPattern.cpp diff -u llvm/lib/Target/IA64/IA64ISelPattern.cpp:1.15 llvm/lib/Target/IA64/IA64ISelPattern.cpp:1.16 --- llvm/lib/Target/IA64/IA64ISelPattern.cpp:1.15 Fri Apr 8 22:22:24 2005 +++ llvm/lib/Target/IA64/IA64ISelPattern.cpp Mon Apr 11 00:55:56 2005 @@ -639,28 +639,36 @@ else // false: BuildMI(BB, IA64::CMPNE, 2, Result) .addReg(IA64::r0).addReg(IA64::r0); - return Result; + return Result; // early exit } - case MVT::i64: Opc = IA64::MOVLI32; break; + case MVT::i64: break; } int64_t immediate = cast(N)->getValue(); - if(immediate>>32) { // if our immediate really is big: - int highPart = immediate>>32; - int lowPart = immediate&0xFFFFFFFF; - unsigned dummy = MakeReg(MVT::i64); - unsigned dummy2 = MakeReg(MVT::i64); - unsigned dummy3 = MakeReg(MVT::i64); - - BuildMI(BB, IA64::MOVLI32, 1, dummy).addImm(highPart); - BuildMI(BB, IA64::SHLI, 2, dummy2).addReg(dummy).addImm(32); - BuildMI(BB, IA64::MOVLI32, 1, dummy3).addImm(lowPart); - BuildMI(BB, IA64::ADD, 2, Result).addReg(dummy2).addReg(dummy3); - } else { - BuildMI(BB, IA64::MOVLI32, 1, Result).addImm(immediate); + + if(immediate==0) { // if the constant is just zero, + BuildMI(BB, IA64::MOV, 1, Result).addReg(IA64::r0); // just copy r0 + return Result; // early exit } - return Result; + if (immediate <= 8191 && immediate >= -8192) { + // if this constants fits in 14 bits, we use a mov the assembler will + // turn into: "adds rDest=imm,r0" (and _not_ "andl"...) + BuildMI(BB, IA64::MOVSIMM14, 1, Result).addSImm(immediate); + return Result; // early exit + } + + if (immediate <= 2097151 && immediate >= -2097152) { + // if this constants fits in 22 bits, we use a mov the assembler will + // turn into: "addl rDest=imm,r0" + BuildMI(BB, IA64::MOVSIMM22, 1, Result).addSImm(immediate); + return Result; // early exit + } + + /* otherwise, our immediate is big, so we use movl */ + uint64_t Imm = immediate; + BuildMI(BB, IA64::MOVLIMM64, 1, Result).addU64Imm(Imm); + return Result; } case ISD::UNDEF: { @@ -706,7 +714,7 @@ // first load zero: BuildMI(BB, IA64::MOV, 1, dummy).addReg(IA64::r0); // ...then conditionally (PR:Tmp1) add 1: - BuildMI(BB, IA64::CADDIMM22, 3, Result).addReg(dummy) + BuildMI(BB, IA64::TPCADDIMM22, 2, Result).addReg(dummy) .addImm(1).addReg(Tmp1); return Result; // XXX early exit! } @@ -823,15 +831,16 @@ return Result; // early exit } Tmp1 = SelectExpr(N.getOperand(0)); - Tmp2 = SelectExpr(N.getOperand(1)); if(DestType != MVT::f64) { // integer addition: switch (ponderIntegerAdditionWith(N.getOperand(1), Tmp3)) { case 1: // adding a constant that's 14 bits BuildMI(BB, IA64::ADDIMM14, 2, Result).addReg(Tmp1).addSImm(Tmp3); return Result; // early exit } // fallthrough and emit a reg+reg ADD: + Tmp2 = SelectExpr(N.getOperand(1)); BuildMI(BB, IA64::ADD, 2, Result).addReg(Tmp1).addReg(Tmp2); } else { // this is a floating point addition + Tmp2 = SelectExpr(N.getOperand(1)); BuildMI(BB, IA64::FADD, 2, Result).addReg(Tmp1).addReg(Tmp2); } return Result; @@ -868,7 +877,6 @@ BuildMI(BB, IA64::FMS, 3, Result).addReg(Tmp1).addReg(Tmp2).addReg(Tmp3); return Result; // early exit } - Tmp1 = SelectExpr(N.getOperand(0)); Tmp2 = SelectExpr(N.getOperand(1)); if(DestType != MVT::f64) { // integer subtraction: switch (ponderIntegerSubtractionFrom(N.getOperand(0), Tmp3)) { @@ -876,8 +884,10 @@ BuildMI(BB, IA64::SUBIMM8, 2, Result).addSImm(Tmp3).addReg(Tmp2); return Result; // early exit } // fallthrough and emit a reg+reg SUB: + Tmp1 = SelectExpr(N.getOperand(0)); BuildMI(BB, IA64::SUB, 2, Result).addReg(Tmp1).addReg(Tmp2); } else { // this is a floating point subtraction + Tmp1 = SelectExpr(N.getOperand(0)); BuildMI(BB, IA64::FSUB, 2, Result).addReg(Tmp1).addReg(Tmp2); } return Result; @@ -1311,9 +1321,20 @@ case ISD::SETCC: { Tmp1 = SelectExpr(N.getOperand(0)); - Tmp2 = SelectExpr(N.getOperand(1)); + if (SetCCSDNode *SetCC = dyn_cast(Node)) { if (MVT::isInteger(SetCC->getOperand(0).getValueType())) { + + if(ConstantSDNode *CSDN = + dyn_cast(N.getOperand(1))) { + // if we are comparing against a constant zero + if(CSDN->getValue()==0) + Tmp2 = IA64::r0; // then we can just compare against r0 + else + Tmp2 = SelectExpr(N.getOperand(1)); + } else // not comparing against a constant + Tmp2 = SelectExpr(N.getOperand(1)); + switch (SetCC->getCondition()) { default: assert(0 && "Unknown integer comparison!"); case ISD::SETEQ: @@ -1351,6 +1372,20 @@ else { // if not integer, should be FP. FIXME: what about bools? ;) assert(SetCC->getOperand(0).getValueType() != MVT::f32 && "error: SETCC should have had incoming f32 promoted to f64!\n"); + + if(ConstantFPSDNode *CFPSDN = + dyn_cast(N.getOperand(1))) { + + // if we are comparing against a constant +0.0 or +1.0 + if(CFPSDN->isExactlyValue(+0.0)) + Tmp2 = IA64::F0; // then we can just compare against f0 + else if(CFPSDN->isExactlyValue(+1.0)) + Tmp2 = IA64::F1; // or f1 + else + Tmp2 = SelectExpr(N.getOperand(1)); + } else // not comparing against a constant + Tmp2 = SelectExpr(N.getOperand(1)); + switch (SetCC->getCondition()) { default: assert(0 && "Unknown FP comparison!"); case ISD::SETEQ: @@ -1836,7 +1871,7 @@ unsigned dummy3 = MakeReg(MVT::i64); unsigned dummy4 = MakeReg(MVT::i64); BuildMI(BB, IA64::MOV, 1, dummy3).addReg(IA64::r0); - BuildMI(BB, IA64::CADDIMM22, 3, dummy4) + BuildMI(BB, IA64::TPCADDIMM22, 2, dummy4) .addReg(dummy3).addImm(1).addReg(Tmp1); // if(Tmp1) dummy=0+1; BuildMI(BB, Opc, 2).addReg(dummy2).addReg(dummy4); } @@ -1858,7 +1893,7 @@ unsigned dummy3 = MakeReg(MVT::i64); unsigned dummy4 = MakeReg(MVT::i64); BuildMI(BB, IA64::MOV, 1, dummy3).addReg(IA64::r0); - BuildMI(BB, IA64::CADDIMM22, 3, dummy4) + BuildMI(BB, IA64::TPCADDIMM22, 2, dummy4) .addReg(dummy3).addImm(1).addReg(Tmp1); // if(Tmp1) dummy=0+1; BuildMI(BB, Opc, 2).addReg(Tmp2).addReg(dummy4); } Index: llvm/lib/Target/IA64/IA64AsmPrinter.cpp diff -u llvm/lib/Target/IA64/IA64AsmPrinter.cpp:1.8 llvm/lib/Target/IA64/IA64AsmPrinter.cpp:1.9 --- llvm/lib/Target/IA64/IA64AsmPrinter.cpp:1.8 Thu Apr 7 07:34:36 2005 +++ llvm/lib/Target/IA64/IA64AsmPrinter.cpp Mon Apr 11 00:55:56 2005 @@ -225,14 +225,6 @@ } } - void printS16ImmOperand(const MachineInstr *MI, unsigned OpNo, - MVT::ValueType VT) { - O << (short)MI->getOperand(OpNo).getImmedValue(); - } - void printU16ImmOperand(const MachineInstr *MI, unsigned OpNo, - MVT::ValueType VT) { - O << (unsigned short)MI->getOperand(OpNo).getImmedValue(); - } void printS8ImmOperand(const MachineInstr *MI, unsigned OpNo, MVT::ValueType VT) { int val=(unsigned int)MI->getOperand(OpNo).getImmedValue(); @@ -245,17 +237,11 @@ if(val>=8192) val=val-16384; // if negative, flip sign O << val; } - void printS21ImmOperand(const MachineInstr *MI, unsigned OpNo, - MVT::ValueType VT) { - O << (int)MI->getOperand(OpNo).getImmedValue(); // FIXME (21, not 32!) - } - void printS32ImmOperand(const MachineInstr *MI, unsigned OpNo, - MVT::ValueType VT) { - O << (int)MI->getOperand(OpNo).getImmedValue(); - } - void printU32ImmOperand(const MachineInstr *MI, unsigned OpNo, + void printS22ImmOperand(const MachineInstr *MI, unsigned OpNo, MVT::ValueType VT) { - O << (unsigned int)MI->getOperand(OpNo).getImmedValue(); + int val=(unsigned int)MI->getOperand(OpNo).getImmedValue(); + if(val>=2097152) val=val-4194304; // if negative, flip sign + O << val; } void printU64ImmOperand(const MachineInstr *MI, unsigned OpNo, MVT::ValueType VT) { Index: llvm/lib/Target/IA64/IA64RegisterInfo.cpp diff -u llvm/lib/Target/IA64/IA64RegisterInfo.cpp:1.2 llvm/lib/Target/IA64/IA64RegisterInfo.cpp:1.3 --- llvm/lib/Target/IA64/IA64RegisterInfo.cpp:1.2 Thu Mar 31 01:36:43 2005 +++ llvm/lib/Target/IA64/IA64RegisterInfo.cpp Mon Apr 11 00:55:56 2005 @@ -189,7 +189,7 @@ //fix up the old: MI.SetMachineOperandReg(i, IA64::r22); MachineInstr* nMI; - nMI=BuildMI(IA64::MOVLSI32, 1, IA64::r22).addSImm(Offset); + nMI=BuildMI(IA64::MOVLIMM64, 1, IA64::r22).addSImm(Offset); MBB.insert(II, nMI); nMI=BuildMI(IA64::ADD, 2, IA64::r22).addReg(BaseRegister) .addReg(IA64::r22); @@ -280,7 +280,7 @@ MI=BuildMI(IA64::ADDIMM22, 2, IA64::r12).addReg(IA64::r12).addImm(-NumBytes); MBB.insert(MBBI, MI); } else { // we use r22 as a scratch register here - MI=BuildMI(IA64::MOVLSI32, 1, IA64::r22).addSImm(-NumBytes); + MI=BuildMI(IA64::MOVLIMM64, 1, IA64::r22).addSImm(-NumBytes); // FIXME: MOVLSI32 expects a _u_32imm MBB.insert(MBBI, MI); // first load the decrement into r22 MI=BuildMI(IA64::ADD, 2, IA64::r12).addReg(IA64::r12).addReg(IA64::r22); @@ -328,7 +328,7 @@ MI=BuildMI(IA64::ADDIMM22, 2, IA64::r12).addReg(IA64::r12).addImm(NumBytes); MBB.insert(MBBI, MI); } else { - MI=BuildMI(IA64::MOVLI32, 1, IA64::r22).addImm(NumBytes); + MI=BuildMI(IA64::MOVLIMM64, 1, IA64::r22).addImm(NumBytes); MBB.insert(MBBI, MI); MI=BuildMI(IA64::ADD, 2, IA64::r12).addReg(IA64::r12).addReg(IA64::r22); MBB.insert(MBBI, MI); Index: llvm/lib/Target/IA64/IA64InstrInfo.td diff -u llvm/lib/Target/IA64/IA64InstrInfo.td:1.8 llvm/lib/Target/IA64/IA64InstrInfo.td:1.9 --- llvm/lib/Target/IA64/IA64InstrInfo.td:1.8 Fri Apr 8 05:01:48 2005 +++ llvm/lib/Target/IA64/IA64InstrInfo.td Mon Apr 11 00:55:56 2005 @@ -22,15 +22,8 @@ def s14imm : Operand { let PrintMethod = "printS14ImmOperand"; } -def s16imm : Operand; -def s21imm : Operand { - let PrintMethod = "printS21ImmOperand"; -} -def u32imm : Operand { - let PrintMethod = "printU32ImmOperand"; -} -def s32imm : Operand { - let PrintMethod = "printS32ImmOperand"; +def s22imm : Operand { + let PrintMethod = "printS22ImmOperand"; } def u64imm : Operand { let PrintMethod = "printU64ImmOperand"; @@ -92,13 +85,11 @@ "($qp) cmp.eq $dst, p0 = $src3, $src4;;">; } -def MOVI32 : AForm<0x03, 0x0b, (ops GR:$dst, u32imm:$imm), +def MOVSIMM14 : AForm<0x03, 0x0b, (ops GR:$dst, s14imm:$imm), "mov $dst = $imm;;">; -def MOVLI32 : AForm<0x03, 0x0b, (ops GR:$dst, u32imm:$imm), - "movl $dst = $imm;;">; -def MOVLSI32 : AForm<0x03, 0x0b, (ops GR:$dst, s32imm:$imm), - "movl $dst = $imm;;">; -def MOVLI64 : AForm<0x03, 0x0b, (ops GR:$dst, u64imm:$imm), +def MOVSIMM22 : AForm<0x03, 0x0b, (ops GR:$dst, s22imm:$imm), + "mov $dst = $imm;;">; +def MOVLIMM64 : AForm<0x03, 0x0b, (ops GR:$dst, u64imm:$imm), "movl $dst = $imm;;">; def AND : AForm<0x03, 0x0b, (ops GR:$dst, GR:$src1, GR:$src2), @@ -109,15 +100,15 @@ "xor $dst = $src1, $src2;;">; def SHL : AForm<0x03, 0x0b, (ops GR:$dst, GR:$src1, GR:$src2), "shl $dst = $src1, $src2;;">; -def SHLI : AForm<0x03, 0x0b, (ops GR:$dst, GR:$src1, s21imm:$imm), - "shl $dst = $src1, $imm;;">; // FIXME: 6 immediate bits, not 21 +def SHLI : AForm<0x03, 0x0b, (ops GR:$dst, GR:$src1, u6imm:$imm), + "shl $dst = $src1, $imm;;">; def SHRU : AForm<0x03, 0x0b, (ops GR:$dst, GR:$src1, GR:$src2), "shr.u $dst = $src1, $src2;;">; -def SHRUI : AForm<0x03, 0x0b, (ops GR:$dst, GR:$src1, s21imm:$imm), +def SHRUI : AForm<0x03, 0x0b, (ops GR:$dst, GR:$src1, u6imm:$imm), "shr.u $dst = $src1, $imm;;">; def SHRS : AForm<0x03, 0x0b, (ops GR:$dst, GR:$src1, GR:$src2), "shr $dst = $src1, $src2;;">; -def SHRSI : AForm<0x03, 0x0b, (ops GR:$dst, GR:$src1, s21imm:$imm), +def SHRSI : AForm<0x03, 0x0b, (ops GR:$dst, GR:$src1, u6imm:$imm), "shr $dst = $src1, $imm;;">; def EXTRU : AForm<0x03, 0x0b, (ops GR:$dst, GR:$src1, u6imm:$imm1, u6imm:$imm2), @@ -193,17 +184,17 @@ def ADDIMM14 : AForm<0x03, 0x0b, (ops GR:$dst, GR:$src1, s14imm:$imm), "adds $dst = $imm, $src1;;">; -def ADDIMM22 : AForm<0x03, 0x0b, (ops GR:$dst, GR:$src1, s21imm:$imm), +def ADDIMM22 : AForm<0x03, 0x0b, (ops GR:$dst, GR:$src1, s22imm:$imm), "add $dst = $imm, $src1;;">; -def CADDIMM22 : AForm<0x03, 0x0b, (ops GR:$dst, GR:$src1, s21imm:$imm, PR:$qp), +def CADDIMM22 : AForm<0x03, 0x0b, (ops GR:$dst, GR:$src1, s22imm:$imm, PR:$qp), "($qp) add $dst = $imm, $src1;;">; let isTwoAddress = 1 in { def TPCADDIMM22 : AForm<0x03, 0x0b, - (ops GR:$dst, GR:$src1, s21imm:$imm, PR:$qp), + (ops GR:$dst, GR:$src1, s22imm:$imm, PR:$qp), "($qp) add $dst = $imm, $dst;;">; def TPCMPIMM8NE : AForm<0x03, 0x0b, - (ops PR:$dst, PR:$src1, s21imm:$imm, GR:$src2, PR:$qp), + (ops PR:$dst, PR:$src1, s22imm:$imm, GR:$src2, PR:$qp), "($qp) cmp.ne $dst , p0 = $imm, $src2;;">; } From natebegeman at mac.com Mon Apr 11 01:34:21 2005 From: natebegeman at mac.com (Nate Begeman) Date: Mon, 11 Apr 2005 01:34:21 -0500 Subject: [llvm-commits] CVS: llvm/lib/Target/PowerPC/PPC32ISelPattern.cpp PowerPCInstrInfo.td Message-ID: <200504110634.BAA28017@zion.cs.uiuc.edu> Changes in directory llvm/lib/Target/PowerPC: PPC32ISelPattern.cpp updated: 1.62 -> 1.63 PowerPCInstrInfo.td updated: 1.56 -> 1.57 --- Log message: Add recording variants of ISD::AND and ISD::OR. This kills almost 1000 (1.5%) instructions in 186.crafty --- Diffs of the changes: (+48 -10) PPC32ISelPattern.cpp | 53 +++++++++++++++++++++++++++++++++++++++++---------- PowerPCInstrInfo.td | 5 ++++ 2 files changed, 48 insertions(+), 10 deletions(-) Index: llvm/lib/Target/PowerPC/PPC32ISelPattern.cpp diff -u llvm/lib/Target/PowerPC/PPC32ISelPattern.cpp:1.62 llvm/lib/Target/PowerPC/PPC32ISelPattern.cpp:1.63 --- llvm/lib/Target/PowerPC/PPC32ISelPattern.cpp:1.62 Sun Apr 10 01:06:10 2005 +++ llvm/lib/Target/PowerPC/PPC32ISelPattern.cpp Mon Apr 11 01:34:10 2005 @@ -473,7 +473,7 @@ } namespace { -Statistic<>Rotates("ppc-codegen", "Number of rotates emitted"); +Statistic<>Recorded("ppc-codegen", "Number of recording ops emitted"); Statistic<>FusedFP("ppc-codegen", "Number of fused fp operations"); //===--------------------------------------------------------------------===// /// ISel - PPC32 specific code to select PPC32 machine instructions for @@ -492,7 +492,7 @@ unsigned GlobalBaseReg; bool GlobalBaseInitialized; - + bool RecordSuccess; public: ISel(TargetMachine &TM) : SelectionDAGISel(PPC32Lowering), PPC32Lowering(TM), ISelDAG(0) {} @@ -526,7 +526,7 @@ unsigned getConstDouble(double floatVal, unsigned Result); bool SelectBitfieldInsert(SDOperand OR, unsigned Result); unsigned SelectSetCR0(SDOperand CC); - unsigned SelectExpr(SDOperand N); + unsigned SelectExpr(SDOperand N, bool Recording=false); unsigned SelectExprFP(SDOperand N, unsigned Result); void Select(SDOperand N); @@ -648,6 +648,17 @@ return 0; } +/// NodeHasRecordingVariant - If SelectExpr can always produce code for +/// NodeOpcode that also sets CR0 as a side effect, return true. Otherwise, +/// return false. +static bool NodeHasRecordingVariant(unsigned NodeOpcode) { + switch(NodeOpcode) { + default: return false; + case ISD::AND: + case ISD::OR: return true; + } +} + /// getBCCForSetCC - Returns the PowerPC condition branch mnemonic corresponding /// to Condition. If the Condition is unordered or unsigned, the bool argument /// U is set to true, otherwise it is set to false. @@ -684,6 +695,8 @@ return 0; } +/// + // Structure used to return the necessary information to codegen an SDIV as // a multiply. struct ms { @@ -927,7 +940,6 @@ // where both bitfield halves are sourced from the same value. if (IsRotate && OR.getOperand(0).getOperand(0) == OR.getOperand(1).getOperand(0)) { - ++Rotates; // Statistic Tmp1 = SelectExpr(OR.getOperand(0).getOperand(0)); BuildMI(BB, PPC::RLWINM, 4, Result).addReg(Tmp1).addImm(Amount) .addImm(0).addImm(31); @@ -955,13 +967,29 @@ SetCCSDNode* SetCC = dyn_cast(CC.Val); if (SetCC && CC.getOpcode() == ISD::SETCC) { bool U; + bool AlreadySelected = false; Opc = getBCCForSetCC(SetCC->getCondition(), U); - Tmp1 = SelectExpr(SetCC->getOperand(0)); // Pass the optional argument U to getImmediateForOpcode for SETCC, // so that it knows whether the SETCC immediate range is signed or not. if (1 == getImmediateForOpcode(SetCC->getOperand(1), ISD::SETCC, Tmp2, U)) { + // For comparisons against zero, we can implicity set CR0 if a recording + // variant (e.g. 'or.' instead of 'or') of the instruction that defines + // operand zero of the SetCC node is available. + if (0 == Tmp2 && + NodeHasRecordingVariant(SetCC->getOperand(0).getOpcode())) { + RecordSuccess = false; + Tmp1 = SelectExpr(SetCC->getOperand(0), true); + if (RecordSuccess) { + ++Recorded; + return Opc; + } + AlreadySelected = true; + } + // If we could not implicitly set CR0, then emit a compare immediate + // instead. + if (!AlreadySelected) Tmp1 = SelectExpr(SetCC->getOperand(0)); if (U) BuildMI(BB, PPC::CMPLWI, 2, PPC::CR0).addReg(Tmp1).addImm(Tmp2); else @@ -969,6 +997,7 @@ } else { bool IsInteger = MVT::isInteger(SetCC->getOperand(0).getValueType()); unsigned CompareOpc = CompareOpcodes[2 * IsInteger + U]; + Tmp1 = SelectExpr(SetCC->getOperand(0)); Tmp2 = SelectExpr(SetCC->getOperand(1)); BuildMI(BB, CompareOpc, 2, PPC::CR0).addReg(Tmp1).addReg(Tmp2); } @@ -1337,7 +1366,7 @@ return 0; } -unsigned ISel::SelectExpr(SDOperand N) { +unsigned ISel::SelectExpr(SDOperand N, bool Recording) { unsigned Result; unsigned Tmp1, Tmp2, Tmp3; unsigned Opc = 0; @@ -1588,10 +1617,10 @@ Tmp1 = SelectExpr(N.getOperand(0)); switch(cast(Node)->getExtraValueType()) { default: Node->dump(); assert(0 && "Unhandled SIGN_EXTEND type"); break; - case MVT::i16: + case MVT::i16: BuildMI(BB, PPC::EXTSH, 1, Result).addReg(Tmp1); break; - case MVT::i8: + case MVT::i8: BuildMI(BB, PPC::EXTSB, 1, Result).addReg(Tmp1); break; case MVT::i1: @@ -1681,7 +1710,8 @@ default: assert(0 && "unhandled result code"); case 0: // No immediate Tmp2 = SelectExpr(N.getOperand(1)); - BuildMI(BB, PPC::AND, 2, Result).addReg(Tmp1).addReg(Tmp2); + Opc = Recording ? PPC::ANDo : PPC::AND; + BuildMI(BB, Opc, 2, Result).addReg(Tmp1).addReg(Tmp2); break; case 1: // Low immediate BuildMI(BB, PPC::ANDIo, 2, Result).addReg(Tmp1).addImm(Tmp2); @@ -1690,6 +1720,7 @@ BuildMI(BB, PPC::ANDISo, 2, Result).addReg(Tmp1).addImm(Tmp2); break; } + RecordSuccess = true; return Result; case ISD::OR: @@ -1700,7 +1731,9 @@ default: assert(0 && "unhandled result code"); case 0: // No immediate Tmp2 = SelectExpr(N.getOperand(1)); - BuildMI(BB, PPC::OR, 2, Result).addReg(Tmp1).addReg(Tmp2); + Opc = Recording ? PPC::ORo : PPC::OR; + RecordSuccess = true; + BuildMI(BB, Opc, 2, Result).addReg(Tmp1).addReg(Tmp2); break; case 1: // Low immediate BuildMI(BB, PPC::ORI, 2, Result).addReg(Tmp1).addImm(Tmp2); Index: llvm/lib/Target/PowerPC/PowerPCInstrInfo.td diff -u llvm/lib/Target/PowerPC/PowerPCInstrInfo.td:1.56 llvm/lib/Target/PowerPC/PowerPCInstrInfo.td:1.57 --- llvm/lib/Target/PowerPC/PowerPCInstrInfo.td:1.56 Sat Apr 9 15:09:12 2005 +++ llvm/lib/Target/PowerPC/PowerPCInstrInfo.td Mon Apr 11 01:34:10 2005 @@ -140,12 +140,14 @@ def STWU : DForm_3<37, 0, 0, (ops GPRC:$rS, s16imm:$disp, GPRC:$rA), "stwu $rS, $disp($rA)">; } +let Defs = [CR0] in { def ANDIo : DForm_4<28, 0, 0, (ops GPRC:$dst, GPRC:$src1, u16imm:$src2), "andi. $dst, $src1, $src2">; def ANDISo : DForm_4<29, 0, 0, (ops GPRC:$dst, GPRC:$src1, u16imm:$src2), "andis. $dst, $src1, $src2">; +} def ORI : DForm_4<24, 0, 0, (ops GPRC:$dst, GPRC:$src1, u16imm:$src2), "ori $dst, $src1, $src2">; @@ -222,6 +224,9 @@ def MFCR : XForm_5<31, 19, 0, 0, (ops GPRC:$dst), "mfcr $dst">; def AND : XForm_6<31, 28, 0, 0, 0, (ops GPRC:$rA, GPRC:$rS, GPRC:$rB), "and $rA, $rS, $rB">; +let Defs = [CR0] in +def ANDo : XForm_6<31, 28, 1, 0, 0, (ops GPRC:$rA, GPRC:$rS, GPRC:$rB), + "and. $rA, $rS, $rB">; def ANDC : XForm_6<31, 60, 0, 0, 0, (ops GPRC:$rA, GPRC:$rS, GPRC:$rB), "andc $rA, $rS, $rB">; def EQV : XForm_6<31, 284, 0, 0, 0, (ops GPRC:$rA, GPRC:$rS, GPRC:$rB), From duraid at octopus.com.au Mon Apr 11 02:14:52 2005 From: duraid at octopus.com.au (Duraid Madina) Date: Mon, 11 Apr 2005 02:14:52 -0500 Subject: [llvm-commits] CVS: llvm/include/llvm/CodeGen/MachineInstrBuilder.h Message-ID: <200504110714.CAA28189@zion.cs.uiuc.edu> Changes in directory llvm/include/llvm/CodeGen: MachineInstrBuilder.h updated: 1.26 -> 1.27 --- Log message: rename addU64Imm() to addImm64() --- Diffs of the changes: (+2 -2) MachineInstrBuilder.h | 4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) Index: llvm/include/llvm/CodeGen/MachineInstrBuilder.h diff -u llvm/include/llvm/CodeGen/MachineInstrBuilder.h:1.26 llvm/include/llvm/CodeGen/MachineInstrBuilder.h:1.27 --- llvm/include/llvm/CodeGen/MachineInstrBuilder.h:1.26 Sun Apr 10 04:18:55 2005 +++ llvm/include/llvm/CodeGen/MachineInstrBuilder.h Mon Apr 11 02:14:41 2005 @@ -108,9 +108,9 @@ return *this; } - /// addU64Imm - Add a new 64-bit immediate operand... + /// addImm64 - Add a new 64-bit immediate operand... /// - const MachineInstrBuilder &addU64Imm(uint64_t Val) const { + const MachineInstrBuilder &addImm64(uint64_t Val) const { MI->addZeroExtImm64Operand(Val); return *this; } From duraid at octopus.com.au Mon Apr 11 02:16:50 2005 From: duraid at octopus.com.au (Duraid Madina) Date: Mon, 11 Apr 2005 02:16:50 -0500 Subject: [llvm-commits] CVS: llvm/lib/Target/IA64/IA64ISelPattern.cpp Message-ID: <200504110716.CAA28209@zion.cs.uiuc.edu> Changes in directory llvm/lib/Target/IA64: IA64ISelPattern.cpp updated: 1.16 -> 1.17 --- Log message: hmm, should probably change addImm() to take 64-bit arguments one day anyway. --- Diffs of the changes: (+1 -1) IA64ISelPattern.cpp | 2 +- 1 files changed, 1 insertion(+), 1 deletion(-) Index: llvm/lib/Target/IA64/IA64ISelPattern.cpp diff -u llvm/lib/Target/IA64/IA64ISelPattern.cpp:1.16 llvm/lib/Target/IA64/IA64ISelPattern.cpp:1.17 --- llvm/lib/Target/IA64/IA64ISelPattern.cpp:1.16 Mon Apr 11 00:55:56 2005 +++ llvm/lib/Target/IA64/IA64ISelPattern.cpp Mon Apr 11 02:16:39 2005 @@ -667,7 +667,7 @@ /* otherwise, our immediate is big, so we use movl */ uint64_t Imm = immediate; - BuildMI(BB, IA64::MOVLIMM64, 1, Result).addU64Imm(Imm); + BuildMI(BB, IA64::MOVLIMM64, 1, Result).addImm64(Imm); return Result; } From lattner at cs.uiuc.edu Mon Apr 11 10:01:56 2005 From: lattner at cs.uiuc.edu (Chris Lattner) Date: Mon, 11 Apr 2005 10:01:56 -0500 Subject: [llvm-commits] CVS: llvm/lib/Target/PowerPC/PowerPCInstrFormats.td PowerPCInstrInfo.td Message-ID: <200504111501.j3BF1uQN030431@apoc.cs.uiuc.edu> Changes in directory llvm/lib/Target/PowerPC: PowerPCInstrFormats.td updated: 1.30 -> 1.31 PowerPCInstrInfo.td updated: 1.57 -> 1.58 --- Log message: Fix a minor bug (ORo didn't mark that it set CR0). Refactor how . instructions are handled. In particular, instead of passing the RC flag all the way up the inheritance hierarchy, just make a new tblgen class 'DOT' which can be added to an instruction definition. For example, instead of this: -def AND : XForm_6<31, 28, 0, 0, 0, (ops GPRC:$rA, GPRC:$rS, GPRC:$rB), -let Defs = [CR0] in -def ANDo : XForm_6<31, 28, 1, 0, 0, (ops GPRC:$rA, GPRC:$rS, GPRC:$rB), - "and. $rA, $rS, $rB">; We now have this: +def AND : XForm_6<31, 28, 0, 0, (ops GPRC:$rA, GPRC:$rS, GPRC:$rB), "and $rA, $rS, $rB">; --- Diffs of the changes: (+37 -26) PowerPCInstrFormats.td | 24 ++++++++++++++++++------ PowerPCInstrInfo.td | 39 +++++++++++++++++++-------------------- 2 files changed, 37 insertions(+), 26 deletions(-) Index: llvm/lib/Target/PowerPC/PowerPCInstrFormats.td diff -u llvm/lib/Target/PowerPC/PowerPCInstrFormats.td:1.30 llvm/lib/Target/PowerPC/PowerPCInstrFormats.td:1.31 --- llvm/lib/Target/PowerPC/PowerPCInstrFormats.td:1.30 Wed Nov 24 22:11:07 2004 +++ llvm/lib/Target/PowerPC/PowerPCInstrFormats.td Mon Apr 11 10:01:39 2005 @@ -10,6 +10,14 @@ // //===----------------------------------------------------------------------===// +// DOT - This is a marker that should be added to instructions that set the +// flags in CR0. +class DOT { + list Defs = [CR0]; + bit RC = 1; +} + + class Format val> { bits<5> Value = val; } @@ -217,18 +225,19 @@ // This is the same as XForm_base_r3xo, but the first two operands are swapped // when code is emitted. class XForm_base_r3xo_swapped - opcode, bits<10> xo, bit rc, bit ppc64, bit vmx, + opcode, bits<10> xo, bit ppc64, bit vmx, dag OL, string asmstr> : I { bits<5> A; bits<5> RST; bits<5> B; + bit RC = 0; let Inst{6-10} = RST; let Inst{11-15} = A; let Inst{16-20} = B; let Inst{21-30} = xo; - let Inst{31} = rc; + let Inst{31} = RC; } @@ -243,9 +252,10 @@ let B = 0; } -class XForm_6 opcode, bits<10> xo, bit rc, bit ppc64, bit vmx, +class XForm_6 opcode, bits<10> xo, bit ppc64, bit vmx, dag OL, string asmstr> - : XForm_base_r3xo_swapped; + : XForm_base_r3xo_swapped { +} class XForm_8 opcode, bits<10> xo, bit ppc64, bit vmx, dag OL, string asmstr> @@ -253,13 +263,15 @@ class XForm_10 opcode, bits<10> xo, bit rc, bit ppc64, bit vmx, dag OL, string asmstr> - : XForm_base_r3xo_swapped { + : XForm_base_r3xo_swapped { + let RC = rc; } class XForm_11 opcode, bits<10> xo, bit rc, bit ppc64, bit vmx, dag OL, string asmstr> - : XForm_base_r3xo_swapped { + : XForm_base_r3xo_swapped { let B = 0; + let RC = rc; } class XForm_16 opcode, bits<10> xo, bit ppc64, bit vmx, Index: llvm/lib/Target/PowerPC/PowerPCInstrInfo.td diff -u llvm/lib/Target/PowerPC/PowerPCInstrInfo.td:1.57 llvm/lib/Target/PowerPC/PowerPCInstrInfo.td:1.58 --- llvm/lib/Target/PowerPC/PowerPCInstrInfo.td:1.57 Mon Apr 11 01:34:10 2005 +++ llvm/lib/Target/PowerPC/PowerPCInstrInfo.td Mon Apr 11 10:01:39 2005 @@ -1,4 +1,3 @@ - //===- PowerPCInstrInfo.td - The PowerPC Instruction Set -----*- tablegen -*-=// // // The LLVM Compiler Infrastructure @@ -222,38 +221,38 @@ "ldx $dst, $base, $index">; } def MFCR : XForm_5<31, 19, 0, 0, (ops GPRC:$dst), "mfcr $dst">; -def AND : XForm_6<31, 28, 0, 0, 0, (ops GPRC:$rA, GPRC:$rS, GPRC:$rB), +def AND : XForm_6<31, 28, 0, 0, (ops GPRC:$rA, GPRC:$rS, GPRC:$rB), "and $rA, $rS, $rB">; -let Defs = [CR0] in -def ANDo : XForm_6<31, 28, 1, 0, 0, (ops GPRC:$rA, GPRC:$rS, GPRC:$rB), - "and. $rA, $rS, $rB">; -def ANDC : XForm_6<31, 60, 0, 0, 0, (ops GPRC:$rA, GPRC:$rS, GPRC:$rB), + +def ANDo : XForm_6<31, 28, 0, 0, (ops GPRC:$rA, GPRC:$rS, GPRC:$rB), + "and. $rA, $rS, $rB">, DOT; +def ANDC : XForm_6<31, 60, 0, 0, (ops GPRC:$rA, GPRC:$rS, GPRC:$rB), "andc $rA, $rS, $rB">; -def EQV : XForm_6<31, 284, 0, 0, 0, (ops GPRC:$rA, GPRC:$rS, GPRC:$rB), +def EQV : XForm_6<31, 284, 0, 0, (ops GPRC:$rA, GPRC:$rS, GPRC:$rB), "eqv $rA, $rS, $rB">; -def NAND : XForm_6<31, 476, 0, 0, 0, (ops GPRC:$rA, GPRC:$rS, GPRC:$rB), +def NAND : XForm_6<31, 476, 0, 0, (ops GPRC:$rA, GPRC:$rS, GPRC:$rB), "nand $rA, $rS, $rB">; -def NOR : XForm_6<31, 124, 0, 0, 0, (ops GPRC:$rA, GPRC:$rS, GPRC:$rB), +def NOR : XForm_6<31, 124, 0, 0, (ops GPRC:$rA, GPRC:$rS, GPRC:$rB), "nor $rA, $rS, $rB">; -def OR : XForm_6<31, 444, 0, 0, 0, (ops GPRC:$rA, GPRC:$rS, GPRC:$rB), +def OR : XForm_6<31, 444, 0, 0, (ops GPRC:$rA, GPRC:$rS, GPRC:$rB), "or $rA, $rS, $rB">; -def ORo : XForm_6<31, 444, 1, 0, 0, (ops GPRC:$rA, GPRC:$rS, GPRC:$rB), - "or. $rA, $rS, $rB">; -def ORC : XForm_6<31, 412, 0, 0, 0, (ops GPRC:$rA, GPRC:$rS, GPRC:$rB), +def ORo : XForm_6<31, 444, 0, 0, (ops GPRC:$rA, GPRC:$rS, GPRC:$rB), + "or. $rA, $rS, $rB">, DOT; +def ORC : XForm_6<31, 412, 0, 0, (ops GPRC:$rA, GPRC:$rS, GPRC:$rB), "orc $rA, $rS, $rB">; -def SLD : XForm_6<31, 27, 0, 1, 0, (ops GPRC:$rA, GPRC:$rS, GPRC:$rB), +def SLD : XForm_6<31, 27, 1, 0, (ops GPRC:$rA, GPRC:$rS, GPRC:$rB), "sld $rA, $rS, $rB">; -def SLW : XForm_6<31, 24, 0, 0, 0, (ops GPRC:$rA, GPRC:$rS, GPRC:$rB), +def SLW : XForm_6<31, 24, 0, 0, (ops GPRC:$rA, GPRC:$rS, GPRC:$rB), "slw $rA, $rS, $rB">; -def SRD : XForm_6<31, 539, 0, 1, 0, (ops GPRC:$rA, GPRC:$rS, GPRC:$rB), +def SRD : XForm_6<31, 539, 1, 0, (ops GPRC:$rA, GPRC:$rS, GPRC:$rB), "srd $rA, $rS, $rB">; -def SRW : XForm_6<31, 536, 0, 0, 0, (ops GPRC:$rA, GPRC:$rS, GPRC:$rB), +def SRW : XForm_6<31, 536, 0, 0, (ops GPRC:$rA, GPRC:$rS, GPRC:$rB), "srw $rA, $rS, $rB">; -def SRAD : XForm_6<31, 794, 0, 1, 0, (ops GPRC:$rA, GPRC:$rS, GPRC:$rB), +def SRAD : XForm_6<31, 794, 1, 0, (ops GPRC:$rA, GPRC:$rS, GPRC:$rB), "srad $rA, $rS, $rB">; -def SRAW : XForm_6<31, 792, 0, 0, 0, (ops GPRC:$rA, GPRC:$rS, GPRC:$rB), +def SRAW : XForm_6<31, 792, 0, 0, (ops GPRC:$rA, GPRC:$rS, GPRC:$rB), "sraw $rA, $rS, $rB">; -def XOR : XForm_6<31, 316, 0, 0, 0, (ops GPRC:$rA, GPRC:$rS, GPRC:$rB), +def XOR : XForm_6<31, 316, 0, 0, (ops GPRC:$rA, GPRC:$rS, GPRC:$rB), "xor $rA, $rS, $rB">; let isStore = 1 in { def STBX : XForm_8<31, 215, 0, 0, (ops GPRC:$rS, GPRC:$rA, GPRC:$rB), From lattner at cs.uiuc.edu Mon Apr 11 10:03:55 2005 From: lattner at cs.uiuc.edu (Chris Lattner) Date: Mon, 11 Apr 2005 10:03:55 -0500 Subject: [llvm-commits] CVS: llvm/lib/Target/PowerPC/PowerPCInstrFormats.td PowerPCInstrInfo.td Message-ID: <200504111503.j3BF3tJ4030464@apoc.cs.uiuc.edu> Changes in directory llvm/lib/Target/PowerPC: PowerPCInstrFormats.td updated: 1.31 -> 1.32 PowerPCInstrInfo.td updated: 1.58 -> 1.59 --- Log message: Revert the previous patch, which I didn't mean to check in. --- Diffs of the changes: (+26 -37) PowerPCInstrFormats.td | 24 ++++++------------------ PowerPCInstrInfo.td | 39 ++++++++++++++++++++------------------- 2 files changed, 26 insertions(+), 37 deletions(-) Index: llvm/lib/Target/PowerPC/PowerPCInstrFormats.td diff -u llvm/lib/Target/PowerPC/PowerPCInstrFormats.td:1.31 llvm/lib/Target/PowerPC/PowerPCInstrFormats.td:1.32 --- llvm/lib/Target/PowerPC/PowerPCInstrFormats.td:1.31 Mon Apr 11 10:01:39 2005 +++ llvm/lib/Target/PowerPC/PowerPCInstrFormats.td Mon Apr 11 10:03:40 2005 @@ -10,14 +10,6 @@ // //===----------------------------------------------------------------------===// -// DOT - This is a marker that should be added to instructions that set the -// flags in CR0. -class DOT { - list Defs = [CR0]; - bit RC = 1; -} - - class Format val> { bits<5> Value = val; } @@ -225,19 +217,18 @@ // This is the same as XForm_base_r3xo, but the first two operands are swapped // when code is emitted. class XForm_base_r3xo_swapped - opcode, bits<10> xo, bit ppc64, bit vmx, + opcode, bits<10> xo, bit rc, bit ppc64, bit vmx, dag OL, string asmstr> : I { bits<5> A; bits<5> RST; bits<5> B; - bit RC = 0; let Inst{6-10} = RST; let Inst{11-15} = A; let Inst{16-20} = B; let Inst{21-30} = xo; - let Inst{31} = RC; + let Inst{31} = rc; } @@ -252,10 +243,9 @@ let B = 0; } -class XForm_6 opcode, bits<10> xo, bit ppc64, bit vmx, +class XForm_6 opcode, bits<10> xo, bit rc, bit ppc64, bit vmx, dag OL, string asmstr> - : XForm_base_r3xo_swapped { -} + : XForm_base_r3xo_swapped; class XForm_8 opcode, bits<10> xo, bit ppc64, bit vmx, dag OL, string asmstr> @@ -263,15 +253,13 @@ class XForm_10 opcode, bits<10> xo, bit rc, bit ppc64, bit vmx, dag OL, string asmstr> - : XForm_base_r3xo_swapped { - let RC = rc; + : XForm_base_r3xo_swapped { } class XForm_11 opcode, bits<10> xo, bit rc, bit ppc64, bit vmx, dag OL, string asmstr> - : XForm_base_r3xo_swapped { + : XForm_base_r3xo_swapped { let B = 0; - let RC = rc; } class XForm_16 opcode, bits<10> xo, bit ppc64, bit vmx, Index: llvm/lib/Target/PowerPC/PowerPCInstrInfo.td diff -u llvm/lib/Target/PowerPC/PowerPCInstrInfo.td:1.58 llvm/lib/Target/PowerPC/PowerPCInstrInfo.td:1.59 --- llvm/lib/Target/PowerPC/PowerPCInstrInfo.td:1.58 Mon Apr 11 10:01:39 2005 +++ llvm/lib/Target/PowerPC/PowerPCInstrInfo.td Mon Apr 11 10:03:41 2005 @@ -1,3 +1,4 @@ + //===- PowerPCInstrInfo.td - The PowerPC Instruction Set -----*- tablegen -*-=// // // The LLVM Compiler Infrastructure @@ -221,38 +222,38 @@ "ldx $dst, $base, $index">; } def MFCR : XForm_5<31, 19, 0, 0, (ops GPRC:$dst), "mfcr $dst">; -def AND : XForm_6<31, 28, 0, 0, (ops GPRC:$rA, GPRC:$rS, GPRC:$rB), +def AND : XForm_6<31, 28, 0, 0, 0, (ops GPRC:$rA, GPRC:$rS, GPRC:$rB), "and $rA, $rS, $rB">; - -def ANDo : XForm_6<31, 28, 0, 0, (ops GPRC:$rA, GPRC:$rS, GPRC:$rB), - "and. $rA, $rS, $rB">, DOT; -def ANDC : XForm_6<31, 60, 0, 0, (ops GPRC:$rA, GPRC:$rS, GPRC:$rB), +let Defs = [CR0] in +def ANDo : XForm_6<31, 28, 1, 0, 0, (ops GPRC:$rA, GPRC:$rS, GPRC:$rB), + "and. $rA, $rS, $rB">; +def ANDC : XForm_6<31, 60, 0, 0, 0, (ops GPRC:$rA, GPRC:$rS, GPRC:$rB), "andc $rA, $rS, $rB">; -def EQV : XForm_6<31, 284, 0, 0, (ops GPRC:$rA, GPRC:$rS, GPRC:$rB), +def EQV : XForm_6<31, 284, 0, 0, 0, (ops GPRC:$rA, GPRC:$rS, GPRC:$rB), "eqv $rA, $rS, $rB">; -def NAND : XForm_6<31, 476, 0, 0, (ops GPRC:$rA, GPRC:$rS, GPRC:$rB), +def NAND : XForm_6<31, 476, 0, 0, 0, (ops GPRC:$rA, GPRC:$rS, GPRC:$rB), "nand $rA, $rS, $rB">; -def NOR : XForm_6<31, 124, 0, 0, (ops GPRC:$rA, GPRC:$rS, GPRC:$rB), +def NOR : XForm_6<31, 124, 0, 0, 0, (ops GPRC:$rA, GPRC:$rS, GPRC:$rB), "nor $rA, $rS, $rB">; -def OR : XForm_6<31, 444, 0, 0, (ops GPRC:$rA, GPRC:$rS, GPRC:$rB), +def OR : XForm_6<31, 444, 0, 0, 0, (ops GPRC:$rA, GPRC:$rS, GPRC:$rB), "or $rA, $rS, $rB">; -def ORo : XForm_6<31, 444, 0, 0, (ops GPRC:$rA, GPRC:$rS, GPRC:$rB), - "or. $rA, $rS, $rB">, DOT; -def ORC : XForm_6<31, 412, 0, 0, (ops GPRC:$rA, GPRC:$rS, GPRC:$rB), +def ORo : XForm_6<31, 444, 1, 0, 0, (ops GPRC:$rA, GPRC:$rS, GPRC:$rB), + "or. $rA, $rS, $rB">; +def ORC : XForm_6<31, 412, 0, 0, 0, (ops GPRC:$rA, GPRC:$rS, GPRC:$rB), "orc $rA, $rS, $rB">; -def SLD : XForm_6<31, 27, 1, 0, (ops GPRC:$rA, GPRC:$rS, GPRC:$rB), +def SLD : XForm_6<31, 27, 0, 1, 0, (ops GPRC:$rA, GPRC:$rS, GPRC:$rB), "sld $rA, $rS, $rB">; -def SLW : XForm_6<31, 24, 0, 0, (ops GPRC:$rA, GPRC:$rS, GPRC:$rB), +def SLW : XForm_6<31, 24, 0, 0, 0, (ops GPRC:$rA, GPRC:$rS, GPRC:$rB), "slw $rA, $rS, $rB">; -def SRD : XForm_6<31, 539, 1, 0, (ops GPRC:$rA, GPRC:$rS, GPRC:$rB), +def SRD : XForm_6<31, 539, 0, 1, 0, (ops GPRC:$rA, GPRC:$rS, GPRC:$rB), "srd $rA, $rS, $rB">; -def SRW : XForm_6<31, 536, 0, 0, (ops GPRC:$rA, GPRC:$rS, GPRC:$rB), +def SRW : XForm_6<31, 536, 0, 0, 0, (ops GPRC:$rA, GPRC:$rS, GPRC:$rB), "srw $rA, $rS, $rB">; -def SRAD : XForm_6<31, 794, 1, 0, (ops GPRC:$rA, GPRC:$rS, GPRC:$rB), +def SRAD : XForm_6<31, 794, 0, 1, 0, (ops GPRC:$rA, GPRC:$rS, GPRC:$rB), "srad $rA, $rS, $rB">; -def SRAW : XForm_6<31, 792, 0, 0, (ops GPRC:$rA, GPRC:$rS, GPRC:$rB), +def SRAW : XForm_6<31, 792, 0, 0, 0, (ops GPRC:$rA, GPRC:$rS, GPRC:$rB), "sraw $rA, $rS, $rB">; -def XOR : XForm_6<31, 316, 0, 0, (ops GPRC:$rA, GPRC:$rS, GPRC:$rB), +def XOR : XForm_6<31, 316, 0, 0, 0, (ops GPRC:$rA, GPRC:$rS, GPRC:$rB), "xor $rA, $rS, $rB">; let isStore = 1 in { def STBX : XForm_8<31, 215, 0, 0, (ops GPRC:$rS, GPRC:$rA, GPRC:$rB), From lattner at cs.uiuc.edu Mon Apr 11 10:04:01 2005 From: lattner at cs.uiuc.edu (Chris Lattner) Date: Mon, 11 Apr 2005 10:04:01 -0500 Subject: [llvm-commits] CVS: llvm/lib/Target/PowerPC/PowerPCInstrInfo.td Message-ID: <200504111504.j3BF41GV030473@apoc.cs.uiuc.edu> Changes in directory llvm/lib/Target/PowerPC: PowerPCInstrInfo.td updated: 1.59 -> 1.60 --- Log message: ORo sets CR0 --- Diffs of the changes: (+1 -0) PowerPCInstrInfo.td | 1 + 1 files changed, 1 insertion(+) Index: llvm/lib/Target/PowerPC/PowerPCInstrInfo.td diff -u llvm/lib/Target/PowerPC/PowerPCInstrInfo.td:1.59 llvm/lib/Target/PowerPC/PowerPCInstrInfo.td:1.60 --- llvm/lib/Target/PowerPC/PowerPCInstrInfo.td:1.59 Mon Apr 11 10:03:41 2005 +++ llvm/lib/Target/PowerPC/PowerPCInstrInfo.td Mon Apr 11 10:03:48 2005 @@ -237,6 +237,7 @@ "nor $rA, $rS, $rB">; def OR : XForm_6<31, 444, 0, 0, 0, (ops GPRC:$rA, GPRC:$rS, GPRC:$rB), "or $rA, $rS, $rB">; +let Defs = [CR0] in def ORo : XForm_6<31, 444, 1, 0, 0, (ops GPRC:$rA, GPRC:$rS, GPRC:$rB), "or. $rA, $rS, $rB">; def ORC : XForm_6<31, 412, 0, 0, 0, (ops GPRC:$rA, GPRC:$rS, GPRC:$rB), From lattner at cs.uiuc.edu Mon Apr 11 13:55:52 2005 From: lattner at cs.uiuc.edu (Chris Lattner) Date: Mon, 11 Apr 2005 13:55:52 -0500 Subject: [llvm-commits] CVS: llvm/lib/Target/IA64/IA64ISelPattern.cpp Message-ID: <200504111855.j3BItq6d002702@apoc.cs.uiuc.edu> Changes in directory llvm/lib/Target/IA64: IA64ISelPattern.cpp updated: 1.17 -> 1.18 --- Log message: IA64 supports this operation. --- Diffs of the changes: (+0 -1) IA64ISelPattern.cpp | 1 - 1 files changed, 1 deletion(-) Index: llvm/lib/Target/IA64/IA64ISelPattern.cpp diff -u llvm/lib/Target/IA64/IA64ISelPattern.cpp:1.17 llvm/lib/Target/IA64/IA64ISelPattern.cpp:1.18 --- llvm/lib/Target/IA64/IA64ISelPattern.cpp:1.17 Mon Apr 11 02:16:39 2005 +++ llvm/lib/Target/IA64/IA64ISelPattern.cpp Mon Apr 11 13:55:36 2005 @@ -62,7 +62,6 @@ setShiftAmountType(MVT::i64); setOperationAction(ISD::EXTLOAD , MVT::i1 , Promote); - setOperationAction(ISD::EXTLOAD , MVT::f32 , Promote); setOperationAction(ISD::ZEXTLOAD , MVT::i1 , Expand); setOperationAction(ISD::ZEXTLOAD , MVT::i32 , Expand); From lattner at cs.uiuc.edu Mon Apr 11 14:19:33 2005 From: lattner at cs.uiuc.edu (Chris Lattner) Date: Mon, 11 Apr 2005 14:19:33 -0500 Subject: [llvm-commits] CVS: llvm-test/MultiSource/Benchmarks/Makefile Message-ID: <200504111919.j3BJJXRQ008259@apoc.cs.uiuc.edu> Changes in directory llvm-test/MultiSource/Benchmarks: Makefile updated: 1.9 -> 1.10 --- Log message: These benchmarks work for me, enable by default :) --- Diffs of the changes: (+1 -1) Makefile | 2 +- 1 files changed, 1 insertion(+), 1 deletion(-) Index: llvm-test/MultiSource/Benchmarks/Makefile diff -u llvm-test/MultiSource/Benchmarks/Makefile:1.9 llvm-test/MultiSource/Benchmarks/Makefile:1.10 --- llvm-test/MultiSource/Benchmarks/Makefile:1.9 Tue Jan 11 12:14:49 2005 +++ llvm-test/MultiSource/Benchmarks/Makefile Mon Apr 11 14:19:16 2005 @@ -3,6 +3,6 @@ LEVEL = ../.. PARALLEL_DIRS := Fhourstones McCat Olden OptimizerEval Ptrdist llubenchmark \ sim FreeBench MallocBench Prolangs-C Prolangs-C++ SciMark2-C\ - mediabench + mediabench ASCI_Purple include $(LEVEL)/Makefile.programs From lattner at cs.uiuc.edu Mon Apr 11 15:09:08 2005 From: lattner at cs.uiuc.edu (Chris Lattner) Date: Mon, 11 Apr 2005 15:09:08 -0500 Subject: [llvm-commits] CVS: llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp Message-ID: <200504112009.j3BK98Nk009453@apoc.cs.uiuc.edu> Changes in directory llvm/lib/CodeGen/SelectionDAG: LegalizeDAG.cpp updated: 1.81 -> 1.82 --- Log message: Fix expansion of shifts by exactly NVT bits on arch's (like X86) that have masking shifts. This fixes the miscompilation of this: long long test1(unsigned A, unsigned B) { return ((unsigned long long)A << 32) | B; } into this: test1: movl 4(%esp), %edx movl %edx, %eax orl 8(%esp), %eax ret allowing us to generate this instead: test1: movl 4(%esp), %edx movl 8(%esp), %eax ret --- Diffs of the changes: (+10 -0) LegalizeDAG.cpp | 10 ++++++++++ 1 files changed, 10 insertions(+) Index: llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp diff -u llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp:1.81 llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp:1.82 --- llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp:1.81 Sun Apr 10 22:01:51 2005 +++ llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp Mon Apr 11 15:08:52 2005 @@ -1500,6 +1500,9 @@ } else if (Cst > NVTBits) { Lo = DAG.getConstant(0, NVT); Hi = DAG.getNode(ISD::SHL, NVT, InL, DAG.getConstant(Cst-NVTBits,ShTy)); + } else if (Cst == NVTBits) { + Lo = DAG.getConstant(0, NVT); + Hi = InL; } else { Lo = DAG.getNode(ISD::SHL, NVT, InL, DAG.getConstant(Cst, ShTy)); Hi = DAG.getNode(ISD::OR, NVT, @@ -1514,6 +1517,9 @@ } else if (Cst > NVTBits) { Lo = DAG.getNode(ISD::SRL, NVT, InH, DAG.getConstant(Cst-NVTBits,ShTy)); Hi = DAG.getConstant(0, NVT); + } else if (Cst == NVTBits) { + Lo = InH; + Hi = DAG.getConstant(0, NVT); } else { Lo = DAG.getNode(ISD::OR, NVT, DAG.getNode(ISD::SRL, NVT, InL, DAG.getConstant(Cst, ShTy)), @@ -1530,6 +1536,10 @@ DAG.getConstant(Cst-NVTBits, ShTy)); Hi = DAG.getNode(ISD::SRA, NVT, InH, DAG.getConstant(NVTBits-1, ShTy)); + } else if (Cst == NVTBits) { + Lo = InH; + Hi = DAG.getNode(ISD::SRA, NVT, InH, + DAG.getConstant(NVTBits-1, ShTy)); } else { Lo = DAG.getNode(ISD::OR, NVT, DAG.getNode(ISD::SRL, NVT, InL, DAG.getConstant(Cst, ShTy)), From lattner at cs.uiuc.edu Mon Apr 11 15:30:16 2005 From: lattner at cs.uiuc.edu (Chris Lattner) Date: Mon, 11 Apr 2005 15:30:16 -0500 Subject: [llvm-commits] CVS: llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp Message-ID: <200504112030.j3BKUGjO010480@apoc.cs.uiuc.edu> Changes in directory llvm/lib/CodeGen/SelectionDAG: LegalizeDAG.cpp updated: 1.82 -> 1.83 --- Log message: Teach the dag mechanism that this: long long test2(unsigned A, unsigned B) { return ((unsigned long long)A << 32) + B; } is equivalent to this: long long test1(unsigned A, unsigned B) { return ((unsigned long long)A << 32) | B; } Now they are both codegen'd to this on ppc: _test2: blr or this on x86: test2: movl 4(%esp), %edx movl 8(%esp), %eax ret --- Diffs of the changes: (+21 -2) LegalizeDAG.cpp | 23 +++++++++++++++++++++-- 1 files changed, 21 insertions(+), 2 deletions(-) Index: llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp diff -u llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp:1.82 llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp:1.83 --- llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp:1.82 Mon Apr 11 15:08:52 2005 +++ llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp Mon Apr 11 15:29:59 2005 @@ -1443,8 +1443,27 @@ ExpandOp(LHS, LHSL, LHSH); ExpandOp(RHS, RHSL, RHSH); - // Convert this add to the appropriate ADDC pair. The low part has no carry - // in. + // FIXME: this should be moved to the dag combiner someday. + if (NodeOp == ISD::ADD_PARTS || NodeOp == ISD::SUB_PARTS) + if (LHSL.getValueType() == MVT::i32) { + SDOperand LowEl; + if (ConstantSDNode *C = dyn_cast(LHSL)) + if (C->getValue() == 0) + LowEl = RHSL; + if (ConstantSDNode *C = dyn_cast(RHSL)) + if (C->getValue() == 0) + LowEl = LHSL; + if (LowEl.Val) { + // Turn this into an add/sub of the high part only. + SDOperand HiEl = + DAG.getNode(NodeOp == ISD::ADD_PARTS ? ISD::ADD : ISD::SUB, + LowEl.getValueType(), LHSH, RHSH); + Lo = LowEl; + Hi = HiEl; + return; + } + } + std::vector Ops; Ops.push_back(LHSL); Ops.push_back(LHSH); From natebegeman at mac.com Mon Apr 11 15:49:08 2005 From: natebegeman at mac.com (Nate Begeman) Date: Mon, 11 Apr 2005 15:49:08 -0500 Subject: [llvm-commits] CVS: llvm/lib/Target/PowerPC/README.txt Message-ID: <200504112049.PAA27493@zion.cs.uiuc.edu> Changes in directory llvm/lib/Target/PowerPC: README.txt updated: 1.21 -> 1.22 --- Log message: Update PPC readme. Remove things that are done or aren't ppc specific --- Diffs of the changes: (+2 -6) README.txt | 8 ++------ 1 files changed, 2 insertions(+), 6 deletions(-) Index: llvm/lib/Target/PowerPC/README.txt diff -u llvm/lib/Target/PowerPC/README.txt:1.21 llvm/lib/Target/PowerPC/README.txt:1.22 --- llvm/lib/Target/PowerPC/README.txt:1.21 Sun Nov 7 20:25:40 2004 +++ llvm/lib/Target/PowerPC/README.txt Mon Apr 11 15:48:57 2005 @@ -1,12 +1,8 @@ TODO: -* poor switch statement codegen -* load/store to alloca'd array or struct. -* implement not-R0 register GPR class -* implement scheduling info -* implement do-loop pass +* condition register allocation +* gpr0 allocation * implement do-loop -> bdnz transform * implement powerpc-64 for darwin -* implement powerpc-64 for aix * use stfiwx in float->int * should hint to the branch select pass that it doesn't need to print the second unconditional branch, so we don't end up with things like: From lattner at cs.uiuc.edu Mon Apr 11 16:34:38 2005 From: lattner at cs.uiuc.edu (Chris Lattner) Date: Mon, 11 Apr 2005 16:34:38 -0500 Subject: [llvm-commits] CVS: llvm-poolalloc/lib/PoolAllocate/Heuristic.cpp Message-ID: <200504112134.j3BLYcGX013302@apoc.cs.uiuc.edu> Changes in directory llvm-poolalloc/lib/PoolAllocate: Heuristic.cpp updated: 1.9 -> 1.10 --- Log message: Change this prototype to make the base class so it actually overloads the virtual method... :-/ --- Diffs of the changes: (+5 -4) Heuristic.cpp | 9 +++++---- 1 files changed, 5 insertions(+), 4 deletions(-) Index: llvm-poolalloc/lib/PoolAllocate/Heuristic.cpp diff -u llvm-poolalloc/lib/PoolAllocate/Heuristic.cpp:1.9 llvm-poolalloc/lib/PoolAllocate/Heuristic.cpp:1.10 --- llvm-poolalloc/lib/PoolAllocate/Heuristic.cpp:1.9 Wed Mar 16 16:46:45 2005 +++ llvm-poolalloc/lib/PoolAllocate/Heuristic.cpp Mon Apr 11 16:34:22 2005 @@ -421,7 +421,7 @@ ResultPools.push_back(OnePool(NodesToPA[i])); } - void HackFunctionBody(Function &F, std::map &PDs); + void HackFunctionBody(Function &F, std::map &PDs); }; /// getDynamicallyNullPool - Return a PoolDescriptor* that is always dynamically @@ -447,13 +447,14 @@ // Basically it replaces all uses of real pool descriptors with dynamically null // values. However, it leaves pool init/destroy alone. void OnlyOverheadHeuristic::HackFunctionBody(Function &F, - std::map &PDs) { + std::map &PDs) { Function *PoolInit = PA->PoolInit; Function *PoolDestroy = PA->PoolDestroy; Value *NullPD = getDynamicallyNullPool(F.front().begin()); - for (std::map::iterator PDI = PDs.begin(), E = PDs.end(); - PDI != E; ++PDI) { + for (std::map::iterator PDI = PDs.begin(), + E = PDs.end(); PDI != E; ++PDI) { Value *OldPD = PDI->second; std::vector OldPDUsers(OldPD->use_begin(), OldPD->use_end()); for (unsigned i = 0, e = OldPDUsers.size(); i != e; ++i) { From lattner at cs.uiuc.edu Mon Apr 11 16:34:50 2005 From: lattner at cs.uiuc.edu (Chris Lattner) Date: Mon, 11 Apr 2005 16:34:50 -0500 Subject: [llvm-commits] CVS: llvm-poolalloc/test/TEST.pacompiletime.Makefile TEST.poolalloc.Makefile Message-ID: <200504112134.j3BLYoJ2013313@apoc.cs.uiuc.edu> Changes in directory llvm-poolalloc/test: TEST.pacompiletime.Makefile updated: 1.1 -> 1.2 TEST.poolalloc.Makefile updated: 1.37 -> 1.38 --- Log message: Apparently the .so changed names --- Diffs of the changes: (+2 -2) TEST.pacompiletime.Makefile | 2 +- TEST.poolalloc.Makefile | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) Index: llvm-poolalloc/test/TEST.pacompiletime.Makefile diff -u llvm-poolalloc/test/TEST.pacompiletime.Makefile:1.1 llvm-poolalloc/test/TEST.pacompiletime.Makefile:1.2 --- llvm-poolalloc/test/TEST.pacompiletime.Makefile:1.1 Sat Apr 2 13:53:52 2005 +++ llvm-poolalloc/test/TEST.pacompiletime.Makefile Mon Apr 11 16:34:36 2005 @@ -14,7 +14,7 @@ RELDIR := $(subst $(PROGDIR),,$(CURDIR)) # Pool allocator pass shared object -PA_SO := $(PROJECT_DIR)/Debug/lib/libpoolalloc$(SHLIBEXT) +PA_SO := $(PROJECT_DIR)/Release/lib/libpoolalloc$(SHLIBEXT) # Command to run opt with the pool allocator pass loaded OPT_PA := $(LOPT) -load $(PA_SO) Index: llvm-poolalloc/test/TEST.poolalloc.Makefile diff -u llvm-poolalloc/test/TEST.poolalloc.Makefile:1.37 llvm-poolalloc/test/TEST.poolalloc.Makefile:1.38 --- llvm-poolalloc/test/TEST.poolalloc.Makefile:1.37 Tue Feb 8 14:24:35 2005 +++ llvm-poolalloc/test/TEST.poolalloc.Makefile Mon Apr 11 16:34:36 2005 @@ -21,7 +21,7 @@ RELDIR := $(subst $(PROGDIR),,$(CURDIR)) # Pool allocator pass shared object -PA_SO := $(PROJECT_DIR)/Debug/lib/libpoolalloc$(SHLIBEXT) +PA_SO := $(PROJECT_DIR)/Debug/lib/poolalloc$(SHLIBEXT) # Pool allocator runtime library #PA_RT := $(PROJECT_DIR)/lib/Bytecode/libpoolalloc_fl_rt.bc From lattner at cs.uiuc.edu Mon Apr 11 17:40:02 2005 From: lattner at cs.uiuc.edu (Chris Lattner) Date: Mon, 11 Apr 2005 17:40:02 -0500 Subject: [llvm-commits] CVS: llvm-test/MultiSource/Benchmarks/llubenchmark/Makefile Message-ID: <200504112240.j3BMe2VO014309@apoc.cs.uiuc.edu> Changes in directory llvm-test/MultiSource/Benchmarks/llubenchmark: Makefile updated: 1.4 -> 1.5 --- Log message: add large-problem-size input --- Diffs of the changes: (+4 -0) Makefile | 4 ++++ 1 files changed, 4 insertions(+) Index: llvm-test/MultiSource/Benchmarks/llubenchmark/Makefile diff -u llvm-test/MultiSource/Benchmarks/llubenchmark/Makefile:1.4 llvm-test/MultiSource/Benchmarks/llubenchmark/Makefile:1.5 --- llvm-test/MultiSource/Benchmarks/llubenchmark/Makefile:1.4 Wed Sep 1 09:33:26 2004 +++ llvm-test/MultiSource/Benchmarks/llubenchmark/Makefile Mon Apr 11 17:39:45 2005 @@ -4,6 +4,10 @@ CPPFLAGS = LDFLAGS = +ifdef LARGE_PROBLEM_SIZE +RUN_OPTIONS = -i 6000 +else RUN_OPTIONS = -i 3000 +endif include ../../Makefile.multisrc From lattner at cs.uiuc.edu Mon Apr 11 18:41:58 2005 From: lattner at cs.uiuc.edu (Chris Lattner) Date: Mon, 11 Apr 2005 18:41:58 -0500 Subject: [llvm-commits] CVS: llvm-poolalloc/lib/PoolAllocate/PoolOptimize.cpp Message-ID: <200504112341.j3BNfwvr017185@apoc.cs.uiuc.edu> Changes in directory llvm-poolalloc/lib/PoolAllocate: PoolOptimize.cpp updated: 1.2 -> 1.3 --- Log message: fix pasteo --- Diffs of the changes: (+1 -1) PoolOptimize.cpp | 2 +- 1 files changed, 1 insertion(+), 1 deletion(-) Index: llvm-poolalloc/lib/PoolAllocate/PoolOptimize.cpp diff -u llvm-poolalloc/lib/PoolAllocate/PoolOptimize.cpp:1.2 llvm-poolalloc/lib/PoolAllocate/PoolOptimize.cpp:1.3 --- llvm-poolalloc/lib/PoolAllocate/PoolOptimize.cpp:1.2 Mon Nov 15 15:05:09 2004 +++ llvm-poolalloc/lib/PoolAllocate/PoolOptimize.cpp Mon Apr 11 18:41:41 2005 @@ -138,7 +138,7 @@ } // Optimize poolmemaligns - getCallsOf(PoolFree, Calls); + getCallsOf(PoolMemAlign, Calls); for (unsigned i = 0, e = Calls.size(); i != e; ++i) { CallInst *CI = Calls[i]; // poolmemalign(null, X, Y) -> memalign(X, Y) From lattner at cs.uiuc.edu Mon Apr 11 18:56:56 2005 From: lattner at cs.uiuc.edu (Chris Lattner) Date: Mon, 11 Apr 2005 18:56:56 -0500 Subject: [llvm-commits] CVS: llvm-test/MultiSource/Benchmarks/MallocBench/cfrac/Makefile Message-ID: <200504112356.j3BNuuTv017516@apoc.cs.uiuc.edu> Changes in directory llvm-test/MultiSource/Benchmarks/MallocBench/cfrac: Makefile updated: 1.2 -> 1.3 --- Log message: add a larger input --- Diffs of the changes: (+4 -0) Makefile | 4 ++++ 1 files changed, 4 insertions(+) Index: llvm-test/MultiSource/Benchmarks/MallocBench/cfrac/Makefile diff -u llvm-test/MultiSource/Benchmarks/MallocBench/cfrac/Makefile:1.2 llvm-test/MultiSource/Benchmarks/MallocBench/cfrac/Makefile:1.3 --- llvm-test/MultiSource/Benchmarks/MallocBench/cfrac/Makefile:1.2 Wed Sep 1 09:33:23 2004 +++ llvm-test/MultiSource/Benchmarks/MallocBench/cfrac/Makefile Mon Apr 11 18:56:42 2005 @@ -11,5 +11,9 @@ pfloat.c pidiv.c pimod.c picmp.c \ primes.c pcfrac.c pgcd.c +ifdef LARGE_PROBLEM_SIZE +RUN_OPTIONS = 376529507064485971626234556089714053987 +else RUN_OPTIONS = 41757646344123832613190542166099121 +endif include ../../../Makefile.multisrc From lattner at cs.uiuc.edu Mon Apr 11 19:04:17 2005 From: lattner at cs.uiuc.edu (Chris Lattner) Date: Mon, 11 Apr 2005 19:04:17 -0500 Subject: [llvm-commits] CVS: llvm-test/MultiSource/Benchmarks/MallocBench/espresso/Makefile Message-ID: <200504120004.j3C04HRG018318@apoc.cs.uiuc.edu> Changes in directory llvm-test/MultiSource/Benchmarks/MallocBench/espresso: Makefile updated: 1.4 -> 1.5 --- Log message: switch from taking 0.05s to ~1.5s on apoc --- Diffs of the changes: (+1 -1) Makefile | 2 +- 1 files changed, 1 insertion(+), 1 deletion(-) Index: llvm-test/MultiSource/Benchmarks/MallocBench/espresso/Makefile diff -u llvm-test/MultiSource/Benchmarks/MallocBench/espresso/Makefile:1.4 llvm-test/MultiSource/Benchmarks/MallocBench/espresso/Makefile:1.5 --- llvm-test/MultiSource/Benchmarks/MallocBench/espresso/Makefile:1.4 Sat Jan 15 21:16:10 2005 +++ llvm-test/MultiSource/Benchmarks/MallocBench/espresso/Makefile Mon Apr 11 19:04:04 2005 @@ -1,5 +1,5 @@ LEVEL = ../../../.. PROG = espresso CPPFLAGS += -DNOMEMOPT -RUN_OPTIONS = -t $(PROJ_SRC_DIR)/INPUT/mlp4.espresso +RUN_OPTIONS = -t $(PROJ_SRC_DIR)/INPUT/largest.espresso include ../../../Makefile.multisrc From natebegeman at mac.com Mon Apr 11 19:10:13 2005 From: natebegeman at mac.com (Nate Begeman) Date: Mon, 11 Apr 2005 19:10:13 -0500 Subject: [llvm-commits] CVS: llvm/lib/Target/PowerPC/PPC32ISelPattern.cpp PowerPCInstrInfo.td Message-ID: <200504120010.TAA28778@zion.cs.uiuc.edu> Changes in directory llvm/lib/Target/PowerPC: PPC32ISelPattern.cpp updated: 1.63 -> 1.64 PowerPCInstrInfo.td updated: 1.60 -> 1.61 --- Log message: Implement bitfield clears Implement divide by negative power of two --- Diffs of the changes: (+35 -11) PPC32ISelPattern.cpp | 42 +++++++++++++++++++++++++++++++----------- PowerPCInstrInfo.td | 4 ++++ 2 files changed, 35 insertions(+), 11 deletions(-) Index: llvm/lib/Target/PowerPC/PPC32ISelPattern.cpp diff -u llvm/lib/Target/PowerPC/PPC32ISelPattern.cpp:1.63 llvm/lib/Target/PowerPC/PPC32ISelPattern.cpp:1.64 --- llvm/lib/Target/PowerPC/PPC32ISelPattern.cpp:1.63 Mon Apr 11 01:34:10 2005 +++ llvm/lib/Target/PowerPC/PPC32ISelPattern.cpp Mon Apr 11 19:10:02 2005 @@ -604,12 +604,13 @@ /// getImmediateForOpcode - This method returns a value indicating whether /// the ConstantSDNode N can be used as an immediate to Opcode. The return /// values are either 0, 1 or 2. 0 indicates that either N is not a -/// ConstantSDNode, or is not suitable for use by that opcode. A return value -/// of 1 indicates that the constant may be used in normal immediate form. A -/// return value of 2 indicates that the constant may be used in shifted -/// immediate form. A return value of 3 indicates that log base 2 of the -/// constant may be used. A return value of 4 indicates that the constant is -/// suitable for conversion into a magic number for integer division. +/// ConstantSDNode, or is not suitable for use by that opcode. +/// Return value codes for turning into an enum someday: +/// 1: constant may be used in normal immediate form. +/// 2: constant may be used in shifted immediate form. +/// 3: log base 2 of the constant may be used. +/// 4: constant is suitable for integer division conversion +/// 5: constant is a bitfield mask /// static unsigned getImmediateForOpcode(SDOperand N, unsigned Opcode, unsigned& Imm, bool U = false) { @@ -623,7 +624,13 @@ if (v <= 32767 && v >= -32768) { Imm = v & 0xFFFF; return 1; } if ((v & 0x0000FFFF) == 0) { Imm = v >> 16; return 2; } break; - case ISD::AND: + case ISD::AND: { + unsigned MB, ME; + if (IsRunOfOnes(v, MB, ME)) { Imm = MB << 16 | ME & 0xFFFF; return 5; } + if (v >= 0 && v <= 65535) { Imm = v & 0xFFFF; return 1; } + if ((v & 0x0000FFFF) == 0) { Imm = v >> 16; return 2; } + break; + } case ISD::XOR: case ISD::OR: if (v >= 0 && v <= 65535) { Imm = v & 0xFFFF; return 1; } @@ -639,6 +646,7 @@ break; case ISD::SDIV: if ((Imm = ExactLog2(v))) { return 3; } + if ((Imm = ExactLog2(-v))) { Imm = -Imm; return 3; } if (v <= -2 || v >= 2) { return 4; } break; case ISD::UDIV: @@ -695,8 +703,6 @@ return 0; } -/// - // Structure used to return the necessary information to codegen an SDIV as // a multiply. struct ms { @@ -1719,6 +1725,13 @@ case 2: // Shifted immediate BuildMI(BB, PPC::ANDISo, 2, Result).addReg(Tmp1).addImm(Tmp2); break; + case 5: // Bitfield mask + Opc = Recording ? PPC::RLWINMo : PPC::RLWINM; + Tmp3 = Tmp2 >> 16; // MB + Tmp2 &= 0xFFFF; // ME + BuildMI(BB, Opc, 4, Result).addReg(Tmp1).addImm(0) + .addImm(Tmp3).addImm(Tmp2); + break; } RecordSuccess = true; return Result; @@ -1828,8 +1841,15 @@ case 3: Tmp1 = MakeReg(MVT::i32); Tmp2 = SelectExpr(N.getOperand(0)); - BuildMI(BB, PPC::SRAWI, 2, Tmp1).addReg(Tmp2).addImm(Tmp3); - BuildMI(BB, PPC::ADDZE, 1, Result).addReg(Tmp1); + if ((int)Tmp3 < 0) { + unsigned Tmp4 = MakeReg(MVT::i32); + BuildMI(BB, PPC::SRAWI, 2, Tmp1).addReg(Tmp2).addImm(-Tmp3); + BuildMI(BB, PPC::ADDZE, 1, Tmp4).addReg(Tmp1); + BuildMI(BB, PPC::NEG, 1, Result).addReg(Tmp4); + } else { + BuildMI(BB, PPC::SRAWI, 2, Tmp1).addReg(Tmp2).addImm(Tmp3); + BuildMI(BB, PPC::ADDZE, 1, Result).addReg(Tmp1); + } return Result; // If this is a divide by constant, we can emit code using some magic // constants to implement it as a multiply instead. Index: llvm/lib/Target/PowerPC/PowerPCInstrInfo.td diff -u llvm/lib/Target/PowerPC/PowerPCInstrInfo.td:1.60 llvm/lib/Target/PowerPC/PowerPCInstrInfo.td:1.61 --- llvm/lib/Target/PowerPC/PowerPCInstrInfo.td:1.60 Mon Apr 11 10:03:48 2005 +++ llvm/lib/Target/PowerPC/PowerPCInstrInfo.td Mon Apr 11 19:10:02 2005 @@ -464,6 +464,10 @@ def RLWINM : MForm_2<21, 0, 0, 0, (ops GPRC:$rA, GPRC:$rS, u5imm:$SH, u5imm:$MB, u5imm:$ME), "rlwinm $rA, $rS, $SH, $MB, $ME">; +let Defs = [CR0] in +def RLWINMo : MForm_2<21, 1, 0, 0, + (ops GPRC:$rA, GPRC:$rS, u5imm:$SH, u5imm:$MB, u5imm:$ME), + "rlwinm. $rA, $rS, $SH, $MB, $ME">; def RLWNM : MForm_2<23, 0, 0, 0, (ops GPRC:$rA, GPRC:$rS, GPRC:$rB, u5imm:$MB, u5imm:$ME), "rlwnm $rA, $rS, $rB, $MB, $ME">; From lattner at cs.uiuc.edu Mon Apr 11 19:10:16 2005 From: lattner at cs.uiuc.edu (Chris Lattner) Date: Mon, 11 Apr 2005 19:10:16 -0500 Subject: [llvm-commits] CVS: llvm-test/MultiSource/Benchmarks/MallocBench/gawk/gawk.h Message-ID: <200504120010.j3C0AGG7018950@apoc.cs.uiuc.edu> Changes in directory llvm-test/MultiSource/Benchmarks/MallocBench/gawk: gawk.h updated: 1.1 -> 1.2 --- Log message: improve the bm, though it still doesn't work --- Diffs of the changes: (+0 -12) gawk.h | 12 ------------ 1 files changed, 12 deletions(-) Index: llvm-test/MultiSource/Benchmarks/MallocBench/gawk/gawk.h diff -u llvm-test/MultiSource/Benchmarks/MallocBench/gawk/gawk.h:1.1 llvm-test/MultiSource/Benchmarks/MallocBench/gawk/gawk.h:1.2 --- llvm-test/MultiSource/Benchmarks/MallocBench/gawk/gawk.h:1.1 Mon Feb 23 11:05:56 2004 +++ llvm-test/MultiSource/Benchmarks/MallocBench/gawk/gawk.h Mon Apr 11 19:10:00 2005 @@ -39,21 +39,9 @@ /* nasty nasty SunOS-ism */ #ifdef sparc #include -#ifdef lint -extern char *alloca(); -#endif #else extern char *alloca(); #endif -#ifdef SPRINTF_INT -extern int sprintf(); -#else /* not USG */ -/* nasty nasty berkelixm */ -#define setjmp _setjmp -#define longjmp _longjmp - -extern int sprintf(); -#endif /* * if you don't have vprintf, but you are BSD, the version defined in * vprintf.c should do the trick. Otherwise, use this and cross your fingers. From lattner at cs.uiuc.edu Mon Apr 11 19:29:05 2005 From: lattner at cs.uiuc.edu (Chris Lattner) Date: Mon, 11 Apr 2005 19:29:05 -0500 Subject: [llvm-commits] CVS: llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp Message-ID: <200504120029.j3C0T5CQ022601@apoc.cs.uiuc.edu> Changes in directory llvm/lib/CodeGen/SelectionDAG: SelectionDAG.cpp updated: 1.71 -> 1.72 --- Log message: canonicalize x x == 0. On this testcase: unsigned long long g; unsigned long foo (unsigned long a) { return (a >= g) ? 1 : 0; } It changes the ppc code from: _foo: .LBB_foo_0: ; entry mflr r11 stw r11, 8(r1) bl "L00000$pb" "L00000$pb": mflr r2 addis r2, r2, ha16(L_g$non_lazy_ptr-"L00000$pb") lwz r2, lo16(L_g$non_lazy_ptr-"L00000$pb")(r2) lwz r4, 0(r2) lwz r2, 4(r2) cmplw cr0, r3, r2 li r2, 1 li r3, 0 bge .LBB_foo_2 ; entry .LBB_foo_1: ; entry or r2, r3, r3 .LBB_foo_2: ; entry cmplwi cr0, r4, 1 li r3, 1 li r5, 0 blt .LBB_foo_4 ; entry .LBB_foo_3: ; entry or r3, r5, r5 .LBB_foo_4: ; entry cmpwi cr0, r4, 0 beq .LBB_foo_6 ; entry .LBB_foo_5: ; entry or r2, r3, r3 .LBB_foo_6: ; entry rlwinm r3, r2, 0, 31, 31 lwz r11, 8(r1) mtlr r11 blr to: _foo: .LBB_foo_0: ; entry mflr r11 stw r11, 8(r1) bl "L00000$pb" "L00000$pb": mflr r2 addis r2, r2, ha16(L_g$non_lazy_ptr-"L00000$pb") lwz r2, lo16(L_g$non_lazy_ptr-"L00000$pb")(r2) lwz r4, 0(r2) lwz r2, 4(r2) cmplw cr0, r3, r2 li r2, 1 li r3, 0 bge .LBB_foo_2 ; entry .LBB_foo_1: ; entry or r2, r3, r3 .LBB_foo_2: ; entry cntlzw r3, r4 srwi r3, r3, 5 cmpwi cr0, r4, 0 beq .LBB_foo_4 ; entry .LBB_foo_3: ; entry or r2, r3, r3 .LBB_foo_4: ; entry rlwinm r3, r2, 0, 31, 31 lwz r11, 8(r1) mtlr r11 blr --- Diffs of the changes: (+9 -0) SelectionDAG.cpp | 9 +++++++++ 1 files changed, 9 insertions(+) Index: llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp diff -u llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp:1.71 llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp:1.72 --- llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp:1.71 Sun Apr 10 18:37:16 2005 +++ llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp Mon Apr 11 19:28:49 2005 @@ -429,6 +429,15 @@ N2 = getConstant(C2, N2.getValueType()); N2C = cast(N2.Val); } + + // If we have setult X, 1, turn it into seteq X, 0 + if ((Cond == ISD::SETLT || Cond == ISD::SETULT) && C2 == MinVal+1) + return getSetCC(ISD::SETEQ, VT, N1, + getConstant(MinVal, N1.getValueType())); + // If we have setult X, 1, turn it into seteq X, 0 + else if ((Cond == ISD::SETGT || Cond == ISD::SETUGT) && C2 == MaxVal-1) + return getSetCC(ISD::SETEQ, VT, N1, + getConstant(MaxVal, N1.getValueType())); // If we have "setcc X, C1", check to see if we can shrink the immediate // by changing cc. From lattner at cs.uiuc.edu Mon Apr 11 20:46:22 2005 From: lattner at cs.uiuc.edu (Chris Lattner) Date: Mon, 11 Apr 2005 20:46:22 -0500 Subject: [llvm-commits] CVS: llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp Message-ID: <200504120146.j3C1kMks024130@apoc.cs.uiuc.edu> Changes in directory llvm/lib/CodeGen/SelectionDAG: LegalizeDAG.cpp updated: 1.83 -> 1.84 --- Log message: Emit long comparison against -1 better. Instead of this (x86): test2: movl 8(%esp), %eax notl %eax movl 4(%esp), %ecx notl %ecx orl %eax, %ecx cmpl $0, %ecx sete %al movzbl %al, %eax ret or this (PPC): _test2: nor r2, r4, r4 nor r3, r3, r3 or r2, r2, r3 cntlzw r2, r2 srwi r3, r2, 5 blr Emit this: test2: movl 8(%esp), %eax andl 4(%esp), %eax cmpl $-1, %eax sete %al movzbl %al, %eax ret or this: _test2: .LBB_test2_0: ; and r2, r4, r3 cmpwi cr0, r2, -1 li r3, 1 li r2, 0 beq .LBB_test2_2 ; .LBB_test2_1: ; or r3, r2, r2 .LBB_test2_2: ; blr it seems like the PPC isel could do better for R32 == -1 case. --- Diffs of the changes: (+10 -0) LegalizeDAG.cpp | 10 ++++++++++ 1 files changed, 10 insertions(+) Index: llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp diff -u llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp:1.83 llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp:1.84 --- llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp:1.83 Mon Apr 11 15:29:59 2005 +++ llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp Mon Apr 11 20:46:05 2005 @@ -750,6 +750,16 @@ switch (cast(Node)->getCondition()) { case ISD::SETEQ: case ISD::SETNE: + if (RHSLo == RHSHi) + if (ConstantSDNode *RHSCST = dyn_cast(RHSLo)) + if (RHSCST->isAllOnesValue()) { + // Comparison to -1. + Tmp1 = DAG.getNode(ISD::AND, LHSLo.getValueType(), LHSLo, LHSHi); + Result = DAG.getSetCC(cast(Node)->getCondition(), + Node->getValueType(0), Tmp1, RHSLo); + break; + } + Tmp1 = DAG.getNode(ISD::XOR, LHSLo.getValueType(), LHSLo, RHSLo); Tmp2 = DAG.getNode(ISD::XOR, LHSLo.getValueType(), LHSHi, RHSHi); Tmp1 = DAG.getNode(ISD::OR, Tmp1.getValueType(), Tmp1, Tmp2); From lattner at cs.uiuc.edu Mon Apr 11 21:19:25 2005 From: lattner at cs.uiuc.edu (Chris Lattner) Date: Mon, 11 Apr 2005 21:19:25 -0500 Subject: [llvm-commits] CVS: llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp Message-ID: <200504120219.j3C2JP5U024783@apoc.cs.uiuc.edu> Changes in directory llvm/lib/CodeGen/SelectionDAG: LegalizeDAG.cpp updated: 1.84 -> 1.85 --- Log message: Emit comparisons against the sign bit better. Codegen this: bool %test1(long %X) { %A = setlt long %X, 0 ret bool %A } like this: test1: cmpl $0, 8(%esp) setl %al movzbl %al, %eax ret instead of: test1: movl 8(%esp), %ecx cmpl $0, %ecx setl %al movzbw %al, %ax cmpl $0, 4(%esp) setb %dl movzbw %dl, %dx cmpl $0, %ecx cmove %dx, %ax movzbl %al, %eax ret --- Diffs of the changes: (+10 -0) LegalizeDAG.cpp | 10 ++++++++++ 1 files changed, 10 insertions(+) Index: llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp diff -u llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp:1.84 llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp:1.85 --- llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp:1.84 Mon Apr 11 20:46:05 2005 +++ llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp Mon Apr 11 21:19:10 2005 @@ -768,6 +768,16 @@ DAG.getConstant(0, Tmp1.getValueType())); break; default: + // If this is a comparison of the sign bit, just look at the top part. + // X > -1, x < 0 + if (ConstantSDNode *CST = dyn_cast(Node->getOperand(1))) + if ((cast(Node)->getCondition() == ISD::SETLT && + CST->getValue() == 0) || // X < 0 + (cast(Node)->getCondition() == ISD::SETGT && + (CST->isAllOnesValue()))) // X > -1 + return DAG.getSetCC(cast(Node)->getCondition(), + Node->getValueType(0), LHSHi, RHSHi); + // FIXME: This generated code sucks. ISD::CondCode LowCC; switch (cast(Node)->getCondition()) { From lattner at cs.uiuc.edu Mon Apr 11 21:54:55 2005 From: lattner at cs.uiuc.edu (Chris Lattner) Date: Mon, 11 Apr 2005 21:54:55 -0500 Subject: [llvm-commits] CVS: llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp Message-ID: <200504120254.j3C2staU025638@apoc.cs.uiuc.edu> Changes in directory llvm/lib/CodeGen/SelectionDAG: SelectionDAG.cpp updated: 1.72 -> 1.73 --- Log message: Remove some redundant checks, add a couple of new ones. This allows us to compile this: int foo (unsigned long a, unsigned long long g) { return a >= g; } To: foo: movl 8(%esp), %eax cmpl %eax, 4(%esp) setae %al cmpl $0, 12(%esp) sete %cl andb %al, %cl movzbl %cl, %eax ret instead of: foo: movl 8(%esp), %eax cmpl %eax, 4(%esp) setae %al movzbw %al, %cx movl 12(%esp), %edx cmpl $0, %edx sete %al movzbw %al, %ax cmpl $0, %edx cmove %cx, %ax movzbl %al, %eax ret --- Diffs of the changes: (+5 -7) SelectionDAG.cpp | 12 +++++------- 1 files changed, 5 insertions(+), 7 deletions(-) Index: llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp diff -u llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp:1.72 llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp:1.73 --- llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp:1.72 Mon Apr 11 19:28:49 2005 +++ llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp Mon Apr 11 21:54:39 2005 @@ -960,12 +960,6 @@ if (VT == MVT::i1) { // Boolean SELECT if (N2C) { - if (N3C) { - if (N2C->getValue()) // select C, 1, 0 -> C - return N1; - return getNode(ISD::XOR, VT, N1, N3); // select C, 0, 1 -> ~C - } - if (N2C->getValue()) // select C, 1, X -> C | X return getNode(ISD::OR, VT, N1, N3); else // select C, 0, X -> ~C & X @@ -980,6 +974,11 @@ else // select C, X, 0 -> C & X return getNode(ISD::AND, VT, N1, N2); } + + if (N1 == N2) // X ? X : Y --> X ? 1 : Y --> X | Y + return getNode(ISD::OR, VT, N1, N3); + if (N1 == N3) // X ? Y : X --> X ? Y : 0 --> X & Y + return getNode(ISD::AND, VT, N1, N2); } // If this is a selectcc, check to see if we can simplify the result. @@ -1001,7 +1000,6 @@ N2.getOperand(0) == N3) return getNode(ISD::FABS, VT, N3); } - } break; case ISD::BRCOND: From lattner at cs.uiuc.edu Mon Apr 11 22:59:43 2005 From: lattner at cs.uiuc.edu (Chris Lattner) Date: Mon, 11 Apr 2005 22:59:43 -0500 Subject: [llvm-commits] CVS: llvm/lib/Analysis/DataStructure/DataStructure.cpp Message-ID: <200504120359.j3C3xhG5028084@apoc.cs.uiuc.edu> Changes in directory llvm/lib/Analysis/DataStructure: DataStructure.cpp updated: 1.237 -> 1.238 --- Log message: Fix a crash analyzing MultiSource/Benchmarks/MallocBench/gs --- Diffs of the changes: (+1 -2) DataStructure.cpp | 3 +-- 1 files changed, 1 insertion(+), 2 deletions(-) Index: llvm/lib/Analysis/DataStructure/DataStructure.cpp diff -u llvm/lib/Analysis/DataStructure/DataStructure.cpp:1.237 llvm/lib/Analysis/DataStructure/DataStructure.cpp:1.238 --- llvm/lib/Analysis/DataStructure/DataStructure.cpp:1.237 Sat Apr 2 13:17:17 2005 +++ llvm/lib/Analysis/DataStructure/DataStructure.cpp Mon Apr 11 22:59:27 2005 @@ -119,10 +119,9 @@ // DSNode copy constructor... do not copy over the referrers list! DSNode::DSNode(const DSNode &N, DSGraph *G, bool NullLinks) : NumReferrers(0), Size(N.Size), ParentGraph(G), - Ty(N.Ty), NodeType(N.NodeType) { + Ty(N.Ty), Globals(N.Globals), NodeType(N.NodeType) { if (!NullLinks) { Links = N.Links; - Globals = N.Globals; } else Links.resize(N.Links.size()); // Create the appropriate number of null links G->addNode(this); From natebegeman at mac.com Tue Apr 12 02:04:27 2005 From: natebegeman at mac.com (Nate Begeman) Date: Tue, 12 Apr 2005 02:04:27 -0500 Subject: [llvm-commits] CVS: llvm/lib/Target/PowerPC/PPC32InstrInfo.cpp PPC32RegisterInfo.cpp PPC32RegisterInfo.td PowerPCInstrFormats.td PowerPCInstrInfo.td Message-ID: <200504120704.CAA30616@zion.cs.uiuc.edu> Changes in directory llvm/lib/Target/PowerPC: PPC32InstrInfo.cpp updated: 1.2 -> 1.3 PPC32RegisterInfo.cpp updated: 1.9 -> 1.10 PPC32RegisterInfo.td updated: 1.3 -> 1.4 PowerPCInstrFormats.td updated: 1.32 -> 1.33 PowerPCInstrInfo.td updated: 1.61 -> 1.62 --- Log message: Initial support for allocation condition registers --- Diffs of the changes: (+67 -13) PPC32InstrInfo.cpp | 8 ++++++++ PPC32RegisterInfo.cpp | 15 ++++++++++++++- PPC32RegisterInfo.td | 2 +- PowerPCInstrFormats.td | 48 +++++++++++++++++++++++++++++++++++++++--------- PowerPCInstrInfo.td | 7 +++++-- 5 files changed, 67 insertions(+), 13 deletions(-) Index: llvm/lib/Target/PowerPC/PPC32InstrInfo.cpp diff -u llvm/lib/Target/PowerPC/PPC32InstrInfo.cpp:1.2 llvm/lib/Target/PowerPC/PPC32InstrInfo.cpp:1.3 --- llvm/lib/Target/PowerPC/PPC32InstrInfo.cpp:1.2 Thu Oct 7 17:26:12 2004 +++ llvm/lib/Target/PowerPC/PPC32InstrInfo.cpp Tue Apr 12 02:04:16 2005 @@ -65,6 +65,14 @@ sourceReg = MI.getOperand(1).getReg(); destReg = MI.getOperand(0).getReg(); return true; + } else if (oc == PPC::MCRF) { // mcrf cr1, cr2 + assert(MI.getNumOperands() == 2 && + MI.getOperand(0).isRegister() && + MI.getOperand(1).isRegister() && + "invalid PPC MCRF instruction"); + sourceReg = MI.getOperand(1).getReg(); + destReg = MI.getOperand(0).getReg(); + return true; } return false; } Index: llvm/lib/Target/PowerPC/PPC32RegisterInfo.cpp diff -u llvm/lib/Target/PowerPC/PPC32RegisterInfo.cpp:1.9 llvm/lib/Target/PowerPC/PPC32RegisterInfo.cpp:1.10 --- llvm/lib/Target/PowerPC/PPC32RegisterInfo.cpp:1.9 Sat Apr 9 22:59:42 2005 +++ llvm/lib/Target/PowerPC/PPC32RegisterInfo.cpp Tue Apr 12 02:04:16 2005 @@ -69,6 +69,11 @@ case 4: return 3; case 8: return 4; } + } else if (RC == PPC32::CRRCRegisterClass) { + switch (RC->getSize()) { + default: assert(0 && "Invalid data size!"); + case 4: return 2; + } } std::cerr << "Invalid register class to getIdx()!\n"; abort(); @@ -85,6 +90,9 @@ if (SrcReg == PPC::LR) { BuildMI(MBB, MI, PPC::MFLR, 1, PPC::R11).addReg(PPC::LR); addFrameReference(BuildMI(MBB, MI, OC, 3).addReg(PPC::R11),FrameIdx); + } else if (PPC32::CRRCRegisterClass == getClass(SrcReg)) { + BuildMI(MBB, MI, PPC::MFCR, 0, PPC::R11); + addFrameReference(BuildMI(MBB, MI, OC, 3).addReg(PPC::R11),FrameIdx); } else { addFrameReference(BuildMI(MBB, MI, OC, 3).addReg(SrcReg),FrameIdx); } @@ -101,6 +109,9 @@ if (DestReg == PPC::LR) { addFrameReference(BuildMI(MBB, MI, OC, 2, PPC::R11), FrameIdx); BuildMI(MBB, MI, PPC::MTLR, 1).addReg(PPC::R11); + } else if (PPC32::CRRCRegisterClass == getClass(DestReg)) { + addFrameReference(BuildMI(MBB, MI, OC, 2, PPC::R11), FrameIdx); + BuildMI(MBB, MI, PPC::MTCRF, 1, DestReg).addReg(PPC::R11); } else { addFrameReference(BuildMI(MBB, MI, OC, 2, DestReg), FrameIdx); } @@ -116,7 +127,9 @@ BuildMI(MBB, MI, PPC::OR, 2, DestReg).addReg(SrcReg).addReg(SrcReg); } else if (RC == PPC32::FPRCRegisterClass) { BuildMI(MBB, MI, PPC::FMR, 1, DestReg).addReg(SrcReg); - } else { + } else if (RC == PPC32::CRRCRegisterClass) { + BuildMI(MBB, MI, PPC::MCRF, 1, DestReg).addReg(SrcReg); + } else { std::cerr << "Attempt to copy register that is not GPR or FPR"; abort(); } Index: llvm/lib/Target/PowerPC/PPC32RegisterInfo.td diff -u llvm/lib/Target/PowerPC/PPC32RegisterInfo.td:1.3 llvm/lib/Target/PowerPC/PPC32RegisterInfo.td:1.4 --- llvm/lib/Target/PowerPC/PPC32RegisterInfo.td:1.3 Sat Aug 21 15:14:40 2004 +++ llvm/lib/Target/PowerPC/PPC32RegisterInfo.td Tue Apr 12 02:04:16 2005 @@ -37,4 +37,4 @@ F8, F9, F10, F11, F12, F13, F14, F15, F16, F17, F18, F19, F20, F21, F22, F23, F24, F25, F26, F27, F28, F29, F30, F31]>; -def CRRC : RegisterClass; +def CRRC : RegisterClass; Index: llvm/lib/Target/PowerPC/PowerPCInstrFormats.td diff -u llvm/lib/Target/PowerPC/PowerPCInstrFormats.td:1.32 llvm/lib/Target/PowerPC/PowerPCInstrFormats.td:1.33 --- llvm/lib/Target/PowerPC/PowerPCInstrFormats.td:1.32 Mon Apr 11 10:03:40 2005 +++ llvm/lib/Target/PowerPC/PowerPCInstrFormats.td Tue Apr 12 02:04:16 2005 @@ -236,13 +236,6 @@ dag OL, string asmstr> : XForm_base_r3xo; -class XForm_5 opcode, bits<10> xo, bit ppc64, bit vmx, - dag OL, string asmstr> - : XForm_base_r3xo { - let A = 0; - let B = 0; -} - class XForm_6 opcode, bits<10> xo, bit rc, bit ppc64, bit vmx, dag OL, string asmstr> : XForm_base_r3xo_swapped; @@ -343,13 +336,27 @@ let BH = 0; } +class XLForm_3 opcode, bits<10> xo, bit ppc64, bit vmx, + dag OL, string asmstr> : I { + bits<3> BF; + bits<3> BFA; + + let Inst{6-8} = BF; + let Inst{9-10} = 0; + let Inst{11-13} = BFA; + let Inst{14-15} = 0; + let Inst{16-20} = 0; + let Inst{21-30} = xo; + let Inst{31} = 0; +} + // 1.7.8 XFX-Form class XFXForm_1 opcode, bits<10> xo, bit ppc64, bit vmx, dag OL, string asmstr> : I { - bits<5> ST; + bits<5> RT; bits<10> SPR; - let Inst{6-10} = ST; + let Inst{6-10} = RT; let Inst{11-20} = SPR; let Inst{21-30} = xo; let Inst{31} = 0; @@ -361,6 +368,29 @@ let SPR = spr; } +class XFXForm_3 opcode, bits<10> xo, bit ppc64, bit vmx, + dag OL, string asmstr> : I { + bits<5> RT; + + let Inst{6-10} = RT; + let Inst{11-20} = 0; + let Inst{21-30} = xo; + let Inst{31} = 0; +} + +class XFXForm_5 opcode, bits<10> xo, bit ppc64, bit vmx, + dag OL, string asmstr> : I { + bits<8> FXM; + bits<5> ST; + + let Inst{6-10} = ST; + let Inst{11} = 0; + let Inst{12-19} = FXM; + let Inst{20} = 0; + let Inst{21-30} = xo; + let Inst{31} = 0; +} + class XFXForm_7 opcode, bits<10> xo, bit ppc64, bit vmx, dag OL, string asmstr> : XFXForm_1; Index: llvm/lib/Target/PowerPC/PowerPCInstrInfo.td diff -u llvm/lib/Target/PowerPC/PowerPCInstrInfo.td:1.61 llvm/lib/Target/PowerPC/PowerPCInstrInfo.td:1.62 --- llvm/lib/Target/PowerPC/PowerPCInstrInfo.td:1.61 Mon Apr 11 19:10:02 2005 +++ llvm/lib/Target/PowerPC/PowerPCInstrInfo.td Tue Apr 12 02:04:16 2005 @@ -221,7 +221,6 @@ def LDX : XForm_1<31, 21, 1, 0, (ops GPRC:$dst, GPRC:$base, GPRC:$index), "ldx $dst, $base, $index">; } -def MFCR : XForm_5<31, 19, 0, 0, (ops GPRC:$dst), "mfcr $dst">; def AND : XForm_6<31, 28, 0, 0, 0, (ops GPRC:$rA, GPRC:$rS, GPRC:$rB), "and $rA, $rS, $rB">; let Defs = [CR0] in @@ -341,6 +340,8 @@ "crnor $D, $A, $B">; def CROR : XLForm_1<19, 449, 0, 0, (ops u5imm:$D, u5imm:$A, u5imm:$B), "cror $D, $A, $B">; +def MCRF : XLForm_3<19, 0, 0, 0, (ops CRRC:$BF, CRRC:$BFA), + "mfcr $BF, $BFA">; // XFX-Form instructions. Instructions that deal with SPRs // @@ -349,10 +350,12 @@ // which means the SPR value needs to be multiplied by a factor of 32. def MFCTR : XFXForm_1_ext<31, 339, 288, 0, 0, (ops GPRC:$rT), "mfctr $rT">; def MFLR : XFXForm_1_ext<31, 339, 256, 0, 0, (ops GPRC:$rT), "mflr $rT">; +def MFCR : XFXForm_3<31, 19, 0, 0, (ops GPRC:$rT), "mfcr $rT">; +def MTCRF : XFXForm_5<31, 144, 0, 0, (ops CRRC:$FXM, GPRC:$rS), + "mtcrf $FXM, $rS">; def MTCTR : XFXForm_7_ext<31, 467, 288, 0, 0, (ops GPRC:$rS), "mtctr $rS">; def MTLR : XFXForm_7_ext<31, 467, 256, 0, 0, (ops GPRC:$rS), "mtlr $rS">; - // XS-Form instructions. Just 'sradi' // def SRADI : XSForm_1<31, 413, 0, 1, 0, (ops GPRC:$rA, GPRC:$rS, u6imm:$SH), From duraid at octopus.com.au Tue Apr 12 09:54:55 2005 From: duraid at octopus.com.au (Duraid Madina) Date: Tue, 12 Apr 2005 09:54:55 -0500 Subject: [llvm-commits] CVS: llvm/lib/Target/IA64/IA64ISelPattern.cpp IA64RegisterInfo.td Message-ID: <200504121454.JAA04662@zion.cs.uiuc.edu> Changes in directory llvm/lib/Target/IA64: IA64ISelPattern.cpp updated: 1.18 -> 1.19 IA64RegisterInfo.td updated: 1.3 -> 1.4 --- Log message: stop emitting IDEFs for args - change to using liveIn/liveOut --- Diffs of the changes: (+31 -4) IA64ISelPattern.cpp | 29 +++++++++++++++++++++++++++-- IA64RegisterInfo.td | 6 ++++-- 2 files changed, 31 insertions(+), 4 deletions(-) Index: llvm/lib/Target/IA64/IA64ISelPattern.cpp diff -u llvm/lib/Target/IA64/IA64ISelPattern.cpp:1.18 llvm/lib/Target/IA64/IA64ISelPattern.cpp:1.19 --- llvm/lib/Target/IA64/IA64ISelPattern.cpp:1.18 Mon Apr 11 13:55:36 2005 +++ llvm/lib/Target/IA64/IA64ISelPattern.cpp Tue Apr 12 09:54:44 2005 @@ -183,7 +183,8 @@ // fixme? (well, will need to for weird FP structy stuff, // see intel ABI docs) case MVT::f64: - BuildMI(&BB, IA64::IDEF, 0, args_FP[used_FPArgs]); +//XXX BuildMI(&BB, IA64::IDEF, 0, args_FP[used_FPArgs]); + MF.addLiveIn(args_FP[used_FPArgs]); // mark this reg as liveIn // floating point args go into f8..f15 as-needed, the increment argVreg[count] = // is below..: MF.getSSARegMap()->createVirtualRegister(getRegClassFor(MVT::f64)); @@ -199,7 +200,8 @@ case MVT::i16: case MVT::i32: case MVT::i64: - BuildMI(&BB, IA64::IDEF, 0, args_int[count]); +//XXX BuildMI(&BB, IA64::IDEF, 0, args_int[count]); + MF.addLiveIn(args_int[count]); // mark this register as liveIn argVreg[count] = MF.getSSARegMap()->createVirtualRegister(getRegClassFor(MVT::i64)); argPreg[count] = args_int[count]; @@ -271,6 +273,24 @@ } } + // Finally, inform the code generator which regs we return values in. + // (see the ISD::RET: case down below) + switch (getValueType(F.getReturnType())) { + default: assert(0 && "i have no idea where to return this type!"); + case MVT::isVoid: break; + case MVT::i1: + case MVT::i8: + case MVT::i16: + case MVT::i32: + case MVT::i64: + MF.addLiveOut(IA64::r8); + break; + case MVT::f32: + case MVT::f64: + MF.addLiveOut(IA64::F8); + break; + } + return ArgValues; } @@ -1769,10 +1789,15 @@ default: assert(0 && "All other types should have been promoted!!"); // FIXME: do I need to add support for bools here? // (return '0' or '1' r8, basically...) + // + // FIXME: need to round floats - 80 bits is bad, the tester + // told me so case MVT::i64: + // we mark r8 as live on exit up above in LowerArguments() BuildMI(BB, IA64::MOV, 1, IA64::r8).addReg(Tmp1); break; case MVT::f64: + // we mark F8 as live on exit up above in LowerArguments() BuildMI(BB, IA64::FMOV, 1, IA64::F8).addReg(Tmp1); } break; Index: llvm/lib/Target/IA64/IA64RegisterInfo.td diff -u llvm/lib/Target/IA64/IA64RegisterInfo.td:1.3 llvm/lib/Target/IA64/IA64RegisterInfo.td:1.4 --- llvm/lib/Target/IA64/IA64RegisterInfo.td:1.3 Sat Apr 9 06:53:00 2005 +++ llvm/lib/Target/IA64/IA64RegisterInfo.td Tue Apr 12 09:54:44 2005 @@ -249,12 +249,14 @@ r104, r105, r106, r107, r108, r109, r110, r111, r112, r113, r114, r115, r116, r117, r118, r119, r120, r121, r122, r123, r124, r125, r126, r127, - r0, r1, r2, r12, r13, r15, r22]> // these last 7 are special (look down) + r0, r1, r2, r12, r13, r15, r22, + out0, out1, out2, out3, + out4, out5, out6, out7]> // these last 15 are special (look down) { let Methods = [{ iterator allocation_order_end(MachineFunction &MF) const { - int numReservedRegs=7; // the 7 special registers r0,r1,r2,r12,r13 etc + int numReservedRegs=15; // the 15 special registers r0,r1,r2,r12,r13 etc // we also can't allocate registers for use as locals if they're // already required as 'out' registers numReservedRegs+=MF.getInfo()->outRegsUsed; From lattner at cs.uiuc.edu Tue Apr 12 10:13:05 2005 From: lattner at cs.uiuc.edu (Chris Lattner) Date: Tue, 12 Apr 2005 10:13:05 -0500 Subject: [llvm-commits] CVS: llvm/lib/Target/IA64/IA64RegisterInfo.td Message-ID: <200504121513.j3CFD5nK006014@apoc.cs.uiuc.edu> Changes in directory llvm/lib/Target/IA64: IA64RegisterInfo.td updated: 1.4 -> 1.5 --- Log message: Put out* into the allocation order, allowing the register allocator to coallesce moves into outgoing args. --- Diffs of the changes: (+3 -4) IA64RegisterInfo.td | 7 +++---- 1 files changed, 3 insertions(+), 4 deletions(-) Index: llvm/lib/Target/IA64/IA64RegisterInfo.td diff -u llvm/lib/Target/IA64/IA64RegisterInfo.td:1.4 llvm/lib/Target/IA64/IA64RegisterInfo.td:1.5 --- llvm/lib/Target/IA64/IA64RegisterInfo.td:1.4 Tue Apr 12 09:54:44 2005 +++ llvm/lib/Target/IA64/IA64RegisterInfo.td Tue Apr 12 10:12:51 2005 @@ -249,14 +249,13 @@ r104, r105, r106, r107, r108, r109, r110, r111, r112, r113, r114, r115, r116, r117, r118, r119, r120, r121, r122, r123, r124, r125, r126, r127, - r0, r1, r2, r12, r13, r15, r22, out0, out1, out2, out3, - out4, out5, out6, out7]> // these last 15 are special (look down) - + out4, out5, out6, out7, + r0, r1, r2, r12, r13, r15, r22]> // these last 7 are special (look down) { let Methods = [{ iterator allocation_order_end(MachineFunction &MF) const { - int numReservedRegs=15; // the 15 special registers r0,r1,r2,r12,r13 etc + int numReservedRegs=7; // the 15 special registers r0,r1,r2,r12,r13 etc // we also can't allocate registers for use as locals if they're // already required as 'out' registers numReservedRegs+=MF.getInfo()->outRegsUsed; From lattner at cs.uiuc.edu Tue Apr 12 10:13:05 2005 From: lattner at cs.uiuc.edu (Chris Lattner) Date: Tue, 12 Apr 2005 10:13:05 -0500 Subject: [llvm-commits] CVS: llvm/lib/Target/IA64/IA64InstrInfo.td Message-ID: <200504121513.j3CFD5hI006015@apoc.cs.uiuc.edu> Changes in directory llvm/lib/Target/IA64: IA64InstrInfo.td updated: 1.9 -> 1.10 --- Log message: Make sure to realize that calls use their argument regs --- Diffs of the changes: (+1 -0) IA64InstrInfo.td | 1 + 1 files changed, 1 insertion(+) Index: llvm/lib/Target/IA64/IA64InstrInfo.td diff -u llvm/lib/Target/IA64/IA64InstrInfo.td:1.9 llvm/lib/Target/IA64/IA64InstrInfo.td:1.10 --- llvm/lib/Target/IA64/IA64InstrInfo.td:1.9 Mon Apr 11 00:55:56 2005 +++ llvm/lib/Target/IA64/IA64InstrInfo.td Tue Apr 12 10:12:19 2005 @@ -310,6 +310,7 @@ } let isCall = 1, isTerminator = 1, isBranch = 1, + Uses = [out0,out1,out2,out3,out4,out5,out6,out7], // all calls clobber non-callee-saved registers, and for now, they are these: Defs = [r2,r3,r8,r9,r10,r11,r14,r15,r16,r17,r18,r19,r20,r21,r22,r23,r24, r25,r26,r27,r28,r29,r30,r31, From alenhar2 at cs.uiuc.edu Tue Apr 12 12:35:32 2005 From: alenhar2 at cs.uiuc.edu (Andrew Lenharth) Date: Tue, 12 Apr 2005 12:35:32 -0500 Subject: [llvm-commits] CVS: llvm/lib/Target/Alpha/AlphaISelPattern.cpp Message-ID: <200504121735.MAA02866@niobe.cs.uiuc.edu> Changes in directory llvm/lib/Target/Alpha: AlphaISelPattern.cpp updated: 1.91 -> 1.92 --- Log message: Get rid of idefs for arguments --- Diffs of the changes: (+28 -5) AlphaISelPattern.cpp | 33 ++++++++++++++++++++++++++++----- 1 files changed, 28 insertions(+), 5 deletions(-) Index: llvm/lib/Target/Alpha/AlphaISelPattern.cpp diff -u llvm/lib/Target/Alpha/AlphaISelPattern.cpp:1.91 llvm/lib/Target/Alpha/AlphaISelPattern.cpp:1.92 --- llvm/lib/Target/Alpha/AlphaISelPattern.cpp:1.91 Fri Apr 8 22:22:22 2005 +++ llvm/lib/Target/Alpha/AlphaISelPattern.cpp Tue Apr 12 12:35:16 2005 @@ -170,20 +170,25 @@ for (int i = 0; i < 6; ++i) { if (F.isVarArg()) { - BuildMI(&BB, Alpha::IDEF, 0, args_int[i]); - BuildMI(&BB, Alpha::IDEF, 0, args_float[i]); + F->addLiveIn(args_int[i]); + F->addLiveIn(args_float[i]); +// BuildMI(&BB, Alpha::IDEF, 0, args_int[i]); +// BuildMI(&BB, Alpha::IDEF, 0, args_float[i]); } else if (I != E) { if(MVT::isInteger(getValueType(I->getType()))) - BuildMI(&BB, Alpha::IDEF, 0, args_int[i]); + F->addLiveIn(args_int[i]); +// BuildMI(&BB, Alpha::IDEF, 0, args_int[i]); else - BuildMI(&BB, Alpha::IDEF, 0, args_float[i]); + F->addLiveIn(args_float[i]); +// BuildMI(&BB, Alpha::IDEF, 0, args_float[i]); ++I; } } } - BuildMI(&BB, Alpha::IDEF, 0, Alpha::R29); + F->addLiveIn(Alpha::R29); +// BuildMI(&BB, Alpha::IDEF, 0, Alpha::R29); BuildMI(&BB, Alpha::BIS, 2, GP).addReg(Alpha::R29).addReg(Alpha::R29); for (Function::arg_iterator I = F.arg_begin(), E = F.arg_end(); I != E; ++I) @@ -257,6 +262,24 @@ //Set up a token factor with all the stack traffic DAG.setRoot(DAG.getNode(ISD::TokenFactor, MVT::Other, LS)); + + // Finally, inform the code generator which regs we return values in. + switch (getValueType(F.getReturnType())) { + default: assert(0 && "Unknown type!"); + case MVT::isVoid: break; + case MVT::i1: + case MVT::i8: + case MVT::i16: + case MVT::i32: + case MVT::i64: + MF.addLiveOut(Alpha::R0); + break; + case MVT::f32: + case MVT::f64: + MF.addLiveOut(Alpha::F0); + break; + } + //return the arguments return ArgValues; } From alenhar2 at cs.uiuc.edu Tue Apr 12 12:48:13 2005 From: alenhar2 at cs.uiuc.edu (Andrew Lenharth) Date: Tue, 12 Apr 2005 12:48:13 -0500 Subject: [llvm-commits] CVS: llvm/lib/Target/Alpha/AlphaISelPattern.cpp Message-ID: <200504121748.MAA02893@niobe.cs.uiuc.edu> Changes in directory llvm/lib/Target/Alpha: AlphaISelPattern.cpp updated: 1.92 -> 1.93 --- Log message: Get rid of idefs for arguments (oops) --- Diffs of the changes: (+5 -5) AlphaISelPattern.cpp | 10 +++++----- 1 files changed, 5 insertions(+), 5 deletions(-) Index: llvm/lib/Target/Alpha/AlphaISelPattern.cpp diff -u llvm/lib/Target/Alpha/AlphaISelPattern.cpp:1.92 llvm/lib/Target/Alpha/AlphaISelPattern.cpp:1.93 --- llvm/lib/Target/Alpha/AlphaISelPattern.cpp:1.92 Tue Apr 12 12:35:16 2005 +++ llvm/lib/Target/Alpha/AlphaISelPattern.cpp Tue Apr 12 12:47:57 2005 @@ -170,24 +170,24 @@ for (int i = 0; i < 6; ++i) { if (F.isVarArg()) { - F->addLiveIn(args_int[i]); - F->addLiveIn(args_float[i]); + MF.addLiveIn(args_int[i]); + MF.addLiveIn(args_float[i]); // BuildMI(&BB, Alpha::IDEF, 0, args_int[i]); // BuildMI(&BB, Alpha::IDEF, 0, args_float[i]); } else if (I != E) { if(MVT::isInteger(getValueType(I->getType()))) - F->addLiveIn(args_int[i]); + MF.addLiveIn(args_int[i]); // BuildMI(&BB, Alpha::IDEF, 0, args_int[i]); else - F->addLiveIn(args_float[i]); + MF.addLiveIn(args_float[i]); // BuildMI(&BB, Alpha::IDEF, 0, args_float[i]); ++I; } } } - F->addLiveIn(Alpha::R29); + MF.addLiveIn(Alpha::R29); // BuildMI(&BB, Alpha::IDEF, 0, Alpha::R29); BuildMI(&BB, Alpha::BIS, 2, GP).addReg(Alpha::R29).addReg(Alpha::R29); From duraid at octopus.com.au Tue Apr 12 13:43:10 2005 From: duraid at octopus.com.au (Duraid Madina) Date: Tue, 12 Apr 2005 13:43:10 -0500 Subject: [llvm-commits] CVS: llvm/lib/Target/IA64/IA64RegisterInfo.td README Message-ID: <200504121843.NAA06892@zion.cs.uiuc.edu> Changes in directory llvm/lib/Target/IA64: IA64RegisterInfo.td updated: 1.5 -> 1.6 README updated: 1.3 -> 1.4 --- Log message: * OK, after changing to use liveIn/liveOut instead of IDEFs, to avoid redundant mov out3=r44 type instructions, we need to tell the register allocator the truth about out? registers. FIXME: unfortunately, since the list of allocatable registers is immutable, we can't simply 'delete r127' from the allocation order, say, if 'out0' is used. The only correct thing we can do is have a linear order of regs: out7, out6 ... out2, out1, out0, r32, r33, r34 ... r126, r127 and slide a 'window' of 96 registers along this line, depending on how many of the out? regs a function actually uses. The only downside of this is that the out? registers will be allocated _first_, which makes the resulting assembly ugly. :( Note this in the README. Hope this gets fixed soon. :) (note the 3rd person speech there) --- Diffs of the changes: (+18 -5) IA64RegisterInfo.td | 21 ++++++++++++++++----- README | 2 ++ 2 files changed, 18 insertions(+), 5 deletions(-) Index: llvm/lib/Target/IA64/IA64RegisterInfo.td diff -u llvm/lib/Target/IA64/IA64RegisterInfo.td:1.5 llvm/lib/Target/IA64/IA64RegisterInfo.td:1.6 --- llvm/lib/Target/IA64/IA64RegisterInfo.td:1.5 Tue Apr 12 10:12:51 2005 +++ llvm/lib/Target/IA64/IA64RegisterInfo.td Tue Apr 12 13:42:59 2005 @@ -234,7 +234,13 @@ // in IA64RegisterInfo.cpp def GR : RegisterClass // these last 7 are special (look down) + r0, r1, r2, r12, r13, r15, r22]> // the last 15 are special (look down) { let Methods = [{ + + iterator allocation_order_begin(MachineFunction &MF) const { + // hide registers appropriately: + return begin()+(8-(MF.getInfo()->outRegsUsed)); + } + iterator allocation_order_end(MachineFunction &MF) const { - int numReservedRegs=7; // the 15 special registers r0,r1,r2,r12,r13 etc + int numReservedRegs=7; // the 7 special registers r0,r1,r2,r12,r13 etc + // we also can't allocate registers for use as locals if they're // already required as 'out' registers numReservedRegs+=MF.getInfo()->outRegsUsed; Index: llvm/lib/Target/IA64/README diff -u llvm/lib/Target/IA64/README:1.3 llvm/lib/Target/IA64/README:1.4 --- llvm/lib/Target/IA64/README:1.3 Thu Mar 31 06:31:11 2005 +++ llvm/lib/Target/IA64/README Tue Apr 12 13:42:59 2005 @@ -55,6 +55,8 @@ TODO: - clean up and thoroughly test the isel patterns. + - fix stacked register allocation order: (for readability) we don't want + the out? registers being the first ones used - fix up floating point (nb http://gcc.gnu.org/wiki?pagename=ia64%20floating%20point ) - bundling! From lattner at cs.uiuc.edu Tue Apr 12 13:51:53 2005 From: lattner at cs.uiuc.edu (Chris Lattner) Date: Tue, 12 Apr 2005 13:51:53 -0500 Subject: [llvm-commits] CVS: llvm/lib/Transforms/Utils/SimplifyCFG.cpp Message-ID: <200504121851.j3CIprD6013045@apoc.cs.uiuc.edu> Changes in directory llvm/lib/Transforms/Utils: SimplifyCFG.cpp updated: 1.66 -> 1.67 --- Log message: Get rid of this for_each loop --- Diffs of the changes: (+2 -2) SimplifyCFG.cpp | 4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) Index: llvm/lib/Transforms/Utils/SimplifyCFG.cpp diff -u llvm/lib/Transforms/Utils/SimplifyCFG.cpp:1.66 llvm/lib/Transforms/Utils/SimplifyCFG.cpp:1.67 --- llvm/lib/Transforms/Utils/SimplifyCFG.cpp:1.66 Sun Feb 27 00:17:04 2005 +++ llvm/lib/Transforms/Utils/SimplifyCFG.cpp Tue Apr 12 13:51:33 2005 @@ -837,8 +837,8 @@ // Loop through all of our successors and make sure they know that one // of their predecessors is going away. - for_each(succ_begin(BB), succ_end(BB), - std::bind2nd(std::mem_fun(&BasicBlock::removePredecessor), BB)); + for (succ_iterator SI = succ_begin(BB), E = succ_end(BB); SI != E; ++SI) + SI->removePredecessor(BB); while (!BB->empty()) { Instruction &I = BB->back(); From lattner at cs.uiuc.edu Tue Apr 12 13:52:08 2005 From: lattner at cs.uiuc.edu (Chris Lattner) Date: Tue, 12 Apr 2005 13:52:08 -0500 Subject: [llvm-commits] CVS: llvm/include/llvm/BasicBlock.h Message-ID: <200504121852.j3CIq8G6013593@apoc.cs.uiuc.edu> Changes in directory llvm/include/llvm: BasicBlock.h updated: 1.51 -> 1.52 --- Log message: add an argument. --- Diffs of the changes: (+1 -1) BasicBlock.h | 2 +- 1 files changed, 1 insertion(+), 1 deletion(-) Index: llvm/include/llvm/BasicBlock.h diff -u llvm/include/llvm/BasicBlock.h:1.51 llvm/include/llvm/BasicBlock.h:1.52 --- llvm/include/llvm/BasicBlock.h:1.51 Mon Mar 14 23:17:48 2005 +++ llvm/include/llvm/BasicBlock.h Tue Apr 12 13:51:53 2005 @@ -155,7 +155,7 @@ /// update the PHI nodes that reside in the block. Note that this should be /// called while the predecessor still refers to this block. /// - void removePredecessor(BasicBlock *Pred); + void removePredecessor(BasicBlock *Pred, bool DontDeleteUselessPHIs = false); /// splitBasicBlock - This splits a basic block into two at the specified /// instruction. Note that all instructions BEFORE the specified iterator From lattner at cs.uiuc.edu Tue Apr 12 13:52:28 2005 From: lattner at cs.uiuc.edu (Chris Lattner) Date: Tue, 12 Apr 2005 13:52:28 -0500 Subject: [llvm-commits] CVS: llvm/lib/VMCore/BasicBlock.cpp Message-ID: <200504121852.j3CIqS4W013614@apoc.cs.uiuc.edu> Changes in directory llvm/lib/VMCore: BasicBlock.cpp updated: 1.59 -> 1.60 --- Log message: add an argument to allow avoiding deleting phi nodes. --- Diffs of the changes: (+9 -6) BasicBlock.cpp | 15 +++++++++------ 1 files changed, 9 insertions(+), 6 deletions(-) Index: llvm/lib/VMCore/BasicBlock.cpp diff -u llvm/lib/VMCore/BasicBlock.cpp:1.59 llvm/lib/VMCore/BasicBlock.cpp:1.60 --- llvm/lib/VMCore/BasicBlock.cpp:1.59 Sat Mar 5 13:51:49 2005 +++ llvm/lib/VMCore/BasicBlock.cpp Tue Apr 12 13:52:14 2005 @@ -12,7 +12,7 @@ //===----------------------------------------------------------------------===// #include "llvm/BasicBlock.h" -#include "llvm/Constant.h" +#include "llvm/Constants.h" #include "llvm/Instructions.h" #include "llvm/Type.h" #include "llvm/Support/CFG.h" @@ -134,7 +134,8 @@ // update the PHI nodes that reside in the block. Note that this should be // called while the predecessor still refers to this block. // -void BasicBlock::removePredecessor(BasicBlock *Pred) { +void BasicBlock::removePredecessor(BasicBlock *Pred, + bool DontDeleteUselessPHIs) { assert((hasNUsesOrMore(16)||// Reduce cost of this assertion for complex CFGs. find(pred_begin(this), pred_end(this), Pred) != pred_end(this)) && "removePredecessor: BB is not a predecessor!"); @@ -164,10 +165,12 @@ if (this == Other) max_idx = 3; } - if (max_idx <= 2) { // <= Two predecessors BEFORE I remove one? + // <= Two predecessors BEFORE I remove one? + if (max_idx <= 2 && !DontDeleteUselessPHIs) { // Yup, loop through and nuke the PHI nodes while (PHINode *PN = dyn_cast(&front())) { - PN->removeIncomingValue(Pred); // Remove the predecessor first... + // Remove the predecessor first. + PN->removeIncomingValue(Pred, !DontDeleteUselessPHIs); // If the PHI _HAD_ two uses, replace PHI node with its now *single* value if (max_idx == 2) { @@ -175,7 +178,7 @@ PN->replaceAllUsesWith(PN->getOperand(0)); else // We are left with an infinite loop with no entries: kill the PHI. - PN->replaceAllUsesWith(Constant::getNullValue(PN->getType())); + PN->replaceAllUsesWith(UndefValue::get(PN->getType())); getInstList().pop_front(); // Remove the PHI node } @@ -187,7 +190,7 @@ // PHI nodes. Iterate over each PHI node fixing them up PHINode *PN; for (iterator II = begin(); (PN = dyn_cast(II)); ++II) - PN->removeIncomingValue(Pred); + PN->removeIncomingValue(Pred, false); } } From lattner at cs.uiuc.edu Tue Apr 12 14:08:34 2005 From: lattner at cs.uiuc.edu (Chris Lattner) Date: Tue, 12 Apr 2005 14:08:34 -0500 Subject: [llvm-commits] CVS: llvm-poolalloc/lib/PoolAllocate/TransformFunctionBody.cpp Message-ID: <200504121908.j3CJ8YgE028887@apoc.cs.uiuc.edu> Changes in directory llvm-poolalloc/lib/PoolAllocate: TransformFunctionBody.cpp updated: 1.43 -> 1.44 --- Log message: Fix problems handling invoke instructions that prevented the pool allocator from working on 176.gcc --- Diffs of the changes: (+18 -0) TransformFunctionBody.cpp | 18 ++++++++++++++++++ 1 files changed, 18 insertions(+) Index: llvm-poolalloc/lib/PoolAllocate/TransformFunctionBody.cpp diff -u llvm-poolalloc/lib/PoolAllocate/TransformFunctionBody.cpp:1.43 llvm-poolalloc/lib/PoolAllocate/TransformFunctionBody.cpp:1.44 --- llvm-poolalloc/lib/PoolAllocate/TransformFunctionBody.cpp:1.43 Sat Apr 2 14:10:57 2005 +++ llvm-poolalloc/lib/PoolAllocate/TransformFunctionBody.cpp Tue Apr 12 14:08:18 2005 @@ -172,6 +172,12 @@ UpdateNewToOldValueMap(I, V, V != Casted ? Casted : 0); } + // If this was an invoke, fix up the CFG. + if (InvokeInst *II = dyn_cast(I)) { + new BranchInst(II->getNormalDest(), I); + II->getUnwindDest()->removePredecessor(II->getParent(), true); + } + // Remove old allocation instruction. I->eraseFromParent(); return Casted; @@ -305,6 +311,12 @@ UpdateNewToOldValueMap(I, V, V != Casted ? Casted : 0); } + // If this was an invoke, fix up the CFG. + if (InvokeInst *II = dyn_cast(I)) { + new BranchInst(II->getNormalDest(), I); + II->getUnwindDest()->removePredecessor(II->getParent(), true); + } + // Remove old allocation instruction. I->eraseFromParent(); } @@ -370,6 +382,12 @@ UpdateNewToOldValueMap(I, V, V != Casted ? Casted : 0); } + // If this was an invoke, fix up the CFG. + if (InvokeInst *II = dyn_cast(I)) { + new BranchInst(II->getNormalDest(), I); + II->getUnwindDest()->removePredecessor(II->getParent(), true); + } + // Remove old allocation instruction. I->eraseFromParent(); } From lattner at cs.uiuc.edu Tue Apr 12 14:30:34 2005 From: lattner at cs.uiuc.edu (Chris Lattner) Date: Tue, 12 Apr 2005 14:30:34 -0500 Subject: [llvm-commits] CVS: llvm-test/External/SPEC/CINT2000/176.gcc/Makefile Message-ID: <200504121930.j3CJUY2M029535@apoc.cs.uiuc.edu> Changes in directory llvm-test/External/SPEC/CINT2000/176.gcc: Makefile updated: 1.12 -> 1.13 --- Log message: fix large problem size --- Diffs of the changes: (+8 -3) Makefile | 11 ++++++++--- 1 files changed, 8 insertions(+), 3 deletions(-) Index: llvm-test/External/SPEC/CINT2000/176.gcc/Makefile diff -u llvm-test/External/SPEC/CINT2000/176.gcc/Makefile:1.12 llvm-test/External/SPEC/CINT2000/176.gcc/Makefile:1.13 --- llvm-test/External/SPEC/CINT2000/176.gcc/Makefile:1.12 Mon Apr 4 14:59:52 2005 +++ llvm-test/External/SPEC/CINT2000/176.gcc/Makefile Tue Apr 12 14:30:17 2005 @@ -1,14 +1,19 @@ LEVEL = ../../../.. +include ../../Makefile.spec2000 + ifeq ($(RUN_TYPE),test) RUN_OPTIONS = cccp.i -o - -quiet STDOUT_FILENAME = cccp.s -else +endif +ifeq ($(RUN_TYPE),train) RUN_OPTIONS = cp-decl.i -o - -quiet STDOUT_FILENAME = cp-decl.s endif - -include ../../Makefile.spec2000 +ifeq ($(RUN_TYPE),ref) +RUN_OPTIONS = 200.i -o - -quiet +STDOUT_FILENAME = 200.s +endif ifeq ($(ENDIAN),big) CPPFLAGS += -DHOST_WORDS_BIG_ENDIAN From lattner at cs.uiuc.edu Tue Apr 12 15:30:26 2005 From: lattner at cs.uiuc.edu (Chris Lattner) Date: Tue, 12 Apr 2005 15:30:26 -0500 Subject: [llvm-commits] CVS: llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp Message-ID: <200504122030.j3CKUQqc030876@apoc.cs.uiuc.edu> Changes in directory llvm/lib/CodeGen/SelectionDAG: LegalizeDAG.cpp updated: 1.85 -> 1.86 --- Log message: promote extload i1 -> extload i8 --- Diffs of the changes: (+10 -2) LegalizeDAG.cpp | 12 ++++++++++-- 1 files changed, 10 insertions(+), 2 deletions(-) Index: llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp diff -u llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp:1.85 llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp:1.86 --- llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp:1.85 Mon Apr 11 21:19:10 2005 +++ llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp Tue Apr 12 15:30:10 2005 @@ -448,8 +448,17 @@ MVT::ValueType SrcVT = cast(Node)->getExtraValueType(); switch (TLI.getOperationAction(Node->getOpcode(), SrcVT)) { - case TargetLowering::Promote: default: assert(0 && "This action is not supported yet!"); + case TargetLowering::Promote: + assert(SrcVT == MVT::i1 && "Can only promote EXTLOAD from i1 -> i8!"); + Result = DAG.getNode(Node->getOpcode(), Node->getValueType(0), + Tmp1, Tmp2, MVT::i8); + // Since loads produce two values, make sure to remember that we legalized + // both of them. + AddLegalizedOperand(SDOperand(Node, 0), Result); + AddLegalizedOperand(SDOperand(Node, 1), Result.getValue(1)); + return Result.getValue(Op.ResNo); + case TargetLowering::Legal: if (Tmp1 != Node->getOperand(0) || Tmp2 != Node->getOperand(1)) @@ -463,7 +472,6 @@ AddLegalizedOperand(SDOperand(Node, 0), Result); AddLegalizedOperand(SDOperand(Node, 1), Result.getValue(1)); return Result.getValue(Op.ResNo); - break; case TargetLowering::Expand: assert(Node->getOpcode() != ISD::EXTLOAD && "EXTLOAD should always be supported!"); From lattner at cs.uiuc.edu Tue Apr 12 15:54:41 2005 From: lattner at cs.uiuc.edu (Chris Lattner) Date: Tue, 12 Apr 2005 15:54:41 -0500 Subject: [llvm-commits] CVS: llvm-test/External/SPEC/CINT2000/197.parser.hacked/Makefile Message-ID: <200504122054.j3CKsf18031449@apoc.cs.uiuc.edu> Changes in directory llvm-test/External/SPEC/CINT2000/197.parser.hacked: Makefile updated: 1.5 -> 1.6 --- Log message: allow this to pass with l_p_s on apoc --- Diffs of the changes: (+6 -0) Makefile | 6 ++++++ 1 files changed, 6 insertions(+) Index: llvm-test/External/SPEC/CINT2000/197.parser.hacked/Makefile diff -u llvm-test/External/SPEC/CINT2000/197.parser.hacked/Makefile:1.5 llvm-test/External/SPEC/CINT2000/197.parser.hacked/Makefile:1.6 --- llvm-test/External/SPEC/CINT2000/197.parser.hacked/Makefile:1.5 Wed Nov 3 18:26:23 2004 +++ llvm-test/External/SPEC/CINT2000/197.parser.hacked/Makefile Tue Apr 12 15:54:25 2005 @@ -4,6 +4,11 @@ STDOUT_FILENAME = $(RUN_TYPE).out CPPFLAGS = +ifdef LARGE_PROBLEM_SIZE +RUNTIMELIMIT := 1000 +endif + + SPEC_BENCH_DIR := /home/vadve/shared/benchmarks/speccpu2000/benchspec/CINT2000/197.parser/ Source = $(addprefix $(SPEC_BENCH_DIR)/src/, \ @@ -13,3 +18,4 @@ xa.c include ../../Makefile.spec2000 + From natebegeman at mac.com Tue Apr 12 16:22:39 2005 From: natebegeman at mac.com (Nate Begeman) Date: Tue, 12 Apr 2005 16:22:39 -0500 Subject: [llvm-commits] CVS: llvm/lib/Target/PowerPC/PPC32ISelPattern.cpp Message-ID: <200504122122.QAA08072@zion.cs.uiuc.edu> Changes in directory llvm/lib/Target/PowerPC: PPC32ISelPattern.cpp updated: 1.64 -> 1.65 --- Log message: Implement setcc op, -1 sequences Remove dead setcc op, 0 sequences Coming later: generalization of op, imm --- Diffs of the changes: (+41 -22) PPC32ISelPattern.cpp | 63 +++++++++++++++++++++++++++++++++------------------ 1 files changed, 41 insertions(+), 22 deletions(-) Index: llvm/lib/Target/PowerPC/PPC32ISelPattern.cpp diff -u llvm/lib/Target/PowerPC/PPC32ISelPattern.cpp:1.64 llvm/lib/Target/PowerPC/PPC32ISelPattern.cpp:1.65 --- llvm/lib/Target/PowerPC/PPC32ISelPattern.cpp:1.64 Mon Apr 11 19:10:02 2005 +++ llvm/lib/Target/PowerPC/PPC32ISelPattern.cpp Tue Apr 12 16:22:28 2005 @@ -663,7 +663,8 @@ switch(NodeOpcode) { default: return false; case ISD::AND: - case ISD::OR: return true; + case ISD::OR: + case ISD::ZERO_EXTEND_INREG: return true; } } @@ -965,6 +966,7 @@ unsigned ISel::SelectSetCR0(SDOperand CC) { unsigned Opc, Tmp1, Tmp2; + bool AlreadySelected = false; static const unsigned CompareOpcodes[] = { PPC::FCMPU, PPC::FCMPU, PPC::CMPW, PPC::CMPLW }; @@ -973,7 +975,6 @@ SetCCSDNode* SetCC = dyn_cast(CC.Val); if (SetCC && CC.getOpcode() == ISD::SETCC) { bool U; - bool AlreadySelected = false; Opc = getBCCForSetCC(SetCC->getCondition(), U); // Pass the optional argument U to getImmediateForOpcode for SETCC, @@ -984,7 +985,8 @@ // variant (e.g. 'or.' instead of 'or') of the instruction that defines // operand zero of the SetCC node is available. if (0 == Tmp2 && - NodeHasRecordingVariant(SetCC->getOperand(0).getOpcode())) { + NodeHasRecordingVariant(SetCC->getOperand(0).getOpcode()) && + SetCC->getOperand(0).Val->hasOneUse()) { RecordSuccess = false; Tmp1 = SelectExpr(SetCC->getOperand(0), true); if (RecordSuccess) { @@ -1008,9 +1010,9 @@ BuildMI(BB, CompareOpc, 2, PPC::CR0).addReg(Tmp1).addReg(Tmp2); } } else { + Opc = PPC::BNE; Tmp1 = SelectExpr(CC); BuildMI(BB, PPC::CMPLWI, 2, PPC::CR0).addReg(Tmp1).addImm(0); - Opc = PPC::BNE; } return Opc; } @@ -1643,8 +1645,9 @@ case MVT::i8: Tmp2 = 24; break; case MVT::i1: Tmp2 = 31; break; } - BuildMI(BB, PPC::RLWINM, 4, Result).addReg(Tmp1).addImm(0).addImm(Tmp2) - .addImm(31); + Opc = Recording ? PPC::RLWINMo : PPC::RLWINM; + RecordSuccess = true; + BuildMI(BB, Opc, 4, Result).addReg(Tmp1).addImm(0).addImm(Tmp2).addImm(31); return Result; case ISD::CopyFromReg: @@ -2017,54 +2020,70 @@ case ISD::SETCC: if (SetCCSDNode *SetCC = dyn_cast(Node)) { - // We can codegen setcc op, 0 very efficiently compared to a conditional - // branch. Check for that here. if (ConstantSDNode *CN = dyn_cast(SetCC->getOperand(1).Val)) { + // We can codegen setcc op, imm very efficiently compared to a brcond. + // Check for those cases here. + // setcc op, 0 if (CN->getValue() == 0) { Tmp1 = SelectExpr(SetCC->getOperand(0)); switch (SetCC->getCondition()) { default: assert(0 && "Unhandled SetCC condition"); abort(); case ISD::SETEQ: - case ISD::SETULE: Tmp2 = MakeReg(MVT::i32); BuildMI(BB, PPC::CNTLZW, 1, Tmp2).addReg(Tmp1); BuildMI(BB, PPC::RLWINM, 4, Result).addReg(Tmp2).addImm(27) .addImm(5).addImm(31); break; case ISD::SETNE: - case ISD::SETUGT: Tmp2 = MakeReg(MVT::i32); BuildMI(BB, PPC::ADDIC, 2, Tmp2).addReg(Tmp1).addSImm(-1); BuildMI(BB, PPC::SUBFE, 2, Result).addReg(Tmp2).addReg(Tmp1); break; - case ISD::SETULT: - BuildMI(BB, PPC::LI, 1, Result).addSImm(0); - break; case ISD::SETLT: BuildMI(BB, PPC::RLWINM, 4, Result).addReg(Tmp1).addImm(1) .addImm(31).addImm(31); break; - case ISD::SETLE: + case ISD::SETGT: Tmp2 = MakeReg(MVT::i32); Tmp3 = MakeReg(MVT::i32); BuildMI(BB, PPC::NEG, 2, Tmp2).addReg(Tmp1); - BuildMI(BB, PPC::ORC, 2, Tmp3).addReg(Tmp1).addReg(Tmp2); + BuildMI(BB, PPC::ANDC, 2, Tmp3).addReg(Tmp2).addReg(Tmp1); BuildMI(BB, PPC::RLWINM, 4, Result).addReg(Tmp3).addImm(1) .addImm(31).addImm(31); break; - case ISD::SETGT: + } + return Result; + } + // setcc op, -1 + if (CN->isAllOnesValue()) { + Tmp1 = SelectExpr(SetCC->getOperand(0)); + switch (SetCC->getCondition()) { + default: assert(0 && "Unhandled SetCC condition"); abort(); + case ISD::SETEQ: Tmp2 = MakeReg(MVT::i32); Tmp3 = MakeReg(MVT::i32); - BuildMI(BB, PPC::NEG, 2, Tmp2).addReg(Tmp1); - BuildMI(BB, PPC::ANDC, 2, Tmp3).addReg(Tmp2).addReg(Tmp1); + BuildMI(BB, PPC::ADDIC, 2, Tmp2).addReg(Tmp1).addSImm(1); + BuildMI(BB, PPC::LI, 1, Tmp3).addSImm(0); + BuildMI(BB, PPC::ADDZE, 1, Result).addReg(Tmp3); + break; + case ISD::SETNE: + Tmp2 = MakeReg(MVT::i32); + Tmp3 = MakeReg(MVT::i32); + BuildMI(BB, PPC::NOR, 2, Tmp2).addReg(Tmp1).addReg(Tmp1); + BuildMI(BB, PPC::ADDIC, 2, Tmp3).addReg(Tmp2).addSImm(-1); + BuildMI(BB, PPC::SUBFE, 2, Result).addReg(Tmp3).addReg(Tmp2); + break; + case ISD::SETLT: + Tmp2 = MakeReg(MVT::i32); + Tmp3 = MakeReg(MVT::i32); + BuildMI(BB, PPC::ADDI, 2, Tmp2).addReg(Tmp1).addSImm(1); + BuildMI(BB, PPC::AND, 2, Tmp3).addReg(Tmp2).addReg(Tmp1); BuildMI(BB, PPC::RLWINM, 4, Result).addReg(Tmp3).addImm(1) .addImm(31).addImm(31); break; - case ISD::SETUGE: - BuildMI(BB, PPC::LI, 1, Result).addSImm(1); - break; - case ISD::SETGE: + case ISD::SETGT: + Tmp2 = MakeReg(MVT::i32); BuildMI(BB, PPC::RLWINM, 4, Tmp2).addReg(Tmp1).addImm(1) .addImm(31).addImm(31); BuildMI(BB, PPC::XORI, 2, Result).addReg(Tmp2).addImm(1); From natebegeman at mac.com Tue Apr 12 18:12:28 2005 From: natebegeman at mac.com (Nate Begeman) Date: Tue, 12 Apr 2005 18:12:28 -0500 Subject: [llvm-commits] CVS: llvm/lib/Target/PowerPC/PPC32ISelPattern.cpp Message-ID: <200504122312.SAA08794@zion.cs.uiuc.edu> Changes in directory llvm/lib/Target/PowerPC: PPC32ISelPattern.cpp updated: 1.65 -> 1.66 --- Log message: Fold shift by size larger than type size to undef Make llvm undef values generate ISD::UNDEF nodes --- Diffs of the changes: (+0 -1) PPC32ISelPattern.cpp | 1 - 1 files changed, 1 deletion(-) Index: llvm/lib/Target/PowerPC/PPC32ISelPattern.cpp diff -u llvm/lib/Target/PowerPC/PPC32ISelPattern.cpp:1.65 llvm/lib/Target/PowerPC/PPC32ISelPattern.cpp:1.66 --- llvm/lib/Target/PowerPC/PPC32ISelPattern.cpp:1.65 Tue Apr 12 16:22:28 2005 +++ llvm/lib/Target/PowerPC/PPC32ISelPattern.cpp Tue Apr 12 18:12:17 2005 @@ -61,7 +61,6 @@ setOperationAction(ISD::SREM, MVT::i32, Expand); setOperationAction(ISD::UREM, MVT::i32, Expand); - setShiftAmountFlavor(Extend); // shl X, 32 == 0 setSetCCResultContents(ZeroOrOneSetCCResult); addLegalFPImmediate(+0.0); // Necessary for FSEL addLegalFPImmediate(-0.0); // From natebegeman at mac.com Tue Apr 12 18:12:28 2005 From: natebegeman at mac.com (Nate Begeman) Date: Tue, 12 Apr 2005 18:12:28 -0500 Subject: [llvm-commits] CVS: llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp SelectionDAGISel.cpp Message-ID: <200504122312.SAA08790@zion.cs.uiuc.edu> Changes in directory llvm/lib/CodeGen/SelectionDAG: SelectionDAG.cpp updated: 1.73 -> 1.74 SelectionDAGISel.cpp updated: 1.40 -> 1.41 --- Log message: Fold shift by size larger than type size to undef Make llvm undef values generate ISD::UNDEF nodes --- Diffs of the changes: (+5 -19) SelectionDAG.cpp | 15 ++++----------- SelectionDAGISel.cpp | 9 +-------- 2 files changed, 5 insertions(+), 19 deletions(-) Index: llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp diff -u llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp:1.73 llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp:1.74 --- llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp:1.73 Mon Apr 11 21:54:39 2005 +++ llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp Tue Apr 12 18:12:17 2005 @@ -755,19 +755,12 @@ case ISD::SHL: case ISD::SRL: - // If the shift amount is bigger than the size of the data, simplify. + case ISD::SRA: + // If the shift amount is bigger than the size of the data, then all the + // bits are shifted out. Simplify to loading constant zero. if (C2 >= MVT::getSizeInBits(N1.getValueType())) { - if (TLI.getShiftAmountFlavor() == TargetLowering::Mask) { - unsigned NewAmt = - C2 & ((1 << MVT::getSizeInBits(N1.getValueType()))-1); - return getNode(Opcode, VT, N1, getConstant(NewAmt,N2.getValueType())); - } else if (TLI.getShiftAmountFlavor() == TargetLowering::Extend) { - // Shifting all of the bits out? - return getConstant(0, N1.getValueType()); - } + return getNode(ISD::UNDEF, N1.getValueType()); } - // FALL THROUGH. - case ISD::SRA: if (C2 == 0) return N1; break; Index: llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp diff -u llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp:1.40 llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp:1.41 --- llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp:1.40 Fri Apr 8 22:30:29 2005 +++ llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp Tue Apr 12 18:12:17 2005 @@ -266,14 +266,7 @@ } else if (isa(C)) { return N = DAG.getConstant(0, TLI.getPointerTy()); } else if (isa(C)) { - /// FIXME: Implement UNDEFVALUE better. - if (MVT::isInteger(VT)) - return N = DAG.getConstant(0, VT); - else if (MVT::isFloatingPoint(VT)) - return N = DAG.getConstantFP(0, VT); - else - assert(0 && "Unknown value type!"); - + return N = DAG.getNode(ISD::UNDEF, VT); } else if (ConstantFP *CFP = dyn_cast(C)) { return N = DAG.getConstantFP(CFP->getValue(), VT); } else { From natebegeman at mac.com Tue Apr 12 18:32:39 2005 From: natebegeman at mac.com (Nate Begeman) Date: Tue, 12 Apr 2005 18:32:39 -0500 Subject: [llvm-commits] CVS: llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp Message-ID: <200504122332.SAA08871@zion.cs.uiuc.edu> Changes in directory llvm/lib/CodeGen/SelectionDAG: SelectionDAG.cpp updated: 1.74 -> 1.75 --- Log message: Fold shift x, [sz]ext(y) -> shift x, y --- Diffs of the changes: (+16 -0) SelectionDAG.cpp | 16 ++++++++++++++++ 1 files changed, 16 insertions(+) Index: llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp diff -u llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp:1.74 llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp:1.75 --- llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp:1.74 Tue Apr 12 18:12:17 2005 +++ llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp Tue Apr 12 18:32:28 2005 @@ -911,6 +911,14 @@ if (N2.getOpcode() == ISD::FNEG) // (A- (-B) -> A+B return getNode(ISD::ADD, VT, N1, N2.getOperand(0)); break; + case ISD::SHL: + case ISD::SRL: + case ISD::SRA: + if (N2.getOpcode() == ISD::ZERO_EXTEND_INREG || + N2.getOpcode() == ISD::SIGN_EXTEND_INREG) { + return getNode(Opcode, VT, N1, N2.getOperand(0)); + } + break; } SDNode *&N = BinaryOps[std::make_pair(Opcode, std::make_pair(N1, N2))]; @@ -1002,6 +1010,14 @@ else return N1; // Never-taken branch break; + case ISD::SRA_PARTS: + case ISD::SRL_PARTS: + case ISD::SHL_PARTS: + if (N3.getOpcode() == ISD::ZERO_EXTEND_INREG || + N3.getOpcode() == ISD::SIGN_EXTEND_INREG) { + return getNode(Opcode, VT, N1, N2, N3.getOperand(0)); + } + break; } SDNode *N = new SDNode(Opcode, N1, N2, N3); From lattner at cs.uiuc.edu Tue Apr 12 21:36:58 2005 From: lattner at cs.uiuc.edu (Chris Lattner) Date: Tue, 12 Apr 2005 21:36:58 -0500 Subject: [llvm-commits] CVS: llvm/include/llvm/CodeGen/SelectionDAGNodes.h Message-ID: <200504130236.j3D2aw4m015440@apoc.cs.uiuc.edu> Changes in directory llvm/include/llvm/CodeGen: SelectionDAGNodes.h updated: 1.31 -> 1.32 --- Log message: Remove the ZERO_EXTEND_INREG node which is redundant with AND --- Diffs of the changes: (+5 -7) SelectionDAGNodes.h | 12 +++++------- 1 files changed, 5 insertions(+), 7 deletions(-) Index: llvm/include/llvm/CodeGen/SelectionDAGNodes.h diff -u llvm/include/llvm/CodeGen/SelectionDAGNodes.h:1.31 llvm/include/llvm/CodeGen/SelectionDAGNodes.h:1.32 --- llvm/include/llvm/CodeGen/SelectionDAGNodes.h:1.31 Fri Apr 8 22:21:50 2005 +++ llvm/include/llvm/CodeGen/SelectionDAGNodes.h Tue Apr 12 21:36:41 2005 @@ -141,13 +141,12 @@ SINT_TO_FP, UINT_TO_FP, - // SIGN_EXTEND_INREG/ZERO_EXTEND_INREG - These operators atomically performs - // a SHL/(SRA|SHL) pair to (sign|zero) extend a small value in a large - // integer register (e.g. sign extending the low 8 bits of a 32-bit register - // to fill the top 24 bits with the 7th bit). The size of the smaller type - // is indicated by the ExtraValueType in the MVTSDNode for the operator. + // SIGN_EXTEND_INREG - This operator atomically performs a SHL/SRA pair to + // sign extend a small value in a large integer register (e.g. sign + // extending the low 8 bits of a 32-bit register to fill the top 24 bits + // with the 7th bit). The size of the smaller type is indicated by the + // ExtraValueType in the MVTSDNode for the operator. SIGN_EXTEND_INREG, - ZERO_EXTEND_INREG, // FP_TO_[US]INT - Convert a floating point value to a signed or unsigned // integer. @@ -809,7 +808,6 @@ static bool classof(const SDNode *N) { return N->getOpcode() == ISD::SIGN_EXTEND_INREG || - N->getOpcode() == ISD::ZERO_EXTEND_INREG || N->getOpcode() == ISD::FP_ROUND_INREG || N->getOpcode() == ISD::EXTLOAD || N->getOpcode() == ISD::SEXTLOAD || From lattner at cs.uiuc.edu Tue Apr 12 21:37:32 2005 From: lattner at cs.uiuc.edu (Chris Lattner) Date: Tue, 12 Apr 2005 21:37:32 -0500 Subject: [llvm-commits] CVS: llvm/include/llvm/CodeGen/SelectionDAG.h Message-ID: <200504130237.j3D2bWV3015456@apoc.cs.uiuc.edu> Changes in directory llvm/include/llvm/CodeGen: SelectionDAG.h updated: 1.19 -> 1.20 --- Log message: Add a new helper method which returns the and that is equivalent to what ZERO_EXTEND_INREG was. --- Diffs of the changes: (+4 -0) SelectionDAG.h | 4 ++++ 1 files changed, 4 insertions(+) Index: llvm/include/llvm/CodeGen/SelectionDAG.h diff -u llvm/include/llvm/CodeGen/SelectionDAG.h:1.19 llvm/include/llvm/CodeGen/SelectionDAG.h:1.20 --- llvm/include/llvm/CodeGen/SelectionDAG.h:1.19 Thu Feb 17 14:16:58 2005 +++ llvm/include/llvm/CodeGen/SelectionDAG.h Tue Apr 12 21:37:19 2005 @@ -149,6 +149,10 @@ SDOperand getSetCC(ISD::CondCode, MVT::ValueType VT, SDOperand LHS, SDOperand RHS); + /// getZeroExtendInReg - Return the expression required to zero extend the Op + /// value assuming it was the smaller SrcTy value. + SDOperand getZeroExtendInReg(SDOperand Op, MVT::ValueType SrcTy); + /// getNode - Gets or creates the specified node. /// SDOperand getNode(unsigned Opcode, MVT::ValueType VT); From lattner at cs.uiuc.edu Tue Apr 12 21:39:01 2005 From: lattner at cs.uiuc.edu (Chris Lattner) Date: Tue, 12 Apr 2005 21:39:01 -0500 Subject: [llvm-commits] CVS: llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp Message-ID: <200504130239.j3D2d17o015483@apoc.cs.uiuc.edu> Changes in directory llvm/lib/CodeGen/SelectionDAG: LegalizeDAG.cpp updated: 1.86 -> 1.87 --- Log message: Instead of making ZERO_EXTEND_INREG nodes, use the helper method in SelectionDAG to do the job with AND. Don't legalize Z_E_I anymore as it is gone --- Diffs of the changes: (+22 -31) LegalizeDAG.cpp | 53 ++++++++++++++++++++++------------------------------- 1 files changed, 22 insertions(+), 31 deletions(-) Index: llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp diff -u llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp:1.86 llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp:1.87 --- llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp:1.86 Tue Apr 12 15:30:10 2005 +++ llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp Tue Apr 12 21:38:47 2005 @@ -479,10 +479,12 @@ // zero/sign extend inreg. Result = DAG.getNode(ISD::EXTLOAD, Node->getValueType(0), Tmp1, Tmp2, SrcVT); - unsigned ExtOp = Node->getOpcode() == ISD::SEXTLOAD ? - ISD::SIGN_EXTEND_INREG : ISD::ZERO_EXTEND_INREG; - SDOperand ValRes = DAG.getNode(ExtOp, Result.getValueType(), - Result, SrcVT); + SDOperand ValRes; + if (Node->getOpcode() == ISD::SEXTLOAD) + ValRes = DAG.getNode(ISD::SIGN_EXTEND_INREG, Result.getValueType(), + Result, SrcVT); + else + ValRes = DAG.getZeroExtendInReg(Result, SrcVT); AddLegalizedOperand(SDOperand(Node, 0), ValRes); AddLegalizedOperand(SDOperand(Node, 1), Result.getValue(1)); if (Op.ResNo) @@ -735,8 +737,8 @@ // ALL of these operations will work if we either sign or zero extend // the operands (including the unsigned comparisons!). Zero extend is // usually a simpler/cheaper operation, so prefer it. - Tmp1 = DAG.getNode(ISD::ZERO_EXTEND_INREG, NVT, Tmp1, VT); - Tmp2 = DAG.getNode(ISD::ZERO_EXTEND_INREG, NVT, Tmp2, VT); + Tmp1 = DAG.getZeroExtendInReg(Tmp1, VT); + Tmp2 = DAG.getZeroExtendInReg(Tmp2, VT); break; case ISD::SETGE: case ISD::SETGT: @@ -1054,8 +1056,8 @@ Result = PromoteOp(Node->getOperand(0)); // NOTE: Any extend would work here... Result = DAG.getNode(ISD::ZERO_EXTEND, Op.getValueType(), Result); - Result = DAG.getNode(ISD::ZERO_EXTEND_INREG, Op.getValueType(), - Result, Node->getOperand(0).getValueType()); + Result = DAG.getZeroExtendInReg(Result, + Node->getOperand(0).getValueType()); break; case ISD::SIGN_EXTEND: Result = PromoteOp(Node->getOperand(0)); @@ -1088,16 +1090,15 @@ break; case ISD::UINT_TO_FP: Result = PromoteOp(Node->getOperand(0)); - Result = DAG.getNode(ISD::ZERO_EXTEND_INREG, Result.getValueType(), - Result, Node->getOperand(0).getValueType()); + Result = DAG.getZeroExtendInReg(Result, + Node->getOperand(0).getValueType()); Result = DAG.getNode(ISD::UINT_TO_FP, Op.getValueType(), Result); break; } } break; case ISD::FP_ROUND_INREG: - case ISD::SIGN_EXTEND_INREG: - case ISD::ZERO_EXTEND_INREG: { + case ISD::SIGN_EXTEND_INREG: { Tmp1 = LegalizeOp(Node->getOperand(0)); MVT::ValueType ExtraVT = cast(Node)->getExtraValueType(); @@ -1112,16 +1113,7 @@ break; case TargetLowering::Expand: // If this is an integer extend and shifts are supported, do that. - if (Node->getOpcode() == ISD::ZERO_EXTEND_INREG) { - // NOTE: we could fall back on load/store here too for targets without - // AND. However, it is doubtful that any exist. - // AND out the appropriate bits. - SDOperand Mask = - DAG.getConstant((1ULL << MVT::getSizeInBits(ExtraVT))-1, - Node->getValueType(0)); - Result = DAG.getNode(ISD::AND, Node->getValueType(0), - Node->getOperand(0), Mask); - } else if (Node->getOpcode() == ISD::SIGN_EXTEND_INREG) { + if (Node->getOpcode() == ISD::SIGN_EXTEND_INREG) { // NOTE: we could fall back on load/store here too for targets without // SAR. However, it is doubtful that any exist. unsigned BitsDiff = MVT::getSizeInBits(Node->getValueType(0)) - @@ -1259,8 +1251,8 @@ Result = DAG.getNode(ISD::SIGN_EXTEND_INREG, NVT, Result, Node->getOperand(0).getValueType()); else - Result = DAG.getNode(ISD::ZERO_EXTEND_INREG, NVT, Result, - Node->getOperand(0).getValueType()); + Result = DAG.getZeroExtendInReg(Result, + Node->getOperand(0).getValueType()); break; } break; @@ -1294,8 +1286,8 @@ Result = DAG.getNode(ISD::SIGN_EXTEND_INREG, Result.getValueType(), Result, Node->getOperand(0).getValueType()); else - Result = DAG.getNode(ISD::ZERO_EXTEND_INREG, Result.getValueType(), - Result, Node->getOperand(0).getValueType()); + Result = DAG.getZeroExtendInReg(Result, + Node->getOperand(0).getValueType()); // No extra round required here. Result = DAG.getNode(Node->getOpcode(), NVT, Result); break; @@ -1383,8 +1375,8 @@ Tmp1 = PromoteOp(Node->getOperand(0)); Tmp2 = PromoteOp(Node->getOperand(1)); assert(MVT::isInteger(NVT) && "Operators don't apply to FP!"); - Tmp1 = DAG.getNode(ISD::ZERO_EXTEND_INREG, NVT, Tmp1, VT); - Tmp2 = DAG.getNode(ISD::ZERO_EXTEND_INREG, NVT, Tmp2, VT); + Tmp1 = DAG.getZeroExtendInReg(Tmp1, VT); + Tmp2 = DAG.getZeroExtendInReg(Tmp2, VT); Result = DAG.getNode(Node->getOpcode(), NVT, Tmp1, Tmp2); break; @@ -1403,7 +1395,7 @@ case ISD::SRL: // The input value must be properly zero extended. Tmp1 = PromoteOp(Node->getOperand(0)); - Tmp1 = DAG.getNode(ISD::ZERO_EXTEND_INREG, NVT, Tmp1, VT); + Tmp1 = DAG.getZeroExtendInReg(Tmp1, VT); Tmp2 = LegalizeOp(Node->getOperand(1)); Result = DAG.getNode(ISD::SRL, NVT, Tmp1, Tmp2); break; @@ -2026,8 +2018,7 @@ case Promote: In = PromoteOp(Node->getOperand(0)); // Emit the appropriate zero_extend_inreg to get the value we want. - In = DAG.getNode(ISD::ZERO_EXTEND_INREG, In.getValueType(), In, - Node->getOperand(0).getValueType()); + In = DAG.getZeroExtendInReg(In, Node->getOperand(0).getValueType()); break; } From lattner at cs.uiuc.edu Tue Apr 12 21:39:02 2005 From: lattner at cs.uiuc.edu (Chris Lattner) Date: Tue, 12 Apr 2005 21:39:02 -0500 Subject: [llvm-commits] CVS: llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp Message-ID: <200504130239.j3D2d2ED015489@apoc.cs.uiuc.edu> Changes in directory llvm/lib/CodeGen/SelectionDAG: SelectionDAG.cpp updated: 1.75 -> 1.76 --- Log message: Remove all foldings of ZERO_EXTEND_INREG, moving them to work for AND nodes instead. OVerall, this increases the amount of folding we can do. --- Diffs of the changes: (+46 -41) SelectionDAG.cpp | 87 +++++++++++++++++++++++++++++-------------------------- 1 files changed, 46 insertions(+), 41 deletions(-) Index: llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp diff -u llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp:1.75 llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp:1.76 --- llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp:1.75 Tue Apr 12 18:32:28 2005 +++ llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp Tue Apr 12 21:38:18 2005 @@ -235,7 +235,6 @@ break; case ISD::TRUNCSTORE: case ISD::SIGN_EXTEND_INREG: - case ISD::ZERO_EXTEND_INREG: case ISD::FP_ROUND_INREG: case ISD::EXTLOAD: case ISD::SEXTLOAD: @@ -286,6 +285,12 @@ delete AllNodes[i]; } +SDOperand SelectionDAG::getZeroExtendInReg(SDOperand Op, MVT::ValueType VT) { + int64_t Imm = ~0ULL >> 64-MVT::getSizeInBits(VT); + return getNode(ISD::AND, Op.getValueType(), Op, + getConstant(Imm, Op.getValueType())); +} + SDOperand SelectionDAG::getConstant(uint64_t Val, MVT::ValueType VT) { assert(MVT::isInteger(VT) && "Cannot create FP integer constant!"); // Mask out any bits that are not valid for this constant. @@ -773,9 +778,8 @@ // ZERO_EXTEND/SIGN_EXTEND by converting them to an ANY_EXTEND node which // we don't have yet. - // and (zero_extend_inreg x:16:32), 1 -> and x, 1 - if (N1.getOpcode() == ISD::ZERO_EXTEND_INREG || - N1.getOpcode() == ISD::SIGN_EXTEND_INREG) { + // and (sign_extend_inreg x:16:32), 1 -> and x, 1 + if (N1.getOpcode() == ISD::SIGN_EXTEND_INREG) { // If we are masking out the part of our input that was extended, just // mask the input to the extension directly. unsigned ExtendBits = @@ -783,6 +787,31 @@ if ((C2 & (~0ULL << ExtendBits)) == 0) return getNode(ISD::AND, VT, N1.getOperand(0), N2); } + if (N1.getOpcode() == ISD::AND) + if (ConstantSDNode *OpRHS = dyn_cast(N1.getOperand(1))) + return getNode(ISD::AND, VT, N1.getOperand(0), + getNode(ISD::AND, VT, N1.getOperand(1), N2)); + + // If we are anding the result of a setcc, and we know setcc always + // returns 0 or 1, simplify the RHS to either be 0 or 1 + if (N1.getOpcode() == ISD::SETCC && + TLI.getSetCCResultContents() == TargetLowering::ZeroOrOneSetCCResult) + if (C2 & 1) + return getNode(ISD::AND, VT, N1.getOperand(1), getConstant(1, VT)); + else + return getConstant(0, VT); + + if (N1.getOpcode() == ISD::ZEXTLOAD) { + // If we are anding the result of a zext load, realize that the top bits + // of the loaded value are already zero to simplify C2. + unsigned SrcBits = + MVT::getSizeInBits(cast(N1)->getExtraValueType()); + uint64_t C3 = C2 & (~0ULL >> (64-SrcBits)); + if (C3 != C2) + return getNode(ISD::AND, VT, N1, getConstant(C3, VT)); + else if (C2 == (~0ULL >> (64-SrcBits))) + return N1; // Anding out just what is already masked. + } break; case ISD::OR: if (!C2)return N1; // X or 0 -> X @@ -1092,7 +1121,6 @@ if (isa(N1)) return getNode(ISD::FP_EXTEND, VT, getNode(ISD::FP_ROUND, EVT, N1)); break; - case ISD::ZERO_EXTEND_INREG: case ISD::SIGN_EXTEND_INREG: assert(VT == N1.getValueType() && "Not an inreg extend!"); assert(MVT::isInteger(VT) && MVT::isInteger(EVT) && @@ -1100,41 +1128,28 @@ if (EVT == VT) return N1; // Not actually extending assert(EVT < VT && "Not extending!"); - // Extending a constant? Just return the constant. + // Extending a constant? Just return the extended constant. if (ConstantSDNode *N1C = dyn_cast(N1.Val)) { SDOperand Tmp = getNode(ISD::TRUNCATE, EVT, N1); - if (Opcode == ISD::ZERO_EXTEND_INREG) - return getNode(ISD::ZERO_EXTEND, VT, Tmp); - else - return getNode(ISD::SIGN_EXTEND, VT, Tmp); + return getNode(ISD::SIGN_EXTEND, VT, Tmp); } // If we are sign extending an extension, use the original source. - if (N1.getOpcode() == ISD::ZERO_EXTEND_INREG || - N1.getOpcode() == ISD::SIGN_EXTEND_INREG) { - if (N1.getOpcode() == Opcode && - cast(N1)->getExtraValueType() <= EVT) + if (N1.getOpcode() == ISD::SIGN_EXTEND_INREG) + if (cast(N1)->getExtraValueType() <= EVT) return N1; - } - // If we are (zero|sign) extending a [zs]extload, return just the load. - if ((N1.getOpcode() == ISD::ZEXTLOAD && Opcode == ISD::ZERO_EXTEND_INREG) || - (N1.getOpcode() == ISD::SEXTLOAD && Opcode == ISD::SIGN_EXTEND_INREG)) + // If we are sign extending a sextload, return just the load. + if (N1.getOpcode() == ISD::SEXTLOAD && Opcode == ISD::SIGN_EXTEND_INREG) if (cast(N1)->getExtraValueType() <= EVT) return N1; // If we are extending the result of a setcc, and we already know the // contents of the top bits, eliminate the extension. - if (N1.getOpcode() == ISD::SETCC) - switch (TLI.getSetCCResultContents()) { - case TargetLowering::UndefinedSetCCResult: break; - case TargetLowering::ZeroOrOneSetCCResult: - if (Opcode == ISD::ZERO_EXTEND_INREG) return N1; - break; - case TargetLowering::ZeroOrNegativeOneSetCCResult: - if (Opcode == ISD::SIGN_EXTEND_INREG) return N1; - break; - } + if (N1.getOpcode() == ISD::SETCC && + TLI.getSetCCResultContents() == + TargetLowering::ZeroOrNegativeOneSetCCResult) + return N1; // If we are sign extending the result of an (and X, C) operation, and we // know the extended bits are zeros already, don't do the extend. @@ -1142,17 +1157,8 @@ if (ConstantSDNode *N1C = dyn_cast(N1.getOperand(1))) { uint64_t Mask = N1C->getValue(); unsigned NumBits = MVT::getSizeInBits(EVT); - if (Opcode == ISD::ZERO_EXTEND_INREG) { - if ((Mask & (~0ULL << NumBits)) == 0) - return N1; - else - return getNode(ISD::AND, VT, N1.getOperand(0), - getConstant(Mask & (~0ULL >> (64-NumBits)), VT)); - } else { - assert(Opcode == ISD::SIGN_EXTEND_INREG); - if ((Mask & (~0ULL << (NumBits-1))) == 0) - return N1; - } + if ((Mask & (~0ULL << (NumBits-1))) == 0) + return N1; } break; } @@ -1177,7 +1183,7 @@ case ISD::EXTLOAD: case ISD::SEXTLOAD: case ISD::ZEXTLOAD: - // If they are asking for an extending loat from/to the same thing, return a + // If they are asking for an extending load from/to the same thing, return a // normal load. if (VT == EVT) return getNode(ISD::LOAD, VT, N1, N2); @@ -1325,7 +1331,6 @@ case ISD::SIGN_EXTEND: return "sign_extend"; case ISD::ZERO_EXTEND: return "zero_extend"; case ISD::SIGN_EXTEND_INREG: return "sign_extend_inreg"; - case ISD::ZERO_EXTEND_INREG: return "zero_extend_inreg"; case ISD::TRUNCATE: return "truncate"; case ISD::FP_ROUND: return "fp_round"; case ISD::FP_ROUND_INREG: return "fp_round_inreg"; From lattner at cs.uiuc.edu Tue Apr 12 21:39:19 2005 From: lattner at cs.uiuc.edu (Chris Lattner) Date: Tue, 12 Apr 2005 21:39:19 -0500 Subject: [llvm-commits] CVS: llvm/lib/Target/X86/X86ISelPattern.cpp Message-ID: <200504130239.j3D2dJ3c015502@apoc.cs.uiuc.edu> Changes in directory llvm/lib/Target/X86: X86ISelPattern.cpp updated: 1.103 -> 1.104 --- Log message: Z_E_I is gone --- Diffs of the changes: (+0 -2) X86ISelPattern.cpp | 2 -- 1 files changed, 2 deletions(-) Index: llvm/lib/Target/X86/X86ISelPattern.cpp diff -u llvm/lib/Target/X86/X86ISelPattern.cpp:1.103 llvm/lib/Target/X86/X86ISelPattern.cpp:1.104 --- llvm/lib/Target/X86/X86ISelPattern.cpp:1.103 Sat Apr 9 10:23:56 2005 +++ llvm/lib/Target/X86/X86ISelPattern.cpp Tue Apr 12 21:39:05 2005 @@ -59,9 +59,7 @@ setOperationAction(ISD::BRCONDTWOWAY , MVT::Other, Expand); setOperationAction(ISD::MEMMOVE , MVT::Other, Expand); setOperationAction(ISD::SIGN_EXTEND_INREG, MVT::i16 , Expand); - setOperationAction(ISD::ZERO_EXTEND_INREG, MVT::i16 , Expand); setOperationAction(ISD::SIGN_EXTEND_INREG, MVT::i1 , Expand); - setOperationAction(ISD::ZERO_EXTEND_INREG, MVT::i1 , Expand); setOperationAction(ISD::FP_ROUND_INREG , MVT::f32 , Expand); setOperationAction(ISD::SEXTLOAD , MVT::i1 , Expand); setOperationAction(ISD::SREM , MVT::f64 , Expand); From lattner at cs.uiuc.edu Tue Apr 12 21:40:39 2005 From: lattner at cs.uiuc.edu (Chris Lattner) Date: Tue, 12 Apr 2005 21:40:39 -0500 Subject: [llvm-commits] CVS: llvm/lib/Target/PowerPC/PPC32ISelPattern.cpp PPC64ISelPattern.cpp Message-ID: <200504130240.j3D2edWu015531@apoc.cs.uiuc.edu> Changes in directory llvm/lib/Target/PowerPC: PPC32ISelPattern.cpp updated: 1.66 -> 1.67 PPC64ISelPattern.cpp updated: 1.7 -> 1.8 --- Log message: Elimate handling of ZERO_EXTEND_INREG. This causes the PPC backend to emit andi instructions instead of rlwinm instructions for zero extend, but they seem like they would take the same time. --- Diffs of the changes: (+0 -25) PPC32ISelPattern.cpp | 13 ------------- PPC64ISelPattern.cpp | 12 ------------ 2 files changed, 25 deletions(-) Index: llvm/lib/Target/PowerPC/PPC32ISelPattern.cpp diff -u llvm/lib/Target/PowerPC/PPC32ISelPattern.cpp:1.66 llvm/lib/Target/PowerPC/PPC32ISelPattern.cpp:1.67 --- llvm/lib/Target/PowerPC/PPC32ISelPattern.cpp:1.66 Tue Apr 12 18:12:17 2005 +++ llvm/lib/Target/PowerPC/PPC32ISelPattern.cpp Tue Apr 12 21:40:26 2005 @@ -1636,19 +1636,6 @@ } return Result; - case ISD::ZERO_EXTEND_INREG: - Tmp1 = SelectExpr(N.getOperand(0)); - switch(cast(Node)->getExtraValueType()) { - default: Node->dump(); assert(0 && "Unhandled ZERO_EXTEND type"); break; - case MVT::i16: Tmp2 = 16; break; - case MVT::i8: Tmp2 = 24; break; - case MVT::i1: Tmp2 = 31; break; - } - Opc = Recording ? PPC::RLWINMo : PPC::RLWINM; - RecordSuccess = true; - BuildMI(BB, Opc, 4, Result).addReg(Tmp1).addImm(0).addImm(Tmp2).addImm(31); - return Result; - case ISD::CopyFromReg: if (Result == 1) Result = ExprMap[N.getValue(0)] = MakeReg(N.getValue(0).getValueType()); Index: llvm/lib/Target/PowerPC/PPC64ISelPattern.cpp diff -u llvm/lib/Target/PowerPC/PPC64ISelPattern.cpp:1.7 llvm/lib/Target/PowerPC/PPC64ISelPattern.cpp:1.8 --- llvm/lib/Target/PowerPC/PPC64ISelPattern.cpp:1.7 Fri Apr 8 22:22:30 2005 +++ llvm/lib/Target/PowerPC/PPC64ISelPattern.cpp Tue Apr 12 21:40:26 2005 @@ -1154,18 +1154,6 @@ } return Result; - case ISD::ZERO_EXTEND_INREG: - Tmp1 = SelectExpr(N.getOperand(0)); - switch(cast(Node)->getExtraValueType()) { - default: Node->dump(); assert(0 && "Unhandled ZERO_EXTEND type"); break; - case MVT::i16: Tmp2 = 16; break; - case MVT::i8: Tmp2 = 24; break; - case MVT::i1: Tmp2 = 31; break; - } - BuildMI(BB, PPC::RLWINM, 4, Result).addReg(Tmp1).addImm(0).addImm(Tmp2) - .addImm(31); - return Result; - case ISD::CopyFromReg: if (Result == 1) Result = ExprMap[N.getValue(0)] = MakeReg(N.getValue(0).getValueType()); From lattner at cs.uiuc.edu Tue Apr 12 21:42:05 2005 From: lattner at cs.uiuc.edu (Chris Lattner) Date: Tue, 12 Apr 2005 21:42:05 -0500 Subject: [llvm-commits] CVS: llvm/lib/Target/IA64/IA64ISelPattern.cpp Message-ID: <200504130242.j3D2g5OE015551@apoc.cs.uiuc.edu> Changes in directory llvm/lib/Target/IA64: IA64ISelPattern.cpp updated: 1.19 -> 1.20 --- Log message: Remove special handling of ZERO_EXTEND_INREG. This pessimizes code, causing things like this: mov r9 = 65535;; and r8 = r8, r9;; To be emitted instead of: zxt2 r8 = r8;; To get this back, the selector for ISD::AND should recognize this case. --- Diffs of the changes: (+0 -17) IA64ISelPattern.cpp | 17 ----------------- 1 files changed, 17 deletions(-) Index: llvm/lib/Target/IA64/IA64ISelPattern.cpp diff -u llvm/lib/Target/IA64/IA64ISelPattern.cpp:1.19 llvm/lib/Target/IA64/IA64ISelPattern.cpp:1.20 --- llvm/lib/Target/IA64/IA64ISelPattern.cpp:1.19 Tue Apr 12 09:54:44 2005 +++ llvm/lib/Target/IA64/IA64ISelPattern.cpp Tue Apr 12 21:41:52 2005 @@ -1304,23 +1304,6 @@ return Result; } - case ISD::ZERO_EXTEND_INREG: { - Tmp1 = SelectExpr(N.getOperand(0)); - MVTSDNode* MVN = dyn_cast(Node); - switch(MVN->getExtraValueType()) - { - default: - Node->dump(); - assert(0 && "don't know how to zero extend this type"); - break; - case MVT::i8: Opc = IA64::ZXT1; break; - case MVT::i16: Opc = IA64::ZXT2; break; - case MVT::i32: Opc = IA64::ZXT4; break; - } - BuildMI(BB, Opc, 1, Result).addReg(Tmp1); - return Result; - } - case ISD::SIGN_EXTEND_INREG: { Tmp1 = SelectExpr(N.getOperand(0)); MVTSDNode* MVN = dyn_cast(Node); From lattner at cs.uiuc.edu Tue Apr 12 21:43:54 2005 From: lattner at cs.uiuc.edu (Chris Lattner) Date: Tue, 12 Apr 2005 21:43:54 -0500 Subject: [llvm-commits] CVS: llvm/lib/Target/Alpha/AlphaISelPattern.cpp Message-ID: <200504130243.j3D2hsXW015579@apoc.cs.uiuc.edu> Changes in directory llvm/lib/Target/Alpha: AlphaISelPattern.cpp updated: 1.93 -> 1.94 --- Log message: Remove support for ZERO_EXTEND_INREG. This pessimizes code, genering stuff like this: ldah $1,1($31) lda $1,-1($1) and $0,$1,$24 instead of this: zap $0,252,$24 To get this back, the selector should recognize the ISD::AND case where this happens and emit the appropriate ZAP instruction. --- Diffs of the changes: (+0 -21) AlphaISelPattern.cpp | 21 --------------------- 1 files changed, 21 deletions(-) Index: llvm/lib/Target/Alpha/AlphaISelPattern.cpp diff -u llvm/lib/Target/Alpha/AlphaISelPattern.cpp:1.93 llvm/lib/Target/Alpha/AlphaISelPattern.cpp:1.94 --- llvm/lib/Target/Alpha/AlphaISelPattern.cpp:1.93 Tue Apr 12 12:47:57 2005 +++ llvm/lib/Target/Alpha/AlphaISelPattern.cpp Tue Apr 12 21:43:40 2005 @@ -1562,27 +1562,6 @@ } return Result; } - case ISD::ZERO_EXTEND_INREG: - { - Tmp1 = SelectExpr(N.getOperand(0)); - MVTSDNode* MVN = dyn_cast(Node); - //std::cerr << "SrcT: " << MVN->getExtraValueType() << "\n"; - switch(MVN->getExtraValueType()) - { - default: - Node->dump(); - assert(0 && "Zero Extend InReg not there yet"); - break; - case MVT::i32: Tmp2 = 0xf0; break; - case MVT::i16: Tmp2 = 0xfc; break; - case MVT::i8: Tmp2 = 0xfe; break; - case MVT::i1: //handle this one special - BuildMI(BB, Alpha::ANDi, 2, Result).addReg(Tmp1).addImm(1); - return Result; - } - BuildMI(BB, Alpha::ZAPi, 2, Result).addReg(Tmp1).addImm(Tmp2); - return Result; - } case ISD::SETCC: { From lattner at cs.uiuc.edu Tue Apr 12 21:46:30 2005 From: lattner at cs.uiuc.edu (Chris Lattner) Date: Tue, 12 Apr 2005 21:46:30 -0500 Subject: [llvm-commits] CVS: llvm/lib/Target/PowerPC/PPC32ISelPattern.cpp Message-ID: <200504130246.j3D2kUkE016329@apoc.cs.uiuc.edu> Changes in directory llvm/lib/Target/PowerPC: PPC32ISelPattern.cpp updated: 1.67 -> 1.68 --- Log message: remove one more occurance of this that snuck in --- Diffs of the changes: (+1 -1) PPC32ISelPattern.cpp | 2 +- 1 files changed, 1 insertion(+), 1 deletion(-) Index: llvm/lib/Target/PowerPC/PPC32ISelPattern.cpp diff -u llvm/lib/Target/PowerPC/PPC32ISelPattern.cpp:1.67 llvm/lib/Target/PowerPC/PPC32ISelPattern.cpp:1.68 --- llvm/lib/Target/PowerPC/PPC32ISelPattern.cpp:1.67 Tue Apr 12 21:40:26 2005 +++ llvm/lib/Target/PowerPC/PPC32ISelPattern.cpp Tue Apr 12 21:46:17 2005 @@ -663,7 +663,7 @@ default: return false; case ISD::AND: case ISD::OR: - case ISD::ZERO_EXTEND_INREG: return true; + return true; } } From lattner at cs.uiuc.edu Tue Apr 12 21:48:10 2005 From: lattner at cs.uiuc.edu (Chris Lattner) Date: Tue, 12 Apr 2005 21:48:10 -0500 Subject: [llvm-commits] CVS: llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp Message-ID: <200504130248.j3D2mAd9017054@apoc.cs.uiuc.edu> Changes in directory llvm/lib/CodeGen/SelectionDAG: SelectionDAG.cpp updated: 1.76 -> 1.77 --- Log message: Oops, remove these too. --- Diffs of the changes: (+2 -6) SelectionDAG.cpp | 8 ++------ 1 files changed, 2 insertions(+), 6 deletions(-) Index: llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp diff -u llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp:1.76 llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp:1.77 --- llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp:1.76 Tue Apr 12 21:38:18 2005 +++ llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp Tue Apr 12 21:47:57 2005 @@ -943,10 +943,8 @@ case ISD::SHL: case ISD::SRL: case ISD::SRA: - if (N2.getOpcode() == ISD::ZERO_EXTEND_INREG || - N2.getOpcode() == ISD::SIGN_EXTEND_INREG) { + if (N2.getOpcode() == ISD::SIGN_EXTEND_INREG) return getNode(Opcode, VT, N1, N2.getOperand(0)); - } break; } @@ -1042,10 +1040,8 @@ case ISD::SRA_PARTS: case ISD::SRL_PARTS: case ISD::SHL_PARTS: - if (N3.getOpcode() == ISD::ZERO_EXTEND_INREG || - N3.getOpcode() == ISD::SIGN_EXTEND_INREG) { + if (N3.getOpcode() == ISD::SIGN_EXTEND_INREG) return getNode(Opcode, VT, N1, N2, N3.getOperand(0)); - } break; } From lattner at cs.uiuc.edu Tue Apr 12 21:58:29 2005 From: lattner at cs.uiuc.edu (Chris Lattner) Date: Tue, 12 Apr 2005 21:58:29 -0500 Subject: [llvm-commits] CVS: llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp Message-ID: <200504130258.j3D2wTvN020376@apoc.cs.uiuc.edu> Changes in directory llvm/lib/CodeGen/SelectionDAG: SelectionDAG.cpp updated: 1.77 -> 1.78 --- Log message: add back the optimization that Nate added for shl X, (zext_inreg y) --- Diffs of the changes: (+23 -2) SelectionDAG.cpp | 25 +++++++++++++++++++++++-- 1 files changed, 23 insertions(+), 2 deletions(-) Index: llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp diff -u llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp:1.77 llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp:1.78 --- llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp:1.77 Tue Apr 12 21:47:57 2005 +++ llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp Tue Apr 12 21:58:13 2005 @@ -943,8 +943,18 @@ case ISD::SHL: case ISD::SRL: case ISD::SRA: - if (N2.getOpcode() == ISD::SIGN_EXTEND_INREG) + if (N2.getOpcode() == ISD::SIGN_EXTEND_INREG && + cast(N2)->getExtraValueType() != MVT::i1) return getNode(Opcode, VT, N1, N2.getOperand(0)); + else if (N2.getOpcode() == ISD::AND) + if (ConstantSDNode *AndRHS = dyn_cast(N2.getOperand(1))) { + // If the and is only masking out bits that cannot effect the shift, + // eliminate the and. + unsigned NumBits = MVT::getSizeInBits(VT); + if ((AndRHS->getValue() & (NumBits-1)) == NumBits-1) + return getNode(Opcode, VT, N1, N2.getOperand(0)); + } + break; } @@ -1040,8 +1050,19 @@ case ISD::SRA_PARTS: case ISD::SRL_PARTS: case ISD::SHL_PARTS: - if (N3.getOpcode() == ISD::SIGN_EXTEND_INREG) + if (N3.getOpcode() == ISD::SIGN_EXTEND_INREG && + cast(N3)->getExtraValueType() != MVT::i1) return getNode(Opcode, VT, N1, N2, N3.getOperand(0)); + else if (N3.getOpcode() == ISD::AND) + if (ConstantSDNode *AndRHS = dyn_cast(N3.getOperand(1))) { + // If the and is only masking out bits that cannot effect the shift, + // eliminate the and. + unsigned NumBits = MVT::getSizeInBits(VT)*2; + if ((AndRHS->getValue() & (NumBits-1)) == NumBits-1) + return getNode(Opcode, VT, N1, N2, N3.getOperand(0)); + } + + break; } From lattner at cs.uiuc.edu Tue Apr 12 22:30:10 2005 From: lattner at cs.uiuc.edu (Chris Lattner) Date: Tue, 12 Apr 2005 22:30:10 -0500 Subject: [llvm-commits] CVS: llvm/lib/Target/X86/X86ISelPattern.cpp Message-ID: <200504130330.j3D3UAqQ024694@apoc.cs.uiuc.edu> Changes in directory llvm/lib/Target/X86: X86ISelPattern.cpp updated: 1.104 -> 1.105 --- Log message: Fix some mysteriously missing {}'s which cause the miscompilation of Olden/mst, Ptrdist/bc, Obsequi, etc. --- Diffs of the changes: (+2 -2) X86ISelPattern.cpp | 4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) Index: llvm/lib/Target/X86/X86ISelPattern.cpp diff -u llvm/lib/Target/X86/X86ISelPattern.cpp:1.104 llvm/lib/Target/X86/X86ISelPattern.cpp:1.105 --- llvm/lib/Target/X86/X86ISelPattern.cpp:1.104 Tue Apr 12 21:39:05 2005 +++ llvm/lib/Target/X86/X86ISelPattern.cpp Tue Apr 12 22:29:53 2005 @@ -2191,8 +2191,7 @@ assert((N.getOpcode() != ISD::SREM || MVT::isInteger(N.getValueType())) && "We don't support this operator!"); - if (N.getOpcode() == ISD::SDIV) - + if (N.getOpcode() == ISD::SDIV) { // We can fold loads into FpDIVs, but not really into any others. if (N.getValueType() == MVT::f64) { // Check for reversed and unreversed DIV. @@ -2267,6 +2266,7 @@ return Result; } } + } if (getRegPressure(N.getOperand(0)) > getRegPressure(N.getOperand(1))) { Tmp1 = SelectExpr(N.getOperand(0)); From lattner at cs.uiuc.edu Tue Apr 12 22:42:30 2005 From: lattner at cs.uiuc.edu (Chris Lattner) Date: Tue, 12 Apr 2005 22:42:30 -0500 Subject: [llvm-commits] CVS: llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp Message-ID: <200504130342.j3D3gUOG025430@apoc.cs.uiuc.edu> Changes in directory llvm/lib/CodeGen/SelectionDAG: LegalizeDAG.cpp updated: 1.87 -> 1.88 --- Log message: Make expansion of uint->fp cast assert out instead of infinitely recurse. --- Diffs of the changes: (+1 -1) LegalizeDAG.cpp | 2 +- 1 files changed, 1 insertion(+), 1 deletion(-) Index: llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp diff -u llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp:1.87 llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp:1.88 --- llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp:1.87 Tue Apr 12 21:38:47 2005 +++ llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp Tue Apr 12 22:42:14 2005 @@ -1820,7 +1820,7 @@ } else { // If this is unsigned, and not supported, first perform the conversion to // signed, then adjust the result if the sign bit is set. - SDOperand SignedConv = ExpandIntToFP(false, DestTy, Source); + SDOperand SignedConv = ExpandIntToFP(true, DestTy, Source); assert(0 && "Unsigned casts not supported yet!"); } From alenhar2 at cs.uiuc.edu Tue Apr 12 22:47:19 2005 From: alenhar2 at cs.uiuc.edu (Andrew Lenharth) Date: Tue, 12 Apr 2005 22:47:19 -0500 Subject: [llvm-commits] CVS: llvm/lib/Target/Alpha/AlphaISelPattern.cpp Message-ID: <200504130347.j3D3lJ4U025538@apoc.cs.uiuc.edu> Changes in directory llvm/lib/Target/Alpha: AlphaISelPattern.cpp updated: 1.94 -> 1.95 --- Log message: added all flavors of zap for anding --- Diffs of the changes: (+20 -0) AlphaISelPattern.cpp | 20 ++++++++++++++++++++ 1 files changed, 20 insertions(+) Index: llvm/lib/Target/Alpha/AlphaISelPattern.cpp diff -u llvm/lib/Target/Alpha/AlphaISelPattern.cpp:1.94 llvm/lib/Target/Alpha/AlphaISelPattern.cpp:1.95 --- llvm/lib/Target/Alpha/AlphaISelPattern.cpp:1.94 Tue Apr 12 21:43:40 2005 +++ llvm/lib/Target/Alpha/AlphaISelPattern.cpp Tue Apr 12 22:47:03 2005 @@ -1692,6 +1692,26 @@ } //Fall through case ISD::AND: + //handle zap + if (opcode == ISD::AND && N.getOperand(1).getOpcode() == ISD::Constant) + { + uint64_t k = cast(N.getOperand(1))->getValue(); + unsigned int build = 0; + for(int i = 0; i < 8; ++i) + { + if (k & 0xFF == 0xFF) + build |= 1 << i; + else if (k & 0xFF != 0) + { build = 0; break; } + k >>= 8; + } + if (build) + { + Tmp1 = SelectExpr(N.getOperand(0)); + BuildMI(BB, Alpha::ZAPNOTi, 2, Result).addReg(Tmp1).addImm(build); + return Result; + } + } case ISD::OR: //Check operand(0) == Not if (N.getOperand(0).getOpcode() == ISD::XOR && From alenhar2 at cs.uiuc.edu Tue Apr 12 23:41:22 2005 From: alenhar2 at cs.uiuc.edu (Andrew Lenharth) Date: Tue, 12 Apr 2005 23:41:22 -0500 Subject: [llvm-commits] CVS: llvm/test/Regression/CodeGen/Alpha/s4addl.ll Message-ID: <200504130441.j3D4fMjd030653@apoc.cs.uiuc.edu> Changes in directory llvm/test/Regression/CodeGen/Alpha: s4addl.ll added (r1.1) --- Log message: added s4addl matching test --- Diffs of the changes: (+14 -0) s4addl.ll | 14 ++++++++++++++ 1 files changed, 14 insertions(+) Index: llvm/test/Regression/CodeGen/Alpha/s4addl.ll diff -c /dev/null llvm/test/Regression/CodeGen/Alpha/s4addl.ll:1.1 *** /dev/null Tue Apr 12 23:41:16 2005 --- llvm/test/Regression/CodeGen/Alpha/s4addl.ll Tue Apr 12 23:41:06 2005 *************** *** 0 **** --- 1,14 ---- + ; Make sure this testcase codegens to the bic instruction + ; RUN: llvm-as < %s | llc -march=alpha | grep 's4addl' + + ; ModuleID = 'test.o' + deplibs = [ "c", "crtend" ] + + implementation ; Functions: + + int %foo(int %x, int %y) { + entry: + %tmp.1 = shl int %y, ubyte 2 ; [#uses=1] + %tmp.3 = add int %tmp.1, %x ; [#uses=1] + ret int %tmp.3 + } From lattner at cs.uiuc.edu Tue Apr 12 23:41:27 2005 From: lattner at cs.uiuc.edu (Chris Lattner) Date: Tue, 12 Apr 2005 23:41:27 -0500 Subject: [llvm-commits] CVS: llvm-test/SingleSource/UnitTests/2005-05-12-Int64ToFP.c Message-ID: <200504130441.j3D4fRMS030662@apoc.cs.uiuc.edu> Changes in directory llvm-test/SingleSource/UnitTests: 2005-05-12-Int64ToFP.c added (r1.1) --- Log message: new testcase for 64-bit int -> FP operations --- Diffs of the changes: (+15 -0) 2005-05-12-Int64ToFP.c | 15 +++++++++++++++ 1 files changed, 15 insertions(+) Index: llvm-test/SingleSource/UnitTests/2005-05-12-Int64ToFP.c diff -c /dev/null llvm-test/SingleSource/UnitTests/2005-05-12-Int64ToFP.c:1.1 *** /dev/null Tue Apr 12 23:41:24 2005 --- llvm-test/SingleSource/UnitTests/2005-05-12-Int64ToFP.c Tue Apr 12 23:41:14 2005 *************** *** 0 **** --- 1,15 ---- + + #include + + + int main() { + unsigned long long NX = 124, X; + + do { + X = NX; + printf("%llu = %f %lld = %f\n", X, (double)X, X, (double)(signed long long)X); + NX += 1ULL << 60; + } while (X < NX); + + return 0; + } From duraid at octopus.com.au Tue Apr 12 23:51:05 2005 From: duraid at octopus.com.au (Duraid Madina) Date: Tue, 12 Apr 2005 23:51:05 -0500 Subject: [llvm-commits] CVS: llvm/lib/Target/IA64/IA64ISelPattern.cpp Message-ID: <200504130451.XAA10702@zion.cs.uiuc.edu> Changes in directory llvm/lib/Target/IA64: IA64ISelPattern.cpp updated: 1.20 -> 1.21 --- Log message: * if ANDing with a constant of the form: 0x00000..00FFF..FF ^ ^ ^ ^ any number of 0's followed by some number of 1's then we use dep.z to just paste zeros over the input. For the special cases where this is zxt1/zxt2/zxt4, we use those instructions instead, because we're all about readability!!! that's what it's about!! readability! *twitch* ;D --- Diffs of the changes: (+45 -2) IA64ISelPattern.cpp | 47 +++++++++++++++++++++++++++++++++++++++++++++-- 1 files changed, 45 insertions(+), 2 deletions(-) Index: llvm/lib/Target/IA64/IA64ISelPattern.cpp diff -u llvm/lib/Target/IA64/IA64ISelPattern.cpp:1.20 llvm/lib/Target/IA64/IA64ISelPattern.cpp:1.21 --- llvm/lib/Target/IA64/IA64ISelPattern.cpp:1.20 Tue Apr 12 21:41:52 2005 +++ llvm/lib/Target/IA64/IA64ISelPattern.cpp Tue Apr 12 23:50:54 2005 @@ -445,7 +445,7 @@ /// ExactLog2 - This function solves for (Val == 1 << (N-1)) and returns N. It /// returns zero when the input is not exactly a power of two. -static uint64_t ExactLog2(uint64_t Val) { +static unsigned ExactLog2(uint64_t Val) { if (Val == 0 || (Val & (Val-1))) return 0; unsigned Count = 0; while (Val != 1) { @@ -455,6 +455,17 @@ return Count; } +/// ExactLog2sub1 - This function solves for (Val == (1 << (N-1))-1) +/// and returns N. It returns 666 if Val is not 2^n -1 for some n. +static unsigned ExactLog2sub1(uint64_t Val) { + unsigned int n; + for(n=0; n<64; n++) { + if(Val==(uint64_t)((1<(N)->getSignExtended(); + + if ((Imm = ExactLog2sub1(v))!=666) { // if ANDing with ((2^n)-1) for some n + return 1; // say so + } + + return 0; // fallthrough +} + static unsigned ponderIntegerAdditionWith(SDOperand N, unsigned& Imm) { if (N.getOpcode() != ISD::Constant) return 0; // if not adding a // constant, give up. @@ -967,15 +991,34 @@ .addReg(bogusTemp1).addReg(IA64::r0).addReg(IA64::r0).addReg(pTemp); break; } + // if not a bool, we just AND away: case MVT::i8: case MVT::i16: case MVT::i32: case MVT::i64: { Tmp1 = SelectExpr(N.getOperand(0)); + switch (ponderIntegerAndWith(N.getOperand(1), Tmp3)) { + case 1: // ANDing a constant that is 2^n-1 for some n + switch (Tmp3) { + case 8: // if AND 0x00000000000000FF, be quaint and use zxt1 + BuildMI(BB, IA64::ZXT1, 1, Result).addReg(Tmp1); + break; + case 16: // if AND 0x000000000000FFFF, be quaint and use zxt2 + BuildMI(BB, IA64::ZXT2, 1, Result).addReg(Tmp1); + break; + case 32: // if AND 0x00000000FFFFFFFF, be quaint and use zxt4 + BuildMI(BB, IA64::ZXT4, 1, Result).addReg(Tmp1); + break; + default: // otherwise, use dep.z to paste zeros + BuildMI(BB, IA64::DEPZ, 3, Result).addReg(Tmp1) + .addImm(0).addImm(Tmp3); + break; + } + return Result; // early exit + } // fallthrough and emit a simple AND: Tmp2 = SelectExpr(N.getOperand(1)); BuildMI(BB, IA64::AND, 2, Result).addReg(Tmp1).addReg(Tmp2); - break; } } return Result; From lattner at cs.uiuc.edu Wed Apr 13 00:09:58 2005 From: lattner at cs.uiuc.edu (Chris Lattner) Date: Wed, 13 Apr 2005 00:09:58 -0500 Subject: [llvm-commits] CVS: llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp Message-ID: <200504130509.j3D59wlK001985@apoc.cs.uiuc.edu> Changes in directory llvm/lib/CodeGen/SelectionDAG: LegalizeDAG.cpp updated: 1.88 -> 1.89 --- Log message: Implement expansion of unsigned i64 -> FP. Note that this probably only works for little endian targets, but is enough to get siod working :) --- Diffs of the changes: (+31 -2) LegalizeDAG.cpp | 33 +++++++++++++++++++++++++++++++-- 1 files changed, 31 insertions(+), 2 deletions(-) Index: llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp diff -u llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp:1.88 llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp:1.89 --- llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp:1.88 Tue Apr 12 22:42:14 2005 +++ llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp Wed Apr 13 00:09:42 2005 @@ -1822,7 +1822,37 @@ // signed, then adjust the result if the sign bit is set. SDOperand SignedConv = ExpandIntToFP(true, DestTy, Source); - assert(0 && "Unsigned casts not supported yet!"); + assert(Source.getValueType() == MVT::i64 && + "This only works for 64-bit -> FP"); + // The 64-bit value loaded will be incorrectly if the 'sign bit' of the + // incoming integer is set. To handle this, we dynamically test to see if + // it is set, and, if so, add a fudge factor. + SDOperand Lo, Hi; + ExpandOp(Source, Lo, Hi); + + SDOperand SignSet = DAG.getSetCC(ISD::SETLT, TLI.getSetCCResultTy(), Hi, + DAG.getConstant(0, Hi.getValueType())); + SDOperand Zero = getIntPtrConstant(0), Four = getIntPtrConstant(4); + SDOperand CstOffset = DAG.getNode(ISD::SELECT, Zero.getValueType(), + SignSet, Four, Zero); + // FIXME: This is almost certainly broken for big-endian systems. Should + // this just put the fudge factor in the low bits of the uint64 constant or? + static Constant *FudgeFactor = + ConstantUInt::get(Type::ULongTy, 0x5f800000ULL << 32); + + MachineConstantPool *CP = DAG.getMachineFunction().getConstantPool(); + SDOperand CPIdx = DAG.getConstantPool(CP->getConstantPoolIndex(FudgeFactor), + TLI.getPointerTy()); + CPIdx = DAG.getNode(ISD::ADD, TLI.getPointerTy(), CPIdx, CstOffset); + SDOperand FudgeInReg; + if (DestTy == MVT::f32) + FudgeInReg = DAG.getLoad(MVT::f32, DAG.getEntryNode(), CPIdx); + else { + assert(DestTy == MVT::f64 && "Unexpected conversion"); + FudgeInReg = DAG.getNode(ISD::EXTLOAD, MVT::f64, DAG.getEntryNode(), + CPIdx, MVT::f32); + } + return DAG.getNode(ISD::ADD, DestTy, SignedConv, FudgeInReg); } SDOperand Callee = DAG.getExternalSymbol(FnName, TLI.getPointerTy()); @@ -1835,7 +1865,6 @@ // calls wherever we need them to satisfy data dependences. const Type *RetTy = MVT::getTypeForValueType(DestTy); return TLI.LowerCallTo(InChain, RetTy, false, Callee, Args, DAG).first; - } From alenhar2 at cs.uiuc.edu Wed Apr 13 00:20:11 2005 From: alenhar2 at cs.uiuc.edu (Andrew Lenharth) Date: Wed, 13 Apr 2005 00:20:11 -0500 Subject: [llvm-commits] CVS: llvm/lib/Target/Alpha/AlphaISelPattern.cpp Message-ID: <200504130520.j3D5KB7X002166@apoc.cs.uiuc.edu> Changes in directory llvm/lib/Target/Alpha: AlphaISelPattern.cpp updated: 1.95 -> 1.96 --- Log message: add matches for SxADDL and company, as well as simplify the SxADDQ code --- Diffs of the changes: (+48 -78) AlphaISelPattern.cpp | 126 +++++++++++++++++++-------------------------------- 1 files changed, 48 insertions(+), 78 deletions(-) Index: llvm/lib/Target/Alpha/AlphaISelPattern.cpp diff -u llvm/lib/Target/Alpha/AlphaISelPattern.cpp:1.95 llvm/lib/Target/Alpha/AlphaISelPattern.cpp:1.96 --- llvm/lib/Target/Alpha/AlphaISelPattern.cpp:1.95 Tue Apr 12 22:47:03 2005 +++ llvm/lib/Target/Alpha/AlphaISelPattern.cpp Wed Apr 13 00:19:55 2005 @@ -1490,47 +1490,43 @@ bool isAdd = N.getOperand(0).getOpcode() == ISD::ADD; bool isMul = N.getOperand(0).getOpcode() == ISD::MUL; //FIXME: first check for Scaled Adds and Subs! - if(N.getOperand(0).getOperand(1).getOpcode() == ISD::Constant && + ConstantSDNode* CSD = NULL; + if(!isMul && N.getOperand(0).getOperand(0).getOpcode() == ISD::SHL && + (CSD = dyn_cast(N.getOperand(0).getOperand(0).getOperand(1))) && + (CSD->getValue() == 2 || CSD->getValue() == 3)) + { + bool use4 = CSD->getValue() == 2; + Tmp1 = SelectExpr(N.getOperand(0).getOperand(0).getOperand(0)); + Tmp2 = SelectExpr(N.getOperand(0).getOperand(1)); + BuildMI(BB, isAdd?(use4?Alpha::S4ADDL:Alpha::S8ADDL):(use4?Alpha::S4SUBL:Alpha::S8SUBL), + 2,Result).addReg(Tmp1).addReg(Tmp2); + } + else if(isAdd && N.getOperand(0).getOperand(1).getOpcode() == ISD::SHL && + (CSD = dyn_cast(N.getOperand(0).getOperand(1).getOperand(1))) && + (CSD->getValue() == 2 || CSD->getValue() == 3)) + { + bool use4 = CSD->getValue() == 2; + Tmp1 = SelectExpr(N.getOperand(0).getOperand(1).getOperand(0)); + Tmp2 = SelectExpr(N.getOperand(0).getOperand(0)); + BuildMI(BB, use4?Alpha::S4ADDL:Alpha::S8ADDL, 2,Result).addReg(Tmp1).addReg(Tmp2); + } + else if(N.getOperand(0).getOperand(1).getOpcode() == ISD::Constant && cast(N.getOperand(0).getOperand(1))->getValue() <= 255) { //Normal imm add/sub Opc = isAdd ? Alpha::ADDLi : (isMul ? Alpha::MULLi : Alpha::SUBLi); - //if the value was really originally a i32, skip the up conversion - if (N.getOperand(0).getOperand(0).getOpcode() == ISD::SIGN_EXTEND_INREG && - dyn_cast(N.getOperand(0).getOperand(0).Val) - ->getExtraValueType() == MVT::i32) - Tmp1 = SelectExpr(N.getOperand(0).getOperand(0).getOperand(0)); - else - Tmp1 = SelectExpr(N.getOperand(0).getOperand(0)); + Tmp1 = SelectExpr(N.getOperand(0).getOperand(0)); Tmp2 = cast(N.getOperand(0).getOperand(1))->getValue(); BuildMI(BB, Opc, 2, Result).addReg(Tmp1).addImm(Tmp2); } else { //Normal add/sub - Opc = isAdd ? Alpha::ADDL : (isMul ? Alpha::MULLi : Alpha::SUBL); - //if the value was really originally a i32, skip the up conversion - if (N.getOperand(0).getOperand(0).getOpcode() == ISD::SIGN_EXTEND_INREG && - dyn_cast(N.getOperand(0).getOperand(0).Val) - ->getExtraValueType() == MVT::i32) - Tmp1 = SelectExpr(N.getOperand(0).getOperand(0).getOperand(0)); - else - Tmp1 = SelectExpr(N.getOperand(0).getOperand(0)); - //if the value was really originally a i32, skip the up conversion - if (N.getOperand(0).getOperand(1).getOpcode() == ISD::SIGN_EXTEND_INREG && - dyn_cast(N.getOperand(0).getOperand(1).Val) - ->getExtraValueType() == MVT::i32) - Tmp2 = SelectExpr(N.getOperand(0).getOperand(1).getOperand(0)); - else - Tmp2 = SelectExpr(N.getOperand(0).getOperand(1)); - + Opc = isAdd ? Alpha::ADDL : (isMul ? Alpha::MULL : Alpha::SUBL); Tmp1 = SelectExpr(N.getOperand(0).getOperand(0)); + Tmp2 = SelectExpr(N.getOperand(0).getOperand(1)); BuildMI(BB, Opc, 2, Result).addReg(Tmp1).addReg(Tmp2); } return Result; } - case ISD::SEXTLOAD: - //SelectionDag isn't deleting the signextend after sextloads - Reg = Result = SelectExpr(N.getOperand(0)); - return Result; default: break; //Fall Though; } } //Every thing else fall though too, including unhandled opcodes above @@ -1787,79 +1783,52 @@ //first check for Scaled Adds and Subs! //Valid for add and sub + ConstantSDNode* CSD = NULL; if(N.getOperand(0).getOpcode() == ISD::SHL && - N.getOperand(0).getOperand(1).getOpcode() == ISD::Constant && - cast(N.getOperand(0).getOperand(1))->getValue() == 2) - { - Tmp2 = SelectExpr(N.getOperand(0).getOperand(0)); - if (N.getOperand(1).getOpcode() == ISD::Constant && - cast(N.getOperand(1))->getValue() <= 255) - BuildMI(BB, isAdd?Alpha::S4ADDQi:Alpha::S4SUBQi, 2, Result).addReg(Tmp2) - .addImm(cast(N.getOperand(1))->getValue()); - else { - Tmp1 = SelectExpr(N.getOperand(1)); - BuildMI(BB, isAdd?Alpha::S4ADDQ:Alpha::S4SUBQ, 2, Result).addReg(Tmp2).addReg(Tmp1); - } - } - else if(N.getOperand(0).getOpcode() == ISD::SHL && - N.getOperand(0).getOperand(1).getOpcode() == ISD::Constant && - cast(N.getOperand(0).getOperand(1))->getValue() == 3) + (CSD = dyn_cast(N.getOperand(0).getOperand(1))) && + (CSD->getValue() == 2 || CSD->getValue() == 3)) { + bool use4 = CSD->getValue() == 2; Tmp2 = SelectExpr(N.getOperand(0).getOperand(0)); - if (N.getOperand(1).getOpcode() == ISD::Constant && - cast(N.getOperand(1))->getValue() <= 255) - BuildMI(BB, isAdd?Alpha::S8ADDQi:Alpha::S8SUBQi, 2, Result).addReg(Tmp2) - .addImm(cast(N.getOperand(1))->getValue()); + if ((CSD = dyn_cast(N.getOperand(1))) && CSD->getValue() <= 255) + BuildMI(BB, isAdd?(use4?Alpha::S4ADDQi:Alpha::S8ADDQi):(use4?Alpha::S4SUBQi:Alpha::S8SUBQi), + 2, Result).addReg(Tmp2).addImm(CSD->getValue()); else { Tmp1 = SelectExpr(N.getOperand(1)); - BuildMI(BB, isAdd?Alpha::S8ADDQ:Alpha::S8SUBQ, 2, Result).addReg(Tmp2).addReg(Tmp1); + BuildMI(BB, isAdd?(use4?Alpha::S4ADDQi:Alpha::S8ADDQi):(use4?Alpha::S4SUBQi:Alpha::S8SUBQi), + 2, Result).addReg(Tmp2).addReg(Tmp1); } } //Position prevents subs - else if(N.getOperand(1).getOpcode() == ISD::SHL && isAdd & - N.getOperand(1).getOperand(1).getOpcode() == ISD::Constant && - cast(N.getOperand(1).getOperand(1))->getValue() == 2) - { - Tmp2 = SelectExpr(N.getOperand(1).getOperand(0)); - if (N.getOperand(0).getOpcode() == ISD::Constant && - cast(N.getOperand(0))->getValue() <= 255) - BuildMI(BB, Alpha::S4ADDQi, 2, Result).addReg(Tmp2) - .addImm(cast(N.getOperand(0))->getValue()); - else { - Tmp1 = SelectExpr(N.getOperand(0)); - BuildMI(BB, Alpha::S4ADDQ, 2, Result).addReg(Tmp2).addReg(Tmp1); - } - } else if(N.getOperand(1).getOpcode() == ISD::SHL && isAdd && - N.getOperand(1).getOperand(1).getOpcode() == ISD::Constant && - cast(N.getOperand(1).getOperand(1))->getValue() == 3) + (CSD = dyn_cast(N.getOperand(1).getOperand(1))) && + (CSD->getValue() == 2 || CSD->getValue() == 3)) { + bool use4 = CSD->getValue() == 2; Tmp2 = SelectExpr(N.getOperand(1).getOperand(0)); - if (N.getOperand(0).getOpcode() == ISD::Constant && - cast(N.getOperand(0))->getValue() <= 255) - BuildMI(BB, Alpha::S8ADDQi, 2, Result).addReg(Tmp2) - .addImm(cast(N.getOperand(0))->getValue()); + if ((CSD = dyn_cast(N.getOperand(0))) && CSD->getValue() <= 255) + BuildMI(BB, use4?Alpha::S4ADDQi:Alpha::S8ADDQi, 2, Result).addReg(Tmp2) + .addImm(CSD->getValue()); else { Tmp1 = SelectExpr(N.getOperand(0)); - BuildMI(BB, Alpha::S8ADDQ, 2, Result).addReg(Tmp2).addReg(Tmp1); + BuildMI(BB, use4?Alpha::S4ADDQ:Alpha::S8ADDQ, 2, Result).addReg(Tmp2).addReg(Tmp1); } } //small addi - else if(N.getOperand(1).getOpcode() == ISD::Constant && - cast(N.getOperand(1))->getValue() <= 255) + else if((CSD = dyn_cast(N.getOperand(1))) && + CSD->getValue() <= 255) { //Normal imm add/sub Opc = isAdd ? Alpha::ADDQi : Alpha::SUBQi; Tmp1 = SelectExpr(N.getOperand(0)); - Tmp2 = cast(N.getOperand(1))->getValue(); - BuildMI(BB, Opc, 2, Result).addReg(Tmp1).addImm(Tmp2); + BuildMI(BB, Opc, 2, Result).addReg(Tmp1).addImm(CSD->getValue()); } //larger addi - else if(N.getOperand(1).getOpcode() == ISD::Constant && - (cast(N.getOperand(1))->getValue() <= 32767 || - (long)cast(N.getOperand(1))->getValue() >= -32767)) + else if((CSD = dyn_cast(N.getOperand(1))) && + CSD->getSignExtended() <= 32767 && + CSD->getSignExtended() >= -32767) { //LDA Tmp1 = SelectExpr(N.getOperand(0)); - Tmp2 = (long)cast(N.getOperand(1))->getValue(); + Tmp2 = (long)CSD->getSignExtended(); if (!isAdd) Tmp2 = -Tmp2; BuildMI(BB, Alpha::LDA, 2, Result).addImm(Tmp2).addReg(Tmp1); @@ -2270,3 +2239,4 @@ FunctionPass *llvm::createAlphaPatternInstructionSelector(TargetMachine &TM) { return new ISel(TM); } + From duraid at octopus.com.au Wed Apr 13 01:12:15 2005 From: duraid at octopus.com.au (Duraid Madina) Date: Wed, 13 Apr 2005 01:12:15 -0500 Subject: [llvm-commits] CVS: llvm/lib/Target/IA64/IA64ISelPattern.cpp IA64InstrInfo.td Message-ID: <200504130612.BAA11127@zion.cs.uiuc.edu> Changes in directory llvm/lib/Target/IA64: IA64ISelPattern.cpp updated: 1.21 -> 1.22 IA64InstrInfo.td updated: 1.10 -> 1.11 --- Log message: * add the shladd instruction * fold left shifts of 1, 2, 3 or 4 bits into adds This doesn't save much now, but should get a serious workout once multiplies by constants get converted to shift/add/sub sequences. Hold on! :) --- Diffs of the changes: (+23 -0) IA64ISelPattern.cpp | 20 ++++++++++++++++++++ IA64InstrInfo.td | 3 +++ 2 files changed, 23 insertions(+) Index: llvm/lib/Target/IA64/IA64ISelPattern.cpp diff -u llvm/lib/Target/IA64/IA64ISelPattern.cpp:1.21 llvm/lib/Target/IA64/IA64ISelPattern.cpp:1.22 --- llvm/lib/Target/IA64/IA64ISelPattern.cpp:1.21 Tue Apr 12 23:50:54 2005 +++ llvm/lib/Target/IA64/IA64ISelPattern.cpp Wed Apr 13 01:12:04 2005 @@ -873,6 +873,26 @@ BuildMI(BB, IA64::FMA, 3, Result).addReg(Tmp1).addReg(Tmp2).addReg(Tmp3); return Result; // early exit } + + if(DestType != MVT::f64 && N.getOperand(0).getOpcode() == ISD::SHL && + N.getOperand(0).Val->hasOneUse()) { // if we might be able to fold + // this add into a shladd, try: + ConstantSDNode *CSD = NULL; + if((CSD = dyn_cast(N.getOperand(0).getOperand(1))) && + (CSD->getValue() >= 1) && (CSD->getValue() <= 4) ) { // we can: + + // ++FusedSHLADD; // Statistic + Tmp1 = SelectExpr(N.getOperand(0).getOperand(0)); + int shl_amt = CSD->getValue(); + Tmp3 = SelectExpr(N.getOperand(1)); + + BuildMI(BB, IA64::SHLADD, 3, Result) + .addReg(Tmp1).addImm(shl_amt).addReg(Tmp3); + return Result; // early exit + } + } + + //else, fallthrough: Tmp1 = SelectExpr(N.getOperand(0)); if(DestType != MVT::f64) { // integer addition: switch (ponderIntegerAdditionWith(N.getOperand(1), Tmp3)) { Index: llvm/lib/Target/IA64/IA64InstrInfo.td diff -u llvm/lib/Target/IA64/IA64InstrInfo.td:1.10 llvm/lib/Target/IA64/IA64InstrInfo.td:1.11 --- llvm/lib/Target/IA64/IA64InstrInfo.td:1.10 Tue Apr 12 10:12:19 2005 +++ llvm/lib/Target/IA64/IA64InstrInfo.td Wed Apr 13 01:12:04 2005 @@ -111,6 +111,9 @@ def SHRSI : AForm<0x03, 0x0b, (ops GR:$dst, GR:$src1, u6imm:$imm), "shr $dst = $src1, $imm;;">; +def SHLADD : AForm<0x03, 0x0b, (ops GR:$dst, GR:$src1, u6imm:$imm, GR:$src2), + "shladd $dst = $src1, $imm, $src2;;">; + def EXTRU : AForm<0x03, 0x0b, (ops GR:$dst, GR:$src1, u6imm:$imm1, u6imm:$imm2), "extr.u $dst = $src1, $imm1, $imm2;;">; From alenhar2 at cs.uiuc.edu Wed Apr 13 08:00:33 2005 From: alenhar2 at cs.uiuc.edu (Andrew Lenharth) Date: Wed, 13 Apr 2005 08:00:33 -0500 Subject: [llvm-commits] CVS: llvm/test/Regression/CodeGen/Alpha/zapnot.ll Message-ID: <200504131300.j3DD0Xm4004125@apoc.cs.uiuc.edu> Changes in directory llvm/test/Regression/CodeGen/Alpha: zapnot.ll added (r1.1) --- Log message: check that casts still use zap --- Diffs of the changes: (+11 -0) zapnot.ll | 11 +++++++++++ 1 files changed, 11 insertions(+) Index: llvm/test/Regression/CodeGen/Alpha/zapnot.ll diff -c /dev/null llvm/test/Regression/CodeGen/Alpha/zapnot.ll:1.1 *** /dev/null Wed Apr 13 08:00:27 2005 --- llvm/test/Regression/CodeGen/Alpha/zapnot.ll Wed Apr 13 08:00:16 2005 *************** *** 0 **** --- 1,11 ---- + ; Make sure this testcase codegens to the bic instruction + ; RUN: llvm-as < %s | llc -march=alpha | grep 'zapnot' + + implementation ; Functions: + + ushort %foo(long %y) { + entry: + %tmp.1 = cast long %y to ushort ; [#uses=1] + ret ushort %tmp.1 + } + From alenhar2 at cs.uiuc.edu Wed Apr 13 11:16:17 2005 From: alenhar2 at cs.uiuc.edu (Andrew Lenharth) Date: Wed, 13 Apr 2005 11:16:17 -0500 Subject: [llvm-commits] CVS: llvm/test/Regression/CodeGen/Alpha/bsr.ll Message-ID: <200504131616.j3DGGHiC002295@apoc.cs.uiuc.edu> Changes in directory llvm/test/Regression/CodeGen/Alpha: bsr.ll added (r1.1) --- Log message: regression case for faster call sequence --- Diffs of the changes: (+15 -0) bsr.ll | 15 +++++++++++++++ 1 files changed, 15 insertions(+) Index: llvm/test/Regression/CodeGen/Alpha/bsr.ll diff -c /dev/null llvm/test/Regression/CodeGen/Alpha/bsr.ll:1.1 *** /dev/null Wed Apr 13 11:16:11 2005 --- llvm/test/Regression/CodeGen/Alpha/bsr.ll Wed Apr 13 11:16:01 2005 *************** *** 0 **** --- 1,15 ---- + ; Make sure this testcase codegens the bsr instruction + ; RUN: llvm-as < %s | llc -march=alpha | grep 'bsr' + + + implementation ; Functions: + + long %abc(int %x) { + entry: + %tmp.2 = add int %x, -1 ; [#uses=1] + %tmp.0 = call long %abc( int %tmp.2 ) ; [#uses=1] + %tmp.5 = add int %x, -2 ; [#uses=1] + %tmp.3 = call long %abc( int %tmp.5 ) ; [#uses=1] + %tmp.6 = add long %tmp.0, %tmp.3 ; [#uses=1] + ret long %tmp.6 + } From alenhar2 at cs.uiuc.edu Wed Apr 13 11:20:07 2005 From: alenhar2 at cs.uiuc.edu (Andrew Lenharth) Date: Wed, 13 Apr 2005 11:20:07 -0500 Subject: [llvm-commits] CVS: llvm/lib/Target/Alpha/AlphaInstrInfo.td Message-ID: <200504131620.j3DGK7bm002333@apoc.cs.uiuc.edu> Changes in directory llvm/lib/Target/Alpha: AlphaInstrInfo.td updated: 1.39 -> 1.40 --- Log message: prepare for func call optimization --- Diffs of the changes: (+1 -1) AlphaInstrInfo.td | 2 +- 1 files changed, 1 insertion(+), 1 deletion(-) Index: llvm/lib/Target/Alpha/AlphaInstrInfo.td diff -u llvm/lib/Target/Alpha/AlphaInstrInfo.td:1.39 llvm/lib/Target/Alpha/AlphaInstrInfo.td:1.40 --- llvm/lib/Target/Alpha/AlphaInstrInfo.td:1.39 Thu Apr 7 12:17:48 2005 +++ llvm/lib/Target/Alpha/AlphaInstrInfo.td Wed Apr 13 11:19:50 2005 @@ -29,7 +29,7 @@ def WTF : PseudoInstAlpha<(ops ), "#wtf">; def ADJUSTSTACKUP : PseudoInstAlpha<(ops ), "ADJUP">; def ADJUSTSTACKDOWN : PseudoInstAlpha<(ops ), "ADJDOWN">; - +def ALTENT : PseudoInstAlpha<(ops s64imm:$TARGET), "$TARGET:\n">; def PCLABEL : PseudoInstAlpha<(ops s64imm:$num), "PCMARKER_$num:\n">; //***************** From alenhar2 at cs.uiuc.edu Wed Apr 13 12:17:44 2005 From: alenhar2 at cs.uiuc.edu (Andrew Lenharth) Date: Wed, 13 Apr 2005 12:17:44 -0500 Subject: [llvm-commits] CVS: llvm/lib/Target/Alpha/AlphaAsmPrinter.cpp AlphaISelPattern.cpp AlphaRegisterInfo.cpp Message-ID: <200504131717.j3DHHigi002407@apoc.cs.uiuc.edu> Changes in directory llvm/lib/Target/Alpha: AlphaAsmPrinter.cpp updated: 1.10 -> 1.11 AlphaISelPattern.cpp updated: 1.96 -> 1.97 AlphaRegisterInfo.cpp updated: 1.18 -> 1.19 --- Log message: WOW, function calls still seem to work after this. --- Diffs of the changes: (+30 -19) AlphaAsmPrinter.cpp | 33 ++++++++++++++++++++------------- AlphaISelPattern.cpp | 10 +++++----- AlphaRegisterInfo.cpp | 6 +++++- 3 files changed, 30 insertions(+), 19 deletions(-) Index: llvm/lib/Target/Alpha/AlphaAsmPrinter.cpp diff -u llvm/lib/Target/Alpha/AlphaAsmPrinter.cpp:1.10 llvm/lib/Target/Alpha/AlphaAsmPrinter.cpp:1.11 --- llvm/lib/Target/Alpha/AlphaAsmPrinter.cpp:1.10 Thu Mar 17 09:37:15 2005 +++ llvm/lib/Target/Alpha/AlphaAsmPrinter.cpp Wed Apr 13 12:17:28 2005 @@ -49,6 +49,7 @@ /// typedef std::map ValueMapTy; ValueMapTy NumberForBB; + std::string CurSection; virtual const char *getPassName() const { return "Alpha Assembly Printer"; @@ -62,6 +63,7 @@ bool runOnMachineFunction(MachineFunction &F); bool doInitialization(Module &M); bool doFinalization(Module &M); + void SwitchSection(std::ostream &OS, const char *NewSection); }; } // end of anonymous namespace @@ -134,8 +136,13 @@ O << MO.getSymbolName(); return; - case MachineOperand::MO_GlobalAddress: - O << Mang->getValueName(MO.getGlobal()); + case MachineOperand::MO_GlobalAddress: + //Abuse PCrel to specify pcrel calls + //calls are the only thing that use this flag + if (MO.isPCRelative()) + O << "$" << Mang->getValueName(MO.getGlobal()) << "..ng"; + else + O << Mang->getValueName(MO.getGlobal()); return; default: @@ -169,8 +176,8 @@ printConstantPool(MF.getConstantPool()); // Print out labels for the function. - O << "\t.text\n"; - emitAlignment(3); + SwitchSection(O, "text"); + emitAlignment(4); O << "\t.globl\t" << CurrentFnName << "\n"; O << "\t.ent\t" << CurrentFnName << "\n"; @@ -209,8 +216,9 @@ if (CP.empty()) return; + SwitchSection(O, "section .rodata"); for (unsigned i = 0, e = CP.size(); i != e; ++i) { - O << "\t.section\t.rodata\n"; + // SwitchSection(O, "section .rodata, \"dr\""); emitAlignment(TD.getTypeAlignmentShift(CP[i]->getType())); O << "CPI" << CurrentFnName << "_" << i << ":\t\t\t\t\t" << CommentString << *CP[i] << "\n"; @@ -229,18 +237,17 @@ // SwitchSection - Switch to the specified section of the executable if we are // not already in it! // -static void SwitchSection(std::ostream &OS, std::string &CurSection, - const char *NewSection) { +void AlphaAsmPrinter::SwitchSection(std::ostream &OS, const char *NewSection) +{ if (CurSection != NewSection) { CurSection = NewSection; if (!CurSection.empty()) - OS << "\t" << NewSection << "\n"; + OS << "\t." << NewSection << "\n"; } } bool AlphaAsmPrinter::doFinalization(Module &M) { const TargetData &TD = TM.getTargetData(); - std::string CurSection; for (Module::const_global_iterator I = M.global_begin(), E = M.global_end(); I != E; ++I) if (I->hasInitializer()) { // External global require no code @@ -253,7 +260,7 @@ if (C->isNullValue() && (I->hasLinkOnceLinkage() || I->hasInternalLinkage() || I->hasWeakLinkage() /* FIXME: Verify correct */)) { - SwitchSection(O, CurSection, ".data"); + SwitchSection(O, "data"); if (I->hasInternalLinkage()) O << "\t.local " << name << "\n"; @@ -268,7 +275,7 @@ case GlobalValue::WeakLinkage: // FIXME: Verify correct for weak. // Nonnull linkonce -> weak O << "\t.weak " << name << "\n"; - SwitchSection(O, CurSection, ""); + SwitchSection(O, ""); O << "\t.section\t.llvm.linkonce.d." << name << ",\"aw\", at progbits\n"; break; case GlobalValue::AppendingLinkage: @@ -280,9 +287,9 @@ // FALL THROUGH case GlobalValue::InternalLinkage: if (C->isNullValue()) - SwitchSection(O, CurSection, ".data"); //was .bss + SwitchSection(O, "bss"); //was .bss else - SwitchSection(O, CurSection, ".data"); + SwitchSection(O, "data"); break; case GlobalValue::GhostLinkage: std::cerr << "GhostLinkage cannot appear in AlphaAsmPrinter!\n"; Index: llvm/lib/Target/Alpha/AlphaISelPattern.cpp diff -u llvm/lib/Target/Alpha/AlphaISelPattern.cpp:1.96 llvm/lib/Target/Alpha/AlphaISelPattern.cpp:1.97 --- llvm/lib/Target/Alpha/AlphaISelPattern.cpp:1.96 Wed Apr 13 00:19:55 2005 +++ llvm/lib/Target/Alpha/AlphaISelPattern.cpp Wed Apr 13 12:17:28 2005 @@ -1411,15 +1411,15 @@ if (GlobalAddressSDNode *GASD = dyn_cast(N.getOperand(1))) { - //if (GASD->getGlobal()->isExternal()) { + if (GASD->getGlobal()->isExternal()) { //use safe calling convention AlphaLowering.restoreGP(BB); has_sym = true; - BuildMI(BB, Alpha::CALL, 1).addGlobalAddress(GASD->getGlobal(),true); - //} else { + BuildMI(BB, Alpha::CALL, 1).addGlobalAddress(GASD->getGlobal()); + } else { //use PC relative branch call - //BuildMI(BB, Alpha::BSR, 1, Alpha::R26).addGlobalAddress(GASD->getGlobal(),true); - //} + BuildMI(BB, Alpha::BSR, 1, Alpha::R26).addGlobalAddress(GASD->getGlobal(),true); + } } else if (ExternalSymbolSDNode *ESSDN = dyn_cast(N.getOperand(1))) Index: llvm/lib/Target/Alpha/AlphaRegisterInfo.cpp diff -u llvm/lib/Target/Alpha/AlphaRegisterInfo.cpp:1.18 llvm/lib/Target/Alpha/AlphaRegisterInfo.cpp:1.19 --- llvm/lib/Target/Alpha/AlphaRegisterInfo.cpp:1.18 Tue Mar 29 13:24:04 2005 +++ llvm/lib/Target/Alpha/AlphaRegisterInfo.cpp Wed Apr 13 12:17:28 2005 @@ -16,6 +16,7 @@ #include "AlphaRegisterInfo.h" #include "llvm/Constants.h" #include "llvm/Type.h" +#include "llvm/Function.h" #include "llvm/CodeGen/ValueTypes.h" #include "llvm/CodeGen/MachineInstrBuilder.h" #include "llvm/CodeGen/MachineFunction.h" @@ -213,7 +214,10 @@ //handle GOP offset MI = BuildMI(Alpha::LDGP, 0); MBB.insert(MBBI, MI); - + //evil const_cast until MO stuff setup to handle const + MI = BuildMI(Alpha::ALTENT, 1).addGlobalAddress(const_cast(MF.getFunction()), true); + MBB.insert(MBBI, MI); + // Get the number of bytes to allocate from the FrameInfo long NumBytes = MFI->getStackSize(); From lattner at cs.uiuc.edu Wed Apr 13 14:41:21 2005 From: lattner at cs.uiuc.edu (Chris Lattner) Date: Wed, 13 Apr 2005 14:41:21 -0500 Subject: [llvm-commits] CVS: llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp Message-ID: <200504131941.j3DJfLFD016824@apoc.cs.uiuc.edu> Changes in directory llvm/lib/CodeGen/SelectionDAG: SelectionDAG.cpp updated: 1.78 -> 1.79 --- Log message: avoid work when possible, perhaps fix the problem nate and andrew are seeing with != 0 comparisons vanishing. --- Diffs of the changes: (+1 -0) SelectionDAG.cpp | 1 + 1 files changed, 1 insertion(+) Index: llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp diff -u llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp:1.78 llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp:1.79 --- llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp:1.78 Tue Apr 12 21:58:13 2005 +++ llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp Wed Apr 13 14:41:05 2005 @@ -286,6 +286,7 @@ } SDOperand SelectionDAG::getZeroExtendInReg(SDOperand Op, MVT::ValueType VT) { + if (Op.getValueType() == VT) return Op; int64_t Imm = ~0ULL >> 64-MVT::getSizeInBits(VT); return getNode(ISD::AND, Op.getValueType(), Op, getConstant(Imm, Op.getValueType())); From lattner at cs.uiuc.edu Wed Apr 13 14:53:53 2005 From: lattner at cs.uiuc.edu (Chris Lattner) Date: Wed, 13 Apr 2005 14:53:53 -0500 Subject: [llvm-commits] CVS: llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp Message-ID: <200504131953.j3DJrrAK016938@apoc.cs.uiuc.edu> Changes in directory llvm/lib/CodeGen/SelectionDAG: SelectionDAG.cpp updated: 1.79 -> 1.80 --- Log message: fix some serious miscompiles on ia64, alpha, and ppc --- Diffs of the changes: (+1 -1) SelectionDAG.cpp | 2 +- 1 files changed, 1 insertion(+), 1 deletion(-) Index: llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp diff -u llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp:1.79 llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp:1.80 --- llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp:1.79 Wed Apr 13 14:41:05 2005 +++ llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp Wed Apr 13 14:53:40 2005 @@ -798,7 +798,7 @@ if (N1.getOpcode() == ISD::SETCC && TLI.getSetCCResultContents() == TargetLowering::ZeroOrOneSetCCResult) if (C2 & 1) - return getNode(ISD::AND, VT, N1.getOperand(1), getConstant(1, VT)); + return getNode(ISD::AND, VT, N1, getConstant(1, VT)); else return getConstant(0, VT); From lattner at cs.uiuc.edu Wed Apr 13 15:06:42 2005 From: lattner at cs.uiuc.edu (Chris Lattner) Date: Wed, 13 Apr 2005 15:06:42 -0500 Subject: [llvm-commits] CVS: llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp Message-ID: <200504132006.j3DK6gdp017155@apoc.cs.uiuc.edu> Changes in directory llvm/lib/CodeGen/SelectionDAG: SelectionDAG.cpp updated: 1.80 -> 1.81 --- Log message: fix an infinite loop --- Diffs of the changes: (+1 -1) SelectionDAG.cpp | 2 +- 1 files changed, 1 insertion(+), 1 deletion(-) Index: llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp diff -u llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp:1.80 llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp:1.81 --- llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp:1.80 Wed Apr 13 14:53:40 2005 +++ llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp Wed Apr 13 15:06:29 2005 @@ -795,7 +795,7 @@ // If we are anding the result of a setcc, and we know setcc always // returns 0 or 1, simplify the RHS to either be 0 or 1 - if (N1.getOpcode() == ISD::SETCC && + if (N1.getOpcode() == ISD::SETCC && C2 != 1 && TLI.getSetCCResultContents() == TargetLowering::ZeroOrOneSetCCResult) if (C2 & 1) return getNode(ISD::AND, VT, N1, getConstant(1, VT)); From alenhar2 at cs.uiuc.edu Wed Apr 13 15:08:45 2005 From: alenhar2 at cs.uiuc.edu (Andrew Lenharth) Date: Wed, 13 Apr 2005 15:08:45 -0500 Subject: [llvm-commits] CVS: llvm-test/Makefile.programs Message-ID: <200504132008.j3DK8jOp017198@apoc.cs.uiuc.edu> Changes in directory llvm-test: Makefile.programs updated: 1.151 -> 1.152 --- Log message: change beta options for alpha (LSR has had it's chance for a while) --- Diffs of the changes: (+2 -1) Makefile.programs | 3 ++- 1 files changed, 2 insertions(+), 1 deletion(-) Index: llvm-test/Makefile.programs diff -u llvm-test/Makefile.programs:1.151 llvm-test/Makefile.programs:1.152 --- llvm-test/Makefile.programs:1.151 Thu Mar 31 13:18:19 2005 +++ llvm-test/Makefile.programs Wed Apr 13 15:08:32 2005 @@ -190,7 +190,8 @@ LLCBETAOPTION := -enable-ppc-pattern-isel else ifeq ($(ARCH),Alpha) -LLCBETAOPTION := -enable-lsr-for-alpha +LLCBETAOPTION := -enable-alpha-intfpdiv -enable-alpha-ftoi +#-enable-lsr-for-alpha else LLCBETAOPTION := -disable-pattern-isel=0 endif From natebegeman at mac.com Wed Apr 13 16:23:42 2005 From: natebegeman at mac.com (Nate Begeman) Date: Wed, 13 Apr 2005 16:23:42 -0500 Subject: [llvm-commits] CVS: llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp Message-ID: <200504132123.QAA11619@zion.cs.uiuc.edu> Changes in directory llvm/lib/CodeGen/SelectionDAG: SelectionDAG.cpp updated: 1.81 -> 1.82 --- Log message: Disbale the broken fold of shift + sz[ext] for now Move the transform for select (a < 0) ? b : 0 into the dag from ppc isel Enable the dag to fold and (setcc, 1) -> setcc for targets where setcc always produces zero or one. --- Diffs of the changes: (+30 -7) SelectionDAG.cpp | 37 ++++++++++++++++++++++++++++++------- 1 files changed, 30 insertions(+), 7 deletions(-) Index: llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp diff -u llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp:1.81 llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp:1.82 --- llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp:1.81 Wed Apr 13 15:06:29 2005 +++ llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp Wed Apr 13 16:23:31 2005 @@ -773,7 +773,7 @@ case ISD::AND: if (!C2) return N2; // X and 0 -> 0 if (N2C->isAllOnesValue()) - return N1; // X and -1 -> X + return N1; // X and -1 -> X // FIXME: Should add a corresponding version of this for // ZERO_EXTEND/SIGN_EXTEND by converting them to an ANY_EXTEND node which @@ -795,13 +795,13 @@ // If we are anding the result of a setcc, and we know setcc always // returns 0 or 1, simplify the RHS to either be 0 or 1 - if (N1.getOpcode() == ISD::SETCC && C2 != 1 && + if (N1.getOpcode() == ISD::SETCC && TLI.getSetCCResultContents() == TargetLowering::ZeroOrOneSetCCResult) if (C2 & 1) - return getNode(ISD::AND, VT, N1, getConstant(1, VT)); + return N1; else return getConstant(0, VT); - + if (N1.getOpcode() == ISD::ZEXTLOAD) { // If we are anding the result of a zext load, realize that the top bits // of the loaded value are already zero to simplify C2. @@ -941,6 +941,10 @@ if (N2.getOpcode() == ISD::FNEG) // (A- (-B) -> A+B return getNode(ISD::ADD, VT, N1, N2.getOperand(0)); break; + // FIXME: figure out how to safely handle things like + // int foo(int x) { return 1 << (x & 255); } + // int bar() { return foo(256); } +#if 0 case ISD::SHL: case ISD::SRL: case ISD::SRA: @@ -955,8 +959,8 @@ if ((AndRHS->getValue() & (NumBits-1)) == NumBits-1) return getNode(Opcode, VT, N1, N2.getOperand(0)); } - break; +#endif } SDNode *&N = BinaryOps[std::make_pair(Opcode, std::make_pair(N1, N2))]; @@ -1039,6 +1043,22 @@ N2.getOperand(0) == N3) return getNode(ISD::FABS, VT, N3); } + // select (setlt X, 0), A, 0 -> and (sra X, size(X)-1, A) + if (ConstantSDNode *CN = + dyn_cast(SetCC->getOperand(1))) + if (CN->getValue() == 0 && N3C && N3C->getValue() == 0) + if (SetCC->getCondition() == ISD::SETLT) { + MVT::ValueType XType = SetCC->getOperand(0).getValueType(); + MVT::ValueType AType = N2.getValueType(); + if (XType >= AType) { + SDOperand Shift = getNode(ISD::SRA, XType, SetCC->getOperand(0), + getConstant(MVT::getSizeInBits(XType)-1, + TLI.getShiftAmountTy())); + if (XType > AType) + Shift = getNode(ISD::TRUNCATE, AType, Shift); + return getNode(ISD::AND, AType, Shift, N2); + } + } } break; case ISD::BRCOND: @@ -1048,6 +1068,10 @@ else return N1; // Never-taken branch break; + // FIXME: figure out how to safely handle things like + // int foo(int x) { return 1 << (x & 255); } + // int bar() { return foo(256); } +#if 0 case ISD::SRA_PARTS: case ISD::SRL_PARTS: case ISD::SHL_PARTS: @@ -1062,9 +1086,8 @@ if ((AndRHS->getValue() & (NumBits-1)) == NumBits-1) return getNode(Opcode, VT, N1, N2, N3.getOperand(0)); } - - break; +#endif } SDNode *N = new SDNode(Opcode, N1, N2, N3); From natebegeman at mac.com Wed Apr 13 16:23:42 2005 From: natebegeman at mac.com (Nate Begeman) Date: Wed, 13 Apr 2005 16:23:42 -0500 Subject: [llvm-commits] CVS: llvm/lib/Target/PowerPC/PPC32ISelPattern.cpp Message-ID: <200504132123.QAA11623@zion.cs.uiuc.edu> Changes in directory llvm/lib/Target/PowerPC: PPC32ISelPattern.cpp updated: 1.68 -> 1.69 --- Log message: Disbale the broken fold of shift + sz[ext] for now Move the transform for select (a < 0) ? b : 0 into the dag from ppc isel Enable the dag to fold and (setcc, 1) -> setcc for targets where setcc always produces zero or one. --- Diffs of the changes: (+0 -19) PPC32ISelPattern.cpp | 19 ------------------- 1 files changed, 19 deletions(-) Index: llvm/lib/Target/PowerPC/PPC32ISelPattern.cpp diff -u llvm/lib/Target/PowerPC/PPC32ISelPattern.cpp:1.68 llvm/lib/Target/PowerPC/PPC32ISelPattern.cpp:1.69 --- llvm/lib/Target/PowerPC/PPC32ISelPattern.cpp:1.68 Tue Apr 12 21:46:17 2005 +++ llvm/lib/Target/PowerPC/PPC32ISelPattern.cpp Wed Apr 13 16:23:31 2005 @@ -2126,25 +2126,6 @@ return 0; case ISD::SELECT: { - // We can codegen select (a < 0) ? b : 0 very efficiently compared to a - // conditional branch. Check for that here. - if (SetCCSDNode *SetCC = dyn_cast(N.getOperand(0).Val)) { - if (ConstantSDNode *CN = - dyn_cast(SetCC->getOperand(1).Val)) { - if (ConstantSDNode *CNF = - dyn_cast(N.getOperand(2).Val)) { - if (CN->getValue() == 0 && CNF->getValue() == 0 && - SetCC->getCondition() == ISD::SETLT) { - Tmp1 = SelectExpr(N.getOperand(1)); // TRUE value - Tmp2 = SelectExpr(SetCC->getOperand(0)); - Tmp3 = MakeReg(MVT::i32); - BuildMI(BB, PPC::SRAWI, 2, Tmp3).addReg(Tmp2).addImm(31); - BuildMI(BB, PPC::AND, 2, Result).addReg(Tmp1).addReg(Tmp3); - return Result; - } - } - } - } unsigned TrueValue = SelectExpr(N.getOperand(1)); //Use if TRUE unsigned FalseValue = SelectExpr(N.getOperand(2)); //Use if FALSE Opc = SelectSetCR0(N.getOperand(0)); From natebegeman at mac.com Wed Apr 13 16:45:24 2005 From: natebegeman at mac.com (Nate Begeman) Date: Wed, 13 Apr 2005 16:45:24 -0500 Subject: [llvm-commits] CVS: llvm/test/Regression/CodeGen/PowerPC/select_lt0.ll setcc_no_zext.ll Message-ID: <200504132145.QAA11844@zion.cs.uiuc.edu> Changes in directory llvm/test/Regression/CodeGen/PowerPC: select_lt0.ll added (r1.1) setcc_no_zext.ll added (r1.1) --- Log message: Add CodeGen tests for the recent SelectionDAG transforms --- Diffs of the changes: (+53 -0) select_lt0.ll | 45 +++++++++++++++++++++++++++++++++++++++++++++ setcc_no_zext.ll | 8 ++++++++ 2 files changed, 53 insertions(+) Index: llvm/test/Regression/CodeGen/PowerPC/select_lt0.ll diff -c /dev/null llvm/test/Regression/CodeGen/PowerPC/select_lt0.ll:1.1 *** /dev/null Wed Apr 13 16:45:23 2005 --- llvm/test/Regression/CodeGen/PowerPC/select_lt0.ll Wed Apr 13 16:45:13 2005 *************** *** 0 **** --- 1,45 ---- + ; RUN: llvm-as < %s | llc -march=ppc32 -enable-ppc-pattern-isel | not grep cmp + + int %seli32_1(int %a) { + entry: + %tmp.1 = setlt int %a, 0 + %retval = select bool %tmp.1, int 5, int 0 + ret int %retval + } + + int %seli32_2(int %a, int %b) { + entry: + %tmp.1 = setlt int %a, 0 + %retval = select bool %tmp.1, int %b, int 0 + ret int %retval + } + + int %seli32_3(int %a, short %b) { + entry: + %tmp.2 = cast short %b to int + %tmp.1 = setlt int %a, 0 + %retval = select bool %tmp.1, int %tmp.2, int 0 + ret int %retval + } + + int %seli32_4(int %a, ushort %b) { + entry: + %tmp.2 = cast ushort %b to int + %tmp.1 = setlt int %a, 0 + %retval = select bool %tmp.1, int %tmp.2, int 0 + ret int %retval + } + + short %seli16_1(short %a) { + entry: + %tmp.1 = setlt short %a, 0 + %retval = select bool %tmp.1, short 7, short 0 + ret short %retval + } + + short %seli16_2(int %a, short %b) { + entry: + %tmp.1 = setlt int %a, 0 + %retval = select bool %tmp.1, short %b, short 0 + ret short %retval + } Index: llvm/test/Regression/CodeGen/PowerPC/setcc_no_zext.ll diff -c /dev/null llvm/test/Regression/CodeGen/PowerPC/setcc_no_zext.ll:1.1 *** /dev/null Wed Apr 13 16:45:24 2005 --- llvm/test/Regression/CodeGen/PowerPC/setcc_no_zext.ll Wed Apr 13 16:45:13 2005 *************** *** 0 **** --- 1,8 ---- + ; RUN: llvm-as < %s | llc -march=ppc32 -enable-ppc-pattern-isel | not grep rlwinm + + int %setcc_one_or_zero(int* %a) { + entry: + %tmp.1 = setne int* %a, null + %inc.1 = cast bool %tmp.1 to int + ret int %inc.1 + } From natebegeman at mac.com Wed Apr 13 17:14:25 2005 From: natebegeman at mac.com (Nate Begeman) Date: Wed, 13 Apr 2005 17:14:25 -0500 Subject: [llvm-commits] CVS: llvm/lib/Target/PowerPC/PPC32ISelPattern.cpp Message-ID: <200504132214.RAA12113@zion.cs.uiuc.edu> Changes in directory llvm/lib/Target/PowerPC: PPC32ISelPattern.cpp updated: 1.69 -> 1.70 --- Log message: Implement the fold shift X, zext(Y) -> shift X, Y at the target level, where it is safe to do so. --- Diffs of the changes: (+22 -6) PPC32ISelPattern.cpp | 28 ++++++++++++++++++++++------ 1 files changed, 22 insertions(+), 6 deletions(-) Index: llvm/lib/Target/PowerPC/PPC32ISelPattern.cpp diff -u llvm/lib/Target/PowerPC/PPC32ISelPattern.cpp:1.69 llvm/lib/Target/PowerPC/PPC32ISelPattern.cpp:1.70 --- llvm/lib/Target/PowerPC/PPC32ISelPattern.cpp:1.69 Wed Apr 13 16:23:31 2005 +++ llvm/lib/Target/PowerPC/PPC32ISelPattern.cpp Wed Apr 13 17:14:14 2005 @@ -524,6 +524,7 @@ unsigned getGlobalBaseReg(); unsigned getConstDouble(double floatVal, unsigned Result); bool SelectBitfieldInsert(SDOperand OR, unsigned Result); + unsigned FoldIfWideZeroExtend(SDOperand N); unsigned SelectSetCR0(SDOperand CC); unsigned SelectExpr(SDOperand N, bool Recording=false); unsigned SelectExprFP(SDOperand N, unsigned Result); @@ -963,6 +964,21 @@ return false; } +/// FoldIfWideZeroExtend - 32 bit PowerPC implicit masks shift amounts to the +/// low six bits. If the shift amount is an ISD::AND node with a mask that is +/// wider than the implicit mask, then we can get rid of the AND and let the +/// shift do the mask. +unsigned ISel::FoldIfWideZeroExtend(SDOperand N) { + unsigned C; + if (N.getOpcode() == ISD::AND && + 5 == getImmediateForOpcode(N.getOperand(1), ISD::AND, C) && // isMask + 31 == (C & 0xFFFF) && // ME + 26 >= (C >> 16)) // MB + return SelectExpr(N.getOperand(0)); + else + return SelectExpr(N); +} + unsigned ISel::SelectSetCR0(SDOperand CC) { unsigned Opc, Tmp1, Tmp2; bool AlreadySelected = false; @@ -1650,7 +1666,7 @@ BuildMI(BB, PPC::RLWINM, 4, Result).addReg(Tmp1).addImm(Tmp2).addImm(0) .addImm(31-Tmp2); } else { - Tmp2 = SelectExpr(N.getOperand(1)); + Tmp2 = FoldIfWideZeroExtend(N.getOperand(1)); BuildMI(BB, PPC::SLW, 2, Result).addReg(Tmp1).addReg(Tmp2); } return Result; @@ -1662,7 +1678,7 @@ BuildMI(BB, PPC::RLWINM, 4, Result).addReg(Tmp1).addImm(32-Tmp2) .addImm(Tmp2).addImm(31); } else { - Tmp2 = SelectExpr(N.getOperand(1)); + Tmp2 = FoldIfWideZeroExtend(N.getOperand(1)); BuildMI(BB, PPC::SRW, 2, Result).addReg(Tmp1).addReg(Tmp2); } return Result; @@ -1673,7 +1689,7 @@ Tmp2 = CN->getValue() & 0x1F; BuildMI(BB, PPC::SRAWI, 2, Result).addReg(Tmp1).addImm(Tmp2); } else { - Tmp2 = SelectExpr(N.getOperand(1)); + Tmp2 = FoldIfWideZeroExtend(N.getOperand(1)); BuildMI(BB, PPC::SRAW, 2, Result).addReg(Tmp1).addReg(Tmp2); } return Result; @@ -1880,9 +1896,9 @@ "Not an i64 shift!"); unsigned ShiftOpLo = SelectExpr(N.getOperand(0)); unsigned ShiftOpHi = SelectExpr(N.getOperand(1)); - unsigned SHReg = SelectExpr(N.getOperand(2)); - Tmp1 = MakeReg(MVT::i32); - Tmp2 = MakeReg(MVT::i32); + unsigned SHReg = FoldIfWideZeroExtend(N.getOperand(2)); + Tmp1 = MakeReg(MVT::i32); + Tmp2 = MakeReg(MVT::i32); Tmp3 = MakeReg(MVT::i32); unsigned Tmp4 = MakeReg(MVT::i32); unsigned Tmp5 = MakeReg(MVT::i32); From natebegeman at mac.com Wed Apr 13 18:15:55 2005 From: natebegeman at mac.com (Nate Begeman) Date: Wed, 13 Apr 2005 18:15:55 -0500 Subject: [llvm-commits] CVS: llvm/lib/Target/PowerPC/PPC32ISelPattern.cpp Message-ID: <200504132315.SAA12576@zion.cs.uiuc.edu> Changes in directory llvm/lib/Target/PowerPC: PPC32ISelPattern.cpp updated: 1.70 -> 1.71 --- Log message: Start allocating condition registers. Almost all explicit uses of CR0 are now gone. Next step is to get rid of the remaining ones and then start allocating bools to CRs where appropriate. --- Diffs of the changes: (+26 -23) PPC32ISelPattern.cpp | 49 ++++++++++++++++++++++++++----------------------- 1 files changed, 26 insertions(+), 23 deletions(-) Index: llvm/lib/Target/PowerPC/PPC32ISelPattern.cpp diff -u llvm/lib/Target/PowerPC/PPC32ISelPattern.cpp:1.70 llvm/lib/Target/PowerPC/PPC32ISelPattern.cpp:1.71 --- llvm/lib/Target/PowerPC/PPC32ISelPattern.cpp:1.70 Wed Apr 13 17:14:14 2005 +++ llvm/lib/Target/PowerPC/PPC32ISelPattern.cpp Wed Apr 13 18:15:44 2005 @@ -525,7 +525,7 @@ unsigned getConstDouble(double floatVal, unsigned Result); bool SelectBitfieldInsert(SDOperand OR, unsigned Result); unsigned FoldIfWideZeroExtend(SDOperand N); - unsigned SelectSetCR0(SDOperand CC); + unsigned SelectCC(SDOperand CC, unsigned &Opc); unsigned SelectExpr(SDOperand N, bool Recording=false); unsigned SelectExprFP(SDOperand N, unsigned Result); void Select(SDOperand N); @@ -979,12 +979,15 @@ return SelectExpr(N); } -unsigned ISel::SelectSetCR0(SDOperand CC) { - unsigned Opc, Tmp1, Tmp2; +unsigned ISel::SelectCC(SDOperand CC, unsigned &Opc) { + unsigned Result, Tmp1, Tmp2; bool AlreadySelected = false; static const unsigned CompareOpcodes[] = { PPC::FCMPU, PPC::FCMPU, PPC::CMPW, PPC::CMPLW }; + // Allocate a condition register for this expression + Result = RegMap->createVirtualRegister(PPC32::CRRCRegisterClass); + // If the first operand to the select is a SETCC node, then we can fold it // into the branch that selects which value to return. SetCCSDNode* SetCC = dyn_cast(CC.Val); @@ -1006,7 +1009,7 @@ Tmp1 = SelectExpr(SetCC->getOperand(0), true); if (RecordSuccess) { ++Recorded; - return Opc; + return PPC::CR0; } AlreadySelected = true; } @@ -1014,22 +1017,22 @@ // instead. if (!AlreadySelected) Tmp1 = SelectExpr(SetCC->getOperand(0)); if (U) - BuildMI(BB, PPC::CMPLWI, 2, PPC::CR0).addReg(Tmp1).addImm(Tmp2); + BuildMI(BB, PPC::CMPLWI, 2, Result).addReg(Tmp1).addImm(Tmp2); else - BuildMI(BB, PPC::CMPWI, 2, PPC::CR0).addReg(Tmp1).addSImm(Tmp2); + BuildMI(BB, PPC::CMPWI, 2, Result).addReg(Tmp1).addSImm(Tmp2); } else { bool IsInteger = MVT::isInteger(SetCC->getOperand(0).getValueType()); unsigned CompareOpc = CompareOpcodes[2 * IsInteger + U]; Tmp1 = SelectExpr(SetCC->getOperand(0)); Tmp2 = SelectExpr(SetCC->getOperand(1)); - BuildMI(BB, CompareOpc, 2, PPC::CR0).addReg(Tmp1).addReg(Tmp2); + BuildMI(BB, CompareOpc, 2, Result).addReg(Tmp1).addReg(Tmp2); } } else { Opc = PPC::BNE; Tmp1 = SelectExpr(CC); - BuildMI(BB, PPC::CMPLWI, 2, PPC::CR0).addReg(Tmp1).addImm(0); + BuildMI(BB, PPC::CMPLWI, 2, Result).addReg(Tmp1).addImm(0); } - return Opc; + return Result; } /// Check to see if the load is a constant offset from a base register @@ -1055,8 +1058,9 @@ MachineBasicBlock *Dest = cast(N.getOperand(2))->getBasicBlock(); + unsigned Opc, CCReg; Select(N.getOperand(0)); //chain - unsigned Opc = SelectSetCR0(N.getOperand(1)); + CCReg = SelectCC(N.getOperand(1), Opc); // Iterate to the next basic block, unless we're already at the end of the ilist::iterator It = BB, E = BB->getParent()->end(); @@ -1070,21 +1074,20 @@ MachineBasicBlock *Fallthrough = cast(N.getOperand(3))->getBasicBlock(); if (Dest != It) { - BuildMI(BB, PPC::COND_BRANCH, 4).addReg(PPC::CR0).addImm(Opc) + BuildMI(BB, PPC::COND_BRANCH, 4).addReg(CCReg).addImm(Opc) .addMBB(Dest).addMBB(Fallthrough); if (Fallthrough != It) BuildMI(BB, PPC::B, 1).addMBB(Fallthrough); } else { if (Fallthrough != It) { Opc = PPC32InstrInfo::invertPPCBranchOpcode(Opc); - BuildMI(BB, PPC::COND_BRANCH, 4).addReg(PPC::CR0).addImm(Opc) + BuildMI(BB, PPC::COND_BRANCH, 4).addReg(CCReg).addImm(Opc) .addMBB(Fallthrough).addMBB(Dest); } } } else { - BuildMI(BB, PPC::COND_BRANCH, 4).addReg(PPC::CR0).addImm(Opc) + BuildMI(BB, PPC::COND_BRANCH, 4).addReg(CCReg).addImm(Opc) .addMBB(Dest).addMBB(It); - //BuildMI(BB, Opc, 2).addReg(PPC::CR0).addMBB(Dest); } return; } @@ -1177,7 +1180,7 @@ unsigned TrueValue = SelectExpr(N.getOperand(1)); //Use if TRUE unsigned FalseValue = SelectExpr(N.getOperand(2)); //Use if FALSE - Opc = SelectSetCR0(N.getOperand(0)); + unsigned CCReg = SelectCC(N.getOperand(0), Opc); // Create an iterator with which to insert the MBB for copying the false // value and the MBB to hold the PHI instruction for this SetCC. @@ -1189,12 +1192,12 @@ // thisMBB: // ... // TrueVal = ... - // cmpTY cr0, r1, r2 + // cmpTY ccX, r1, r2 // bCC copy1MBB // fallthrough --> copy0MBB MachineBasicBlock *copy0MBB = new MachineBasicBlock(LLVM_BB); MachineBasicBlock *sinkMBB = new MachineBasicBlock(LLVM_BB); - BuildMI(BB, Opc, 2).addReg(PPC::CR0).addMBB(sinkMBB); + BuildMI(BB, Opc, 2).addReg(CCReg).addMBB(sinkMBB); MachineFunction *F = BB->getParent(); F->getBasicBlockList().insert(It, copy0MBB); F->getBasicBlockList().insert(It, sinkMBB); @@ -2095,7 +2098,7 @@ } } - Opc = SelectSetCR0(N); + unsigned CCReg = SelectCC(N, Opc); unsigned TrueValue = MakeReg(MVT::i32); BuildMI(BB, PPC::LI, 1, TrueValue).addSImm(1); unsigned FalseValue = MakeReg(MVT::i32); @@ -2110,12 +2113,12 @@ // thisMBB: // ... - // cmpTY cr0, r1, r2 + // cmpTY ccX, r1, r2 // %TrueValue = li 1 // bCC sinkMBB MachineBasicBlock *copy0MBB = new MachineBasicBlock(LLVM_BB); MachineBasicBlock *sinkMBB = new MachineBasicBlock(LLVM_BB); - BuildMI(BB, Opc, 2).addReg(PPC::CR0).addMBB(sinkMBB); + BuildMI(BB, Opc, 2).addReg(CCReg).addMBB(sinkMBB); MachineFunction *F = BB->getParent(); F->getBasicBlockList().insert(It, copy0MBB); F->getBasicBlockList().insert(It, sinkMBB); @@ -2144,7 +2147,7 @@ case ISD::SELECT: { unsigned TrueValue = SelectExpr(N.getOperand(1)); //Use if TRUE unsigned FalseValue = SelectExpr(N.getOperand(2)); //Use if FALSE - Opc = SelectSetCR0(N.getOperand(0)); + unsigned CCReg = SelectCC(N.getOperand(0), Opc); // Create an iterator with which to insert the MBB for copying the false // value and the MBB to hold the PHI instruction for this SetCC. @@ -2156,12 +2159,12 @@ // thisMBB: // ... // TrueVal = ... - // cmpTY cr0, r1, r2 + // cmpTY ccX, r1, r2 // bCC copy1MBB // fallthrough --> copy0MBB MachineBasicBlock *copy0MBB = new MachineBasicBlock(LLVM_BB); MachineBasicBlock *sinkMBB = new MachineBasicBlock(LLVM_BB); - BuildMI(BB, Opc, 2).addReg(PPC::CR0).addMBB(sinkMBB); + BuildMI(BB, Opc, 2).addReg(CCReg).addMBB(sinkMBB); MachineFunction *F = BB->getParent(); F->getBasicBlockList().insert(It, copy0MBB); F->getBasicBlockList().insert(It, sinkMBB); From natebegeman at mac.com Wed Apr 13 22:20:49 2005 From: natebegeman at mac.com (Nate Begeman) Date: Wed, 13 Apr 2005 22:20:49 -0500 Subject: [llvm-commits] CVS: llvm/lib/Target/PowerPC/PPC32ISelSimple.cpp PowerPCAsmPrinter.cpp PowerPCInstrFormats.td PowerPCInstrInfo.td Message-ID: <200504140320.WAA14250@zion.cs.uiuc.edu> Changes in directory llvm/lib/Target/PowerPC: PPC32ISelSimple.cpp updated: 1.138 -> 1.139 PowerPCAsmPrinter.cpp updated: 1.75 -> 1.76 PowerPCInstrFormats.td updated: 1.33 -> 1.34 PowerPCInstrInfo.td updated: 1.62 -> 1.63 --- Log message: Add the necessary support to codegen condition register logical ops with register allocated condition registers. Make sure that the printed output is gas compatible. --- Diffs of the changes: (+72 -17) PPC32ISelSimple.cpp | 16 +++++++++------- PowerPCAsmPrinter.cpp | 20 ++++++++++++++++++++ PowerPCInstrFormats.td | 18 ++++++++++++++++-- PowerPCInstrInfo.td | 35 +++++++++++++++++++++++++++-------- 4 files changed, 72 insertions(+), 17 deletions(-) Index: llvm/lib/Target/PowerPC/PPC32ISelSimple.cpp diff -u llvm/lib/Target/PowerPC/PPC32ISelSimple.cpp:1.138 llvm/lib/Target/PowerPC/PPC32ISelSimple.cpp:1.139 --- llvm/lib/Target/PowerPC/PPC32ISelSimple.cpp:1.138 Sat Apr 9 20:03:31 2005 +++ llvm/lib/Target/PowerPC/PPC32ISelSimple.cpp Wed Apr 13 22:20:38 2005 @@ -1115,7 +1115,7 @@ // Use crand for lt, gt and crandc for le, ge unsigned CROpcode = (OpNum == 2 || OpNum == 4) ? PPC::CRAND : PPC::CRANDC; // ? cr1[lt] : cr1[gt] - unsigned CR1field = (OpNum == 2 || OpNum == 3) ? 4 : 5; + unsigned CR1field = (OpNum == 2 || OpNum == 3) ? 0 : 1; // ? cr0[lt] : cr0[gt] unsigned CR0field = (OpNum == 2 || OpNum == 5) ? 0 : 1; unsigned Opcode = CompTy->isSigned() ? PPC::CMPW : PPC::CMPLW; @@ -1165,9 +1165,10 @@ .addReg(ConstReg); BuildMI(*MBB, IP, Opcode, 2, PPC::CR1).addReg(Op0r+1) .addReg(ConstReg+1); - BuildMI(*MBB, IP, PPC::CRAND, 3).addImm(2).addImm(2).addImm(CR1field); - BuildMI(*MBB, IP, PPC::CROR, 3).addImm(CR0field).addImm(CR0field) - .addImm(2); + BuildMI(*MBB, IP, PPC::CRAND, 5, PPC::CR0).addImm(2) + .addReg(PPC::CR0).addImm(2).addReg(PPC::CR1).addImm(CR1field); + BuildMI(*MBB, IP, PPC::CROR, 5, PPC::CR0).addImm(CR0field) + .addReg(PPC::CR0).addImm(CR0field).addReg(PPC::CR0).addImm(2); return; } } @@ -1204,9 +1205,10 @@ // cr0 = r3 ccOpcode r5 or (r3 == r5 AND r4 ccOpcode r6) BuildMI(*MBB, IP, Opcode, 2, PPC::CR0).addReg(Op0r).addReg(Op1r); BuildMI(*MBB, IP, Opcode, 2, PPC::CR1).addReg(Op0r+1).addReg(Op1r+1); - BuildMI(*MBB, IP, PPC::CRAND, 3).addImm(2).addImm(2).addImm(CR1field); - BuildMI(*MBB, IP, PPC::CROR, 3).addImm(CR0field).addImm(CR0field) - .addImm(2); + BuildMI(*MBB, IP, PPC::CRAND, 5, PPC::CR0).addImm(2) + .addReg(PPC::CR0).addImm(2).addReg(PPC::CR1).addImm(CR1field); + BuildMI(*MBB, IP, PPC::CROR, 5, PPC::CR0).addImm(CR0field) + .addReg(PPC::CR0).addImm(CR0field).addReg(PPC::CR0).addImm(2); return; } } Index: llvm/lib/Target/PowerPC/PowerPCAsmPrinter.cpp diff -u llvm/lib/Target/PowerPC/PowerPCAsmPrinter.cpp:1.75 llvm/lib/Target/PowerPC/PowerPCAsmPrinter.cpp:1.76 --- llvm/lib/Target/PowerPC/PowerPCAsmPrinter.cpp:1.75 Sat Apr 9 20:48:29 2005 +++ llvm/lib/Target/PowerPC/PowerPCAsmPrinter.cpp Wed Apr 13 22:20:38 2005 @@ -138,6 +138,26 @@ O << "-\"L0000" << LabelNumber << "$pb\")"; } } + void printcrbit(const MachineInstr *MI, unsigned OpNo, + MVT::ValueType VT) { + unsigned char value = MI->getOperand(OpNo).getImmedValue(); + assert(value <= 3 && "Invalid crbit argument!"); + unsigned RegNo, CCReg = MI->getOperand(OpNo-1).getReg(); + switch (CCReg) { + case PPC::CR0: RegNo = 0; break; + case PPC::CR1: RegNo = 1; break; + case PPC::CR2: RegNo = 2; break; + case PPC::CR3: RegNo = 3; break; + case PPC::CR4: RegNo = 4; break; + case PPC::CR5: RegNo = 5; break; + case PPC::CR6: RegNo = 6; break; + case PPC::CR7: RegNo = 7; break; + default: + std::cerr << "Unhandled reg in enumRegToRealReg!\n"; + abort(); + } + O << 4 * RegNo + value; + } virtual void printConstantPool(MachineConstantPool *MCP) = 0; virtual bool runOnMachineFunction(MachineFunction &F) = 0; Index: llvm/lib/Target/PowerPC/PowerPCInstrFormats.td diff -u llvm/lib/Target/PowerPC/PowerPCInstrFormats.td:1.33 llvm/lib/Target/PowerPC/PowerPCInstrFormats.td:1.34 --- llvm/lib/Target/PowerPC/PowerPCInstrFormats.td:1.33 Tue Apr 12 02:04:16 2005 +++ llvm/lib/Target/PowerPC/PowerPCInstrFormats.td Wed Apr 13 22:20:38 2005 @@ -309,8 +309,22 @@ // 1.7.7 XL-Form class XLForm_1 opcode, bits<10> xo, bit ppc64, bit vmx, - dag OL, string asmstr> - : XForm_base_r3xo { + dag OL, string asmstr> : I { + bits<3> CRD; + bits<2> CRDb; + bits<3> CRA; + bits<2> CRAb; + bits<3> CRB; + bits<2> CRBb; + + let Inst{6-8} = CRD; + let Inst{9-10} = CRDb; + let Inst{11-13} = CRA; + let Inst{14-15} = CRAb; + let Inst{16-18} = CRB; + let Inst{19-20} = CRBb; + let Inst{21-30} = xo; + let Inst{31} = 0; } class XLForm_2 opcode, bits<10> xo, bit lk, bit ppc64, bit vmx, Index: llvm/lib/Target/PowerPC/PowerPCInstrInfo.td diff -u llvm/lib/Target/PowerPC/PowerPCInstrInfo.td:1.62 llvm/lib/Target/PowerPC/PowerPCInstrInfo.td:1.63 --- llvm/lib/Target/PowerPC/PowerPCInstrInfo.td:1.62 Tue Apr 12 02:04:16 2005 +++ llvm/lib/Target/PowerPC/PowerPCInstrInfo.td Wed Apr 13 22:20:38 2005 @@ -45,6 +45,9 @@ def symbolLo: Operand { let PrintMethod = "printSymbolLo"; } +def crbit: Operand { + let PrintMethod = "printcrbit"; +} // Pseudo-instructions: def PHI : Pseudo<(ops), "; PHI">; @@ -332,14 +335,30 @@ // XL-Form instructions. condition register logical ops. // -def CRAND : XLForm_1<19, 257, 0, 0, (ops u5imm:$D, u5imm:$A, u5imm:$B), - "crand $D, $A, $B">; -def CRANDC : XLForm_1<19, 129, 0, 0, (ops u5imm:$D, u5imm:$A, u5imm:$B), - "crandc $D, $A, $B">; -def CRNOR : XLForm_1<19, 33, 0, 0, (ops u5imm:$D, u5imm:$A, u5imm:$B), - "crnor $D, $A, $B">; -def CROR : XLForm_1<19, 449, 0, 0, (ops u5imm:$D, u5imm:$A, u5imm:$B), - "cror $D, $A, $B">; +def CRAND : XLForm_1<19, 257, 0, 0, (ops CRRC:$D, crbit:$Db, + CRRC:$A, crbit:$Ab, CRRC:$B, crbit:$Bb), + "crand $Db, $Ab, $Bb">; +def CRANDC : XLForm_1<19, 129, 0, 0, (ops CRRC:$D, crbit:$Db, + CRRC:$A, crbit:$Ab, CRRC:$B, crbit:$Bb), + "crandc $Db, $Ab, $Bb">; +def CREQV : XLForm_1<19, 289, 0, 0, (ops CRRC:$D, crbit:$Db, + CRRC:$A, crbit:$Ab, CRRC:$B, crbit:$Bb), + "creqv $Db, $Ab, $Bb">; +def CRNAND : XLForm_1<19, 225, 0, 0, (ops CRRC:$D, crbit:$Db, + CRRC:$A, crbit:$Ab, CRRC:$B, crbit:$Bb), + "crnand $Db, $Ab, $Bb">; +def CRNOR : XLForm_1<19, 33, 0, 0, (ops CRRC:$D, crbit:$Db, + CRRC:$A, crbit:$Ab, CRRC:$B, crbit:$Bb), + "crnor $Db, $Ab, $Bb">; +def CROR : XLForm_1<19, 449, 0, 0, (ops CRRC:$D, crbit:$Db, + CRRC:$A, crbit:$Ab, CRRC:$B, crbit:$Bb), + "cror $Db, $Ab, $Bb">; +def CRORC : XLForm_1<19, 417, 0, 0, (ops CRRC:$D, crbit:$Db, + CRRC:$A, crbit:$Ab, CRRC:$B, crbit:$Bb), + "crorc $Db, $Ab, $Bb">; +def CRXOR : XLForm_1<19, 193, 0, 0, (ops CRRC:$D, crbit:$Db, + CRRC:$A, crbit:$Ab, CRRC:$B, crbit:$Bb), + "crxor $Db, $Ab, $Bb">; def MCRF : XLForm_3<19, 0, 0, 0, (ops CRRC:$BF, CRRC:$BFA), "mfcr $BF, $BFA">; From duraid at octopus.com.au Thu Apr 14 03:37:43 2005 From: duraid at octopus.com.au (Duraid Madina) Date: Thu, 14 Apr 2005 03:37:43 -0500 Subject: [llvm-commits] CVS: llvm/lib/Target/IA64/IA64ISelPattern.cpp Message-ID: <200504140837.DAA02313@zion.cs.uiuc.edu> Changes in directory llvm/lib/Target/IA64: IA64ISelPattern.cpp updated: 1.22 -> 1.23 --- Log message: we have zextloads, not sextloads! --- Diffs of the changes: (+1 -1) IA64ISelPattern.cpp | 2 +- 1 files changed, 1 insertion(+), 1 deletion(-) Index: llvm/lib/Target/IA64/IA64ISelPattern.cpp diff -u llvm/lib/Target/IA64/IA64ISelPattern.cpp:1.22 llvm/lib/Target/IA64/IA64ISelPattern.cpp:1.23 --- llvm/lib/Target/IA64/IA64ISelPattern.cpp:1.22 Wed Apr 13 01:12:04 2005 +++ llvm/lib/Target/IA64/IA64ISelPattern.cpp Thu Apr 14 03:37:32 2005 @@ -64,11 +64,11 @@ setOperationAction(ISD::EXTLOAD , MVT::i1 , Promote); setOperationAction(ISD::ZEXTLOAD , MVT::i1 , Expand); - setOperationAction(ISD::ZEXTLOAD , MVT::i32 , Expand); setOperationAction(ISD::SEXTLOAD , MVT::i1 , Expand); setOperationAction(ISD::SEXTLOAD , MVT::i8 , Expand); setOperationAction(ISD::SEXTLOAD , MVT::i16 , Expand); + setOperationAction(ISD::SEXTLOAD , MVT::i32 , Expand); setOperationAction(ISD::SREM , MVT::f32 , Expand); setOperationAction(ISD::SREM , MVT::f64 , Expand); From natebegeman at mac.com Thu Apr 14 03:57:03 2005 From: natebegeman at mac.com (Nate Begeman) Date: Thu, 14 Apr 2005 03:57:03 -0500 Subject: [llvm-commits] CVS: llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp Message-ID: <200504140857.DAA05753@zion.cs.uiuc.edu> Changes in directory llvm/lib/CodeGen/SelectionDAG: SelectionDAG.cpp updated: 1.82 -> 1.83 --- Log message: Add a couple missing transforms in getSetCC that were triggering assertions in the PPC Pattern ISel --- Diffs of the changes: (+8 -1) SelectionDAG.cpp | 9 ++++++++- 1 files changed, 8 insertions(+), 1 deletion(-) Index: llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp diff -u llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp:1.82 llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp:1.83 --- llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp:1.82 Wed Apr 13 16:23:31 2005 +++ llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp Thu Apr 14 03:56:52 2005 @@ -436,11 +436,18 @@ N2C = cast(N2.Val); } + if ((Cond == ISD::SETLT || Cond == ISD::SETULT) && C2 == MinVal) + return getConstant(0, VT); // X < MIN --> false + + // Canonicalize setgt X, Min --> setne X, Min + if ((Cond == ISD::SETGT || Cond == ISD::SETUGT) && C2 == MinVal) + return getSetCC(ISD::SETNE, VT, N1, N2); + // If we have setult X, 1, turn it into seteq X, 0 if ((Cond == ISD::SETLT || Cond == ISD::SETULT) && C2 == MinVal+1) return getSetCC(ISD::SETEQ, VT, N1, getConstant(MinVal, N1.getValueType())); - // If we have setult X, 1, turn it into seteq X, 0 + // If we have setugt X, Max-1, turn it into seteq X, Max else if ((Cond == ISD::SETGT || Cond == ISD::SETUGT) && C2 == MaxVal-1) return getSetCC(ISD::SETEQ, VT, N1, getConstant(MaxVal, N1.getValueType())); From natebegeman at mac.com Thu Apr 14 04:45:19 2005 From: natebegeman at mac.com (Nate Begeman) Date: Thu, 14 Apr 2005 04:45:19 -0500 Subject: [llvm-commits] CVS: llvm/lib/Target/PowerPC/PPC32ISelPattern.cpp PowerPCInstrInfo.td Message-ID: <200504140945.EAA18345@zion.cs.uiuc.edu> Changes in directory llvm/lib/Target/PowerPC: PPC32ISelPattern.cpp updated: 1.71 -> 1.72 PowerPCInstrInfo.td updated: 1.63 -> 1.64 --- Log message: Implement multi-way branches through logical ops on condition registers. This can generate considerably shorter code, reducing the size of crafty by almost 1%. Also fix the printing of mcrf. The code is currently disabled until it gets a bit more testing, but should work as-is. --- Diffs of the changes: (+69 -3) PPC32ISelPattern.cpp | 70 +++++++++++++++++++++++++++++++++++++++++++++++++-- PowerPCInstrInfo.td | 2 - 2 files changed, 69 insertions(+), 3 deletions(-) Index: llvm/lib/Target/PowerPC/PPC32ISelPattern.cpp diff -u llvm/lib/Target/PowerPC/PPC32ISelPattern.cpp:1.71 llvm/lib/Target/PowerPC/PPC32ISelPattern.cpp:1.72 --- llvm/lib/Target/PowerPC/PPC32ISelPattern.cpp:1.71 Wed Apr 13 18:15:44 2005 +++ llvm/lib/Target/PowerPC/PPC32ISelPattern.cpp Thu Apr 14 04:45:08 2005 @@ -474,6 +474,7 @@ namespace { Statistic<>Recorded("ppc-codegen", "Number of recording ops emitted"); Statistic<>FusedFP("ppc-codegen", "Number of fused fp operations"); +Statistic<>MultiBranch("ppc-codegen", "Number of setcc logical ops collapsed"); //===--------------------------------------------------------------------===// /// ISel - PPC32 specific code to select PPC32 machine instructions for /// SelectionDAG operations. @@ -689,6 +690,43 @@ return 0; } +/// getCROpForOp - Return the condition register opcode (or inverted opcode) +/// associated with the SelectionDAG opcode. +static unsigned getCROpForSetCC(unsigned Opcode, bool Inv1, bool Inv2) { + switch (Opcode) { + default: assert(0 && "Unknown opcode!"); abort(); + case ISD::AND: + if (Inv1 && Inv2) return PPC::CRNOR; // De Morgan's Law + if (!Inv1 && !Inv2) return PPC::CRAND; + if (Inv1 ^ Inv2) return PPC::CRANDC; + case ISD::OR: + if (Inv1 && Inv2) return PPC::CRNAND; // De Morgan's Law + if (!Inv1 && !Inv2) return PPC::CROR; + if (Inv1 ^ Inv2) return PPC::CRORC; + } + return 0; +} + +/// getCRIdxForSetCC - Return the index of the condition register field +/// associated with the SetCC condition, and whether or not the field is +/// treated as inverted. That is, lt = 0; ge = 0 inverted. +static unsigned getCRIdxForSetCC(unsigned Condition, bool& Inv) { + switch (Condition) { + default: assert(0 && "Unknown condition!"); abort(); + case ISD::SETULT: + case ISD::SETLT: Inv = false; return 0; + case ISD::SETUGE: + case ISD::SETGE: Inv = true; return 0; + case ISD::SETUGT: + case ISD::SETGT: Inv = false; return 1; + case ISD::SETULE: + case ISD::SETLE: Inv = true; return 1; + case ISD::SETEQ: Inv = false; return 2; + case ISD::SETNE: Inv = true; return 2; + } + return 0; +} + /// IndexedOpForOp - Return the indexed variant for each of the PowerPC load /// and store immediate instructions. static unsigned IndexedOpForOp(unsigned Opcode) { @@ -1009,7 +1047,8 @@ Tmp1 = SelectExpr(SetCC->getOperand(0), true); if (RecordSuccess) { ++Recorded; - return PPC::CR0; + BuildMI(BB, PPC::MCRF, 1, Result).addReg(PPC::CR0); + return Result; } AlreadySelected = true; } @@ -1028,6 +1067,33 @@ BuildMI(BB, CompareOpc, 2, Result).addReg(Tmp1).addReg(Tmp2); } } else { +#if 0 + if (CC.getOpcode() == ISD::AND || CC.getOpcode() == ISD::OR) + if (CC.getOperand(0).Val->hasOneUse() && + CC.getOperand(1).Val->hasOneUse()) { + SetCCSDNode* Op0CC = dyn_cast(CC.getOperand(0).Val); + SetCCSDNode* Op1CC = dyn_cast(CC.getOperand(1).Val); + if (Op0CC && Op1CC) { + ++MultiBranch; + bool Inv0, Inv1; + unsigned Opc1; + unsigned Idx0 = getCRIdxForSetCC(Op0CC->getCondition(), Inv0); + unsigned Idx1 = getCRIdxForSetCC(Op1CC->getCondition(), Inv1); + unsigned CROpc = getCROpForSetCC(CC.getOpcode(), Inv0, Inv1); + Tmp1 = SelectCC(CC.getOperand(0), Opc); + Tmp2 = SelectCC(CC.getOperand(1), Opc1); + if (Inv0 && !Inv1) { + std::swap(Tmp1, Tmp2); + std::swap(Idx0, Idx1); + Opc = Opc1; + } + if (Inv0 && Inv1) Opc = PPC32InstrInfo::invertPPCBranchOpcode(Opc); + BuildMI(BB, CROpc, 5, Result).addImm(Idx0).addReg(Tmp1).addImm(Idx0) + .addReg(Tmp2).addImm(Idx1); + return Result; + } + } +#endif Opc = PPC::BNE; Tmp1 = SelectExpr(CC); BuildMI(BB, PPC::CMPLWI, 2, Result).addReg(Tmp1).addImm(0); @@ -2033,7 +2099,7 @@ if (CN->getValue() == 0) { Tmp1 = SelectExpr(SetCC->getOperand(0)); switch (SetCC->getCondition()) { - default: assert(0 && "Unhandled SetCC condition"); abort(); + default: SetCC->dump(); assert(0 && "Unhandled SetCC condition"); abort(); case ISD::SETEQ: Tmp2 = MakeReg(MVT::i32); BuildMI(BB, PPC::CNTLZW, 1, Tmp2).addReg(Tmp1); Index: llvm/lib/Target/PowerPC/PowerPCInstrInfo.td diff -u llvm/lib/Target/PowerPC/PowerPCInstrInfo.td:1.63 llvm/lib/Target/PowerPC/PowerPCInstrInfo.td:1.64 --- llvm/lib/Target/PowerPC/PowerPCInstrInfo.td:1.63 Wed Apr 13 22:20:38 2005 +++ llvm/lib/Target/PowerPC/PowerPCInstrInfo.td Thu Apr 14 04:45:08 2005 @@ -360,7 +360,7 @@ CRRC:$A, crbit:$Ab, CRRC:$B, crbit:$Bb), "crxor $Db, $Ab, $Bb">; def MCRF : XLForm_3<19, 0, 0, 0, (ops CRRC:$BF, CRRC:$BFA), - "mfcr $BF, $BFA">; + "mcrf $BF, $BFA">; // XFX-Form instructions. Instructions that deal with SPRs // From duraid at octopus.com.au Thu Apr 14 05:06:47 2005 From: duraid at octopus.com.au (Duraid Madina) Date: Thu, 14 Apr 2005 05:06:47 -0500 Subject: [llvm-commits] CVS: llvm/lib/Target/IA64/IA64ISelPattern.cpp Message-ID: <200504141006.FAA27318@zion.cs.uiuc.edu> Changes in directory llvm/lib/Target/IA64: IA64ISelPattern.cpp updated: 1.23 -> 1.24 --- Log message: oops, this stopped us turning movl r4=0xFFFFFFFF;; and rX, r4 into zxt4 --- Diffs of the changes: (+1 -1) IA64ISelPattern.cpp | 2 +- 1 files changed, 1 insertion(+), 1 deletion(-) Index: llvm/lib/Target/IA64/IA64ISelPattern.cpp diff -u llvm/lib/Target/IA64/IA64ISelPattern.cpp:1.23 llvm/lib/Target/IA64/IA64ISelPattern.cpp:1.24 --- llvm/lib/Target/IA64/IA64ISelPattern.cpp:1.23 Thu Apr 14 03:37:32 2005 +++ llvm/lib/Target/IA64/IA64ISelPattern.cpp Thu Apr 14 05:06:35 2005 @@ -460,7 +460,7 @@ static unsigned ExactLog2sub1(uint64_t Val) { unsigned int n; for(n=0; n<64; n++) { - if(Val==(uint64_t)((1< Changes in directory llvm/lib/Target/IA64: IA64AsmPrinter.cpp updated: 1.9 -> 1.10 IA64InstrInfo.td updated: 1.11 -> 1.12 --- Log message: print negative 64 bit immediates as negative numbers, makes things a little easier on the eyes, not that numbers like 18446744073709541376 are bad or anything --- Diffs of the changes: (+9 -2) IA64AsmPrinter.cpp | 6 +++++- IA64InstrInfo.td | 5 ++++- 2 files changed, 9 insertions(+), 2 deletions(-) Index: llvm/lib/Target/IA64/IA64AsmPrinter.cpp diff -u llvm/lib/Target/IA64/IA64AsmPrinter.cpp:1.9 llvm/lib/Target/IA64/IA64AsmPrinter.cpp:1.10 --- llvm/lib/Target/IA64/IA64AsmPrinter.cpp:1.9 Mon Apr 11 00:55:56 2005 +++ llvm/lib/Target/IA64/IA64AsmPrinter.cpp Thu Apr 14 05:08:01 2005 @@ -247,7 +247,11 @@ MVT::ValueType VT) { O << (uint64_t)MI->getOperand(OpNo).getImmedValue(); } - + void printS64ImmOperand(const MachineInstr *MI, unsigned OpNo, + MVT::ValueType VT) { + O << (int64_t)MI->getOperand(OpNo).getImmedValue(); + } + void printCallOperand(const MachineInstr *MI, unsigned OpNo, MVT::ValueType VT) { printOp(MI->getOperand(OpNo), true); // this is a br.call instruction Index: llvm/lib/Target/IA64/IA64InstrInfo.td diff -u llvm/lib/Target/IA64/IA64InstrInfo.td:1.11 llvm/lib/Target/IA64/IA64InstrInfo.td:1.12 --- llvm/lib/Target/IA64/IA64InstrInfo.td:1.11 Wed Apr 13 01:12:04 2005 +++ llvm/lib/Target/IA64/IA64InstrInfo.td Thu Apr 14 05:08:01 2005 @@ -28,6 +28,9 @@ def u64imm : Operand { let PrintMethod = "printU64ImmOperand"; } +def s64imm : Operand { + let PrintMethod = "printS64ImmOperand"; +} // the asmprinter needs to know about calls let PrintMethod = "printCallOperand" in @@ -89,7 +92,7 @@ "mov $dst = $imm;;">; def MOVSIMM22 : AForm<0x03, 0x0b, (ops GR:$dst, s22imm:$imm), "mov $dst = $imm;;">; -def MOVLIMM64 : AForm<0x03, 0x0b, (ops GR:$dst, u64imm:$imm), +def MOVLIMM64 : AForm<0x03, 0x0b, (ops GR:$dst, s64imm:$imm), "movl $dst = $imm;;">; def AND : AForm<0x03, 0x0b, (ops GR:$dst, GR:$src1, GR:$src2), From alenhar2 at cs.uiuc.edu Thu Apr 14 11:18:06 2005 From: alenhar2 at cs.uiuc.edu (Andrew Lenharth) Date: Thu, 14 Apr 2005 11:18:06 -0500 Subject: [llvm-commits] CVS: llvm/test/Regression/CodeGen/Alpha/zapnot2.ll Message-ID: <200504141618.j3EGI670024833@apoc.cs.uiuc.edu> Changes in directory llvm/test/Regression/CodeGen/Alpha: zapnot2.ll added (r1.1) --- Log message: added a random and mask test --- Diffs of the changes: (+10 -0) zapnot2.ll | 10 ++++++++++ 1 files changed, 10 insertions(+) Index: llvm/test/Regression/CodeGen/Alpha/zapnot2.ll diff -c /dev/null llvm/test/Regression/CodeGen/Alpha/zapnot2.ll:1.1 *** /dev/null Thu Apr 14 11:18:00 2005 --- llvm/test/Regression/CodeGen/Alpha/zapnot2.ll Thu Apr 14 11:17:49 2005 *************** *** 0 **** --- 1,10 ---- + ; Make sure this testcase codegens to the zapnot instruction + ; RUN: llvm-as < %s | llc -march=alpha | grep 'zapnot' + + implementation ; Functions: + + long %bar(long %x) { + entry: + %tmp.1 = and long %x, 255 ; [#uses=1] + ret long %tmp.2 + } From alenhar2 at cs.uiuc.edu Thu Apr 14 11:24:13 2005 From: alenhar2 at cs.uiuc.edu (Andrew Lenharth) Date: Thu, 14 Apr 2005 11:24:13 -0500 Subject: [llvm-commits] CVS: llvm/lib/Target/Alpha/AlphaAsmPrinter.cpp AlphaISelPattern.cpp Message-ID: <200504141624.j3EGODIv024865@apoc.cs.uiuc.edu> Changes in directory llvm/lib/Target/Alpha: AlphaAsmPrinter.cpp updated: 1.11 -> 1.12 AlphaISelPattern.cpp updated: 1.97 -> 1.98 --- Log message: a 21264 fix, and fix the operator precidence on an and -> zap check (should fix hundreds of test cases --- Diffs of the changes: (+11 -3) AlphaAsmPrinter.cpp | 10 +++++++++- AlphaISelPattern.cpp | 4 ++-- 2 files changed, 11 insertions(+), 3 deletions(-) Index: llvm/lib/Target/Alpha/AlphaAsmPrinter.cpp diff -u llvm/lib/Target/Alpha/AlphaAsmPrinter.cpp:1.11 llvm/lib/Target/Alpha/AlphaAsmPrinter.cpp:1.12 --- llvm/lib/Target/Alpha/AlphaAsmPrinter.cpp:1.11 Wed Apr 13 12:17:28 2005 +++ llvm/lib/Target/Alpha/AlphaAsmPrinter.cpp Thu Apr 14 11:24:00 2005 @@ -25,9 +25,14 @@ #include "llvm/Support/Mangler.h" #include "llvm/ADT/Statistic.h" +#include "llvm/Support/CommandLine.h" using namespace llvm; +namespace llvm { + extern cl::opt EnableAlphaFTOI; +} + namespace { Statistic<> EmittedInsts("asm-printer", "Number of machine instrs printed"); @@ -229,7 +234,10 @@ bool AlphaAsmPrinter::doInitialization(Module &M) { AsmPrinter::doInitialization(M); - O << "\t.arch ev56\n"; + if(EnableAlphaFTOI) + O << "\t.arch ev6\n"; + else + O << "\t.arch ev56\n"; return false; } Index: llvm/lib/Target/Alpha/AlphaISelPattern.cpp diff -u llvm/lib/Target/Alpha/AlphaISelPattern.cpp:1.97 llvm/lib/Target/Alpha/AlphaISelPattern.cpp:1.98 --- llvm/lib/Target/Alpha/AlphaISelPattern.cpp:1.97 Wed Apr 13 12:17:28 2005 +++ llvm/lib/Target/Alpha/AlphaISelPattern.cpp Thu Apr 14 11:24:00 2005 @@ -1695,9 +1695,9 @@ unsigned int build = 0; for(int i = 0; i < 8; ++i) { - if (k & 0xFF == 0xFF) + if ((k & 0x00FF) == 0x00FF) build |= 1 << i; - else if (k & 0xFF != 0) + else if ((k & 0x00FF) != 0) { build = 0; break; } k >>= 8; } From alenhar2 at cs.uiuc.edu Thu Apr 14 12:34:36 2005 From: alenhar2 at cs.uiuc.edu (Andrew Lenharth) Date: Thu, 14 Apr 2005 12:34:36 -0500 Subject: [llvm-commits] CVS: llvm/lib/Target/Alpha/AlphaISelPattern.cpp AlphaInstrInfo.td Message-ID: <200504141734.j3EHYaHP024939@apoc.cs.uiuc.edu> Changes in directory llvm/lib/Target/Alpha: AlphaISelPattern.cpp updated: 1.98 -> 1.99 AlphaInstrInfo.td updated: 1.40 -> 1.41 --- Log message: fix calls --- Diffs of the changes: (+2 -1) AlphaISelPattern.cpp | 1 + AlphaInstrInfo.td | 2 +- 2 files changed, 2 insertions(+), 1 deletion(-) Index: llvm/lib/Target/Alpha/AlphaISelPattern.cpp diff -u llvm/lib/Target/Alpha/AlphaISelPattern.cpp:1.98 llvm/lib/Target/Alpha/AlphaISelPattern.cpp:1.99 --- llvm/lib/Target/Alpha/AlphaISelPattern.cpp:1.98 Thu Apr 14 11:24:00 2005 +++ llvm/lib/Target/Alpha/AlphaISelPattern.cpp Thu Apr 14 12:34:20 2005 @@ -1418,6 +1418,7 @@ BuildMI(BB, Alpha::CALL, 1).addGlobalAddress(GASD->getGlobal()); } else { //use PC relative branch call + AlphaLowering.restoreGP(BB); BuildMI(BB, Alpha::BSR, 1, Alpha::R26).addGlobalAddress(GASD->getGlobal(),true); } } Index: llvm/lib/Target/Alpha/AlphaInstrInfo.td diff -u llvm/lib/Target/Alpha/AlphaInstrInfo.td:1.40 llvm/lib/Target/Alpha/AlphaInstrInfo.td:1.41 --- llvm/lib/Target/Alpha/AlphaInstrInfo.td:1.40 Wed Apr 13 11:19:50 2005 +++ llvm/lib/Target/Alpha/AlphaInstrInfo.td Thu Apr 14 12:34:20 2005 @@ -330,7 +330,7 @@ R20, R21, R22, R23, R24, R25, R27, R28, R29, F0, F1, F10, F11, F12, F13, F14, F15, F16, F17, F18, F19, - F20, F21, F22, F23, F24, F25, F26, F27, F28, F29, F30] in { + F20, F21, F22, F23, F24, F25, F26, F27, F28, F29, F30], Uses = [R29] in { def JSR : MForm< 0x1A, (ops GPRC:$RD, GPRC:$RS, s14imm:$DISP), "jsr $RD,($RS),$DISP">; //Jump to subroutine def BSR : BForm<0x34, (ops GPRC:$RD, s21imm:$DISP), "bsr $RD,$DISP">; //Branch to subroutine } From brukman at cs.uiuc.edu Thu Apr 14 14:34:15 2005 From: brukman at cs.uiuc.edu (Misha Brukman) Date: Thu, 14 Apr 2005 14:34:15 -0500 Subject: [llvm-commits] CVS: llvm-www/status/index.html Message-ID: <200504141934.OAA20215@zion.cs.uiuc.edu> Changes in directory llvm-www/status: index.html updated: 1.42 -> 1.43 --- Log message: Add Octave compilation status --- Diffs of the changes: (+11 -4) index.html | 15 +++++++++++---- 1 files changed, 11 insertions(+), 4 deletions(-) Index: llvm-www/status/index.html diff -u llvm-www/status/index.html:1.42 llvm-www/status/index.html:1.43 --- llvm-www/status/index.html:1.42 Mon Apr 5 12:29:37 2004 +++ llvm-www/status/index.html Thu Apr 14 14:34:04 2005 @@ -2,7 +2,7 @@ LLVM: status of compiling off-the-shelf programs @@ -65,6 +65,14 @@ Status Notes + + + octave + 2.1.57 + 11 Feb 2005 + FAILS + bug #504 (indvars). + llvm @@ -156,7 +164,7 @@ 1.11.6 15 Mar 2004 FAILS - Cannot compile: see bug 295. + Cannot compile: see bug 156. @@ -371,7 +379,6 @@ (X^Y) + N1 = getNode(ISD::XOR, MVT::i1, N1, N2); + break; + case ISD::SETGT: // X >s Y --> X == 0 & Y == 1 --> X^1 & Y + case ISD::SETULT: // X X == 0 & Y == 1 --> X^1 & Y + N1 = getNode(ISD::AND, MVT::i1, N2, + getNode(ISD::XOR, MVT::i1, N1, getConstant(1, MVT::i1))); + break; + case ISD::SETLT: // X X == 1 & Y == 0 --> Y^1 & X + case ISD::SETUGT: // X >u Y --> X == 1 & Y == 0 --> Y^1 & X + N1 = getNode(ISD::AND, MVT::i1, N1, + getNode(ISD::XOR, MVT::i1, N2, getConstant(1, MVT::i1))); + break; + case ISD::SETULE: // X <=u Y --> X == 0 | Y == 1 --> X^1 | Y + case ISD::SETGE: // X >=s Y --> X == 0 | Y == 1 --> X^1 | Y + N1 = getNode(ISD::OR, MVT::i1, N2, + getNode(ISD::XOR, MVT::i1, N1, getConstant(1, MVT::i1))); + break; + case ISD::SETUGE: // X >=u Y --> X == 1 | Y == 0 --> Y^1 | X + case ISD::SETLE: // X <=s Y --> X == 1 | Y == 0 --> Y^1 | X + N1 = getNode(ISD::OR, MVT::i1, N1, + getNode(ISD::XOR, MVT::i1, N2, getConstant(1, MVT::i1))); + break; + } + if (VT != MVT::i1) + N1 = getNode(ISD::ZERO_EXTEND, VT, N1); + return N1; + } + + SetCCSDNode *&N = SetCCs[std::make_pair(std::make_pair(N1, N2), std::make_pair(Cond, VT))]; if (N) return SDOperand(N, 0);