Papers, Reports, Posters and Demos

Energy Efficient Computing

Exploiting DMA for Performance and Energy Optimized STREAM on a DS, with G Netzer, D Ahlin, E Stotzer and P Varis, The 10th Workshop on High-Performance, Power-Aware Computing , May 19, 2014, Phoenix, AZ. (held in conjunction with the 28th Annual International Parallel & Distributed Processing Symposium (IPDPS) May 19–23). talk

Final Report on Prototype Evaluation , PRACE Deliverable D9.3.4, 75 pages, December 2013 (with Gilbert Netzer).

Report on prototype evaluation , PRACE deliverable D9.3.3, 164 pages, March, 2013 (with Gilbert Netzer).

Efficiency, Energy Efficiency, and Programming of Accelerated HPC Servers: Highlights of PRACE Studies in GPU Solutions to Multi-Scale Problems in Science and Engineering , Springer Verlag, Lecture Notes in Earth System Sciences, pp. 33–78, 2013. talk

Overview of Data Centers Energy Efficiency Evolution, in Handbook of Energy-Aware and Green Computing, pp. 983–1028, CRC Press, 2012. talk

The SNIC/KTH PRACE prototype: Achieving high energy efficiency with commodity technology without acceleration, Lennart Johnsson, Daniel Ahlin, John Wang, First International Green Computing Conference, August 15–18, 2010, Chicago, IL.

Grid Software

“Dynamic, Context-Aware, Least-Privilege Grid Delegation”, Mehran Ahsant, Jim Basney, Lennart Johnsson, The 8th IEEE/ACM International Conference on Grid Computing, Grid 2007, September 19 – 21, Austin, TX

“Grids, Accounting and High-Performance Computing”, 2007 Symposium on Digital Life Technologies: Building a Safe, Secured and Sound (3S) Living Environment, 17 pp,Tainan, Taiwan, June 7 – 8, 2007.

“Towards An On-Demand Restricted Delegation Mechanism for Grids”, M. Ahsant, J. Basney, O. Mulmo, A.J. Lee, L. Johnsson, Grid 2006, the 7th IEEE/ACM International Conference on Grid Computing, September 28 – 29, 2006, Barcelona, Spain.

“A Service-Oriented Approach to Enforce Grid Resource Allocations”, T. Sandholm, P. Gardfjall, E. Elmroth, O. Mulmo, and Lennart Johnsson, International Journal of Cooperative Information Systems, vol. 15, No. 3, pp. 439 – 459, September, 2006.

Scheduling Strategies for Mapping Application Workflows onto the Grid”. A. Mandal, K. Kennedy, C. Koelbel, G. Marin, B. Liu, L. Johnsson, and J. Mellor-Crummey, in 14th IEEE Symposium on High Performance Distributed Computing (HPDC 2005). IEEE Computer Society Press.

“New Grid Scheduling and Rescheduling Methods in the GrADS Project”, F. Berman, H. Casanova, A. Chien, K. Cooper, H. Dail, A. Dasgupta, W. Deng, J. Dongarra, L. Johnsson, K. Kennedy, C. Koelbel, B. Liu, X. Liu, A. Mandal, G. Marin, M. Mazina, J. Mellor-Crummey, C. Mendes, A. Olugbile, M. Patel, D. Reed, Z. Shi, O. Sievert, H. Xia, A. YarKhan, International Journal of Parallel Programming, vol. 33, no 2-3, pp. 209-229, 2005.

“An OGSA-Based Accounting System for Allocation Enforcement across HPC Centers”, Thomas Sandholm, Peter Gardfjäll, Erik Elmroth, Lennart Johnsson, Olle Mulmo, Second International Conference on Service Oriented Computing, pp. 279 – 288, November 15 – 18, 2004, New York,

New Grid Scheduling and Rescheduling Methods in the GrADS Project”, K. Cooper, A. Dasgupata, K. Kennedy, C. Koelbel, A. Mandal, G. Marin, M. Mazina, J. Mellor-Crummey, F. Berman, H. Casanova, A. Chien, H. Dail, X. Liu, A. Olugbile, O. Sievert, H. Xia, L. Johnsson, B. Liu, M. Patel, D. Reed, W. Deng, C. Mendes, Z. Shi, A. YarKhan, J. Dongarra, NSF Next Generation Software Workshop, International Parallel and Distributed Processing Symposium, April 26 – 30, 2004, Santa Fe.

Toward a Framework for Preparing and Executing Adaptive Grid Programs”. Ken Kennedy, Mark Mazina, John Mellor-Crummey, Keith Cooper, Linda Torczon, Fran Berman, Andrew Chien, Holly Dail, Otto Sievert, Dave Angulo, Ian Foster, Dennis Gannon, Lennart Johnsson, Carl Kesselman, Ruth Aydt, Daniel Reed, Jack Dongarra, Sathish Vadhiyar, and Rich Wolski. April 2002, Proceedings of NSF Next Generation Systems Program Workshop (International Parallel and Distributed Processing Symposium 2002), Fort Lauderdale, FL

“The GrADS Project: Software Support for High-Level Grid Application Development”, (with F. Berman, A. Chien, K. Cooper, J. Dongarra, I. Foster, D. Gannon, K. Kennedy, C. Kesselman, J. Mellor-Crummey, D. Reed, L. Torczon and R. Wolski), Journal on Supercomputer Applications and High-Performance Computing, vol. 15, no. 4, pp 327 – 344, 2001.

“SimDB: A Problem Solving Environment for Molecular Dynamics Simulation and Analysis”,(with Matin Abdullah, Michael Feig,and Montgomery Pettitt) First European Grid Forum Workshop, pp. 321 – 329, in “Proceedings of ISThmus 2000: Research and Development for the Information Society”, ISBN 83-913639-0-2, April 2000.

“Large Scale Data Repository: Design of a Molecular Dynamics Trajectory Database”, with Michael Feig, Matin Abdullah, and Montgomery Pettitt, in Future Generation Computer Systems, Elsevier, North-Holland, Vol. 16, No. 1, pp. 101 – 110, 1999.

Computer Communication

“Network–Related Performance Issues and Techniques for MPPs”, in Optoelectronic Interconnect and Packaging, Critical reviews of Optical Science and Technology, Vol. CR62, pp. 176 – 209, 1996, SPIE Press.

“ROMM Routing on Mesh and Torus Networks”, (with Ted Nesson) Proceedings of the 7th Annual ACM Symposium on Parallel Algorithms and Architectures, ACM Press, pages 275 – 287, 1995.

“All–to–All Communication on the Connection Machine system CM–200”, (with Kapil K. Mathur), the Journal of Scientific Programming, Vol. 4, No. 4, pp. 251 – 273, 1995.

“On the Conversion between Binary Code and Binary Reflected Gray Code”, (with Ching-Tien Ho), in IEEE Transactions on Computers, Vol. 44, No. 1, pp. 47 – 53, January 1995.

“ROMM Routing: A Class of Efficient Minimal Routing Algorithms”, (with Ted Nesson) Proceedings of the Parallel Computer Routing and Communication Workshop, Springer– Verlag, Lecture Notes in Computer Science 853, pages 185 – 199, 1994.

“Issues in High Performance Computer Networks”, in IEEE Technical Committee on Computer Architecture Newsletter, Summer – Fall 1994, pp. 14 – 19.

“Optimal Communication Channel Utilization for Matrix Transposition and Related Permutations on Boolean Cubes”, (with Ching–Tien Ho) in the Journal of Discrete Applied Mathematics, Vol. 53, pp. 251 – 274, September 1994.

“An Efficient Communication Strategy for Finite Element Methods on the Connection Machine CM–5 System”, (with Zdenek Johan, Kapil K Mathur, and Thomas J.R. Hughes), in Computer Methods in Applied Mechanics and Engineering, vol. 113, pages 363 – 387, 1994.

“POLYSHIFT Communications Software for the Connection Machine System CM–200”, (with Ralph Brickner and William George), Journal of Scientific Programming, vol. 3, No. 1, pp. 83 – 99, Spring 1994.

“Boolean Cube Emulation of Butterfly Networks Encoded by Gray Code” (with Ching–Tien Ho), Journal of Parallel and Distributed Computing, Vol. 20, No. 3, pp 261 – 279, 1994.

“Data Motion and High Performance Computing”, in Proceedings of the First International Workshop on Massively Parallel Processing Using Optical Interconnections, pages 1 – 18, IEEE Computer Society, Order No. 5832-02, ISBN 0-8186-5832-02, 1994.

“Minimizing the Communication Time for Matrix Multiplication on Multiprocessors”, Journal of Parallel Computing, Vol. 19, No. 11, pp. 1235 – 1257, 1993.

“All–to–all Communication Algorithms for Distributed BLAS”, (with Kapil K. Mathur) 6th SIAM Conference on Parallel Processing for Scientific Computing, Norfolk, Virginia, March 22 – 24, 1993.

“All–to–All Broadcast with Applications on the Connection Machine”, (with Jean-Philippe Brunet), International Journal of Supercomputer Applications, Vol. 6, No. 3, pp. 241 – 256, 1992.

“Communication Primitives for Unstructured Finite Element Simulations on Data Parallel Architectures”, (with Kapil K. Mathur) Computing Systems in Engineering, Vol. 3, Nos. 1 – 4, pp. 63 – 72, 1992.

“Maximizing Channel Utilization for All–to–All Personalized Communication on Boolean cubes”, (with Ching–Tien Ho) Proceedings of the Sixth Distributed Memory Computing Conference, pp. 299 – 304, IEEE Computer Society Press, April, 1991.

“The Complexity of Reshaping Arrays on Boolean Cubes”, (with Ching–Tien Ho), Proceedings of The Fifth Distributed Memory Computing Conference, pp. 370 – 377, IEEE Computer Society, April, 1990.

“Optimal Communication in Network Architectures”, in VLSI Frontiers: Massively Parallel Models of Computation by Morgan Kaufmann Publishers, pp. 223 – 389, 1990.

“Spanning Graphs for Optimum Broadcasting and Personalized Communication in Hypercubes”, (with Ching–Tien Ho), IEEE Trans. Computers, Vol. 38, No. 9, pp. 1249 – 1268, September, 1989.

“Spanning Balanced Trees in Boolean cubes”, (with Ching–Tien Ho). SIAM J. Sci. Stat. Comp., Vol. 10, No 4, pp. 607 – 630, July 1989.

“Stable Dimension Permutations on Boolean Cubes”, (with Ching–Tien Ho), Department of Computer Science, Yale University, Technical Report YALEU/DCS/RR–617, March 1988.

“Communication Efficient Basic Linear Algebra Computations on Hypercube Architectures”, Journal of Parallel and Distributed Computing, Vol. 4, No. 2, pp. 133 – 172, April 1987.

“The Communication Efficiency of Meshes, Boolean Cubes, and Cube Connected Cycles for Wafer Scale Integration”, (with Abhiram Ranade), The 1987 International Conference on Parallel Processing, pp. 479 – 482, IEEE Computer Society, 1987.

“Distributed Routing Algorithms for Broadcasting and Personalized Communication in Hypercubes[1]”, (with Ching–Tien Ho), The 1986 International Conference on Parallel Processing, pp. 640 – 648, IEEE Computer Society, 1986.

“Data Permutations and Basic Linear Algebra Computations on Ensemble Architectures”, presented at the Second SIAM Meeting on Parallel Processing for Scientific Computing, Norfolk, Virginia, November 18 – 21, 1985, Report YALEU/DCS/RR–367, February 1985.

“Mathematical Approach to Computational Networks”, (with D. Cohen), IEEE International Conference on Computer Design: VLSI in Computers, October 31 – November 3, 1983, pp. 642 – 646, New York. IEEE Computer Society, 83CH1935–6.

“A Mathematical Approach to Modeling the Flow of Data and Control in Computational Networks”, (with Danny Cohen), VLSI Systems and Computations, Eds. Kung, Sproull, Steele, Computer Sciences Press, Rockville, 1981, pp. 213 – 225.

Data Distribution

“A Data–Parallel Implementation of the Geometric Partitioning Algorithm”, (with Yu Hu and Shanghua Teng), Eighth SIAM Conference on Parallel Processing for Scientific Computing, Minneapolis, Minnesota, March 14 – 17, 1997.

“Embedding Hyper–pyramids in Hypercubes”, (with Ching–Tien Ho), IBM Journal of Research and Development, Vol. 38, No. 1, pp. 31 – 45, 1994.

“Mesh Decomposition and Communication Procedures for Finite Element Applications on the Connection Machine CM–5 System”, (with Zdenek Johan, Kapil K. Mathur and Thomas J.R. Hughes), in High–Performance Computing and Networking, Vol. 2,pages 233 – 240, Springer–Verlag, Lecture Notes in Computer Science, 1994.

“Massively Parallel Computing: Data distribution and communication”, Parallel Architectures and their Efficient Use, pp 68 – 92, Springer Verlag, 1993.

“Embedding Meshes in Boolean Cubes by Graph Decomposition”, (with Ching–Tien Ho), the Journal of Parallel and Distributed Computing, Vol. 8, No. 4, pp. 325 – 339, April 1990.

“Embedding Three–Dimensional Meshes in Boolean Cubes by Graph Decomposition”, (with Ching–Tien Ho), Proceedings of The 1990 International Conference on Parallel Processing, pp. 319 – 326, IEEE Computer Society, August, 1990.

“Embedding Meshes into Small Boolean Cubes”, (with Ching–Tien Ho), Proceedings of The Fifth Distributed Memory Computing Conference, pp. 1366 – 1374, IEEE Computer Society, April, 1990.

“Dilation d Embeddings of a Hyper–Pyramid into a Hypercube, (with Ching–Tien Ho), Supercomputing 89, ACM Press, pp. 294 – 303, November 1989.

“Embedding Hyper–pyramids in Hypercubes”, (with Ching–Tien Ho), Department of Computer Science, Yale University, Technical Report YALEU/DCS/RR–667, December 1988.

“On the Embedding of Arbitrary Meshes in Boolean Cubes with Expansion Two Dilation Two”, (with Ching–Tien Ho), The 1987 International Conference on Parallel Processing, pp. 188 – 191, IEEE Computer Society, 1987.

“Graph Embeddings for Maximum Bandwith Utilization in Hypercubes”, (with Ching–Tien Ho), presented at the International Conference on Vector and Parallel Computing, Loen, Norway, June 1986.

Scalable Algorithms and Scientific Software Libraries

“Adaptive Computation of Self Sorting In-place FFTs on Hierarchical Memory Architectures”, Ayaz Ali, Lennart Johnsson, Jaspal Subhlok, High Performance Computation Conference 2007, September 26 – 28, 2007, Houston, TX

“Scheduling FFT Computation on SMP and Multi-core Systems”, Ayaz Ali, Lennart Johnsson, Dragan Mirkovic, 21st International Conference on Supercomputing, June 16 – 20, 2007, Seattle, WA.

“Data Parallel Performance Optimizations Using Array Aliasing”, in Algorithms for Parallel Processing, Vol. 105, IMA Series in Mathematics and its Applications, pp. 213 – 246, Springer Verlag, 1999.

“Load–Balance in Parallel FACR”, (with Nikos Pitsianis), in High Performance Algorithms for Structured Matrix Problems, pp. 163 – 180, Nova, 1999.

“High Performance Fortran for Highly Irregular Problems”, (with Yu Hu and Shang-Hua Teng) Sixth ACM SIGPLAN Symposium on Principles and Practices of Parallel Programming, pp. 13 – 24, Las Vegas, Nevada, June 18 – 21, 1997.

A Vector Space Framework for Parallel Stable Permutations, (with Nadia Shalaby) Second International Workshop on Formal Methods for Parallel Programming: Theory and Applications, Geneva, Switzerland, April 1, 1997.

“On the Accuracy of Anderson’s fast N–body algorithm”, (with Yu Hu) Proceedings of the Eighth SIAM Conference on Parallel Processing for Scientific Computing, Minneapolis, Minnesota, March 14 – 17, 1997.

“On the accuracy of Poisson’s formula based N–body algorithms”, (with Yu Hu), Harvard University Technical Report TR-06-96, May 1996.

“Hierarchical Load–Balancing for Parallel Fast Legendre Transforms”, (with Nadia Shalaby), Proceedings of the Eighth SIAM Conference on Parallel Processing for Scientific Computing, Minneapolis, Minnesota, March 14 – 17, 1997.

“A Data Parallel Implementation of O(N) Hierarchical N–body Methods”, (with Yu Hu) Supercomputing ’96, Pittsburgh, November 17 – 22, 1996.

“Local Basic Linear Algebra Subroutines (LBLAS) for the CM–5/5E”, (with David Kramer and Yu Hu), in the International Journal of Supercomputer Applications, Vol. 10, No. 4, pp. 300 – 335, 1996.

“A Data Parallel Implementation of Hierarchical N–body Methods”, (with Yu Hu), in the International Journal of Supercomputer Applications, Vol. 10, No. 1, pp. 3 – 40, 1996.

“Implementing O(N) N–body algorithms efficiently in data parallel languages”, (with Yu Hu), in the Journal of Scientific Programming, Vol. 5, No. 4, pp. 337 – 364, 1996.

“Structured Linear Algebra Software on Scalable Architectures”, ICIAM95, Hamburg, July 3 – 7, page 54, ICIAM Book of Abstracts, 1995

“Parallel Implementation of Recursive Spectral Bisection on the Connection Machine CM–5 System”, (with Zdenek Johan, Kapil K. Mathur and Thomas J.R. Hughes), Parallel Computational Fluid Dynamics: New Trends and Advances, pages 451 – 459, Elsevier Science, 1995.

“CMSSL: A Scalable Scientific Software Library”, in Proceedings of the Scalable Parallel Libraries Conference, pages 57 – 66, IEEE Computer Society, Order No. 4980-02,ISBN 0-8186-4980-1, 1994.

“Load–Balanced LU and QR Factor and Solve Routines for Scalable Processors with Scalable I/O” (with Jean-Philippe Brunet and Pelle Pedersen), in Proceedings of the 14th IMACS World Congress, July 11 – 15, 1994, Atlanta, Georgia. Harvard University Technical Report TR-20-94.

“High Performance, Scalable Scientific Software Libraries”, (with Kapil K. Mathur) Portability and Performance in Parallel Processing, pages 159 – 208, 1994, John Wiley & Sons.

“Scientific Software Libraries for Scalable Architectures”, (with Kapil K. Mathur), in Parallel Scientific Computing, Springer Verlag, 1994.

“Index Transformation Algorithms in a Linear Algebra Framework”, (with Alan Edelman and Steve Heller), in Transactions on Parallel and Distributed Systems, Vol. 5, No. 12, pp. 1302 – 1309, 1994.

“Multiplication of Matrices of Arbitrary Shape on a Data Parallel Computer”, (with Kapil K. Mathur), in Journal of Parallel Computing, Vol. 20, No. 7, pp. 919 – 951, July, 1994.

“An Efficient Algorithm for Gray–to–Binary Permutation on Hypercubes”, (with Ching–Tien Ho and M.T. Raghunath), Journal of Parallel and Distributed Computing, Vol. 20, No. 1, pp. 114 – 120, 1994.

“Massively Parallel Computing: Mathematics and Communications Libraries”, (with Kapil K. Mathur), Parallel Supercomputing in Atmospheric Science, pages 250 – 285, 1993, World Scientific.

“Block Cyclic Dense Linear Algebra”, (with Woody Lichtenstein), SIAM J. of Sci. Comp., Vol. 14, No. 6, pp. 1257 – 1286, 1993.

“Local Basic Linear Algebra Subroutines (BLAS) on the Connection Machine System CM–200”, (with Luis Ortiz), International Journal of Supercomputer Applications, pp. 322 – 350, Vol. 7, No. 1, 1993.

“Cooley–Tukey FFT on the Connection Machine”, (with Robert L. Krawitz), Journal of Parallel Computing, Vol. 18, No. 11, pp. 1201 – 1221, 1992.

“Communication Efficient Multi–Processor FFT”, (with Michel Jacquemin and Robert L. Krawitz), Journal of Computational Physics, Vol. 102, No. 2, pp. 381 – 397,October 1992.

“Generalized Shuffle Permutations on Boolean Cubes”, (with Ching–Tien Ho), Journal of Parallel and Distributed Computing, Vol. 16, No. 1, pp. 1 – 14, 1992.

“The Parallel Multipole Method on the Connection Machine”, (with Feng Zhao), SIAM J. Sci. Stat. Comp., Vol. 12, No. 6, pp. 1420 – 1437, November 1991.

“Matrix Multiplication on Hypercubes Using Full Bandwidth and Constant Storage”, (with Ching–Tien Ho), Proceedings of the Sixth Distributed Memory Computing Conference, pp. 447 – 451, IEEE Computer Society Press, April, 1991.

“Communication and I/O Libraries”, presented at DARPA Workshop on Scalable Scientific Libraries, September, 1990. Technical Report TR–02–91, Harvard University, January 1991.

“True Hypercube Algorithms on the Connection Machine” (with Alan Edelman, Mark Bromley, and Steve Heller), Parallel Computing: Achievements, Problems, and Prospects, Capri, Italy, June 3 – 7, 1990.

“Optimizing Tridiagonal Solvers for the Alternating Direction Method on Boolean Cube Multiprocessors”, (with Ching–Tien Ho), SIAM J. Sci. Stat. Comp., Vol. 11, No. 3, pp. 563 – 592, May 1990.

“The Connection Machine Scientific Software Library” (with Anne Trefethen and Kapil K. Mathur), Fourth SIAM Conference on Parallel Processing for Scientific Computing, December 12, 1989, Chicago, IL.

“Histogram Computation on Distributed Memory Architectures” (with Dimitris C. Gerogiannis and Stelios C. Orphanoudakis), Journal on Concurrency: Practice and Experience, Vol. 1, No. 2, pp. 219 – 237, December 1989.

“Optimizing Tridiagonal Solvers for Alternating Direction Methods on Boolean Cube Multiprocessors”, (with Ching–Tien Ho), Proceedings of the Fourth SIAM Conference on Parallel Processing for Scientific Computing, pp. 96 – 98, December 11, 1989. SIAM 1990.

“Matrix Multiplication on a Data Parallel Architecture”, (with Kapil K. Mathur and Tim Harris), Fourth SIAM Conference on Parallel Processing for Scientific Computing, December 11, 1989, Chicago, IL.

“A Radix–2 FFT on the Connection Machine”, (with Robert L. Krawitz, Roger Frye and Doug MacDonald), Fourth SIAM Conference on Parallel Processing for Scientific Computing, December 11, 1989, Chicago, IL.

“Matrix Multiplication on the Connection Machine”, (with Tim Harris and Kapil K. Mathur), Supercomputing 89, ACM Press, pp. 326 – 332, November 1989.

“A Radix–2 FFT on the Connection Machine”, (with Robert Krawitz, Roger Frye and Doug MacDonald), Supercomputing 89,A CM Press, pp. 809 – 819,November 1989.

“High radix FFT on Boolean cube networks”, (with Michel Jacquemin and Ching–Tien Ho), Department of Computer Science, Yale University, Technical Report YALEU/DCS/RR–751, November 1989.

“Multiplying of Arbitrarily Shaped Matrices on Boolean Cubes Using the full Communications Bandwidth”, (with Ching–Tien Ho) Department of Computer Science, Yale University, Technical Report YALEU/DCS/RR–721, July 1989.

“Node Orderings and Concurrency in Structurally–Symmetric Sparse Problems”, (with I.S. Du.), Parallel Supercomputing: Methods, Algorithms and Applications, pp. 177 – 189, Wiley , 1989.

“Matrix Multiplication on Boolean Cubes using Generic Communication Primitives”, Parallel Processing and Medium Scale Multiprocessors, SIAM, pp. 108 – 156, 1989.

“Systolic FFT Algorithms on Boolean Cubes”, (with Ching–Tien Ho, Michel Jacquemin, and Alan Ruttenberg), Proceedings International Conference on Systolic Arrays, pp. 151 – 162, IEEE Computer Society Press, March 1988.

“Data Parallel Programming and Basic Linear Algebra Subroutines”, Scientific Software, Vol. 14., pp. 183 – 196, IMA Series in Mathematics and its Applications, Springer Verlag, 1988.

“Matrix Transposition on Boolean n–cube Configured Ensemble Architectures”, (with Ching–Tien Ho). SIAM J. Matrix Analysis, Vol. 9, No. 3, pp. 419 – 454, July 1988.

“Divide–and–Conquer Algorithms for SIMD Architectures”, The Second International Conference on Vector and Parallel Computing, June, 1988, Tromso, Norway.

“Systolic Fast Fourier Transform Algorithms for Boolean Cube Networks”, The 1988 International Conference on Systolic Arrays, May, 1988, San Diego.

“Optimal Algorithms for Stable Dimension Permutations on Boolean Cubes”, (with Ching–Tien Ho) The Third Conference on Hypercube Concurrent Computers and Applications, January 1988, ACM Press, pp 725 – 736.

“Highly Parallel Banded Systems Solvers”, Parallel Computations and Their Impact on Mechanics, AMD–Vol. 86, pp. 187 – 208, ASME, December, 1987.

“Algorithms for Matrix Transposition on Boolean Cube Configured Ensemble Architectures”, (with Ching–Tien Ho), The 1987 International Conference on Parallel Processing, pp. 621 – 629, IEEE Computer Society, 1987.

“Alternating Direction Methods on Multiprocessors”, (with Y. Saad and M. H. Schultz). SIAM J. Sci. Stat. Comp., Vol. 8, No. 5, pp. 686 – 700,September 1987.

“Solving Tridiagonal Systems on Ensemble Architectures”, SIAM J. Sci. Stat. Comp, Vol. 8, No. 3, pp. 354 – 392, May 1987.

“Solving Banded Systems on Parallel Architectures”, (with Jack Dongarra, Argonne National Laboratory), Journal of Parallel Computing, Vol. 5, No. 2, pp. 219 – 246, 1987.

“Computing Fast Fourier Transforms on Boolean Cubes and Related Networks”, in Advanced Algorithms and Architectures for Signal Processing II”, SPIE 1987, Vol. 826, pp. 223 – 231, 1987.

“Multiple Tridiagonal Systems, the Alternating Direction Method, and Boolean Cube Configured Multiprocessors”, (with Ching–Tien Ho), Rep ort YALEU/DCS/RR–532, July 1987.

“Fast PDE Solvers on Fine and Medium Grain Architectures”, in Advances in Computer Methods for Partial Differential Equations, IMA CS (International Association for Mathematics and Computers in Simulation), Vol. 6, pp. 405 – 410, 1987.

“Band Matrix Systems Solvers on Ensemble Architectures”, Algorithms, Architectures and the Future of Scientific Computation, the University of Texas Press, pp. 195 – 216, 1986.

“The Effect of Orderings on the Parallelization of Sparse Codes”, (with Iain Duff, AERE Harwell), presented at the International Conference on Vector and Parallel Computing, Loen, Norway, June 1986.

“Floating–point CORDIC: An analysis of number representations and a bit–serial chip”, (with Venkatesh Krishnaswamy), Report YALE/DCS/RR–473, April 1986.

“Fast Banded Systems Solvers for Ensemble Architectures”, presented at the SIAM Fall Meeting, Tempe, Arizona, October 1985, Report YALEU/DCS/RR–379, Marc h 1985.

“Dense Matrix Operations on a Torus and a Boolean Cube”, AFIPS Conference Proceedings, Vol. 54, 1985, The National Computer Conference.

“Solving Narrow Banded Systems on Ensemble Architectures”, ACM TOMS, Vol. 11, No. 3, pp. 271 – 288, September 1985.

“Cyclic Reduction on a Binary Tree”, Computer Physics Communications, Vol. 37, 1985.

“Combining Parallel and Sequential Sorting on a Boolean n–cube”, The 1984 International Conference on Parallel Processing, IEEE Computer Society, 1984.

“Highly Concurrent Algorithms for Solving Linear Systems of Equations”, Elliptic Problem Solvers II, Academic Press 1984, pp. 105 – 126.

“Odd–Even Cyclic Reduction on Ensemble Architectures and the Solution of Tridiagonal Systems of Equations”, Report YALEU/DCS/RR–339, October 1984.

“Cyclic Reduction on a Binary Tree”, Presented at Vector and Parallel Processing in Computational Science II, Oxford, August 28 – 31, 1984), Report YALEU/DCS/RR–437, November 1984.

“A Mathematical Approach to Computational Networks for the Discrete Fourier Transform”, Draft, June 1984.

“Some Ensemble Architecture Algorithms for the Conjugate Gradient Method and for Tridiagonal Systems of Equations”, SIAM Conference on Parallel Processing for Scientific Computing, Norfolk, Virginia, November 10 – 11, 1983.

“Residue Arithmetic and VLSI”, (with Chao–Lin Chiang), IEEE International Conference on Computer Design: VLSI in Computers, pp. 80 – 83, IEEE Computer Society 83CH1935–6, 1983.

“Highly Concurrent Algorithms for Solving Linear Systems of Equations”, the Conference on Elliptic Problem Solving, Monterey, January 10 – 12, 1983.

“A Formal Derivation of Array Implementations of FFT Algorithms”, (with Danny Cohen), USC Conference on VLSI and Modern Signal Processing, Los Angeles, November 1 – 3, 1982, pp. 53 – 63.

“An Algebraic Description of Array Implementations of FFT Algorithms”, (with Danny Cohen), The 20th Annual Allerton Conference on Communication, Control and Computing, Monticello, Illinois, October 6 – 8, 1982, pp. 126 – 134.

“VLSI Algorithms for Doolittle’s, Crout’s and Cholesky’s Methods”, the International Conference on Circuits and Computers, ICCC 82, New York, September 29 – October 1, 1982, pp. 372 – 377.

“Pipelined Linear Equation Solvers and VLSI”, Microelectronics 1982, Adelaide, Australia, May 12 – 14, 1982, pp. 42 – 47, The Institution of Engineers, Australia, National Conference Publication No. 82/4.

“A Computational Array for the QR–method”, Proceedings, Conference on Advanced Research in VLSI, Ed. P. Pennfield, Artech House, 1982, pp. 123 – 129.

“Concurrent Algorithms for the Conjugate Gradient Method”, 5040:TR:82, Computer Science, California Institute of Technology, September 1982.

“A VLSI Algorithm and Array for the QR–method”, The 19th Annual Allerton Conference on Communication, Control and Computing, Monticello, Illinois, September 30 – October 2, 1981, pp. 235 – 236.

“A Computational Array for the QR–method”, 4533:DF:81, Computer Science, California Institute of Technology, July 1981.

“Computational Arrays for Band Matrix Equations”, 4287:TM:81,Computer Science, California Institute of Technology, May 1981.

“Computational Arrays for Discrete Fourier Transform”, (with Danny Cohen), Twenty–second Computer Science International Conference, COMPCON 81,San Francisco, February 24 – 26, 1981, pp. 236 – 244, IEEE Catalog No. 81CH1626–1.

“A Note on Householder’s Method, Sparse Matrices and Concurrency”, 4089:DF:80, Computer Science, California Institute of Technology, December 1980.

“Gaussian Elimination on Sparse Matrices and Concurrency, A Complexity Analysis”, 4087:TR:80, Computer Science, California Institute of Technology, December 1980.

“An Algorithm for State Estimation in Power Systems”, Paper No. KG 940–107E, Eighth Power Industry Computer Application Conference, (PICA), Minneapolis, June 1973.

“Matrislagring, Facktorisering och Nodnumrering”, (Storage Schemes for Matrices, Factorization and Node Ordering), (with Johan Schubert), ASEA Technical Report No. KYYS 714–7021, May 1977.

“On the Choice of Integration Method for a New Simulation Package for Dynamic Stability Analysis”, (with Kjell Aneros), ASEA Technical Report No. KYYS 714–6053, November 1976.

“An Algorithm for Matrix Multiplication by Structural Programming”, ASEA Technical Report No. KYY 714–4024, April 1974.

“Tillståndsuppskattning: En Analytisk Jämförelse av Befintliga Algoritmer”, (State Estimation: An Analytic Comparison of Existing Algorithms), ASEA Technical Report No. KYY 575–2019, March 1972.

“Direkta Metoder för Lösning av Överbestämda Ekvationssystem”, (Direct Methods for the Solution of Overdetermined Systems of Equations), ASEA Technical Report No. KYY 013–2015, February 1972.

“Optimal Estimering och Separationsatser”, (Optimal Estimation and Separation Theorems), ASEA Technical Report KYYH 013–1044, April 1971.

“Distributed Parameter Systems, An Annotated Bibliography – to April 1971, Parts I and II”, UCLA ,Report No. UCLA–ENG–7143, August 1971.

“Beskrivning av Datamaskinprogram för Riskanalys av Investeringsprojekt, (MoDo)”, (Description of a Computer Program for the Analysis of Risk of Investment Project), Goteborg, July 1968.

Scalable Scientific and Engineering Applications

“Mimicry of Statistical Properties of Host Genomes by RNA Viruses”, Quance, M., Feng, C., Rojas, M., Putonti, C., Johnsson, L., Fofanov, Y.. Keystone Symposia: Molecular Evolution as a Driving Force in Infectious Diseases. Beaver Run Resort. Breckenridge, CO, 2008 Apr. 8-13

“Developing Assays for the Detection of Influenza in Human Samples”, Chen Feng, Catherine Putonti, Michael Quance, Stephen Huff, Andrey Belokrylov, Lennart Johnsson, Krishna Jayaraman, Michael Hogan, and Yuriy Fofanov, The 2007 International Conference on Bioinformatics and Computational Biology (BIOCOMP’07), June 25-28, 2007, Las Vegas, NV.

“Two Challenges in Genomics That Can Benefit From Peta-Scale Platforms”, M. Zhang, C.Putonti, Y. Fofanov and L. Johnson, Europar, August 29 – September 1, 2006, Dresden, Germany. Vol. 4375, Lecture Notes in Computer Science, Springer Verlag.

“Scalability of Finite Element Applications on Distributed–Memory Parallel Computers”, (with Zdenek Johan and Kapil K. Mathur and S. Lennart Johnsson and Thomas J.R. Hughes), in Computer Methods in Applied Mechanics and Engineering, Vol. 119, Nos. 1 –2, pp. 61 – 72, November 1994.

“Data Parallel Finite Element Techniques for Compressible Flow Problems”, (with Zdenek Johan, Kapil K. Mathur, and Thomas J.R. Hughes), Proceedings of the Parallel Computational Fluid Dynamics 1994 Workshop, March 1994. Harvard University Technical Report TR-04-94, January 1994.

“Massively Parallel Computing: Unstructured Finite Element Simulations”, NAFEMS Benchmark, pp. 24 – 29, June, 1993.

“Massively Parallel Computing: Unstructured Finite Element Simulations”, (with K. Mathur, Zdenek Johan and Thomas J.R. Hughes), NAFEMS: Proceedings of the Fourth International Conference on Quality Assurance and Standards in Finite Element and Associated Technologies, NAFEMS, pp. 158 – 170, 1993.

“Data Parallel Finite Element Techniques for Computational Fluid Dynamics on the Connection Machine Systems”, Parallel Computational Fluid Dynamics ’92, pp. 215 – 229, North–Holland, 1993.

“Finite Element Techniques for Computational Fluid Dynamics on the Connection Machine CM–5 System”, with Z. Johan, K.K. Mathur, S.L. Johnsson and T.J.R. Hughes, the Second US Congress on Computational Mechanics, Washington D.C., August 1993.

“A Data Parallel Finite Element Method for Computational Fluid Dynamics on the Connection Machine Systems”, (with Zdenek Johan, Tom Hughes and Kapil K. Mathur), Computer Methods in Applied Mechanics and Engineering, Vol. 99, No. 1, pp. 113 – 134, August 1992.

“QCD on the Connection Machine: Beyond *Lisp”, (with Ralph G. Brickner and Clive F. Baillie), Computer Physics Communications, Vol. 65, pages 39 – 51, 1991.

“A Data Parallel Implementation of an Explicit Method for the Compressible Navier– Stokes Equations for Three–Dimensional Channel Flow”, (with Pelle Olsson) Journal of Parallel Computing, Vol. 14, No. 1, pp. 1 – 30, 1990.

“Experience with the Conjugate Gradient Method for Stress Analysis on a Data Parallel Supercomputer”, (with Kapil K. Mathur) International Journal on Numerical Methods in Engineering, Vol. 27, No. 3, pp. 523 – 546,December 1989.

“Boundary Modifications of the Dissipation Operators for the Three–Dimensional Euler Equations, (with Pelle Olsson), Journal of Scientific Computing, Vol. 4, No. 2, pp. 159 – 195, June, 1989.

“The Finite Element Method on a Data Parallel Computing System”, (with Kapil K. Mathur), International Journal of High–Speed Computing, Vol. 1, No. 1, pp. 29 – 44, May 1989.

“Data Structures and Algorithms for the Finite Element Method on a Data Parallel Supercomputer”, (with Kapil K. Mathur), International Journal of Numerical Methods in Engineering, Vol. 29, No. 4, pp. 881 – 908, April 1990.

“Data Parallel Algorithms for the Finite Element Method”, (with Kapil K. Mathur), Proceedings of the Fourth SIAM Conference on Parallel Processing for Scientific Computing, pp. 257 – 267, December 1989. SIAM 1990.

“The Finite Element Method on a Data Parallel Architecture” (with Kapil K. Mathur), Fourth SIAM Conference on Parallel Processing for Scientific Computing, December 12, 1989, Chicago, IL.

“A Data Parallel Implementation of an Explicit Method for the 3–Dimensional Compressible Navier–Stokes Problem”, (with Pelle Olsson), Fourth SIAM Conference on Parallel Processing for Scientific Computing, December 11, 1989, Chicago, IL.

“QCD with Dynamical Fermions on the Connection Machine”, (with Clive Baillie, Ralph Brickner, Rajan Gupta), Supercomputing 89,A CM Press, pp. 2 – 9, November 1989.

“Element Order and Convergence Rate of the Conjugate Gradient Method for Stress Analysis on the Connection Machine”, (with Kapil K. Mathur), Supercomputing 89, ACM Press, pp. 337 – 343, November 1989.

“A study of Dissipation Operators for the Euler Equations and a Three–dimensional Channel Flow”, (with Pelle Olsson), Supercomputing 89, ACM Press, pp. 141 – 151, November 1989.

“The Finite Element Method on a Data Parallel Architecture”, (with Kapil K. Mathur), Fifth International Symposium on Numerical Methods in Engineering, September 1989.

“QED on the Connection Machine”, (with Clive Baillie, Luis Ortiz, and Stuart Pawley), The Third Conference on Hypercube Concurrent Computers and Applications, January 1988, ACM Press, pp. 1288 – 1285.

“Solving Schroedinger’s Equation on the Intel iPSC by the Alternating Direction Method”, (with F. Saied, Ching–Tien Ho and M. H. Schultz), in Hypercube Multiprocessors 1987, pp. 680 – 691, SIAM 1987.

“The Three–Dimensional Wide Angle Wave Equation, Tridiagonal Systems and the Intel iPSC”, (with F. Saied, Ching–Tien Ho and M. H. Schultz), Second Conference on Hypercube Multiprocessors, Knoxville, Tennessee, September 29 – October 1, 1986.

“TRANSTA – A Computer Program for Simulation of Electromechanical Transients in Power Systems: Numerical Technique and Program Structure”, Technical Report KYLS 714–9016, May 1979.

“Static Network Equivalents for Load Flow Calculations”, (with Kjell Aneros), ASEA Technical Report No. KYYS 714–7011, Marc h 1977.

“Modell for simulering av trefas växelströms Ljusbågsugn”, (with Lennart Friis), ASEA Technical Report No. TR KYYS 6631–6054, November 1976.

“An Algorithm for Thermal Unit Commitment”, (with Kjell Aneros), ASEA Technical Report No. KYYS 714–6052, December 1976.

“Detection and Correction of Bad Measurements in Power Systems – Algorithmic Description”, (with Gunnar Bengtsson), ASEA Technical Report No. KYYS 714–6051, November 1976.

“Production Scheduling in Power Systems”, (with Kjell Aneros), ASEA Technical Report No. KYYS 714–6046, October 1976.

“Functional Specification for the TIDAS State Estimation Function”, (with M. Manson and L. Pettersson), ASEA Technical Report No. KYYS 714–6028, June 1976.

“On–Line Determination of Power and Measurement System Configuration, Topological Properties and Observability”, (with M. Manson), Proceedings of the IEEE On–Line Operation and Optimization of Transmission and Distribution Systems Conference, London, Paper No. KC 940–109 E, June 1976.

“The Use of State Estimation in Power System Control Centres”, ASEA Technical Report No. KYYS 714–6017, May 1976.

“An Evaluation and Simulation Program Package for State Estimation in Power Systems”, ASEA Technical Report No. KYYS 714–6016, May 1976.

“Detektion av Stora Mätfel i Kraftsystemsammanhang”, (Detection of Gross Measurement Errors in Electric Power Systems), (with L. Bengtsson and M. Manson), ASEA Technical Report No. KYYS RM 714–6012, April 1976.

“Utvärdering av Mätningar på Magnetisk Modell av PDC–Maskinen” (Evaluation of Measurements on a Magnetic Model of the PDC–Machine), (with Gunnar Andersson), ASEA Technical Report TR KYYS 7277–6010, April 1976.

“Mätnoggrannheten for Stressometersystemet SM 200”, (Measurement Accuracy of the Stressometer System SM 200), (with Gunnar Andersson), ASEA Technical Report No. KYYS 5693–6007, February 1976.

“State Estimation in Power Systems. A Description of ASEA’s O.–line Program Package”, ASEA Technical Report No. KYYS 714–6006, February 1976.

“An Algorithm for Measurement Bias Estimation in Power System State Estimation”, ASEA Technical Report No. KYYS 714–6005, March 1976.

“Capabilities of the Economic Dispatch Calculation Function in SINDAC”, (with Piotr Chanachowicz), ASEA Technical Report No. KYYS 714–6003, February 1976.

“A Comparison Between Least Squares and Minimum Variance Filtering Applied to Power System State Estimation”, ASEA Technical Report No. KYYS 714–6002, February 1976.

“Observability of Power Systems”, (with Sven Lubeck), ASEA Technical Report No. KYYS 714–5040, October 1975.

“Description of a Program for the Determination of Observable Areas in a Power System”, Program Description, KYYS October 1975.

“Tillståndsuppskattning i Kraftnät: Förbättring av konvergens vid spänningsmätningar”, (State Estimations in Power Systems: Improved Convergence for Voltage Measurements), ASEA Technical Report TR KYYS 714–5039, October 1975.

“State Estimation Techniques in the Swedish Power System”, (with H. Elg,P . E. Molander, L. Pettersson), Proceedings of the International Conference on Large High Voltage Electric Systems (CIGRE), Paris, Paper No. 32–02, August 25 – September 2, 1976.

“Simulering av PDC 4/8 Motordrift: Interference mellan nät–och maskinkommutering”, (Simulation of PDC 4/8 Electric Drive: Interference between net and machine commutation), (with G. Anderson), ASEA Technical Report TR KYYS 7277–5063,October 1975.

“Datorprogram for Simulering av PDC–Motordrift”, (A Computer program for the Simulation of PDC Electric Drives”), (with G. Andersson), ASEA Technical Report TR KYYS 7277–5032, September 1975.

“Statorvibrationer i PDC–maskinen”, (Stator vibrations in the PDC–machine), (with G. Andersson), ASEA Technical Report TR KYYS 7277–5031, September 1975.

“Ekonomisk Optimering av Påbyggnadsreaktor for PDC X–6”, (Economic optimization of attached inductor for the PDC X–6), (with G. Andersson), ASEA Technical Report TR KYYS 4510–5030, August 1975.

“Utvärdering av Datorprogram for Säkerhetstest av Elektriska Kraftnät (Security Monitor)”, (Evaluation of a Computer Program for Security Monitoring of Electric Power Systems), (with Sten Kollberg), ASEA Technical Report No. KYYS 714–5020, May 1975.

“Driftcentraler i Kraftsystem. Bekrivning av Målsättning och System”, (Control Centres for Electric Power Systems. A Description of Goals and Objectives), ASEA Technical Report No. KYYS 714–5014, March 1975.

“The ASEA/TRW Prestudy on Power System Security. Final Report”, ASEA Technical Report No. KYYS 714–5013, Marc h 1975.

“On–Line power System Configuration Determination by Computer”, (with M. Manson), ASEA Technical Report No. KYYS 714–5011, March 1975.

“Description of Control Centre Functions”, ASEA Technical Report No. KYYS RM 714–4036, December 1974.

“Detaljerad Resursuppskattning och Tidplan for Tillämpningsprogram”, (Detailed Schedules and Resource Plan for the Development of Application Programs), ASEA Technical Report No. KYYS RM 714–4032, November 1974.

“A New Algorithm for Gross Measurement Error Detection for Power Network State Estimation Purposes”, (with J. Valis), Rep ort No. KYYS 714–4020, October 1974.

“The Basis for Bad Data Detection in Power Systems”, ASEA Technical Report No. KYYS 714–4019, October 1974.

“The ASEA/TRW Prestudy on Power System Security: Discussions with CEGB”, ASEA Technical Report NO. KYYS 714–4016, October 1974.

“The ASEA/TRW Prestudy on Power System Security: Discussions with ENEL”, ASEA Technical Report No. KYYS 714–4015, October 1974.

“Besök vid Central Electricity Research Laboratories (CERL), EdF’s Nationella Driftcentral, Paris Regionscentral och The Research Centre of Automation, ENEL”, (Visits to CERL, London, The National Control Centre of Electricitet de France (EdF), The Regional Control Centre of Paris, The Research Centre of Automation of EdF, Ente Nationale Electricite Lab.,ENEL, Rome and Milan),ASEA Technical Report No. KYYS RR 714–4014, October 1974.

“The ASEA/TRW Prestudy on Power System Security – Discussions with EdF”, ASEA Technical Report No. KYYS 714–4011, October 1974.

“The Treatment of Exact Information in Weighted Least–Squares Estimation”, ASEA Technical Report No. KYYS 013–4001, September 1974.

“Simulation Package for Power System State Estimation”, ASEA Technical Report KYY TR 714–4059, September 1974.

“Detailed Report from Visits to CEGB, Southern Services Inc., Ontario Hydro, Philadelphia Electric Co., Autocon Industries, Stagg Systems Inc., Macro Corporation and Modcomp”, ASEA Technical Report No. KYY RR 714–4033, May 1974.

“The ASEA/TRW Prestudy on Power System Security – Security Practices at the Swedish State Power Board (SSPB)”, ASEA Technical Report No. KYY 714–4030, May 1974.

“Automatikutrustningar/Driftcentraler”, (Equipment for Automation in Power Systems and Control Centres), ASEA Technical Report No. KYY PM 714–4026, May 1974.

“Investigation of Toke–data”, ASEA Technical Report No. KYY 714–4021, April 1974.

“Power System Supervisory Control – Application Functions”, ASEA Technical Report No. KYY 714–4015, March 1974.

“Functional Classification”, (with Ulf Hermansson), ASEA Technical Report No. KYY 714–4015b, March 1974.

“On Bias and Parameter Estimation in Connection with Power System State Estimation”, ASEA Technical Report No. KYY 714–3056, September 1973.

“State Estimation in Power Systems: Algorithms and Some Simulation Experiences”, ASEA Technical Report No. KYY 714–3013.

“Evaluation of OXYPAC’s feasibility for the control of Basic Oxygen Furnaces”, ASEA Technical Report No. KYY 5712–2040, September 1972.

“System Security Monitoring”, ASEA Technical Report No. KYY 575–1105, December 1971.

“Recent Developments in Power Systems Operation. A Literature Survey”, ASEA Technical Report No. KYY 013–1085, October 1971.

“Tillståndsuppskattning i Kraftnät”, (State Estimation in Power Systems), ASEA Technical Report No. KYY 575–1047, June 1971.

“Mathematical Models of the Kraft Cooking Process”, Acta Polytechnica Scandinavica, Mathematics and Computing Machinery Series No. 22, The Royal Academy of Engineering Sciences, Stockholm 1971, Catalogue No. UDC 676.1.022.6.

“A Mathematical Model and Computer Program for the Continuous Kraft Cooking Process”, Proceedings of European Symposium on The Use of Computers in the Studies Proceeding the Design of Chemical Plants, European Federation of Chemical Engineering, Florence, April 27 – 30, 1970.

“Ett Datorprogram for Simulering av Kontinuerlig Sulfatkokning”, (A Computer Program for the Simulation of the Continuous Kraft Cooking Process), Institutionen for Regleringsteknik, Chalmers Tekniska Hogskola, September 1969.

Special-Purpose Compilers

“Automatic Generation of FFT for Translations of Multipole Expansions in Spherical Harmonics”, Jakub Kurzak, Dragan Mirkovic, B. Montgomery Pettitt, and Lennart Johnsson, Int’l Journal of High Performance Computing Applications, vol. 22, no. 2, pp 219 – 230, 2008

“Empirical Auto-tuning Code Generator for FFT and Trigonometric Transforms”, Ayaz Ali, Lennart Johnsson and Dragan Mirkovic, 5th Workshop on Optimizations for DSP and Embedded Systems, 2007 International Symposium on Code Generation and Optimization, March 11 – 14, 2007, San Jose, CA.

“CODELAB: A developers’ Tool for Code Generation and Optimization”, D. Mirkovic and L. Johnsson, Lecture Notes in Computer Science, Vol 2660, pp. 729-738, 2003.

“CODELAB: A Developers’ Tool for Efficient Code Generation and Optimization”, (with Dragan Mirkovic), ICCS 2003, June 2 – 4, 2003, Melbourne, Australia.

“Telescoping Languages: A Strategy for Automatic Generation of Scientific Problem–Solving Systems from Annotated Libraries”, (with K. Kennedy, B. Broom, K. Cooper, J. Dongarra, R. Fowler, D. Gannon, J. Mellor-Crummy and L. Torczon), in Journal of Parallel and Distributed Computing, Vol.61, No. 12, pp 1802 – 1826, 2001.

“Automatic Performance Tuning in the UHFFT Library”, (with Dragan Mirkovic), The 2001 International Conference on Computational Science, ICCS 2001, May 28 – 30, 2001 Hilton San Francisco and Towers, San Francisco, USA.

“An Adaptive Software Library for Fast Fourier Transforms”, (with Dragan Mirkovic) 2000 International Conference on Supercomputing, pp. 215 – 224, ACM Press, May 2000.

“A Stencil Compiler for the Connection Machine Models CM–2/200”, (with Ralph G. Brickner, William George and Alan Ruttenberg), Fourth International Workshop on Compilers for Parallel Computers, pages 68 – 78, Delft, 1993.

“Language and Compiler Issues in Scalable High Performance Libraries”, to appear in Compilation Techniques for Novel Architectures, Springer Verlag, 1993.

“Expressing Boolean Cube Matrix Algorithms in Shared Memory Primitives”, (with Ching–Tien Ho), The Third Conference on Hypercube Concurrent Computers and Applications, January 1988, ACM Press, pp. 1599 – 1609.

“Programmeringsspråk”, (Programming Languages), (with Borje Karlsson), ASEA Technical Report No. KYYS 0182–8019, April 1978.

“EKVGEN (Ekvations–och data–generator): Generellt, Blockorienterat Datorprogram for Simulering av Dynamiska System”, (EKV GEN: A Block diagram oriented Computer Program for the Simulation of Dynamical Systems), (with S. Kollberg), ASEA Technical Report, June 1975.

Computer System Benchmarking

“HPFBench: A High Performance Fortran Benchmark Suite”, with Y. Charlie Hu, Guohua Jin, Dimitris Kehagias and Nadia Shalaby, in ACM Transactions on Mathematical Software, Vol. 26, No. 1, pp. 99 – 149, March, 2000.

“A Data Parallel Fortran Benchmark Suite”, (with Yu Hu,and Dimitris Kehagias and Nadia Shalaby) Proceedings of the 11th International Parallel Processing Symposium, pp. 219 – 226,Geneva, Switzerland, April 1 – 5, 1997.

“Performance Modeling of Distributed Memory Architectures”, Journal of Distributed and Parallel Computing, Vol. 12, No. 4, pp. 300 – 312, 1991.

Process Control

“Tidsstörningar i reglerloopar”, (Sampling rate variations in discrete time feedback systems), (with Per Kihlgren), Technical Report KYLS 015–9018, May 1979.

“Synpunkter på och förslag till Styrsystem för robot”, (Proposal for a control system for the ASEA Robot), Technical Report KYLS 6397–9015, April 1979.

“Självinställande Regulator: Laboratorieprov med varvtalsreglerad motordrift” (Self–tuning Regulator: Experiments with speed controlled electric drives, (with Gunnar Bengtsson) Technical Report KYLS 5706–9012, March 1979.

“Self–Tuning Regulator: Functional Description”, (with Gunnar Bengtsson) Technical Report KYLS 5706–9010, March 1979.

“On Production Increase in a Pulp Mill with the ASEA Pulp Mill Production Control (PMPC) System”, ASEA Technical Report TR KYYS 932–9009, February 1979.

“Självinställande Regulator: Fältprov vid Sandvik AB”, (Self–Tuning Regulator: Experiments at Sandvik AB), (with Gunnar Bengtsson) Technical Report KYLS 5706–8004, January 1979.

“Förslag till Reglerstrategi för lägessytem hos robot”, (A positional control system for the ASEA Robot), (with Gunnar Bengtsson), Technical Report KYYS 57–8026, August 1978.

“Bestämning av tändvinkel och tryck vid TDC for dieselmotor ur mätdata med MK–metod”, (Computation of ignition angle and pressure for a diesel engine by a least–squares method), Technical Report KYYS 351–8025, August 1978.

“On the Choice of Storage Levels in ASEA’s Production Control System PMPC”, (with Gunnar Bengtsson), Technical Report KYYS 932–8023, May 1978.

“Predicted Production Increase in a Pulp Mill with ASEA’s Production Control System PMPC”, (with Gunnar Bengtsson), Technical Report KYYS 932–8015, April 1978.

“Adaptiv AGC for Kallvalsning”, (Adaptive AGC for Cold Rolling Mills), (with Jaroslav Valis), ASEA Technical Report No. KYYS 5692–5021, May 1975.

“Mikrodatorbaserad Självinställande Regulator – Nulägesrapport”, (Microprocessor Based Self–Tuning Regulator – Status Report), ASEA Technical Report No. KYY RM 5706 – 3049, August 1973.

Computation — General

“Business Models”, PRACE deliverable 2.3, 66 pages, March 2014 (Carlos Merida lead author).

“Atmospheric Balloon Studies: A Collaboration Between Minority and Traditional Undergraduate and Graduate Institutions”, Austin S., L. Johnson, B. Lefer, G. Morris, P.A. Morris and D.K. Walter, Geological Society of America, 2008 Annual Meeting, Houston, TX, Abstract# 311-14. October 5 – 9.

“Datortekniken. En ingenjörsmässig och humanistisk bedrift.”, Deadalus 2002, pp 11 – 30, Tekniska Museet, Stockholm.

“The Connection Machine System CM–5”, the Fourth Symposium on Parallel Algorithms and Architectures, SP AA–93, pp. 365 – 366, 1993, ACM Press.

“Supercomputers: Past and Future”, KOSMOS, pp. 31 – 44, Almquist &Wiksell, Uppsala, 1990.

“Data Parallel Supercomputing”, Use of Parallel Processors in Meteorology, Springer – Verlag, 1989.

“The Fluent Abstract Machine”, (with Abhiram G. Ranade and Sandeep N. Bhatt), Advanced Research in VLSI, pp. 71 – 93, MIT Press, 1987.

“Ensemble Architectures and Their Algorithms: An Overview”, Numerical Algorithms for Modern Parallel Computer Architectures, Vol. 13, pp. 109 – 144, IMA Series in Mathematics and its Applications, Springer Verlag, 1988.

“Directions in High Performance Computing”, Proceedings of the American Statistical Association 19th Symposium on Computer Science and Statistics, March 8 – 11, 1987.

“Generation of Layouts from Circuit Schematics; A Graph Theoretic Approach2 ”, (with Tak Ng,) The 1985 Design Automation Conference, Las Vegas, June 1985.

“Future High Performance Computation: The Megaflop per dollar alternative”, Report YALEU/DCS/RR–360, January 1985.

“Distributed RC Delay Model and MOS PLA Timing Estimation, (with Chao–Lin Chiang), Computer Science, California Institute of Technology, September 1983.

“The Computer Science of Concurrent Processing”, (with A. Martin and C. Seitz), Computer Science, California Institute of Technology, February 1983.

“The Tree Machine: An evaluation of program loading strategies, (with Peggy Li), The 1983 International Conference on Parallel Processing, August 23 – 26, 1983, pp. 202 – 205, Shanty Creek. IEEE Computer Society, 83CH1922–4.

“Submicron Systems Architecture: Semiannual Technical Report”, (with Charles L. Seitz, 5052:TR:82, Computer Science, California Institute of Technology, October 1982.

“Towards a Formal Treatment of VLSI Arrays”, (with Uri Weiser, Danny Cohen and Alan L. Davis), Proceedings, Second Caltech Conference on VLSI, Pasadena, January 19 – 21, 1981.

“The Submicron Systems Architecture Project”, 4076:DF:80, Computer Science, California Institute of Technology, November 1980.

“VLSI Architecture and Design”, Proceedings of the National Electronics Conference, 1980, Vol. 34, pp. 254 – 259.

“Elektronikutvecklingens inverkan på ASEA–produkter under 80–talet” (The impact of electronics on the ASEA products during the 80’s), (with Bengt Kredell), Technical Memo KYYS 573–9006, February 1979.

“System 1990 – Project Proposal”, Technical Memo KYYS 573–8032, October 1978.

“KYs verksamhet avseende Informationselektronik. Diskussionsunderlag och förslag till inriktning”, (Activities in Information Electronics of the Central R and D Division of ASEA AB. Future Directions), Technical Memo KYYS 9024–8031, October 1978.

“Större Elektronik–och Datorsystem”, (Large Electronic Systems and Computers), (with B. Karlsson) Technical Report KYYS 573–8027, August 1978.

“Hybridteknologi”, (Hybrid Technology), (with Gunnar Bengtsson), ASEA Technical Report No. KYYS 570–7037, November 1977.

Conference Exhibits and Live Demonstrations

“Digital Signal Processors for Energy Efficient HPC: The Linpack Benchmark” SC12, November 12 – 15, 2012, Salt Lake City, Utah (with Gilbert Netzer and Daniel Ahlin, KTH)

“Energy Efficient 24-core blade server with 7U 10-blade chassis with built-in Infiniband switch, SC09, November 14 – 20, 2009, Portland, Oregon (with Daniel Ahlin, KTH)

“GEMSviz: Computational Steering of Electromagnetic Field Calculation from a Virtual Environment”, (with Erik Engquist and Per Öster), iGRID 2000, Yokohama, July 17 – 21, 2000.

“Distributed Virtual Reality”, (with Fredrik Hedman, Johan Ihren, Volodymyr Kindratenko, Rishad Mahasoom, Lars Malinowsky, and Per Öster), Alliance ’98, April 28- 29, Urbana/Champaign, IL, 1998.

“Electromagnetic Field calculation on a Metacomputer using Globus”, (with Olle Larsson, Kim Andrews, Charles Chambers, Tesfaye Kumbi), SC 97, San Jose, November 15 – 21, CA, 1997.

Poster presentations

“Optimization Methods for Energy Efficient HPC Architectures”, with O. Datskova and G. Netzer, SC13, November 17 – 22, 2013, Denver, CO.

“Digital Signal Processors for Energy Efficient HPC: The Linpack Benchmark”, Energy Efficient HPC Workshop, MIT, February 1, 2013.

“Digital Signal Processors for Energy Efficient HPC”, with G. Netzer and D. Ahlin, SC12, November 10 – 16, 2012, Salt Lake City, UT.

“Mimicry of Statistical Properties of Host Genomes by RNA Viruses”, Quance, M., Feng, C., Rojas, M., Putonti, C., Johnsson, L., Fofanov, Y.. Keystone Symposia: Molecular Evolution as a Driving Force in Infectious Diseases. Beaver Run Resort. Breckenridge, CO, 2008 Apr. 8-13

“SIMDB: A Problem Solving Environment for Molecular Dynamics Simulations”, NPACI Annual Review, San Diego, CA, January 28 – 29, 1999.

“University of Houston Seismic Data Repository”, NPACI Annual Review, San Diego, CA, January 28 – 29, 1999.

“Matrix Multiplication on Hypercubes Using Full Bandwidth and Constant Storage”, Ching–Tien Ho and S. Lennart Johnsson, The Sixth Distributed Memory Computing Conference, Portland, OR., April, 1991.

“Maximizing Channel Utilization for All–to–All Personalized Communication on Boolean cubes, S. Lennart Johnsson and Ching–Tien Ho, The Sixth Distributed Memory Computing Conference, Portland, OR, April, 1991.

“CMIS Arithmetic and Multiwire NEWS for QCD on the Connection Machine”, Clive F. Baillie, Ralph G. Brickner and S. Lennart Johnsson, Supercomputing 90, New York, November, 1990.

“The Complexity of Reshaping Arrays on Boolean Cubes”, The Fifth Distributed Memory Computing Conference, April 8 – 12, 1990, Charleston, SC.

“High Performance Matrix Operations for QCD on the Connection Machine”, Ralph Brickner and S. Lennart Johnsson, Supercomputing 89, Reno, CA.


[1] Chosen for the 1986 ICPP Outstanding Paper Award

2 Nominated for Outstanding Paper Award

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s