In Brief

Research Interests

Professor Johnsson’s research interests are in the areas of High-Performance and Energy Efficient computer system design, programming and operation, parallel algorithms, adaptive software and tools for creation thereof, auto-tuning, performance modelling, and middle-ware. Multi-core and SoC dsigns are of particular interest. Dr Johnsson’s interest also include High-Performance Networking and NoC designs.

The research is carried out mainly through national and international collaborative projects with academic institutions, national labs, university and national data centers, and international projects such as PRACE, the Partnership for Advanced Computing in Europe and industry, such as AMD , Intel , Supermicro and Texas Instruments.

Energy efficiency of computation and communication, traditionally of great concern for mobile devices and applications, has become one of the foremost concerns also in high-performance computing for cost and environmental reasons. With the rise in energy cost, continued exponential improvement in integrated circuit technology, and improved packaging technology, the lifetime expense for power and cooling has for the last few years in most cases exceeded the cost of HPC systems and associated software. Unless new approaches are found, the dominance of the operational costs is likely to continue to grow and so are the environmental concerns and limit the exponential growth in capabilities the world have experienced for decades.

Our first accomplishment in energy efficient HPC server design and implementation was a highly energy efficient four-socket, 24-core blade server designed in collaboration with AMD and Supermicro that resulted in a Supermicro product . This blade server entirely built with commodity products achieved an energy efficiency for High-Performance Linpack (HPL) comparable to the most energy efficient HPC platform at the time designed and built using proprietary technology at a fraction of the cost. For a molecular dynamics application, GROMACS, the commodity blade server achieved a seven times higher performance and four times higher energy efficiency than the leading proprietary design at the time.

In our current project in regards to energy efficient HPC server design we are exploring the energy efficiency of embedded processors, specifically DSPs from Texas Instruments (TI), for HPC workloads and have designed a HPC prototype server node using the TI Hyperlink for high-speed low-latency inter-node interconnection in a cluster setting. Embedded processors, like the TI DSP, have about an order of magnitude higher nominal double-precision energy efficiency than the same generation x86 processors and hence the use of DSP style architectures for HPC server design could potentially lead to a significant improvement in energy efficiency for HPC workloads if a comparable efficiency (fraction of peak) can be achieved. Benchmark results indicate that efficiencies comparable to those for x86 server designs are achievable for both compute (HPL) and memory intensive (STREAM) HPC kernel benchmarks when some of the unique architectural features of the TI DSP are used. HPL is known as a power intensive benchmark and our DSP code that has an efficiency comparable to that of HPL benchmark codes for x86 architectures is used for stress testing of the HP Mooonshot system.

Prof. Johnsson has served on the faculties of the California Institute of Technology (Caltech), Yale University , Harvard University, and the Royal Institute of Technology and worked in industry, at ABB Research as an engineer and manager, and at Thinking Machines Corp. as a Director of Computer and Computational Sciences. Prof. Johnsson has authored or co-authored over 100 journal and conference papers on various aspects of parallel computation and served on several conference program and organizing committees and panels, and serves on the editorial boards of several journals. He has served on the Board of the Computer Research Association, the Board of Directors of the Partnership for Advanced Computing in Europe (PRACE) and the Science Councils of Universities Space Research Association for ICASE and CESDIS , and technical and advisory committees of several professional organizations including the Global Grid Forum and Open Grid Forum and committees and panels of funding bodies, such as the US National Science Foundation , the US Department of Energy , the Canadian Foundation for Innovation, the Swedish Research Council and the Swedish Foundation for Strategic Research .

In his thesis Prof. Johnsson’s presented a partial differential equations model for the control of the kraft process of producing paper pulp from wood chips, including a numerical method for stiff differential equations and a simulation program. At ABB Research (then ASEA), he developed one of the first sparse matrix packages for real-time use in supervisory control and data acquisition systems for electrical utilities, software for resilience assessment of electric power grids, and microprocessor based process control systems. The work on real-time resilience assessment, supervisory control, and planning lead to ABB becoming the world leading supplier of such systems within five years.

While at Caltech Prof Johnsson in collaboration with Bengt Fornberg (now Univ of Colorado, Boulder) in the Applied Mathematics department developed one of the first parallel algorithms courses in the US focused on highly concurrent systems while his research mostly focused on the development of highly concurrent algorithms motivated by several ongoing parallel computer design projects at Caltech, such as the COSMIC Cube and Tree Machine projects, followed by joint industry projects, such as the Intel iPSC system. The work on VLSI (Very Large Scale Integration) design with Tak Ng resulted in a nomination for Outstanding Paper Award at the 1985 Design Automation Conference for “Generation of Layouts from Circuit Schematics; A Graph Theoretic Approach”.

The focus on developing parallel algorithms courses and research on algorithms, including communication network routing algorithms, continued while at Yale University , with Abhiram Ranade (now at Indian Institute of Technology, Powai) receiving the Machtey Best Student Paper award at the 1987 IEEE Symposium on the Foundations of Computer Science (FOCS) for “How to Emulate Shared Memory” , and the joint research with Ching-Tien Ho (now Manager, IBM Almaden Research) receiving an Outstanding Paper award at the 1986 International Conference for Parallel Processing for “Distributed Routing Algorithms for Broadcasting and Personalized Communication in Hypercubes” . The research on all-to-all communication came to heavily influence the MPI standard development initiated a few years later. The research on parallel algorithms and software with Dimitris Gerogiannis (now Managing Director, Aegean Airlines ) resulted in the thesis “Efficient Implementation of Intermediate Level Image Analysis Tasks on Parallel Machines” jointly supervised by Stelios C. Orphanoudakis and the research with Pelle Olsson (now Mircosoft) resulted in the thesis “High-Order Difference Methods and Data Parallel Implementation” supervised jointly with Bertil Gustafsson, Uppsala University .

The course development and research on parallel algorithms and network routing continued while at Harvard University with Y. Charlie Hu (now Purdue University) on fast n-body algorithms resulting in an “Impressive Entry” recognition in the 1994 Gordon Bell Prize competition, Ted Nesson (now VP Pegasystems) on randomized oblivious routing , and Nadia Shalaby (now CEO at Arctic Sand) on Fast parallel orthogonal transforms.

At Thinking Machines Corp Prof. Johnsson served as Director of Computational Sciences and initiated the design of a register oriented instruction set for the Connection Machine systems CM-2 and CM-200 and lead the development of the first comprehensive, commercial strength, scalable scientific library for parallel architectures for the Connection Machine (CM) CM-2, CM-200, and CM-5 systems, the CMSSL (Connection Machine Scientific Software Library) that included 3-D matrix multiplication algorithms, in-place bit-reversal algorithms and communication algorithms fully exploiting the capabilities of the interconnection network (“all-the-wires-all-the time”). He significantly influenced the CM Run-Time System, in particular the parts related to the communication software subsystem and lead the development of a suite of software development tools for code generation for library functions for the CM architectures. The CMSSL pioneered overloading of library calls in that algorithmic choices were made at run-time when machine and problem sizes and data distribution were known. Library calls also managed concurrency in single instance function execution as well as concurrent execution of multiple instances. Prof. Johnsson lead the work that resulted in the very first No 1 system on the first Top500 list (1993) and contributed to two Gordon Bell Awards (1989 for absolute performance, and 1990 for parallelization) related to convolution computations and a convolution compiler. His group also contributed to the MPI 1.0 and High-Performance Fortran standards.

At the University of Houston Prof. Johnsson has continued research on parallel algorithms, software with a focus on adaptive software and integrated environments for data analysis and simulation. Ayaz Ali‘s (now at Microsoft) thesis “Adaptive Dynamic Scheduling of FFT on Hierarchical Memory and Multi-core Architectures” is a good example of the research on adaptive software whereas Matin Abullah‘s thesis “SimDB – a Grid Software Environment for Molecular Dynamics Simulation and Analysis” is a good example of the work on integrated distributed application specific environments.

At the Royal Institute of Technology (KTH), Stockholm, Sweden, Prof Johnsson’s early research with graduate students focused mostly on grid technologies resulting in the thesis “Managing Service Levels in Grid Computing Systems: Quota Policy and Computational Market Approaches” by Thomas Sandholm (now HP Palo Alto Research Labs) and the thesis “On-Demand Restricted Delegation: A Framework for Dynamic, Context-Aware, Least-Privilege Delegation in Grids” by Mehran Ahsant (now HP Enterprise Security Services). Prof. Johnsson also supervised a very innovative thesis in Augmented Reality, “Unobtrusive Augmentations in Physical Environments: Interaction Techniques, Spatial Displays & Ubiquitous Sensing” by Alex Olwal (now MIT Media Lab). Current research with collaborators at KTH is focused on the design of energy efficient HPC systems using DSPs carried out in collaboration with Texas Instruments. This effort was preceded by the joint project with AMD and Supermicro that resulted in a new Supermicro blade server.

Prof. Johnsson’s research group was an early participant in the Globus project and demonstrated the first MPI application codes using the Globus toolkit at SC97 using facilities at the San Jose conference site, at the University of Houston, Parallel Dator Centrum (PDC) in Stockholm, Sweden, the Texas GigaPoP jointly established by the University of Houston, Rice University and Baylor College of Medicine through a NSF Connections Grant, and the Internet2, Nordunet and Sunet networks.

At Alliance’98 Prof Johnsson’s group participated in the demonstration of interactive,distributed, shared virtual environments with five institutions, NCSA at the University of Illinois Urbana/Champaign, the Electronic Visualization Laboratory at the University of Illinois Chicago, the Scientific Computing and Imaging Institute of the University of Utah, the University of Houston and PDC. This demonstration used virtual reality equipment at all five institutions and the Internet2, Nordunet and Sunet networks and the Startap gateway. This work was recognized as the number one “in the spirit of the Alliance”. In 2000, a demonstration of distributed computational steering was carried out at iGrid200 co-located with iNET2000 in Yokohama using virtual reality equipment at the conference site and high-performance computers at PDC and University of Houston, and the APAN connection in addition to Internet2, Nordunet, Sunet, Startap,and the Texas GigaPoP.

In 2007 joint work with the Bioinformatics group at the University of Houston lead by Prof Yuriy Fofanov resulted in the 2nd prize in the Itanium Solutions Alliance international competition for innovation in the Humanitarian Impact category. In 2008 continued work resulted in 1st prize in the same category

Prof. Johnsson lead the University of Houston participation in the National Science Foundation funded National Computational Science Alliance and the National Partnership for Advanced Infrastructure , the Grid Application Development Software (GrADS) project. and was a founding member of the Department of Energy’s Los Alamos Computer Science Institute (LACSI).

Prof Johnsson’s interest in Grids lead to the participation in the formation of the US Grid Forum and active involvement in establishing the European Grid Forum that merged in October 2000 to form the Global Grid Forum.

At the University of Houston Prof Johnsson lead the UH part of the creation of one of the first GigaPoPs in the country and established the Texas GigaPoP jointly with Dr Ken Kennedy of Rice University and Dr Wah Chiu of Baylor College of Medicine as part of one of the first NSF awarded Connections Grants. He also led the first WiFi network implementation on the University of Houston Campus, and lead the creation of the Research and Education Network of Houston (RENoH) a 500 fiber miles network joining the University of Houston main campus, the Texas Medical Center, Rice University, the University of Houston Downtown and Victoria campuses, and the Texas Southern University with Internet2, the National Lambda Rail , and the Lonestar Education And Research Network (LEARN). Educational infrastructure projects include:

Posted in Uncategorized | Leave a comment