Scott Baden Moves to Head of CLaSS

Scott Baden is now leading the Computer Languages and Systems Software (CLaSS) Group in Berkeley Lab’s Computational Research Division (CRD). A graduate of UC Berkeley and former lab post-doc, Baden returns to the Berkeley Lab from a faculty post in UC San Diego’s Computer Science and Engineering Department. In addition to leading the group, Baden will also be doing research on program restructuring to improve performance on supercomputers and looking into novel implementation techniques that may someday be incorporated in an automated program-restructuring tool. He will also remain on faculty at UCSD.

Berkeley Lab’s OpenMSI Licensed to ImaBiotech

Two years ago, Berkeley Lab researchers developed OpenMSI—the most advanced computational tool for analyzing and visualizing mass spectrometry imaging (MSI) data. Last year, this web-available tool was selected as one of the 100 most technologically significant new products of the year by R&D Magazine.  Now, OpenMSI has been licensed to support ImaBiotech’s Multimaging technology in the field of pharmaceutical and cosmetic research and development. The Multimaging platform essentially allows researchers to combine and overlay different image files that have been acquired from different imaging techniques—like qualitative MALDI imaging, staining and immune staining—to increase confidence in data sets.

Supercomputers & Lasers Dig Into Warm Dense Matter

Researchers from the University of Washington are using supercomputers at Department of Energy’s National Energy Research Scientific Computing Center (NERSC) and data from X-ray free-electron laser (XFEL) experiments to gain new insights into warm dense matter (WDM), one of the most challenging aspects of contemporary plasma physics.

“WDM is in that funky regime that is too hot to be condensed solid matter and too cold to be plasma,” explained Ryan Valenza, a third-year graduate student at the University of Washington and lead author on a new paper in Physical Review B that demonstrates the role XFELs can play in broadening our understanding of WDM. “It is an intermediate regime that is not well understood because the electronic behavior is intermediate between being fully quantum-mechanical and fully classical-mechanical.”

Shalf Speaks at European Conferences

John Shalf recently delivered a keynote at the Exascale Applications and Software Conference (EASC) 2016 held in late April in Stockholm, Sweden. A video of the talk entitled “Exascale Computing Architecture Trends and Implications for Programming Systems” is now available online. On June 22, he will chair a panel discussion entitled Beyond von Neumann at ISC 2016 in Frankfurt, Germany.

NERSC Users: Register for Cori Phase 1 Training

NERSC users are invited to attend a four-day training for Cori Phase 1: “Programming Environment, Debugging, and Optimization” led by Cray instructor Rick Slick and held June 13-16.

Besides the usual training topics for Cray’s XC series, this event will include SLURM and Burst Buffer.  Below are the topics to be covered:

  • Hardware concepts and terminology that are relevant to the programmer
  • Launch of a parallel application
  • Operating system components that a parallel application uses
  • Use of Slurm for job launch and monitoring
  • Modules environment
  • Use of the compilers (C, C++, and Fortran), loader, and libraries
  • Use of the application debugger
  • MPI environment variables
  • I/O on the Cray system
    • Lustre
    • DataWarp (sometimes referred to as Burst Buffer)
  • Use of available performance analysis tools
  • Single-processor optimization

NERSC users may attend in person at Berkeley Lab or via Web conferencing. Online registration is required. The deadline is June 3.

Zombie Apocalypse Prompts Surprise Visit

Hugo Villanueva of John F. Kennedy Middle College High School in Southern California was trying to calculate the probability of surviving a zombie apocalypse. His handheld calculator couldn’t handle the problem, so he asked Cray Inc., for help. Impressed by the student’s initiative, the supercomputer maker planned a surprise visit and invited long-time customer NERSC to send a representative. Brian Friesen, a NESAP post-doc working closely with CRD’s Center for Computational Sciences and Engineering, tweaked his travel plans and joined the visit en route to Arizona. Villanueva didn’t get his answer about a zombie attack, but that didn’t damp his enthusiasm for the visit.

This Week’s CS Seminars

»CS Seminars Calendar

Tuesday, May 24

Scalable Computing with Data-Sparse Tensor Representations: Towards Robust Reduced-Scaling Many-Body Electronic Structure
2 – 3 p.m., Bldg. 50B, Room 4205
Eduard Valeyev, Virginia Tech

Thanks to recent advances in many-body electronic structure methods, such as coupled-cluster, it is possible to significantly improve on the Kohn-Sham Density Functional Theory for prediction of chemical energy differences and other properties without the need for excessively-large basis sets AND with nearly linear scaling achieved in practice. These key improvements stem from the regularization of the two-electron cusps arising from the Coulomb singularities (so-called explicit correlation technique) combined with the use of localized one-electron states and sparse tensor representations. In this talk I will highlight our recent efforts to exploit block and block-rank sparsities in the context of reduced-scaling one- and many-body electronic structure methods. I will also discuss the computational challenges posed by the general data-sparse tensor computing and how the TiledArray tensor framework addresses them by dataflow-style computation.

Wednesday, May 25

NERSC Data Seminar
Dask: A Python Library for Flexible Parallelism
3:30 – 4:30 p.m., Wang Hall – Bldg. 59, Room 3101
Matthew Rocklin, Continuum Analytics

Dask is a Python library for parallel computing designed for complex workloads. It complements the scientific Python ecosystem (numpy, scipy, pandas, scikit-*) with parallelism on multi-core workstations and moderate sized clusters. Dask’s original mission was to parallelize NumPy and Pandas, however, in doing so it grew a flexible and efficient dynamic task scheduler which has since shown considerable value in a broad variety of ad-hoc workflows useful in general scientific computing.

This talk describes Dask in two parts: Low-level performance of Dask’s dynamic distributed task scheduler, and how it is used to solve a wide variety of unstructured problems. High-level modules built on top of Dask to provide convenient user access to scientific users At a low level dynamic task scheduling is an effective model for general computing, lying somewhere between MPI (full developer control, full algorithmic freedom) and MapReduce/Spark (managed control, low algorithmic freedom.)

The Dask schedulers provide managed parallelism without sacrificing too much algorithmic freedom, hitting a sweet spot for a number of applications of current interest and current researcher pain points. At a high level we discuss modules build on top of the scheduler, including dask.array and dask.dataframe, which have been in common use in scientific communities for some time now. These modules demonstrate the depth of complexity that scientists generate, and serve as examples for potential projects to extend other applications to parallel computing.