ESnet, CENIC Announce Joint Cybersecurity Initiative; CRD’s Peisert to Direct
ESnet and the Corporation for Education Network Initiatives in California (CENIC) recently announced a partnership in developing cybersecurity strategy and research. CENIC is a nonprofit organization that operates the California Research & Education Network (CalREN), a high-capacity network with over 20 million users.
Sean Peisert of the Computational Research Division will be director of the »new CENIC/ESnet Joint Cybersecurity Initiative. Peisert, who was also recently named as the chief cybersecurity strategist for CENIC, has worked extensively in computer security research and development. He will continue his work at Berkeley Lab and as an adjunct faculty member of the University of California at Davis.
“CENIC is a critical partner of ESnet’s, and we already collaborate actively to improve research outcomes in California and beyond,” said ESnet Director Greg Bell. “This new initiative is timely and exciting. Because data exchange is the lifeblood of open science, both organizations require innovative and flexible cybersecurity solutions in order to succeed. Aligning our strategies and teams is an important step forward.”
Supernova Twins: New Modeling Method Makes Standard Candles More Standard
By employing a new modeling method run on NERSC’s Edison supercomputer, members of the international Nearby Supernova Factory (SNfactory) based at Berkeley Lab have dramatically reduced the scatter in supernova brightnesses, doubling the accuracy of distance measurements for some supernovae. Using a sample of almost 50 nearby supernovae, they identified supernova twins—pairs whose spectra are closely matched—which reduced their brightness dispersion to a mere 8 percent. The Gaussian Process modeling they developed and ran at NERSC was instrumental in developing this new method of measuring accurate supernova distances, the researchers noted. »Read more about supernova twins.
New Scientist Quotes CRD’s Wehner on Recent Floods
Michael Wehner of Berkeley Lab’s Computational Research Division was recently quoted in a New Scientist article exploring the link between such extreme weather events as flooding and man-made climate change:
But coming up with an exact link between climate change and weather events is tricky. We need lots of data to show whether rare, once-in-a-millennium events like the South Carolina floods are becoming more common. And in many places, such as southern India, which experienced flooding in November and December, there are no robust climate models for the area that can analyze the flooding.
Rainstorms are particularly tough to study, because they are affected by many factors, says Michael Wehner of Lawrence Berkeley National Laboratory in California. In October, he presented initial results from a study of the role climate change played in Colorado’s 2013 floods. By comparing the river basin responsible with the same basin in an imaginary world without anthropogenic climate change, his team found that the real basin received an extra 8 centimetres of rain.
Wehner’s results won’t be published until years after that storm ended, but he hopes that, in future, scientists will be able to combine several climate models with expert judgement, so that they can make calls on the role of climate change in extreme weather events as they are happening.
»Read the New Scientist article.
Exploring the Dark Universe with Supercomputers
A recent Symmetry Magazine article turned to Peter Nugent of CRD’s Computational Cosmology Center for insight into the part computing plays in the search for dark energy:
Scientists use more than telescopes to search for clues about the nature of dark energy. Increasingly, dark energy research is taking place not only at mountaintop observatories with panoramic views but also in the chilly, humming rooms that house state-of-the-art supercomputers…DESI will collect about 10 times more data than its predecessor, the Baryon Oscillation Spectroscopic Survey, and LSST will generate 30 laptops’ worth of data each night. But even these enormous data sets do not fully eliminate statistical error. Simulation can support observational evidence by modeling similar conditions to see if the same results appear consistently.
“We’re basically creating the same size data set as the entire observational set, then we’re creating it again and again—producing up to 10 to 100 times more data than the observational sets,” Nugent says.
Processing such large amounts of data requires sophisticated analyses. Simulations make this possible.
»Read the rest of the Symmetry article.
Science Node: Networks Like ESnet Critical to Science
A recent Science Node article explains why research networks like ESnet are increasingly critical to scientific progress: “Connecting supercomputers to each other — as well as to the people who use them — requires research and education networks (RENs) capable of moving massive quantities of data between locations quickly, efficiently, and with minimal latency and packet loss….The Energy Sciences Network (ESnet) is a prime example. It carries approximately 20 petabytes of data each month with traffic increasing an average of 10 times every four years, propelled by the rising tide of data produced by supercomputers and global collaborations involving thousands of field researchers—the so-called ‘long tail of science’.”
»Read the Science Node article.
Internet Pioneer Vint Cerf Delivers CS Distinguished Lecture Next Week
Vinton G. Cerf, who is vice president and chief Internet evangelist for Google and widely known as one of the “fathers of the Internet,” will give a Computing Sciences Distinguished Lecture on “Safety, Security and Privacy in the Internet of Things,” at 1:30 p.m. Wednesday, Jan. 27, in the Bldg. 50 auditorium. Cerf, who is a member of the ESnet Policy Board, contributes to global policy development and continued spread of the Internet. »Learn more about Cerf and his talk.
This Week’s CS Seminars
SampleClean: Scalable and Reliable Analytics on Dirty Data
Wednesday, January 20, 10 – 11am, Bldg. 59, Room 3101
Sanjay Krishnan, AMP Lab University of California, Berkeley
An important challenge in data analytics is the presence of dirty data in the form of missing, out-of-date, incorrect, or inconsistent values. This problem is exacerbated in large and growing datasets where there are practical constraints such as how much data can be manually “cleaned” or computational costs limiting the “freshness” of data. Ensuring clean data and answering analytical queries are often viewed as two separate problems, of concern to different research communities, and this divorced perspective makes analyzing the tradeoff between the amount of clean data available and the accuracy of the subsequent analytics challenging. This talk presents a project called SampleClean which hopes to bridge the gap between statistical estimation and data cleaning by proposing new sampling-based estimation methodologies for queries on dirty data. SampleClean is a framework where analysts apply cleaning to samples of a relation and are able to approximate both traditional SQL analytics and advanced analytics such as convex loss models as if the entire relation was cleaned. This talk describes a number of technical challenges including: allowing data transformations that affect sampling statistics, efficient materialization of samples from structured data, techniques for data skew and sparsity, and online result updates. SampleClean has a number of practical applications in budgeting expensive data transformations in crowdsourcing, real-time systems, and interactive data analysis. The talk describes theoretical results characterizing the key tradeoffs and empirical results that demonstrate the value of budgeted data cleaning in a number of case studies. Code and recent publications are available at: http://sampleclean.org
EECS Colloquium: Getting More Women into Tech Careers and Why It Matters
Wednesday, January 20, 4-5pm, 306 Soda Hall (HP Auditorium), UC Berkeley
Maria Klawe, President, Harvey Mudd College
Over the past decade the participation of females in the tech industry has declined rather than advanced. This is unfortunate for young women because of the incredible career opportunities, for the tech industry because of the loss of incoming talent, and for society because of the loss of diversity of perspective among tech teams. I will talk about the reasons why women tend not major in computer technology fields and how Harvey Mudd College dramatically increased the number of females majoring in computer science from 10 percent of the majors to over 40 percent.
Energy Specific Equation-of-motion Coupled-cluster Method and Its Low-scaling approximations for High Energy Excited States Calculations
Thursday, January 21, 10-11am, Bldg. 50B, Room 2222
Bo Peng, University of Washington
Single-reference techniques based on coupled-cluster (CC) theory, in the forms of linear response (LR-) or equation-of-motion (EOM), are highly accurate and widely used approaches for modeling valence absorption spectra. Unfortunately, these equations with singles and doubles (LR-CCSD and EOM-CCSD) scale as O(N6), which may be prohibitively expensive for the study of high-energy excited states using a conventional eigensolver. In this talk, I will talk about an energy-specific non-Hermitian eigensolver that is able to obtain high-energy excited states, e.g. XAS K-edge spectrum, at low computational cost.
Numerical Solutions of Singularly Perturbed Differential Equations
Friday, January 22, 2–3 pm, Bldg. 50F, Room 1647
Thai Anh Nhan, University of Ireland
Singularly perturbed convection-reaction-diffusion differential equations are characterized by a small positive perturbation parameter multiplying the highest derivative, leading to the presence of boundary and/or interior layers. These problems arise in various practical applications and mathematical models. For example, convection-diffusion problems are found in many formulations of fluid flow problems (such as the linearization of the Navier-Stokes equations, and transport problems), and semi-conductor device simulation. Mathematical models involving systems of reaction-diffusion problems appear in simulation of chemical reactions, wave-current interaction, and biological applications. Finding robust numerical solutions to the singularly perturbed problems is a great challenge. The talk aims to explain difficulties in solving these problems numerically with an emphasis on the limitations of linear solvers for systems coming from finite difference discretizations of singularly perturbed problems.
Geometric Graph-based Methods for High Dimensional Data
Friday, Jan. 22, 11am-12pm, Wang Hal (CRT), 59-3101
Andrea Bertozzi, Director of Applied Mathematics, UCLA
We present new methods for segmentation of large datasets with graph based structure. The method combines ideas from classical nonlinear PDE-based image segmentation with fast and accessible linear algebra methods for computing information about the spectrum of the graph Laplacian. The goal of the algorithms is to solve semi-supervised and unsupervised graph cut optimization problems. I will present results for image processing applications such as image labeling and hyperspectral video segmentation, and results from machine learning and community detection in social networks, including modularity optimization posed as a graph total variation minimization problem. Bio (optional) Speaker: Prof. Andrea Bertozzi oversee the graduate and undergraduate research training programs. In 2012 she was appointed the Betsy Wood Knapp Chair for Innovation and Creativity. Bertozzi’s honors include the Sloan Research Fellowship, Presidential Early Career Award for Scientists and Engineers, and SIAM’s Kovalevsky Prize. She was elected to the American Academy of Arts and Sciences in 2010 and to the Fellows of the Society of Industrial and Applied Mathematics in 2010. She became a Fellow of the American Mathematical Society in 2013. To date she has graduated 28 PhD students and has mentored 39 postdoctoral scholars.