Computing Sciences Inspires at National Science Bowl

Last week, more than 9,000 of the nation’s brightest high school students and 4,500 middle school students convened in Washington, D.C. to compete in the Department of Energy’s (DOE’s) National Science Bowl and Berkeley Lab’s Computing Sciences was there to help inspire the budding scientists.

The four-day, nationwide academic competition, which ran April 28 to May 2 this year, tests students’ knowledge in all areas of science and mathematics. During the event, three Berkeley Lab Computing Sciences staffers talked to students about research being conducted at two of DOE’s most productive user facilities.

Joaquin Correa, a computer systems engineer at NERSC, gave a talk entitled “From Detectors to High Performance Computing–Stories at NERSC.” Lauren Rotman, who heads ESnet’s Science Engagement Group, and Jason Zurawski, a science engagement engineer at ESnet, presented another talk: “Of Mice & Elephants: Science Networks vs. the Internet.”

Next Webinar in ‘Best Practices for HPC Software Developers’ Series on May 18

At 1 p.m. (PDT) on Wednesday, May 18, Barry Smith of Argonne National Laboratory will present the second webinar in a series on Best Practices for HPC Software Developers. This series is created to help HPC software developers increase their team’s productivity and for facility staff who interact extensively with users.

Smith’s presentation, “Developing, Configuring, Building, and Deploying HPC Software,” is described as follows: “The process of developing HPC software requires consideration of issues in software design as well as practices that support the collaborative writing of well-structured code that is easy to maintain, extend, and support.  This presentation will provide an overview of development environments and how to configure, build, and deploy HPC software using some of the tools that are frequently used in the community.  We will also discuss ways in which these and other tools are best utilized by various categories of scientific software developers, ranging from small teams (for example, a faculty member and graduate students who are writing research code intended primarily for their own use) through moderate/large teams (for example, collaborating developers spread among multiple institutions who are writing publicly distributable code intended for use by others in the community).”

The webinars are a cooperative effort between NERSC, ALCF, OLCF and the IDEAS software productivity project. Participation in prior sessions is not required. Visit the series web page to view agendas and register for the series and to access previous presentations and recordings.

This Week’s CS Seminars

Monday, May 9

SDS-Sort: Scalable Dynamic Skew-aware Parallel Sorting
11 a.m. to 12 p.m., Bldg. 50B, Room 4205
Bin Dong, Scientific Data Management Group, Lawrence Berkeley National Laboratory

Parallel sorting is an essential algorithm in large-scale data analytics using distributed memory systems. As the number of processes increases, existing parallel sorting algorithms could become inefficient because of the unbalanced work- load. A common cause of load imbalance is the skewness of data, which is common in application data sets from physics, biology, earth and planetary sciences. In this work, we introduce a new scalable dynamic skew-aware parallel sorting algorithm, named SDS-Sort. It uses a skew-aware partition method to guarantee a tighter upper bound on the work- load of each process. To improve load balance among parallel processes, existing algorithms usually add extra variables to the sorting key, which increase the time needed to complete the sorting operation. SDS-Sort allows a user to select any sorting key without sacrificing performance. SDS-Sort also provides optimizations, including adaptive local merging, overlapping of data exchange and data processing, and dynamic selection of data processing algorithms for different hardware configurations and for partially ordered data. SDS-Sort uses local-sampling based partitioning to further reduce its overhead. We tested SDS-Sort extensively on Edi- son, a Cray XC30 supercomputer. Timing measurements show that SDS-Sort can scale to 130K CPU cores and deliver a sorting throughput of 117TB/min. In tests with real application data from large science projects, SDS-Sort out-performs HykSort, a state-of-art parallel sorting algorithm, by 3.4X.1

Tuesday, May 10

A Hybrid CFD-DSMC Model Designed to Simulate Physical Vapor Deposition
1 to 2 p.m., Wang Hall – Bldg. 59, Room 3034
Kevin Gott, The Pennsylvania State University

This research endeavors to better understand the physical vapor deposition (PVD) coating manufacturing process by determining the most appropriate fluidic model to accurate capture PVD processes. An initial analysis was completed based on the properties of titanium vapor and first-law principles. The results show a dense Navier-Stokes solver best describes flow near the evaporative source and the vacuum will likely cause an extremely strong density gradient which transitions the flow into a highly rarefied state. Based on this result, a hybrid CFD-DSMC solver has been constructed in OpenFOAM for rapidly rarefying flow fields such as PVD vapor transport. The models are patched together using a unique patching methodology designed to take advantage of the one-way motion of vapor from the CFD region to the DSMC region.

Parameter studies are performed on a Navier-Stokes continuum based compressible solver, a Direct Simulation Monte Carlo (DSMC) rarefied particle solver, a collisionless free molecular solver and the hybrid CFD-DSMC solver to determine which features affect the deposition profile. The models are also compared to each other and appropriate experimental data to determine which model is most likely to accurately describe PVD coating deposition processes with the available information, requirements for computational design and reasonable computational resources. Overall, a hybrid CFD-DSMC solver with an improved equation of state for evaporated PVD materials is
recommended to model PVD flow.

Remote Access Details

PC, Mac, Linux, iOS or Android: https://zoom.us/j/494758175
iPhone one-tap: 14086380968,494758175# or 16465588656,494758175#
Telephone: +1 408 638 0968 (US Toll) or +1 646 558 8656 (US Toll)
Meeting ID: 494 758 175
International numbers available:https://zoom.us/zoomconference?m=XFM7Evo6SYPoE5kDM-jVvkBfcyjp6Kbf
H.323/SIP room system: H.323: 162.255.37.11 (US West) or 162.255.36.11 (US East)
Meeting ID: 494 758 175
SIP: 494758175@zoomcrc.com

Link of the Week: No One Guessed AI Teaching Assistant Not Human

Jill Watson is a virtual teaching assistant. She was one of nine teaching assistants in an artificial intelligence online course. And none of the students guessed she wasn’t a human

College of Computing Professor Ashok Goel teaches Knowledge Based Artificial Intelligence (KBAI) every semester. It’s a core requirement of Georgia Tech’s online master’s of science in computer science program. And every time he offers it, Goel estimates, his 300 or so students post roughly 10,000 messages in the online forums — far too many inquiries for him and his eight teaching assistants (TA) to handle.

That’s why Goel added a ninth TA this semester. Her name is Jill Watson, and she’s unlike any other TA in the world. In fact, she’s not even a “she.” Jill is a computer — a virtual TA —implemented on IBM’s Watson platform.

“The world is full of online classes, and they’re plagued with low retention rates,” Goel said. “One of the main reasons many students drop out is because they don’t receive enough teaching support. We created Jill as a way to provide faster answers and feedback.”