Data Science Club walks newbies through the hottest tools in research

Written by 

rep data duo 550pxThe data duo: In their live videos, William Monroe (left) and Ravi Tripathi (right) take Data Science Club viewers through the basics of starting out in one of today's hottest fields.Lots of people would like to hire a data scientist — or become one. For the past three years, data scientist has held the top spot on Glassdoor’s list of the best jobs in America based on earning potential and job satisfaction. And in 2019, it was the top job overall on LinkedIn’s rankings of the most promising jobs in the country.

“Data science is one of the most in-demand skillsets today — with the least supply of people who can do the job,” said William Monroe, a scientist in UAB IT Research Computing. Programs such as UAB’s new Master’s in Data Science are working to fill the gap. Meanwhile, researchers, especially those working in neuroscience and biostatistics at UAB, grasp the potential of techniques such as machine-learning but “don’t know where to start,” Monroe said. “Machine-learning is now where HTML was in 1997. Unless you had had an HTML course back then, you didn’t know how to build a website and it was all a mystery. These are the hot tools right now that everyone is figuring out how to use.”

During the past year, researchers with questions have ended up at Monroe’s door. That’s because Research Computing has lots of Graphical Processing Units (GPUs), the hardware that has fueled the current artificial intelligence and machine-learning revolutions. “We have 72 Nvidia GPUs” in the Cheaha supercomputer cluster, Monroe explained, and data-science queries comprise a growing percentage of the weekly office hours sessions that Research Computing offers, he added.


rep dsc screenshot 1000pxYou don't come to the Data Science Club for cinematography. Each episode consists of a collaborative screencast as Monroe and Tripathi walk through the lesson — complete with common user errors. 


Jump in today

To join the Data Science Club, visit rc.uab.edu and request an account on the Cheaha supercomputer, then visit the club’s YouTube page.

Join the club

So Monroe and colleague Ravi Tripathi, a software developer, found a way to scale up their training efforts. In April, they launched the Data Science Club. This isn’t a traditional lab journal club — it’s more like the Dollar Shave Club for machine-learning and biostats. For free.

The Data Science Club is open to “anybody who might be interested,” Monroe said. Each week, members get an online notebook of code that covers a fundamental data-science concept — along with a YouTube Live screencast video of Monroe and Tripathi (which "airs" live on Fridays from 10:30-11:30 a.m.) working through the lesson’s steps in real time. This is decidedly hands-on instruction of the type you would get if you came in to Research Computing’s office hours — except the schedule is entirely up to the user, Monroe said. “This work is best done when you’re sitting at your computer trying to do it.”

Monroe and Tripathi don’t have to worry about whether their students have the right hardware. Club members get their own accounts on UAB’s Cheaha supercomputer — the state’s fastest — and interact with it using any web browser through the new UAB Research Computing on Demand system. The first Data Science Club video walks participants through setting up the necessary Python and R programming languages in their user accounts.

Open office hours to discuss club content:
  • 1-3 p.m. Wednesdays in Lister Hill Library Room 425
  • 10 a.m.-noon Thursdays in Lister Hill Library Room 425

Getting over the hump

The videos don’t skip steps or delete operator errors. “Part of the experience is getting to see Ravi and I struggle through,” Monroe explained. “We forget to put in semicolons and make other mistakes, just like every user. That will hopefully make it more approachable.”

The ultimate goal is to build a community, he says. As his participant list grows, Monroe hopes that users can build collaborations. “We want to create a space where conversations can happen, where an undergraduate in electrical and computer engineering can come alongside a neurobiologist,” he said. “We want to generate that activation energy to push people over the hump of learning a new skill.”




More on Research Computing

More on Research Computing

More storage, better data, new partners: Zottola’s people-first approach to research computing core facilities

Read more

Your browser is the supercomputer: On Demand is a no-tears shortcut to research-computing

Read more