Training future Data Scientists - Part 4: Growth - it's a process

Sun don't shine in the shade - Kanye West, Waves

Growth is not only for the students, but for the program itself and all those involved. In 2017, growth was definitely on the cards for all involved. It was a time to stretch our methods, our relationships and our processes. Let’s take a look at some of the stretches we made in 2017.

Deep Dives

The CSIR is a research organisation. We have many researchers around the CSIR Modelling and Digital Science building. To tap into this talent during the DSIDE program, we work to have constant opportunities for collisions between student teams and researchers. One of these are Friday project progress presentations. These in the past were called Shoe* and Tell (The show was misspelled and we just kept it that way).

We tried to fit in 6 project presentations in 60 minutes. This gave the teams an opportunity to present their work and ideas to other researchers and developers at MDS, while at the same time exposing the rest of MDS to what the DSIDE and Data Science teams were working on. The challenge we quickly identified was that the time was too little (7 minutes for each team with Q&A). In 2017, we altered this model. We introduced both shorter and longer formats of the presentations. Read more ›

Tagged with: , ,

Training future Data Scientists - Part 3: Breaking down the process

Nothing's ever promised tomorrow today, But we'll a find a way - Kanye West, Heard 'Em Say

In this post I want to break down how we tackle the challenges that are chosen for the DSIDE program. As such lets break down the process.

How we look at problems

When we decide on the final challenge with a partner, we have some question that needs to be answered, some data to be delved into and background that the project team will have to get.  We now work with partners to define their project. First the potential partners fill in the web form with information needed to start scoping a project. We then follow up with promising project partners with a scoping sheet. We currently send a scoping sheet inspired by the DSSG program. Once all of this is done, we can then get to tackling the problem. Read more ›

Tagged with: ,

Training future Data Scientists - Part 2: Preplanning

Reach for the stars so if you fall you land on a cloud - Kanye West, Homecoming

Before the students step on to the CSIR campus, a lot of preparation happens. How much preparation? A lot! Let’s talk about preplanning. We are going to break it down into program design, finding partners & problems, and recruiting students.

Program Design and Preplanning

So how many people does it take to run a program with 50 students showing up every season? A lot. We have 4 core program leads at the moment. Nyalleng Moorosi, Dr. Quentin Williams, Dhiren Seetharam and myself. We work together on the design of the program, coordination and organisation of all  other parts. The program design is always a work in progress. Our goal is to be able to reach the goal of the program of providing a rigorous  training program that delivers value for our partners. As such we have set expectations on”

  • what happens on both parts of the season,
  • what a day should typically be like,
  • what happens during a typical week,
  • when deliverables are due,
  • What is a deliverable?
  • When we start
  • What workshops will be available
  • Evaluation
  • Ambitions to get better.
  • Which other non-curricular enhancements do we add to the the schedule?

Thus in this pre planning phase, we discuss the philosophies we all might have and what changes we might introduce in the new season. This is a collaboration that stretches all of us and pushes us to think of the impact our own decision make on the program. Our ambitions on each season have to be high, and we are coignascant that this also means more pressure on the rest of the participants. To reach our goals, we work with other CSIR staff for recruitment, CSIR researchers for project leads, mentors who oversee a single project etc. Read more ›

Tagged with: , ,

Training future Data Scientists - Part 1: What's in a DSIDE season?

Our work is never over - Kanye West, Stronger

We just finished another season of the Data Science for Insight and Decision Enablement (DSIDE). DSIDE is a Data Science training program that recruits 50 undergraduates (3rd & 4th years) and MSc/PhD students to come to the CSIR and tackle some of South Africa's challenges using a Data Science approach. The students spend 3 months at the CSIR, broken up into 1 month in the winter and 2 months in the summer. The program has been running since 2014 and the Department of Science and Technology is the main sponsor. You can find out more about the program on the program website. Now that the formalities are done, I wanted to look back at the just finished season and highlight some of the changes, successes and failures. Running such programs is very interesting and stretches the limits of the program team every year. Just a caveat, the DSIDE program is run between the Modelling and Digital Science and Meraka Units at - CSIR. Some experiences will be shared between the groups of students based at both, but some are unique to each unit. I will highlight this in the post.

Whats in a Season?

First let's start describing what actually happens during a season. The 50 students recruited will work in groups or 2/3 on a number of projects. We have had about 16 projects a year in the last few years. I believe a team of about 3 per project is a good number. It makes it easy to break ties and make decision :D. Our Data Science team at MDS takes 18 students, so 6 projects a year. The students are split into these teams and then assigned a project topic and a mentor. The project topic is not simply a description of the project, but access to a partner (who contributed the project topic) and data. The teams work to tackle the project challenge during the 3 month period they are given. The 1st month is focused on exploratory data analysis (EDA) and for the teams to refine their project challenge after spending some time with the data and essentially understanding the feasibility of tackling the challenge with the data given, the partner interactions and tools available. Read more ›

Tagged with: , , ,

Data Science Townhall and Q&A #1 (#DSIDE2017)

We have started the second session of the 2017/2018 DSIDE program. Our team at CSIR Modelling and Digital Science takes about 18 our of the 50 students on the program yearly. On the 8th of December we hosted Tefo Mohapi (Tech Journalist and Owner of iAfrikan).

Data Science Townhall and Q&A

Data Science Townhall and Q&A

Tefo talked a bit about his work and it's relations to data. The Q&A session was free flowing with questions on ethics of data, challenges with privacy and POPI. What became apparent was the general lack of public knowledge of the Protection of Personal Information (POPI) Act.

I want to express a lot of gratitude to Tefo for availing himself for the session.

Tagged with: , ,
Tweeter button Facebook button Linkedin button