Report on activities of ACEMS Early Career Researcher Retreat

Sam Clifford
18 December 2017

Introduction

This document lists the activities of the November 2017 ECR Retreat, particularly the project and discussion sessions. The retreat was held Sunday 29 October 2017 - Wednesday 1 November 2017.

The retreat was organised by Boris Beranger (UNSW), Sam Clifford and Miles McBain (QUT), Ross McVinish and Azam Asanjarani (UQ), Dinesha Ranathunga (Uni Adelaide), Eric Zhou (Monash), and Wilson Chen (UTS) with additional support from Jessie Roberts (QUT).

Schedule

Sunday 29 October 2017

Start	End	Activity
from 15:00		Welcome BBQ (at Eileen Peters Park, Cnr of Northcliffe Tce & Laycock St, Surfers Paradise)

Monday 30 October 2017

Level 1, Apollo room 2 and 3

Start	End	Activity
9:00	10:00	Welcome + Introductions + Identifying Research Projects
10:00	10:30	Morning Tea
10:30	11:00	Facilitated Discussion of Project Ideas
11:00	12:00	Work on Collaborative Projects
12:00	13:00	Lunch*
13:00	14:00	Discussion Sessions: Best Practice in Undergraduate Lecturing/Teaching, Challenges facing ECRs
14:00	15:30	Work on Collaborative Projects
15:30	16:00	Travel to Kurrawa Park (Broadbeach), Tram from Surfers Paradise to Braodbeach (~10min)
16:00	16:30	Speed Dating with PhD Students* (at Kurrawa Park)
16:30	19:00	BBQ & Games on the Beach* (at Kurrawa Park)
19:00	Late	Free Time

* Cross over with student retreat

Tuesday 31 October 2017

Start	End	Activity
9:00	9:45	Welcome + Identifying Research Projects
9:45	10:00	Work on Collaborative Projects
10:00	10:30	Morning Tea
10:30	12:00	Work on Collaborative Projects
12:00	13:00	Lunch*
13:00	14:00	Introduction to Julia*
14:00	15:30	Workshop Session: GitHub & R packages (Run by Nicholas Tierney)
15:30	16:00	Afternoon Tea
16:00	18:00	Career Trajectories: Invited Speaker Presentations Prof. Aurore Delaigle (Uni Melb/ACEMS) Dr. John Vial (SOUNDelve) Dr. Mark Lawrence (AMSI/ACEMS)
18:00	Late	Social evening with Invited Speakers (The Island, Cnr. Surfer Paradise Blvd & Beach Rd - self catered)

* Cross over with student retreat

Wednesday 1 November 2017

Start	End	Activity
9:00	10:00	Work on Collaborative Projects (Final preparation before presenting to the group)
10:00	10:30	Morning Tea
10:30	11:30	Group Presentations on Your Week's Work
11:30	12:30	Closing & Planning for Next Postdoc Retreat
14:00		Main Retreat Starts

* Cross over with student retreat

Project sessions

The following instructions were given to participants about proposing sessions for the retreat:

These sessions are about collaboration and discussing/working on projects you are interested in. They work like this:

You add projects you are interested in working on, or a topic your interseted in discussing to the Issues page before the retreat. Precede posts with either PROJECT: or SESSION:
On day 1, we discuss the project ideas and you decide what you would like to work on.
During the collaborative sessions, you work on your project of interest.

The following instructions were given to participants about participating in sessions:

You can change projects as much as you like.
You can work on a project you didn't suggest.
There are no expected outcomes.
You are not expected to finish the project by the end of the retreat.
The aim is to learn new skills, meet new collaborators, work on a project you find interesting.

Project proposals are available at the Issues page and the proramme at the Wiki. The session number corresponds to the item on the Issues page.

Sessions 1/14 - Teaching and Writing a book

These two sessions were run by Sam Clifford with participants Brodie Lawson, Tarunendu Mapder, Azam Asanjarani, Nick Tierney, Arthur Hung, Ross McVinish. In the first session, we discussed the issues around engaging students in first year service units, identifying and overcoming lack of background knowledge, designing teaching materials around problem based learning, live coding to show that teaching staff are capable of mistakes, whether we have the freedom to develop our own lecture material and whether this includes the case when there is no expectation as to content and approach from the school.

The second session, on writing a book, was focussed on the development of a book for first year STEM students that is not so much a comprehensive textbook as an overview of important topics from algebra, calculus, probability, statistics, visualisation, and communication. The book would be online, simply written, application-driven, and make use of storytelling techniques to provide a wide variety of applications in the sciences and engineering that require different types of mathematics and data analysis. Providing example code to copy, paste, run and modify would help show the usage of these techniques to solve the problems and then additional contexts would be provide for a "now you try" approach. The end of each chapter would show a little more detail for the theory and provide references to other resources (desirably online and open) for those who wish to read more about the topic.

A GitHub repo and Slack channel have been created for Sam, Azam, Arthur, Tarun, Jessie Roberts and Nick to continue work on this project.

Session 2 - Speed dating

This session was run by Sam Clifford at the Monday evening BBQ jointly with the PhD students. 40 minutes was spent at the park using a lottery system (pairs with raffle tickets) to give the PhD students and ECRs some time to ask questions of each other about what kind of research they did as well as what research life is like. A bag of question prompts was made available for when participants couldn't think of what to say.

Feedback from students was that they appreciated getting the opportunity to talk to one or two postdocs and even to discuss their issues one on one with each other. Postdocs enjoyed the experience of talking to the students and getting to know the wider ACEMS community. It was heartening that after the session itself had wrapped up students and postdocs continued talking to each other throughout the BBQ. This format would be repeatable in the future and would be good to have some AIs and CIs available to have contact with the other levels of ACEMS.

Session 9 - Challenges faced by ECRs

This session was ran by Azam Asanjarani. In this session, we consider some challenges faced by ECRs such as supervision of students, best way of starting a fruitful collaboration, doubt about academic carrier, and how become prepared for an academic job interview. There were more than 10 ECRs in this session and we enjoyed learning from Prof. Matt Wand's and Prof. Robert Kohn's comments and experiences.

Session 10 - HMMs, DBNs

This session was run by Paul Wu and was attended by Rhys Bowden, Jing Fu and Robert Kohn

Explained how Dynamic Bayesian Networks work in terms of inferences and their application to complex systems
Discussed the issue of inference under uncertainty where the current framework treats evidence as weightings rather than with likelihoods
Side discussion on paths in Markov random fields (headed by Jing).
Separate discussion with Robert Kohn who was interested in future collaboration along the lines of variational Bayes and incorporating a loss function to enable optimisation (eg. absolute difference between baseline trajectory and dredging trajectory).

Sessions 12/19 - GitHub and R packages

Researchers often write code to perform an analysis. When writing code, it is quite natural to have to run similar batches of code. For example, getting the data into right format for analysis, fitting a new model to different datasets, or creating some diagnostic statistics and visualisations. As a researcher, you will likely need to share the code you have written - this can be with collaborators, for a journal publication, or you might want to share the work that you have created with the greater public. In the popular programming language R, (ranked in the top 10 programming languages ), the fundamental unit of shareable code is an R package - it bundles together your code, data, docuemntation, and is very easy to share with others.

Another important component in writing your code is managing what changes as you progress through writing it, and also developing a system or protocol for managing who accesses your code and how it is added back in. This system or protocol is typically called "version control". If you ever have documents like "Final1", "Final2" "2017-01-01-analysis-of-health-data", then you have done version control. Git is regarded as the best, (and also perhaps the most complicated), system for version control. GitHub is a company that has an online platform to manage git and share your code with the rest of the world.

This workshop discussed R packages - why you should care, how to build one, and how to write documentation. We then discussed how to share this package online with git and github, and some basic workflows for git.

The workshop was successful in demonstrating these principles of package development, and version control to participants, and in showing them that it is a very achieveable thing to do. In total about 8-12 participants attended the course. A few of the participants commented that they would use R packages in their work. We were hindered by some unfortunately typical problems when running a computer workshop - getting everyone's machine up and running with the appropriate software. This could be improved in the future by having a much better internet connection, and also with more planning ahead of time, where we could give participants instructions to have certain softwares installed before the course started. It would also have been helpful to have a few helpers / teaching assistants to help get everyone set up.

As an aside, it would be of great benefit to many researchers to be upskilled with some research computing skills, with a program such as software carpentary, which would cover skills on using git, github, the unix shell, and R programming, and literature programming (using Rmarkdown). Please contact Nick Tierney (nicholas.tierney@gmail.com) to discuss this further.

Session 13 - Data, data everywhere

Nick Tierney

At this session we sat down and discussed what data we use in our research, the idea being that some possible outcomes could be:

establishing collaborations using the same data
publishing the data online for others to access
identifying public datasets that can be used for research problems

We put together a proposal for how the ACEMS website could list projects, papers, and also datasets for each person at ACEMS. One approach was:

Link three databases/tables
Person ID: ID for each researcher (ORCID?)
Project ID: ID for each project that they work on (ARC code, or > equivalent)
Data ID: DOI to each data set (or placeholder)
Have a "keywords" search, so we can search the datasets

We then discussed how it would be useful to have more discussions about data within ACEMS, with one idea being that ACEMS could discuss, and possibly even make a public statement about data sharing and data formatting. This could, for example, discuss data from organisations such as BOM, or police data, which in Australia are severely lacking in terms of data access, compared to the USA.

We then discussed how ACEMS could connect up with other centres to share data, Robert Kohn suggesting that we could work together with Sydney University (Centre for Translational Data Science), who have access to Crime and hospital data. Boris also suggested some collaboration with the Centre of Excellence for Climate Extremes. Another institute that came of interest was the Australian Urban Research Infrastructure Network (AURIN).

We also discussed that ACEMS should promote data sharing, and could perhaps use services such as:

DOI issuing service (in line with "Good Enough Practices in Scientific Computing"), like ANDS: http://www.ands.org.au/
ANDS provides a directory to various data portals https://researchdata.ands.org.au/theme/open-data https://researchdata.ands.org.au/search/#!/class=collection/access_rights=open/

Session 18 - Egg Packing Competition

The egg packing competition was a competition (with prizes!) made available to ACEMS students. After some interest from some of the Postdocs, a separate competition was also made available to that group. The basic page for the problem is here and the page created for the Postdocs is here. In both cases, the goal is the same - given a set of points on a 2D square, each with an associated circle of given radius, we must find the largest number of these circles we can "turn on" in order to maximise the area covered. This is referred to as "egg packing" because the provided application was a set of locations for a frog to lay her eggs such that the maximum number of eggs would be able to grow to full size. To give an idea of the scope of the problem, the postdoc competition involved 100,000 possible sites, and good solutions were turning on 2100+ sites - there are a very large number of possible combinations and only a small fraction of the sites will end up turned on.

We used the opportunity of the retreat to compare and contrast the submitted solutions to the postdoc side of the competition, then coded up the submitted algorithms in a consistent framework so we could compare their solution-finding performance and timings. These varied from a simple greedy algorithm with a clever piece of intuition that allowed for the generation of surprisingly good results, to an algorithm that sought to gradually explore very large numbers of possible states, whilst guaranteeing that the solutions it generated never grew worse. The decision was reached to write up the problem and the techniques applied to solve it in a more formal fashion - at least as a full blog post, or possibly a conference or arxiv article. To this end, we investigated what possible links there are between this problem and others of mathematical interest, also noting that the problem can be formulated exactly as an integer programming problem. Our discussions at the retreat culminated with a presentation.

Primary members of this Project group were Brodie Lawson, Wilson Chen, Ross McVinish and Slava Vaisman, although many others did also contribute!

Session 27 - Athlete data

Two sessions were run by Paul Wu, the first was attended by Steve Psaltis and Ali Tirdad, who were joined in the second session by Rhys Bowden and Jing Fu.

We explored the Athlete Monitoring System (AMS) dataset and the challenges involved in maximising performance whilst minimising risk of injury or illness. Identified some challenges: (i) need some way to make injuries/illness and performance comparable in an objective function for optimisation, (ii) how to correlate instances of training and other activities over time with performance and injury outcomes. Investigated the current dose-response model for training, fatigue and adaptation and how it might be adapted in a probabilistic framework. Potential future collaboration if grants with QAS/AIS get up.

Conclusion

Informal discussion with workshop participants indicated an appreciation of the unstructured nature of the workshop and a desire to repeat the same approach to planning the next workshop. The organisers suggest that for the next retreat it would be good to have a member of each ACEMS node on the organising committee in order to ensure that all ACEMS ECRs are familiar with the planned structure and engaging with it prior to arrival at the retreat.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

report.md

report.md

Report on activities of ACEMS Early Career Researcher Retreat

Introduction

Schedule

Sunday 29 October 2017

Monday 30 October 2017

Tuesday 31 October 2017

Wednesday 1 November 2017

Project sessions

Sessions 1/14 - Teaching and Writing a book

Session 2 - Speed dating

Session 9 - Challenges faced by ECRs

Session 10 - HMMs, DBNs

Sessions 12/19 - GitHub and R packages

Session 13 - Data, data everywhere

Session 18 - Egg Packing Competition

Session 27 - Athlete data

Conclusion

Files

report.md

Latest commit

History

report.md

File metadata and controls

Report on activities of ACEMS Early Career Researcher Retreat

Introduction

Schedule

Sunday 29 October 2017

Monday 30 October 2017

Tuesday 31 October 2017

Wednesday 1 November 2017

Project sessions

Sessions 1/14 - Teaching and Writing a book

Session 2 - Speed dating

Session 9 - Challenges faced by ECRs

Session 10 - HMMs, DBNs

Sessions 12/19 - GitHub and R packages

Session 13 - Data, data everywhere

Session 18 - Egg Packing Competition

Session 27 - Athlete data

Conclusion