Organizations collect data from a wide range of sources to deliver usable insights to drive competitive advantage. Participants will learn about the challenges of collaboratively modelling and manipulating this vast amount of data. They'll also acquire hands-on experience using advanced data science tools to model and manipulate data sets through practical exercises.
This workshop will be delivered by experienced analysts of the McGill High Performance Computing Centre.
Participants will be encouraged to work in collaborative teams, and to practice these tools and techniques using their own business and organizational data.
- Statistical modelling tools: R and Python (SciPy)
- Image analysis tools: ParaView
- Distributed data management over clusters: Hadoop and Mapreduce
At the end of this workshop, participants will:
- Be introduced, through hands-on exercises, to applying statistical models and mathematical analyses to both small and large data sets;
- Develop skills to start applying these analysis methods to their own organizational data in order to obtain useful practical insights;
- Deepen their understanding of the advanced information technologies and tools available to model and manipulate large data sets