Skip to Content (Press Enter)

Job ID : 43674

ML Runtime Engineer

Cerebras Systems - Computer Science

JOB POSTING INFORMATION

Position Type:

Professional Experience Year Co-op (PEY Co-op: 12-16 months)

Job Title:

ML Runtime Engineer

Job Location:

Toronto

Job Location Type:

Flexible

If working on site, can you provide a copy of your COVID-19 safety protocols?:

Number of Positions:

Salary:

$42.00 hourly for 40.0 hours per week

Start Date:

05/06/2024

End Date:

04/25/2025

Job Function:

Information Technology (IT)

Job Description:

Cerebras Systems has pioneered a groundbreaking chip and system that revolutionizes deep learning applications. Our system empowers ML researchers to achieve unprecedented speeds in training and inference workloads, propelling AI innovation to new horizons.

The Condor Galaxy 1 (CG-1), unveiled in a recent announcement, stands as a testament to Cerebras' commitment to pushing the boundaries of AI computing. With a staggering 4 ExaFLOP processing power, 54 million cores, and 64-node architecture, the CG-1 is the first of nine powerful supercomputers to be built and operated through an exclusive partnership between Cerebras and G42. This strategic collaboration aims to redefine the possibilities of AI by creating a network of interconnected supercomputers that will collectively deliver a mind-boggling 36 ExaFLOPS of AI compute power upon completion in 2024.

Cerebras is building a team of exceptional people to work together on big problems. Join us!.

About The Role
As a Runtime Engineer, you will directly impact the performance at which deep learning models are trained on our “distributed systems” hardware and be responsible for enabling next-generation AI applications that require substantial computational capabilities. In this position, you will develop algorithms for execution, acceleration, partitioning, and routing of communication for dataflow graphs on a massively parallel, multi-core architecture.

Specific responsibilities may include:

Be able to understand the flow of data in a distributed system and how to characterize performance pain points
Develop algorithms for allocation of compute, communication, and memory resources
Measure, analyze, and improve execution of Runtime software (that is responsible for training large models with massive datasets)
Integrate successful optimizations into production software stack
Implement mathematical models in C++ or Python using discrete optimization techniques and standard libraries and packages

Job Requirements:

Requirements

Enrolled within University of Toronto's PEY program with a degree in Computer Science, Computer Engineering, or any other related discipline
Strong proficiency in C/C++
Familiarity with Python or other scripting language
The ability to operate at multiple levels of abstraction in the software stack

Preferred

Knowledge about distributed systems, memory subsystem of modern computers, and networking solutions

Preferred Disciplines:

Computer Engineering

Computer Science

Engineering Science (Electrical and Computer)

Engineering Science (Infrastructure)

Engineering Science (Machine Intelligence)

Engineering Science (Robotics)

All Co-op programs:

Targeted Co-op Programs:

Targeted Programs

Professional Experience Year Co-op (12 - 16 months)

APPLICATION INFORMATION
Application Deadline:	Nov 1, 2023 11:59 PM
Application Receipt Procedure:	Online via system
Additional Application Information:	Please apply with both resume & transcript. Lacking transcript will disqualify you from being considered. Note that applications will be considered on a rolling basis. Apply as early as possible.
U of T Job Coordinator:	Yasmine Abdelhady

ORGANIZATION INFORMATION
Organization:	Cerebras Systems
Division:	Computer Science
Website:	https://cerebras.net/

ADDITIONAL INFORMATION
Length of Workterm:	FLEXIBLE PEY Co-op: 12-16 months (range)

Apply for this Position

Shortlist

TAGS

Deadline in 45 day(s) Viewed