Aimed at "dramatically" improving the productivity of deep learning developers, the company's Determined AI Platform tightly integrates all of the features that a deep learning (DL) engineer needs to train models at scale. The platform, says the company, manages users' heterogeneous hardware and optimizes their GPU resource utilization, and now powers teams of DL engineers and large GPU clusters in industries like pharmaceutical drug discovery, adtech, industrial IoT, and autonomous vehicles, and is now ready for widespread adoption.
Up to now, says the company, except for tech giants like Google, Facebook, and Microsoft - which have invested massive resources and expertise to build proprietary, AI-native internal infrastructure - lack of software infrastructure has been a fundamental bottleneck in achieving AI's immense potential. For everyone else who doesn’t have access to this infrastructure, building practical applications powered by AI remains prohibitively expensive, time-consuming, and difficult.
"We started Determined AI three years ago to bring AI-native software infrastructure to the broader market," says the company in a blog post announcing the move to open source. "Working closely with cutting-edge deep learning teams across a variety of industries, a clear narrative emerged: without better infrastructure, training deep learning models at scale remains extremely difficult, as organizations move from research to production."
That feedback, says the company, led it to build the Determined Training Platform, which the company has now open sourced under the Apache 2.0 license. The platform offers the following features:
- High-performance distributed training: Determined’s distributed training support builds upon Horovod, a popular distributed training framework, but includes a suite of optimizations that results in twice the performance of stock Horovod. Moreover, Determined’s distributed training support is easy to set up (no code changes are needed to move from single-GPU to distributed training), and allows multiple users to seamlessly share the same GPU cluster.
- State-of-the-art hyperparameter search: Determined’s hyperparameter search functionality integrates tightly with the company's job scheduler and is parallel by