ML Platforms are hard to get right. It isn’t uncommon for custom MLPs to commit various design sins, like false prescription, that make them hard to use or limited. In some cases, the ML life cycle has done more harm than good, focusing engineering teams on common activities instead of common computing abstractions. Leveraging existing systems principals, we propose a possible ML Systems layered approach.
As a tangible example, we focus on data versioning, examples of which exist across commercial and private MLPs. We describe our experiences developing and using Disdat, an open-sourced data versioning system, to make the case for interoperable ML systems that can accommodate complexity and innovation.