Cleantech innovator EnPowered equips businesses with insights they can use to reduce their energy costs. ML and AI are key to this. EnPowered uses ML models and AI-driven solutions to analyze and predict energy consumption patterns and market rates for clients. Behind the scenes, its data science team works to continually improves the quality and performance of this software.
EnPowered relies on Shakudo, a modular, adaptable operating system for data infrastructure, to achieve business goals. Shakudo streamlines DataOps and MLOps in the cloud, enabling EnPowered’s data scientists to focus on developing and operationalizing AI solutions — instead of spending time setting up and configuring their data stack. Shakudo also gives EnPowered reusable patterns and best practices it can use to quickly implement workflows for common data management and ML processes.
The Challenge
Before Shakudo, most of EnPowered’s ML development took place on each individual data scientist’s local laptop. The lack of a consistent ML development platform made it difficult for them to share models, code, data, and other artifacts.In practice, EnPowered’s data scientists were spending a significant part of each workday debugging and resolving software dependency and versioning issues within their data stack. Only after fixing these issues could they focus on ML-related tasks.
But ML development also took longer than it should. Data-prep or model-training workloads that would have taken minutes on a cloud compute cluster took hours on their local systems. To make matters worse, they didn’t have a central repository for data, says EnPowered CTO Mike Kirkup.
“All the data that we would collect to train models, for example, would be individually collected on a specific data scientist's laptop — there was no way for other teams to access that, let alone visualize it,” Kirkup explains. “There was very little ability for data scientists to collaborate, and very little ability to multitask or work on multiple projects at the same time.”
[.div-block-152][.text-block-45]Improve efficiency on your end-to-end model development cycle[.text-block-45][.cta-button-blog]LEARN MORE[.cta-button-blog][.div-block-152]
EnPowered wanted to shift ML development to the cloud, but neither the data science team nor IT wanted responsibility for owning, operating, and maintaining the cloud ML stack. The data science team, especially, were eager to move away from the time-consuming work of maintaining their data stack.
Enter Shakudo, whose modular, pre-configured data stack components would allow EnPowered’s data scientists to self-serve in the cloud, without having to learn how to provision cloud compute and storage resources, set up cloud VPC environments, or configure secure connectivity to cloud services.
“With Shakudo, if we want to train models in the cloud, we can easily allocate extra capacity if it’s a large job. Shakudo equips the data scientists themselves to make that call,” Kirkup says. “And we don’t have the risk of the person who designed our stack, or the person who maintains it, deciding to leave the company. Shakudo makes us a lot more robust.”
The Shakudo Solution
Before Kirkup and EnPowered landed on Shakudo, they considered a DIY approach, hoping to leverage the infrastructure-as-code (IaC) services offered by their cloud provider.
EnPowered’s data science team had already tried, and rejected, IaC in its prior efforts, where it had relied on a combination of scripts, code, and workflow orchestration software.
The team had come to recognize that, while IaC can enable a scalable, version-controlled approach to managing infrastructure and software, it is a massive undertaking. Organizations that practice IaC must assume responsibility not only for the IaC codebase itself, but also for the underlying software and tools, such as Terraform, used to define and provision infrastructure, along with platforms like Kubernetes.
For a vendor that markets a fully managed service, IaC is more than just a core competency, it’s the basis of its business. For most companies, however, IaC isn’t a core competency. Besides, the thing IaC helps with the most is the infrastructure aspect of cloud operations: setting up and tearing down resources. Ironically, this is the easy part, as cloud infrastructure APIs tend to change far less frequently than the software workloads running on top of them. With software, the challenge isn’t just to set up and tear-down instances of an ML stack — like MLflow, Apache Superset, Ray, and Prefect — but to get them to interoperate with one another, scale them appropriately, and resolve software dependency and versioning issues.
Shakudo’s modular data stack platform is designed to obviate problems like these.
With Shakudo, EnPowered data scientists wouldn’t have to worry about software dependency or versioning issues. Instead, they could select from among pre-configured data stack components that pre-integrate all required binaries, libraries, and other dependencies. Even more important, Shakudo, not EnPowered, would be responsible for maintaining these components. With Shakudo, the data science team would be able to focus on the core work of AI development — without wasting time operating and maintaining the software that supports it.
[.div-block-152][.text-block-45]Eliminate the complexity of managing your data stack and focus on analytics, model building, and deriving insights from data.[.text-block-45][.cta-button-blog]LEARN MORE[.cta-button-blog][.div-block-152]
Outcomes
Shakudo allowed EnPowered to eliminate the local silos that had stifled ML development, giving the company’s data scientists the equivalent of self-service cloud resources they could spin up and shut down on demand.
Thanks to Shakudo, EnPowered was able to:
- Equip its data scientists to serve themselves. Before Shakudo, EnPowered's data scientists trained models on their laptops, unable to parallelize ML processing across large clusters or take advantage of advanced GPU compute capabilities. In the cloud, Shakudo makes this easy. Its pre-configured data stack components allow EnPowered’s data scientists to self-serve, giving them a way to easily run ML and other workloads in the cloud, without having to master the intricacies of IaC — or involving SREs.
- Sidestep software setup and configuration challenges. Shakudo completely automates the tasks involved in provisioning cloud infrastructure and software. It offers a large selection of pre-configured data stack components for running data engineering, analytic, and ML workloads, meaning data scientists don’t have to manually install, configure, or troubleshoot software. Instead, they can select their preferred data stack components, specify certain minimum requirements, and let Shakudo take care of the rest, provisioning the appropriate resources they need to support their workloads.
- Accelerate ML development and improve collaboration. With Shakudo, EnPowered’s data scientists were able to shift ML training and other workloads to cloud infrastructure, taking advantage of cloud elasticity to run them on high-performance clusters. Access to cloud object storage not only helped simplify data discovery and access, but also encouraged sharing, eliminating data silos and reducing the risk of potential data loss.
- Eliminate uncertainty, especially in the cloud. Shakudo offers the equivalent of reusable data stack “patterns” that EnPowered’s data science team can leverage to simplify repetitive or complex tasks, like feature engineering and model training. Data scientists can even create their own customer patterns based on Shakudo’s data stack components, designing reusable workflows to create and/or update a feature store; retrain, test, and redeploy a production model; etc. Many users are unsure how to implement these and other processes in the cloud — Shakudo’s data stack patterns provide them with workflows and best practices they can trust.“In terms of best practices, in terms of structure, in terms of overcoming some of that initial uncertainty around how we architect for certain tasks and what pieces we use, Shakudo’s expertise has been essential,” Kirkup says.
“One of the biggest improvements, from an engineering perspective, is the speed with which the data science team is able to deploy their work. Before, development was a huge bottleneck, and it could take a week or two to deploy new models. Shakudo made a really big difference in terms of our ability to iterate and move quickly.” - Mike Kirkup, CTO, EnPowered
Looking Ahead
Right now, EnPowered’s data science team is laser-focused on refining the accuracy of its models, especially in challenging markets with narrow margins for error. This requires running large-scale compute workloads in the cloud, with iterative rounds of model training, tuning, and validation yielding marginal gains in accuracy. Data scientists can’t avoid this work, but they can lean heavily on Shakudo to accelerate it, quickly provisioning and deprovisioning the cloud infrastructure resources they need to test and evaluate their work.
Going forward, one of EnPowered’s priorities is to make use of new types of data from novel sources, including real-time environmental telemetry and weather data. EnPowered’s data scientists can take advantage of Shakudo’s modular data stack components to accelerate this work, too, with Shakudo’s pre-built patterns simplifying the process of creating reusable workflows for data ingestion, profiling, cleansing, validation, and other core data preparation tasks. This will make it much easier for them to acquire and prepare data they need to use in exploratory data analysis, feature engineering, model training, model optimization, and other workflows.