HPE Machine Learning Data Management Software
Do you need to scale your machine learning (ML) and data pipelines when processing large amounts of structured and unstructured data?
HPE Machine Learning Data Management Software delivers a flexible data layer platform that automates complex ML and data pipelines, while providing data versioning and lineage for reproducibility. Increase the performance of your pipelines with autoscaling and parallel processing built on Kubernetes for resource orchestration. Standard object stores, deduplication, and pipelines are automatically triggered when data is modified, thus saving your engineers time and resources. This platform features immutable data lineage with data versioning of any data type to provide traceable results, allowing any outcome to be reproduced.
- QuickSpecs (PDF) (HTML)
- | Data sheet
-
Existing selections will be lost. Click OK to proceed further.
More Information
What's New
- Data pipelines with versioning and lineage for complete reproducibility.
- Scale workloads to petabytes with autoscaling and parallel processing.
- Data-driven pipelines intelligently triggered by detecting changes to data.
- Drive collaboration and scale teams with modular, shareable, central platform serving as single source of truth.
- Generate simpler, clearer, and easy to debug code with the improved Python based SDK.
Key Features
Scalability and Performance
Leveraging Kubernetes for resource orchestration, HPE Machine Learning Data Management Software can process petabytes of data and billions of records using autoscaling and parallel processing across multiple nodes.
Automatically deduplicate data and process only the new or changed data (incremental processing) so models always have the most current data.
Git-like, modular structure of repos and pipelines enables teams to scale, share resources, and collaborate effectively.
Reproducibility and Automation
With immutable data lineage and data versioning, HPE Machine Learning Data Management Software provides complete reproducibility of any outcome.
Full versioning for data and metadata including all analysis, parameters, artifacts, models, and intermediate results.
Data-driven pipelines are automatically triggered by detecting changes to data (additions and modifications), pipelines, or code.
Combine automation, data versioning, and parallel processing to transform expensive and unpredictable projects into streamlined, enterprise-grade AI/ML production workflows through end-to-end pipelines.
Flexible Platform
Language and tooling agnostic, HPE Machine Learning Data Management Software allows engineers complete autonomy to use whatever languages, frameworks, or libraries are best for their use case.
Data agnostic, this solution supports structured and unstructured data easily as well as batch and streaming data.
Runs inside Docker containers on Kubernetes for complete portability to the cloud or on-premises.
Cost Effective for ML workloads
HPE Machine Learning Data Management Software can reduce compute and storage costs by processing only new or changed data and deduplicating data.
Customize resource utilization by allocating workloads to CPU or GPU resources as the use case demands.
QuickSpecs
Related Links
Pachyderm Support Sites
Docker is a trademark or registered trademark of Docker, Inc. in the United States and/or other countries. All third-party marks are property of their respective owners.
* Prices may vary based on local reseller.
