Apache Spark

Unified analytics engine for large-scale data processing.

Deployment • On-Premises; SaaS; HybridPricing • Open-sourceHQ • InternationalFounded • 2014Team • Open-source

Product overview

Key capabilities

  • Automation; Data Integration; Analytics

Designed for

  • Pharma; Biotech; CRO; IT

Pharma stages

Discovery; Preclinical; Clinical

Integrations

Custom API; Microsoft; Other

Sources

https://spark.apache.org/

Similar platforms

Based on category, deployment, and capabilities overlap.

ITOn-Premises; SaaS; HybridAutomation; Data Integration; Analytics
Apache Airflow

Open-source platform to programmatically author, schedule, and monitor workflows.

Visit website

Secondary focus

Key capabilities

  • Automation; Data Integration; Analytics

Built for

Pharma; Biotech; CRO; IT

Pricing

Open-source
HQ • InternationalFounded • 2014Team • Open-source
ITOn-Premises; SaaS; HybridAutomation; Data Integration; Analytics
Apache Beam

Unified model for defining batch and streaming data-parallel processing pipelines.

Visit website

Secondary focus

Key capabilities

  • Automation; Data Integration; Analytics

Built for

Pharma; Biotech; CRO; IT

Pricing

Open-source
HQ • InternationalFounded • 2016Team • Open-source
ITOn-Premises; SaaS; HybridAutomation; Data Integration; Analytics
Pachyderm (HPE)

Data lineage versioned pipelines for ML and analytics.

Visit website

Secondary focus

Key capabilities

  • Automation; Data Integration; Analytics

Built for

Pharma; Biotech; CRO; IT

Pricing

Subscription
HQ • United StatesFounded • 2014Team • 201-500
ITOn-Premises; SaaS; HybridAutomation; Data Integration; Analytics
Ray (Anyscale)

Unified compute framework for scaling Python/AI workloads; workflows APIs.

Visit website

Secondary focus

Key capabilities

  • Automation; Data Integration; Analytics

Built for

Pharma; Biotech; CRO; IT

Pricing

Open-source; Subscription
HQ • United StatesFounded • 2019Team • 51-200
ITOn-Premises; SaaS; HybridAutomation; Data Integration; Analytics
Stonebranch

Universal automation center for hybrid IT orchestration.

Visit website

Secondary focus

Key capabilities

  • Automation; Data Integration; Analytics

Built for

Pharma; Biotech; CRO; IT

Pricing

Subscription
HQ • United StatesFounded • 1994Team • 201-500
ITOn-Premises; HybridAutomation; Data Integration; Analytics
Altair Grid Engine

Distributed resource management (ex-Univa/Oracle Grid Engine).

Visit website

Secondary focus

Key capabilities

  • Automation; Data Integration; Analytics

Built for

Pharma; Biotech; CRO; IT

Pricing

License; Subscription
HQ • United StatesFounded • 1985Team • 1001-5000