Apache Spark components¶
Apache Spark is a unified analytics engine for large-scale data processing, centered around a core computing engine with various specialized components.^[600-developer__big-data__big-data.md]
Core Components¶
- Spark Core: The foundation of the platform, responsible for data computing operations^[600-developer__big-data__big-data.md].
- Spark SQL: A module for structured data processing, allowing users to query data using SQL or the DataFrame API^[600-developer__big-data__big-data.md].
- Spark Streaming: A component that enables scalable and high-throughput Stream processing (flow computation) for real-time analytics^[600-developer__big-data__big-data.md].
Implementation and Context¶
The Spark framework is primarily implemented in the Scala programming language^[600-developer__big-data__big-data.md]. It is one of the two major frameworks in the Big Data ecosystem, often contrasted with the Hadoop framework (which includes HDFS and MapReduce)^[600-developer__big-data__big-data.md].
Related Concepts¶
- [[Big Data]]
- [[Hadoop]]
Sources¶
^[600-developer__big-data__big-data.md]