Overview | What you’ll learn:
Once every few years or so, the big data open source community experiences a major innovation that advances the capabilities of data processing frameworks. For many years, MapReduce and the Hadoop open-source platform served as an effective foundation for the distributed processing of large data sets. Then last year, the introduction of YARN provided the resource manager needed to enable interactive workloads, bringing data processing performance to another level. However, as organizations entrust big data platforms to handle more of their critical business information, the volume and variety of data will continue to grow rapidly as will the need for speed to insight and action on that data. As most of the community would agree, we believe that Apache Spark is the next big innovation and platform to help take on the data challenges of tomorrow.
Application Spotlight: Tableau Software
Source: article written by Jeff Feng at Databricks and posted in Databricks’ Blog.
Please, fill in the form to read the full post.