External Table
Luca's blog on data engineering, data platforms, and performance.
Links
(Move to ...)
Luca's Home Page
Luca's Twitter
Luca's GitHub
Blog of the database services at CERN
▼
Wednesday, March 29, 2017
On Measuring Apache Spark Workload Metrics for Performance Troubleshooting
›
Topic: This post is about measuring Apache Spark workload metrics for performance investigations. In particular you can find the descriptio...
Monday, November 21, 2016
IPython/Jupyter SQL Magic Functions for PySpark
›
Topic: this post is about a simple implementation with examples of IPython custom magic functions for running SQL in Apache Spark using PyS...
Thursday, September 15, 2016
Apache Spark 2.0 Performance Improvements Investigated With Flame Graphs
›
Topic: This post is about performance optimizations introduced in Apache Spark 2.0, in particular whole-stage code generation. A test case ...
‹
›
Home
View web version