Performance Metrics

Application metrics

  • How quickly can I reduce how many events? Depends on
  • reduction factor
  • size per event
  • how much of the event is accessed during reduction (to make decision (skimming) and also to pass on to output (slimming))

System metrics

  • memory usage and caching strategy
    • I/O metrics
    • spark inbuilt metrics
    • CPU time of all executors
    • time spent on garbage in garbage collection, time in serialization
    • from HDFS you get rows and data read from HDFS
  • measure network traffic, important for reading from EOS