July 07, 2021, author: Lubomír Bulej
We are pleased to announce a new release of the Renaissance benchmark suite. Apart from a number of internal cleanups, most changes in this release focused on improving compatibility with modern platforms. Most notably, we have moved away from Scala 2.11, Scala 2.12, and Spark 2 towards Scala 2.12, Scala 2.13 and Spark 3. Some benchmarks were unaffected, some needed just dependency updates, and some needed a bit more work.
The workloads, as implemented by the benchmarks, are largely unchanged, but the code being executed as a result of the benchmark workload may differ substantially from previous version of Renaissance, mainly due to substantial dependency updates in some of the benchmarks. See the sections below for more details.
On the platform compatibility front, the suite supports JDK8 and JDK11 on Linux, MacOS X, and Windows. While most benchmarks also run on more recent JDK versions, some of them have dependencies that limit their compatibility to a particular range of JDK versions.
We welcome any comments and contributions.
The harness now automatically excludes benchmarks that are known to be
incompatible with the JVM on which it executes. This can be disabled
--no-jvm-check option on the command line.
The harness now allows using a specific base directory for scratch
files. The current directory is still the default, but it can be changed
--scratch-base option on the command line. For debugging
--keep-scratch option disables deleting the scratch
directories when the harness terminates.
Finally, to make experimentation with benchmarks a bit easier, the
harness allows overriding benchmark parameters using the
command line option.
Spark benchmark changes
The Spark benchmarks moved from Spark 2.0.0 on Scala 2.11.8 to Spark 3.1.2 on Scala 2.12.14, which is a major change in the underlying libraries. In addition to various cleanups and compatibility fixes for Windows, there have been several changes that may influence the overall performance.
- The number of threads used by the Spark local executor is now
to reflect the parallelism of a particular workload, which prevents
scalability issues on huge machines with tens of cores. The limit is
configured using the
spark_thread_limitparameter. - All workloads use the default partitioning of input data and limit the storage level of input data to memory only.
movie-lensalso limit the storage level of intermediate data sets to memory only to avoid benchmarking storage IO.
log-regressionnow avoids repeated conversion from an RDD to DataFrame.
naive-bayesnow uses ML classes from the
org.apache.spark.mlpackage because classes from the
org.apache.spark.mllibpackage only served as wrappers, introducing unnecessary conversion from RDD to DataFrame.
Benchmarks using Scala 2.13.6
The following lists Java bencharks with a simple Scala wrapper or Scala benchmarks that have been updated to use Scala 2.13.6, along with notable changes in some of them:
akka-uct(previously Scala 2.11.8)
akka-actordependency (2.3.11 -> 2.6.12)
- Very minor benchmark update for the 2.6.x API.
- Maximum supported JVM version set to 11.
dotty(previously Scala 2.12.8)
- Updated Dotty compiler (0.12.0 -> 3.0.0)
scalapsources instead of “type-system puzzles”.
- Adds hash-based validation of generated Tasty files.
scala-kmeans(previously Scala 2.12.8)
scala-doku(previously Scala 2.11.7)
Benchmarks using Scala 2.12.14
The following lists benchmarks that use Scala 2.12.14, along with notable changes in some them:
finagle-http(previously Scala 2.11.8)
neo4j-analytics(previously Scala 2.12.8)
- Benchmark ported to Neo4J 4.x API
org.neo4j.neo4jdependency (3.5.12 -> 4.2.4)
net.liftweb.lift-jsondependency (3.2.0 -> 3.4.3)
- Requires JVM versions between 11 and 15, inclusive (Neo4J requirements).
scala-stm-bench7(previously Scala 2.12.3)
reactors(previously Scala 2.11.8)
io.reactors.reactors-coreupdated (0.7 -> 0.9-e605e145-60-g9838414) and cleaned up for Scala 2.12
- Spark workloads (previously Scala 2.11.8)
- See Spark benchmark changes above.