You can set up the Avro Input step to run on the Spark engine. Spark processes null values differently than the Pentaho engine, so you may need to adjust your transformation to successfully process null values according to Spark's processing rules.
Additionally, when using the Avro Input step on an Amazon
EMR cluster, you must copy the spark-avro_2.11-2.4.2.jar file from your
SPARK_HOME folder into the extra folder
in your AEL
data-integration setup location. The following is an example command to
copy the file:
cp /usr/lib/spark/external/lib/spark-avro_2.11-2.4.2.jar <User>/data-integration/adaptive-execution/extra/