Type in expressions to have them evaluated.Ģ1/12/11 19:28:36 ERROR SparkContext: Error initializing SparkContext. Using Scala version 2.13.5 (OpenJDK 64-Bit Server VM, Java 11.0.9.1) To adjust logging level use sc.setLogLevel(newLevel). Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties WARNING: All illegal access operations will be denied in a future release WARNING: Use -illegal-access=warn to enable warnings of further illegal reflective access operations WARNING: Please consider reporting this to the maintainers of .Platform WARNING: Illegal reflective access by .Platform (file:/C:/big_data/spark-3.2.0-bin-hadoop3.2-scala2.13/jars/spark-unsafe_2.13-3.2.0.jar) to constructor (long,int) WARNING: An illegal reflective access operation has occurred Val df6 = df5.withColumn("next", when(col("next").isNull, col("nxt")).otherwise(col("next"))).select("identifier", "line", "next") Val df3 = df2.withColumn("next", lead("line", 1, null).over(w)) Process per val w = .("part").orderBy("line") Add partition so as to be able to apply parallelism - except for upper boundary record. Import .expressions.WindowĬase class X(identifier: Long, line: Long) // Too hard to explain, just gets around issues with df -> rdd -> df.