Tag: Exception

  • What to do in case of org.apache.spark.sql.catalyst.errors.package$TreeNodeException: Unresolved attributes

    I’m currently gathering my first experiences with Apache Spark and in particular Spark SQL.

    While I was playing a bit with Spark SQL Joins I suddenly faced an exception like Exception in thread "main" org.apache.spark.sql.catalyst.errors.package$TreeNodeException: Unresolved attributes: foo.
    Followed by the parsed SQL statement etc …

    Well, in MySQL the error message would have been
    "Unknown column 'foo' in field list"
    Aka: You are accessing a column/field foo where this field does not exist.
    I was already a bit too close to the problem in order to see it at once – and I only found descriptions dealing with nested structures etc (which wasn’t the case in my situation). So it took me a couple of minutes to realize what Spark want to tell me.

    Maybe this helps someone else, too.

  • Scalding Exception: diverging implicit expansion for type com.twitter.algebird.Semigroup[T]

    I was just doing a again some scalding jobs and again got an .. interesting exception:

    In a groupBy operation, I wanted to sum something up using:

    .groupBy('a) {
      _.sum('a -> 'c)
    }

    And was rewarded with this one:

    [error] example.scala:20: diverging implicit expansion for type com.twitter.algebird.Semigroup[T]
    [error] starting with method eitherSemigroup in object Semigroup
    [error]       _.sum('a -> 'c)
    [error]            ^
    [error] one error found
    [error] (compile:compile) Compilation failed

    WTF??

    Solution:

    Spot the mistake? It’s the missing type hint at sum:

    .groupBy('a) {
      _.sum<strong>[Int]</strong>('a -> 'c)  //  <-- [Int]
    }
  • Scalding: unable to compare stream elements in position: 0

    I’m currently working quite a bit with Twitter’s Scalding.
    Recently I split up a job into sub-jobs and suddenly got an Exception in my join:

    Caused by: cascading.CascadingException: unable to compare stream elements in position: 0

    If I had remembered the Fields API in detail, I would have thought about this paragraph (it’s about sorting, but the consequence is the same):

    Note: When reading from a CSV, the data types are set to String,hence the sorting will be alphabetically, therefore to sort by age, an int, you need to convert it to an integer. For example …

    Solution:

    Ensure you are joining the correct data types and possibly convert them before. For example:

    .map ('myField-> 'myField) {x:Int => x}
  • Compiling Cascading: FAILURE: Build failed with an exception.

    Today I ran into a really stupid error message when I tried to recompile cascading-jdbc:

    Evaluating root project ‘cascading-jdbc’ using build file ‘/home/…/cascading-jdbc/build.gradle’.

    FAILURE: Build failed with an exception.

    * Where:
    Build file ‘/home/…/cascading-jdbc/build.gradle’ line: 68

    * What went wrong:
    A problem occurred evaluating root project ‘cascading-jdbc’.
    > Could not find method create() for arguments [fatJarPrepareFiles, class eu.appsatori.gradle.fatjar.tasks.PrepareFiles] on task set.

    * Try:
    Run with –stacktrace option to get the stack trace. Run with –debug option to get more log output.

    BUILD FAILED

    Total time: 5.355 secs

    Solution

    Check your gradle version … I ran a brand new Ubuntu with the shipped gradle version 1.4. Well the cascading readme states that gradle 1.8 is required … and it really is.