Scalding: unable to compare stream elements in position: 0

I’m currently working quite a bit with Twitter’s Scalding.
Recently I split up a job into sub-jobs and suddenly got an Exception in my join:

Caused by: cascading.CascadingException: unable to compare stream elements in position: 0

If I had remembered the Fields API in detail, I would have thought about this paragraph (it’s about sorting, but the consequence is the same):

Note: When reading from a CSV, the data types are set to String,hence the sorting will be alphabetically, therefore to sort by age, an int, you need to convert it to an integer. For example …

Solution:

Ensure you are joining the correct data types and possibly convert them before. For example:

.map ('myField-> 'myField) {x:Int => x}

Leave a Reply

Your email address will not be published.