Pages

Java's Comparator interface

The lazy aren't the only ones to write about Comparators and comparisons in Java. I'm not lazy, so please love and gripe about yet another explanation. I hope it will not be superfluous. And yes, this article is the answer to the question: "Can you write a comparator from memory?" I hope everyone will be able to write a comparator from memory after reading this article. Java's Comparator interface - 1

Introduction

As you know, Java is an object-oriented language. As a result, it's customary to manipulate objects in Java. But sooner or later, you face the task of comparing objects based on some characteristic. For example: Suppose we have some message described by the Message class:
public static class Message {
    private String message;
    private int id;

    public Message(String message) {
        this.message = message;
        this.id = new Random().nextInt(1000);
    }
    public String getMessage() {
        return message;
    }
    public Integer getId() {
        return id;
    }
    public String toString() {
        return "[" + id + "] " + message;
    }
}
Put this class in the Tutorialspoint Java compiler. Do not forget to add the import statements as well:
import java.util.Random;
import java.util.ArrayList;
import java.util.List;
In the main method, create several messages:
public static void main(String[] args){
    List<Message> messages = new ArrayList();
    messages.add(new Message("Hello, World!"));
    messages.add(new Message("Hello, Sun!"));
    System.out.println(messages);
}
Let's think about what we would do if we wanted to compare them? For example, we want to sort by id. And to create an order, we need to somehow compare the objects in order to understand which object should come first (i.e. the smaller one) and which one should follow (i.e. the larger one). Let's start with a class like java.lang.Object. We know that all classes implicitly inherit the Object class. And this makes sense because it reflects the concept that "everything is an object" and provides common behavior for all classes. This class dictates that every class has two methods: → hashCode The hashCode method returns some numeric (int) representation of the object. What does that mean? It means that if you create two different instances of a class, then they should have different hashCodes. The method's description says as much: "As much as is reasonably practical, the hashCode method defined by class Object does return distinct integers for distinct objects". In other words, for two different instances, there should be different hashCodes. That is, this method is not suitable for our comparison. → equals. The equals method answers the question "are these objects equal?" and returns a boolean." By default, this method has the following code:
public boolean equals(Object obj) {
    return (this == obj);
}
That is, if this method is not overridden, it essentially says whether the object references match or not. This isn't what we want for our messages, because we're interested in message ids, not object references. And even if we override the equals method, the most we can hope for is to learn whether they are equal. And this isn't enough for us to determine the order. So what do we need then? We need something that compares. The one who compares is a Comparator. Open the Java API and find Comparator. Indeed, there is a java.util.Comparator interface java.util.Comparator and java.util.Comparable As you can see, such an interface exists. A class that implements it says, "I implement a method that compares objects." The only thing you really need to remember is the comparator contract, which is expressed as follows:
Comparator returns an int according to the following rules:
  • It returns a negative int if the first object is smaller
  • It returns a positive int if the first object is larger
  • It returns zero if the objects are equal
Now let's write a comparator. We will need to import java.util.Comparator. After the import statement, add the following to the main method: Comparator<Message> comparator = new Comparator<Message>(); Of course, this won't work, because Comparator is an interface. So we add curly braces {} after the parentheses. Write the following method inside the braces:
public int compare(Message o1, Message o2) {
    return o1.getId().compareTo(o2.getId());
}
You don't even need to remember the spelling. A comparator is one who performs a comparison, that is, it compares. To indicate the relative order of the objects, we return an int. That's basically it. Nice and easy. As you can see from the example, in addition to Comparator, there is another interface — java.lang.Comparable, which requires us to implement the compareTo method. This interface says, "a class that implements me makes it possible to compare instances of the class." For example, Integer's implementation of compareTo is as follows:
(x < y) ? -1 : ((x == y) ? 0 : 1)
Java 8 introduced some nice changes. If you take a closer look at the Comparator interface, you'll see the @FunctionalInterface annotation above it. This annotation is for information purposes and tells us that this interface is functional. This means that this interface has only 1 abstract method, which is a method without an implementation. What does this give us? Now we can write the comparator's code like this:
Comparator<Message> comparator = (o1, o2) -> o1.getId().compareTo(o2.getId());
We name the variables in parentheses. Java will see that because there is only one method, then the required number and types of input parameters are clear. Then we use the arrow operator to pass them to this part of the code. What's more, thanks to Java 8, we now have default methods in interfaces. These methods appear by default when we implement an interface. The Comparator interface has several. For example:
Comparator moreImportant = Comparator.reverseOrder();
Comparator lessImportant = Comparator.naturalOrder();
There is another method that will make your code cleaner. Take a look at the example above, where we defined our comparator. What does it do? It's quite primitive. It simply takes an object and extracts some value that is "comparable". For example, Integer implements comparable, so we're able to perform a compareTo operation on the values of message id fields. This simple comparator function can be written like this:
Comparator<Message> comparator = Comparator.comparing(obj -> obj.getId());
In other words, we have a Comparator that compares like this: it takes objects, uses the getId() method to get a Comparable from them, and then uses compareTo to compare. And there are no more horrible constructs. And finally, I want to note one more feature. Comparators can be chained. For example:
Comparator<Message> comparator = Comparator.comparing(obj -> obj.getId());
comparator = comparator.thenComparing(obj -> obj.getMessage().length());

Application

Declaring a comparator turns out to be quite logical, don't you think? Now we need to see how and where to use it. → Collections.sort(java.util.Collections) We can, of course, sort collections this way. But not every collection, only lists. There's nothing unusual here, because lists are the kind of collections where you access elements by their index. This allows the second element to be swapped with the third element. That's why the following method of sorting is only for lists:
Comparator<Message> comparator = Comparator.comparing(obj -> obj.getId());
Collections.sort(messages, comparator);
Arrays.sort(java.util.Arrays) Arrays are also easy to sort. Again, for the same reason — their elements are accessed by index. → Descendants of java.util.SortedSet and java.util.SortedMap You will recall that Set and Map don't guarantee the order in which elements are stored. BUT, we have special implementations that do guarantee the order. And if a collection's elements don't implement java.util.Comparable, then we can pass a Comparator to its constructor:
Set<Message> msgSet = new TreeSet(comparator);
Stream API In the Stream API, which appeared in Java 8, comparators let you simplify work with stream elements. For example, suppose we need a sequence of random numbers from 0 to 999, inclusive:
Supplier<Integer> randomizer = () -> new Random().nextInt(1000);
Stream.generate(randomizer)
    .limit(10)
    .sorted(Comparator.naturalOrder())
    .forEach(e -> System.out.println(e));
We could stop here, but there are even more interesting problems. For example, suppose you need to prepare a Map, where the key is a message id. Additionally, we want to sort these keys, so we'll start with the following code:
Map<Integer, Message> collected = Arrays.stream(messages)
                .sorted(Comparator.comparing(msg -> msg.getId()))
                .collect(Collectors.toMap(msg -> msg.getId(), msg -> msg));
We actually get a HashMap here. And as we know, it doesn't guarantee any order. As a result, our elements, which were sorted by id, simply lose their order. Not good. We'll have to change our collector a bit:
Map<Integer, Message> collected = Arrays.stream(messages)
                .sorted(Comparator.comparing(msg -> msg.getId()))
                .collect(Collectors.toMap(msg -> msg.getId(), msg -> msg, (oldValue, newValue) -> oldValue, TreeMap::new));
The code has started to look a little scarier, but now the problem is solved correctly. Read more about the various groupings here:
You can create your own collector. Read more here: "Creating a custom collector in Java 8". And you'll benefit from reading the discussion here: "Java 8 list to map with stream".

Fall-trap

Comparator and Comparable are good. But there is one nuance you should remember. When a class performs sorting, it expects that your class can be converted to a Comparable. If this isn't the case, then you'll receive an error at run time. Let's look at an example:
SortedSet<Message> msg = new TreeSet<>();
msg.add(new Message(2, "Developer".getBytes()));
It seems that nothing is wrong here. But in fact, in our example, it will fail with an error: java.lang.ClassCastException: Message cannot be cast to java.lang.Comparable And all because it tried to sort the elements (it is a SortedSet, after all)...but couldn't. Don't forget this when working with SortedMap and SortedSet.

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.