Can you create a simple 'EqualityComparer<T>' using a lambda expression

C#LinqDistinct

C# Problem Overview


Short question:

Is there a simple way in LINQ to objects to get a distinct list of objects from a list based on a key property on the objects.

Long question:

I am trying to do a Distinct() operation on a list of objects that have a key as one of their properties.

class GalleryImage {
   public int Key { get;set; }
   public string Caption { get;set; }
   public string Filename { get; set; }
   public string[] Tags {g et; set; }
}

I have a list of Gallery objects that contain GalleryImage[].

Because of the way the webservice works [sic] I have duplicates of the GalleryImage object. i thought it would be a simple matter to use Distinct() to get a distinct list.

This is the LINQ query I want to use :

var allImages = Galleries.SelectMany(x => x.Images);
var distinctImages = allImages.Distinct<GalleryImage>(new 
                     EqualityComparer<GalleryImage>((a, b) => a.id == b.id));

The problem is that EqualityComparer is an abstract class.

I dont want to :

  • implement IEquatable on GalleryImage because it is generated
  • have to write a separate class to implement IEqualityComparer as shown here

Is there a concrete implementation of EqualityComparer somewhere that I'm missing?

I would have thought there would be an easy way to get 'distinct' objects from a set based on a key.

C# Solutions


Solution 1 - C#

(There are two solutions here - see the end for the second one):

My MiscUtil library has a ProjectionEqualityComparer class (and two supporting classes to make use of type inference).

Here's an example of using it:

EqualityComparer<GalleryImage> comparer = 
    ProjectionEqualityComparer<GalleryImage>.Create(x => x.id);

Here's the code (comments removed)

// Helper class for construction
public static class ProjectionEqualityComparer
{
    public static ProjectionEqualityComparer<TSource, TKey>
        Create<TSource, TKey>(Func<TSource, TKey> projection)
    {
        return new ProjectionEqualityComparer<TSource, TKey>(projection);
    }

    public static ProjectionEqualityComparer<TSource, TKey>
        Create<TSource, TKey> (TSource ignored,
                               Func<TSource, TKey> projection)
    {
        return new ProjectionEqualityComparer<TSource, TKey>(projection);
    }
}

public static class ProjectionEqualityComparer<TSource>
{
    public static ProjectionEqualityComparer<TSource, TKey>
        Create<TKey>(Func<TSource, TKey> projection)
    {
        return new ProjectionEqualityComparer<TSource, TKey>(projection);
    }
}

public class ProjectionEqualityComparer<TSource, TKey>
    : IEqualityComparer<TSource>
{
    readonly Func<TSource, TKey> projection;
    readonly IEqualityComparer<TKey> comparer;

    public ProjectionEqualityComparer(Func<TSource, TKey> projection)
        : this(projection, null)
    {
    }

    public ProjectionEqualityComparer(
        Func<TSource, TKey> projection,
        IEqualityComparer<TKey> comparer)
    {
        projection.ThrowIfNull("projection");
        this.comparer = comparer ?? EqualityComparer<TKey>.Default;
        this.projection = projection;
    }

    public bool Equals(TSource x, TSource y)
    {
        if (x == null && y == null)
        {
            return true;
        }
        if (x == null || y == null)
        {
            return false;
        }
        return comparer.Equals(projection(x), projection(y));
    }

    public int GetHashCode(TSource obj)
    {
        if (obj == null)
        {
            throw new ArgumentNullException("obj");
        }
        return comparer.GetHashCode(projection(obj));
    }
}

Second solution

To do this just for Distinct, you can use the DistinctBy extension in MoreLINQ:

    public static IEnumerable<TSource> DistinctBy<TSource, TKey>
        (this IEnumerable<TSource> source,
         Func<TSource, TKey> keySelector)
    {
        return source.DistinctBy(keySelector, null);
    }

    public static IEnumerable<TSource> DistinctBy<TSource, TKey>
        (this IEnumerable<TSource> source,
         Func<TSource, TKey> keySelector,
         IEqualityComparer<TKey> comparer)
    {
        source.ThrowIfNull("source");
        keySelector.ThrowIfNull("keySelector");
        return DistinctByImpl(source, keySelector, comparer);
    }

    private static IEnumerable<TSource> DistinctByImpl<TSource, TKey>
        (IEnumerable<TSource> source,
         Func<TSource, TKey> keySelector,
         IEqualityComparer<TKey> comparer)
    {
        HashSet<TKey> knownKeys = new HashSet<TKey>(comparer);
        foreach (TSource element in source)
        {
            if (knownKeys.Add(keySelector(element)))
            {
                yield return element;
            }
        }
    }

In both cases, ThrowIfNull looks like this:

public static void ThrowIfNull<T>(this T data, string name) where T : class
{
    if (data == null)
    {
        throw new ArgumentNullException(name);
    }
}

Solution 2 - C#

Building on Charlie Flowers' answer, you can create your own extension method to do what you want which internally uses grouping:

    public static IEnumerable<T> Distinct<T, U>(
        this IEnumerable<T> seq, Func<T, U> getKey)
    {
        return
            from item in seq
            group item by getKey(item) into gp
            select gp.First();
    }

You could also create a generic class deriving from EqualityComparer, but it sounds like you'd like to avoid this:

    public class KeyEqualityComparer<T,U> : IEqualityComparer<T>
    {
        private Func<T,U> GetKey { get; set; }

        public KeyEqualityComparer(Func<T,U> getKey) {
            GetKey = getKey;
        }

        public bool Equals(T x, T y)
        {
            return GetKey(x).Equals(GetKey(y));
        }

        public int GetHashCode(T obj)
        {
            return GetKey(obj).GetHashCode();
        }
    }

Solution 3 - C#

This is the best i can come up with for the problem in hand. Still curious whether theres a nice way to create a EqualityComparer on the fly though.

Galleries.SelectMany(x => x.Images).ToLookup(x => x.id).Select(x => x.First());

Create lookup table and take 'top' from each one

Note: this is the same as @charlie suggested but using ILookup - which i think is what a group must be anyway.

Solution 4 - C#

What about a throw away IEqualityComparer generic class?

public class ThrowAwayEqualityComparer<T> : IEqualityComparer<T>
{
  Func<T, T, bool> comparer;

  public ThrowAwayEqualityComparer(Func<T, T, bool> comparer)   
  {
    this.comparer = comparer;
  }

  public bool Equals(T a, T b)
  {
    return comparer(a, b);
  }
  
  public int GetHashCode(T a)
  {
    return a.GetHashCode();
  }
}

So now you can use Distinct with a custom comparer.

var distinctImages = allImages.Distinct(
   new ThrowAwayEqualityComparer<GalleryImage>((a, b) => a.Key == b.Key));

You might be able to get away with the <GalleryImage>, but I'm not sure if the compiler could infer the type (don't have access to it right now.)

And in an additional extension method:

public static class IEnumerableExtensions
{
  public static IEnumerable<TValue> Distinct<TValue>(this IEnumerable<TValue> @this, Func<TValue, TValue, bool> comparer)
  {
    return @this.Distinct(new ThrowAwayEqualityComparer<TValue>(comparer);
  }
  
  private class ThrowAwayEqualityComparer...
}

Solution 5 - C#

You could group by the key value and then select the top item from each group. Would that work for you?

Solution 6 - C#

Here's an interesting article that extends LINQ for this purpose... http://www.singingeels.com/Articles/Extending_LINQ__Specifying_a_Property_in_the_Distinct_Function.aspx

The default Distinct compares objects based on their hashcode - to easily make your objects work with Distinct, you could override the GetHashcode method.. but you mentioned that you are retrieving your objects from a web service, so you may not be able to do that in this case.

Solution 7 - C#

This idea is being debated here, and while I'm hoping the .NET Core team adopt a method to generate IEqualityComparer<T>s from lambda, I'd suggest you to please vote and comment on that idea, and use the following:

Usage:

IEqualityComparer<Contact> comp1 = EqualityComparerImpl<Contact>.Create(c => c.Name);
var comp2 = EqualityComparerImpl<Contact>.Create(c => c.Name, c => c.Age);

class Contact { public Name { get; set; } public Age { get; set; } }

Code:

public class EqualityComparerImpl<T> : IEqualityComparer<T>
{
  public static EqualityComparerImpl<T> Create(
    params Expression<Func<T, object>>[] properties) =>
    new EqualityComparerImpl<T>(properties);

  PropertyInfo[] _properties;
  EqualityComparerImpl(Expression<Func<T, object>>[] properties)
  {
    if (properties == null)
      throw new ArgumentNullException(nameof(properties));

    if (properties.Length == 0)
      throw new ArgumentOutOfRangeException(nameof(properties));

    var length = properties.Length;
    var extractions = new PropertyInfo[length];
    for (int i = 0; i < length; i++)
    {
      var property = properties[i];
      extractions[i] = ExtractProperty(property);
    }
    _properties = extractions;
  }

  public bool Equals(T x, T y)
  {
    if (ReferenceEquals(x, y))
      //covers both are null
      return true;
    if (x == null || y == null)
      return false;
    var len = _properties.Length;
    for (int i = 0; i < _properties.Length; i++)
    {
      var property = _properties[i];
      if (!Equals(property.GetValue(x), property.GetValue(y)))
        return false;
    }
    return true;
  }

  public int GetHashCode(T obj)
  {
    if (obj == null)
      return 0;

    var hashes = _properties
        .Select(pi => pi.GetValue(obj)?.GetHashCode() ?? 0).ToArray();
    return Combine(hashes);
  }

  static int Combine(int[] hashes)
  {
    int result = 0;
    foreach (var hash in hashes)
    {
      uint rol5 = ((uint)result << 5) | ((uint)result >> 27);
      result = ((int)rol5 + result) ^ hash;
    }
    return result;
  }

  static PropertyInfo ExtractProperty(Expression<Func<T, object>> property)
  {
    if (property.NodeType != ExpressionType.Lambda)
      throwEx();

    var body = property.Body;
    if (body.NodeType == ExpressionType.Convert)
      if (body is UnaryExpression unary)
        body = unary.Operand;
      else
        throwEx();

    if (!(body is MemberExpression member))
      throwEx();

    if (!(member.Member is PropertyInfo pi))
      throwEx();

    return pi;

    void throwEx() =>
      throw new NotSupportedException($"The expression '{property}' isn't supported.");
  }
}

Solution 8 - C#

> implement IEquatable on GalleryImage because it is generated

A different approach would be to generate GalleryImage as a partial class, and then have another file with the inheritance and IEquatable, Equals, GetHash implementation.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionSimon_WeaverView Question on Stackoverflow
Solution 1 - C#Jon SkeetView Answer on Stackoverflow
Solution 2 - C#kvbView Answer on Stackoverflow
Solution 3 - C#Simon_WeaverView Answer on Stackoverflow
Solution 4 - C#SamuelView Answer on Stackoverflow
Solution 5 - C#Charlie FlowersView Answer on Stackoverflow
Solution 6 - C#marktView Answer on Stackoverflow
Solution 7 - C#Shimmy WeitzhandlerView Answer on Stackoverflow
Solution 8 - C#RichardView Answer on Stackoverflow