05 September 2023

Removing Duplicates from a JavaScript Array ('Deduping')

Hero image for 'Removing Duplicates from a JavaScript Array ('Deduping').' Image by Willy the Wizard.

When working with arrays in JavaScript, we can often encounter duplicate values, whether handling user input, processing API responses, or manipulating datasets. Removing these duplicates (also known as 'deduping'), helps ensure data integrity and improves efficiency.

In this article, I explore different methods for removing duplicates from a JavaScript array, as well as comparing their advantages, disadvantages, and possible use cases.

Using the `Set` Object to Remove Duplicates

One of the simplest and most efficient ways to remove duplicates is by using the Set object. A Set automatically enforces uniqueness, making it ideal for deduping arrays of primitive values. For example:

const numbers = [1, 2, 2, 3, 4, 4, 5];const uniqueNumbers = [...new Set(numbers)];console.log(uniqueNumbers);  // [1, 2, 3, 4, 5]

This approach is concise, performs well, and runs in O(n) time complexity because each element is inserted into the Set once. However, it only works for arrays of primitive values such as numbers or strings. When working with arrays of objects, we need a different strategy.

Removing Duplicates from an Array of Objects

When dealing with objects, we cannot use Set directly because it works by comparing object references rather than their actual values. Instead, we can use Array.prototype.filter with findIndex to ensure uniqueness based on a specific property.

This looks something like this:

const people = [  { id: 1, name: "Maddie" },  { id: 2, name: "Bob" },  { id: 1, name: "Maddie" }];const uniquePeople = people.filter(  (person, index, self) =>    index === self.findIndex(p => p.id === person.id));console.log(uniquePeople);// [{ id: 1, name: "Maddie" }, { id: 2, name: "Bob" }]

In this way, we can ensure that only the first occurrence of an object with a given id is retained, effectively deduping the array. However, findIndex iterates through the array for each element, making this method O(n²) in the worst case, which means it can become very slow for large datasets.

Deduping Arrays Using `reduce`

Another approach to deduping involves using Array.prototype.reduce to build a unique array by checking for existing values as we iterate, like this:

const numbers = [1, 2, 2, 3, 4, 4, 5];const uniqueNumbers = numbers.reduce<number[]>((acc, num) => {  if (!acc.includes(num)) {    acc.push(num);  }  return acc;}, []);console.log(uniqueNumbers);  // [1, 2, 3, 4, 5]

This method offers more control over how duplicates are handled but is ‑ as with using .filter() ‑ O(n²) in complexity because .includes() checks each element in acc linearly, for every iteration. As a result ‑ again ‑ this approach is not terribly efficient for large arrays.

Performance Comparison: `Set` vs. `filter` vs. `reduce`

Here's an overview of the efficiency of each deduping method depends on the dataset size and the type of values stored in the array:

Method	Time Complexity	Best For	Drawbacks
`Set`	O(n)	Arrays of primitives (numbers, strings)	Does not work for objects
`filter` with `findIndex`	O(n²)	Small arrays of objects	Slow for large datasets
`reduce` with `includes`	O(n²)	Custom logic for small datasets	Inefficient for large datasets

For small arrays, the performance difference is negligible. However, as the dataset grows, Set becomes significantly faster than filter or reduce. If we need to dedupe objects efficiently, a Map‑based approach is often a better alternative:

const uniquePeople = Array.from(new Map(people.map(person => [person.id, person])).values());

Like Set, this runs in O(n) time, making it much more efficient for large datasets than filter.

Wrapping Up

Removing duplicates, or deduping, is an essential operation when working with arrays in JavaScript. Depending on the type of data and performance needs, we can choose between Set, filter, or reduce to achieve the desired result efficiently.

Key Takeaways

The Set object is the fastest and simplest way to remove duplicates from arrays of primitive values.
When working with objects, filter with findIndex is an option but can be slow for large datasets.
reduce allows for more custom deduplication logic but is inefficient for large arrays.
Using a Map provides a better alternative for efficiently deduping large objects.

By understanding these techniques, we can choose the best approach for removing duplicates efficiently, keeping our JavaScript arrays clean and performant.

Removing Duplicates from a JavaScript Array ('Deduping')

Using the `Set` Object to Remove Duplicates

Removing Duplicates from an Array of Objects

Deduping Arrays Using `reduce`

Performance Comparison: `Set` vs. `filter` vs. `reduce`

Wrapping Up

Key Takeaways

Add Two Numbers in TypeScript: A LeetCode Linked List Solution

Solving the LeetCode Two Sum Problem Using JavaScript

Practical Use Cases for JavaScript `Set` and `Map`

Previewing CMS Content in Gatsby Workflows

Dynamic Programming in LeetCode: Solving 'Coin Change'

Solving the 'Jump Game' Problem with Greedy Algorithms

Where to Find Jobs in Web Development

Technical GEO for Websites: Entities, Structured Data, and Crawl Paths

Keeping jQuery Plugins Maintainable

Content Security Policy in Next.js: Static Pages, Nonces, and Real‑World Trade‑Offs

Use Greater‑Than and Less‑Than Symbols in JSX

Static Generation with CMS Content and Build‑Time Data

Untangling a delivery problem?

Using the Set Object to Remove Duplicates

Removing Duplicates from an Array of Objects

Deduping Arrays Using reduce

Performance Comparison: Set vs. filter vs. reduce

Wrapping Up

Key Takeaways

Untangling a delivery problem?

Using the `Set` Object to Remove Duplicates

Deduping Arrays Using `reduce`

Performance Comparison: `Set` vs. `filter` vs. `reduce`