SQLite DISTINCT Query

SQLite is a popular open-source database that is embedded in many applications. As a relational database, SQLite allows you to query data using SQL syntax. One common SQL clause is DISTINCT, which removes duplicate rows from query results. In this beginner’s guide, we’ll cover the basics of using DISTINCT in SQLite.

To demonstrate DISTINCT, we’ll create a table called users:

CREATE TABLE users (
  id INTEGER PRIMARY KEY, 
  name TEXT,
  age INTEGER
);

INSERT INTO users VALUES 
  (1, 'John', 30),
  (2, 'Jane', 25),
  (3, 'Bob', 20),
  (4, 'Mary', 30);

This gives us a small table with some sample user data:

id|name|age
1|John|30
2|Jane|25 
3|Bob|20
4|Mary|30

Now let’s see how DISTINCT works in practice.

Removing Duplicate Rows with DISTINCT

The DISTINCT clause eliminates duplicate rows from the result set. This is useful when you only want unique values returned. For example, to get a list of unique ages in the users table:

SELECT DISTINCT age FROM users;

This would return:

age
30
25
20

Without the DISTINCT clause, the query would return every row:

SELECT age FROM users;
age  
30
25
20
30

As you can see, DISTINCT removed the duplicate value 30. This condenses the result set down to only the unique values.

DISTINCT applies to the entire selected row. So if two rows have the same values (but in a different order), only one will be returned.

Combining SQLite DISTINCT with Other Clauses

DISTINCT can be combined with other common SQLite clauses like WHERE, ORDER BY, and LIMIT.

For example, getting unique names for users over the age of 25:

SELECT DISTINCT name 
FROM users
WHERE age > 25
ORDER BY name;

This returns:

name
Jane  
John
Mary

The WHERE clause filtered the rows, DISTINCT removed duplicates, and ORDER BY sorted the final results.

Another example using DISTINCT with LIMIT:

SELECT DISTINCT age 
FROM users
ORDER BY age DESC
LIMIT 2;

Returns:

age
30  
25

DISTINCT removed duplicates, then ORDER BY sorted the ages descending, and finally LIMIT restricted it to 2 rows.

Counting Unique Values with SQLite DISTINCT

A common use case for DISTINCT is counting the number of unique values in a column. For example, to count unique ages:

SELECT COUNT(DISTINCT age) FROM users;

Returns:

3

This counts the number of unique ages, excluding any duplicates. Without DISTINCT, it would count all rows, including duplicates:

SELECT COUNT(age) FROM users;

Returns:

4

Being able to count distinct values efficiently is useful for analytics queries.

Performance Considerations of SQLite DISTINCT

It’s important to note that DISTINCT can negatively impact performance if used improperly. This is because removing duplicates requires SQLite to sort and scan the entire result set.

For large tables, it’s often faster to use an indexed column in the WHERE clause instead. For example:

SELECT age FROM users
WHERE age > 25;

This returns unique ages over 25 by filtering on an indexed column. The results are not scanned and sorted like they would be with DISTINCT.

In general, only use SQLite DISTINCT when you specifically need to remove duplicates. Avoid using it for arbitrary filtering or limiting results – that’s better handled in the WHERE clause.

Summary

DISTINCT is a simple but powerful tool for condensing result sets down to unique values. We learned:

  • DISTINCT removes duplicate rows from a query
  • It can be combined with WHERE, ORDER BY, LIMIT, etc.
  • COUNT(DISTINCT col) calculates the number of unique values in a column
  • DISTINCT can impact performance if used injudiciously

With this foundation, you can now apply DISTINCT to simplify your queries and analytics in SQLite. Try it out on your own data to see the results. And remember, only use DISTINCT when you specifically need unique values – don’t make it a habit without cause.