Before starting the concept behind the index, firstly,
we will address it in the real world, how it is helpful!!
Since childhood books are part of our day to day life,
once I thought of reading a book and picked one from a library nearby. While I
was searching the topics of my interest, I flipped a few pages but did not find
the same. During that time, I thought there is any mechanism that can help me
in quickly browsing the topic of my interest.
There I realized the use and power of an index in a
book - this helped me in quickly browsing the contents and pages of my
interest.
The same principle applies to database indexes - this
helps you in searching the data of your interest in a faster and efficient
manner.
Now
let's take a deep dive into database indexes: The index is a mechanism
to improve the database performance, and it is used to arrange the data a
better way, which makes it easy to fetch the data by achieving direct access to
data pages. Regarding performance, an Index is a vital tool for a DBA.
PostgreSQL is one of those databases known for its
great extensibility and vibrant collection of core and 3D party additions, and
Indexing is one of them. PostgreSQL databases cover a wide variety of cases,
whether it is simple B-Tree for all types or specific geospatial GiST indexes
for the non-scalar data types (e.g., geometrical shapes, fts, arrays).
Table
of Contents
PostgreSQL Basic Index Terminology:
Before understanding the different Index type and their
uses, we must take a look at terminology because a DBA will come on this one sooner
or later.
Index
Access Method: The index type (B-tree,
GiST, GIN, etc)
Type: Indexed the column’s data type.
Operator: A function between two data types
Operator
Family: An operator family defines a collection of
related operator classes, and perhaps some additional operators and support
functions that are compatible with these operator classes but not essential for
the functioning of any individual index
Operator
Class (also stated as index strategy): It
defines the operator that the index uses for the column.
In PostgreSQL’s system catalog, access methods are
stored in pg_am, operator classes in pg_opclass, operator families in
pg_opfamily. The dependencies of the above are shown in the diagram below:
Types of Indexes in PostgreSQL
PostgreSQL provides the following Index types:
B-tree
is
default index, it can operate against all data types and also can use to
retrieve the NULL value.
Hash index handles equality
only.
GiST index is suitable for
non-scalar data types (e.g. geometrical shapes, fts, arrays)
SP-GiST
index is space partitioned GIST, an evolution of GiST for
handling non-balanced structures (quadtrees, k-d trees, radix trees)
GIN
index is suitable for
complex types (e.g. jsonb, fts, arrays)
BRIN
index is a relatively new type of index which supports data
that can be sorted by storing min/max values in each block
Let’s try to understand the index uses with some real
world examples.
B-Tree Index
B-Tree is the default index and it can apply on all data
types, it is most of the commonly used index in PostgreSQL. B-Tree index entries are
sorted by default in ascending order. It supports in many cases to supply sort
order for an index. Suppose we you are working on pagination and you want to
recent data entries first so you can use the index sorting as desc, it may be
possible indexed column may contain the null value also, B-Tree index provides
the solution for this also you can consider/ sort the null value’s rows at the
last.
Create a B-Tree Index using the following query
CREATE INDEX mytable_mycolumn_index ON
mytable (mycolumn DESC NULLS LAST);
|
Hash Index
Pre-Postgres 10 is only useful for comparisons of
equality, but you almost never want to use them because they are not secure for
transactions, need to be manually rebuilt after crashes and not repeated to
followers, so the benefit over using a B-Tree is very less. There are efforts
in PgSQL 11 to make hash indexes a first-class index method along with its
bigger brothers (B-tree, GiST, GIN).
Create a Hash Index using the following query
CREATE INDEX mytable_mycolumn_idx ON
mytable USING
hash(mycolumn);
|
GiST Index
PostgreSQL provides the Generalized Search Tree (GiST)
index that allows you to build general balanced tree structures, and can be
used for operations beyond equality and range comparisons. GiST Index support to
geometric data types, tsquery and tsvector (as well as full-text search). It
can be used in exclusion constraints to create GiST index.
Create a GiST Index using the following query
CREATE INDEX mytable_mycolumn_idx ON
mytable USING
hash(mycolumn);
|
Note:
If you are getting the following error that means some features are not on in
PgSQL.
“ERROR: data type character varying has no default
operator class for access method "gist"
HINT: You must
specify an operator class for the index or define a default operator class for
the data type.”
To fix the above issue run the following query.
CREATE extension btree_gist ;
|
SP-GiST Index
Space-partitioned - Generalized Search Tree (SP-GiST)
Index is same like GiST enables the development of many different strategies
in the domain of non-balanced disk-based data structures. The default PgSQL
distribution offers support for two-dimensional points, (any type) ranges, text
and inet types. It can be used in exclusion constraints the same as GiST.
Create an SP-GiST Index using the following query
CREATE INDEX mytable_mycolumn_idx ON
mytable USING
hash(mycolumn);
|
GIN Index
Generalized Inverted Index provides many indexing strategies.
GIN is compatible when we want to index composite types of columns. As we know,
the default PostgreSQL distribution provides support for any array type, jsonb,
and full-text search (tsvector). Jsonb type is the growing feature of PostgreSQL
that depends on the GIN for index support. GIN also used commonly to the index for
full-text search.
Create a GIN Index using the following query
CREATE INDEX mytable_mycolumn_idx ON
mytable USING
hash(mycolumn);
|
BRIN Index
Block Range Index (BRIN Index) was introduced in PostgreSQL
9.5. BRIN works on very large tables by storing summary information for a set
of pages called “Block Range”. If you are using BRIN Index then be
ready for extra logic for PostgreSQL query statements and also need for extra
maintenance.
Create a BRIN Index using the following query
CREATE INDEX part_machine_id_idx_brin ON
part USING BRIN(machine_id);
|
No comments:
Post a Comment
Please do not enter any spam link in the comment box.