Skip List

Algorithms/Data Structures
Updated on:
September 17, 2024

What is a skip list?

A skip list is a probabilistic data structure that arranges elements in a hierarchical linked list and uses randomized "skips" to enable fast search over ordered data. It provides logarithmic search and insertion complexity like balanced trees.

Skip lists use layers of pointers to "skip" over elements, giving fast access to elements based on their sort order. Extra pointer levels are added probabilistically to create shortcuts for faster traversal.

This allows efficient searches, inserts, and deletes in log N time on average. Skip lists have more locality and simpler implementation than tree structures.

Skip lists employ probabilistic techniques reminiscent of Bloom filters and interval arithmetic used in tree structures like B-trees. Skip lists are commonly used to implement fast, scalable ordered key-value storage including distributed variants.

How does a skip list work?

A skip list consists of sorted linked list layers where higher layers act as an "express lane" to skip segments of the lower layer. An element is promoted to a higher layer randomly based on a configurable probability.

Search and insert traverse the highest possible layer and drop down only as elements are encountered. This skipping allows fast access based on sort order.

Why are skip lists important? Where are they used?

Skip lists provide fast search over dynamic ordered data without complex rebalancing operations. They are used in databases, caches, networking, probabilistic data structures, and search trees.

Skip lists scale better than binary search trees while providing similar access complexity. Their simplicity compared to tree balancing enables efficient implementations.

FAQ

How is a skip list different from a binary search tree?

Skip lists use probabilistic "skipping" via multilayer linked lists rather than complex rebalancing rules. Insertions are faster but queries are probabilistic.

What are the advantages of skip lists over trees?

  • Simpler to implement than balanced binary search trees.
  • Faster to insert and delete since no rebalancing needed.
  • Concurrent operations have less impact on performance.
  • Cache-friendly linear structure.

What are the downsides of using skip lists?

  • Worse (but still logarithmic) search time compared to balanced trees.
  • Not deterministically balanced, small probability of degradation.
  • Require random number generation.

Where are skip lists commonly used?

  • Database indexing, in-memory caching layers
  • Network routing table lookups
  • Fast search over sorted data streams
  • Probabilistic data structures like Bloof filters

References:


Related Entries

Interval Arithmetic

Interval arithmetic is a method of computing with sets of numbers rather than single values, representing uncertainty in calculations and accounting for rounding errors.

Read more ->
Probabilistic Data Structures

Probabilistic data structures are space and time efficient data structures that use randomized algorithms to provide approximate results to queries with strong guarantees.

Read more ->
Bloom Filter

A Bloom filter is a probabilistic data structure used to test set membership that is space-efficient compared to storing the full set.

Read more ->

Get early access to AI-native data infrastructure