Full Table Scan
When a DBMS sees a query of the form like
The obvious thing to do is read through the tuples of
R and report these tuples that satisfy the condition.
This is called a Full Table Scan.
If we have to report 80% of the tuples in
R, it makes sense to do a full table scan.
Otherwise, if the query is very selective, and returns just a small percentage of the tuples, we might hope to do better.
Consider a selection query with a single equality in the condition
This is a Point Query: We look for a single value for “year”. Point queries are easy if data is sorted by the right attribute.
Consider a selection query of the form
This is a Range Query: We look for a range of values of “birthdate”.
Range queries are also easy if data is sorted by the right attribute, but often not be as selective as point queries.
Index & Index Scan
- To speed up queries the DBMS may build an Index on the year attribute.
- A database index is similar to an index in the back of a book:
- for every piece of data you might be interested in, the index says where to find it.
- The index itself is organised such that one can quickly do the lookup.
- Looking for information in a relation with the help of an index is called Index Scan.
Index Scan vs Full Table Scan
Point and range queries on the attribute(s) of the Primary Index are almost always best performed using an index scan.
Secondary Index should be used with high selectivity queries: As a rule of thumb, a secondary index scan is faster than a full table scan for queries returning less than 10% of a relation.
Choosing to Use an Index
Drawbacks of Indexing
- Space Usage
- Small for primary index
- Similar to data size for secondary index
- Time usage for keeping indexes updated under data insertion & data change
- Small to medium for primary index
- High for secondary index
- Some joins can be executed using a modest number of index lookups, maybe faster than looking at all data
- Some queries may be executed by only looking at the information in the index
- Index only queries execution plan (coving index)
- May need to read much less data