Problem-solving with a computer involves processing data. As our programs become more sophisticated, we need assistance
- to organize a large amount of data
- to manage relationships among individual data items.
To process data efficiently, we need to define the data types to organize the data and operations to be performed on the data. The definition of data type and the definition of operations applied for that data are a part of an idea behind the “abstract data type” (ADT). This section looks at this powerful idea of “abstract data type.”
Abstract Data Type background
One way to think about Abstract data type is to consider the different types of data we use in the real world.
A type is a collection of values. Examples of types in the real world are
- The boolean type consists of values true and false.
- The integer type consists of values ranging from … -2, -1, 0, 1, 2…
An Abstract data type is a type with a set of operations that can act on that type. Operations specify the things you can do to the value of the type.
The user of the data type only needs to know that set of operations are available for the data type but does not need to understand how they are applied. Abstract data type does not reveal how the data is stored or what internal mechanisms are used to accomplish the operations on that data.
integer data type (int) can store … -2, -1, 0, 1, 2…. and we can perform addition (+), substractions (-), multiplications (*), operation on those stored values.
The below figure depicts the internal representation of float by the computer.
A programmer writing the C program to add two float values need not know the above representation to use the float variable in the program.
All programmers need to know is
- How to declare float value
- How to assign value to float value.
- Operations on the float value (addition, subtraction, multiplication, division, mod)
The implementation details of the data type are hidden from the user of the ADT and protected from outside access.
There are many types of ADTs
- The standard inbuilt ADTs in programming languages include int, double, string, char, etc.
- ADTs that facilitate organizing and manipulating collections of data or objects include stacks, queues, Arrays, etc. (also called data structures)
You will sometimes confuse by the term “data type” and “abstract data type (ADT).” People use the term abstract data type to emphasize that the “internal structure and internal working of the operations” are somehow unknown to the outside world—only the effects of the operation matter. Anyway, saying data type is enough; all data types, built-in or derived, are “abstract.”
Simple Data Types
Many programming languages already define simple data types, such as int, float, bool, string, etc., as integral parts of the language. These are essential data types and are built into the programming languages.
For example, the C language provides the data type int; C also defines several operations that can be applied to this data type (addition, subtraction, multiplication, division, and so on). C explicitly defines these operations on integers and what we expect as the results.
Complex Data Types
Although several simple data types, such as int, float, char, pointers, etc., have been implemented by any programming language, many programming applications require the programmer to create many useful complex data types using built-in data types.
For example, we need list data type, stack data type, queue data type, tree data type, and so on. To be efficient, these data types should be created and stored in the library of the computer to be used.
Data structure and ADT
Consider a program that manages the student records. Here are some properties to consider when thinking about students.
- Student name
- mailing address
Three simple operations performed by this program include
- Adding a new student to the class: ADD (studentRecord)
- searching the class for a student: SEARCH(studentId)
- Deleting the student: DELETE(student record)
We will discuss how these student records will be stored in memory and how these operations will be implemented.
We can implement the solution to the above problem using two alternative data structures
- an unordered array of records
- ordered array of records, ordered by Ids.
Implementation using an unordered array
- ADD: Take the student record and store it in the following available location of the array. Increment n.
- SEARCH: The records in this array are not sorted, so we have to look through the whole array from beginning to end to find the needed record. In this case, we use the linear search algorithm. The result could be either record is found, or the record doesn’t exist
- DELETE: This operation requires us to search for the given record first. Once found, the algorithm can replace it with the last record and decrement n.
Implementation using an ordered array
Assume that student records are sorted in an array, with an increment order of student Ids.
- ADD: Since the array is sorted, we first need to find out where we should insert the record. This can be done by scanning through the array, comparing the current record in the array with the record we want to insert, and finding the smallest index I of the record whose Id is larger than the new Id. Then the new record should be put to the ith slot in the array. All the records in the place, i, i+1, i+2 …n, need to be moved down one slot to create an empty slot for the new record. We can then put the new record into the ith slot and increment n.
- SEARCH: Since the array of records is sorted by Ids, we can use a binary search algorithm to find the given record.
- DELETE: We first search to find the record in this operation. Since the array is sorted, we need to fill in the slot I post deletion with a record, which means that records in slots i+1, i+2 and so on will need to be moved up one slot each to cover for an empty one.
The above picture depicts an abstract data type concept.
- Client calls the interface for adding, deleting, and searching student records. In many programming languages, ADTs can be expressed as an interface, which is simply a list of method declarations, where each method has an empty body (without any implementation)
- This means a client (application using the ADT) doesn’t need to know about the implementation. The programmer can focus on problem-solving and not worry about the implementation.
- the concrete class implements the data structure and algorithms.
- Suppose the interface implementation which provides (ADD, DELETE, SEARCH) functionality uses an unordered list with linear search in the beginning. Later, if we realize linear search is not a scalable and efficient algorithm to use for a large number of students, then we can change the implementation with an ordered list with binary search.
- This means implementation of ADTs can be changed (e.g., for efficiency) without requiring changes to the program that uses the ADTs, so the client need not have to change its implementation as it still uses the APIs provided by interfaces, and it is abstracted from the implementation.
- Maintenance of the application is easier
The principle of hiding the used data structure and only providing a well-defined interface is known as encapsulation.
Here the concept of abstraction means
- We know what a data type can do
- How it is done is hidden (whether we use an unordered array with liner search or an ordered array with binary search is hidden from the client)
A real-world example of ADT
Java’s standard libraries supply TreeMap and HashMap implementations of its Map data type.
- TreeMap: More efficient when a total ordering on the keys can be computed quickly
- HashMap: More efficient when hash values can be computed quickly
The part of the program that creates a Map can decide which implementation to use. The part of a program that deals with a created Map doesn’t have to know how it was implemented; once created, it is just a Map.
Every part of the program that uses a Map would have to be written twice, without the concept of abstract data type, with one version to deal with the TreeMap implementation and another version to deal with Hashmap implementation.