1. Overview
You may have heard the terms *data type* and *data structure* and wondered what they mean and what the difference is between them. In this article, we will briefly describe their meanings and discuss the difference between them in Java
2. Prerequisites
1. You should know what a byte is. Spoiler alert: Byte is the basic unit of (binary) information that a computer stores or processes. The size of a byte is typically 8 bits where each bit represents a binary 0 or 1.
2. You should have a basic understanding of classes in Java. Take a look at understanding basic Hello World program in Java
3. Data Types
Each programming language defines what types of data they support. i.e. that programming language natively understands that type of data and may support specific operations on that data. For example, Java programming language supports “integer” data types (referred to as int
when writing code) and certain operations using this data type (addition, subtraction, etc.).
When developers write code, some languages require the developers to specify the type of data that their code is referring to explicitly. This is called strong typing. Strong typing means that when we declare a variable, we *must* specify the type of that variable. Languages, like Java, that require strong typing are called strongly typed languages.
Some languages, like Python & Javascript, infer the type of data as they perform execution and do not demand that developers specify the type of data upfront, explicitly. This is called weak typing and languages with this support are called weakly typed languages.
3.1 Primitive and Reference Data types in Java
In Java, there are two broad kinds of data types. Primitive data types and reference data types. They are part of the Java Language Specification and thus integral to the Java language itself.
3.1.1 Primitive Types
Each primitive type has a fixed size in bytes. Primitive data types are passed by value.
Java defines & supports the following primitive types:
Java Primitive Type | Size (in bytes) | Description |
---|---|---|
byte | 1 | Represents an 8-bit signed two’s complement integer |
short | 2 | Represents a 16-bit signed two’s complement integer |
int | 4 | Represents a 32-bit signed two’s complement integer |
long | 8 | Represents a 64-bit signed two’s complement integer |
float | 4 | Represents a single-precision 32-bit IEEE 754 floating point |
double | 8 | Represents a double-precision 64-bit IEEE 754 floating point |
char | 2 | Represents a single 16-bit Unicode character |
3.1.2 Reference Types
Reference types in general do not have a fixed size in bytes. Reference data types are passed by reference. Arrays, classes, interfaces and type variables are all reference types in Java.
3.2 Arrays
An array is an ordered collection of either primitive types or objects. An array’s size is fixed when the array is initialized and is given as an int primitive type. The below statement is declaring and initializing a new array called myIntArray of primitive int type and of size N
int[] myIntArray = new int[N];
We can determine the length of the above array as follows:
int length = myStringArray.length;
In this example, length would be equal to N.
An array’s elements are accessed via an index. The index starts at 0 and represents the first element of the array and ends at N – 1 and represents the last element of the array. Here N is the length of the array.
We can access an element of an array named myArray at position i as myArray[i]
. Thus,
int a = myIntArray[0];
will set the variable a to the value of the first element of myIntArray and
myIntArray[0] = 3;
will set the first element of myIntArray to 3.
4. Data Structures
Compared to Data types, data structures are a higher-level construct. Data structures contain data (of one or more data types) and methods to access and possibly modify that data. Data Structures that allow their data to be modified are called mutable data structures. Data structures that do not allow their data to be modified are called immutable data structures
“Data structures” are general programming concepts and have some sort of implementation in most, if not all, programming languages.
Let’s look at some common data structures and how they are implemented in Java –
4.1 Array
An array is perhaps the simplest data structure. It contains an ordered list of data of a given length. As seen above it has a mechanism to both access its data and modify its data. Thus an array is a mutable data structure. Note: an array is both a data type and a data structure in Java.
4.2 String
We can consider a String as a data structure. It contains an ordered list of char primitive types representing the character text of a given length.
The String class has a method to determine its length:
public int length()
and a method to access the character at position i:
public char charAt(int i)
Note that the String class does not have a method to set the character at position i.
Thus a String is an immutable data structure.
4.3 The Java Collections Framework
The Java Collections Framework contains a number of common data structures and is provided as part of the core Java libraries. Each data structure in the collections framework includes an interface for the data structure, an abstract base class implementing that interface, and one or more concrete subclasses of the abstract base class.
Each interface defines the methods common to its corresponding data structure. The abstract base class provides default implementations for some of those methods.
Finally, each concrete implementation provides a data structure suitable for different purposes. For example, some implementations are more efficient for small data sets or are more efficient when access operations are used more often than inserting operations. Let’s look at some of these data structures below.
4.3.1 Set
A Set in Java defines an interface for a data structure that contains an unordered collection of data. Each element of a Set must be unique. Elements can be added and removed from the set but there is no explicit method to access a specific element. A specific concrete implementation of the Set interface is the HashSet class.
4.3.2 List
A List in Java defines an interface for a data structure that contains an ordered collection of data. The elements of a List do not need to be unique. Elements can be added and removed from the List, and there is a method to access a specific element. Lists are similar to arrays and in fact, one implementation of a List, ArrayList, is actually backed by an array. i.e., the ArrayList class internally uses an array to provide the List functionality (i.e., to implement the List interface)
4.3.3 Map
A Map defines an interface for a data structure that contains an unordered collection of data in key-value pair format. The values of a Map do not need to be unique but their corresponding keys do need to be unique. Values can be added and removed and there is a method to access a specific value by specifying its corresponding key. A specific concrete implementation of Map is the HashMap class.
4.3.4 Stack
Another common data structure is a Stack. We’ll cover this important data structure in a subsequent article.
4.3.5 Custom Data Structures
Of course, you are not restricted to using only these data types. Custom data structures can be designed and implemented to suit whatever requirements are necessary for a given programming task.
5. Data Types vs Data Structures
Comparing data types and data structures is like comparing apples and oranges. That said, here are a few key relative points to keep in mind –
Data Types | Data Structures |
---|---|
Language specific | General concepts. Languages may choose to implement |
Comparatively lower level implementations | Comparatively higher level implementations |
Primitive data types are atomic (single, indivisible unit) | No such concept |
Are used to implement data storage for data structures | Uses Data types to hold the data |
6. Conclusion
In this article, we learned the concept of data types and data structures. We looked at some common data types and data structures in Java. Finally, we looked at data types and data structures in comparison with each other.