Data Types in C++

C++ Programming

# Available Data Types

Numeric Data Types

C++ contains intrinsic data types to store numeric values in your application code. It's important to remember that these values are binary-based and not as flexible as their base 10 counterparts. For example, in mathematical terms of a base 10 integer, the definition is a value that is negative infinity to positive infinity whole numbers. Modern computers still cannot represent numbers these large. Take as an example the int type in the Numeric Data Types table. The range does not exceed 3 billion in either direction.

The byte representation given in that table will give you a hint as to how the values are stored in the memory and on disk.

NOTE: The type names that start with a __ character are considered non-standard types.

Type Name	Bytes	Alias	Range
int	4	signed	–2,147,483,648 to 2,147,483,647
unsigned int	4	unsigned	0 to 4,294,967,295
__int8	1	char	-128 to 127
unsigned __int8	1	unsigned char	0 to 255
__int16	2	short, short int, signed short int	–32,768 to 32,767
unsigned __int16	2	unsigned short, unsigned short int	0 to 65,535
__int32	4	signed, signed int, int	–2,147,483,648 to 2,147,483,647
unsigned __int32	4	unsigned, unsigned int	0 to 4,294,967,295
__int64	8	long long, signed long long	–9,223,372,036,854,775,808 to 9,223,372,036,854,775,807
unsigned __int64	8	unsigned long long	0 to 18,446,744,073,709,551,615
short	2	short int, signed short int	-32,768 to 32,767
unsigned short	2	unsigned short int	0 to 65,535
long	4	long int, signed long int	–2,147,483,648 to 2,147,483,647
unsigned long	4	unsigned long int	0 to 4,294,967,295
long long	8	none	–9,223,372,036,854,775,808 to 9,223,372,036,854,775,807
unsigned long long	8	none	0 to 18,446,744,073,709,551,615
float	4	none	3.4E +/- 38 (7 digits)
double	8	none	1.7E +/- 308 (15 digits)
long double	8	none	1.7E +/- 308 (15 digits)

Character Data Type

Character data is used to represent non-numeric data such as letters and symbols. Character data is actually represented as numeric information under the covers. The standard char type is used to represent the numeric values for character data as represented by the basic character set present on a particular computer. This is determined by the locale settings.

For internationalization purposes, the wchar_t type is used which expands on the numeric values available to represent character sets from various languages found around the world.

NOTE: The type names that start with a __ character are considered non-standard types.

Type Name	Bytes	Alias	Range
char	1	none	–128 to 127 by default 0 to 255 when compiled by using /J
signed char	1	none	-128 to 127
unsigned char	1	none	0 to 255
wchar_t, char16_t, and char32_t	2 or 4	__wchar_t	0 to 65,535 (wchar_t & char16_t), 0 to 4,294,967,295 (char32_t)

Other Data Types

C++ supports other data types outside of the numeric or character data types. The first one we see in the table below, is the Boolean data type called bool. This is used to represent true or false values in an application. In previous languages such as C, false was represented as a 0 value and true was represented as any non-zero value.

Type Name	Bytes	Alias	Range
bool	1	none	true or false
enum	varies	none	dependant on the enclosed data types

C++ also has the concept of an enumeration, called an enum. An enumeration is a set of constants stored as literal values. They limit the choices for the type. For example, when dealing with an int data type, you can assign any value to that data type that fits within the range of the integer type for that computer. With an enum, you specify a limited set of literal constants that can be assigned to the type.

Consider the need to use a data type to represent days of the week. How do you store this information in a data type? You could use an array, but what type of data would you use? Perhaps a string data type. But how do you prevent someone from adding a non-valid day of the week, like moncleday, to the array? If you create an enumeration that stores only valid values for Sunday through Saturday, you constrain the data type to those literal constants only.

Enumerations are covered later in this module under the lesson on Complex Data Types.

Choosing Data Types

Choosing the correct data type is important in your applications to ensure that you can represent your data efficiently and correctly. Some examples of this would be;

making use of short rather than int if your data range permits
using a double rather than a float to get greater a accuracy for values representing money
using a wchar_t for character data that doesn't fit in the standard ASCII character set, such as Japanese kanji

# Variables and Constants

Introducing Variables

Variables are identifiers that you create to hold values or references to objects in your code. A variable is essentially a named memory location.

When you create a variable in C++, you must give it a data type. You can assign a value to the variable at the time you create it or later in your program code. This is known as initializing the variable. Even though you can initialize a variable separate from its creation, you must assign the data type when you define the variable. C++ will not allow you to use an uninitialized variable to help prevent unwanted data from being used in your application. The following code sample demonstrates declaring a variable and assigning a value to it. C++ supports two methods of initializing a variable.

int myVar = 0;
int yourVar{1};

C++ has some restrictions around identifiers that you need to be aware of.

First off, identifiers are case-sensitive because C++ is a case-sensitive language. That means that identifiers such as myVar, _myVar, and myvar, are considered different identifiers.

Identifiers can only contain letters (in any case), digits, and the underscore character. You can only start an identifier with a letter or an underscore character. You cannot start the identifier with a digit. myVar and _myVar are legal but 2Vars is not.

C++ has a set of reserved keywords that the language uses. You cannot use these keywords as an identifier in your code. You may choose to take advantage of the case-sensitivity of C++ and use Long as an identifier to distinguish it from the reserved keyword long, but that is not a recommended approach.

To keep up to date on the reserved keywords for C++ you should always refer to the C++ standard but the current standard lists reserved keywords in the C++ Standard document found here.

Introducing Constants

Similar to a variable, a constant expression is a named memory location for a value used in your application. The difference is that, as you might expect, a constant expression cannot have its value change in the program during run time. C++ uses the keyword const to indicate that an expression is a constant.

When you declare a constant in C++, you must assign a literal value to that constant at the same time. You cannot assign it later in program nor can you change the value in code later. Also, because the value cannot be changed, you cannot initialize a constant with a variable or any other value that will have its value modified during runtime.

Type Conversion

Casting refers to converting one data type to another. Some data conversions are not possible while some are possible but result in data loss. C++ can perform many conversions automatically, what is known as implicit casting or implicit conversion. For example, attempting to convert a smaller data type to larger data type as shown here:

int myInt = 12;
long myLong;
myLong = myInt;

In the first line, we declare an integer data type and assign it a value of 12. The next line declares a long data type and in the third line, we assign the integer data type value to the long data type. C++ automatically converts the data type for you. This is known as a widening conversion. Some programmers also call this an expanding assignment. We are expanding or widening the data type to a larger one. In this case, there is no loss in data. The following table highlights some potential data conversion problems.

Conversion	Potential Issues
Large floating point type to small floating point type	Loss of precision and/or the starting value might be out of range for the new type
Floating point type to integer type	Loss of the fractional component of the floating point type and/or out of range
Bigger integer type to smaller integer type	Starting value may be out of range for the new type

This table only deals with numeric data type conversions. There are other conversion types such as from character to numeric or numeric to character, or among character types. C++ also uses the boolean type that represents true or false. If you assign a zero value to a bool variable, it will be converted to false. Any non-zero value is converted to true.

When you want to explicitly perform a conversion or cast, you can use the type cast features of C++. For example, the previous widening conversion in the int to long cast was implicit but you can also tell the compiler that you are know what you are doing by using the type cast statement as in:

long myLong = (long)myInt;

// or you can use this version as well

long myLong = long(myInt);

C++ also provides a cast operator that is more restrictive in its usage. This in the form static_cast (type). This static cast operator can be used for converting numeric values or to numeric values. As an example:

char ch;
int i = 65;
float f = 2.5;
double dbl;
ch = static_cast<char>(i);   // int to char
dbl = static_cast<double>(f);   // float to double

# Arrays

Complex Data Types (Arrays)

So far you have been introduced to the intrinsic data types that C++ supports. These are types that contain the data literals directly. The int type directly stores the literal integer value, for example.

C++ also provides support for complex data types. These are also referred to as compound data types. Mostly because they store more than one piece of data or potentially more than one data type.

An array is a set of objects that are grouped together and managed as a unit. You can think of an array as a sequence of elements, all of which are the same type. You can build simple arrays that have one dimension (a list), two dimensions (a table), three dimensions (a cube), and so on. Arrays in C++ have the following features:

Every element in the array contains a value.

Arrays are zero-indexed, that is, the first item in the array is element 0.

The size of an array is the total number of elements that it can contain.

Arrays can be single-dimensional, multidimensional, or jagged.

The rank of an array is the number of dimensions in the array.

Arrays of a particular type can only hold elements of that type. This means that you cannot store integers, longs, and character data types in the same array.

Creating and Using Arrays

When you declare an array, you specify the type of data that it contains and a name for the array. To declare a single-dimensional array, you specify the type of elements in the array and use brackets, [] to indicate that a variable is an array. The following code example shows how to create a single-dimensional array of integers with elements zero through nine.

int arrayName[10];

You can also choose to create an array and initialize it with values at the same time as in the following example that declares and integer array and assigns values to it. The compiler knows how large to make the array by the number of values in the curly braces:

int arrayName[] = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10};

You can also declare an array and only initialize some of the elements:

int arrayName[10] = {1, 2, 3};

In this case, we have declared an array of size 10 but have only assigned values to the first three elements. The compiler will initialize the remaining elements to the default value for the data type the array holds. In this case, int data type, the remaining values are initialized to 0.

Accessing Data in an Array

You can access data in an array in several ways, such as by specifying the index of a specific element that you require or by iterating through the entire array and returning each element in sequence.

The following code example uses an index to access the element at index two.

//Accessing Data by Index
int oldNumbers[] = { 1, 2, 3, 4, 5 };

//number will contain the value 3
int number = oldNumbers[2];

Note: Arrays are zero-indexed, so the first element in any dimension in an array is at index zero. The last element in a dimension is at index N-1, where N is the size of the dimension. If you are using some other languages, such as C#, and you attempt to access an element outside this range, the C# runtime will throw an exception (error). C++ doesn't offer such protection. If you attempt to access an element that is outside the bounds of your array, you will still return data, but you have no idea what that data is.

The reason for this is because an array is simply a pointer to a memory location. The first element of the array is the starting memory address for the entire array. If you have an array of integer data types, then the number of the elements multiplied by the size of the int data type on your system, determines how much memory is used by the array, and at the same time, permits the access of the elements in the array by performing math on the memory address to get at the required element. If you attempt to access oldNumbers[5], the program will simply return the data found at the memory address that is the next memory address location beyond the last array element. This can be a dangerous situation and is in fact, the result of some security issues found in software.

You can also iterate through an array by using a for loop. You will cover loops in module 3 so don't worry if you don't completely understand this example at this time. Essentially, the for loop starts at 0 and repeats the portion in the curly braces {} for each of the five steps in the loop.

The following code example shows how to use a for loop to iterate through an array.

//Iterating Over an Array
int oldNumbers[] = { 1, 2, 3, 4, 5 };
for (int i = 0; i < 5; i++)
{
     int number = oldNumbers[i];
     ...
}

# String

Strings are a series of characters. C++ represents strings in one of two ways. The first maintains backward compatibility with the C language and represents the string as a character array. There is one aspect to a C-style string that is important to note. The last character of every string you store is the null character string, typically represented by the ASCII code for 0 which is \0. This is necessary so that the compiler knows when the string ends. An example demonstrates a C-style string stored in a character array:

char isAString[6] = { 'H', 'e', 'l', 'l', 'o', '\0'};
char isNotAString[5] = { 'H', 'e', 'l', 'l', 'o'};
cout << isAString << endl;
cout << isNotAString << endl;

The most common mistake made by users of the C-style string is to forget to make the char array large enough to accommodate the \0 character, but also forgetting include the \0. In the previous example, a programmer might think that an array of size 5 would be large enough to contain Hello because that's how many characters are in the word. However, the null character would not be included in the second array, which could result in errors in code that uses this array. The reason is that C++ does not consider the isNotAString array to be a string.

Consider the output displayed in Figure 2.1. Note that the first output correctly terminates because C++ encountered the null (\0) character. The second did not terminate and output the contents of adjacent memory.

An alternative method of declaring a character array for use as a string is to simply initialize it with a string literal. A string literal is a sequence of characters enclosed in the double quotes ("). For example:

char isAString[6] = "Hello";
char isAnotherString[] = "Array size is inferred";

In the previous example, the first line creates an array of size 6 and assigns the string literal Hello to the array. The second example lets the compiler infer the size from the string literal itself. Note that neither of these two string literals specifies a \0 character. The compiler will implicitly add that for you. However, caution is advised in the first line to ensure that you allow enough room in the array size specified for the null character. If you create an array that is larger than required for the string literal along with the null character, then C++ simply fills the remaining elements of the array with null characters.

The string Class If the use of character arrays, single quoted characters, and null termination characters are making you think that strings aren't worth the hassle, consider the string class instead. The ISO/ANSI standard helped to expand the string handling capabilities of C++ by adding the string class.

In order to use the string class, you have to include the string header file. We have not covered namespaces yet but to make typing much easier, you would add a using statement as in the following example.

using namespace std;
string myString = "Hello!";
std::string myNewString = "Less typing";

Without the using directive, you would have to type std::string every time you wanted to use the string class in your code, as in the second line above.

As you can see from the code example, you use string in the same manner in which you would use any other data type in C++. You also do not need to add a null character to terminate your string.

# Structures

Arrays can store multiple pieces of data in one compound data type but recall, the data types must all be of the same type. If that is the case, how might you store multiple pieces of data in one type, where the individual pieces are of different data types? For example, let's say that we want to store information about a coffee bean. We might want to store information about the bean type, its strength, and perhaps which country it originated from. In this case, we could use all strings to store that information but what if the strength was intended to be a number from 1 to 10. In this case, we would want to store two strings and one integer in our coffee bean data type.

We haven't covered classes yet, which is another data type we could use, but instead, we will use a structure (struct) to store this information. Structures are known as user-defined types. You define the struct by giving it a name and then adding the member data types as in the following example:

struct coffeeBean
{
     string name;
     string country;
     int strength;
};

Recall that in order to use the string data type in our struct, the C++ file that contains the struct must include the string header file. This code snippet also assumes that using namespace std; has also been included.

Once we have defined the structure, we can then use it in our code the same as any other data type. To use the coffeeBean struct in your code, you simply declare a new variable of that type as shown in this example.

struct coffeeBean
{
     string name;
     string country;
     int strength;
};

coffeeBean myBean = { "Strata", "Columbia", 10 };
coffeeBean newBean;
newBean.name = "Flora";
newBean.country = "Mexico";
newBean.strength = 9;
cout << "Coffee bean " + newBean.name + " is from " + newBean.country << endl;

You can assign values to a struct using one of the methods seen here. For myBean, we assign values in the curly braces while for newBean, we use the dot notation. You can also access the values of the the struct members using the dot notation as well, shown in the cout statement at the end.

# Unions

A union, in C++, is similar to a structure in that it can store multiple, disparate data types. The differentiation is that a union can only store one piece of data at a time. What does that signify? It's best represented using an example.

union numericUnion
{
     int intValue;
     long longValue;
     double doubleValue;
};
numericUnion myUnion;
myUnion.intValue = 3;
cout << myUnion.intValue << endl;
myUnion.doubleValue = 4.5;
cout << myUnion.doubleValue << endl;
cout << myUnion.intValue; cout << endl;

In this example, we define a union called numericUnion and then create a variable of that type, called myUnion. We first assign the value 3 to the intValue field and then output it. Next we assign the value 4.5 to the doubleValue field and output that. The example shows how the union works when on the second to last line, we try to output the value for intValue again. In the output, it results in 0 rather than 3. The reason is that once we assign a value to doubleValue, what was contained in intValue is lost. The union can only store a value in one of its fields at a time.

Why use a union over a struct if it can only hold one piece of data at a time? Consider a situation where you are programming an application that will run on a device with limited memory. You would like to use a data type that can support multiple types internally like a struct, but not necessarily all at the same time. For example, part numbers for components in manufacturing where the part number may be a number or perhaps a string, depending on the manufacturer of the part. In this case, you could use the union to represent a numeric and a string data type internally but only assign the proper data type based on the part number.

# Enumerations

In the topics on variables and constants, it was noted that anytime you want to create a value for use in a program, where the value should never change, you used a constant. An enumeration can be considered a way to create what are known as symbolic constants. The most common example is to use an enum to define the day of the week. There are only seven possible values for days of the week, and you can be reasonably certain that these values will never change.

To create an enum, you declare it in your code file with the following syntax, which demonstrates creating an enum called Day, that contains the days of the week:

enum Day { Sunday, Monday, Tuesday, Wednesday, Thursday, Friday, Saturday };

By default enum values start at 0 and each successive member is increased by a value of 1. As a result, the previous enum 'Day' would contain the values:

Sunday = 0

Monday = 1

Tuesday = 2

etc.

You can change the default by specifying a starting value for your enum as in the following example.

enum Day { Sunday = 1, Monday, Tuesday, Wednesday, Thursday, Friday, Saturday };

In this example, Sunday is given the value 1 instead of the default 0. Now Monday is 2, Tuesday is 3, etc.

The keyword enum is used to specify the "type" that the variable Day will be. In this case, an enumeration type. Consider the following code sample:

enum Day { Sunday = 1, Monday, Tuesday, Wednesday, Thursday, Friday, Saturday };
Day payDay;
payDay = Thursday;
cout << payDay << endl;

The first line defines the enumeration Day and assigns seven values to the enum. Sunday is listed as the first day of the week and is initialized with the value one.

The second line declares a new variable called payDay which is of the Day enum type. In the third line, payDay is assigned a value from the list of values, in this case Thursday. Finally, the last line outputs the value of payDay to the console window. If you run this code, you will notice that the last line outputs 5 and not Thursday. Internally, the constants in the enum are used as numbers and not as the textual representation you assign to them.

C++ Programming

Search This Blog

Data Types in C++

# Available Data Types

Numeric Data Types

Character Data Type

Other Data Types

Choosing Data Types

# Variables and Constants

Introducing Variables

Introducing Constants

Type Conversion

# Arrays

Complex Data Types (Arrays)

Creating and Using Arrays

Accessing Data in an Array

# String

# Structures

# Unions

# Enumerations

Comments

Post a Comment

Popular posts from this blog

Function and Objects

C++ Classes

Control Statements