กก

Section 1.2
Variables. Data types. Constants.
cplusplus.com

The usefulness of the "Hello World" program shown in the previous section is more than questionable. Since we have had to write several lines of code, compile them, and then execute the resulting program to obtain just a phrase on the screen as result. It is true, but programming is not limited only to print texts on screen, whereupon would have been much faster to do it by ourselves. In order to go a little further on and to become able to write programs that perform useful tasks that really save us work we need to introduce the concept of variable.

Let's think that I ask you to retain number 5 in your mental memory, and then I ask you to memorize also the number 2. You have just stored two values in your memory. Now, if I ask you to add 1 to the first number I said, you should be retaining numbers 6 (that is 5+1) and 2 in your memory. Values that now we could subtract and obtain 4 as result.

All this process that you have made is a simil of what a computer can do with two variables. This same process can be expressed in C++ with the following instruction set:

a = 5;
b = 2;
a = a + 1;
result = a - b;

Obviously this is a very simple example since we have only used two small integer values, but consider that your computer can store several million of numbers like these at the same time and conduct sophisticated mathematical operations with them.

Therefore, we can define a variable like a portion of memory to store a determined value.

Each variable needs an identifier that distinguishes it from the others, for example, in the previous code the variable identifiers were a, b and result, but we could have called the variables like we had wanted to invent, whenever they were valid identifiers.

Identifiers

A valid identifier is a sequence of one or more letters, digits or underline symbols ( _ ). The length of an identifier is not limited, although for some compilers only the 32 first characters of an identifier are significant (the rest are not considered).

Neither spaces nor marked letters can be part of an identifier. Only letters, digits and underline characters are valid. In addition, variable identifiers would always have to begin with a letter. They can also begin with an underline character ( _ ), but this is usually reserved for external links. In no case they can begin with a digit.

Another rule that you have to consider at the time of inventing your own identifiers is that they cannot match with any language's key word nor your compiler's ones since they could be confused with these, for example, the following expressions are always considered key words according to the ANSI-C++ standard and therefore they must not be used as identifiers:

asm, car, bool, break, marry, catch, to char, class, const, const_cast, continue, default, delete, do, double, dynamic_cast, else, enum, explicit, extern, false, float, for, friend, goto, if, inline, int, long, mutable, namespace, new, operator, private, protected, public, to register, reinterpret_cast, return, short, signed, sizeof, static, static_cast, struct, switch, template, this, throw, true, try, typedef, typeid, typename, union, unsigned, using, virtual, void, volatile, wchar_t
Additionally, alternative representations for some operators do not have to be used as identifiers since they are reserved words under some circumstances:
and, and_eq, bitand, bitor, compl, not, not_eq, or, or_eq, xor, xor_eq
your compiler may also include some more specific reserved keywords. For example, many compilers which generate 16bits code (like some compilers for DOS) include also far, huge and near as key words.

Important: The C++ language is "case sensitive", that means that a same identifier written in capital letters is not equivalent to another one with the same name but written in small letters. Thus, for example the variable RESULT is not the same one that the variable result nor variable Result.

Data types

When programming we store the variables in the computer's memory, but the computer must know what we want to store in them since it is not going to occupy the same space in memory to store a simple number, a letter or a large number.

Our computer's memory is organized in bytes. A byte is the minimum amount of memory which we can manage. A byte is able to store a relatively small data, like an integer between 0 and 255 or one single character. But in addition, the computer can manipulate more complex data types that come from grouping several bytes, like long numbers or numbers with decimals. Next you have a list of the existing fundamental data types in C++, as well as the range of values that can be represented with each of them:

DATA TYPES
Name Bytes Description Range
char 1 character or integer 8 bits length. signed: -128 to 127
unsigned: 0 to 255
short 2 integer 16 bits length. signed: -32763 to 32762
unsigned: 0 to 65535
long 4 integer 32 bits length. signed:-2147483648 to 2147483647
unsigned: 0 to 4294967295
int * Integer. Its length depends on the length of the system's Word type, thus in MSDOS is 16 bits length, whereas in systems of 32 bits (like Windows 9x/2000/NT and systems that work under protected mode in x86 systems) is 32 bits length, and in those of 64 bits, 64. See short, long
float 4 floating point number. 3.4e + / - 38 (7 digits)
double 8 double precision floating point number. 1.7e + / - 308 (15 digits)
long double 10 long double precision floating point number. 1.2e + / - 4932 (19 digits)
bool 1 Boolean value. * it is the most recently added data type by the ANSI-C++ standard. So not every compiler supports it. Consult section bool type for information about compatibility. true or false

In addition to these fundamental data types there exist also the pointers and the void parameter type specification, that we will see later.

Declaration of variables

In order to be able to use a variable in C++, first we must declare it specifying which of the data types above we want it to be. For that we only need to write the data type specifier that we need (like int, short, float...) followed by a valid variable identifier. For example:
int a;
float mynumber;
are possible declarations of valid variables. The first one declares a variable of type int with the identifier a. The second one declares a variable of type float with the identifier mynumber. Once declared, variables a and mynumber can be used within the rest of their scope in the program.

If you need to declare several variables of the same type and you want to save some writing work you can declare all of them in the same line separating the identifiers with commas. For example:

int a, b, c;
declares three variables of type int (a, b and c), and has exactly the same meaning as if we have written:
int a;
int b;
int c;

Integer data types (char, short, long and int) can be signed or unsigned according to the range of numbers that we need to represent. Thus to specify an integer data type we do it by putting the keyword signed or unsigned before the data type itself. For example:

unsigned short NumberOfSons;
signed int MyAccountBalance;
By default, if we did not specify signed or unsigned it will be assumed that the type is signed, thus in the second declaration we could have written:
int MyAccountBalance
with exactly the same meaning and being this last one the most usual, in fact, it is rare to see source codes including the keyword signed.

The only exception to this rule is the char type that exists by itself and is considered by ANSI-C++ a diferent type than signed char and unsigned char.

To see in action how looks like a declaration in a program, we are going to see the C++ code of the example about your mental memory proposed at the beginning of this section:

// operating with variables

#include <iostream.h>

main ()
{
  // declaring variables:
  int a, b;
  int result;

  // process:
  a = 5;
  b = 2;
  a = a + 1;
  result = a - b;

  // print out the result:
  cout << result;

  // terminate the program:
  return 0;
}
4

Do not worry if something out of the varibale declarations sounds a bit strange to you. You will see the rest in detail in next sections.

Initialization of variables

When declaring a variable, its value is undetermined by default. But you may want that a variable stores a concrete value when it is declared. For that, you have to append an equal sign and the value you want to the variable declaration:
type identifier = initial_value ;
For example, if we want to declare an int variable called a that contains the value 0 since the moment in which is declared, we could write:
int a = 0;

Additionally to this C-like way of initializating variables, C++ has added a new way to initialize them by enclosing the initial value between parenthesis ():

type identifier (initial_value) ;
For example:
int a (0);
Both ways are valid in C++.

Scope of variables

All the variables that we are going to use must be previously declared. An important difference between C++ and C languages, is that in C++ we can declare variables in any point of the program, even between executable sentences, and not only at the beginning of a block of instructions, like happens in C.

Even so it is recommendable to follow the indications of the C language when declaring variables, since it can be very useful at the time of correcting a program to have all the declarations grouped together. Therefore, the traditional way to declare C-like variables is to include their declaration at the beginning of each function (local variables) or directly in the body of the program outside any function (global variables).

Global variables can be referred anywhere in the code, even within functions, whenever it be after its declaration.

The scope of local variables is limited to the code level in which they are declared. If they are declared at the beginning of a function (like in main) its scope is the whole main function. This means that if in the example above, moreover than the function main() another function existed, the local variables declared in main could not be used in the other function and vice versa.

In addition to local and global scopes exists the external scope, that causes a variable to be visible not only in the same source file but in all other archives which will be linked with.

In C++ the scope of a variable is given by the block in which it is declared (a block is a group of instructions grouped together within key brackets {} signs). If it is declared within a function it will be a variable with function scope, if it is declared in a loop its scope will be only the loop, etc...

Constants: Literals.

A constant is any expression that has a fixed value, like:

Integer Numbers

1776
707
-273
they are numerical constants that identify integer decimal numbers. Notice that to express a numerical constant we do not need to write quotes (") nor any special symbol. There is no doubt that it is a constant: whenever we include 1776 in a program we will be referring to the value 1776.

In addition to decimal numbers (those that all of us already know) C++ allows the use as literal constant of octal numbers (radix 8) and hexadecimal numbers (radix 16). In order to express an octal number we must precede it with a 0 character (zero character). And to express a hexadecimal number we have to precede it with the characters 0x. For example, the following literal constants are all equivalent to each other:

75         // decimal
0113       // octal
0x4b       // hexadecimal
All of them represent the same number: 75 (seventy five) expressed as a radix-10 number, octal and hexdecimal, respectively.

[ Note: You can find more information on hexadecimal and octal representations in the document Numerical radixes]

Floating Point Numbers
They express numbers with decimals and/or exponent. They can include a decimal point, an e character (that expreses "by ten high to...") or both.

3.14159    // 3.14159
6.02e23    // 6.02 x 1023
1.6e-19    // 1.6 x 10-19
3.0        // 3.0
these are four valid numbers with decimals expressed in C++. The first number is PI, the second one is the number of Avogadro, the third is the electric load of an electron (an extremely small number) -all of them approximated- and the last one is the number 3 expressed as a floating point number literal.

Characters and strings
There also exist non-numerical constants, like:

'z'
'p'
"Hello world"
"How do you do?"
The first two expressions represent single characters, these are enclosed within single quotes ('), and the two following ones represent strings of several characters that are expressed enclosed between double quotes (").

When writing both single characters and strings of characters in a constant way, it is necessary to put the quotation marks to distinguish them from possible variable identifiers or reserved words. Notice this:

x
'x'
x refers to the x variable, whereas 'x' references to the character constant 'x'.

Character constants and string constants have certain peculiarities like the escape codes. These are special characters that cannot be expressed otherwise in the sourcecode of a program, like newline (\n) or tab (\t). All of them are preceded by an inverted slash (\). Here you have a list of such escape codes:

\nnewline
\rcarriage return
\ttabulation
\vvertical tabulation
\bbackspace
\fpage feed
\aalert (beep)
\'simple quotes (')
\"double quotes (")
\question (?)
\\inverted slash (\)
For example:
'\n'
'\t'
"Left \t Right"
"one\ntwo\nthree"
Additionally, to numerically express an ASCII code you may also use the inverted slash bar (\) followed by an ASCII code expressed in octal (radix-8) or hexadecimal (radix-16). In the first case the number must follow immediately the inverted slash bar (for example \23 or \40), in the second, to pass the code in hexadecimal, you must put an x character before the number (for example \x20 or \x4A).
[Consult the document ASCII Code for more information about this type of escape code].

String of characters constants can be extended by more than a single code line if each one ends with an inverted slash (\):

"string expressed in \
two lines"
You can also concatenate several string constants separating them by one or several blankspaces, tabulators, newline or any othe valid blank character:
"we form" "a unique" "string" "of characters"

Defined constants (#define)

You can define your own names for constants that you use quite often without having to resource to variables, simply by using the #define preprocessor directive. This is its format:
#define identifier value
For example:
#define PI 3.14159265
#define NEWLINE '\n'
#define WIDTH 100
they define three new constants. Since the moment when they are declared, you will be able to use them in the rest of the code as any other constants, for example:
circle = 2 * PI * r;
cout << NEWLINE;
In fact the only thing that the compiler does when it finds #define directives is to replace literally any occurrence of the them (in the previous example, PI, NEWLINE or WIDTH) by the constants to which they have been defined (3.14159265, 'n' and 100, respectively). For this reason, #define constants are considered macro constants.

The #define directive is not an instruction of code, it is a directive for the preprocessor, reason why it assumes the whole line as the directive and does not require a semicolon (;) at the end of it. If you include a semicolon character (;) at the end, it will also be added when the preprocessor will substitute any occurence of the defined constant in the body of the program.

declared constants (const)


With the const prefix you can declare constants with a specific type exactly just as you would do with a variable:
const int width = 100;
const to char tab = '\t';
const zip = 12440;
In case that the type was not specified (as in the last example) the compiler assumes that it is type int.

© The C++ Resources Network, 2000 - All rights reserved

Previous:
1-1. Structure of a C++ program.

index
Next:
1-3. Operators.