CISC105: Reading Notes
Tan and D'Orazio, Chapter 1

by Phill Conrad, Asst. Professor,
CIS Dept, University of Delaware

The purpose of these reading notes is to help you get the most out of your reading in the textbook. You may want to have these available, either online, or in printed form, as you read the corresponding chapter in the textbook. These notes will

What to emphasize in Chapter 1

Crucial Material: The following sections are crucial to your success in this course. Material from these sections will be useful not only in this course, but also in later courses as well. It definitely be on assignments and exams.

Section 1.4: Using bits to represent information (the entire section).

Section 1.5: Programming Languages

Section 1.7.2: Functions

Section 1.8.2: C Compilers

Section 1.9: About this textbook (you wont be tested over this, but it is crucial to your success).

Important Material: The following sections contain concepts and terminology that are important enough that they might appear on a quiz or exam, but are not as important as the information in the "Crucial Material" list.

Section 1.1: History of Electronic Computers

Section 1.2: Architecture

Section 1.6: Software

Section 1.8: The C Language, Ansi C and C Compilers (entire section, except 1.8.2, which is crucial)

Other Material: The following sections are good background, but will likely not appear on the exam.

1.3: Networks

1.7: Software Engineering (except 1.7.2, which is crucial)

Chapter 1, Tan and D'Orazio (1999)

1.1 History of Electronic Computers

This section covers the basic history of computers, and the idea of a stored program computer.

The main idea I want you to get from this section is that unlike the earliest computers, modern computers don't have to be physically rewired to solve different problems. Instead, computers store in their memory both data and instructions. Instructions tell the computer what problem to solve, and how to solve it.

The computer scientist credited with this development is John von Neumann. That is a name worth remembering, because in discussions of computers, you will sometimes encounter the term "von Neumann architecture".

Figure 1.1a and 1.1b are meant to illustrate the idea that you can just "change the instructions and get a different result". I'm not sure how well the diagram really illustrates that point, so if it doesn't make that much sense, don't worry about it; here's an alternative explanation.

As an example of the von Neuman principle, consider two different problems:

Here is a sequence of instructions to solve problem 1: multiply by 9, divide by 5, add 32. For example, consider the boiling point of water: 100 degrees Celsius. When we multiply 100 by 9, we get 900. We then divide by 5, and get 180. Finally, add 32, and we get 212, which is the correct boiling point of water in Farenheit. We can abbreviate that program as: {MUL 9; DIV 5; ADD 32;}.

A sequence of instructions to solve the second problem would be subtract 32, multiply by 5, divide by 9, or {SUB 32; MUL 5; DIV 9;}.

The word program simply refers to a sequence of such instructions. So, {MUL 9; DIV 5; ADD 32} could be said to be a program that solves problem 1, while {SUB 32; MUL 5; DIV 9} is a program to solve problem 2. Programming is the task of developing sequences of instructions to solve particular problems.

The von Neumann concept indicates that we can store both the data (e.g. a temperature to be converted, such as 72) and the instructions (MUL 9; DIV 5; ADD 32) in the memory of a computer. To change which problem the computer solves, we just change which instructions are stored in the memory of the computer.

Most of the rest of this chapter is just general background. You don't need to memorize the details of "supercomputer" vs. "mainframe" vs. "mini", etc.. That is stuff you probably should know, but I'm not going to test you on it.

1.2 Architecture

Section 1.2 covers some basic terminology that we need in order to speak about computers, and especially the hardware of a computer, i.e., its physical parts.

After reading through Section 1.2 in its entirety, you should understand the role of the CPU, main memory, Arithmetic-logic unit (ALU), Control Unit, Bus, Peripheral Devices, and Controllers (that is, everything illustrated in Figure 1.2).

1.2.1 Main Memory

This section discusses the idea of a "bit": the most basic unit of information. A bit can have one of two states, which we represent as either a zero or a one.

The figure 1.3a and 1.3b are intended to show that you can both "store" information into a bit, and "read" information from a bit. If this figure helps you understand that, then great; if not, don't worry about it.

What's more important to get out of this section is that bits can be grouped into cells or words. If the cell has 8 bits, it is typically called a byte.

Your book doesn't tell you this, but a word is typically the amount of data that a particular computer processes "all at once". On most computers around the year 2004 (when I am writing this), the word size is 32 bits. But in fact, the size of a word depends on the particular machine architecture. In 2004, 64 bit machines are just starting to be generally available, and will probably become more available over the next few years. Some special purpose machines (including some machines for graphics and computer gaming) have a word size of 128 bits.

The word cell is a more generic word for a group of bits in memory, and has no particular size associated with it.

Another thing you should know, that your book doesn't tell you: half of a byte is a nibble. So a nibble is 4 bits.

The explanation of figure 1.4 is important, so be sure you understand it.

Another important fact: the prefixes Kilo, Mega and Giga have different meanings when applied to memory than their normal meanings. Normally:

These prefixes have their normal meaning when applied to the speed of computer networks; e.g. a 100 Mbps Ethernet card operates at 100 million bits per second.

However, when applied to computer memory (whether that memory is on RAM or on disk drives), the meaning of these terms changes:

Also: know the terms: RAM, ROM, Bus

1.2.2 Central Processing Unit

Know the terminology of three parts of the CPU: registers, control unit, arithmetic-logic unit (ALU). A key concept: the control unit is the part of the computer that steps through sequential instructions and causes those instructions to be carried out and also moves data and instructions between the CPU and main memory. The ALU is the part that actually has the circuits for carrying out instructions like adding, subtracting, comparing, etc.

1.2.3 Peripheral Devices

Know the two major categories of peripherals: mass storage and input/output (i/o).

Mass storage devices

Mass storage devices are a type of memory: they store information in the form of bits. However they differ from main memory in four important ways outlined in this section; be sure you can identify all four!

Input-output (I/O) devices

Your text lists some common input/output devices. Note that special purpose computers (e.g. the computer that controls a microwave oven, or a cell phone) may have special kinds of I/O devices; be sure you can also identify this kind of I/O device.

1.2.4 Controllers and Communication to Peripheral Devices

Key idea: controllers for peripheral devices (e.g. the controller on a disk drive) are "minature computers within themselves". Your text gives an example of a printer, which may not be able to print as fast as it can receive information. The computer inside the printer (the printer controller) has a program in it with instructions to handle buffering and flow control so that the printer can do its job correctly, and communicate with the computer correctly.

1.3 Networks

This section is included in your book because the textbook authors (Tan and D'Orazio) thought it might be important for you to understand a bit about networks, given that you'll "likely be using a network in doing your classwork". The information here is interesting, and good for you to know, but I wont be testing you over it.

A side note: If you are a Computer Science major at UD, you can learn much more about Networks in CISC450, a course you can take in your junior or senior year. The Electrical Engineering department and MIS Departments also have courses in Computer Networks. The University of Delaware well-known internationally for expertise in Computer Networks, so this is a a great opportunity.

1.4 Using Bits to Represent Characters and Symbols, Integers, Real Numbers, Addresses, and Instructions

This is perhaps the most crucial section in this chapter, and the one where I recommend you spend the most study time.

1.4.1 Characters and Symbols

Key concept: ASCII and EBCDIC are two different codes for representing characters as bits. Each character has its own eight bit representation, including each capital letter, each lowercase letter, each digit 0 through 9, and each punctuation mark, and even the blank space.

You should know that both ASCII and EBCDIC are used, but we will concentrate on ASCII in this course.

There is one other important code that your book does not mention: Unicode is a newer code that uses 16 bits per character; the advantage of Unicode is that it can represent a much larger set of characters: (65,536 to be exact), including not only the letters of the English alphabet, but the accented characters in languages such as Spanish, French and German, the alphabets of Greek, Russian, Arabic, Hebrew and Korean, plus many characters used in languages such as Japanese and Chinese. Many other languages can also be respresented with Unicode. Unicode is often used in conjunction with Java and certain Web technologies. We will not use Unicode in this course, but it is good for you to know that it exists, since you may encounter it in later courses, or later in your career.

Note that the capital letter A is represented by the bit pattern 01000001 in ASCII. As you will see after you review section 1.4.2, the bit pattern 01000001 corresponds to the number 65 in decimal. You should memorize that the capital letter A corresponds to the decimal number 65. With this piece of knowledge, you can quickly calculate the ASCII value of other capital letters by counting: 'A'=65, 'B'=66, 'C'=67, etc. all the way through to 'Z'=90.

Similarly you should also memorize three other decimal ASCII values:

You can also determine an ASCII value by consulting the table that starts on p. 736 of your textbook (Appendix B).

1.4.2 Integers

Read the material in this section carefully. After reading it, you should be able to convert any integer between decimal and binary. As a practial matter, you should be able to do the following without a calculator (i.e. with only pencil and paper):

Your book explains that the representation illustrated on p 13 of your book is not exactly what is used in most computers to represent integers. The book goes on to explain that the scheme more often used is called "two's complement". What your book doesn't tell you (but I will now) is that the scheme illustrated on p. 13 is called "one's complement".

Here's the difference between one's complement and two's complement.

Consider four numbers:

Positive seven is represented in eight bits as 00000111, that is 4+2+1. To make the eight bits easier to read, we often put a space between between the first group of four bits and the second group of four bits (i.e. between the upper nibble and the lower nibble), like this: 0000 0111. So positive eight is represented as 0000 1000. The fact that the first bit is zero indicates that the number is positive.

Your textbook explains that to represent a negative seven, we could change the first bit (the sign bit) to a 1, representing that the number is negative. This representation is called sign-magnitude. In sign magnitude, negative seven is 1000 0111, and negative eight would be 1000 1000.

However, the representation normally used in most modern computers is two's-complement. Two's complement forms a negative by "flipping all the bits" and then adding one. That is, if a positive seven is 0000 0111, we flip all the bits and get 1111 1000, then add one and get 1111 1001 to represent negative seven. Similarly, positive eight is 0000 1000, so negative eight is obtained by first flipping all the bits to get 1111 0111, and then adding one yields 1111 1000.

Two's complement has several advantages over sign magnitude, some of which are best explained in terms of hardware and circuit design. We wont go into the hardware aspects in this course. But one advantage is easy to explain.

In sign magnitude, there are two representations for zero: 0000 0000 and 1000 0000. The first corresponds to "positive zero", while the second corresponds to "negative zero". Of course, positive zero and negative zero are the same number, so this is a waste.

By contrast, in two's complement, in eight bits, we can represent positive numbers up to 0111 1111, which is 0 to +127, and negative numbers from -1 to -128. Similarly, with 16 bits or 32 bits, we can represent one more negative number than we can positive numbers, if a two's complement representation is used. This is shown in the table below; we'll explore this further with a C program later in the semester. These ranges are facts you should memorize. (Actually, you don't need to memorize all the digits of the ranges for 32 bits, but you do need to know "about 2.1 billion" or "about 4.2 billion".) Better yet, you should memorize WHY these are the ranges, because then you'll be able to recreate this table anytime you need it.

Two complement representation ranges
Number of bits
Range of positive integers that can be represented
Range of negative integers that can be represented
Range of unsigned integers that can be represented
8
0 to
+127
-1 to
-128
0 to
255
16
0 to
+32767
-1 to
-32768
0 to
65535
32
0 to

-2,147,483,647
(a bit more than two billion)

-1 to
-2,147,483,648
(a bit less than negative two billion)
0 to 4,294,967,295
(about 4.2 billion)

1.4.3 Real Numbers

After reading and understand this section, and know how to convert from binary with a radix point (e.g. 100.10110) to decimal (5.6875). You should also understand the concepts of base, exponent and mantissa.

One thing not covered in detail in this section is that there are some real numbers that do not have an exact representation in binary. This is also true in decimal. Consider, for example the number 1/3 (one third). In decimal, this numbers is represented as 0.3333... the sequence of threes goes on forever. There is no exact representation with a fixed number of decimal digits; any representation to, for example, 10 decimal places, (or for that matter, to a thousand decimal places) is only an approximation of the true value of one-third. So, anytime we use a representation for real numbers that uses a fixed number of bits in a computer, we may introduce some error. This fact will be important to us later in the semester.

1.4.4 Hexadecimal and Octal Notation

After reading and understanding this chapter, you should be able to convert easily between binary, octal, and hexadecimal representations. (Hexadecimal is often just called hex.) These are both crucial skills for later in the semester. Octal will be important in setting file permission in the Unix operating system. Hexadecimal is important for working with addresses in computer memory. (It also comes in handy if you do anything with computer networks later in your career, since Ethernet addresses are written in hex.)

1.4.5 Addresses

Section 1.4.5 simply makes the point that a memory cell can contain the address of another memory cell. This may seem like a trivial point, but it actually ends up being one of the most powerful ideas in computer science. It is the basis of an idea called "pointers". Pointers which are the glue that hold together almost all complex software systems. So make sure you understand the point.

One observation: figure 1.6 has one misleading aspect. Note that the cells are labelled with their addresses: A1, A2, etc. up through A9, then B1 through B9, and C1 through C9. Actually, the diagram would be more accurate if it showed addresses ranging as in the following table. Note the the addresses range from A0 through AF, then B0 through BF, then C0 through CF. This reflects the fact that the range of a hexadecimal digit is 0,1,2,3,4,5,6,7,8,9,A,B,C,D,E,F with A through F representing the values 10 through 15.

An improved version of Figure 1.6
cell contents Address cell contents Address cell contents Address
00001000
A0
01000001
B0
11000011
C0
10010110
A1
10110101
B1
10011101
C1
10010101
A2
10011101
B2
10001101
C2
00110101
A3
10100101
B3
00010101
C3
11010101
A4
10010101
B4
10010000
C4
10100101
A5
01000111
B5
10000001
C5
10000101
A6
10111111
B6
10001010
C6
10110101
A7
10000101
B7
10000000
C7
10011111
A8
00000101
B8
10000101
C8
00101000
A9
01111100
B9
00010000
C9
00000000
AA
00000010
BA
00001100
CA
11110000
AB
00001100
BB
00001001
CB
00101100
AC
01010011
BC
01001001
CC
01010100
AD
01010100
BD
00000101
CD
00011010
AE
11111000
BE
00010001
CE
00011111
AF
11111111
BF
00100000
CF

1.4.6 Instructions

This section notes that instructions can be stored in binary as well. For example, the program {MUL 9; DIV 5; ADD 32;} that we developed in the reading notes for section 1.1 might be represented in binary somehow, and stored in consective memory cells. For example, look back for a moment at the the table in reading notes for the previous section, that is my improved version of Figure 1.6, and look at memory cells CA through CF. These values are shown again in the table below, along with an explanation of what the values might represent. The values in CB, CD and CF are 00001001, 00000101, and 00100000, which correspond to 9, 5 and 32 in binary (study section 1.4.2 of your text for an explanation). The values in cells CA, CC and CE are (), (), and (); these values might correspond to the instructions MUL (multiply), DIV (divide) and ADD. Thus, a sequence of binary values can represent a program. Some of the cells contain binary codes for the instructions, while others contain binary codes for the data. The control unit of the CPU has to be told where the first instruction in a program starts. From there, it is designed read the necessary amount of data (depending on what the instruction code is), signal the ALU to carry out the instruction, then look for the next instruction after the data. In this way, the control unit moves from instruction to instruction..

1.4.7 Comment

The comment in Section 1.4.7 is important. I have nothing to add except: read it!

1.5 Programming Languages

The introduction to section 1.5 mentions machine language, and notes that this is the only language a computer can understand. Here are a few points to add to what the textbook has to say about machine language.

In the example I gave in the reading notes for Section 1.4.6, the binary codes for the MUL, DIV and ADD instructions would be an example of machine language. Machine language is a series of ones and zeros that cause the electricity on the control unit and ALU in the CPU to flow in such as way as to add, or multiply, or divide, or whatever needs to be done, and store the result in a particular place.

Because machine languge instructions are carried out at a physical level on the computer chip, each model computer chip has its own machine language. The machine language has to be designed right along with the hardware design of the chip.

As a specific example, consider the table of computers below. Each of these computers has a different model CPU chip, from a different manufacturer. Therefore, each of these machines has a different machine language. A computer program in binary that is designed to run on one machine (say, the Dell), will not run on one of the other machines (e.g., the Apple Mac G5).

The first two machines in this table are probably familiar to you: a Dell and an Apple that are fairly up-to-date. The third one is a Sun Microsystems computer that happens to the same model as the computers known as Copland and Strass on the University of Delaware campus. The fourth one is a very old model of computer that I included in the table just to have a fourth chip manufacturer represented, and to illustrate that even two computers with the same brand (Apple, in this case) can have CPU chips that come from different manufacturers and have different machine languages.

Computer From year Chip Manufacturer Speed Chip
Dell Desktop 2004 Intel 3.6Ghz Pentium 4
Apple Mac G5 2004 IBM 2.6Ghz PowerPC G5
Sun Fire 6800 Domain
2004 Sun 750Mhz Sparc III
Apple Mac Quadra 900 1991 Motorola 25Mhz 68040

The speed calculations in the table for each chip are given in Megahertz, or Gigahertz. A Hertz is a "cycle per second. If a computer operates at 800Mhz, it means that it executes 800 million cycles per second.

What is a cycle? Well roughly speaking, a cycle means that the CPU carries out the following steps: (we are leaving out some details here that you can learn in later courses):

  1. the control unit grabs the current machine language instruction from main memory.
  2. the control unit decodes the instruction, and gets any data needed from main memory.
  3. the control unit causes the ALU to carry out the instruction
  4. the control unit moves to the next machine language instruction to be executed.

Because these four steps are repeated over and over again, this is referred to as the instruction cycle. Roughly speaking, the speed measurement of the CPU indicates how many of these cycles can be completed per second.

However, speed measurements of different CPUs cannot always be compared fairly. This is because on one machine, a given instruction may carry out a lot of work (e.g. multiplying several number by a single number), while on another machine, the instructions might carry out the work in smaller units (multiplying only one number at a time), requiring more instructions to carry out the same amount of work. Therefore, a computer that operates at 2.4Ghz may be faster than one that operates at 3.6Ghz, depending on the task to be performed. There are many other factors that affect computer performance as well, as you may learn in later courses, such as Computer Architecture, and Operating Systems.

If you have use a computer before, you may have encountered special files on your computers hard disk called "executables". Executables are simply computer files that contain binary machine language code for the machine you are running the program on. The word execute means to carry out the instructions in the program, or run the program.

As your book points out, machine language is "cumbersome and tedious to write". Typically, programs seldom ever right code directly in machine language; instead they use either Assembly language, or a High-Level Language. Those are described in the next section.

1.5.1 Assembly Language

A couple of things to add to what your textbook says here:

The {MUL 9; DIV 5; ADD 32;} program for Celsius to Farenheit conversion that we demonstrated earlier gives you a feeling for what assembly language is like. Typically, in assembly language, we can only do one simple operation at a time. Also, it is possible to take each separate instruction in assembly language, and turn it into a specific instruction in binary (machine language) in a rather straightforward way. Finally, it is typically the case that a single assembly language instruction corresponds directly to a single machine language instruction. Thus, each computer chip has its own specific assembly language. For each chip that you want to work with, you will have to learn a different assembly language.

Because assembly language is cumbersome and machine specific, programmers very seldom use it these days. It is typically used only to do very small parts of an overall system that need to do be very fast, or are very specific to a particular piece of hardware. Two examples of where you might still encounter assembly language programming are video game programming, and programming of device drivers (the software that interfaces with external hardware, e.g. network cards, graphics boards, hard disk controllers, etc.).

1.5.2 High Level Languages

Nearly all programming is now done in high-level languages. Here are a few notes to add to what your textbook says about high-level languages.

First, your book indicates four types of high-level languages: procedural (imperative), functional, declarative, and object-oriented. This list is not one that you need to memorize for the exam; I will not test you over this material. Nor do you need to memorize the contents of Table 1.3; this list of languages is here mainly to give you some examples.

A couple of languages that didn't make the list that you may have heard of are Java, which is an object-oriented language, and Perl, which is a scripting language (a language type that didn't make your textbook's list!)

The main thing you should get from section 1.5.2 is the concept of what a high-level language is: namely, a way to write programs that is more abstracted from the low-level details of the hardware. Programs in high-level languages are more likely to be portable, which means that the program can be run on more than one type of computer. The reason for this is that high-level language programs can be translated (via computer software) into the machine language of many different types of computer. The piece of software that does this translation is called a compiler. You will use the C compiler in almost every homework assignment and project that you do in this class—in fact, you will use the C compiler hundreds if not thousands of times over the course of the semester. So I hope it has been worthwhile learning a bit about what is actually taking place!

1.6 Software

This section starts with a discussion of the difference between software and hardware. That is an important distinction to remember.

The section then does on to describe two categories of software: system software and application software. That is a useful distinction to make, so it is worth reading through this section. However, if after reading the section, the distinction is still not entirely clear, don't worry; I think you will pick up the difference as the semester proceeds.

1.6.1 System Software

The book notes that system software includes operating systems, utility programs, and language translators. I don't want you to get the idea that these are three very clear distinct categories—in fact, the distinctions among these three categories can be a bit blurry. Some folks would claim that certain utility programs are in fact part of the operating system. Likewise, one could argue that a language translation program is a kind of utility program.

The remainder of this section is important for you to read and understand. However, don't worry too much about Figures 1.7 and 1.8, or the distinctions between single-user and multiple user operating systems. What I'm more concerned with is that you understand the sections on Utility programs an Language translators. In particular, you need to understand the following terms: editor, assembler, compiler.

Note that your text indicate that a compiler can do two things:

  1. Translate a program in a high-level language into machine instructions
  2. Link together different translated modules.

If you missed that in your reading, go back and read the section titled "Language Translators" again! It is an important point that you will see reflected in your homework assignments this semester.

1.6.2 Application Software

This section notes a few different types of application software. Don't bother memorizing the list of different types. The more important contribution of this section is to motivate the idea of "Software Engineering", which is discussed in Section 1.7.

1.7 Software Engineering

The introduction to section 1.7 is a good introduction to the idea of Software Engineering, as is Figure 1.9. You don't need to memorize this chart or the information in the intro to Section 1.7, but reading and understanding what is here will help you to understand a bit about my perspective in teaching this course. The way in which software is developed in the so-called "real world" is very different from what typically happens in most college programming courses. In particular, real world software developing includes lots of teamwork, integration of software developed by different programmers and organizations, testing. It involves continuous updating, changing, and documenting. Many of things I emphasize, such as good style and commenting, have to do with this perspective.

1.7.1 Top-down Modular Design, Structure Charts, and Data Flow Diagrams

This section, and Figures 1.10 and 1.11 are well-intentioned, but I'm not sure how effectively they communicate the ideas in the section title. Read through this material, but if it doesn't make much sense, don't sweat it; just move on.

1.7.2 Functions

This section, on the other hand, is important, because it lays the foundation for a very important idea: the idea of a function. An entire chapter of our textbook (Chapter 5) is devoted to functions, so this section is worth reading two or three times.

1.8 The C Language, ANSI C, and C Compilers

1.8.1 C and ANSI C

This section gives you an idea of where the C language came from, and what the "standard" is for the current version of C (namely, ANSI C). Read through this section; it has stuff that is good for you to know, but don't bother memorizing detail here.

1.8.2 C Compilers

This section, on the other hand, along with figure 1.12, is crucial to your success in this course. Section 1.8.2 contains a synopsis of some of the most important material for you to know before you tackle your first lab. Read it several times, and be sure you know the following terms:

Near the end of the section, your book suggests that your course instructor should be able to give you the information needed to utilize the C compiler made available to you. This will be the focus of our first lab, which you should be able to find on the course web site.

1.9 About This Textbook, and How to Get The Most Out Of It.

Do read through this section (the entire section) to familiarize yourself with how the textbook is structured.

Exercises

Exercises 1 through 6, and 10 are definitely fair game for Exam 1. I might assign them as homework. If I don't, I still recommend doing them as a way of studying for the Exam.

On the other hand, Exercises 7 through 10 are unlikely to appear on one of my Exams.

Application Exercises

The application exercises 1 and 2 are interesting. However, the task you are given, namely to create a "structure chart" and "data flow diagram" is a bit vague for this point in the semester.

Therefore, I would not suggest that you spend your time on the application exercises in this chapter.

That will NOT be the case in susbsequent chapters however. For example in Chapter 2, every single one of the application exercises is a worthwhile and useful way for you to spend your time (whether I assign them for credit or not!)

(end of reading notes for Chapter 1).

These reading notes are based on C Programming for Engineering and Computer Science, by H. H. Tan and T. B. D'Orazio, Copyright 1999 by WCB McGraw-Hill. The notes themselves are Copyright 2004, by Phillip T. Conrad, All Rights Reserved.

Prof. Conrad takes sole responsibility for any views or opinions expressed in these notes. Such view and opinions do necessarily reflect those of the University of Delaware, its Department of Computer and Information Sciences, or any other organization.