Bits & Bytes: Lesson 3

February 2007

The Data Game

What’s Your Type?

The amount of information the average person deals with on a daily basis is mind boggling. You are inundated with news and events from television, radio, and the Web. You receive enough emails and snail-mail to overflow your inbox and mailbox on a weekly basis. Add to that weblogs, magazines, announcements, and telephone messages, and it is no wonder we sometimes feel like we are losing the battle of the data bulge. As overwhelming as this is, the amount of personal data generated and collected is miniscule compared with the production of data from companies and research institutions. Knowledge discovery and generation through research and business activities generate a doubling of stored data every nine months! Luckily computers have come to the rescue for many of the tasks of handling all of this data. They are the perfect device for storing, organizing, and in general, managing the deluge.

To better appreciate how computers can process the data that we, as humans, just can’t get a handle on, it’s useful to understand data from the ground up and in the most elementary form—this is the form that computers can crunch on!

In lesson 2, we explored object-oriented design and programming. That’s a challenging concept to wrap your brain around. But actually, you have already learned quite a bit about computer data from that experience. You learned that an object is a complex data type used to store information about complex entities such as vehicle registrations or inventory items. In lesson 3, we are going to step back a bit, break the big ideas of representing and storing data into some very basic concepts, and explore the incredible power and flexibility of electronic data storage.

So let’s start at the beginning and discuss how data is electronically represented.

The Two Faces of Data

Electronic data can be represented, stored, and managed in 2 ways; it can be handled as analog data or as digital data. Analog data is a continuous flow of data. A telephone conversation and transmission is typically an analog form of data. The sound waves fluctuate up and down in smooth waves and are transmitted by changing waves of voltage and then received by your telephone and converted back into sound waves. An old fashioned tape recording, or vinyl records, record and store sound as analog data.

Analog - Data that is represented, transmitted, or stored in a continuous stream or wave format.

Contrast this to digital information such as sound stored on a CD. Sound stored digitally is broken into tiny time-segmented pieces and is stored by numbers that represent the intensity and pitch of the sound in each piece or sampling. The data is digitized—turned into digits or numbers.

Digital - Data that is represented, transmitted, or stored in discrete pieces with numerical values.

Your computer is a digital machine. It represents data in small segments that can be represented by numbers. You have likely had a hint of this if you have ever closely observed the values representing colors as you customized the color of text in a Word document. Each hue of red, green, and blue can have 256 values—0 to 255. Many of the colors of the rainbow are possible by mixing these RGB (Red-Blue-Green) values.

Counting on Two Fingers

While you and I count using a decimal system with the numerals 0 to 9, computers use a binary system with the numerals of 0 and 1. While this is hard to imagine because we are so accustomed to the decimal system, it’s really a very efficient and simple method. It’s simple because computers are electrical devices composed of switches. These switches are either “on” or “off” creating obvious and totally unambiguous conditions. Think of a computer as a huge collection of millions of switches that are either on or off. Each of these switches is called a bit. It’s the combination of “on” or “off” switches called bits that creates the magic.

Bit - The basic electronic data storage element of an electronic switch with 2 states—“on” and “off.”

If we focus on just one of these bits we realize it can be “on” or “off”; “on” represent a “1” and “off” represents a “0”(zero). 1 and 0 are the 2 digits used to represent all numerical values in this base-2 number system. Generally speaking, as we need to represent greater and greater values we use more and more bits. Let’s look at the numerical value possibilities of having 2 bits.

Bb330927.370042ce-5d16-4caa-a52b-d69afd0ade15(en-US,VS.80).png

Don’t fall into your comfort zone and call these representations zero, one, ten, and eleven. 10 is called “one-zero” and 11 is “one-one.”

With these 4 values or 2 bits we could represent the directions of north, south, east, and west or the stages of a 4-stroke engine.

As we add more bits we increase the number of values that can be represented. 3 bits can represent 8 values: 000, 001, 010, 011, 100, 101, 110, and 111. 4 bits can represent 16 values (2ˆ4) and 5 bits can represent 32 values (2ˆ5). You’re starting to see a pattern here. The more switches or bits the greater the capacity to represent data.

Putting Bits to Use in Your Computer

The quantity of data your computer can store is represented in measurements of these units. Data is stored in computers in connected memory locations made up of 8 bits. These connected 8 bits are referred to as a byte. If you do the math you will see that 8 bits, or 1 byte, can represent 256 values (2ˆ8). Now you understand the significance of the RGB color scheme of 256 values for each of red, green, and blue. The size of a document you are working on is measured in kilobytes (KB); a KB is approximately 1000 bytes. The storage capacity of your computer’s memory is usually expressed in megabytes of approximately 1,000,000 bytes. The storage of your hard drive is expressed in gigabytes of approximately 1,000,000,000 bytes. The actual number of bytes in each of these measurements can be seen in the chart below.

Byte - An electronic storage unit comprised of 8 connected memory locations called bits which can represent 256 values.

Unit

Symbol

Number of Bytes

byte

2ˆ0 = 1

kilobyte

MB

2ˆ10= 1024

megabyte

KB

2ˆ20 = 1,048,576

gigabyte

GB

2ˆ30 = 1,073,741,824

Everything in the digital world is measured in bits and bytes which are the building blocks for data & code – hence the name of this learning series: “Bits & Bytes”!

Get Real

  1. Open Microsoft Word. Type a few lines of text. Highlight a section and choose to change the color by using the pull down menu in the Formatting menu—A. Select “More Colors.” Select the “Custom” tab. In this dialog box, be sure RGB is selected as the format. Experiment with the up and down arrows to the right of Red, Green, and Blue and the slider on the right of the color display to lighten or darken the hue. Observe as the RGB values change for each hue. You can also enter any value from 0 to 255 for each hue and observe the changes in the color display. How does the color change as you increase the value of any given hue? What happens when you assign 255 to all three hues? What happens when you assign 0 to all three hues? How do you get the purest blue you can imagine?

  2. Think of another place where you may have encountered this magic number 256.

  3. Refer to the advertisements for computers in any newspaper. Record the memory capacity of various hardware devices expressed in KB, MB, or GB.

Here are some possible answers:

  1. As the value for any hue is increased the hue becomes darker. If each RGB value is set to 255, the resulting color is black. If the RGB value is set to 0, the resulting color is white. To get the purest color of any hue set, set its value to a midrange of about 128 and the hue value of the other 2 colors to 0.

  2. You might have seen or heard the number 256 in reference to 256-bit wireless encryption standards, the ASCII character set, or memory upgrade units. Perhaps you recall seeing the number 512 in reference to computer processors or hardware. The significance of 512 is that it is 2 × 256!

  3. In today’s newspaper, I found an advertisement for a 256 MB memory card, a 512 MB thumb drive, a 30 GB media player, and a 160 GB hard drive. KB generally represents document measurements, not hardware. On my hard drive I found 14 KB graphic and a 176 KB Word document

Storing Data

“So how does this apply to computer programming”, you ask? For as simple as a series of on-and-off switches sounds, there is obviously a great deal more to it. Programmers rely upon the capacity of computers to remember or store data as engineered through computer design. The creation of machines capable of storing and manipulating data is the work of computer and hardware engineers. Programmers depend upon the expertise of these engineers to create machines capable of responding to the commands written in programs. A program that can’t “remember” a series of values to be added, or the properties of a newly created vehicle registration object is useless! It is this ability to store data that gives computers real power. In order to store data in a computer, the programmer must follow the rules, or syntax, of a specific computer programming language that dictates how memory space is allocated and the data to be stored.

Certainly in your own experiences with computers, you have discovered that “exactness” is the name of the game. For all of their power, computers must be told exactly what to do; they are not particularly good at assuming much of anything. In dealing with data, a computer must be told more than just what data to store, but also how and where to store it. Thanks to modern programming languages, much of the behind-the-scenes details of bits and bytes are taken care of. However, programmers still need to give some very specific directions to the computer about the type of data to store and how to arrange it. In the rest of this lesson we will look at some of the details programmers must communicate to the computer in terms of the type of data to be stored and how to arrange this data for use by computer programs.

Numerical Values as a type of data

Numerical values constitute one form of data. They can come in various “sizes” and with a variety of “exactness.” One size doesn’t fit every computing need and one degree of exactness doesn’t fit every statistical situation. Let’s analyze the concept of “size” first.

Think about the numbers needed to represent these situations:

  • Degrees on your home thermostat or number of stars in the heavens

  • Dollars allocated in the national budget or given for your child’s allowance

These two examples illustrate the property of numbers referred to as a “range.” A range can be thought of as the “size” of a value—how small or big it might be. The numbers on a thermostat expressed as the degree markers have a range from 50 to 100 and the numerical values required to count the stars in the heavens range from 0 to 1,000,000,000,000,000,000,000. The range of the numerical values needed in a computer program to make calculations for solving a problem determine the amount of memory or bytes required to store the values.

Contrast these two values:

  • The exact value in your piggy bank or the exact value of the natural logarithm (2.718281828459)

The coins in your piggy bank can likely be expressed with about 5 digits of precision. If you count $327.76, there are 5 digits required to express the exact amount. With the value of the natural logarithm (a value used in calculus) 13 or more digits of precision are required to express the value. Again, the precision required of the numerical values needed in a computer program also determine the amount of memory, or bytes required to store the values.

Both range and precision are important considerations in planning for the accuracy of the data you wish to represent in a computer. The range and precision are also important from an efficiency perspective; the greater the range and the greater the precision, the more memory in bytes is required to store and process the values. It makes no sense to allocate more memory than is needed to solve the problem than it does to allocate the entire garage to store your tricycle!

To make it easier for programmers (and the rest of us), specific data types have been defined that take these details into consideration. As programmers design and write programs, they select from these types based upon the range and accuracy needed for the problem at hand and they indicate the chosen type to the computer through program statements. The data type chosen determines the amount of memory used by the computer to solve the problem.

Type - A designation used to describe the kinds of values to be stored in a computer. The data type determines the amount of storage required.

Putting Data to work

Let’s imagine that a computer programmer is designing an application for children to practice their arithmetic skills. The programmer wants these tasks to be performed by the “Math Bam” game:

  • Ask for the child’s name

  • Present easy addition and subtraction problems

  • Reward the child for correct answers with praise

  • Keep track of the correct answers

  • Report the percentage of correct answers

  • Give the child a letter grade for their success

There are many computing steps that would eventually need to be programmed to create the game in this example, but we are going to forget about that for now and concentrate on the information, or data that needs to be represented and stored in this program.

Let’s start with the “numbers” that must be stored.

Here is a typical problem for the “Math Bam” game: 3 + 2 = ?

Integer

The values in this problem, 2, 3 and 5, are described as “whole” numbers in common language, but technically they are called “integers.” Integers can be positive (greater than 0) or negative (less than 0) or 0 itself. Integers typically range from -2,147,483,648 to 2,147,483,647 and can occupy 2 to 4 bytes of storage capacity.

Double

Calculating the percentage of correct answers might result in a decimal or floating point values such as 2.5, 33.33333 or 16.04. These decimal numbers which are called doubles in tech-speak, can store values up to 1.8e308 and occupy 8 bytes of memory.

Boolean

Storing the result of each problem in terms of “right” or “wrong” would require another powerful and efficient data type called Boolean. A Boolean value stores only 2 values, true or false. Boolean values are typically used in programs that store conditions such as on/off, yes/no, or true/false. They require only 1 bit of memory.

Character

Storing the letter-grade score in “Math Bam” uses another data type. You recall that all data is stored in a numerical form in the computer; this is true for all kinds of non-number data also. Letters, or characters as they are called in technology, are stored as numbers. There is a numerical value for every letter of the alphabet, both upper and lower case, as well as every key board symbol such as ‘!’ or ‘&’. The numerical values used to represent the characters from ‘A’ to ‘Z’ range from 65 to 90. The values used to represent the characters from ‘a’ to ‘z’ range from 97 to 122. There are even numerical representations for invisible characters such as tab (value 9) and enter (value 13). Perhaps you have heard of ASCII code (American Standard Code for Information Interchange). This code contains the universally accepted values for all characters created by keyboards. A newer, more comprehensive coding system called Unicode has been created to provide a unique number for every character, on any platform, in any program, and in any language.

String

There is one more data type needed by the “Math Bam” game. The child’s name must be stored. The data type that stores words is called a String – a string of characters. This data type is more complex because it uses a series of bytes held sequentially in the computer memory. A String is actually an object in modern, object-oriented languages. Strings have properties such as its length and methods for manipulating the string such as ToUpper or ToLower which are used to change the case of the characters in the string.

Sometimes what looks like a number is really a String. A telephone number such as 123-4567 is a String. The classification of a series of digits as a value is reserved for such combinations of digits that would be treated as a value for addition, division, and other mathematical operations. Since there would be no valid reason to add telephone numbers together and expect a meaningful result, they, and other identifier-type combinations of digits such as social security numbers, should be classified as Strings. These “number-look-alikes” are really just labels for telephone connections or citizens.

There are other specialized data types including the Date type which is obviously used to store a calendar date. Different programming languages support unique sets of complex data types in addition to the basic types described above.

In fact, if you remember from Lesson 2, creating a class in object-oriented programming creates a new user-defined data type which can be customized to represent complex forms of data. When a class is defined, the specific properties used to describe the objects to be created are generally specified as one of the data types discussed here. In Lesson 2 of this series, we designed a Pet class for a program to be used in managing a pet show. Each of the properties of a Pet would reference data of one of the types discussed above. The chart below shows possible properties of a Pet class and the type of data to be stored for each property.

Property

Data type

Pet name

String

Birth date

Date

Weight

integer

Owner

String

Prize money

double

Types Dictate Operations

Selecting a data type that fits the values used in a program reflects more than just range and accuracy. Each data type supports various operations. Integers and doubles are number values which support typical number operations such as addition and subtraction. Each type however, has some unique operations which offer additional power to knowledgeable programmers.

For instance, you would likely expect that a computer programmed to solve the math expression 5 / 2 would yield 2.5. This result is only accurate if the values 5 and 2 are of type double – meaning decimal. If 5 and 2 are described in the program as being of type integer – whole numbers, the result is 2 (since the integer type does not allow decimals). While many operations for both integers and doubles result in expected values, there are a few exceptions that can take the programming novice by surprise.

Other data types also make unique use of common operators. For instance, if the + operator is used between strings the result is a combining of the strings like this:

bat + man = batman

That is very different than the use of the + operator with integers or doubles:

7.3 + 1.2 = 8.5

Get Real

  1. Select the best data type to store the values likely to be used in these situations:

    1. Grocery list

    2. Report card

    3. Restaurant bill

  2. Translate your name into ASCII code. Visit this Website for help.

    https://msdn.microsoft.com/library/default.asp?url=/library/en-us/vsintro7/html/_pluslang_ascii_character_codes_chart_1.asp

  3. Complete this chart identifying the appropriate data type to use for the properties in a vehicle registration object.

Property

Value

Type

Vehicle Identification number

1F7KB54Y1WU7734

Year of manufacture

2005

Owner’s first name

Kim

Owner’s middle initial

J

Registration fee

378.50

Here are some possible answers:

  1. A grocery list would likely include words such as milk, eggs, and bananas. They would be stored in variables of type String. A report card would likely include words such as the student name. It would be stored in a variable of type String. The letter grades earned would be stored in variables of type char. A restaurant bill would include the cost of chosen menu items in dollars and cents and it would likely be expressed as decimal numbers. This data would be stored in variables of type double.

  2. The name Marc Brown translated into ASCII code:

    Character

    M

    a

    r

    c

    <space>

    B

    r

    o

    w

    n

    ASCII

    77

    97

    114

    99

    32

    66

    114

    111

    119

    110

  3. Vehicle registration properties, values, and data types:

    Property

    Value

    Data Type

    Vehicle Identification number

    1F7KB54Y1WU7734

    String(this is an identifier value, not a numerical value)

    Year of manufacture

    2005

    integer

    Owner’s first name

    Kim

    String

    Owner’s middle initial

    J

    character

    Registration fee

    378.50

    double

No Square Pegs in Round Holes Allowed

Storing and using values from within a computer program requires the use of special “storage containers” called variables. A variable is a named memory location which stores the values or data used in a program. Each variable holds a specific data type such as an integer, double, String, or others. The type of variable is indicated by the programmer when the code is written.

A variable identified to hold data of one type cannot hold data of a different type without generating an error or producing inaccurate results. For example, a variable designated to hold a character cannot be used to hold data of type double (decimal).

Think of this as a child’s peg game where each uniquely shaped peg will only fit in a hole of the same shape. Circle pegs fit into circle holes and square pegs fit into square holes. Character type data fits into character variables and double type data fits into double variables.

Naming your data

The program data stored in variables are accessed by names referred to as identifiers. An identifier is a word chosen by the programmer for use in accessing the stored data. It is simply the act of naming the information so that you can get to it when you want to use it. I guess this is the same reason we name children and pets – we want to get the right child or dog when we shout their name!

Identifier - The one-word name given to a stored data element so that it can be referred to and used within a program.

If I were creating a program for the telephone company I might choose “customer_Name” as the variable identifier to label a telephone customer’s name. I would likely use “street”, “city”, “state” and “zip” as variables identifiers to label the separate elements of the customer’s address. All of these would be of type String because they are words. I would also choose “balance_Due” as the variable identifier to label the amount of money owed. This would be of type double because it is a numerical value using 2 decimal places for storing dollars and cents. This step of naming the variable tells the computer to associate a particular word—an identifier, with data of a specific type. The actual storage of the data in the computer memory comes a bit later. This first step of selecting variable identifiers and giving them a data type is equivalent to writing the address on an envelope, but not yet putting the letter inside.

There are rules for selecting identifiers. Generally speaking, identifiers must begin with a letter and contain only letters, digits, and the underscore (_) character. Spaces, periods, and other characters cannot be used.

Here is an example of creating variables to hold data of various types in pseudo-code (human-readable words):

*      customer_Name as String*

*      street as String*

*      balance_Due as double*

Storing your data

The next step is to actually store some item of data in a variable or memory location. This occurs with an assignment statement. An assignment statement assigns some value to a variable; this is the step where the values are actually put into memory locations.

Variable - A named computer memory location used to store data of various types.

Here is an example of assigning values to variables in pseudo-code:

*      customer_Name = “Kim Lee”*

*      street = “401 S. Main”*

*      balance_Due = 78.55*

You are probably wondering why some pieces of data appear between quotes and some do not. Strings are enclosed in quotes to indicate that the words found between the quotes make up the actual string of characters which are assigned to the variable. These words in quotations are not to be confused with the identifier (name) given to another variable.

Get Real

  1. Let’s create variables and identifiers for the “Math Bam” game. Complete the chart below by providing the likely data type and an appropriate identifier for each data element listed.

    Data Element

    Value

    Type

    Identifier

    The child’s name

    Number of correct answers

    Percentage correct answers

    A letter grade

  2. Write a pseudo-code statement to create a variable for each data element in the chart above. Use this format: <variable name> as <type>

  3. Write another pseudo-code statement to store a value in each of the variables you just created. Use this format: <variable name> = <value>.

Here are some possible answers:

1.

<table>
<colgroup>
<col style="width: 25%" />
<col style="width: 25%" />
<col style="width: 25%" />
<col style="width: 25%" />
</colgroup>
<thead>
<tr class="header">
<th><p>Data Element</p></th>
<th><p>Value</p></th>
<th><p>Type</p></th>
<th><p>Identifier</p></th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td><p>The child’s name</p></td>
<td><p>Lee Singh</p></td>
<td><p>String</p></td>
<td><p>student_Name</p></td>
</tr>
<tr class="even">
<td><p>Number of correct answers</p></td>
<td><p>15</p></td>
<td><p>integer</p></td>
<td><p>number_Correct</p></td>
</tr>
<tr class="odd">
<td><p>Percentage correct answers</p></td>
<td><p>83.7</p></td>
<td><p>double</p></td>
<td><p>percentage</p></td>
</tr>
<tr class="even">
<td><p>A letter grade</p></td>
<td><p>B</p></td>
<td><p>character</p></td>
<td><p>grade</p></td>
</tr>
</tbody>
</table>
  1. student_Name as String

    number_Correct as integer

    percentage as double

    grade as character

  2. student_Name = “Lee Singh”

    number_Correct = 15

    percentage = 83.7

    grade = ‘B’

More Complex Data Containers

While the data types described in this lesson and the objects you learned about in Lesson 2 are the basis of storing and manipulating the information in computers, they can only go so far in meeting today’s needs. Many other schemes for storing data have been designed by computer scientists, including arrays and databases.

Arrays

Up to this point we have discussed variables as memory containers that hold only one piece of information. It would take many variables to hold all of the names of the players on your baseball team or all of your bowling scores. A more efficient method for storing all of that data and being able to access each unique name or score in the collection is to store the data in an array.

Array - A list of sequentially numbered variables with the same name.

An array uses a single variable name that is numbered, or indexed, to refer to each separately stored data element. Think of it like the mail boxes in an apartment building. The apartment address is analogous to the variable name, such as 7284 Venice Boulevard. Each apartment number is analogous to the indexing value, such apartment 1, 2, 3, etc. Unlike apartments, however, the numbering of arrays starts with the number 0 instead of the number 1.

An array created to store the names of players might be named “player.” The String data of “Mary” would be stored in the first player variable, player(0). “Bob” would be stored in the variable player(1) and “Connie” in player(2). Your bowling scores could be stored as integers in variables named score(0), score(1), score(2), and so on.

0

1

2

3

4

Mary

Bob

Connie

Mike

Linda

This data storage technique is ideal for some common computer processes such as alphabetically sorting names or adding many scores to find your bowling average. Being able to move conveniently from one stored value to another by using the sequentially numbered index value is easily programmed with techniques you will learn about in future lessons.

Databases

In a visit to your local library you likely encounter an electronic database when you search for books by your favorite author, books about horses, or books on the Civil War published in 1965. A library database packages the specific details of every book with additional data called metadata (described below). This combination allows you to search for books in any number of ways and to analyze the information in ways that would be impossible with other data storage methods. A database is a collection of related information and can store the details about such things as items in a warehouse, the individuals in a personnel record, individuals in a telephone book, and transactions on your credit card. The Web offers more databases than you can imagine. There are databases of movies, plants, animals, the human genome, pesticides, chemicals, languages, countries, sports, and on and on!

Database - A collection of information on related items which stored in a form that can be organized and searched.

Structure of databases

While we often see the contents of databases in the form of a table (grid of rows and columns), the table is NOT the database – there may be many tables that make up a single database. A database is a collection of related information made up of entities (or rows)—single examples of the data stored which are sometimes called records. Each entity or record, contains specific labeling details about the record called fields or attributes (columns). If a telephone book is a database, the information about any given telephone number is a record or entity, and the label given to the data such as name, address, and telephone number are the fields or attributes that make up the metadata. The specific data details such as Jose Rodriquez, 401 Main Street, and 213-4567 are the values.

Metadata - A description of the data in a database that enables powerful searches and the ability to answer complex questions about the data.

In order for us to see the information stored within a database, it is usually presented in the form of a table. Let’s use the example of the U.S. Department of Agriculture Plants database (https://plants.usda.gov/index.html) to explore the details of databases.

Here are 3 records or entities of the plant database:

Bb330927.2bd19386-4034-4881-a612-aee620bf8859(en-US,VS.80).png

  • The details of these 3 plants plus thousands more comprise the plant database.

  • The complete set of information on each individual plant is a record or entity.

  • The category of each specific detail such as family, growth habit, and a picture is a field or attribute.

  • The specific data such as Meadow Horsetail, Equisetaceae, and a specific picture are values.

Obviously, databases are popular for storing huge amounts of data. But what makes them so powerful? The power of databases results from the actual structure or design of the database for easily and quickly accessing, searching and reporting data. Metadata is data, or information, about the data stored in a database. Meta means “about” in Greek, so Metadata is literally “Data about data”.

For a moment, consider the data 86403. Without some explanation this value is totally useless. But the moment I tell you that 86403 is a zip code, the value has significance and meaning. “Zip code”, in this situation, is metadata. It is information about the data value 86403 that gives meaning to it as the zip code of Lake Havasu City, Arizona. This metadata allows you to search a database for answers to specific questions.

In the example of the plant database, we might create a question or query that seeks to answer the question, “What perennial plant is native to the U.S. and contains the word “horse?” The result of a query referencing the metadata would likely be a table listing all of the plants in the database that fit the description of the word “native” in the field of U.S. Nativity, “perennial” in the field of Duration, and “horse” in the field Common Name. Querying a database to find the answer to a question is a very skilled task valued by organizations and businesses.

Databases are usually created using software applications such as Microsoft Access or Microsoft SQL Server. These database applications streamline the creation of databases, the definition of metadata with fields or attributes, and the population of records or entities with values. Solving problems with queries is also made easier through database software.

Get Real

  1. Complete the array below by filling in the names of your childhood friends. What identifier would you give to this array?

    0

    1

    2

    3

    4

     

     

     

     

     

  2. Imagine you are creating a database to record the members of your family tree. What fields of data (or metadata) would you include?

  3. A paper telephone book is an old-fashioned database. Think about questions you could not answer (or at least not easily or quickly) with a paper telephone book that you could with an electronic telephone book.

  4. Using the table below, what records would be returned by the following query? “Select all rows where Name begins with ‘B’, BirthYear is greater than 1990, and FavoriteColor = ‘Yellow’”

    Name

    BirthYear

    FavoriteColor

    Betty

    1901

    red

    George

    1983

    yellow

    Bruce

    1991

    yellow

    Clark

    1998

    blue

    Barbara

    1989

    yellow

Here are some possible answers:

  1. childhood_buddies

    0

    1

    2

    3

    4

    Linn

    Carlos

    Fran

    Maria

    Connie

  2. These are some likely fields or meta data for a family tree data base:

    • First name

    • Last name

    • Mother

    • Father

    • Siblings

    • Spouse

    • Birth date

    • Birth place

    • Occupation

    • Date of death

  3. With an electronic telephone book you could easily locate all individuals with the same first name as you, all of the Smith families living on Oak Street, all of the businesses on Main Street that have the word “hardware” in their name, and all of the Juarez families that are not in the 925 area code.

  4. The only individual who fits the query where Name begins with ‘B’, BirthYear is greater than 1990, and FavoriteColor = ‘Yellow’ is

    Name

    BirthYear

    FavoriteColor

    Bruce

    1991

    yellow

Summary

There is a huge quantity and diversity of data that is handled by technology. Data can be stored in either analog or digital formats. Analog data is represented, transmitted, and stored as a wave or continuous stream of information. Digital information is represented, transmitted, and stored in discrete pieces as numerical values.

Computers use a binary number system to store data. The basic electronic storage unit is a bit with 2 states—on and off. These states of on and off are represented by the values of ‘0’ and ‘1.’ Eight connected memory locations or bits are called a byte. A byte can represent 256 binary values. Megabyte, kilobyte, and gigabyte are common measures of storage used in technology.

Using data in computer programs involves identifying the data type, creating variables, naming identifiers to reference the variables, and storing the data in variables.

Data types are based upon the nature of the information to be stored. Common types include:

  • integer—whole numbers

  • double—decimal numbers

  • Boolean—true or false

  • character—individual keyboard symbols

  • String—words or labels

Each data type has its own set of operators such as + or ÷ that produce unique results.

Variables are named memory locations used to store data within computer programs. Variable are created to hold a specific type of data; they cannot hold data of another type. Variables are given names, called identifiers, so that the data they hold can be referred to and used within the program.

There are many complex data containers which can be used to store data. An array is a more complex data container and is a list of sequentially numbered variables with the same name. The numbering begins with 0. This data structure allows for easy access and manipulation of collections of similar type data.

A database is another more complex data structure. A database is a collection of related information which can be easily organized and searched using queries. The data in a database contains metadata. Metadata is a description of the information that enables powerful searches and problem solving. The basic elements of a database are the records of individual items, the fields or attributes that describe the information about each record, and the specific data values for each record.

Join me in Lesson 4 to learn how computer make decisions using data.