Library homepage

  • school Campus Bookshelves
  • menu_book Bookshelves
  • perm_media Learning Objects
  • login Login
  • how_to_reg Request Instructor Account
  • hub Instructor Commons
  • Download Page (PDF)
  • Download Full Book (PDF)
  • Periodic Table
  • Physics Constants
  • Scientific Calculator
  • Reference & Cite
  • Tools expand_more
  • Readability

selected template will load here

This action is not available.

K12 LibreTexts

2.1: Types of Data Representation

  • Last updated
  • Save as PDF
  • Page ID 5696

Two common types of graphic displays are bar charts and histograms. Both bar charts and histograms use vertical or horizontal bars to represent the number of data points in each category or interval. The main difference graphically is that in a  bar chart  there are spaces between the bars and in a  histogram  there are not spaces between the bars. Why does this subtle difference exist and what does it imply about graphic displays in general?

Displaying Data

It is often easier for people to interpret relative sizes of data when that data is displayed graphically. Note that a  categorical variable  is a variable that can take on one of a limited number of values and a  quantitative variable  is a variable that takes on numerical values that represent a measurable quantity. Examples of categorical variables are tv stations, the state someone lives in, and eye color while examples of quantitative variables are the height of students or the population of a city. There are a few common ways of displaying data graphically that you should be familiar with. 

A  pie chart  shows the relative proportions of data in different categories.  Pie charts  are excellent ways of displaying categorical data with easily separable groups. The following pie chart shows six categories labeled A−F.  The size of each pie slice is determined by the central angle. Since there are 360 o  in a circle, the size of the central angle θ A  of category A can be found by:

Screen Shot 2020-04-27 at 4.52.45 PM.png

CK-12 Foundation -  https://www.flickr.com/photos/slgc/16173880801  - CCSA

A  bar chart  displays frequencies of categories of data. The bar chart below has 5 categories, and shows the TV channel preferences for 53 adults. The horizontal axis could have also been labeled News, Sports, Local News, Comedy, Action Movies. The reason why the bars are separated by spaces is to emphasize the fact that they are categories and not continuous numbers. For example, just because you split your time between channel 8 and channel 44 does not mean on average you watch channel 26. Categories can be numbers so you need to be very careful.

Screen Shot 2020-04-27 at 4.54.15 PM.png

CK-12 Foundation -  https://www.flickr.com/photos/slgc/16173880801  - CCSA

A  histogram  displays frequencies of quantitative data that has been sorted into intervals. The following is a histogram that shows the heights of a class of 53 students. Notice the largest category is 56-60 inches with 18 people.

Screen Shot 2020-04-27 at 4.55.38 PM.png

A  boxplot  (also known as a  box and whiskers plot ) is another way to display quantitative data. It displays the five 5 number summary (minimum, Q1,  median , Q3, maximum). The box can either be vertically or horizontally displayed depending on the labeling of the axis. The box does not need to be perfectly symmetrical because it represents data that might not be perfectly symmetrical.

Screen Shot 2020-04-27 at 5.03.32 PM.png

Earlier, you were asked about the difference between histograms and bar charts. The reason for the space in bar charts but no space in histograms is bar charts graph categorical variables while histograms graph quantitative variables. It would be extremely improper to forget the space with bar charts because you would run the risk of implying a spectrum from one side of the chart to the other. Note that in the bar chart where TV stations where shown, the station numbers were not listed horizontally in order by size. This was to emphasize the fact that the stations were categories.

Create a boxplot of the following numbers in your calculator.

8.5, 10.9, 9.1, 7.5, 7.2, 6, 2.3, 5.5

Enter the data into L1 by going into the Stat menu.

Screen Shot 2020-04-27 at 5.04.34 PM.png

CK-12 Foundation - CCSA

Then turn the statplot on and choose boxplot.

Screen Shot 2020-04-27 at 5.05.07 PM.png

Use Zoomstat to automatically center the window on the boxplot.

Screen Shot 2020-04-27 at 5.05.34 PM.png

Create a pie chart to represent the preferences of 43 hungry students.

  • Other – 5
  • Burritos – 7
  • Burgers – 9
  • Pizza – 22

Screen Shot 2020-04-27 at 5.06.00 PM.png

Create a bar chart representing the preference for sports of a group of 23 people.

  • Football – 12
  • Baseball – 10
  • Basketball – 8
  • Hockey – 3

Screen Shot 2020-04-27 at 5.06.29 PM.png

Create a histogram for the income distribution of 200 million people.

  • Below $50,000 is 100 million people
  • Between $50,000 and $100,000 is 50 million people
  • Between $100,000 and $150,000 is 40 million people
  • Above $150,000 is 10 million people

Screen Shot 2020-04-27 at 5.07.15 PM.png

1. What types of graphs show categorical data?

2. What types of graphs show quantitative data?

A math class of 30 students had the following grades:

3. Create a bar chart for this data.

4. Create a pie chart for this data.

5. Which graph do you think makes a better visual representation of the data?

A set of 20 exam scores is 67, 94, 88, 76, 85, 93, 55, 87, 80, 81, 80, 61, 90, 84, 75, 93, 75, 68, 100, 98

6. Create a histogram for this data. Use your best judgment to decide what the intervals should be.

7. Find the  five number summary  for this data.

8. Use the  five number summary  to create a boxplot for this data.

9. Describe the data shown in the boxplot below.

Screen Shot 2020-04-27 at 5.11.42 PM.png

10. Describe the data shown in the histogram below.

Screen Shot 2020-04-27 at 5.12.15 PM.png

A math class of 30 students has the following eye colors:

11. Create a bar chart for this data.

12. Create a pie chart for this data.

13. Which graph do you think makes a better visual representation of the data?

14. Suppose you have data that shows the breakdown of registered republicans by state. What types of graphs could you use to display this data?

15. From which types of graphs could you obtain information about the spread of the data? Note that spread is a measure of how spread out all of the data is.

Review (Answers)

To see the Review answers, open this  PDF file  and look for section 15.4. 

Additional Resources

PLIX: Play, Learn, Interact, eXplore - Baby Due Date Histogram

Practice: Types of Data Representation

Real World: Prepare for Impact

Page Statistics

Table of contents.

  • Introduction to Functional Computer
  • Fundamentals of Architectural Design

Data Representation

  • Instruction Set Architecture : Instructions and Formats
  • Instruction Set Architecture : Design Models
  • Instruction Set Architecture : Addressing Modes
  • Performance Measurements and Issues
  • Computer Architecture Assessment 1
  • Fixed Point Arithmetic : Addition and Subtraction
  • Fixed Point Arithmetic : Multiplication
  • Fixed Point Arithmetic : Division
  • Floating Point Arithmetic
  • Arithmetic Logic Unit Design
  • CPU's Data Path
  • CPU's Control Unit
  • Control Unit Design
  • Concepts of Pipelining
  • Computer Architecture Assessment 2
  • Pipeline Hazards
  • Memory Characteristics and Organization
  • Cache Memory
  • Virtual Memory
  • I/O Communication and I/O Controller
  • Input/Output Data Transfer
  • Direct Memory Access controller and I/O Processor
  • CPU Interrupts and Interrupt Handling
  • Computer Architecture Assessment 3

Course Computer Architecture

Digital computers store and process information in binary form as digital logic has only two values "1" and "0" or in other words "True or False" or also said as "ON or OFF". This system is called radix 2. We human generally deal with radix 10 i.e. decimal. As a matter of convenience there are many other representations like Octal (Radix 8), Hexadecimal (Radix 16), Binary coded decimal (BCD), Decimal etc.

Every computer's CPU has a width measured in terms of bits such as 8 bit CPU, 16 bit CPU, 32 bit CPU etc. Similarly, each memory location can store a fixed number of bits and is called memory width. Given the size of the CPU and Memory, it is for the programmer to handle his data representation. Most of the readers may be knowing that 4 bits form a Nibble, 8 bits form a byte. The word length is defined by the Instruction Set Architecture of the CPU. The word length may be equal to the width of the CPU.

The memory simply stores information as a binary pattern of 1's and 0's. It is to be interpreted as what the content of a memory location means. If the CPU is in the Fetch cycle, it interprets the fetched memory content to be instruction and decodes based on Instruction format. In the Execute cycle, the information from memory is considered as data. As a common man using a computer, we think computers handle English or other alphabets, special characters or numbers. A programmer considers memory content to be data types of the programming language he uses. Now recall figure 1.2 and 1.3 of chapter 1 to reinforce your thought that conversion happens from computer user interface to internal representation and storage.

  • Data Representation in Computers

Information handled by a computer is classified as instruction and data. A broad overview of the internal representation of the information is illustrated in figure 3.1. No matter whether it is data in a numeric or non-numeric form or integer, everything is internally represented in Binary. It is up to the programmer to handle the interpretation of the binary pattern and this interpretation is called Data Representation . These data representation schemes are all standardized by international organizations.

Choice of Data representation to be used in a computer is decided by

  • The number types to be represented (integer, real, signed, unsigned, etc.)
  • Range of values likely to be represented (maximum and minimum to be represented)
  • The Precision of the numbers i.e. maximum accuracy of representation (floating point single precision, double precision etc)
  • If non-numeric i.e. character, character representation standard to be chosen. ASCII, EBCDIC, UTF are examples of character representation standards.
  • The hardware support in terms of word width, instruction.

Before we go into the details, let us take an example of interpretation. Say a byte in Memory has value "0011 0001". Although there exists a possibility of so many interpretations as in figure 3.2, the program has only one interpretation as decided by the programmer and declared in the program.

  • Fixed point Number Representation

Fixed point numbers are also known as whole numbers or Integers. The number of bits used in representing the integer also implies the maximum number that can be represented in the system hardware. However for the efficiency of storage and operations, one may choose to represent the integer with one Byte, two Bytes, Four bytes or more. This space allocation is translated from the definition used by the programmer while defining a variable as integer short or long and the Instruction Set Architecture.

In addition to the bit length definition for integers, we also have a choice to represent them as below:

  • Unsigned Integer : A positive number including zero can be represented in this format. All the allotted bits are utilised in defining the number. So if one is using 8 bits to represent the unsigned integer, the range of values that can be represented is 28 i.e. "0" to "255". If 16 bits are used for representing then the range is 216 i.e. "0 to 65535".
  • Signed Integer : In this format negative numbers, zero, and positive numbers can be represented. A sign bit indicates the magnitude direction as positive or negative. There are three possible representations for signed integer and these are Sign Magnitude format, 1's Compliment format and 2's Complement format .

Signed Integer – Sign Magnitude format: Most Significant Bit (MSB) is reserved for indicating the direction of the magnitude (value). A "0" on MSB means a positive number and a "1" on MSB means a negative number. If n bits are used for representation, n-1 bits indicate the absolute value of the number. Examples for n=8:

Examples for n=8:

0010 1111 = + 47 Decimal (Positive number)

1010 1111 = - 47 Decimal (Negative Number)

0111 1110 = +126 (Positive number)

1111 1110 = -126 (Negative Number)

0000 0000 = + 0 (Postive Number)

1000 0000 = - 0 (Negative Number)

Although this method is easy to understand, Sign Magnitude representation has several shortcomings like

  • Zero can be represented in two ways causing redundancy and confusion.
  • The total range for magnitude representation is limited to 2n-1, although n bits were accounted.
  • The separate sign bit makes the addition and subtraction more complicated. Also, comparing two numbers is not straightforward.

Signed Integer – 1’s Complement format: In this format too, MSB is reserved as the sign bit. But the difference is in representing the Magnitude part of the value for negative numbers (magnitude) is inversed and hence called 1’s Complement form. The positive numbers are represented as it is in binary. Let us see some examples to better our understanding.

1101 0000 = - 47 Decimal (Negative Number)

1000 0001 = -126 (Negative Number)

1111 1111 = - 0 (Negative Number)

  • Converting a given binary number to its 2's complement form

Step 1 . -x = x' + 1 where x' is the one's complement of x.

Step 2 Extend the data width of the number, fill up with sign extension i.e. MSB bit is used to fill the bits.

Example: -47 decimal over 8bit representation

As you can see zero is not getting represented with redundancy. There is only one way of representing zero. The other problem of the complexity of the arithmetic operation is also eliminated in 2’s complement representation. Subtraction is done as Addition.

More exercises on number conversion are left to the self-interest of readers.

  • Floating Point Number system

The maximum number at best represented as a whole number is 2 n . In the Scientific world, we do come across numbers like Mass of an Electron is 9.10939 x 10-31 Kg. Velocity of light is 2.99792458 x 108 m/s. Imagine to write the number in a piece of paper without exponent and converting into binary for computer representation. Sure you are tired!!. It makes no sense to write a number in non- readable form or non- processible form. Hence we write such large or small numbers using exponent and mantissa. This is said to be Floating Point representation or real number representation. he real number system could have infinite values between 0 and 1.

Representation in computer

Unlike the two's complement representation for integer numbers, Floating Point number uses Sign and Magnitude representation for both mantissa and exponent . In the number 9.10939 x 1031, in decimal form, +31 is Exponent, 9.10939 is known as Fraction . Mantissa, Significand and fraction are synonymously used terms. In the computer, the representation is binary and the binary point is not fixed. For example, a number, say, 23.345 can be written as 2.3345 x 101 or 0.23345 x 102 or 2334.5 x 10-2. The representation 2.3345 x 101 is said to be in normalised form.

Floating-point numbers usually use multiple words in memory as we need to allot a sign bit, few bits for exponent and many bits for mantissa. There are standards for such allocation which we will see sooner.

  • IEEE 754 Floating Point Representation

We have two standards known as Single Precision and Double Precision from IEEE. These standards enable portability among different computers. Figure 3.3 picturizes Single precision while figure 3.4 picturizes double precision. Single Precision uses 32bit format while double precision is 64 bits word length. As the name suggests double precision can represent fractions with larger accuracy. In both the cases, MSB is sign bit for the mantissa part, followed by Exponent and Mantissa. The exponent part has its sign bit.

It is to be noted that in Single Precision, we can represent an exponent in the range -127 to +127. It is possible as a result of arithmetic operations the resulting exponent may not fit in. This situation is called overflow in the case of positive exponent and underflow in the case of negative exponent. The Double Precision format has 11 bits for exponent meaning a number as large as -1023 to 1023 can be represented. The programmer has to make a choice between Single Precision and Double Precision declaration using his knowledge about the data being handled.

The Floating Point operations on the regular CPU is very very slow. Generally, a special purpose CPU known as Co-processor is used. This Co-processor works in tandem with the main CPU. The programmer should be using the float declaration only if his data is in real number form. Float declaration is not to be used generously.

  • Decimal Numbers Representation

Decimal numbers (radix 10) are represented and processed in the system with the support of additional hardware. We deal with numbers in decimal format in everyday life. Some machines implement decimal arithmetic too, like floating-point arithmetic hardware. In such a case, the CPU uses decimal numbers in BCD (binary coded decimal) form and does BCD arithmetic operation. BCD operates on radix 10. This hardware operates without conversion to pure binary. It uses a nibble to represent a number in packed BCD form. BCD operations require not only special hardware but also decimal instruction set.

  • Exceptions and Error Detection

All of us know that when we do arithmetic operations, we get answers which have more digits than the operands (Ex: 8 x 2= 16). This happens in computer arithmetic operations too. When the result size exceeds the allotted size of the variable or the register, it becomes an error and exception. The exception conditions associated with numbers and number operations are Overflow, Underflow, Truncation, Rounding and Multiple Precision . These are detected by the associated hardware in arithmetic Unit. These exceptions apply to both Fixed Point and Floating Point operations. Each of these exceptional conditions has a flag bit assigned in the Processor Status Word (PSW). We may discuss more in detail in the later chapters.

  • Character Representation

Another data type is non-numeric and is largely character sets. We use a human-understandable character set to communicate with computer i.e. for both input and output. Standard character sets like EBCDIC and ASCII are chosen to represent alphabets, numbers and special characters. Nowadays Unicode standard is also in use for non-English language like Chinese, Hindi, Spanish, etc. These codes are accessible and available on the internet. Interested readers may access and learn more.

1. Track your progress [Earn 200 points]

Mark as complete

2. Provide your ratings to this chapter [Earn 100 points]

Logo for UEN Digital Press with Pressbooks

CHAPTER 4: DATA MEASUREMENT

4-3: Types of Data and Appropriate Representations

Introduction.

Graphs and charts can be effective visual tools because they present information quickly and easily. Graphs and charts condense large amounts of information into easy-to-understand formats that clearly and effectively communicate important points. Graphs are commonly used by print and electronic media as they quickly convey information in a small space. Statistics are often presented visually as they can effectively facilitate understanding of the data. Different types of graphs and charts are used to represent different types of data.

  Types of Data

There are four types of data used in statistics: nominal data, ordinal data, discrete data, and continuous data. Nominal and ordinal data fall under the umbrella of categorical data, while discrete data and continuous data fall under the umbrella of continuous data.

data types in representation

Qualitative Data

Categorical or qualitative data labels data into categories. Categorical data is defined in terms of natural language specifications. For example, name, sex, country of origin, are categories that represent qualitative data. There are two subcategories of qualitative data, nominal data and ordinal data.

Nominal Data

\text{\color{blue}{there are nominal data represented by numbers.}}

Ordinal Data

When the categories have a natural order, the categories are said to be ordinal . It can be ordered and measured. For example education level (H.S. diploma; 1 year certificate; 2 year degree; 4 year degree; masters degree; doctorate degree), satisfaction rating (extremely dislike; dislike; neutral; like; extremely like), etc. are categories that have a natural order to them. Ordinal data are commonly used for collecting demographic information (age, sex, race, etc.). This is particularly prevalent in marketing and insurance sectors, but it is also used by governments (e.g. the census), and is commonly used when conducting customer satisfaction surveys. Ordinal data is commonly represented using a bar graph .

Quantitative Data

\text{\color{blue}{The distances of adjacent values (e.g., marks on a ruler) should be equal.}}

Quantitative data has two subcategories, discrete data and continuous data.

Discrete Data

The data is discrete when the numbers do not touch each other on a real number line (e.g., 0, 1, 2, 3, 4…). Discrete data is whole numerical values typically shown as counts and contains only a finite number of possible values. For example, the number of visits to the doctor, the number of students in a class, etc. Discrete data is typically represented by a histogram .

Continuous Data

The data is continuous when it has an infinite number of possible values that can be selected within certain limits. (i.e., the numbers run into each other on a real number line). Continuous data is data that can be calculated . It has an infinite number of possible values that can be selected within certain limits. Examples of continuous data are temperature, time, height, etc. Continuous data is typically represented by a line graph .

Explore 1 – Types of data

Classify the data into qualitative or quantitative, then into a subcategory of nominal, ordinal, discrete or continuous.

Weight is a number that is measured and has order. It can also take on any number. So, weight is quantitative: continuous.

  • egg size (small, medium, large, extra large)

Egg size is typically small, medium, large, or extra large that has a natural order. So, egg size is qualitative: ordinal.

  • number of miles driven to work

Number of miles is a number that is measured and has order. It can also take on any number. So, number of miles is : quantitative: continuous.

  • body temperature

Body temperature is a number that is measured and has order. It can also take on any number. So, temperature is quantitative: continuous.

  • basketball team jersey number

Jersey numbers have no order and are numbers that are not measured. So, jersey number is qualitative: nominal.

  • U.S. shoe size

Shoe size is a number. It is calculated based on a formula that includes the measure of your foot length. However, it has only whole or half numbers (e.g., 8 or 9.5). Shoe size has a natural order but has a finite number of options (e.g., half or whole numbers). So, shoe size is quantitative: discrete.

  • military rank

Military rank is not numerical but is categorical with a natural order. So, military rank is qualitative: ordinal.

  • university GPA

University GPA is a weighted average that is calculated, so it is quantitative: continuous.

Practice Exercises

  • year of birth
  • levels of fluency (language)
  • height of players on a team
  • dose of medicine
  • political party
  • course letter grades
  • Quantitative: discrete
  • Qualitative: ordinal
  • Quantitative: continuous
  • Qualitative: nominal

Types of Graphs and Charts

The type of graph or chart used to visualize data is determined by the type of data being represented. A pie chart or bar chart  is typically used for nominal data and a bar chart for ordinal data . For quantitative data , we typically use a histogram for discrete data and a line graph for continuous data .

A pie chart is a circular graphic which is divided into slices to illustrate numerical proportion. Pie charts are widely used in the business world and the mass media. The size of each slice is determined by the percentage represented by a category compared to the whole (i.e., the entire dataset). The percentage in each category adds to 100% or the whole.  

Explore 2 – Pie Charts

The pie chart shows the distribution of the Food and Drug Administration’s Budget of different programs for the fiscal year 2021. The total  budget was $6.1 billion. [1]

data types in representation

  • How many categories are shown in the pie chart?

If we count the number of slices, there are 10 categories shown.

  • What do the percentages represent?

The percentages show the percent of the $6.1 billion FDA budget that was spent on each category.

  • Why is it vital to show the total budget on the chart?

Without the total budget we would be unable to calculate the amount spent on each category.

  • Is there a limit to the number of categories that can be shown on a pie chart?

Yes. If the slices are too small to see, another method of representing the data should be used. Ideally, a pie chart should show no more than 5 or 6 categories.

  • What does the largest slice represent?

The percentage of the total budget spent on human drugs.

  • What does the smallest slice represent?

The percentage of the total budget spent on toxicological research.

  • How could this pie chart be improved?

The slices could be ordered around the circle by size, and the 3-D look could be eliminated to avoid the distorted perspective and to make the graph clearer.

  • Is this an appropriate use of a pie chart?

The chart is showing a comparison of all categories the budget went towards so it is appropriate.

Bar graphs are used to represent categorical data . Each category is represented as a bar either vertically or horizontally. A bar is the measured value or percentage of a category and there is equal space between each pair of consecutive bars. Bar graphs have the advantage of being easy to read and offer direct comparison of categories. 

Explore 3 – Bar Graphs

Graduation rates within 6 years from the first institution attended for first-time, full-time bachelor’s degree-seeking students at 4-year postsecondary institutions, by race/ethnicity: cohort entry year 2010.

data types in representation

  • How many categories are represented in the bar graph and what do they represent?

There are 7 categories representing the race/ethnicity of the students.

  • What do the numbers above each bar represent and why may they be necessary?

The rounded percent of the category. They are necessary because it is very difficult to tell from the vertical scale the height of each bar.

  • What does the tallest bar represent?

The percent of students who graduated within six years from their first institution within 6 years who were Asian.

  • What does the shortest bar represent?

The percent of students who graduated within six years from their first institution within 6 years who were American Indian or Alaska Native.

  • Is this an appropriate use of a bar graph?

Yes. The data is qualitative: nominal; there is no order within the categories.

Histograms are used to represent quantitative data that is discrete . A histogram divides up the range of possible values in a data set into classes or intervals. For each class, a rectangle is constructed with a base length equal to the range of values in that specific class and a length equal to the number of observations falling into that class. A histogram has an appearance similar to a vertical bar chart, but there are no gaps between the bars. The bars are ordered along the axis from the smallest to the largest possible value. Consequently, the bars cannot be reordered. Histograms are often used to illustrate the major features of the distribution of the data in a convenient form. They are also useful when dealing with large data sets (greater than 100 observations). They can help detect any unusual observations (outliers) or any gaps in the data.

Histograms may look similar to bar charts but they are really completely different. Histograms plot quantitative data with ranges of the data grouped into classes or intervals while bar charts plot categorical data. Histograms are used to show distributions while bar charts are used to compare categories. Bars can be reordered in bar charts but not in histograms. The bars of bar charts have the same width. The widths of the bars in a histogram need not be the same as long as the total area of all bars is one hundred percent if percentages are used or the total count, if counts are used. Therefore, values in bar graphs are given by the length of the bar while values in histograms are given by areas.

Explore 4 – Histograms

Reading data from a table can be less than enlightening and certainly doesn’t inspire much interest. Graphing the same data in a histogram gives a graphical representation where certain features are automatically highlighted.

data types in representation

  • What do you notice about the bars of this histogram compared to the bars of a bar graph?

The bars touch in a histogram but not in a bar chart. This is because the data is ordered along the axis.

  • What do the numbers above the bars represent?

The number of employees whose salary lands in each class.

  • State a feature of the graph that is very obvious to you.

Answers may vary. Very few employees make less than $10,000 or more than $91,000. $41,000 – $50,000 is the most common salary.

Line graphs are used when the data is quantitative and continuous . The axis acts as a real number line where every possible value is located. Line graphs are typically used to show how data values change over time.

Explore 5 – Line Graphs

Here is an example of a line graph.

data types in representation

  • What does this line graph represent?

Solution:  The number of annual births in China from 1949 to 2021.

  • What do the numbers on the vertical axis represent?

Solution:  The number of births in millions.

  • What do the numbers on the horizontal axis represent?

Solution:  The year.

  • Is this an appropriate use of a line graph?

Solution:  Yes. The time scale in years is continuous and a line graph is appropriate for continuous data.

  • Does a line graph highlight anything that a histogram may not?

Solution:  Yes. The trend in data over time. In this graph the trend of annual births is decreasing.

Investigation Icon

Infographics are often used by media outlets who are trying to tell a specific (often biased) story. They often combine charts or graphs with narrative and statistics.

Explore 6 – Infographics

data types in representation

Solution:  Since it is circular and based on percentages in each category, it is based on a pie chart.

  • How many categories are represented?

Solution:  There are three categories.

  • What story is the infographic trying to tell?

Solution:  About one third of Americans believe in aliens.

  • How was the data gathered?

Solution:  A survey of 1522 U.S. adults.

  • What does the largest blue area on the chart represent?

Solution:  The percentage of those surveyed that believe that all sightings can be explained by human activity or natural phenomena.

  • What does the smallest grey area on the chart represent?

Solution:  The percentage of those surveyed that have no opinion on UFO sightings.

  • Robert is involved in a group project for a class. The group has collected data to show the amount of time spent performing different tasks on a cell phone. The categories include making calls, Internet, text, music, videos, social media, email, games, and photos. What type of graph or chart should be used to display the average time spent per day on any of these tasks? Explain your reasoning.  
  • A marketing firm wants to show what fraction of the overall market uses a particular Internet browser. What type of graph or chart should be used to display this information? Explain your reasoning.  
  • The data is categorical so a bar graph should be used.
  • The data is categorical. If there are not too many categories (browser used) then a pie chart would work since fraction of the market is used. Alternatively, a bar chart could be used showing the fraction or percent as the height of each bar.

Reflect Icon

  • Name three (3) differences between a bar graph and a histogram.
  • A bar graph is used for qualitative data while a histogram is used for quantitative data.
  • In a bar graph the categories can be reordered. In a histogram the categories cannot be reordered.
  • In a bar graph the bars do not touch. In a histogram the bars touch.
  • A teacher wants to show their class the results of a midterm exam, without exposing any student names. What type of graph or chart should be used to display the scores earned on the midterm? Explain your reasoning.  
  • A pizza company wants to display a graphic of the five favorite pizzas of their customers on the company website. What type of graph or chart should be used to display this information? Explain your reasoning.  
  • Maria is keeping track of her daughter’s height by measuring her height on her birthday each year and recording it in a spreadsheet. What type of graph or chart should be used to display this information? Explain your reasoning
  • Midterm scores may be quantitative as either raw scores or percentages, in which case they should show a histogram showing the number of students scoring in a given score (or percentage) interval. If the midterm results are letter grades, the data is qualitative but ordered. In this case, a pie chart could be used to show the percent of students with each letter grade, but it would be very busy.  A better option would be a bar graph showing the number of students at each letter grade.
  • An infographic. This is categorical data so a (pizza) pie chart would be a good option or a bar chart.
  • A line graph since the data is collected over time and time is continuous.

data types in representation

Perspectives

  • Mike has collected data for a school project from a survey that asked, “What is your favorite pizza? ”. He surveyed 200 people and discovered that there were only 9 pizzas that were on the favorites list. In his report, he plans to show his data in a (pizza) pie chart. Is this the correct chart to use for his purpose? Explain your reasoning.  
  • Sarah is keeping track of the value of her car every year. She started when she first bought the car new and looks up its value every year. She figures that when the car’s value drops to $5000, it is time for an upgrade. What type of graph or chart should be used to display this information? Explain your reasoning.  
  • The Earth’s atmosphere is made up of 77% Nitrogen, 21% Oxygen, and 2% other gases. What type of graph or chart should be used to display this data? Explain your reasoning.  
  • A pie chart could be used but with 9 categories there may be too many slices for the chart to be clear. A bar graph may be better due to the number of categories.
  • A line graph since time is continuous and she will be able to see the trend in car value over time.
  • The data is qualitative: nominal and has percentages that add to 100% so a pie chart would work well with only 3 categories. Alternatively, a bar chart would work.

Skills Icon

Skills Exercises

  • phone number
  • https://www.fda.gov/about-fda/fda-basics/fact-sheet-fda-glance ↵

able to be put into categories

data that can be given labels and put into categories

qualitative data that can be put into labelled categories that have no order and no overlap

having nothing in common; no overlap

the number of times a data value has been recorded

a number or ratio expressed as a fraction of 100

a circular graphic which is divided into slices representing the number or percentage in each category

qualitative data that has a natural order

a graph where each category is represented by a vertical or horizontal bar that measures a frequency or percentage of the whole

expressed using a number or numbers

data that involves numerical values with order

data that is measured using whole numbers with only a finite number of possibilities

a graph similar in appearance to a vertical bar graph with gaps between the bars, ordered bars, with a bse length equal to the range of values in a specific class

data that has an infinite number of possible values that can be selected within certain limits

use arithmetic and the order of operations

a graph used for continuous data that uses an axis as a real number line where every possible value is located

a graphic showing a combination of graphs, charts, and statistics

Numeracy Copyright © 2023 by Utah Valley University is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License , except where otherwise noted.

  • School Guide
  • Mathematics
  • Number System and Arithmetic
  • Trigonometry
  • Probability
  • Mensuration
  • Maths Formulas
  • Class 8 Maths Notes
  • Class 9 Maths Notes
  • Class 10 Maths Notes
  • Class 11 Maths Notes
  • Class 12 Maths Notes

What are the different ways of Data Representation?

  • What are the Different Kinds of Data Scientists?
  • Different types of Coding Schemes to represent data
  • Graphical Representation of Data
  • What are the types of statistics?
  • Textual Presentation of Data: Meaning, Suitability, and Drawbacks
  • Diagrammatic and Graphic Presentation of Data
  • Different Types of Data in Data Mining
  • Tabular Presentation of Data: Meaning, Objectives, Features and Merits
  • What is a Dataset: Types, Features, and Examples
  • Data Manipulation: Definition, Examples, and Uses
  • Collection and Presentation of Data
  • What is Data Organization?
  • What are the HTML tags used to display the data in the tabular form ?
  • What are the 5 methods of statistical analysis?
  • Class 9 RD Sharma- Chapter 22 Tabular Representation of Statistical Data - Exercise 22.1 | Set 1
  • Different Ways To Declare And Initialize 2-D Array in Java
  • Difference between Data and Metadata
  • Different forms of data representation in today's world
  • Difference Between Presentation and Representation
  • What are the Basic Data Types in PHP ?
  • Processing of Raw Data to Tidy Data in R
  • Graph and its representations
  • Difference between Information and Data
  • Data Preprocessing and Its Types
  • What is Meta Data in Data Warehousing?
  • Different Sources of Data for Data Analysis
  • What is Data Visualization and Why is It Important?
  • Difference between Physical and Logical Data Independence
  • Difference between Software and Data

The process of collecting the data and analyzing that data in large quantity is known as statistics. It is a branch of mathematics trading with the collection, analysis, interpretation, and presentation of numeral facts and figures.

It is a numerical statement that helps us to collect and analyze the data in large quantity the statistics are based on two of its concepts:

  • Statistical Data 
  • Statistical Science

Statistics must be expressed numerically and should be collected systematically.

Data Representation

The word data refers to constituting people, things, events, ideas. It can be a title, an integer, or anycast.  After collecting data the investigator has to condense them in tabular form to study their salient features. Such an arrangement is known as the presentation of data.

It refers to the process of condensing the collected data in a tabular form or graphically. This arrangement of data is known as Data Representation.

The row can be placed in different orders like it can be presented in ascending orders, descending order, or can be presented in alphabetical order. 

Example: Let the marks obtained by 10 students of class V in a class test, out of 50 according to their roll numbers, be: 39, 44, 49, 40, 22, 10, 45, 38, 15, 50 The data in the given form is known as raw data. The above given data can be placed in the serial order as shown below: Roll No. Marks 1 39 2 44 3 49 4 40 5 22 6 10 7 45 8 38 9 14 10 50 Now, if you want to analyse the standard of achievement of the students. If you arrange them in ascending or descending order, it will give you a better picture. Ascending order: 10, 15, 22, 38, 39, 40, 44. 45, 49, 50 Descending order: 50, 49, 45, 44, 40, 39, 38, 22, 15, 10 When the row is placed in ascending or descending order is known as arrayed data.

Types of Graphical Data Representation

Bar chart helps us to represent the collected data visually. The collected data can be visualized horizontally or vertically in a bar chart like amounts and frequency. It can be grouped or single. It helps us in comparing different items. By looking at all the bars, it is easy to say which types in a group of data influence the other.

Now let us understand bar chart by taking this example  Let the marks obtained by 5 students of class V in a class test, out of 10 according to their names, be: 7,8,4,9,6 The data in the given form is known as raw data. The above given data can be placed in the bar chart as shown below: Name Marks Akshay 7 Maya 8 Dhanvi 4 Jaslen 9 Muskan 6

A histogram is the graphical representation of data. It is similar to the appearance of a bar graph but there is a lot of difference between histogram and bar graph because a bar graph helps to measure the frequency of categorical data. A categorical data means it is based on two or more categories like gender, months, etc. Whereas histogram is used for quantitative data.

For example:

The graph which uses lines and points to present the change in time is known as a line graph. Line graphs can be based on the number of animals left on earth, the increasing population of the world day by day, or the increasing or decreasing the number of bitcoins day by day, etc. The line graphs tell us about the changes occurring across the world over time. In a  line graph, we can tell about two or more types of changes occurring around the world.

For Example:

Pie chart is a type of graph that involves a structural graphic representation of numerical proportion. It can be replaced in most cases by other plots like a bar chart, box plot, dot plot, etc. As per the research, it is shown that it is difficult to compare the different sections of a given pie chart, or if it is to compare data across different pie charts.

Frequency Distribution Table

A frequency distribution table is a chart that helps us to summarise the value and the frequency of the chart. This frequency distribution table has two columns, The first column consist of the list of the various outcome in the data, While the second column list the frequency of each outcome of the data. By putting this kind of data into a table it helps us to make it easier to understand and analyze the data. 

For Example: To create a frequency distribution table, we would first need to list all the outcomes in the data. In this example, the results are 0 runs, 1 run, 2 runs, and 3 runs. We would list these numerals in numerical ranking in the foremost queue. Subsequently, we ought to calculate how many times per result happened. They scored 0 runs in the 1st, 4th, 7th, and 8th innings, 1 run in the 2nd, 5th, and the 9th innings, 2 runs in the 6th inning, and 3 runs in the 3rd inning. We set the frequency of each result in the double queue. You can notice that the table is a vastly more useful method to show this data.  Baseball Team Runs Per Inning Number of Runs Frequency           0       4           1        3            2        1            3        1

Sample Questions

Question 1: Considering the school fee submission of 10 students of class 10th is given below:

In order to draw the bar graph for the data above, we prepare the frequency table as given below. Fee submission No. of Students Paid   6 Not paid    4 Now we have to represent the data by using the bar graph. It can be drawn by following the steps given below: Step 1: firstly we have to draw the two axis of the graph X-axis and the Y-axis. The varieties of the data must be put on the X-axis (the horizontal line) and the frequencies of the data must be put on the Y-axis (the vertical line) of the graph. Step 2: After drawing both the axis now we have to give the numeric scale to the Y-axis (the vertical line) of the graph It should be started from zero and ends up with the highest value of the data. Step 3: After the decision of the range at the Y-axis now we have to give it a suitable difference of the numeric scale. Like it can be 0,1,2,3…….or 0,10,20,30 either we can give it a numeric scale like 0,20,40,60… Step 4: Now on the X-axis we have to label it appropriately. Step 5: Now we have to draw the bars according to the data but we have to keep in mind that all the bars should be of the same length and there should be the same distance between each graph

Question 2: Watch the subsequent pie chart that denotes the money spent by Megha at the funfair. The suggested colour indicates the quantity paid for each variety. The total value of the data is 15 and the amount paid on each variety is diagnosed as follows:

Chocolates – 3

Wafers – 3

Toys – 2

Rides – 7

To convert this into pie chart percentage, we apply the formula:  (Frequency/Total Frequency) × 100 Let us convert the above data into a percentage: Amount paid on rides: (7/15) × 100 = 47% Amount paid on toys: (2/15) × 100 = 13% Amount paid on wafers: (3/15) × 100 = 20% Amount paid on chocolates: (3/15) × 100 = 20 %

Question 3: The line graph given below shows how Devdas’s height changes as he grows.

Given below is a line graph showing the height changes in Devdas’s as he grows. Observe the graph and answer the questions below.

data types in representation

(i) What was the height of  Devdas’s at 8 years? Answer: 65 inches (ii) What was the height of  Devdas’s at 6 years? Answer:  50 inches (iii) What was the height of  Devdas’s at 2 years? Answer: 35 inches (iv) How much has  Devdas’s grown from 2 to 8 years? Answer: 30 inches (v) When was  Devdas’s 35 inches tall? Answer: 2 years.

Please Login to comment...

Similar reads.

  • School Learning

advertisewithusBannerImg

Improve your Coding Skills with Practice

 alt=

What kind of Experience do you want to share?

Reading 10: Abstract Data Types

Software in 6.031.

Today’s class introduces two ideas:

  • Abstract data types

Representation independence

In this reading, we look at a powerful idea, abstract data types, which enable us to separate how we use a data structure in a program from the particular form of the data structure itself.

Abstract data types address a particularly dangerous problem: clients making assumptions about the type’s internal representation. We’ll see why this is dangerous and how it can be avoided. We’ll also discuss the classification of operations, and some principles of good design for abstract data types.

Access control in Java

You should already have read: Controlling Access to Members of a Class in the Java Tutorials.

reading exercises

The following questions use the code below. Study it first, then answer the questions.

Suppose the program has paused after running the line marked /*A*/ but before reaching /*B*/ . A partial snapshot diagram of its internal state is shown at the right, with numbered gray boxes as placeholders for you to fill in. What should each of those boxes be?

(missing explanation)

What abstraction means

Abstract data types are an instance of a general principle in software engineering, which goes by many names with slightly different shades of meaning. Here are some of the names that are used for this idea:

  • Abstraction. Omitting or hiding low-level details with a simpler, higher-level idea.
  • Modularity. Dividing a system into components or modules, each of which can be designed, implemented, tested, reasoned about, and reused separately from the rest of the system.
  • Encapsulation. Building a wall around a module (a hard shell or capsule) so that the module is responsible for its own internal behavior, and bugs in other parts of the system can’t damage its integrity.
  • Information hiding. Hiding details of a module’s implementation from the rest of the system, so that those details can be changed later without changing the rest of the system.
  • Separation of concerns. Making a feature (or “concern”) the responsibility of a single module, rather than spreading it across multiple modules.

As a software engineer, you should know these terms, because you will run into them frequently. The fundamental purpose of all of these ideas is to help achieve the three important properties that we care about in 6.031: safety from bugs, ease of understanding, and readiness for change.

We have in fact already encountered some of these ideas in previous classes, in the context of writing methods that take inputs and produce outputs:

  • Abstraction: A spec is an abstraction in that the client only has to understand its preconditions and postconditions to use it, not the full internal behavior of the implementation.
  • Modularity: Unit testing and specs help make methods into modules.
  • Encapsulation: The local variables of a method are encapsulated, since only the method itself can use or modify them. Contrast with global variables, which are quite the opposite, or local variables pointing to mutable objects that have aliases, which also threaten encapsulation.
  • Information hiding: A spec uses information-hiding to leave the implementer some freedom in how the method is implemented.
  • Separation of concerns: A spec is coherent if it is responsible for just one concern.

Starting with today’s class, we’re going to move beyond abstractions for methods, and look at abstractions for data as well. But we’ll see that methods will still play a crucial role in how we describe data abstraction.

User-defined types

In the early days of computing, a programming language came with built-in types (such as integers, booleans, strings, etc.) and built-in procedures, e.g., for input and output. Users could define their own procedures: that’s how large programs were built.

A major advance in software development was the idea of abstract types: that one could design a programming language to allow user-defined types, too. This idea came out of the work of many researchers, notably Dahl (the inventor of the Simula language), Hoare (who developed many of the techniques we now use to reason about abstract types), Parnas (who coined the term information hiding and first articulated the idea of organizing program modules around the secrets they encapsulated), and here at MIT, Barbara Liskov and John Guttag, who did seminal work in the specification of abstract types, and in programming language support for them – and developed the original 6.170, the predecessor to 6.005, predecessor to 6.031. Barbara Liskov earned the Turing Award, computer science’s equivalent of the Nobel Prize, for her work on abstract types.

The key idea of data abstraction is that a type is characterized by the operations you can perform on it. A number is something you can add and multiply; a string is something you can concatenate and take substrings of; a boolean is something you can negate, and so on. In a sense, users could already define their own types in early programming languages: you could create a record type date, for example, with integer fields for day, month, and year. But what made abstract types new and different was the focus on operations: the user of the type would not need to worry about how its values were actually stored, in the same way that a programmer can ignore how the compiler actually stores integers. All that matters is the operations.

In Java, as in many modern programming languages, the separation between built-in types and user-defined types is a bit blurry. The classes in java.lang , such as Integer and Boolean are built-in in the sense that the Java language specification requires them to exist and behave in a certain way, but they are defined using the same class/object abstraction as user-defined types. But Java complicates the issue by having primitive types that are not objects. The set of these types, such as int and boolean , cannot be extended by the user.

Consider an abstract data type Bool . The type has the following operations:

true : Bool false : Bool

and : Bool × Bool → Bool or : Bool × Bool → Bool not : Bool → Bool

… where the first two operations construct the two values of the type, and the last three operations have the usual meanings of logical and , logical or , and logical not on those values.

Which of the following are possible ways that Bool might be implemented, and still be able to satisfy the specs of the operations? Choose all that apply.

Classifying types and operations

Types, whether built-in or user-defined, can be classified as mutable or immutable . The objects of a mutable type can be changed: that is, they provide operations which when executed cause the results of other operations on the same object to give different results. So Date is mutable, because you can call setMonth and observe the change with the getMonth operation. But String is immutable, because its operations create new String objects rather than changing existing ones. Sometimes a type will be provided in two forms, a mutable and an immutable form. StringBuilder , for example, is a mutable version of String (although the two are certainly not the same Java type, and are not interchangeable).

The operations of an abstract type are classified as follows:

  • Creators create new objects of the type. A creator may take an object as an argument, but not an object of the type being constructed.
  • Producers create new objects from old objects of the type. The concat method of String , for example, is a producer: it takes two strings and produces a new one representing their concatenation.
  • Observers take objects of the abstract type and return objects of a different type. The size method of List , for example, returns an int .
  • Mutators change objects. The add method of List , for example, mutates a list by adding an element to the end.

We can summarize these distinctions schematically like this (explanation to follow):

  • creator : t* → T
  • producer : T+, t* → T
  • observer : T+, t* → t
  • mutator : T+, t* → void | t | T

These show informally the shape of the signatures of operations in the various classes. Each T is the abstract type itself; each t is some other type. The + marker indicates that the type may occur one or more times in that part of the signature, and the * marker indicates that it occurs zero or more times. | indicates or. For example, a producer may take two values of the abstract type T , like String.concat() does:

  • concat : String × String → String

Some observers take zero arguments of other types t , such as:

  • size : List → int

… and others take several:

  • regionMatches : String × boolean × int × String × int × int → boolean

A creator operation is often implemented as a constructor , like new ArrayList() . But a creator can simply be a static method instead, like Arrays.asList() . A creator implemented as a static method is often called a factory method . The various String.valueOf methods in Java are other examples of creators implemented as factory methods.

Mutators are often signaled by a void return type. A method that returns void must be called for some kind of side-effect, since it doesn’t otherwise return anything. But not all mutators return void. For example, Set.add() returns a boolean that indicates whether the set was actually changed. In Java’s graphical user interface toolkit, Component.add() returns the object itself, so that multiple add() calls can be chained together .

Abstract data type examples

Here are some examples of abstract data types, along with some of their operations, grouped by kind.

int is Java’s primitive integer type. int is immutable, so it has no mutators.

  • creators: the numeric literals 0 , 1 , 2 , …
  • producers: arithmetic operators + , - , * , /
  • observers: comparison operators == , != , < , >
  • mutators: none (it’s immutable)

List is Java’s list type. List is mutable. List is also an interface, which means that other classes provide the actual implementation of the data type. These classes include ArrayList and LinkedList .

  • creators: ArrayList and LinkedList constructors, Collections.singletonList
  • producers: Collections.unmodifiableList
  • observers: size , get
  • mutators: add , remove , addAll , Collections.sort

String is Java’s string type. String is immutable.

  • creators: String constructors, valueOf static methods
  • producers: concat , substring , toUpperCase
  • observers: length , charAt

This classification gives some useful terminology, but it’s not perfect. In complicated data types, there may be an operation that is both a producer and a mutator, for example. Some people reserve the term producer only for operations that do no mutation.

Each of the methods below is an operation on an abstract data type from the Java library. Click on the link to look at its documentation. Think about the operation’s type signature. Then classify the operation.

Hints: pay attention to whether the type itself appears as a parameter or return value. And remember that instance methods (lacking the static keyword) have an implicit parameter.

Integer.valueOf()

BigInteger.mod()

List.addAll()

String.toUpperCase()

Set.contains()

Map.keySet()

BufferedReader.readLine()

An abstract type is defined by its operations

The essential idea here is that an abstract data type is defined by its operations.

The set of operations for a type T , along with their specifications, fully characterize what we mean by T . So, for example, when we talk about the List type, what we mean is not a linked list or an array or any other specific data structure for representing a list. Instead we mean a set of opaque values – the possible objects that can have List type – that satisfy the specifications of all the operations of List : get() , size() , etc.

The values of an abstract type are opaque in the sense that a client can’t examine the data stored inside them, except as permitted by operations. Expanding our metaphor of a specification firewall, you might picture values of an abstract type as hard shells, hiding not just the implementation of an individual function, but of a set of related functions (the operations of the type) and the data they share (the private fields stored inside values of the type).

Designing an abstract type

Designing an abstract type involves choosing good operations and determining how they should behave. Here are a few rules of thumb.

It’s better to have a few, simple operations that can be combined in powerful ways, rather than lots of complex operations.

Each operation should have a well-defined purpose, and should have a coherent behavior rather than a panoply of special cases. We probably shouldn’t add a sum operation to List , for example. It might help clients who work with lists of integers, but what about lists of strings? Or nested lists? All these special cases would make sum a hard operation to understand and use.

The set of operations should be adequate in the sense that there must be enough to do the kinds of computations clients are likely to want to do. A good test is to check that every property of an object of the type can be extracted. For example, if there were no get operation, we would not be able to find out what the elements of a list are. Basic information should not be inordinately difficult to obtain. For example, the size method is not strictly necessary for List , because we could apply get on increasing indices until we get a failure, but this is inefficient and inconvenient.

The type may be generic: a list or a set, or a graph, for example. Or it may be domain-specific: a street map, an employee database, a phone book, etc. But it should not mix generic and domain-specific features. A Deck type intended to represent a sequence of playing cards shouldn’t have a generic add method that accepts arbitrary objects like integers or strings. Conversely, it wouldn’t make sense to put a domain-specific method like dealCards into the generic type List .

Critically, a good abstract data type should be representation independent . This means that the use of an abstract type is independent of its representation (the actual data structure or data fields used to implement it), so that changes in representation have no effect on code outside the abstract type itself. For example, the operations offered by List are independent of whether the list is represented as a linked list or as an array.

You won’t be able to change the representation of an ADT at all unless its operations are fully specified with preconditions and postconditions, so that clients know what to depend on, and you know what you can safely change.

Example: different representations for strings

Let’s look at a simple abstract data type to see what representation independence means and why it’s useful. The MyString type below has far fewer operations than the real Java String , and their specs are a little different, but it’s still illustrative. Here are the specs for the ADT:

These public operations and their specifications are the only information that a client of this data type is allowed to know. But implementing the data type requires a representation. For now, let’s look at a simple representation for MyString : just an array of characters, exactly the length of the string, with no extra room at the end. Here’s how that internal representation would be declared, as an instance variable within the class:

With that choice of representation, the operations would be implemented in a straightforward way:

(The ?: syntax in valueOf is called the ternary conditional operator and it’s a shorthand if-else statement. See The Conditional Operators on this page of the Java Tutorials .)

Question to ponder: Why don’t charAt and substring have to check whether their parameters are within the valid range? What do you think will happen if the client calls these implementations with illegal inputs?

Here’s a snapshot diagram showing what this representation looks like for a couple of typical client operations:

One problem with this implementation is that it’s passing up an opportunity for performance improvement. Because this data type is immutable, the substring operation doesn’t really have to copy characters out into a fresh array. It could just point to the original MyString object’s character array and keep track of the start and end that the new substring object represents. The String implementation in some versions of Java do this.

To implement this optimization, we could change the internal representation of this class to:

With this new representation, the operations are now implemented like this:

Now the same client code produces a very different internal structure:

Because MyString ’s existing clients depend only on the specs of its public methods, not on its private fields, we can make this change without having to inspect and change all that client code. That’s the power of representation independence.

Consider the following abstract data type.

Here is a client of this abstract data type:

Assume all this code works correctly (both Family and client1 ) and passes all its tests.

Now Family ’s representation is changed from a List to Set , as shown:

Assume that Family compiles correctly after the change.

Now consider client2 :

Now consider client3 :

For each section of the Family data type’s code shown below, is it part of the ADT’s specification, its representation, or its implementation?

Realizing ADT concepts in Java

Let’s summarize some of the general ideas we’ve discussed in this reading, which are applicable in general to programming in any language, and their specific realization using Java language features. The point is that there are several ways to do it, and it’s important to both understand the big idea, like a creator operation, and different ways to achieve that idea in practice.

There are three items in this table that haven’t yet been discussed in this reading:

  • Defining an abstract data type using an interface. We’ve seen List and ArrayList as an example, and we’ll discuss interfaces in a future reading.
  • Defining an abstract data type using an enumeration ( enum ). Enums are ideal for ADTs that have a small fixed set of values, like the days of the week Monday, Tuesday, etc. We’ll discuss enumerations in a future reading.
  • Using a constant object as a creator operation. This pattern is commonly seen in immutable types, where the simplest or emptiest value of the type is simply a public constant, and producers are used to build up more complex values from it.

Testing an abstract data type

We build a test suite for an abstract data type by creating tests for each of its operations. These tests inevitably interact with each other. The only way to test creators, producers, and mutators is by calling observers on the objects that result, and likewise, the only way to test observers is by creating objects for them to observe.

Here’s how we might partition the input spaces of the four operations in our MyString type:

Now we want test cases that cover these partitions. Note that writing test cases that use assertEquals directly on MyString objects wouldn’t work, because we don’t have an equality operation defined on MyString . We’ll talk about how to implement equality carefully in a later reading. For now, the only operations we can perform with MyStrings are the ones we’ve defined above: valueOf , length , charAt , and substring .

Given that constraint, a compact test suite that covers all these partitions might look like:

Try to match each test case to the partitions it covers.

Which test cases cover the part “ charAt() with string len = 1”?

Which test cases cover the part “substring() of string produced by substring()”?

Which test cases cover the part “valueOf(true)”?

What “unit” is being tested by testValueOfTrue ?

  • Abstract data types are characterized by their operations.
  • Operations can be classified into creators, producers, observers, and mutators.
  • An ADT’s specification is its set of operations and their specs.
  • A good ADT is simple, coherent, adequate, and representation-independent.
  • An ADT is tested by generating tests for each of its operations, but using the creators, producers, mutators, and observers together in the same tests.

These ideas connect to our three key properties of good software as follows:

Safe from bugs. A good ADT offers a well-defined contract for a data type, so that clients know what to expect from the data type, and implementers have well-defined freedom to vary.

Easy to understand. A good ADT hides its implementation behind a set of simple operations, so that programmers using the ADT only need to understand the operations, not the details of the implementation.

Ready for change. Representation independence allows the implementation of an abstract data type to change without requiring changes from its clients.

Data representation

Integer representation, fixed-point representation, floating-point representation, ieee 754 standard sizes, example floating-point conversions.

Data representation 1: Introduction

This course investigates how systems software works: what makes programs work fast or slow, and how properties of the machines we program impact the programs we write. We discuss both general ideas and specific tools, and take an experimental approach.

Textbook readings

  • How do computers represent different kinds of information?
  • How do data representation choices impact performance and correctness?
  • What kind of language is understood by computer processors?
  • How is code you write translated to code a processor runs?
  • How do hardware and software defend against bugs and attacks?
  • How are operating systems interfaces implemented?
  • What kinds of computer data storage are available, and how do they perform?
  • How can we improve the performance of a system that stores data?
  • How can programs on the same computer cooperate and interact?
  • What kinds of operating systems interfaces are useful?
  • How can a single program safely use multiple processors?
  • How can multiple computers safely interact over a network?
  • Six problem sets
  • Midterm and final
  • Starting mid-next week
  • Attendance checked for simultaneously-enrolled students
  • Rough breakdown: >50% assignments, <35% tests, 15% participation
  • Course grading: A means mastery

Collaboration

Discussion, collaboration, and the exchange of ideas are essential to doing academic work, and to engineering. You are encouraged to consult with your classmates as you work on problem sets. You are welcome to discuss general strategies for solutions as well as specific bugs and code structure questions, and to use Internet resources for general information.

However, the work you turn in must be your own—the result of your own efforts. You should understand your code well enough that you could replicate your solution from scratch, without collaboration.

In addition, you must cite any books, articles, online resources, and so forth that helped you with your work, using appropriate citation practices; and you must list the names of students with whom you have collaborated on problem sets and briefly describe how you collaborated. (You do not need to list course staff.)

On our programming language

We use the C++ programming language in this class.

C++ is a boring, old, and unsafe programming language, but boring languages are underrated . C++ offers several important advantages for this class, including ubiquitous availability, good tooling, the ability to demonstrate impactful kinds of errors that you should understand, and a good standard library of data structures.

Pset 0 links to several C++ tutorials and references, and to a textbook.

Each program runs in a private data storage space. This is called its memory . The memory “remembers” the data it stores.

Programs work by manipulating values . Different programming languages have different conceptions of value; in C++, the primitive values are integers, like 12 or -100; floating-point numbers, like 1.02; and pointers , which are references to other objects.

An object is a region of memory that contains a value. (The C++ standard specifically says “a region of data storage in the execution environment, the contents of which can represent values”.)

Objects, values, and variables

Which are the objects? Which are the values?

Variables generally correspond to objects, and here there are three objects, one for each variable i1 , i2 , and i3 . The compiler and operating system associate the names with their corresponding objects. There are three values, too, one used to initialize each object: 61 , 62 , and 63 . However, there are other values—for instance, each argument to the printf calls is a value.

What does the program print?

i1: 61 i2: 62 i3: 63

C and C++ pointer types allow programs to access objects indirectly. A pointer value is the address of another object. For instance, in this program, the variable i4 holds a pointer to the object named by i3 :

There are four objects, corresponding to variables i1 through i4 . Note that the i4 object holds a pointer value, not an integer. There are also four values: 61 , 62 , 63 , and the expression &i3 (the address of i3 ). Note that there are three integer values, but four values overall.

What does this program print?

i1: 61 i2: 62 i3: 63 value pointed to by i4: 63

Here, the expressions i3 and *i4 refer to exactly the same object. Any modification to i3 can be observed through *i4 and vice versa. We say that i3 and *i4 are aliases : different names for the same object.

We now use hexdump_object , a helper function declared in our hexdump.hh helper file , to examine both the contents and the addresses of these objects.

Exactly what is printed will vary between operating systems and compilers. In Docker in class, on my Apple-silicon Macbook, we saw:

But on an Intel-based Amazon EC2 native Linux machine:

The data bytes look similar—identical for i1 through i3 —but the addresses vary.

But on Intel Mac OS X: 103c63020 3d 00 00 00 |=...| 103c5ef60 3e 00 00 00 |>...| 7ffeebfa4abc 3f 00 00 00 |?...| 7ffeebfa4ab0 bc 4a fa eb fe 7f 00 00 |.J......| And on Docker on an Intel Mac: 56499f239010 3d 00 00 00 |=...| 56499f23701c 3e 00 00 00 |>...| 7fffebf8b19c 3f 00 00 00 |?...| 7fffebf8b1a0 9c b1 f8 eb ff 7f 00 00 |........|

A hexdump printout shows the following information on each line.

  • An address , like 4000004010 . This is a hexadecimal (base-16) number indicating the value of the address of the object. A line contains one to sixteen bytes of memory starting at this address.
  • The contents of memory starting at the given address, such as 3d 00 00 00 . Memory is printed as a sequence of bytes , which are 8-bit numbers between 0 and 255. All modern computers organize their memory in units of 8-bit bytes.
  • A textual representation of the memory contents, such as |=...| . This is useful when examining memory that contains textual data, and random garbage otherwise.

Dynamic allocation

Must every data object be given a name? No! In C++, the new operator allocates a brand-new object with no variable name. (In C, the malloc function does the same thing.) The C++ expression new T returns a pointer to a brand-new, never-before-seen object of type T . For instance:

This prints something like

The new int{64} expression allocates a fresh object with no name of its own, though it can be located by following the i4 pointer.

What do you notice about the addresses of these different objects?

  • i3 and i4 , which are objects corresponding to variables declared local to main , are located very close to one another. In fact they are just 4 bytes part: i3 directly abuts i4 . Their addresses are quite high. In native Linux, in fact, their addresses are close to 2 47 !
  • i1 and i2 are at much lower addresses, and they do not abut. i2 ’s location is below i1 , and about 0x2000 bytes away.
  • The anonymous storage allocated by new int is located between i1 / i2 and i3 / i4 .

Although the values may differ on other operating systems, you’ll see qualitatively similar results wherever you run ./objects .

What’s happening is that the operating system and compiler have located different kinds of object in different broad regions of memory. These regions are called segments , and they are important because objects’ different storage characteristics benefit from different treatment.

i2 , the const int global object, has the smallest address. It is in the code or text segment, which is also used for read-only global data. The operating system and hardware ensure that data in this segment is not changed during the lifetime of the program. Any attempt to modify data in the code segment will cause a crash.

i1 , the int global object, has the next highest address. It is in the data segment, which holds modifiable global data. This segment keeps the same size as the program runs.

After a jump, the anonymous new int object pointed to by i4 has the next highest address. This is the heap segment, which holds dynamically allocated data. This segment can grow as the program runs; it typically grows towards higher addresses.

After a larger jump, the i3 and i4 objects have the highest addresses. They are in the stack segment, which holds local variables. This segment can also grow as the program runs, especially as functions call other functions; in most processors it grows down , from higher addresses to lower addresses.

Experimenting with the stack

How can we tell that the stack grows down? Do all functions share a single stack? This program uses a recursive function to test. Try running it; what do you see?

Reading 12: Abstract Data Types

Software in 6.005.

Today’s class introduces two ideas:

  • Abstract data types
  • Representation independence

In this reading, we look at a powerful idea, abstract data types, which enable us to separate how we use a data structure in a program from the particular form of the data structure itself.

Abstract data types address a particularly dangerous problem: clients making assumptions about the type’s internal representation. We’ll see why this is dangerous and how it can be avoided. We’ll also discuss the classification of operations, and some principles of good design for abstract data types.

Access Control in Java

You should already have read: Controlling Access to Members of a Class in the Java Tutorials.

What Abstraction Means

Abstract data types are an instance of a general principle in software engineering, which goes by many names with slightly different shades of meaning. Here are some of the names that are used for this idea:

  • Abstraction. Omitting or hiding low-level details with a simpler, higher-level idea.
  • Modularity. Dividing a system into components or modules, each of which can be designed, implemented, tested, reasoned about, and reused separately from the rest of the system.
  • Encapsulation. Building walls around a module (a hard shell or capsule) so that the module is responsible for its own internal behavior, and bugs in other parts of the system can’t damage its integrity.
  • Information hiding. Hiding details of a module’s implementation from the rest of the system, so that those details can be changed later without changing the rest of the system.
  • Separation of concerns. Making a feature (or “concern”) the responsibility of a single module, rather than spreading it across multiple modules.

As a software engineer, you should know these terms, because you will run into them frequently. The fundamental purpose of all of these ideas is to help achieve the three important properties that we care about in 6.005: safety from bugs, ease of understanding, and readiness for change.

User-Defined Types

In the early days of computing, a programming language came with built-in types (such as integers, booleans, strings, etc.) and built-in procedures, e.g., for input and output. Users could define their own procedures: that’s how large programs were built.

A major advance in software development was the idea of abstract types: that one could design a programming language to allow user-defined types, too. This idea came out of the work of many researchers, notably Dahl (the inventor of the Simula language), Hoare (who developed many of the techniques we now use to reason about abstract types), Parnas (who coined the term information hiding and first articulated the idea of organizing program modules around the secrets they encapsulated), and here at MIT, Barbara Liskov and John Guttag, who did seminal work in the specification of abstract types, and in programming language support for them – and developed the original 6.170, the predecessor to 6.005. Barbara Liskov earned the Turing Award, computer science’s equivalent of the Nobel Prize, for her work on abstract types.

The key idea of data abstraction is that a type is characterized by the operations you can perform on it. A number is something you can add and multiply; a string is something you can concatenate and take substrings of; a boolean is something you can negate, and so on. In a sense, users could already define their own types in early programming languages: you could create a record type date, for example, with integer fields for day, month, and year. But what made abstract types new and different was the focus on operations: the user of the type would not need to worry about how its values were actually stored, in the same way that a programmer can ignore how the compiler actually stores integers. All that matters is the operations.

In Java, as in many modern programming languages, the separation between built-in types and user-defined types is a bit blurry. The classes in java.lang, such as Integer and Boolean are built-in; whether you regard all the collections of java.util as built-in is less clear (and not very important anyway). Java complicates the issue by having primitive types that are not objects. The set of these types, such as int and boolean, cannot be extended by the user.

Classifying Types and Operations

Types, whether built-in or user-defined, can be classified as mutable or immutable . The objects of a mutable type can be changed: that is, they provide operations which when executed cause the results of other operations on the same object to give different results. So Date is mutable, because you can call setMonth and observe the change with the getMonth operation. But String is immutable, because its operations create new String objects rather than changing existing ones. Sometimes a type will be provided in two forms, a mutable and an immutable form. StringBuilder , for example, is a mutable version of String (although the two are certainly not the same Java type, and are not interchangeable).

The operations of an abstract type are classified as follows:

  • Creators create new objects of the type. A creator may take an object as an argument, but not an object of the type being constructed.
  • Producers create new objects from old objects of the type. The concat method of String , for example, is a producer: it takes two strings and produces a new one representing their concatenation.
  • Observers take objects of the abstract type and return objects of a different type. The size method of List , for example, returns an int .
  • Mutators change objects. The add method of List , for example, mutates a list by adding an element to the end.

We can summarize these distinctions schematically like this (explanation to follow):

  • creator : t* → T
  • producer : T+, t* → T
  • observer : T+, t* → t
  • mutator : T+, t* → void | t | T

These show informally the shape of the signatures of operations in the various classes. Each T is the abstract type itself; each t is some other type. The + marker indicates that the type may occur one or more times in that part of the signature, and the * marker indicates that it occurs zero or more times. | indicates or. For example, a producer may take two values of the abstract type T , like String.concat() does:

  • concat : String × String → String

Some observers take zero arguments of other types t , such as:

  • size : List → int

… and others take several:

  • regionMatches : String × boolean × int × String × int × int → boolean

A creator operation is often implemented as a constructor , like new ArrayList() . But a creator can simply be a static method instead, like Arrays.asList() . A creator implemented as a static method is often called a factory method . The various String.valueOf methods in Java are other examples of creators implemented as factory methods.

Mutators are often signaled by a void return type. A method that returns void must be called for some kind of side-effect, since otherwise it doesn’t return anything. But not all mutators return void. For example, Set.add() returns a boolean that indicates whether the set was actually changed. In Java’s graphical user interface toolkit, Component.add() returns the object itself, so that multiple add() calls can be chained together .

Abstract Data Type Examples

Here are some examples of abstract data types, along with some of their operations, grouped by kind.

int is Java’s primitive integer type. int is immutable, so it has no mutators.

  • creators: the numeric literals 0 , 1 , 2 , …
  • producers: arithmetic operators + , - , * , /
  • observers: comparison operators == , != , < , >
  • mutators: none (it’s immutable)

List is Java’s list type. List is mutable. List is also an interface, which means that other classes provide the actual implementation of the data type. These classes include ArrayList and LinkedList .

  • creators: ArrayList and LinkedList constructors, Collections.singletonList
  • producers: Collections.unmodifiableList
  • observers: size , get
  • mutators: add , remove , addAll , Collections.sort

String is Java’s string type. String is immutable.

  • creators: String constructors
  • producers: concat , substring , toUpperCase
  • observers: length , charAt

This classification gives some useful terminology, but it’s not perfect. In complicated data types, there may be an operation that is both a producer and a mutator, for example. Some people reserve the term producer only for operations that do no mutation.

Designing an Abstract Type

Designing an abstract type involves choosing good operations and determining how they should behave. Here are a few rules of thumb.

It’s better to have a few, simple operations that can be combined in powerful ways, rather than lots of complex operations.

Each operation should have a well-defined purpose, and should have a coherent behavior rather than a panoply of special cases. We probably shouldn’t add a sum operation to List , for example. It might help clients who work with lists of integers, but what about lists of strings? Or nested lists? All these special cases would make sum a hard operation to understand and use.

The set of operations should be adequate in the sense that there must be enough to do the kinds of computations clients are likely to want to do. A good test is to check that every property of an object of the type can be extracted. For example, if there were no get operation, we would not be able to find out what the elements of a list are. Basic information should not be inordinately difficult to obtain. For example, the size method is not strictly necessary for List, because we could apply get on increasing indices until we get a failure, but this is inefficient and inconvenient.

The type may be generic: a list or a set, or a graph, for example. Or it may be domain-specific: a street map, an employee database, a phone book, etc. But it should not mix generic and domain-specific features. A Deck type intended to represent a sequence of playing cards shouldn’t have a generic add method that accepts arbitrary objects like integers or strings. Conversely, it wouldn’t make sense to put a domain-specific method like dealCards into the generic type List .

Representation Independence

Critically, a good abstract data type should be representation independent . This means that the use of an abstract type is independent of its representation (the actual data structure or data fields used to implement it), so that changes in representation have no effect on code outside the abstract type itself. For example, the operations offered by List are independent of whether the list is represented as a linked list or as an array.

You won’t be able to change the representation of an ADT at all unless its operations are fully specified with preconditions and postconditions, so that clients know what to depend on, and you know what you can safely change.

Example: Different Representations for Strings

Let’s look at a simple abstract data type to see what representation independence means and why it’s useful. The MyString type below has far fewer operations than the real Java String , and their specs are a little different, but it’s still illustrative. Here are the specs for the ADT:

These public operations and their specifications are the only information that a client of this data type is allowed to know. Following the test-first programming paradigm, in fact, the first client we should create is a test suite that exercises these operations according to their specs. At the moment, however, writing test cases that use assertEquals directly on MyString objects wouldn’t work, because we don’t have an equality operation defined on MyString . We’ll talk about how to implement equality carefully in a later reading. For now, the only operations we can perform with MyStrings are the ones we’ve defined above: valueOf , length , charAt , and substring . Our tests have to limit themselves to those operations. For example, here’s one test for the valueOf operation:

We’ll come back to the question of testing ADTs at the end of this reading.

For now, let’s look at a simple representation for MyString : just an array of characters, exactly the length of the string, with no extra room at the end. Here’s how that internal representation would be declared, as an instance variable within the class:

With that choice of representation, the operations would be implemented in a straightforward way:

Question to ponder: Why don’t charAt and substring have to check whether their parameters are within the valid range? What do you think will happen if the client calls these implementations with illegal inputs?

One problem with this implementation is that it’s passing up an opportunity for performance improvement. Because this data type is immutable, the substring operation doesn’t really have to copy characters out into a fresh array. It could just point to the original MyString object’s character array and keep track of the start and end that the new substring object represents. The String implementation in some versions of Java do this.

To implement this optimization, we could change the internal representation of this class to:

With this new representation, the operations are now implemented like this:

Because MyString ’s existing clients depend only on the specs of its public methods, not on its private fields, we can make this change without having to inspect and change all that client code. That’s the power of representation independence.

Realizing ADT Concepts in Java

Let’s summarize some of the general ideas we’ve discussed in this reading, which are applicable in general to programming in any language, and their specific realization using Java language features. The point is that there are several ways to do it, and it’s important to both understand the big idea, like a creator operation, and different ways to achieve that idea in practice.

The only item in this table that hasn’t yet been discussed in this reading is the use of a constant object as a creator operation. This pattern is commonly seen in immutable types, where the simplest or emptiest value of the type is simply a public constant, and producers are used to build up more complex values from it.

Testing an Abstract Data Type

We build a test suite for an abstract data type by creating tests for each of its operations. These tests inevitably interact with each other. The only way to test creators, producers, and mutators is by calling observers on the objects that result, and likewise, the only way to test observers is by creating objects for them to observe.

Here’s how we might partition the input spaces of the four operations in our MyString type:

Then a compact test suite that covers all these partitions might look like:

Notice that each test case typically calls a few operations that make or modify objects of the type (creators, producers, mutators) and some operations that inspect objects of the type (observers). As a result, each test case covers parts of several operations.

  • Abstract data types are characterized by their operations.
  • Operations can be classified into creators, producers, observers, and mutators.
  • An ADT’s specification is its set of operations and their specs.
  • A good ADT is simple, coherent, adequate, and representation-independent.
  • An ADT is tested by generating tests for each of its operations, but using the creators, producers, mutators, and observers together in the same tests.

These ideas connect to our three key properties of good software as follows:

Safe from bugs. A good ADT offers a well-defined contract for a data type, so that clients know what to expect from the data type, and implementors have well-defined freedom to vary.

Easy to understand. A good ADT hides its implementation behind a set of simple operations, so that programmers using the ADT only need to understand the operations, not the details of the implementation.

Ready for change. Representation independence allows the implementation of an abstract data type to change without requiring changes from its clients.

logo

Snefru: Learning Programming with C

Data types and representation, 2.2. data types and representation #.

Different data types are stored differently in the memory, i.e. different variable types use different amounts of memory.

2.2.1. Integers #

int are stored using 32 bits, and as we mentioned in How is the 0s and 1s of data memory organized? , each cell in the memory stores a byte. Hence, int variables use 4 bytes/cells of memory.

31 of the bits are used to represent the integer itself and 1 bit is representing the sign. The sign bit is 0 for positive numbers or 0, and 1 for negative numbers.

Since the number of bits is determined, there is a maximum range of numbers that can be represented using int . As we discussed in How many bits do we need to represent x numbers? , we can represent \(2^n\) numbers using \(n\) bits. Since one bit is reserved for the sign of the number, we have 31 bits left. We can represent \(2^{31}\) negative numbers, i.e. from \(-2^{31}\) to \(-1\) . For 0 and positive numbers we have \(2^{31}\) representations, i.e. from 0 to \(2^{31} - 1\) .

There are other data types to represent integers, such as:

short representation typically uses 16 bits, i.e. 2 bytes of memory.

unsigned int representation typically uses 32 bits, i.e. 4 bytes of memory. The sign bit is not used, so the range of numbers is from 0 to \(2^{32} - 1\) .

long representation typically uses 64 bits, i.e. 8 bytes of memory.

long long representation typically uses 64 bits, i.e. 8 bytes of memory.

For the purpose of this course, you are expected to only know int data type out of all integer data types.

Format specifier for int is %d .

2.2.2. Floating point or real numbers #

Floating point binary representation is similar to standard notation, e.g. \(2.89 \times 10^{14}\) or \(2.89e14\) or \(2.89E14\) . Formally, the number is represented as \(m \times 10^e\) , where \(m\) is the mantissa and \(e\) is the exponent. The mantissa is a number between 1 and 10, and the exponent is an integer. The sign of the number is represented by the sign of the mantissa.

The floating point number in binary form represents the mantissa and the exponent separately. We do not need to know how.

There are two data types to represent floating point numbers:

float uses 32 bits, i.e. 4 bytes of memory.

double uses 64 bits, i.e. 8 bytes of memory. Since double data types uses double the number of bits to represent the floating point number, it is more precise than float data type. Hence, double is referred to have “double precision” and float as “single precision”.

For the purpose of this course, we will be using double data type only.

Format specifier for float is %f and for double is %lf .

2.2.3. Characters #

To represent a single letter, symbol or digit, we can use the char data type. Example characters include A , B , … Z , a , b , … z , 0 , 1 , … 9 , @ , # , $ and other symbols.

This code snippet would print S on the screen. The format specifier for char is %c .

char is stored using 8 bits, i.e. 1 byte of memory. Each character is encoded into a unique number, and the number is stored in one cell of the memory. How does this unique number look like? The number is called American Standard Code for Information Interchange (ASCII) code. ASCII code is a standard encoding scheme for characters. It uses only 7-bits and the eighth bit is set to \(0\) . Since we are using 7 bits, then the ASCII code table has numbers between \(0\) and \(2^7 -1\) , which is 128 numbers. Part of the ASCII code table is shown below, but you are NOT expected to memorize it.

2.2.4. Boolean #

Boolean bool data type is used to represent a logical value, i.e. either true or false . In C, true is represented by 1 and false is represented by 0 . Sounds like we only need one bit for that nice data type in the memory, right? As much as bool requires only one bit, but we cannot organize the memory as we like. The memory is organized into cells, and each cell stores a byte. The smallest possible memory space we can use is a cell in the memory. So bool data type uses 1 byte of memory.

Write a C code that prints a bool variable. Code in isRaining.c .

There is no format specifier for bool specifically. We use %d to print the value (either 0 or 1) of a boolean variable. Hence, the above code prints Is it raining? 1 NOT Is it raining? true .

If you noticed, apart from #include <stdio.h> which gives us access to printf and scanf functions, we included another library for bool variables in #include <stdbool.h> . Without this library, the compiler won’t identify the bool variable type.

2.2.5. Declaring Vs. Initializing Variables #

In your code, if you need to declare a variable, you do it as follows

The compiler will understand that you declared a variable with int type and identifier var . When running your code, the computer will reserve a space for it in the memory. The question is, what is the value of this declared variable? The answer is not 0 .

The variable is uninitialized . It is not holding any value. It is just a variable that is declared, but not initialized. To be more specific, if you try printing the value of the declared but uninitialized variable, you print the value that is there in the memory location of the variable. Probably this value was there before the program started running. It is a random value. Some people call it “garbage” value.

For example, when I ran the code below on my computer, the value in var variable was 174739296 , because I never initialized var . However, when you run the code, you may get a different value. The value will be different for each run too. You can download from declare-vs-initialize.c to play with the code.

If you compile the code above, you will get a warning stating: variable ‘var’ is uninitialized when used here [-Wuninitialized] . The compiler will also ask you to initialize var to silence the warning.

This becomes a problem, if you are unaware of it. If you use an uninitialized variable later, your program’s behavior will be undefined. Therefore, it is best practice to declare a variable AND initialize it, e.g. int var = 0; . int var declares the var variable and = 0; initializes the var variable to 0 .

2.2.6. Taking in input from the user using scanf #

Given that we now know the format specifiers of int , double , char and bool data types, there are a few tricks you need to know as you use these to take input from the user using scanf .

Take multiple numbers in multiple variables.

You can take multiple numbers from the user in one single scanf . The scanf should separate the format specifiers by a space. The user should separate the numbers by delimiters . Delimiters can be a space, return or tab and are used to separate two different inputs. An example code that takes multiple numbers as input from user is shown below.

Take numbers and characters.

You can take numbers and characters in the same scanf line. The example code shown below takes in an ID from the user that begins with a character and is followed by a number. It does not require a delimiter between the character entered and the numbers. This is because the %c format specifier will take one character, and stop taking more input. The rest will be taken by %d . If the user enters a space between the character and the numbers it will be ignored.

You can also write the code above with no spaces between %c and %d in scanf as follows.

Take in characters and ignoring leading spaces.

If you want to take in character by character, but you are entering spaces or returns between them, what should you do? To ignore spaces between characters entered, you need to add a space between the format specifier %c . For example, the following code takes in the 4 letters and three numbers of a license plate in Ontario. If you do not add a space between the %c , any delimiter entered will be considered a character, and taken into the character variable.

Common mistake: Spaces after format specifiers

Do not include a space after a format specifier, if there is no format specifier after it in scanf , like scanf("%d ", &num); . This is because scanf will wait for a delimiter after you enter your number and another input too. Although it won’t put that second input into another variable. For example, the following code will not proceed with executing other statements unless you enter another input after your number.

0 Questions

9 Types of Data Distribution in Statistics

Statistical data analysis is indispensable for gaining deeper insights into your datasets. It empowers you to go beyond numbers and comprehend the underlying patterns, relationships, and probabilities. A crucial aspect of statistical analysis involves understanding the different types of data distribution.

By learning how data points spread out, you can infer meaningful interpretations and predictions based on the data's shape, central tendency, and variability. This knowledge empowers you to make informed decisions, test hypotheses, and develop models. But before we discuss any data distribution types, let’s understand more about data.

Types of Data

You can broadly classify data into qualitative and quantitative categories based on its nature. Qualitative data is non-numerical and provides a depth of understanding using descriptive characteristics like color, customer reviews, etc. Quantitative data, on the other hand, represents data that can be measured or counted, like customer visits per month, ratings between one and five, etc.   

Quantitative data is particularly relevant to data distribution analysis, and you can further classify it as discrete and continuous data. 

Discrete Data

This type of data consists of distinct, separate values. It often represents whole numbers or counts, such as the number of students in a class or the number of times you land on heads in 10 coin flips. You can represent discrete data using bar charts or histograms.

Continuous Data

Contrarily, continuous data can take on any value within a given range—for example, height, weight, or time. You can measure these values to any degree of precision within the relevant range and represent them using line graphs or density plots.

Understanding whether your data is discrete or continuous is crucial for choosing the appropriate data distribution model for analysis.

What Is Data Distribution?

Data distribution refers to how data spreads across a range of values. It describes the arrangement of your data, whether it clusters around a particular value, is scattered evenly, or skews in one direction. It also provides insights into the frequency or probability of specific outcomes. 

In statistics, based on the type of quantitative data, there are two types of data distribution—discrete and continuous.

Types of Data Distribution in Statistics

Data distributions provide mathematical models that describe the behavior of random variables. By identifying distributions that fit the data, you can estimate parameters that best define your data distribution and use them to simulate new data points. 

Let’s delve deeper and understand different types of distribution in statistics with examples.  

Discrete Distributions

We have explored the concept of discrete data, where variables can only take on a finite or countable number of values. Now, let’s delve into the different types of data distributions under this category.

Bernoulli Distribution

Bernoulli Distribution

The Bernoulli distribution is the simplest distribution that describes the probability of a single event with binary results such as success (1) or failure (0). For example, tossing a coin once is a Bernoulli trial with only two possible outcomes—heads or tails. If ‘p’ is the probability of a successful outcome, then the probability of a failure will be ‘1-p.’

You can use Bernoulli distribution in various data analysis applications, such as binary classification problems, CTR prediction, churn rate analysis, etc. 

Binomial Distribution

Binomial Distribution

Binomial distribution builds upon the Bernoulli principle. It describes the probability of getting a specific number of successes in a fixed number of independent trials. For example, you roll a dice and count the number of sixes in ten throws. Binomial distribution has two parameters, 'n’ is the total number of trials, and ‘p’ is the probability of success. You can calculate the probability density using the formula below, where ‘x’ is the number of times a specific happens within ‘n’ trials and ‘q’ is the probability of failure, i.e. q = (1 - p). 

Bernoulli principle

You can apply binomial distribution in your email marketing campaigns and calculate the probability of certain emails landing in spam. This helps you optimize marketing strategies and target your audience more effectively. 

Poisson Distribution

Poisson Distribution

The Poisson distribution approximates a certain number of events occurring in a fixed interval of time or space, given an average rate of occurrence, lambda (λ). It is particularly useful for situations where events are random and independent. 

The formula for calculating the probability of x outcomes in a fixed interval is: 

Poisson Distribution Formula

You can use Poisson distribution to model restaurant customer arrivals or estimate the likelihood of receiving a specific number of insurance claims within a particular time interval. 

Geometric Distribution

Geometric Distribution

The geometric distribution describes the probability of the number of failures before encountering a single success in a series of independent trials. An example is the likelihood of the first ‘3’ occurring when you roll a die. 

You can calculate the probability by using the formula below:

geometric distribution formula

In sales and marketing, you can use geometric distribution to model the number of customer contacts needed before a sale.

Continuous Distributions

As we have already explored, continuous data take on any value within a range. This section will explore the different types of data distributions that fall under this category.

Normal Distribution

Normal Distribution

The Normal Distribution, also known as the Gaussian distribution, represents symmetrical data around a central point (mean) with a characteristic bell-shaped curve. Many natural phenomena, like human height, weight, or test scores, follow a normal distribution. This distribution has two input parameters—mean and standard deviation. 

You can calculate the probability using the formula below, where ‘σ’ is the standard deviation,  ‘𝝁’ is the mean, ‘x’ is the value of the variable, and ‘e’ is the natural logarithm or Euler’s constant. 

Normal Distribution Formula

Many statistical models and tests rely on the assumptions of normality, making normal distribution a crucial tool for hypothesis testing, confidence intervals, and regression analysis. Its other properties, like the central limit theorem and the empirical rule, facilitate quick insights into data behavior and help you make better predictions. 

F Distribution

F Distribution

F distribution arises when you conduct an analysis of variance (ANOVA) i.e. compare the variances of two normally distributed populations to asses if the variances are significantly different. You can also use it to evaluate the overall significance of a regression model by comparing the variance explained by the model to the residual variance.

You can calculate the probability density function using the formula given below:

F Distribution Formula

By utilizing F distribution, you can make informed decisions about the relationships between the variables and the validity of your statistical models. This improves the accuracy and reliability of your data analysis results.

Chi-Square Distribution

Chi-Square Distribution

The chi-square distribution is a continuous probability distribution used in hypothesis testing and confidence interval construction. It helps you calculate a chi-squared test statistic by analyzing the discrepancy between observed data and expected values. This test statistic enables you to determine whether the differences are due to chance variation or if they represent a statistically significant deviation. 

You can calculate the probability density using the formula:

Chi-Square Distribution Formula

The chi-square distribution is a critical tool for evaluating the goodness of fit of statistical models, testing independence between categorical variables, and detecting patterns or relationships in data sets.

Exponential Distribution

Exponential Distribution

The exponential distribution is a continuous distribution that models the time between events in a Poisson process, where events occur continuously and independently at a constant average rate. In data analysis, the exponential distribution helps model phenomena with a constant hazard rate, such as duration until an element's radioactive decay. 

You can calculate the probability density using the formula below, where ‘λ’ is the rate parameter and ‘x’ is a random variable. 

Exponential Distribution Formula

Its key characteristic, the memoryless property, suggests that time does not affect future outcomes, allowing you to predict events, assess reliability, or plan resource allocation. Additionally, exponential distribution has only one parameter, the success rate (λ). This makes data interpretation and parameter estimation easy, allowing you to make swift, data-driven decisions. 

Gamma Distribution

Gamma Distribution

The gamma distribution is a continuous probability distribution characterized by two parameters— shape (α) and scale (𝛽) or rate (λ). You can use its ability to model positively skewed data and accommodate different shapes to accurately describe and analyze datasets that don't conform to the standard distribution assumption. 

The formula below calculates the probability density function, where ‘λ’ denotes the rate at which the event occurs in time or space.  

Gamma Distribution Formula

Streamline Data Distribution Analysis with Airbyte

The most common statistical solutions you can use to analyze these data distribution models are Stata, R, Python, or Matlab. However, before you perform any analysis, it is important to prepare and unify your data at a central location. Airbyte bridges the gap and streamlines the data consolidation. It enables you to extract and load data from various sources in a central repository with its 350+ pre-built connectors . You can integrate this specific destination with your statistical software and perform further analysis.

data types in representation

In addition, Airbyte keeps your data pipelines in sync with automated schema evolution and efficient Change Data Capture (CDC). This implies your data structure automatically adapts to changes in the source, and you only capture the most recent data modifications for analysis. 

You can also perform complex data transformation by seamlessly integrating Airbyte with dbt (Data Build Tool). It streamlines the entire data acquisition and integration process, ensuring your statistical analyses are accurate and insightful.

Closing Thoughts

This article introduces you to different data distribution types based on the nature of the data. It also explains how these statistical distributions can help in data analysis. By identifying the distribution that best represents your data, you can make informed decisions, build robust models, and extract invaluable insights.

About the Author

Table of contents, get your data syncing in minutes, join our newsletter to get all the insights on the data stack., integrate with 300+ apps using airbyte, integrate and move data across 300+ apps using airbyte., related posts.

cup-HR Logo

  • For Volunteers
  • For Corporates
  • Knowledge Center
  • Research Center

Learning and professional development.

Conferences, virtual events, special programs, calendar of events, research center data, signature workforce surveys, special surveys, data on demand.

CUPA Advocacy

We help you find the most recent news and resources related to public policy issues.

New developments, ongoing efforts, about cupa-hr membership, manage your membership, make connections, get involved, higher ed hr awards.

cup-HR Logo

Search CUPA-HR.

  • Entire Site

The Higher Ed Workplace Blog

Data show women and people of color aren’t advancing to higher faculty ranks at the same rate as white men.

data types in representation

CUPA-HR’s research team analyzed data from the Faculty in Higher Education Survey , a comprehensive data source that collects salary and demographic data by tenure status, rank, and faculty discipline, to evaluate representation and pay equity for women and faculty of color from 2016-17 to 2022-23.

In addition to the finding that women and faculty of color are not being promoted to senior faculty ranks at the same rate as White men, the data also show that women, Black, and Hispanic or Latina/o faculty are better represented in non-tenure-track than in tenure-track positions, and that pay gaps in non-tenure-track positions persist for these groups. Combined with the fact that these groups are less likely to be promoted to higher ranks in tenure-track positions, the result is that a substantial segment of faculty, primarily women and people of color, are employed in positions that pay lower salaries throughout their careers.

Other Findings

Tenure-track faculty positions are on the decline. There has been a decline in tenure-track positions and a corresponding increase in non-tenure-track positions over the past seven years. In 2016-17, tenure-track roles accounted for 73% of faculty, but by 2022-23, this proportion fell to 66%, with a marked increase in non-tenure-track positions over the last two years. Additionally, the percentage of new tenure-track assistant professor hires dropped in recent years, indicating a trend toward more new non-tenure-track hires.

The representation of women and people of color in tenure-track faculty positions is increasing, yet challenges remain. There was a notable increase in the representation of tenure-track (TT) women and faculty of color from 2016-17 to 2022-23. In 2022-23, more than one-fourth (26%) of TT faculty were people of color. This marks a 28% increase over the span of seven years, compared to 2016-17, when faculty of color constituted closer to one-fifth (21%) of all TT faculty . However, the growth in racial/ethnic representation still lags when compared to the demographic composition of U.S. doctoral degree holders. Further, despite strides toward pay equity for tenure-track faculty of color, White women in tenure-track positions still face persistent pay gaps in 2022-23.

Explore the interactive graphics and read the full report, Representation and Pay Equity in Higher Education Faculty: A Review and Call to Action .

Share This Article:

Enter your email address to subscribe to this blog and receive notifications of new posts by email.

Blog Search

Search through our blog post archives.

  • Classification and Compensation
  • Compliance/Legal Issues
  • Diversity and Inclusion
  • Employment Relationship
  • Higher Ed News
  • Leadership and Strategy
  • Recruitment and Selection
  • Safety and Health
  • Separation/Termination
  • Training and Organizational Development

Our upgrade is complete, and we are bringing services back online. All users must create new login credentials for their CUPA-HR accounts.

This website uses cookies to understand your use of our website and to give you a better experience. By continuing to use our site or by closing this banner without changing your cookie settings, you agree to our use of cookies. To find out more about our use of cookies and how to change your settings, please go to our Privacy Policy .

Help | Advanced Search

Computer Science > Machine Learning

Title: no representation, no trust: connecting representation, collapse, and trust issues in ppo.

Abstract: Reinforcement learning (RL) is inherently rife with non-stationarity since the states and rewards the agent observes during training depend on its changing policy. Therefore, networks in deep RL must be capable of adapting to new observations and fitting new targets. However, previous works have observed that networks in off-policy deep value-based methods exhibit a decrease in representation rank, often correlated with an inability to continue learning or a collapse in performance. Although this phenomenon has generally been attributed to neural network learning under non-stationarity, it has been overlooked in on-policy policy optimization methods which are often thought capable of training indefinitely. In this work, we empirically study representation dynamics in Proximal Policy Optimization (PPO) on the Atari and MuJoCo environments, revealing that PPO agents are also affected by feature rank deterioration and loss of plasticity. We show that this is aggravated with stronger non-stationarity, ultimately driving the actor's performance to collapse, regardless of the performance of the critic. We draw connections between representation collapse, performance collapse, and trust region issues in PPO, and present Proximal Feature Optimization (PFO), a novel auxiliary loss, that along with other interventions shows that regularizing the representation dynamics improves the performance of PPO agents.

Submission history

Access paper:.

  • HTML (experimental)
  • Other Formats

license icon

References & Citations

  • Google Scholar
  • Semantic Scholar

BibTeX formatted citation

BibSonomy logo

Bibliographic and Citation Tools

Code, data and media associated with this article, recommenders and search tools.

  • Institution

arXivLabs: experimental projects with community collaborators

arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.

Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.

Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs .

IMAGES

  1. Data Representation

    data types in representation

  2. How to Use Data Visualization in Your Infographics

    data types in representation

  3. 6 Types of Data: Every Statistician & Data Scientist Must Know

    data types in representation

  4. In-Depth Understanding of Data Structures in Java

    data types in representation

  5. Graphical Representation

    data types in representation

  6. Graphical Representation

    data types in representation

VIDEO

  1. Data representation in tables 01

  2. ICT

  3. MCS-12 (Computer Organization and Assembly Language Programming)Block01 Unit-2 DATA REPRESENTATION#4

  4. Introduction of Set....... representation, types, function , relation, #set

  5. TYPES OF DATA REPRESENTATION

  6. Data Representation & Computer Arithmetic

COMMENTS

  1. Data representations

    Data representations are useful for interpreting data and identifying trends and relationships. When working with data representations, pay close attention to both the data values and the key words in the question. When matching data to a representation, check that the values are graphed accurately for all categories.

  2. Data Representation: Definition, Types, Examples

    Data Representation: Data representation is a technique for analysing numerical data. The relationship between facts, ideas, information, and concepts is depicted in a diagram via data representation. It is a fundamental learning strategy that is simple and easy to understand. It is always determined by the data type in a specific domain.

  3. Data Representation

    Contents pages for the section covering Data Representation from binary representation to various data compression methods at GCSE, IB and A Level - for Computer Science students. FOUNDATION YEARS GCSE ... hexadecimal, and ASCII formats, as well as the different types of data, such as integers, floating-point numbers, characters, and images. We ...

  4. 2.1: Types of Data Representation

    2.1: Types of Data Representation. Page ID. Two common types of graphic displays are bar charts and histograms. Both bar charts and histograms use vertical or horizontal bars to represent the number of data points in each category or interval. The main difference graphically is that in a bar chart there are spaces between the bars and in a ...

  5. Data Types in Programming

    In Programming, data type is an attribute associated with a piece of data that tells a computer system how to interpret its value. Understanding data types ensures that data is collected in the preferred format and that the value of each property is as expected. ... represent numeric data type for numbers without fractions. 300, 0 , -300 ...

  6. PDF Data Types And Representation

    Two major approaches: structural equivalence and name equivalence. Name equivalence is based on declarations. Two types are the same only if they have the same name. (Each type definition introduces a new type) strict: aliases (i.e. declaring a type to be equal to another type) are distinct. loose: aliases are equivalent.

  7. Data Representation

    A programmer considers memory content to be data types of the programming language he uses. Now recall figure 1.2 and 1.3 of chapter 1 to reinforce your thought that conversion happens from computer user interface to internal representation and storage. Data Representation in Computers

  8. 4-3: Types of Data and Appropriate Representations

    Figure 1. Types of data. Qualitative Data. Categorical or qualitative data labels data into categories. Categorical data is defined in terms of natural language specifications. For example, name, sex, country of origin, are categories that represent qualitative data. There are two subcategories of qualitative data, nominal data and ordinal data ...

  9. PDF Lecture Notes on Data Representation

    Data Representation L9.5 argument of type bin we must first unfold its represenation to a sum and then case over the possible summands. There are three possibilities, so our code so far has the form inc : bin!bin = x:case (unfoldx) fb0 y)::: jb1 y)::: j y):::g In each branch, the missing code should have type bin. In the case of b0 y

  10. Data representations

    Data representations problems ask us to interpret data representations or create data representations based on given information. Aside from tables, the two most common data representation types on the SAT are bar graphs and line graphs. In this lesson, we'll learn to: You can learn anything. Let's do this!

  11. Data representation

    The first unit, data representation, is all about how different forms of data can be represented in terms the computer can understand. ... A pointer combines an address and a type. The memory representation of a pointer is the same as the representation of its address value. The size of that integer is the machine's word size; for example, on ...

  12. Decoding Computation Through Data Representation

    Primitive data types: Computers deal with binary data at the most basic level. In most programming languages, integers, floating-point numbers, characters, and Booleans are foundational data types. Their representation involves bit patterns in memory, with specifics such as endian-ness, precision, and overflow/underflow considerations.

  13. What are the different ways of Data Representation?

    When the row is placed in ascending or descending order is known as arrayed data. Types of Graphical Data Representation. Bar Chart. Bar chart helps us to represent the collected data visually. The collected data can be visualized horizontally or vertically in a bar chart like amounts and frequency. It can be grouped or single.

  14. Reading 10: Abstract Data Types

    In this reading, we look at a powerful idea, abstract data types, which enable us to separate how we use a data structure in a program from the particular form of the data structure itself. Abstract data types address a particularly dangerous problem: clients making assumptions about the type's internal representation.

  15. PDF Bits and Bytes Data Representation 1

    Data Representation Computer Organization I 4 CS@VT ©2005-2020 WD McQuain Integer Data Types We need to provide support for a variety of data types. For integer values, we need to provide a variety of types that allow the user to choose based upon memory considerations and range of representation. For contemporary programming languages, we ...

  16. CSC 270: Data Representation

    Data representation Contents: integer representation; fixed-point representation; floating-point representation (IEEE 754) IEEE 754 standard sizes; conversion examples. Integer representation We want to represent data in a computer. We want to store numbers and text and all these data types in a computer, but what we have is electric currents ...

  17. Data representation 1: Introduction

    It is in the data segment, which holds modifiable global data. This segment keeps the same size as the program runs. After a jump, the anonymous new int object pointed to by i4 has the next highest address. This is the heap segment, which holds dynamically allocated data. This segment can grow as the program runs; it typically grows towards ...

  18. Reading 12: Abstract Data Types

    Critically, a good abstract data type should be representation independent . This means that the use of an abstract type is independent of its representation (the actual data structure or data fields used to implement it), so that changes in representation have no effect on code outside the abstract type itself. For example, the operations ...

  19. PDF Data Representation

    Primitive Types vs. Objects • At first glance, Java's rules for passing objects as arguments seem to differ from the rules Java uses with arguments that are primitive types. • When you pass an argument of a primitive type to a method, Java copies the value of the argument into the parameter variable.

  20. A Tutorial on Data Representation

    The interpretation of binary pattern is called data representation or encoding. Furthermore, it is important that the data representation schemes are agreed-upon by all the parties, i.e., industrial standards need to be formulated and straightly followed. ... The char data type are based on the original 16-bit Unicode standard called UCS-2. The ...

  21. PDF How to Use Basic Numeric Data Types Tutorial Numeric Representat

    an appropriate data type representation for output. If different data type representations are used for inputs, some of the data types maybe coerced ó or converted to the same data type representation as the other inputs. For example, in Figure 1 when we try and add 10 which is an I32 with 1.5 which is a double,

  22. 2.2. Data types and representation

    There are other data types to represent integers, such as: short representation typically uses 16 bits, i.e. 2 bytes of memory. unsigned int representation typically uses 32 bits, i.e. 4 bytes of memory. The sign bit is not used, so the range of numbers is from 0 to 2 32 − 1. long representation typically uses 64 bits, i.e. 8 bytes of memory.

  23. 9 Types of Data Distribution in Statistics

    This type of data consists of distinct, separate values. It often represents whole numbers or counts, such as the number of students in a class or the number of times you land on heads in 10 coin flips. ... You can represent discrete data using bar charts or histograms. Continuous Data. Contrarily, continuous data can take on any value within a ...

  24. Quantum Adjoint Convolutional Layers for Effective Data Representation

    Quantum Convolutional Layer (QCL) is considered as one of the core of Quantum Convolutional Neural Networks (QCNNs) due to its efficient data feature extraction capability. However, the current principle of QCL is not as mathematically understandable as Classical Convolutional Layer (CCL) due to its black-box structure. Moreover, classical data mapping in many QCLs is inefficient. To this end ...

  25. Data Show Women and People of Color Aren't Advancing to Higher Faculty

    The representation of women and people of color in tenure-track faculty positions is increasing, yet challenges remain. There was a notable increase in the representation of tenure-track (TT) women and faculty of color from 2016-17 to 2022-23. In 2022-23, more than one-fourth (26%) of TT faculty were people of color.

  26. [2404.19756v1] KAN: Kolmogorov-Arnold Networks

    Inspired by the Kolmogorov-Arnold representation theorem, we propose Kolmogorov-Arnold Networks (KANs) as promising alternatives to Multi-Layer Perceptrons (MLPs). While MLPs have fixed activation functions on nodes ("neurons"), KANs have learnable activation functions on edges ("weights"). KANs have no linear weights at all -- every weight parameter is replaced by a univariate function ...

  27. No Representation, No Trust: Connecting Representation, Collapse, and

    Reinforcement learning (RL) is inherently rife with non-stationarity since the states and rewards the agent observes during training depend on its changing policy. Therefore, networks in deep RL must be capable of adapting to new observations and fitting new targets. However, previous works have observed that networks in off-policy deep value-based methods exhibit a decrease in representation ...

  28. A Detailed Guide to C# TimeSpan

    It's a type from the .NET BCL (Base Class Library) representing a time interval. Do you ever need to represent the concept of "five minutes" or "two hours"? If that's the case, TimeSpan is what you're looking for. C# TimeSpan: importance and benefits. Here's what I consider the top benefits of using TimeSpan:

  29. EEOC Releases Workplace Guidance to Prevent Harassment

    WASHINGTON - Today the U.S. Equal Employment Opportunity Commission (EEOC) published final guidance on harassment in the workplace, "Enforcement Guidance on Harassment in the Workplace."By providing this resource on the legal standards and employer liability applicable to harassment claims under the federal employment discrimination laws enforced by the EEOC, the guidance will help ...