PrepBytes Blog

ONE-STOP RESOURCE FOR EVERYTHING RELATED TO CODING

Sign in to your account

Forgot your password?

Login via OTP

We will send you an one time password on your mobile number

An OTP has been sent to your mobile number please verify it below

Register with PrepBytes

' src=

Last Updated on January 8, 2024 by Ankit Kochar

float data type representation in c

Floating-point numbers in C play a crucial role in handling real numbers with decimal points. Understanding how float variables work is fundamental for developers who want to create programs dealing with precise numerical values. This introduction will provide insights into the basics of floating-point numbers in the C programming language.

What is Float in C?

In the C programming language, float is a data type used to represent single-precision floating-point numbers. Floating-point numbers are used to handle real numbers with decimal points, providing a way to represent a wide range of values, from very small to very large, and with fractional precision. The float data type is typically used when you need to store numbers with a fractional component, and it occupies 4 bytes in memory. It is important to note that while floating-point numbers provide a versatile way to work with real numbers, they have limitations in terms of precision due to the finite representation of real numbers in binary.

Syntax of Float in C

The syntax of float in c is given below:

You can also assign multiple variables together by using a single float by continuing them in the declaration statement.

Parameters of Float in C

There are certain parameters of float in c.

  • Varible_name: It is the name of the variable that you will assign to the float type variable, with this name the variable is called in the entire program.
  • Val: It is the value that you assign to the variable.

Double vs Float in C

Both double and float are the two data types that are used to take and return the numbers in decimals. Here we will discuss the difference between both them so that you know which data type is best for you to use in various scenarios.

  • Precision is the primary distinction between float and double. The variable value’s level of correctness is determined by precision. Double type accuracy is two times greater than float data type precision. In simpler words, it indicates that double, double precision is used to initialize the variable rather than the float data type.
  • A double data type needs twice as much storage space as a float variable because it has greater precision than the float data type. Therefore, it is only recommended to use in situations where the program’s space complexity is less important than the accuracy of the value.
  • When a variable is initialized with a double, it has 64 bits of precision for a decimal point value. The 64 bits are broken up into a number of pieces, each with a distinct function. The sign is stored in the first bit, and the exponent number is stored in the following 11 bits. The actual variable of the number is stored in the remaining 52 bits. 15 digits can be stored in a double.
  • In comparison, a variable initialized with float has a 32-bit precision for a decimal point number. The exponent number is kept in the next 8 bits. The actual variable of the number is stored in the remaining 23 bits. The accuracy of the float is 7 decimal digits.

Examples of Float in C

In this section, we will discuss various examples of float in c.

Example 1 of Float in C: Declaring the Variable Here we will see the code implementation of the above-mentioned example.

Code Implementation

Explanation of the above example In the above example we have declared a float type number and then tries to print it. The result will be shown up to six decimal places.

Example 2 of Float in C: Declaring multiple variables in the same line Here we will declare multiple float variables in the same line. We will see its code implementation.

Explanation of the above example In the above example we have declared two floating type numbers in the same line and then printed their corresponding value and the answer is upto 6 decimal places.

Uses of Float in C

There are many uses of float in c. Some of them are mentioned below:

  • It is very helpful in calculations involving decimal numbers like in physics calculations, financial calculations, and some other scientific calculations.
  • It is used in computer graphics and animations as they are used to store the coordinates of the objects.
  • It is used in input and output operations that have fractional components like temperature and percentage.
  • They are widely used in gaming and simulations to represent physical components like velocity, position, acceleration, and rotation.
  • They are used in machine learning and data analysis, as they are used to represent numerical values like biases, probabilities, weights, and confidence scores.

Conclusion In conclusion, mastering the use of float in C is essential for any programmer working with numerical data. The flexibility offered by floating-point variables allows for accurate representation of real-world values, but it also introduces challenges related to precision and rounding errors. Being aware of these nuances and implementing best practices can significantly enhance the reliability of your C programs dealing with floating-point arithmetic.

Frequently Asked Questions Related to Float in C

Below are some of the frequently asked questions about float in c.

Q1: What is the purpose of using float in C? A1: Float in C is used to represent real numbers with decimal points. It allows for the handling of values that require fractional precision, making it essential for applications dealing with measurements, scientific calculations, and financial data.

Q2: How is a float variable declared in C? A2: To declare a float variable in C, you use the ‘float’ keyword followed by the variable name. For example:

Q3: What are the potential challenges with using float in C? A3: One common challenge is the limited precision of float variables, which can lead to rounding errors. Developers need to be cautious when comparing floating-point numbers and be aware of the potential for loss of precision during arithmetic operations.

Q4: Are there alternatives to float in C for handling decimal numbers? A4: Yes, the ‘double’ data type provides higher precision compared to ‘float’. If precision is critical, using ‘double’ may be a better choice, although it comes with increased memory consumption.

Q5: How can precision issues with float be mitigated? A5: Developers can mitigate precision issues by being mindful of rounding errors, using appropriate rounding functions, and considering alternative data types like ‘double’ for higher precision when necessary. Additionally, avoiding direct equality comparisons and using epsilon values for tolerance can improve the reliability of floating-point comparisons.

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Save my name, email, and website in this browser for the next time I comment.

  • Linked List
  • Segment Tree
  • Backtracking
  • Dynamic Programming
  • Greedy Algorithm
  • Operating System
  • Company Placement
  • Interview Tips
  • General Interview Questions
  • Data Structure
  • Other Topics
  • Computational Geometry
  • Game Theory

Related Post

Null character in c, assignment operator in c, ackermann function in c, median of two sorted arrays of different size in c, number is palindrome or not in c, implementation of queue using linked list in c.

  • Number Systems
  • Sorting algorithms
  • Computer Programming
  • Examples Download
  • Search This Site
  • Privacy Policy

C float data type - single precision

In C float is a data type that represents floating point numbers, using 32 bits. We use this type more often than double , because we rarely need the double’s precision.

Advertise on this site . I promise you will like the rates :)

C float - usage example

A float value normally ends with the data specifier "f". If we leave it out, the literal(5.50) will be treated as double by default. You can do this and try to assign the value to a float variable. Most compilers will not issue a warning or an error. However, data will be lost if you try to enter a value with bigger precision than float can handle(more on this in a second). It is a good practice to explicitly set the specifier.

Further, will you see that the specifier for printing floats is %f.

Scanning floats and specifying printing precision

The specifier for printing and scanning float values is the same %f . We can specify the precision of the output. To do this, we write:

    [integer part][.][fraction part]

  • The integer part is optional. Usually it will be ignored.
  • The decimal delimiter ‘.’ must be used if we intend to print from the fraction.
  • It specifies how many of digits from the fraction should be printed. When the actual value has longer precision, it will be rounded for the printing purposes.

In the next example ( %.2f ) we will print the value with precision two digits after the decimal mark.

Representation

The C standard does not explicitly specify the precision that needs to be supported. However, most C compilers use the IEEE 754 standard for encoding the float values. According to it, the single precision (float) is represented by 32 bits as follows:

float data type representation in c

  • 23 bits of significand
  • 8 bits of exponent

This allows you to save up to 8 digits, including to the left and right of the fraction delimiter.

float data type representation in c

As you can see in the example above( source on GitHub ), the following values can be saved without a problem

The following values will overflow the C float type and you will lose data as as a result:

See also: double data type

C      

*As an Amazon Associate I earn from qualifying purchases.

© Copyright 2008-2022 c-programming-simple-steps.com

Data Types in C - Integer, Floating Point, and Void Explained

Data Types in C - Integer, Floating Point, and Void Explained

Data Types in C

There are several different ways to store data in C, and they are all unique from each other. The types of data that information can be stored as are called data types. C is much less forgiving about data types than other languages. As a result, it’s important to make sure that you understand the existing data types, their abilities, and their limitations.

One quirk of C’s data types is that they depend entirely on the hardware that you’re running your code on. An int on your laptop will be smaller than an int on a supercomputer, so knowing the limitations of the hardware you’re working on is important. This is also why the data types are defined as being minimums- an int value, as you will learn, is at minimum -32767 to 32767: on certain machines, it will be able to store even more values that this.

There are two categories that we can break this into: integers, and floating point numbers. Integers are whole numbers. They can be positive, negative, or zero. Numbers like -321, 497, 19345, and -976812 are all perfectly valid integers, but 4.5 is not because 4.5 is not a whole number.

Floating point numbers are numbers with a decimal. Like integers, -321, 497, 19345, and -976812 are all valid, but now 4.5, 0.0004, -324.984, and other non-whole numbers are valid too.

C allows us to choose between several different options with our data types because they are all stored in different ways on the computer. As a result, it is important to be aware of the abilities and limitations of each data type to choose the most appropriate one.

Integer data types

Characters: char.

char holds characters- things like letters, punctuation, and spaces. In a computer, characters are stored as numbers, so char holds integer values that represent characters. The actual translation is described by the ASCII standard. Here’s a handy table for looking up that.

The actual size, like all other data types in C, depends on the hardware you’re working on. By minimum, it is at least 8 bits, so you will have at least 0 to 127. Alternatively, you can use signed char to get at least -128 to 127.

Standard Integers: int

The amount of memory that a single int takes depends on the hardware. However, you can expect an int to be at least 16 bits in size. This means that it can store values from -32,768 to 32,767, or more depending on hardware.

Like all of these other data types, there is an unsigned variant that can be used. The unsigned int can be positive and zero but not negative, so it can store values from 0 to 65,535, or more depending on hardware.

Short integers: short

This doesn’t get used often, but it’s good to know that it exists. Like int, it can store -32768 to 32767. Unlike int, however, this is the extent of its ability. Anywhere you can use short , you can use int .

Longer integers: long

The long data type stores integers like int , but gives a wider range of values at the cost of taking more memory. Long stores at least 32 bits, giving it a range of -2,147,483,648 to 2,147,483,647. Alternatively, use unsigned long for a range of 0 to 4,294,967,295.

Even longer integers: long long

The long long data type is overkill for just about every application, but C will let you use it anyway. It’s capable of storing at least −9,223,372,036,854,775,807 to 9,223,372,036,854,775,807. Alternatively, get even more overkill with unsigned long long , which will give you at least 0 to 18,446,744,073,709,551,615.

Floating point number data types

Basic floating point numbers: float.

float takes at least 32 bits to store, but gives us 6 decimal places from 1.2E-38 to 3.4E+38.

Doubles: double

double takes double the memory of float (so at least 64 bits). In return, double can provide 15 decimal place from 2.3E-308 to 1.7E+308.

Getting a wider range of doubles: long double

long double takes at least 80 bits. As a result, we can get 19 decimal places from 3.4E-4932 to 1.1E+4932.

Picking the right data type

C makes pick the data type, and makes us be very specific and intentional about the way that we do this. This gives you a lot of power over your code, but it’s important to pick the right one.

In general, you should pick the minimum for your task. If you know you’ll be counting from integer 1 to 10, you don’t need a long and you don’t need a double. If you know that you will never have negative values, look into using the unsigned variants of the data types. By providing this functionality rather than doing it automatically, C is able to produce very light and efficient code. However, it’s up to you as the programmer to understand the abilities and limitations, and choose accordingly.

We can use the sizeof() operator to check the size of a variable. See the following C program for the usage of the various data types:

The Void type

The void type specifies that no value is available. It is used in three kinds of situations:

1. Function returns as void

There are various functions in C which do not return any value or you can say they return void. A function with no return value has the return type as void. For example, void exit (int status);

2. Function arguments as void

There are various functions in C which do not accept any parameter. A function with no parameter can accept a void. For example, int rand(void);

3. Pointers to void

A pointer of type void * represents the address of an object, but not its type. For example, a memory allocation function void *malloc( size_t size); returns a pointer to void which can be casted to any data type.

If this article was helpful, share it .

Learn to code for free. freeCodeCamp's open source curriculum has helped more than 40,000 people get jobs as developers. Get started

Learn C practically and Get Certified .

Popular Tutorials

Popular examples, reference materials, learn c interactively, introduction.

  • Getting Started with C
  • Your First C Program
  • C Variables, Constants and Literals

C Data Types

C Input Output (I/O)

  • C Programming Operators

Flow Control

  • C if...else Statement
  • C while and do...while Loop
  • C break and continue
  • C switch Statement
  • C goto Statement
  • C Functions
  • C User-defined functions
  • Types of User-defined Functions in C Programming
  • C Recursion
  • C Storage Class

Programming Arrays

  • C Multidimensional Arrays
  • Pass arrays to a function in C

Programming Pointers

  • Relationship Between Arrays and Pointers
  • C Pass Addresses and Pointers
  • C Dynamic Memory Allocation
  • C Array and Pointer Examples

Programming Strings

  • C Programming Strings
  • String Manipulations In C Programming Using Library Functions
  • String Examples in C Programming

Structure and Union

  • C structs and Pointers
  • C Structure and Function

Programming Files

  • C File Handling
  • C Files Examples

Additional Topics

  • C Keywords and Identifiers
  • C Precedence And Associativity Of Operators
  • C Bitwise Operators
  • C Preprocessor and Macros
  • C Standard Library Functions
  • Demonstrate the Working of Keyword long
  • Find the Size of int, float, double and char

List of all Keywords in C Language

C Type Conversion

In C programming, data types are declarations for variables. This determines the type and size of data associated with variables. For example,

Here, myVar is a variable of int (integer) type. The size of int is 4 bytes.

Basic types

Here's a table containing commonly used types in C programming for quick access.

Integers are whole numbers that can have both zero, positive and negative values but no decimal values. For example, 0 , -5 , 10

We can use int for declaring an integer variable.

Here, id is a variable of type integer.

You can declare multiple variables at once in C programming. For example,

The size of int is usually 4 bytes (32 bits). And, it can take 2 32 distinct states from -2147483648 to 2147483647 .

float and double

float and double are used to hold real numbers.

In C, floating-point numbers can also be represented in exponential. For example,

What's the difference between float and double ?

The size of float (single precision float data type) is 4 bytes. And the size of double (double precision float data type) is 8 bytes.

Keyword char is used for declaring character type variables. For example,

The size of the character variable is 1 byte.

void is an incomplete type. It means "nothing" or "no type". You can think of void as absent .

For example, if a function is not returning anything, its return type should be void .

Note that, you cannot create variables of void type.

short and long

If you need to use a large number, you can use a type specifier long . Here's how:

Here variables a and b can store integer values. And, c can store a floating-point number.

If you are sure, only a small integer ( [−32,767, +32,767] range) will be used, you can use short .

You can always check the size of a variable using the sizeof() operator.

signed and unsigned

In C, signed and unsigned are type modifiers. You can alter the data storage of a data type by using them:

  • signed - allows for storage of both positive and negative numbers
  • unsigned - allows for storage of only positive numbers

For example,

Here, the variables x and num can hold only zero and positive values because we have used the unsigned modifier.

Considering the size of int is 4 bytes, variable y can hold values from -2 31 to 2 31 -1 , whereas variable x can hold values from 0 to 2 32 -1 .

Derived Data Types

Data types that are derived from fundamental data types are derived types. For example: arrays, pointers, function types, structures, etc.

We will learn about these derived data types in later tutorials.

  • Enumerated type
  • Complex types

Table of Contents

  • C Data types
  • Basic Data types
  • float and double Type

Video: Data Types in C Programming

Sorry about that.

Related Tutorials

Logo for Rebus Press

Want to create or adapt books like this? Learn more about how Pressbooks supports open publishing practices.

Floating-Point Data Type

Kenneth Leroy Busbee and Dave Braunschweig

A floating-point data type uses a formulaic representation of real numbers as an approximation so as to support a trade-off between range and precision. For this reason, floating-point computation is often found in systems which include very small and very large real numbers, which require fast processing times. A number is, in general, represented approximately to a fixed number of significant digits and scaled using an exponent in some fixed base. [1]

The floating-point data type is a family of data types that act alike and differ only in the size of their domains (the allowable values). The floating-point family of data types represents number values with fractional parts. They are technically stored as two integer values: a  mantissa  and an  exponent . The floating-point family has the same attributes and acts or behaves similarly in all programming languages. They can always store negative or positive values thus they always are signed; unlike the integer data type that could be unsigned. The  domain  for floating-point data types varies because they could represent very large numbers or very small numbers. Rather than talk about the actual values, we mention the  precision . The more bytes of storage the larger the mantissa and exponent, thus more precision.

  • cnx.org: Programming Fundamentals – A Modular Structured Approach using C++
  • Wikipedia: Floating-point arithmetic ↵

Programming Fundamentals Copyright © 2018 by Kenneth Leroy Busbee and Dave Braunschweig is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License , except where otherwise noted.

Share This Book

Next: Array Example , Up: Beyond Integers   [ Contents ][ Index ]

4.1 An Example with Non-Integer Numbers

Here’s a function that operates on and returns floating point numbers that don’t have to be integers. Floating point represents a number as a fraction together with a power of 2. (For more detail, see Floating-Point Data Types .) This example calculates the average of three floating point numbers that are passed to it as arguments:

The values of the parameter a , b and c do not have to be integers, and even when they happen to be integers, most likely their average is not an integer.

double is the usual data type in C for calculations on floating-point numbers.

To print a double with printf , we must use ‘ %f ’ instead of ‘ %d ’:

The code that calls printf must pass a double for printing with ‘ %f ’ and an int for printing with ‘ %d ’. If the argument has the wrong type, printf will produce meaningless output.

Here’s a complete program that computes the average of three specific numbers and prints the result:

From now on we will not present examples of calls to main . Instead we encourage you to write them for yourself when you want to test executing some code.

This browser is no longer supported.

Upgrade to Microsoft Edge to take advantage of the latest features, security updates, and technical support.

  • 6 contributors

Floating-point numbers use the IEEE (Institute of Electrical and Electronics Engineers) format. Single-precision values with float type have 4 bytes, consisting of a sign bit, an 8-bit excess-127 binary exponent, and a 23-bit mantissa. The mantissa represents a number between 1.0 and 2.0. Since the high-order bit of the mantissa is always 1, it is not stored in the number. This representation gives a range of approximately 3.4E-38 to 3.4E+38 for type float.

You can declare variables as float or double, depending on the needs of your application. The principal differences between the two types are the significance they can represent, the storage they require, and their range. The following table shows the relationship between significance and storage requirements.

Floating-Point Types

Floating-point variables are represented by a mantissa, which contains the value of the number, and an exponent, which contains the order of magnitude of the number.

The following table shows the number of bits allocated to the mantissa and the exponent for each floating-point type. The most significant bit of any float or double is always the sign bit. If it is 1, the number is considered negative; otherwise, it is considered a positive number.

Lengths of Exponents and Mantissas

Because exponents are stored in an unsigned form, the exponent is biased by half its possible value. For type float, the bias is 127; for type double, it is 1023. You can compute the actual exponent value by subtracting the bias value from the exponent value.

The mantissa is stored as a binary fraction greater than or equal to 1 and less than 2. For types float and double, there is an implied leading 1 in the mantissa in the most-significant bit position, so the mantissas are actually 24 and 53 bits long, respectively, even though the most-significant bit is never stored in memory.

Instead of the storage method just described, the floating-point package can store binary floating-point numbers as denormalized numbers. "Denormalized numbers" are nonzero floating-point numbers with reserved exponent values in which the most-significant bit of the mantissa is 0. By using the denormalized format, the range of a floating-point number can be extended at the cost of precision. You cannot control whether a floating-point number is represented in normalized or denormalized form; the floating-point package determines the representation. The floating-point package never uses a denormalized form unless the exponent becomes less than the minimum that can be represented in a normalized form.

The following table shows the minimum and maximum values you can store in variables of each floating-point type. The values listed in this table apply only to normalized floating-point numbers; denormalized floating-point numbers have a smaller minimum value. Note that numbers retained in 80 x 87 registers are always represented in 80-bit normalized form; numbers can only be represented in denormalized form when stored in 32-bit or 64-bit floating-point variables (variables of type float and type long).

Range of Floating-Point Types

If precision is less of a concern than storage, consider using type float for floating-point variables. Conversely, if precision is the most important criterion, use type double.

Floating-point variables can be promoted to a type of greater significance (from type float to type double). Promotion often occurs when you perform arithmetic on floating-point variables. This arithmetic is always done in as high a degree of precision as the variable with the highest degree of precision. For example, consider the following type declarations:

In the preceding example, the variable f_short is promoted to type double and multiplied by f_long ; then the result is rounded to type float before being assigned to f_short .

In the following example (which uses the declarations from the preceding example), the arithmetic is done in float (32-bit) precision on the variables; the result is then promoted to type double:

Storage of Basic Types

Was this page helpful?

Coming soon: Throughout 2024 we will be phasing out GitHub Issues as the feedback mechanism for content and replacing it with a new feedback system. For more information see: https://aka.ms/ContentUserFeedback .

Submit and view feedback for

Additional resources

  • C++ Data Types
  • C++ Input/Output
  • C++ Pointers
  • C++ Interview Questions
  • C++ Programs
  • C++ Cheatsheet
  • C++ Projects
  • C++ Exception Handling
  • C++ Memory Management

Difference between float and double in C/C++

  • Difference between int* p() and int (*p)()?
  • Difference between Single Precision and Double Precision
  • Difference Between Constant and Literals
  • C Float and Double
  • Difference between std::numeric_limits<T> min, max, and lowest in C++
  • Type Difference of Character Literals in C and C++
  • C++ String to Float/Double and Vice-Versa
  • Abnormal behavior of floating point and double values
  • C++ default constructor | Built-in types for int(), float, double()
  • C++ Program to Find the Size of int, float, double and char
  • Convert a Char Array to Double in C
  • C++ Program For String to Double Conversion
  • Ceil and Floor functions in C++
  • C Program For Double to String Conversion
  • C++ Program For Double to String Conversion
  • lrint() and llrint() in C++
  • Difference between Decimal, Float and Double in .Net
  • Difference Between Integer and Float in Python
  • Different Ways to Convert Double to Integer in C#
  • Dynamic Memory Allocation in C using malloc(), calloc(), free() and realloc()
  • Bitwise Operators in C
  • std::sort() in C++ STL
  • What is Memory Leak? How can we avoid?
  • Substring in C++
  • Segmentation Fault in C/C++
  • Socket Programming in C
  • For Versus While
  • Data Types in C

To represent floating point numbers, we use float , double, and long double . What’s the difference? double has 2x more precision than float . float is a 32-bit IEEE 754 single precision Floating Point Number – 1 bit for the sign, 8 bits for the exponent, and 23* for the value. float has 7 decimal digits of precision. double is a 64-bit IEEE 754 double precision Floating Point Number – 1 bit for the sign, 11 bits for the exponent, and 52* bits for the value. double has 15 decimal digits of precision.

Let’s take an example: For a quadratic equation x^2 – 4.0000000 x + 3.9999999 = 0 , the exact roots to 10 significant digits are, r1 = 2.000316228 and r2 = 1.999683772. Notice the difference in using float and double.

Complexity Analysis

  • The time complexity of the given code is O(1) 
  • The auxiliary space complexity of the code is also O(1)

Difference between float and double

Let us see the differences in a tabular form that is as follows:

Please Login to comment...

Similar reads.

  • C-Data Types
  • cpp-data-types
  • Difference Between

advertisewithusBannerImg

Improve your Coding Skills with Practice

 alt=

What kind of Experience do you want to share?

Learn C++

4.8 — Floating point numbers

Integers are great for counting whole numbers, but sometimes we need to store very large (positive or negative) numbers, or numbers with a fractional component. A floating point type variable is a variable that can hold a number with a fractional component, such as 4320.0, -3.33, or 0.01226. The floating part of the name floating point refers to the fact that the decimal point can “float” -- that is, it can support a variable number of digits before and after the decimal point.

When writing floating point numbers in your code, the decimal separator must be a decimal point. If you’re from a country that uses a decimal comma, you’ll need to get used to using a decimal point instead.

There are three different floating point data types: float , double , and long double . As with integers, C++ does not define the actual size of these types (but it does guarantee minimum sizes).

On modern architectures, floating point representation almost always follows IEEE 754 binary format (created by William Kahan ). In this format, a float is 4 bytes, a double is 8 bytes, and a long double can be equivalent to a double (8 bytes), 80-bits (often padded to 12 bytes), or 16 bytes.

Floating point data types are always signed (can hold positive and negative values).

Here are some definitions of floating point variables:

When using floating point literals, always include at least one decimal place (even if the decimal is 0). This helps the compiler understand that the number is a floating point number and not an integer.

Note that by default, floating point literals default to type double. An f suffix is used to denote a literal of type float.

Best practice

Always make sure the type of your literals match the type of the variables they’re being assigned to or used to initialize. Otherwise an unnecessary conversion will result, possibly with a loss of precision.

Printing floating point numbers

Now consider this simple program:

The results of this seemingly simple program may surprise you:

In the first case, the std::cout printed 5, even though we typed in 5.0. By default, std::cout will not print the fractional part of a number if the fractional part is 0.

In the second case, the number prints as we expect.

In the third case, it printed the number in scientific notation (if you need a refresher on scientific notation, see lesson 4.7 -- Introduction to scientific notation ).

Floating point range

Assuming IEEE 754 representation:

The 80-bit floating point type is a bit of a historical anomaly. On modern processors, it is typically implemented using 12 or 16 bytes (which is a more natural size for processors to handle).

It may seem a little odd that the 80-bit floating point type has the same range as the 16-byte floating point type. This is because they have the same number of bits dedicated to the exponent -- however, the 16-byte number can store more significant digits.

Floating point precision

Consider the fraction 1/3. The decimal representation of this number is 0.33333333333333… with 3’s going out to infinity. If you were writing this number on a piece of paper, your arm would get tired at some point, and you’d eventually stop writing. And the number you were left with would be close to 0.3333333333…. (with 3’s going out to infinity) but not exactly.

On a computer, an infinite precision number would require infinite memory to store, and we typically only have 4 or 8 bytes per value. This limited memory means floating point numbers can only store a certain number of significant digits -- any additional significant digits are either lost or represented imprecisely. The number that is actually stored will be close to the desired number, but not exact. We’ll show an example of this in the next section.

The precision of a floating point type defines how many significant digits it can represent without information loss.

The number of digits of precision a floating point type has depends on both the size (floats have less precision than doubles) and the particular value being stored (some values can be represented more precisely than others).

For example, a float has 6 to 9 digits of precision. This means that a float can exactly represent any number with up to 6 significant digits. A number with 7 to 9 significant digits may or may not be represented exactly depending on the specific value. And a number with more than 9 digits of precision will definitely not be represented exactly.

Double values have between 15 and 18 digits of precision, with most double values having at least 16 significant digits. Long double has a minimum precision of 15, 18, or 33 significant digits depending on how many bytes it occupies.

Key insight

A floating point type can only precisely represent a certain number of significant digits. Using a value with more significant digits than the minimum may result in the value being stored inexactly.

Outputting floating point values

When outputting floating point numbers, std::cout has a default precision of 6 -- that is, it assumes all floating point variables are only significant to 6 digits (the minimum precision of a float), and hence it will truncate anything after that.

The following program shows std::cout truncating to 6 digits:

This program outputs:

Note that each of these only have 6 significant digits.

Also note that std::cout will switch to outputting numbers in scientific notation in some cases. Depending on the compiler, the exponent will typically be padded to a minimum number of digits. Fear not, 9.87654e+006 is the same as 9.87654e6, just with some padding 0’s. The minimum number of exponent digits displayed is compiler-specific (Visual Studio uses 3, some others use 2 as per the C99 standard).

We can override the default precision that std::cout shows by using an output manipulator function named std::setprecision() . Output manipulators alter how data is output, and are defined in the iomanip header.

Because we set the precision to 17 digits using std::setprecision() , each of the above numbers is printed with 17 digits. But, as you can see, the numbers certainly aren’t precise to 17 digits! And because floats are less precise than doubles, the float has more error.

Output manipulators (and input manipulators) are sticky -- meaning if you set them, they will remain set.

Precision issues don’t just impact fractional numbers, they impact any number with too many significant digits. Let’s consider a big number:

123456792 is greater than 123456789. The value 123456789.0 has 10 significant digits, but float values typically have 7 digits of precision (and the result of 123456792 is precise only to 7 significant digits). We lost some precision! When precision is lost because a number can’t be stored precisely, this is called a rounding error .

Consequently, one has to be careful when using floating point numbers that require more precision than the variables can hold.

Favor double over float unless space is at a premium, as the lack of precision in a float will often lead to inaccuracies.

Rounding errors make floating point comparisons tricky

Floating point numbers are tricky to work with due to non-obvious differences between binary (how data is stored) and decimal (how we think) numbers. Consider the fraction 1/10. In decimal, this is easily represented as 0.1, and we are used to thinking of 0.1 as an easily representable number with 1 significant digit. However, in binary, decimal value 0.1 is represented by the infinite sequence: 0.00011001100110011… Because of this, when we assign 0.1 to a floating point number, we’ll run into precision problems.

You can see the effects of this in the following program:

This outputs:

On the top line, std::cout prints 0.1, as we expect.

On the bottom line, where we have std::cout show us 17 digits of precision, we see that d is actually not quite 0.1! This is because the double had to truncate the approximation due to its limited memory. The result is a number that is precise to 16 significant digits (which type double guarantees), but the number is not exactly 0.1. Rounding errors may make a number either slightly smaller or slightly larger, depending on where the truncation happens.

Rounding errors can have unexpected consequences:

Although we might expect that d1 and d2 should be equal, we see that they are not. If we were to compare d1 and d2 in a program, the program would probably not perform as expected. Because floating point numbers tend to be inexact, comparing floating point numbers is generally problematic -- we discuss the subject more (and solutions) in lesson 6.6 -- Relational operators and floating point comparisons .

One last note on rounding errors: mathematical operations (such as addition and multiplication) tend to make rounding errors grow. So even though 0.1 has a rounding error in the 17th significant digit, when we add 0.1 ten times, the rounding error has crept into the 16th significant digit. Continued operations would cause this error to become increasingly significant.

Rounding errors occur when a number can’t be stored precisely. This can happen even with simple numbers, like 0.1. Therefore, rounding errors can, and do, happen all the time. Rounding errors aren’t the exception -- they’re the norm. Never assume your floating point numbers are exact.

A corollary of this rule is: be wary of using floating point numbers for financial or currency data.

Related content

For more insight into how floating point numbers are stored in binary, check out the float.exposed tool. To learn more about floating point numbers and rounding errors, floating-point-gui.de and fabiensanglard.net have approachable guides on the topic.

NaN and Inf

There are two special categories of floating point numbers. The first is Inf , which represents infinity. Inf can be positive or negative. The second is NaN , which stands for “Not a Number”. There are several different kinds of NaN (which we won’t discuss here). NaN and Inf are only available if the compiler uses a specific format (IEEE 754) for floating point numbers. If another format is used, the following code produces undefined behavior.

Here’s a program showing all three:

And the results using Visual Studio 2008 on Windows:

INF stands for infinity, and IND stands for indeterminate. Note that the results of printing Inf and NaN are platform specific, so your results may vary.

Avoid division by 0.0 altogether, even if your compiler supports it.

To summarize, the two things you should remember about floating point numbers:

  • Floating point numbers are useful for storing very large or very small numbers, including those with fractional components.
  • Floating point numbers often have small rounding errors, even when the number has fewer significant digits than the precision. Many times these go unnoticed because they are so small, and because the numbers are truncated for output. However, comparisons of floating point numbers may not give the expected results. Performing mathematical operations on these values will cause the rounding errors to grow larger.

guest

IMAGES

  1. C float

    float data type representation in c

  2. Floating Data Type in C Programming

    float data type representation in c

  3. Floating Point Number Representation IEEE-754 ~ C Programming

    float data type representation in c

  4. PPT

    float data type representation in c

  5. [SOLVED] How to calculate the range of data type float in c++

    float data type representation in c

  6. Data types in C programming

    float data type representation in c

VIDEO

  1. Integer data type and float data type in Python with full explanation@COMPUTEREXCELSOLUTION #python

  2. Float Datatype In Detail

  3. C programming Lecture -2

  4. PHP Data Types || PHP Bangla Tutorial || Basic to Advance || Step by step php training

  5. float data type in java #coding #tutorial #code #windows #java #tech #programming

  6. What are Data Types

COMMENTS

  1. C Float and Double

    Float and double are two primitive data types in C programming that are used to store decimal values.They both store floating point numbers but they differ in the level of precision to which they can store the values. In this article, we will study each of them in detail, their memory representation, and the difference between them.

  2. Float in C

    The float data type is typically used when you need to store numbers with a fractional component, and it occupies 4 bytes in memory. It is important to note that while floating-point numbers provide a versatile way to work with real numbers, they have limitations in terms of precision due to the finite representation of real numbers in binary.

  3. Floating Representations (GNU C Language Manual)

    28.1 Floating-Point Representations. Storing numbers as floating point allows representation of numbers with fractional values, in a range larger than that of hardware integers. A floating-point number consists of a sign bit, a significand (also called the mantissa), and a power of a fixed base.GNU C uses the floating-point representations specified by the IEEE 754-2008 Standard for Floating ...

  4. Floating-Point Data Types (GNU C Language Manual)

    C has three floating-point data types: double. "Double-precision" floating point, which uses 64 bits. This is the normal floating-point type, and modern computers normally do their floating-point computations in this type, or some wider type. Except when there is a special reason to do otherwise, this is the type to use for floating-point ...

  5. C float data type

    In C float is a data type that represents floating point numbers, using 32 bits. We use this type more often than double, because we rarely need the double's precision. ... Representation. The C standard does not explicitly specify the precision that needs to be supported.

  6. floating point

    8. I was trying to understand the floating point representation in C using this code (both float and int are 4 bytes on my machine): We know that the binary representation of x will be the following. Therefore I would have expected y to be represented as follows. Leading to y = (-1)^0 * 2^(0-127) * (1+2^(-22) + 2^(-23)) = 5.87747E-39.

  7. Data Types in C

    Data Types in C There are several different ways to store data in C, and they are all unique from each other. The types of data that information can be stored as are called data types. ... Floating point number data types Basic Floating point numbers: float. float takes at least 32 bits to store, but gives us 6 decimal places from 1.2E-38 to 3. ...

  8. C Data Types

    In C, signed and unsigned are type modifiers. You can alter the data storage of a data type by using them: signed - allows for storage of both positive and negative numbers. unsigned - allows for storage of only positive numbers. For example, // valid codes unsigned int x = 35; int y = -35; // signed int int z = 36; // signed int // invalid ...

  9. Float Data Type in C

    The float data type in C is used to represent floating-point numbers, which include both whole numbers and fractions. It's particularly handy when you're dealing with values that require decimal precision, like scientific calculations, financial operations, and graphics processing.

  10. Floating-Point Data Type

    Floating-Point Data Type Kenneth Leroy Busbee and Dave Braunschweig. Overview. A floating-point data type uses a formulaic representation of real numbers as an approximation so as to support a trade-off between range and precision. For this reason, floating-point computation is often found in systems which include very small and very large real numbers, which require fast processing times.

  11. Float Example (GNU C Language Manual)

    Here's a function that operates on and returns floating point numbers that don't have to be integers. Floating point represents a number as a fraction together with a power of 2. (For more detail, see Floating-Point Data Types.) This example calculates the average of three floating point numbers that are passed to it as arguments:

  12. Type float

    Type float. Floating-point numbers use the IEEE (Institute of Electrical and Electronics Engineers) format. Single-precision values with float type have 4 bytes, consisting of a sign bit, an 8-bit excess-127 binary exponent, and a 23-bit mantissa. The mantissa represents a number between 1.0 and 2.0. Since the high-order bit of the mantissa is ...

  13. Float Data Types in C with examples

    Floating point data type store numerical values with a fractional portion.. There are two types of floating data types named float and double.These may also have the qualifier long.. Long float may be equivalent to double and long double may be equivalent to double, or it may refer to a separate, "extralarge" double-precision data type requiring more than 8 bytes of memory.

  14. Single-precision floating-point format

    Before the widespread adoption of IEEE 754-1985, the representation and properties of floating-point data types depended on the computer manufacturer and computer model, and upon decisions made by programming-language designers. E.g., GW-BASIC's single-precision data type was the 32-bit MBF floating-point format.

  15. C Programming

    This video provides a hands-on explanation of how the float data type is stored in C memory and how to use it in programs using hands-on examples

  16. Difference between float and double in C/C++

    This data type supports up to 15 digits of storage. For float data type, the format specifier is %f. For double data type, the format specifier is %lf. For example -: 3.1415: For example -: 5.3645803: It is less costly in terms of memory usage. It is costly in terms of memory usage. It requires less memory space as compared to double data type.

  17. 4.8

    There are three different floating point data types: float, double, and long double. As with integers, C++ does not define the actual size of these types (but it does guarantee minimum sizes). On modern architectures, floating point representation almost always follows IEEE 754 binary format (created by William Kahan). In this format, a float ...

  18. What is the size of float and double in C and C++? [duplicate]

    The value representation of floating-point types is implementation-defined. Integral and floating types are collectively called arithmetic types. Specializations of the standard template std::numeric_limits (18.3) shall specify the maximum and minimum values of each arithmetic type for an implementation. ... You can try to use a library ...

  19. c++

    Here is what the standard C99 (ISO-IEC 9899 6.2.5 §10) or C++2003 (ISO-IEC 14882-2003 3.1.9 §8) standards say: There are three floating point types: float, double, and long double.The type double provides at least as much precision as float, and the type long double provides at least as much precision as double.The set of values of the type float is a subset of the set of values of the type ...