Binary arithmetic | Floating point types | Computer arithmetic

Double-precision floating-point format

Double-precision floating-point format (sometimes called FP64 or float64) is a floating-point number format, usually occupying 64 bits in computer memory; it represents a wide dynamic range of numeric values by using a floating radix point. Floating point is used to represent fractional values, or when a wider range is needed than is provided by fixed point (of the same bit width), even if at the cost of precision. Double precision may be chosen when the range or precision of single precision would be insufficient. In the IEEE 754-2008 standard, the 64-bit base-2 format is officially referred to as binary64; it was called double in IEEE 754-1985. IEEE 754 specifies additional floating-point formats, including 32-bit base-2 single precision and, more recently, base-10 representations. One of the first programming languages to provide single- and double-precision floating-point data types was Fortran. Before the widespread adoption of IEEE 754-1985, the representation and properties of floating-point data types depended on the computer manufacturer and computer model, and upon decisions made by programming-language implementers. E.g., GW-BASIC's double-precision data type was the 64-bit MBF floating-point format. (Wikipedia).

Double-precision floating-point format
Video thumbnail

Binary 4 – Floating Point Binary Fractions 1

This is the fourth in a series of videos about the binary number system which is fundamental to the operation of a digital electronic computer. In particular, this video covers the representation of real numbers using floating point binary notation. It begins with a description of standard

From playlist Binary

Video thumbnail

Binary 5 – Floating Point Range versus Precision

This is the fifth in a series of videos about the binary number system which is fundamental to the operation of a digital electronic computer. In particular, this video elaborates on the representation of real numbers using floating point binary notation. It explains how the relative allo

From playlist Binary

Video thumbnail

IEEE 754 Standard for Floating Point Binary Arithmetic

This computer science video describes the IEEE 754 standard for floating point binary. The layouts of single precision, double precision and quadruple precision floating point binary numbers are described, including the sign bit, the biased exponent and the mantissa. Examples of how to con

From playlist Binary

Video thumbnail

Binary 8 – Floating Point Binary Subtraction

This is the eighth in a series of videos about the binary number system which is fundamental to the operation of a digital electronic computer. In particular, this video covers subtraction of floating point binary numbers for a given sized mantissa and exponent, both in two’s complement.

From playlist Binary

Video thumbnail

Binary 7 – Floating Point Binary Addition

This is the seventh in a series of videos about the binary number system which is fundamental to the operation of a digital electronic computer. In particular, this video covers adding together floating point binary numbers for a given sized mantissa and exponent, both in two’s complement.

From playlist Binary

Video thumbnail

How to add two decimals together

👉 You will learn how to add and subtract numbers in decimal form. When adding and subtracting decimals it is very important to align the decimal points and use zero as space holders. Then you will apply the operations just like we do in multi-digit operations but keep track of the decima

From playlist Decimals

Video thumbnail

How to subtract a larger decimal from a smaller decimal

👉 You will learn how to add and subtract numbers in decimal form. When adding and subtracting decimals it is very important to align the decimal points and use zero as space holders. Then you will apply the operations just like we do in multi-digit operations but keep track of the decima

From playlist Decimals

Video thumbnail

Learn how to subtract a larger decimal from a smaller decimal

👉 You will learn how to add and subtract numbers in decimal form. When adding and subtracting decimals it is very important to align the decimal points and use zero as space holders. Then you will apply the operations just like we do in multi-digit operations but keep track of the decima

From playlist Decimals

Video thumbnail

Floating Point Representation

Floating Point Representation

From playlist Scientific Computing

Video thumbnail

Adding three digit decimals

👉 You will learn how to add and subtract numbers in decimal form. When adding and subtracting decimals it is very important to align the decimal points and use zero as space holders. Then you will apply the operations just like we do in multi-digit operations but keep track of the decima

From playlist Decimals

Video thumbnail

Introducing MATLAB Fundamental Classes (Data Types)

Get a Free Trial: https://goo.gl/C2Y9A5 Get Pricing Info: https://goo.gl/kDvGHt Ready to Buy: https://goo.gl/vsIeA5 Work with numerical, textual, and logical data types. For more videos, visit http://www.mathworks.com/products/matlab/examples.html

From playlist MATLAB Tutorials: Getting Started with MATLAB

Video thumbnail

[1] - Introduction to C/C++ - Basic starting points

This is my very first video introducing basic concepts of programming in C/C++. See the notebook page here: https://tinyurl.com/y88xv3kl Please comment and give me feedback. Was it too basic, too slow, too fast? What should I cover in the next video? Did I skip over something or do s

From playlist One-off Tutorials

Video thumbnail

Some quirks of computing - MegaFavNumbers

This video is about IEEE's binary64 format and some quirks of finite precision arithmetic on computers. The video is a response to the #MegaFavNumbers challenge by the YouTube maths community. Check out the full playlist here: https://www.youtube.com/playlist?list=PLar4u0v66vIodqt3KSZPsY

From playlist MegaFavNumbers

Video thumbnail

0047 - Custom C++ Web Server: refactoring

This is #47 in my series of live (Twitch) coding streams, working on writing my own web server and service framework in C++. This stream I mostly worked on refactoring the existing web server code. I worked on the Json, WebServer, Base64, and Http components. Rather than starting with a

From playlist Excalibur

Video thumbnail

12/05/2019, Nicolas Brisebarre

Nicolas Brisebarre, École Normale Supérieure de Lyon Title: Correct rounding of transcendental functions: an approach via Euclidean lattices and approximation theory Abstract: On a computer, real numbers are usually represented by a finite set of numbers called floating-point numbers. Wh

From playlist Fall 2019 Symbolic-Numeric Computing Seminar

Video thumbnail

C# Tutorial

Get the Code Here : https://goo.gl/CPivLE Best C# Book : http://amzn.to/2iMArkU Support me on Patreon : https://www.patreon.com/derekbanas MY UDEMY COURSES ARE 87.5% OFF TIL January 8th ($9.99) ➡️ Python Data Science Series for $9.99 : Highest Rated & Largest Python Udemy Course + 56 Hrs

From playlist C# Tutorial

Video thumbnail

Double Precision | Lecture 2 | Numerical Methods for Engineers

A description of the IEEE standard for a double precision number in MATLAB. Join me on Coursera: https://www.coursera.org/learn/numerical-methods-engineers Lecture notes at http://www.math.ust.hk/~machas/numerical-methods-for-engineers.pdf Subscribe to my channel: http://www.youtube.co

From playlist Numerical Methods for Engineers

Related pages

ECMAScript | IEEE 754 | Single-precision floating-point format | CUDA | Significand | X87 | Radix point | Dynamic range | 64-bit MBF | Normal number (computing) | Signed zero | Rounding | IEEE 754-2008 | Floating-point arithmetic | Subnormal number | Exponent bias | Sign bit | Machine epsilon | C data types | Infinity | Bit | Fixed-point arithmetic | NaN | Computer number format | Significant figures | IEEE 754-1985