6.1. Numbers

It might seem that we have only seen one type of data so far: numbers. However, Python actually has two types of numbers – integers and floating point numbers (or floats, for short). We have seen both already! Understanding the differences between integers and floats is crucial for any data scientist, as we’ll see in this section.

6.1.1. Two Types of Numbers

Python recognizes two types of numbers: integers and floating point numbers (or floats for short). An integer is a number without decimals. For instance:

42
42
-7
-7

A float is a number that does have decimals (even if that decimal component is zero), like:

3.14159265
3.14159265
42.0
42.0

When floats get really big or really small they might be printed in scientific notation. You can write floats in scientific notation, too.

1000000000000000000.0
1e+18
1e18
1e+18

Tip

You can place underscores in large numbers to make them easier to read – Python will ignore them. For instance, it is hard to read 10000000000, but somewhat easier to read 10_000_000_000.

When we write a number, Python automatically determines whether it is a float or an integer. We can see the type that Python has determined using the type function:

type(42)
int
type(-7)
int
type(3.14159265)
float
type(42.0)
float

We can also use the type function to ask the type of a variable:

x = 42
type(x)
int
y = 7
z = 2
type(y + z)
int

6.1.2. Types and Arithmetic

Every value in Python has a type. Since an expression results in a value, we can ask about its type:

type(1 + 4)
int

As we see above, adding and subtracting integers results in an integer. However, if we add an integer and a float, the result will be a float:

type(1 + 3.1415)
float

This makes sense: the result of 1 + 3.1415 is 4.1415, and so Python treats it as a float because it is a decimal number. But consider this:

type(1.2 + 3.8)
float

Mathematically-speaking, the result of \(1.2 + 3.8\) is just \(5\), which has no fractional component. But Python treats the result as a float instead of an integer! This might surprise you at first, but Python is following a simple rule here: if the result of arithmetic could be a decimal number, the result is a float.

Let’s put that to to the test. Suppose we add two integers. The result cannot have a decimal component, so it will be an integer. But if we add an integer and a float, the result could be a decimal number, depending on the exact number used. Therefore the result will be a float.

Now you try:

Suppose we perform the division 6/3. What is the type of the result?

Since dividing two integers could result in a decimal number, the result is always a float, even when the answer is mathematically-speaking an integer.

Tip

If you find this rule confusing, you can replace it with these two equivalent rules instead:

  1. When two numbers are combined, with one of them being a float, the result is a float.

  2. Dividing two numbers results in a float, even if both numbers are integers.

6.1.3. Conversions

Sometimes we know that something Python thinks is a float should be an integer, or vice versa. For instance, we have seen that 6/3 will be a float, even though we know that (mathematically-speaking) the result has no decimal place. We can convert a float to an integer using the int function, like so:

int(4/2)
2

Likewise, if we want to convert an integer to a float, we use the float function:

float(42)
42.0

Suppose you try to convert a number like 3.14 to an integer. What do you think will happen?

int(3.14)
3

It looks like Python is rounding the number – but be careful. To be precise, Python is rounding the number towards zero:

int(3.9999)
3
int(-2.9999)
-2

Now you try:

Which of the two is bigger? int(-3.9999) or int(-4.0001)?

Since Python rounds the numbers towards zero, int(-3.9999) will evaluate to -3 while int(-4.0001) will evaluate to -4 and we know that -3>-4.

6.1.4. Integers and Floats Redux

There are some important differences between integers and floats. First, integers can be arbitrarily large, while floats can overflow. For instance, let’s compute \(2^{10{,}000}\), first using integers, and then using floats.

With integers, we write 2**10_000. The result will be an integer (and a big one, too):

2**10_000
19950631168807583848837421626835850838234968318861924548520089498529438830221946631919961684036194597899331129423209124271556491349413781117593785932096323957855730046793794526765246551266059895520550086918193311542508608460618104685509074866089624888090489894838009253941633257850621568309473902556912388065225096643874441046759871626985453222868538161694315775629640762836880760732228535091641476183956381458969463899410840960536267821064621427333394036525565649530603142680234969400335934316651459297773279665775606172582031407994198179607378245683762280037302885487251900834464581454650557929601414833921615734588139257095379769119277800826957735674444123062018757836325502728323789270710373802866393031428133241401624195671690574061419654342324638801248856147305207431992259611796250130992860241708340807605932320161268492288496255841312844061536738951487114256315111089745514203313820202931640957596464756010405845841566072044962867016515061920631004186422275908670900574606417856951911456055068251250406007519842261898059237118054444788072906395242548339221982707404473162376760846613033778706039803413197133493654622700563169937455508241780972810983291314403571877524768509857276937926433221599399876886660808368837838027643282775172273657572744784112294389733810861607423253291974813120197604178281965697475898164531258434135959862784130128185406283476649088690521047580882615823961985770122407044330583075869039319604603404973156583208672105913300903752823415539745394397715257455290510212310947321610753474825740775273986348298498340756937955646638621874569499279016572103701364433135817214311791398222983845847334440270964182851005072927748364550578634501100852987812389473928699540834346158807043959118985815145779177143619698728131459483783202081474982171858011389071228250905826817436220577475921417653715687725614904582904992461028630081535583308130101987675856234343538955409175623400844887526162643568648833519463720377293240094456246923254350400678027273837755376406726898636241037491410966718557050759098100246789880178271925953381282421954028302759408448955014676668389697996886241636313376393903373455801407636741877711055384225739499110186468219696581651485130494222369947714763069155468217682876200362777257723781365331611196811280792669481887201298643660768551639860534602297871557517947385246369446923087894265948217008051120322365496288169035739121368338393591756418733850510970271613915439590991598154654417336311656936031122249937969999226781732358023111862644575299135758175008199839236284615249881088960232244362173771618086357015468484058622329792853875623486556440536962622018963571028812361567512543338303270029097668650568557157505516727518899194129711337690149916181315171544007728650573189557450920330185304847113818315407324053319038462084036421763703911550639789000742853672196280903477974533320468368795868580237952218629120080742819551317948157624448298518461509704888027274721574688131594750409732115080498190455803416826949787141316063210686391511681774304792596709376

Now let’s try it with floats. We can do this by writing 2.0**10_000:

2.0**10_000
---------------------------------------------------------------------------
OverflowError                             Traceback (most recent call last)
<ipython-input-22-be6decc9e3d7> in <module>
----> 1 2.0**10_000

OverflowError: (34, 'Numerical result out of range')

OverflowError! This is Python’s way of telling us that the result of the expression is too big for Python to compute using floats.

Floating point numbers are also of a fixed precision, meaning that only so many digits can be stored. If we try to compute or store a number with too many decimal places (say, more than 16), Python will “forget” the digits beyond a certain point:

0.12345678901234567890123
0.12345678901234568

Here’s another example. \(1/(2^{10{,}000})\) is a very small decimal number, but it isn’t zero. It is too small for Python to calculate using floats, however:

1/(2**10000)
0.0

Because floats lack some precision, small arithmetic errors called floating point errors can result from float operations:

# should be 251, exactly!
2.51 * 100
250.99999999999997

This also means that sometimes something that should be zero doesn’t seem to be zero, but instead appears to be a super small number.

(3.0 * 1.2) - 3.6
-4.440892098500626e-16

This might look like a bug in Python, but it isn’t! This is an inherent limitation of all programming languages which use floating point numbers (which is most of them). It usually isn’t that big of an issue, as long as you’re aware of the problem and are careful. For instance, you should get an uneasy feeling when you write code like int(2.51 * 100), because it may not behave the way you’d first expect:

# "should" be 251!
int(2.51 * 100)
250

Now you try:

Python supports a round() function that (in simple terms) rounds a decimal to an integer like we do with hand so 3.7 gets rounded to 4 and 3.1 gets rounded to 3. With this in mind, answer the following question: Do int(2.51*100) and round(2.51*100) equivalent?

As we saw earlier, Python evaluates 2.51*100 to 250.99999999999997.

  • int(250.99999999999997) evaluates to 250

  • round(250.99999999999997) evaluates to 251

6.1.5. Summary

  • Everything in Python has a type – these are called data types.

  • We can find the type of an object by calling type function on an object or expression.

  • Python has two basic number types: float and int.

  • When faced with division or an expression that involves any floats, the end result will be a float.