3. Line graphs

Like scatter plots, line graphs are useful for visualizing the relationship between two numerical variables. Unlike scatter plots, however, line graphs assume that for any given “x” value, there is a unique “y” value. We often use line graphs for visualizing how a numerical variable changes over time.

Line graphs are constructed similarly to scatter plots. We start by choosing “x” and “y” variables and plot each (x, y) pair as a point. We then draw line segments between each consecutive pair of points when ordered by their “x” values. The straight line segments have the effect of emphasizing the trend in “y” as “x” increases.

3.2. Misleading scales

When interpreting a visualization, you should always pay special attention to the scale of the x- and y-axes. By manipulating these scales, it is possible to make trends appear and disappear.

Consider the plot above. The default behavior of pandas is to choose the scale of the y-axis so as to minimize empty space. In this case, the default axis ranges from 49 degrees to about 55 degrees. As a consequence, the line is more than twice as high in 2010 as it was in 1900. Of course, this doesn’t mean that the temperature is twice as high as it was in 1900! Still, if our goal was to make the change in temperature appear as large as possible, this would be the scale we would use.

On the other hand, if we wished to downplay the change in temperature over the last century, we might change the y-axis so that it starts at zero and ends at 100. This “fixes” the issue with the line doubling in height, but it has the effect of obscuring the warming trend. We can see this by re-creating the plot above, this time with ylim=(0, 100), which makes the lower and upper limits of the y-axis 0 degrees and 100 degrees, respectively.

average_temperatures_by_year.plot(kind='line', y='temperature', ylim=(0, 100));

There isn’t necessarily a single “correct” scale for a particular visualization. Therefore, when reading any type of graph, make sure you take a look at the scale of each axis before drawing conclusions!