top of page

Plot Differences

Created 05 Nov 2023, Last edited 05 Nov 2023

With a list of values taken at fixed intervals it is often valuable to visualise the change in value between periods.

Examples include distance traveled every year given accumulated distance readings or change in pollution levels at fixed intervals.

ppmCO2perYear.jpg

Origin of data. We wished to see if increments pf PPM were showing signs of increasing.

Data:

"Year","PPM"
2000,371.81
2001,373.37
2002,375.02
2003,377.73
2004,380.35
2005,382.29
2006,384.61
2007,386.50
2008,387.21
2009,389.55
2010,392.46
2011,393.25
2012,396.18
2013,398.41
2014,401.38
2015,403.28
2016,402.85
2017,405.00
2018,407.38
2019,411.44
2020,411.52
2021,413.30

Setup
Operating system
$ lsb_release -a
No LSB modules are available.
Distributor ID:    Ubuntu
Description:    Ubuntu 20.04.6 LTS
Release:    20.04
Codename:    focal

$ python3 -V
Python 3.8.10

$ python3 -c "import matplotlib; print('matplotlib: {}'.format(matplotlib.__version__))"
matplotlib: 3.1.2

$ pip show matplotlib
Version: 3.1.2

$ pip show pandas
Version: 2.0.3

$ pip show wheel
Version: 0.34.2

Python Code:

Notes: Below is intended  for use with sequential yearly intervals (on x-axis) but could be changed slighlty to cater for any dates with any separation.

Run with : $ python3 diffa.py

from pandas import read_csv #, DataFrame
from datetime import datetime
from matplotlib import pyplot
import pandas as pd
# https://pandas.pydata.org/pandas-docs/version/0.12.0/cookbook.html#cookbook-plotting
# https://pandas.pydata.org/pandas-docs/version/0.12.0/10min.html
# https://machinelearningmastery.com/difference-time-series-dataset-python/
# https://www.w3schools.com/python/pandas/ref_df_diff.asp
#pip install wheel
#pip install pandas
#Installing collected packages: tzdata, numpy, python-dateutil, pandas

def parser(x1):
  return datetime.strptime(x1, '%Y')
 
df1 = read_csv('diff.csv', header=0, parse_dates=[0], index_col=0, date_parser=parser)
sfz = df1.squeeze() #converts to a series
dfz = sfz.diff() #Output is a Dataframe
dfz=dfz.dropna(how='any') #To drop any rows that have missing data. NaN
dfz = pd.DataFrame(dfz).reset_index() #this adds the index idx 0 1 2 etc as 1st column
dfz.columns = ['Year', 'PPM'] #The Miles was missing after the diff
dfz['Year'] = dfz['Year'].astype('str') #Convert column to string type
#Convert from like 2020-01-01 to just year 2020
idx=0
for iq in dfz['Year']:
  dfz.at[idx,'Year']=iq[:4] #First 4 characters hold the year
  idx += 1
#print("------------")
#print(type(dfz))
#print(dfz)
#print("------------")
bargraph = dfz.plot.bar(x = 'Year', y = 'PPM', fontsize='9')
pyplot.show()

 

ppmCO2perYear.png

ppmCO2perYear.png

©2021 by Digital Thoughts. Proudly created with Wix.com

bottom of page