Tags: %TAGME{ tpaction="" web="Main" tag="" }% view all tags

Ploting COVID 19 Statistics

Moodle

Step 1

Files can be downloaded from the New Your Times github site with data COVID 19:

https://github.com/nytimes/covid-19-data

Here we see US, States, and county data files. These are updated daily. Look them over.

Step 2

Consider the following graph: graph.png Made with the code:

from pandas import read_csv
from matplotlib import pyplot
series = read_csv('us-states.csv', header=0, index_col=0, parse_dates=True, squeeze=True)

oseries = series[series['state'] == "Ohio"]
print(oseries)
oseries = oseries[['cases','deaths']]
print(oseries)
oseries.plot()
pyplot.show()
pyplot.savefig('graph.png')  

Link: https://repl.it/@JimSkon/CovidSamplePlotCSV1

  • Note the dataframe is a time series with the data as the index, and two columns.
  • Note that plot works by making a line FOR EACH column in the data. (Here 'cases' and 'deaths')

Step 3

Let's make the plot look nicer.

  1. Can we turn the code into a function that we pass a state to for plotting?
  2. Can we make the graph look nicer?
  pyplot.title("Cases and deaths for "+state)
  pyplot.ylabel('People')

Step 4

read_cvs() can actually read the data directly from a URL, in this case gihub:

url = 'https://raw.githubusercontent.com/nytimes/covid-19-data/master/us-states.csv'
series = read_csv(url, header=0, index_col=0, parse_dates=True, squeeze=True)

Using this we can get the up-to-date data every day!

(repl.it/@JimSkon/CovidSamplePlotCSVGitHub)

Step 5

Can we compare the number of cases between two states?

  1. Prompt for two states
  2. Make two dataframes, one for each states, by selecting 'cases' for each state
  3. Now we need a NEW dataframe with the dates as the index. How can we do that?
  4. The panda concat() function allows us to combine two tables. It combines data with the same index (date in this case)
  5. pd.concat([d1,d2],axis=1), where d1 and d2 are the dataframes to combine, and axis is the axis to combine (0 is rows, and 1 is columns in this case. we want columns)
  6. Can we fix up the headings?
  7. We can use the "rename()" function to rename columns. "data.rename(columns={'old_name':'new_name'})"
  8. We must rename before concat, why?
repl.it/@JimSkon/CovidSamplePlotCSVGitHubCompareStates

Step 6

Can we compare either cases or rates?

Add an option to choose either cases or deaths"

"Enter 1 for cases, 2 for deaths: "

What does this take?

repl.it/@JimSkon/CovidPlotCSVGitHubCompareStatesWithDeath

Step 7

Can we compare the rates a cases (or deaths) per 100,000 people? A ore realistic comparison.

We need state population to do this!

statepopulation.cvs

We need to look up the population of the state, and do some math! What?

Let's try it!

repl.it/@JimSkon/CovidSamplePlotCSVGitHubRateCompare

Step 8

Finally, can we graph the increase in cases or deaths over the previous day?

repl.it/@JimSkon/CovidSamplePlotCSVGitHubChangeCompare

Now your turn.

What can you do? Use the other files if you want?

Topic attachments
I Attachment History Action Size Date Who Comment
PNGpng graph.png r1 manage 21.5 K 2020-04-20 - 16:25 JimSkon  
Unknown file formatcsv statepopulation.csv r1 manage 0.9 K 2020-04-20 - 18:24 JimSkon  
Edit | Attach | Watch | Print version | History: r3 < r2 < r1 | Backlinks | Raw View | Raw edit | More topic actions
Topic revision: r3 - 2020-04-21 - JimSkon
 
This site is powered by the TWiki collaboration platform Powered by PerlCopyright © 2008-2020 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback