Thursday, June 26, 2014

Updates to R package raincpc: Global Daily Rainfall for over 35 years

The Climate Prediction Center's  (CPCglobal rainfall data, 1979 - present, 50 km resolution, is one of the few high-quality, long-term, observation-based, daily rainfall products available for free. Although raw data is available at CPC's ftp site, obtaining and processing the data is not easy since there are over 12000 files, and formats and names of these files have changed over time. 

The latest version of the raincpc package provides functionality to download, process and visualize over 35 years of global daily rainfall data from CPC. The vignette demonstrates the use of this package, including the extraction and display of regional rainfall data.

Following are some graphics from the raincpc vignette.

Thursday, June 19, 2014

New R package hazus: Damage functions from FEMA's HAZUS software for use in modeling financial losses from natural disasters

Damage Functions (DFs) translate physical damage to property, resulting from natural disasters, to financial damage. FEMA in USA developed several thousand DFs and these serve as a benchmark in natural catastrophe modeling, both in academia and industry. However, these DFs and their documentation are buried within FEMA's HAZUS software and are not easily accessible for analysis and visualization.

The hazus package provides more than 1300 raw DFs used by FEMA's HAZUS software and also functionality to extract and visualize DFs specific to the flood hazard.

Here is the link to the package home on CRAN. Below is a graphic from the package vignette in R markdown.

Sunday, May 25, 2014

New R package rainfreq: Rainfall Frequency (or Design Storm) Estimates from the US National Weather Service

Rainfall estimates at desired frequency (e.g., 1% annual chance or 100-year return period) and duration (e.g., 24-hour) are often required in the design of dams and other hydraulic structures, catastrophe risk modeling, environmental planning and management. One major source of such estimates for the USA is the NOAA National Weather Service. Raw data is available at 1-km resolution and comes as a huge number of GIS files. 

The new R package rainfreq provides functionality to easily access and analyze the 1-km GIS files provided by NWS' PF Data Server for the entire USA. This package also comes with datasets on record point rainfall measurements provided by NWS.

Here is the rainfreq package home page on CRAN. Here are some graphics from the package vignette.

Tuesday, May 13, 2014

Updates to R package emdatr: Global Disaster Losses from the EMDAT Database

The EMDAT database provides valuable information on human and financial losses from natural disasters around the world. Some of the issues with the EMDAT data are lack of entire data accessibility, static and inconsistent summary reports, and the lack of auxiliary financial and demographic data. The emdatr package addresses some of these issues. 

Major updates in emdatr v0.2:

  • Data has been updated to include the whole of 2013.
  • Data is now hosted on and only a sample is provided with the package. Package has the functionality to extract the entire data.
  • A new vignette which explains the raw data clean-up and enhancement procedure and which also demonstrates use of the package.
Here is the emdatr package home page on CRAN. Below is a summary graphic on number of natural disasters by decade obtained using the package.

Wednesday, May 7, 2014

New R package dams: Dams in the United States

The dams package provides functionality to access over 74,000 dams in the National Inventory of Dams (NID) from the US Army Corps of Engineers, the single largest source of dams in the United States. Each dam has 64 attributes such as geographical, structural, hydraulic and operational characteristics.

Obtaining data directly from NID has to be done manually and the website's GUI is not user-friendly - only a couple of thousand records could be displayed at a time on the GUI, but there is no option to save these records to a file. Data was obtained manually from NID's website and then cleaned up. The dams package comes with a sample of the cleaned data and the `extract_nid` function from the package could be used to obtain all of the cleaned data.

Here is the dams package home page on CRAN. Here are some graphics from the package vignette.

Monday, April 7, 2014

Dams in the United States from the National Inventory of Dams (NID) Database

There is no database containing information on all the dams in the United States. The single largest source is the National Inventory of Dams (NID) from the US Army Corps of Engineers which claims to have more than 80,000 dams. I downloaded the entire data from NID and also cleaned it up.

I am in the process of creating an R package for this dataset and will shortly have a post on it.

Here are some graphics.

Tuesday, January 21, 2014

The Tornado Project: Annual Tornado Frequency by Location

The goal of this open source R-based analysis, as mentioned earlier (first post, second post) is to bring consistency and transparency to the analyses of publicly available Tornado data.

The latest addition to the project is the analysis of local tornado occurrence probability. Below graphics show the average number of tornadoes per year within the United States since 1980. The average number appears to increase with the addition of the recent data.

The project home page is here -

Any help or comments or contributions appreciated.

Wednesday, January 8, 2014

USA Drought of 2013: Analysis of High-resolution Rainfall Data Using R

The ongoing drought in California and other parts of Southwestern United States has been reported extensively by newspapers and government sites.

Although rainfall deficit is technically meteorological drought, and drought could be of several other types (such as hydrological, agricultural, etc.), the attempt here is to demonstrate the use of R in the analysis of high resolution rainfall data. Using 4-km rainfall data from the PRISM Climate Group for 1895-2013, the total for 2013 is compared with the long-term and near-term historical averages.

Spatial patterns compare well with those from the Drought Monitor from the University of Nebraska.

The entire code and all the graphics are available on GitHub -

This effort is part of The Rain Project.

Any comments or help appreciated.

Monday, January 6, 2014

The Rain Project: An R-based Open Source Analysis of Publicly Available Rainfall Data

Rainfall data used by researchers in academia and industry does not always come in the same format. Data is often in atypical formats and in extremely large number of files and there is not always guidance on how to obtain, process and visualize the data. This project attempts to resolve this issue by serving as a hub for the processing of such publicly available rainfall data using R.
The goal of this project is to reformat rainfall data from their native format to a consistent format, suitable for use in data analysis. Within this project site, each dataset is intended to have its own wiki. Eventually, an R package would be developed for each data source.
Currently R code is available to process data from three sources - Climate Prediction Center (global coverage), US Historical Climatology Network (USA coverage) and APHRODITE (Asia/Eurasia and Middle East).

The project home page is here -
If you are aware of other sources and would like to add them to this list (and/or would like to add the R code) please let me know. Any other comments or help appreciated.