by Dirk Baker | Updated: 10/03/2017 | Comments: 8
You may be familiar with R for data processing and analysis. But do you know how to easily import data that is in Campbell Scientific’s TOA5 format into R? In this article, I’ll briefly discuss R and TOA5, and then share a function to create an R dataframe from your TOA5.
R is a powerful, open-source software environment for statistical computing and graphics. It is also popular for its ability to produce publication-quality plots with the necessary mathematical symbols and formulas using UNIX, Windows, and MacOS platforms.
Recommended for You: To learn more about R or download R, visit The R Project for Statistical Computing website. |
The TOA5 file format is the default format used for data collected from any contemporary Campbell Scientific data loggers. This format is a simple comma-delimited text file that has a .dat extension.
The TOA5 file format includes a four-line ASCII header with information about the data logger you’re using to collect your data. In addition, the header describes the data values with variable names and units of measurement—if they are available.
The four lines of the header are the following:
Recommended for You: For more information about TOA5, refer to Appendix B of the LoggerNet instruction manual. |
You can find the complete code for the import function at the end of this section. Simply copy the code into a file or the R command line. A call to this R function has one required parameter—the file name, as well as one optional parameter—the Return Option (RetOpt):
importCSdata(“filename.dat”, “RetOpt”)
If you just want the data with a simple header, all you need is one line of code like this:
myData ‹- importCSdata(“CR1000_HourlyWeather.dat”)
The optional RetOpt parameter defaults to “data,” as in the line below:
myData ‹- importCSdata(“CR1000_HourlyWeather.dat”, “data”)
When the RetOpt (Return Option) parameter is omitted or is “data,” an R data frame is created with the second line of the raw data file used for the names. The TIMESTAMP is converted to what R recognizes as a date and time stamp so that this information can be used in graphing, time-series analysis, or the aggregation of data based on date or time.
Because there is a lot of good information in the header of the TOA5 file, you may also want to simply import the header as documentation for your process by using the syntax below:
myData.Info ‹- importCSdata(“CR1000_HourlyWeather.dat”, “info”)
The complete code for the import function:
importCSdata ‹- function(filename,RetOpt="data"){ if(RetOpt=="info"){ # bring in entire header of CSI TOA5 data file for metadata stn.info ‹- scan(file=filename,nlines=4,what=character(),sep="\r") return(stn.info) } else { # second line of header contains variable names header ‹- scan(file=filename,skip=1,nlines=1,what=character(),sep=",") # bring in data stn.data ‹- read.table(file=filename,skip=4,header=FALSE, na.strings=c("NAN"),sep=",") names(stn.data) ‹- header # add column of R-formatted date/timestamps stn.data$TIMESTAMP ‹- as.POSIXlt(strptime(stn.data$TIMESTAMP,"%Y-%m-%d %H:%M:%S")) return(stn.data)} }
You can download this code.
There are far too many resources available for learning R to list them all, but here are a few for getting started or for quick reference:
I hope you find this function useful in your data processing and analyses. Feel free to post your comments about your experience, or your suggestions, below.
Happy analyzing!
Credits: The R logo is © 2016 The R Foundation. The logo was used without modification and with permission in accordance with the terms of the Creative Commons Attribution-ShareAlike 4.0 International license (CC-BY-SA 4.0).
Comments
J. Magnin | 02/15/2017 at 03:20 AM
Hello,
This article is quite useful, although your code has some mistakes: you use 2 variables names for the same object (stn.data / station.data & stn.info / station.info), so the function stops & returns an error.
By the way, the way you use scan & read.table gave me an idea for enhanced functions to read TOA5 data (and why not, later, other data formats ?). I would like to publish that work, as soon as it will be done, on the CRAN (maybe under GPL V3) to make it available for the R users community. For this, I would like to have your authorization for using your code as a basis, if you agree with that idea.
Thanks,
J.M.
Dirk | 02/15/2017 at 10:33 AM
Thank you for pointing that out, JM! Looks like I introduced that typo in condensing the code to fit the page. It's fixed now and a downloadable version added.
I like the idea of expanding and enhancing the function and, as far as I'm concerned, anyone is free to use and modify my code as needed. I'm interested to learn more about your ideas and possibly collaborating. You're welcome to contact me at dbaker@campbellsci.com.
Thanks!
Dirk
ariklee | 02/23/2018 at 12:57 PM
Awesome! Great to see R represented in a Campbell blog. I've written a short R script also to read TOA5 data, it's pretty handy to separate out the 4-line header from the data.
A quick search on Github shows a few repositories related to reading and plotting TOA5 Campbell data: https://github.com/search?l=R&q=campbell&type=Repositories&utf8=%E2%9C%93
Thanks for your contribution!
Eric
Bissett | 11/01/2022 at 08:26 PM
Hi Dirk,
I am trying to use this to combine station files with backup files, etc. The problem I am having is that the header prints out each line in quotes, and removes the quotes from around each character string and I am worried that will cause a problem when loggernet tries to write new data to the file.
Before import the header looks like:
"TOA5","CR300_1","CR300","5514","CR300-RF407.Std.10.06","CPU:CR300_1.CR300","41351","Hourly_met"
"TIMESTAMP","RECORD","Rain_mm_Tot","AirTC_Avg","RH","WS_ms_S_WVT","WindDir_D1_WVT","SlrW_Avg","ETos","Rso"
"TS","RN","mm","Deg C","%","meters/second","Deg","W/m^2","mm","MJ/m²"
"","","Tot","Avg","Smp","WVc","WVc","Avg","ETXs","Rso"
Exported version:
TOA5,CR300_1,CR300,5514,CR300-RF407.Std.10.06,CPU:CR300_1.CR300,41351,Hourly_met TIMESTAMP,RECORD,Rain_mm_Tot,AirTC_Avg,RH,WS_ms_S_WVT,WindDir_D1_WVT,SlrW_Avg,ETos,Rso TS,RN,mm,Deg C,%,meters/second,Deg,W/m^2,mm,MJ/m² ,,Tot,Avg,Smp,WVc,WVc,Avg,ETXs,Rso
If you could let me know if there is a solution, I would greatly appreciate it.
Thanks! Bissett
Dirk | 11/01/2022 at 10:02 PM
Hi Bissett,
The intent of this code is to import the data for the next steps of analysis and visualization, not to combine files into a format that LoggerNet would recognize. This is possible, but would take some additional work, as you've noted.
I'll send you and email shortly so we can talk more about possible solutions for what you want to do.
Best,
Dirk
Joseph Knudsen | 06/27/2023 at 06:46 PM
Hi Dirk,
What package includes the importCSdata() function? -Thanks!
Dirk | 06/28/2023 at 06:04 PM
Hi Joseph,
The function is included as a downloadable text file, so it can simply be run as part of a script or at the command prompt. It is not part of a package.
Best,
Dirk
data science | 05/25/2024 at 03:31 PM
how can i download data straight from online into r
Please log in or register to comment.