Data file format
Feedback
This section provides general rules for file formatting and examples for the SMARTe risk assessment tools:
site characterization data anlaysis;
monitoring data analysis; and, human health risk calculator
(under development)
Conforming to the rules can be accomplished using any spreadsheet program.
General Rules
- The file should be a tab-delimited text file. A spreadsheet file can be saved as a tab-delimited text file using the 'File -> Save As' function.
- Data should be stored as one observation per row (multiple fields in a row). This means that all of the identifiers for a sample should be contained in a row of the data file.
- The observations must be uninterrupted, i.e., no carriage returns within an observation.
- Avoid use of special characters, e.g., ', ", #, &. In most cases, it is best to replace special characters with a '.'
- Nondetects can be identified by either coding nondetects as negative values in the result column or having a detect column in the data where 'T' corresponds to a detect and 'F' corresponds to a nondetect (see examples below).
- Replace empty cells with 'NA' (see examples below).
Examples
The example data sets presented below provide data formatting examples for each assessment tool.
1. Layout of a typical environmental site characterization data set. For site characterization data analysis,
it is usually desirable to subset the
data using the factors ‘analyte’ and ‘siteid’ –
The SMARTe site characterization data analysis tool provides an interface to subset the data by analyte or siteid.
If spatial coordinates are included in the data set then some spatial plots can be implemented in SMARTe.
Summary statistics, exploratory data analysis (plots), hypothesis tests, and confidence intervals can also be implemented.
If non-detects are identified in the data set, then they can be accounted for in the data anlaysis.
Note that this would also be the typical lay-out of a data set for the human health risk calculator.
|
siteid
|
analyte
|
concentration
|
detectflag
|
x.coord
|
y.coord
|
|---|---|---|---|---|---|
|
BKG
|
Arsenic
|
21.6
|
T
|
418550.9
|
3891761
|
|
BKG
|
Arsenic
|
34.2
|
T
|
NA
|
NA
|
|
BKG
|
Copper
|
30.7
|
T
|
418882.1
|
3894504
|
|
BKG
|
Copper
|
48.5
|
T
|
NA
|
NA
|
|
BKG
|
Lead
|
7
|
T
|
418550.9
|
3891761
|
|
BKG
|
Lead
|
0.15
|
F
|
418550.9
|
3891761
|
|
ND02
|
Arsenic
|
14.9
|
T
|
421301.17
|
3892240.68
|
|
ND02
|
Arsenic
|
6.6
|
T
|
421301.17
|
3892240.68
|
|
ND02
|
Copper
|
19.3
|
T
|
421301.17
|
3892240.68
|
|
ND02
|
Copper
|
12.9
|
T
|
421301.17
|
3892240.68
|
|
ND02
|
Lead
|
15
|
T
|
421301.17
|
3892240.68
|
|
ND02
|
Lead
|
1.1
|
T
|
421301.17
|
3892240.68
|
|
ND11B
|
Arsenic
|
0.3
|
F
|
423498.61
|
3897576.27
|
|
ND11B
|
Arsenic
|
1.7
|
T
|
423488.98
|
3897581.71
|
|
ND11B
|
Copper
|
2.5
|
F
|
423381.78
|
3897488.5
|
|
ND11B
|
Copper
|
9.41
|
T
|
423461
|
3897557
|
|
ND11B
|
Lead
|
0.1
|
F
|
423443.83
|
3897578.28
|
|
ND11B
|
Lead
|
26
|
T
|
423443.83
|
3897578.28
|
2. Layout for a typical environmental monitoring data set.
For environmental monitoring data analysis,
it is usually desirable to subset the
data using the factors 'site' and 'analyte' -
the monitoring data anlaysis tool provides an interface for identifying subsetting factors and values.
Analyzing temporal trends in monitoring data requires a temporal value for each observation - this is the date column
in the table below. Sampling times can also be provided in a time column.
At least one of date or time must be provided for the analysis of temporal data.
Analyzing spatial trends in monitoring data requires spatial coordinates for each observation - these are the x.coord and
y.coord columns in the table below.
The format of the date column must be one of 'd/m/y', 'd-m-y', 'y/m/d', 'y-m-d', 'month day year', or 'day month year.'
An example of each is '4/30/70', '4-30-70', '70/4/30', '70-4-30', 'April 30 1970', and '30 April 1970.' The format of the
time column must be 'h:m:s.'
|
siteid
|
analyte
|
date
|
concentration
|
detectflag
|
x.coord
|
y.coord
|
|---|---|---|---|---|---|---|
|
1.1
|
Arsenic
|
1/5/04
|
12.3
|
T
|
0
|
0
|
|
1.1
|
Arsenic
|
4/5/04
|
18.2
|
T
|
0
|
0
|
|
1.1
|
Arsenic
|
7/5/04
|
20.9
|
T
|
0
|
0
|
|
1.1
|
Copper
|
1/5/04
|
1125
|
T
|
0
|
0
|
|
1.1
|
Copper
|
4/5/04
|
1208
|
T
|
0
|
0
|
|
1.1
|
Copper
|
7/5/04
|
1515
|
T
|
0
|
0
|
|
1.2
|
Arsenic
|
1/5/04
|
4.5
|
T
|
52
|
83
|
|
1.2
|
Arsenic
|
4/5/04
|
8.7
|
T
|
52
|
83
|
|
1.2
|
Arsenic
|
7/5/04
|
8.2
|
T
|
52
|
83
|
|
1.2
|
Copper
|
1/5/04
|
352
|
T
|
52
|
83
|
|
1.2
|
Copper
|
4/5/04
|
315
|
T
|
52
|
83
|
|
1.2
|
Copper
|
7/5/04
|
426
|
T
|
52
|
83
|
|
1.3
|
Arsenic
|
1/5/04
|
0.3
|
F
|
160
|
134
|
|
1.3
|
Arsenic
|
4/5/04
|
2.2
|
T
|
160
|
134
|
|
1.3
|
Arsenic
|
7/5/04
|
2
|
T
|
160
|
134
|
|
1.3
|
Copper
|
1/5/04
|
83
|
T
|
160
|
134
|
|
1.3
|
Copper
|
4/5/04
|
92
|
T
|
160
|
134
|
|
1.3
|
Copper
|
7/5/04
|
75
|
T
|
160
|
134
|
|
1.4
|
Arsenic
|
1/5/04
|
0.3
|
F
|
287
|
171
|
|
1.4
|
Arsenic
|
4/5/04
|
0.3
|
F
|
287
|
171
|
|
1.4
|
Arsenic
|
7/5/04
|
1.2
|
T
|
287
|
171
|
|
1.4
|
Copper
|
1/5/04
|
15
|
T
|
287
|
171
|
|
1.4
|
Copper
|
4/5/04
|
12.9
|
T
|
287
|
171
|
|
1.4
|
Copper
|
7/5/04
|
14.4
|
T
|
287
|
171
|



