Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fractional effort for point count data #188

Open
erex opened this issue Nov 25, 2024 · 4 comments
Open

Fractional effort for point count data #188

erex opened this issue Nov 25, 2024 · 4 comments
Assignees
Labels
not a bug To discuss Requires discussion

Comments

@erex
Copy link
Member

erex commented Nov 25, 2024

Email from list, I've not created a reproducible example

I had thought that could be the issue a few days ago and spent several hours checking without finding something out of place. Since the old "Effort" was the number of visits (runs fine) and the new corrected "Effort" is just the old "effort" times the percent of the point that was land. So I tried a few things to see if I could isolate and identify an error in the data entry. If I round the values so they are whole numbers of the new "Effort" values it works. It doesn't work if I have effort values that have any decimals. So I believe that my point descriptive values are lining up with the correct point and are consistent for all observations of each point. This led me to think that I had the wrong idea about how to handle it. This is an example of the aforementioned data columns from the most extreme point with erosion effects on survey area. One point was completely gone and simply had less visits in the previous time that I analysed the data.

Region.Label | Sample.Label | visits | land per | Effort -- | -- | -- | -- | -- B | 1 | 26 | 0.5196679 | 13.51 B | 1 | 26 | 0.5196679 | 13.51 B | 1 | 26 | 0.5196679 | 13.51
 
Region.Label is sub sites with letter codes B G J Y
Sample.Label is points with numbers 1 to 35 for each point
Some subsites have a different number of visits and some points within subsites have different numbers of visits.

I have 3984 observations of bird groups within 10 min surveys. I was checking that everything ran with all lines of data before separating out observations of different species guilds. I previously ran the data with the same code then separated out some common species, received comments from advisors discussions and am rerunning the same data with only that change.   

I am in the process of writing results in a dissertation and had some overlap with another student who did a remote sensing project in the same area (after my project and data collection) and was able to supply me with water, bare ground, vegetation heights (5 ranges) information for the area we surveyed. While not sure of the exact methods my advisors believe that I can make use of this after the fact information. So I may not have read enough examples of incorporating and interpreting habitat variables if there are additional readings you can recommend. Adjusting effort seemed to be the first most straightforward step.  



On Tue, Nov 12, 2024 at 3:30 PM Eric Rexstad <[email protected]> wrote:
Stephanie

I've looked at the place in the code that generates that error.

      # possible that Effort is not the same for a given
      # Sample.Label+Region.Label -- this is BAD.
      if(nrow(sample.table)!=nrow(unique(sample.table[,c("Sample.Label",
                                                  "Region.Label")]))){
        stop("A sample has a non-unique \"Effort\", check data!")
      }


I had thought that could be the issue a few days ago and spent several hours checking without finding something out of place. Since the old "Effort" was the number of visits (runs fine) and the new corrected "Effort" is just the old "effort" times the percent of the point that was land. So I tried a few things to see if I could isolate and identify an error in the data entry. If I round the values so they are whole numbers of the new "Effort" values it works. It doesn't work if I have effort values that have any decimals. So I believe that my point descriptive values are lining up with the correct point and are consistent for all observations of each point. This led me to think that I had the wrong idea about how to handle it. This is an example of the aforementioned data columns from the most extreme point with erosion effects on survey area. One point was completely gone and simply had less visits in the previous time that I analysed the data.

Region.Label Sample.Label visits land per Effort
B 1 26 0.5196679 13.51
B 1 26 0.5196679 13.51
B 1 26 0.5196679 13.51

Region.Label is sub sites with letter codes B G J Y
Sample.Label is points with numbers 1 to 35 for each point
Some subsites have a different number of visits and some points within subsites have different numbers of visits.

I have 3984 observations of bird groups within 10 min surveys. I was checking that everything ran with all lines of data before separating out observations of different species guilds. I previously ran the data with the same code then separated out some common species, received comments from advisors discussions and am rerunning the same data with only that change.

I am in the process of writing results in a dissertation and had some overlap with another student who did a remote sensing project in the same area (after my project and data collection) and was able to supply me with water, bare ground, vegetation heights (5 ranges) information for the area we surveyed. While not sure of the exact methods my advisors believe that I can make use of this after the fact information. So I may not have read enough examples of incorporating and interpreting habitat variables if there are additional readings you can recommend. Adjusting effort seemed to be the first most straightforward step.

On Tue, Nov 12, 2024 at 3:30 PM Eric Rexstad <[email protected]> wrote:

Stephanie

I've looked at the place in the code that generates that error.
          # possible that Effort is not the same for a given
          # Sample.Label+Region.Label -- this is BAD.
          if(nrow(sample.table)!=nrow(unique(sample.table[,c("Sample.Label",
                                                      "Region.Label")]))){
            stop("A sample has a non-unique \"Effort\", check data!")
          }
@lenthomas lenthomas added the triage New items to look at and decide what to do label Nov 25, 2024
@lenthomas
Copy link
Member

@erex Looking over the issue above, I found it a bit difficult to parse as the same text appears twice and it doesn't seem to start with what the original issue was. I saw the name "Stephanie" and traced it to this email on the distance-sampling list. From this and the above text it seems there's some sort of problem with including non-integer effort when analyzing point transect data. Is the problem that it throws an error? I can create an example if so - just wanted to double check that I have the problem correct.

@erex
Copy link
Member Author

erex commented Dec 17, 2024

@lenthomas Exactly right. Stephanie wrote to the list asking how to account for point count effort where not all of the radius around the point is land. I suggested she use fractional effort values. She then replied off-list that my suggestion of fractional effort caused an error.

Please create an example demonstrating this.

@LHMarshall
Copy link
Member

@len @eric I could not replicate this error. Eric do you have this persons exact data?

# Simulate point transect data
library(dsims)
design <- make.design(transect.type = "point")

sim <- make.simulation(design = design)
survey <- run.survey(sim)
plot(survey)

# Try the analysis with no modification to the data
eg.data <- [email protected]
ds.model.1 <- ds(data = eg.data,
               truncation = 50,
               transect = "point",
               formula = ~1,
               key = "hn",
               nadj = 0)
ds.model.1$dht
#Abundance and density estimates from distance sampling
#Variance       : P2, N/L 
#
#Summary statistics
# 
# Region  Area CoveredArea Effort  n  k       ER    se.ER     cv.ER
# 1 region 1e+06    141371.7     18 64 18 3.555556 0.381108 0.1071866
# 
# Abundance:
#   Region Estimate       se        cv      lcl      ucl       df
# 1  Total 990.0958 186.3423 0.1882063 682.7217 1435.856 74.43538
# 
# Density:
#   Region     Estimate           se        cv          lcl         ucl       df
# 1  Total 0.0009900958 0.0001863423 0.1882063 0.0006827217 0.001435856 74.43538


# Change the effort to be < 1 for some points
eg.data$Effort[eg.data$Sample.Label == 3] <- 0.75
eg.data$Effort[eg.data$Sample.Label == 12] <- 0.5555
eg.data$Effort[eg.data$Sample.Label == 18] <- 0.1133

ds.model.2 <- ds(data = eg.data,
               truncation = 50,
               transect = "point",
               formula = ~1,
               key = "hn",
               nadj = 0)
ds.model.2$dht
# Abundance and density estimates from distance sampling
# Variance       : P2, N/L 
# 
# Summary statistics
# 
# Region  Area CoveredArea  Effort  n  k       ER     se.ER
# 1 region 1e+06      128953 16.4188 64 18 3.897971 0.9147251
# cv.ER
# 1 0.234667
# 
# Abundance:
#   Region Estimate       se        cv      lcl      ucl       df
# 1  Total 1085.446 305.0881 0.2810716 619.5064 1901.826 33.29047
# 
# Density:
#   Region    Estimate           se        cv          lcl
# 1  Total 0.001085446 0.0003050881 0.2810716 0.0006195064
# ucl       df
# 1 0.001901826 33.29047

# Change the effort to be < 1 for some points
eg.data <- [email protected]
eg.data$Effort[eg.data$Sample.Label == 3] <- 1.75
eg.data$Effort[eg.data$Sample.Label == 12] <- 3.5555
eg.data$Effort[eg.data$Sample.Label == 18] <- 2.1133

ds.model.3 <- ds(data = eg.data,
                 truncation = 50,
                 transect = "point",
                 formula = ~1,
                 key = "hn",
                 nadj = 0)
ds.model.3$dht
# Abundance and density estimates from distance sampling
# Variance       : P2, N/L 
# 
# Summary statistics
# 
# Region  Area CoveredArea  Effort  n  k       ER     se.ER    cv.ER
# 1 region 1e+06    176076.8 22.4188 64 18 2.854747 0.3727272 0.130564
# 
# Abundance:
#   Region Estimate       se        cv      lcl      ucl       df
# 1  Total 794.9455 160.9242 0.2024342 532.6871 1186.322 64.13157
# 
# Density:
#   Region     Estimate           se        cv          lcl         ucl       df
# 1  Total 0.0007949455 0.0001609242 0.2024342 0.0005326871 0.001186322 64.13157

@LHMarshall LHMarshall added not a bug To discuss Requires discussion and removed triage New items to look at and decide what to do labels Jan 17, 2025
@erex
Copy link
Member Author

erex commented Jan 19, 2025

@LHMarshall Thanks for digging into this. I do not have the user's original data. I thought the problem was her adjustment of effort was not consistent within points, because the error message would have been triggered by such a circumstance.

I asked her to check for effort consistency within points:

Have you associated the same value of effort for all detections associated with that transect?

and her reply was

I had thought that could be the issue a few days ago and spent several hours checking without finding something out of place. Since the old "Effort" was the number of visits (runs fine) and the new corrected "Effort" is just the old "effort" times the percent of the point that was land. So I tried a few things to see if I could isolate and identify an error in the data entry. If I round the values so they are whole numbers of the new "Effort" values it works. It doesn't work if I have effort values that have any decimals. So I believe that my point descriptive values are lining up with the correct point and are consistent for all observations of each point.

Her email also indicated she was writing up her dissertation, perhaps under trime pressures. I have not had further correspondence from her since mid-November.

Do you want me to request that she send the data frame that caused her problems?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
not a bug To discuss Requires discussion
Projects
None yet
Development

No branches or pull requests

3 participants