Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

how to use GeoNet seismic waveform data from a local storage #121

Open
elidana opened this issue Feb 19, 2024 · 2 comments
Open

how to use GeoNet seismic waveform data from a local storage #121

elidana opened this issue Feb 19, 2024 · 2 comments

Comments

@elidana
Copy link
Contributor

elidana commented Feb 19, 2024

We have received this user request from @segburg (GeoNet/fdsn#244) , who is using an on premise data store to analyse seismic waveform data from GeoNet stations.

The local store has some corrupted file, but when downloading data directly from the GeoNet FDSN service or open-data archive those files are complete.

I am using this ticket to provide some possible options (potentially interesting for other data users) to have an on-premise copy or sync of the GeoNet seismic waveform data for a given time period.

option 1 (small volumes of data) - GeoNet FDSN webservice

The GeoNet FDSN service can be used to request small data volumes or when data from the past 7 days are required.
Instructions on how to access this service are provided on the GeoNet website FDSN page
and some tutorials are available in the dataselect jupyter data tutorial

option 2 (moderate to large volumes of data) - GeoNet open data bucket

For moderate to large volumes of data, the recommended approach is to copy data from the GeoNet AWS Open data bucket (https://www.geonet.org.nz/data/access/aws).

Details on how waveform miniseed files are organized in the GeoNet Open AWS archive are provided in the GeoNet data tutorials
and some initial instructions and introduction to how to interact with it are provided in this GeoNet data blog

aws-cli

To interact with the GeoNet open data bucket, the aws command line interface (aws_cli) can be used.
For that, the aws-cli utility shall be installed on a Unix/Linux machine (https://aws.amazon.com/cli/). Users should refer to the AWS cli documentation for a full set of instructions and options.

Once the asws-cli is installed, to list the content for a specific year or day and station, the following command can be run from a terminal:

aws s3 ls --no-sign-request s3://geonet-open-data/waveforms/miniseed/2023

or

aws s3 ls --no-sign-request s3://geonet-open-data/waveforms/miniseed/2023/2023.031/WTAZ.NZ/

To copy one file on your local machine (once the /home/username/tmp folder has been created, the command is

aws s3 cp --no-sign-request s3://geonet-open-data/waveforms/miniseed/2023/2023.031/WTAZ.NZ/2023.031.WTAZ.12-HHE.NZ.D /home/username/tmp/.

To sync an entire day worth of data to your local machine (on the same /home/username/tmp folder)

aws s3 sync --no-sign-request s3://geonet-open-data/waveforms/miniseed/2023/2023.031/WTAZ.NZ/ /home/username/tmp/.

and the same can be applied for all stations available for that day with the following command

aws s3 sync --no-sign-request s3://geonet-open-data/waveforms/miniseed/2023/2023.031/ /home/username/tmp/.

The sync command will generate a local folder structure that is similar to what is in the GeoNet Open data bucket.
If the user requires a different structure for the file, some symbolic links can be created locally to match the preferred local seismic waveform naming convention.

s3fs

s3fs is a utility that can run on Linux and MacOS operating systems and can be used to "mount" an S3 bucket locally and mimic some of the functionalities of a local mount.

Detailed instructions are available here: https://github.com/s3fs-fuse/s3fs-fuse

Below some very quick instructions on how to use it on a Linux based system (Fedora based), that will need to be adapted to the local operating system.

install s3fs fuse (might require sudo access)

dnf install s3fs-fuse

create your local destination directory and "mount" the geonet open data bucket

mkdir /home/username/tmp/
s3fs geonet-open-data:/waveforms/miniseed/ /home/username/tmp/ -o public_bucket=1

We can provide more detailed instructions on how to do these steps, or filter specific waveform data, or different options on how to interact with the AWS open data bucket.

@calum-chamberlain
Copy link

For any obspy users, I have hacked together a drop-in replacement for obspy.clients.fdsn.Client for use with the GeoNet open-data bucket that anyone is welcome to use (and adapt and fix as needed). This is here - make sure you test it yourself before trusting it!

@salichon
Copy link
Contributor

salichon commented Feb 21, 2024

As a short @elidana @segburg

Open Data bucket to Seiscomp data structure (SDS) emulation
(cf. : https://www.seiscomp.de/doc/base/glossary.html#term-SDS)

  • Unix environement - Install S3fs
  • Mount OpenDataBucket (ODB) locally so will show locally YYYY/YYYY.Day/sta.Net/YYYY.Day.Sta.loc-Cha.Net.D
  • Emulate locally the ODB to a parallel local Seiscomp Directory Structure (SDS) `
    • Create struture Per "Year/Day/station/Locid/channel" e.g. YYYY/NET/Sta/Cha.D/
    • Symlink the ODB file in the proper SDS location Properly
      ln -s YYYY/YYYY.Day/sta.Net/YYYY.Day.Sta.loc-Cha.Net.D net.sta.loc.cha.D.YYYY.Day
    • Test with miniseed tools or a seiscomp process (scrttv)- it should work (2024)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants