-
-
Notifications
You must be signed in to change notification settings - Fork 84
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
#249 top10nl etl working with multiple stetl args and standard file n…
…aming
- Loading branch information
Showing
6 changed files
with
127 additions
and
65 deletions.
There are no files selected for viewing
This file was deleted.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,69 @@ | ||
# BRT - Top10NL Inlezen | ||
|
||
Top10NL inlezen met Stetl (www.stetl.org) ETL framework. | ||
door: Just van den Broecke en Frank Steggink | ||
|
||
Deze map bevat de ETL configuratie en commando om via Stetl | ||
Top10NL vanuit de bron GML bestanden naar verschillende outputs weg te schrijven. | ||
Standaard is dit PostGIS, maar omdat output via `ogr2ogr` verloopt kan dit | ||
elke output zijn die ogr2ogr ondersteunt, bijv SHP, GeoJSON of GeoPackage, | ||
in theorie ook bijv Oracle. | ||
|
||
Om gebruik te maken van Stetl moet de externe GitHub submodule externals/stetl | ||
aanwezig zijn. | ||
|
||
Bij het klonen van de GitHub komt Stetl als volgt mee: | ||
|
||
git clone --recursive https://github.com/nlextract/NLExtract.git | ||
|
||
Stetl komt dan mee, hoeft niet apart geinstalleerd, alleen de Stetl-dependencies. | ||
|
||
Dependencies Stetl installeren: | ||
http://www.stetl.org/en/latest/install.html | ||
|
||
Meer over Stetl: http://stetl.org | ||
|
||
## Commando | ||
|
||
./etl.sh | ||
Windows: etl.cmd | ||
|
||
Gebruikt default opties (database params etc) uit `options/default.args` bestand. | ||
|
||
Stetl configuratie, hoeft niet gewijzigd, alleen indien bijv andere output gewenst: | ||
`conf/etl-top10nl-v1.2.cfg` | ||
|
||
## Opties/argumenten | ||
|
||
Een aantal opties kunnen op 2 manieren vervangen worden: | ||
|
||
1) Impliciet: Overrule default opties (database params etc) met een eigen lokale file gebaseerd op | ||
lokale hostnaam: `options/<jouw host naam>.args` | ||
|
||
2) Expliciet op command line via `./etl.sh <mijn opties file>.args` Windows: `etl.cmd <mijn opties file>.args` | ||
|
||
Indien methode 2 gebruikt wordt, prevaleert deze boven 1 en de default opties! | ||
|
||
Een opties-bestand hoeft niet alle argumenten te bevatten. De `options/default.args` wordt altijd | ||
als default gebruikt. Eigen/host-based opties bestanden bevatten argumenten die de default | ||
vervangen ("overriding"). Bijv een standaard gebruik is alleen bron GML en DB gegevens in een eigen opties bestand: | ||
|
||
# INPUT: bron bestanden map | ||
input_dir=/home/me/download/top10nl | ||
|
||
# OUTPUT: PostGIS settings | ||
host=mijndbhost | ||
user=mijnuser | ||
password=mijnww | ||
database=mijndb | ||
schema=mijntop10 | ||
|
||
## Database mapping | ||
|
||
`gfs/top10-v1.2.gfs` is de GDAL/OGR "GFS Template" en bepaalt de mapping van GML elementen/attributen | ||
naar PostGIS kolom(namen). Maak eventueel een eigen GFS file en specificeer deze in je | ||
`options/<jouw host naam>.args`: bijv `gfs_template=gfs/mijntop10.gfs` | ||
|
||
## TODO | ||
|
||
* GUI |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,2 +1,3 @@ | ||
*.sh | ||
*.args | ||
!default.args |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,42 @@ | ||
# Default options for the Stetl TOP10NL extract command | ||
# These values are substituted into the ETL conf in conf/etl-top10nl-v1.2.cfg, see the {arg} strings there | ||
|
||
# INPUT: gml files, point to directory or file(s) pattern | ||
input_dir=test/v1_2/nlextract | ||
|
||
# Files pattern: files filter volgens Python glob.glob patronen: https://docs.python.org/2/library/glob.html | ||
# NB moeten op dit moment .zip files zijn! | ||
zip_files_pattern=*.[zZ][iI][pP] | ||
|
||
# Match files binnen zip-archieven, default is alle .gml files | ||
# filename_match=[!bgt_plaatsbepalingspunt]* om bijv de plaatsbepalingspunten te excluden | ||
filename_match=*.gml | ||
|
||
# OPTIONS | ||
# Temp dir voor GFS | ||
temp_dir=temp | ||
|
||
# GFS template: bepaalt mapping van GML-velden naar PostGIS table kolommen | ||
gfs_template=gfs/top10-v1.2.gfs | ||
|
||
# OUTPUT: PostGIS settings | ||
host=localhost | ||
port=5432 | ||
user=postgres | ||
password=postgres | ||
database=top10nl | ||
schema=test | ||
|
||
# OPTION: attribuut waarden bijv typeWeg die meerdere keren in XML voorkomen, wat daarmee te doen | ||
# Zie ogr2ogr opties | ||
# May use: these options | ||
# multi_opts=-splitlistfields -maxsubfields 1 | ||
# multi_opts=-splitlistfields | ||
multi_opts=-fieldTypeToString StringList | ||
|
||
# Welk gebied (clip), zet leeg voor alles | ||
# spatial_extent=120000 450000 160000 500000 | ||
spatial_extent= | ||
|
||
# Maximaal aantal features | ||
max_features=20000 |