Skip to content

Commit

Permalink
helps
Browse files Browse the repository at this point in the history
  • Loading branch information
kkmattil committed Apr 1, 2022
1 parent 12aeefc commit 287a3de
Show file tree
Hide file tree
Showing 26 changed files with 812 additions and 241 deletions.
2 changes: 1 addition & 1 deletion a-access
Original file line number Diff line number Diff line change
Expand Up @@ -147,7 +147,7 @@ accessible to the internet.
a-access is a tool to control access permissions of a bucket in Allas.
In cases where access permissions are given or revored to a spcific projet thet syntax is:
In cases where access permissions are given or removed to a spcific project the syntax is:
a-access +/-type project_id bucket
Expand Down
89 changes: 18 additions & 71 deletions a-check
Original file line number Diff line number Diff line change
Expand Up @@ -300,80 +300,27 @@ done

if [[ $print_help -eq 1 ]]; then
cat <<EOF
This tool is used to upload data from the disk environment
of CSC's supercomputers to Allas storage environmnet.
a-put can be used in other environments too.
The basic syntax of the command is:
a-put directory_or_file
By default this tool performs following operations:
1. Ensures that you have working connection to Allas storage
service.
2. In case of directory, the content of the directory is
collected into a single file (using tar command).
3. By default option --compress (-c), is used. This means that
the data is compressed using zstdmt command. This is the
recommended way if you will be using the data only in
CSC super computers. If you plan to use the uploaded dataset
in other servers, where zstdmt compression may not be available,
you can disable compression with option --nc (-n).
4. By default the data is uploaded to Allas using rclone command
and swift protocol. S3 protocol is available too.
NOTE! Data was compression with zstdmt command is no longer done by
default before the upload.
The location were data is stored in Allas can be defined with
options --bucket (-b) and --object (-o).
The default option is that data that locates in:
- scratch in Puhti is uploaded to bucket: project_number-puhti-SCRATCH
- scrarch in Mahti is uploaded to bucket: project_number-mahti-SCRATCH
- projappl in Puhti is uploaded to bucket: project_number-puhti-PROJAPPL
- projappl in Mahti is uploaded to bucket: project_number-Mahti-PROJAPPL
- LOCAL_SCRATCH in Puhti is uploaded to bucket: project_number-puhti-LOCAL_SCRATCH
In other cases the data uploaded to by default : username-poject_number-MISC
For example for user kkaytaj belonging in project_201234, data
locatioing in home directory will be uploaded to bucket: kkayttaj-201234-MISC.
The compressed dataset will be stored as one object. The object
name depends on the file name and location. The logic used is that
the possible subdirectory path in Mahti or Puhti is included
in the object name.
E.g. a file called test_1.txt in scratch directroy of Puhti can be
stored with commands:
cd /scratch/project_201234
a-put test_1.txt
In this case the file is stored to bucket: 201234-puhti-SCRATCH
as object: test_1.txt.zst
If you have another file called test_1.txt that locates in directory
/scratch/project_201234/project2/sample3 you can store it with commands:
cd /scratch/project_201234/project2/sample3
a-put test_1.txt
Or commmands
cd /scratch/project_201234
a-put project2/sample3/test_1.txt
In these cases the file is stored to bucket: 201234-puhti-SCRATCH
as object: project2/sample3/test_1.txt.zst
a-put command line options:
This tool is used to check if Allas already includes objects that would matching objects
that a-put would create. This command can be use check the success of a data upload process
done with a-put. Alternatively, the results can be used to list objects that need to be removed
or renamed, before uploading a new version of a dataset to Allas
For example, if you have uploaded a directory to Allas using command:
a-put datadir/*
You can use command:
a-check datadir/*
To check if all the directories and files have corresponding objects in Allas.
If you have defined a bucket with option -b, you must include this option
in the a-check command too:
a-put -b 123_bukcet datadir/*
Checking:
a-check -b 123_bukcet datadir/*
Note that the checking is done only based on the names of files, directories and objects.
The contents of the files and objects are not checked!
a-check command line options:
-b, --bucket <bucket_name> Define a name of the bucket into
which the data is uploaded.
Expand Down
4 changes: 2 additions & 2 deletions a-delete
Original file line number Diff line number Diff line change
Expand Up @@ -129,7 +129,7 @@ fi
if [ $print_help -eq 1 ]; then
cat <<EOF
This tool is used to remove data that has been uploaded to Allas service using the a-put command.
The basic syntax of the comand is:
The basic syntax of the command is:
a-delete object_name
Expand All @@ -140,7 +140,7 @@ Options:
-b --bucket Object name includes bucket name and the command does not try to use the default bucket names.
-u, --user <username> Option allows you to assign a user account that is used to confoirm the object ownership.
-u, --user <username> Option allows you to assign a user account that is used to confirm the object ownership.
-f, --force Don't ask confirmation when deleting a file
Expand Down
4 changes: 2 additions & 2 deletions a-encrypt
Original file line number Diff line number Diff line change
Expand Up @@ -118,7 +118,7 @@ By default the object is encrypted with CSC public key only. The encrypted objec
same bucket as the original object. Suffix: .c4gh is added to the object name.
The main purpose of this tool is to make a file, uploaded to the Allas service, compatible with the
Sensitive data services of csc.
Sensitive data services of CSC.
Options:
Expand Down Expand Up @@ -146,7 +146,7 @@ Examples:
a-encrypt project_12345_data/my_data.csv
2. Make encrypted copies of all objects in buket project_12345_data to bucket project_12345_sd
2. Make encrypted copies of all objects in bucket project_12345_data to bucket project_12345_sd
a-encrypt project_12345_data --all --bucket project_12345_sd
Expand Down
17 changes: 9 additions & 8 deletions a-find
Original file line number Diff line number Diff line change
Expand Up @@ -103,14 +103,15 @@ This tool is used to find:
1. Objects, whose name matches to the query
2. a-put generated objects that contain matching file names.
The basic syntax of the comand is:
The basic syntax of the command is:
a-find query_term
The query term is compared to the object names as well as the names and original paths of the files that have been uploaded to Allas with a-put. The matching obects are reported (but not downloaded).
The query term is compared to the object names as well as the names and original paths of the files that
have been uploaded to Allas with a-put. The matching objects are reported (but not downloaded).
The query term is processed as a reqular repression where some characters, for example dot (.), have a special meaning.
The query term is processed as a regular expression where some characters, for example dot (.), have a special meaning.
The same regular expression syntax is used with e.g. grep, awk and sed commands.
The most commonly occurring special characters are listed below:
Expand All @@ -121,7 +122,7 @@ The most commonly occurring special characters are listed below:
[^ ] matches any character, except the characters inside the brackets.
For example [^abc] would select all rows that contain also other characters
than just a,b and c.
* matchs zero or more of the preceding character or expression
* matches zero or more of the preceding character or expression
\{n,m\} matches n to m occurrences of the preceding character or expression
Options:
Expand All @@ -133,11 +134,11 @@ Options:
-b, --bucket <bucket_name> By default all the standard buckets, used by a-put, are searched. Option --bucket allows you to specify a
single bucket that will be used for the search.
-a, --all By default all the standard buckets, used by a-put, are searched. Option --all defines that all the bukets of
of the project will be included in the search.
-a, --all By default all the standard buckets, used by a-put, are searched. Option --all defines
that all the buckets of the project will be included in the search.
-s, --silent Ouput just the object names and number of hits. If -file in cluded print object name and
matching file name on one row.
-s, --silent Output just the object names and number of hits. If -file option is included,
print object name and matching file name on one row.
Related commands: a-put, a-get, a-delete, a-info
Expand Down
9 changes: 4 additions & 5 deletions a-flip
Original file line number Diff line number Diff line change
Expand Up @@ -49,19 +49,18 @@ done

if [ $print_help -eq 1 ]; then
cat <<EOF
a-flip is a tool to make individual files temporaraily available in the internet.
a-flip is a tool to make individual files temporary available in the internet.
a-flip copies a file to Allas into a bucket that can be publicly accessed. Thus, anyone with the address (URL) of the
uploaded data object can read and download the data with a web browser or tools like *wget* and *curl*.
a-flip works mostly like a-publish but there are some differences:
1) only the predfined bucket name ( _username-projectNumber_-flip ) can be used
1) only the pre-defined bucket name ( _username-projectNumber_-flip ) can be used
2) When the command is executed it automatically deletes objects that are oldes than two days
The basic syntax of the command is:
```
a-flip file_name
```
a-flip file_name
The file is uploaded to a bucket _username-projectNumber_-flip. You can define other bucket names can't be used.
The URL to the uploaded object is:
Expand Down
4 changes: 2 additions & 2 deletions a-info
Original file line number Diff line number Diff line change
Expand Up @@ -124,14 +124,14 @@ done
if [ $print_help -eq 1 ]; then
cat <<EOF
This tool is used to show information about a data object that has been uploaded to Allas service using the a-put command.
The basic syntax of the comand is:
The basic syntax of the command is:
a-info object_name
Options:
-p, --project <project_ID> Get infromation about objects form the buckets of the defined project in stead of the currently configured project.
-p, --project <project_ID> Get information about objects form the buckets of the defined project in stead of the currently configured project.
-b, --bucket Object name includes bucket name and the command does not try to use the default bucket names.
Expand Down
4 changes: 2 additions & 2 deletions a-list
Original file line number Diff line number Diff line change
Expand Up @@ -134,9 +134,9 @@ Options:
-d --dir List content so that / -characters on object names are used to define a directory structure.
-l, --lh <project_ID> Print out detaled listing of the bukects or objects in a bucket.
-l, --lh <project_ID> Print out detailed listing of the buckets or objects in a bucket.
-p, --prefix List only objects startting with the given prefix
-p, --prefix List only objects starting with the given prefix.
Related commands: a-put, a-get, a-delete, a-find
Expand Down
12 changes: 7 additions & 5 deletions a-publish
Original file line number Diff line number Diff line change
Expand Up @@ -117,11 +117,13 @@ a-publish options:
-b, --bucket Use the defined bucket in stead of the default bucket name
-o, --os_file Define alternative name for the object that will be created
-i, --index (static/dynamic). By defaul a-publis creates a static index file that
includes the objects that are in the target bucket when the command is executed.
With setting --index dynamic the command adds a javascript based index file to the
bucket. With this option the index.html page lists the objects that are
available in the bucket in the time when this page is accessed. This dynamic indexing tool can list
-i, --index (static/dynamic). By default a-publish creates a static index
file that includes the objects that are in the target bucket when
the command is executed.
With setting --index dynamic the command adds a javascript based
index file to the bucket. With this option the index.html page
lists the objects that are available in the bucket in the time when
this page is accessed. This dynamic indexing tool can list
only up to 1000 files.
--input-list List of files to be uploaded.
Expand Down
32 changes: 12 additions & 20 deletions a-put
Original file line number Diff line number Diff line change
Expand Up @@ -301,7 +301,7 @@ done
if [[ $print_help -eq 1 ]]; then
cat <<EOF
This tool is used to upload data from the disk environment
of CSC's supercomputers to Allas storage environmnet.
of CSC's supercomputers to Allas storage environment.
a-put can be used in other environments too.
The basic syntax of the command is:
Expand All @@ -316,14 +316,7 @@ By default this tool performs following operations:
2. In case of directory, the content of the directory is
collected into a single file (using tar command).
3. By default option --compress (-c), is used. This means that
the data is compressed using zstdmt command. This is the
recommended way if you will be using the data only in
CSC super computers. If you plan to use the uploaded dataset
in other servers, where zstdmt compression may not be available,
you can disable compression with option --nc (-n).
4. By default the data is uploaded to Allas using rclone command
3. By default the data is uploaded to Allas using rclone command
and swift protocol. S3 protocol is available too.
NOTE! Data was compression with zstdmt command is no longer done by
Expand All @@ -343,14 +336,14 @@ The default option is that data that locates in:
In other cases the data uploaded to by default : username-poject_number-MISC
For example for user kkaytaj belonging in project_201234, data
locatioing in home directory will be uploaded to bucket: kkayttaj-201234-MISC.
locating in home directory will be uploaded to bucket: kkayttaj-201234-MISC.
The compressed dataset will be stored as one object. The object
name depends on the file name and location. The logic used is that
the possible subdirectory path in Mahti or Puhti is included
the possible sub-directory path in Mahti or Puhti is included
in the object name.
E.g. a file called test_1.txt in scratch directroy of Puhti can be
E.g. a file called test_1.txt in scratch directory of Puhti can be
stored with commands:
cd /scratch/project_201234
Expand Down Expand Up @@ -386,7 +379,7 @@ a-put command line options:
created.
-S, --s3cmd Use S3 protocol in stead of swift protocol
for upoload.
for upload.
-n, --nc Do not compress the data that will be uploaded.
(This is now the default mode thus this option is
Expand All @@ -397,7 +390,7 @@ a-put command line options:
-h, --help Print this help.
-t, --tmpdir Define a direcrory that will be used to store
-t, --tmpdir Define a directory that will be used to store
temporary files of the upload process.
-s, --silent Less output
Expand All @@ -419,20 +412,19 @@ a-put command line options:
--override Allow overwriting existing objects.
--input-list <list_file> Give a file that lists the files or directtories
to be uploaded to Allas.
Each item will be stored as one object.
--input-list <list_file> Give a file that lists the files or directories
to be uploaded to Allas. Each item will be stored as one object.
-a, --asis Copy the given file or content of a directory to Allas
without compression and packing so that each file in the
directory will be copied to Allas as an individual object.
The object name contrains the relative path of the file to
The object name contains the relative path of the file to
be copied.
--follow-links When uploading a directory, include linked files as real files
in sead of links.
in stead of links.
-e, --encrypt <methiod> Options: gpg and c4gh. Encrypt data with pgp or crypt4gh.
-e, --encrypt <method> Options: gpg and c4gh. Encrypt data with pgp or crypt4gh.
--pk, --public-key Public key used for crypt4gh encryption.
Expand Down
Loading

0 comments on commit 287a3de

Please sign in to comment.