Commit 780c52e7 authored by Marcel Hellkamp's avatar Marcel Hellkamp
Browse files

Readme overhaul

parent df61c5a0
# pycdstar3: Library and command-line client for GWDG CDSTAR 3.x
This is a client library and command-line toolbelt for accessing a
[GWDG CDSTAR 3.x](https://cdstar.gwdg.de/) repository, written in Python (3.4+)
and designed to be extensible and used for building your own tools and workflows.
There is already a library called [pycdstar](https://pypi.org/project/pycdstar/)
for accessing older versions of CDSTAR. The two libraries may be merged at some
point, but for now, use [pycdstar](https://pypi.org/project/pycdstar/) for
CDSTAR 2 and [pycdstar3](https://gitlab.gwdg.de/cdstar/pycdstar3) for CDSTAR 3
and newer.
[GWDG CDSTAR 3](https://cdstar.gwdg.de/) repository, written in Python (3.4+) and designed to be extensible and used for building your own tools and workflows.
There is already a library called [pycdstar](https://pypi.org/project/pycdstar/) for accessing older versions of CDSTAR. The two libraries may be merged at some point, but for now, use [pycdstar](https://pypi.org/project/pycdstar/) for CDSTAR 2 and [pycdstar3](https://gitlab.gwdg.de/cdstar/pycdstar3) for CDSTAR 3.
## Command-Line interface
`cdstar3` is a command-line toolbox to upload, download or manage data in a CDSTAR repository.
`pycdstar3` is also a command-line toolbox to upload, download or manage data in a CDSTAR repository.
Please note that the command-line interface is made for humans, not scripts. The commands and output may change between releases without notice. If you want to automate CDSTAR, consider implementing your tools directly against the `pycdstar3` client library. Libraries for other languages may also be available. As a last resort, you can always develop directly against the stable [CDSTAR REST API](https://cdstar.gwdg.de/docs/dev/#endpoints) using the HTTP client of your choice.
### cdstar.conf
The `pycdstar3` client will look for a `cdstar.conf` in the current directory or its parent directories and try to load default values from it. The most important settings are the CDSTAR server URI and a default vault name. If these are defined, you can reference archives by ID only, instead of the full service URI. This saves a *lot* of typing. The `pycdstar init` command will help you create this file.
### Referencing Vaults, Archives and Files
Most `pycdstar3` client commands operate on a specific vault, archive or file. These can always be referenced by their full URI. If a `cdstar.conf` is present and default values for server and vault are defined, then also a couple of short forms can be used. The following syntax forms are supported:
Please note that `cdstar3` was designed to be used by humans, not scripts. The output is mostly human-friendly and may change between releases. If you want to automate CDSTAR, consider implementing your tools using the `pycdstar3` client library, or directly against the stable CDSTAR REST API.
* **Full URI**: If the reference starts with `http(s)://` then everything up to the first `/v3/` is used as the server URI, followed by a vault and optionally an archive ID and file path. The default server and vault settings are not used.
### Configuration
Example: `pycdstar3 get https://example.com/v3/myvault/e497a76f/some/file.txt`
* **Vault Name**: If the reference starts with a forward slash and a vault name, then the default server from `cdstar.conf` is used, but the default vault is ignored. To reference the default vault, leave the vault name empty.
The `cdstar3` client needs to know which server to connect to and which vault to use per default. This information (and more) is defined in a file named `cdstar.conf`. If no such file is specified via the `-c` parameter, `cdstar3` will look for a `cdstar.conf` in the current working directory, and then in any of its parent directories. As a last resort, certain system-dependent user folders are searched (e.g. `~/.config/cdstar3/` on linux). You can create a configuration file with `cdstar3 init`.
Example: `pycdstar3 get /myvault/e497a76f/some/file.txt`
Example: `pycdstar3 get //e497a76f/some/file.txt` (default vault)
* **Archive ID**: In the shortest form, the reference must start with an archive ID and both default server and vault settings must be present in your `cdstar.conf`.
### Target Strings
Example: `pycdstar3 get e497a76f/some/file.txt`
A single CDSTAR server might host multiple vaults, each containing any number of archives, each containing multiple files. These can be referenced using their full resource URI (e.g. `http(s)://$server/v3/$vault/$archive/$file`) but that would be a lot of typing if you are mostly working with a single vault at a time. For this reason, a default server and vault can be configured in your `cdstar.conf` and the target strings will get a lot shorter: Archives can be references by their ID alone, and files by `$archive/$file`. If you want to specify a different vault, you can use the slight longer `/$vault/$archive` or `/$vault/$archive/$file` forms. Notice the leading `/` if you specify a vault. Even with a `cdstar.conf` present, you can still use the full URI if needed.
Some commands will accept a special string `new` as an archive ID and create a new archive on the fly.
### Usage
### Command Usage
This is an (incomplete) list of commands and their most important parameters. For a complete list, run `cdstar3 -h` and for details, see `cdstar3 COMMAND -h`.
This is an (incomplete) list of (planned) commands. For a complete list, run `pycdstar3 -h` and for details, see `pycdstar3 COMMAND -h`.
* **`init`**: Ask for server address, vault, credentials and other config options and create a `cdstar.conf` file in the current directory.
* **`new`** Create a new (empty) archive.
* **`info`** Query information about vaults, archives or files.
* **`meta get/set/list`** Manage metadata attributes for archives or files.
* **`acl get/set/list`** Manage access control entries and permissions.
* **`put`** Upload one or more files or folders to an archive.
* **`get`** Download or stream a single file from an archive.
* **`ls`** List files in an archive.
* **`rm`** Remove archives or files.
* **`zip`** Download an entire archive as zip or tar.
* **`sync`** Sync a local folder with a remote archive.
* **`scroll`** List all IDs in a vault.
* **`search`** Search in a vault.
* **`put ID [path]`** Upload one or more files or folders to an archive.
* **`get ID/FILE (path)`** Download a single file. If no path is specified, it is streamed to stdout.
* **`info ID[/FILE]`** Get information about an archive or file.
* **`meta get ID[/FILE] (FIELD)`** Get metadata about an archive or file.
* **`meta set ID[/FILE] [FIELD=VALUE]`** SET metadata about an archive or file.
###########################################
High-level commands work with local archive directories (one per archive). When creating a new archive or recovering an existing archive for the first time, `cdstar3` will create a hidden `.cdstar` folder within the target directory and remember the exact location (server, vault and id) of the remote archive. Do NOT delete this folder, or the correlation between your local copy and the remote archive is lost.
* **`archive DIR`**: Create a new remote archive from a local directory.
* **`recover ARCHIVE DIR`** download an existing remote archive to a local directory.
* **`sync --up/--down`**: Synchronize a local archive directory with the corresponding remote archive by upload- or downloading missing files. This command is *save* by default: Existing files are not overwritten and no files are removed.
* `--update` overwrite outdated files at the target location (comparing last modified time).
* `--force` overwrite all files that do not match, even if the target file is newer.
* `--delete` remove files at the target location that are not present at the source.
* `--progress` show a fancy progress bar.
* `--dry-run` only print what would have changed, but do not actually apply any changes.
* `-- [files]` only sync these files or folders.
Low-level commands do not require a local archive directory and operate directly on remote archives or vaults. If called from within a local archive directory however, the remote archive can be referred to as `origin`. For example, `cdstar3 ls origin` will list all files in the remote archive.
* **`search QUERY`** Search a vault.
* **`scroll [START]`** List all IDs in a vault, starting with the given id.
* **`info [ARCHIVE]`**: Print information about a remote archive.
* **`ls ARCHIVE`** List all files in an remote archive.
* **`get ARCHIVE NAME [FILE]`** Download a single file from a remote archive.
* **`put ARCHIVE FILE [NAME]`** Upload a single file to a remote archive.
* **`meta get/set ARCHIVE ATTR [VALUES]`** Read or set the value of a meta attribute.
* **`acl ARCHIVE SUBJECT [PERMISSIONS]`** Set permissions for a specific subject.
## License
Copyright 2019 Gesellschaft für wissenschaftliche Datenverarbeitung mbH Göttingen
Copyright 2019 Marcel Hellkamp
Licensed under the Apache License, Version 2.0 (the "License");
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment