View Documents in the NFDI-MatWerk Data Repository
Entries in the NFDI-MatWerk Data Repository are called data resources. A data resource contains a set of administrative metadata like title, publisher, date updated, etc., which describe the data. The data (one or more files) are linked to a single data resource which describes them. The data associated with a data resource can be of different file types like .png, .jpg, .tif, .zip, .hdf5, .pdf, etc.
Only the publically available documents(data resources and associated data) are visible without logging in. For data resources which are exclusively shared with you, please login with your credentials. The same applies for the REST-API Method. In this case, the access token for keycloak (the authentication and authorisation interface) needs to be first generated using the command line.
Browser based Graphical User Interface Method
-
Open the homepage of the NFDI-MatWerk Data Repository by clicking on the Open Button on the box titled NFDI-MatWerk Data Repository. Hover the mouse pointer over the box to make the Open button to appear.
-
Find the required data resource by scrolling down and/or filtering and view the contents by clicking on the eye button. The edit button next to the eye button can be used by authorised users to make changes to the data resource (administrative metadata).
-
By clicking on the eye button, the contents listed under the selected data resource become visible in a new window. Open individual documents (visible under the Tab Content Information) in the browser by clicking on the eye button next to them. Similarly, the files can be downloaded by clicking on the download button next to the eye button.
REST-API Method
REST-API calls can be made using the browser or the Command Line Interface (CLI).
Browser Method
Each data resource has a unique identifier (UID). If this UID is known, the data resource can be directly opened in the browser via the REST-API using its link. The link is of the format https://matwerk.datamanager.kit.edu/api/v1/dataresources/<UID>
. For e.g., https://matwerk.datamanager.kit.edu/api/v1/dataresources/3035cf01-cb70-4fb7-9a3a-ef5b4fb99da0.
For better formatting of the JSON output, a plugin like JSON Peep for Safari is recommended.
Each data resource has one or more documents as content. Individual content documents can be accessed using the browser via the REST-API using their full address, which is of the format https://matwerk.datamanager.kit.edu/api/v1/dataresources/<UID>/data/<filename>
. For e.g., https://matwerk.datamanager.kit.edu/api/v1/dataresources/3035cf01-cb70-4fb7-9a3a-ef5b4fb99da0/data/measurement_0000.tif
By making use of these full addresses, the contents can be shared with others, provided that the documents are publically accessible or the people to whom they are shared with, have been previously granted access.
Command Line Usage
Similarly, using the REST-API on the command line, either all the resources can be listed or a single resource can be accessed using its UID.
To begin, open the command prompt.
Install jq Command Line JSON Processor
The jq command line processor is a pre-requisite for many of the operations in this manual using the REST-API. It is used for working with the json outputs from the repositories from the command line itself. Please install jq for your system based on the instructions on the jq website, if it is not already installed. Normally this process involves just executing a single statement from the command line.
List all Data Resources
-
Type
curl -s 'https://matwerk.datamanager.kit.edu/api/v1/dataresources/'| jq
and press enter -
Now all the dataresources which are publically available will be visible. To view resources to which you have special access, follow the instructions under Authenticate using the command line interface
List Selected Data from all Data Resources
For listing all the UIDs, use the command curl -s 'https://matwerk.datamanager.kit.edu/api/v1/dataresources/' | jq -r '.[].id'
.
Below is an example with results:
curl -s 'https://matwerk.datamanager.kit.edu/api/v1/dataresources/' | jq -r '.[].id'
0d1c7f88-7f8e-4f3a-8391-037b5ef111dc
02109dbe-9b2e-4b20-b649-4d667e153b4c
3035cf01-cb70-4fb7-9a3a-ef5b4fb99da0
e51403c6-3e66-4f33-823b-93a9ccdca03c
35d04eea-1288-45f1-8351-e914ae33da9e
de472b6e-131e-424f-99de-919bbe11410f
9c23f27e-31cd-4192-8e71-ec8419dacbbe
10.5281/zenodo.7764161
980f5b67-8d07-4340-a255-e5738997c18a
f4db4926-5dc7-4d23-afec-12168f148cca
e9c5bda4-a4c0-408d-a0e4-0cd77b4b83c4
cd218ccf-4fa2-4629-9860-bb2b5a7b842a
20e6249b-7988-4092-9aa1-5361bfafefbc
423ece8d-6171-4f9a-9055-77ebb77c05e8
f29a69b2-be81-494a-b240-9caba4c2633c
Similarly other data can be retrieved by changing the last part of the previous command. For e.g., use '.[].creators'
instead of '.[].id'
to get the list all the creators in all the data resources. Other options are among others, '.[].titles'
, '.[].publisher'
, '.[].publicationYear'
, '.[].language'
.
Access a Specific Resource
-
A single resource can be accessed knowing its UID. For example, to access a data resource with UID
3035cf01-cb70-4fb7-9a3a-ef5b4fb99da0
, typecurl -s 'https://matwerk.datamanager.kit.edu/api/v1/dataresources/3035cf01-cb70-4fb7-9a3a-ef5b4fb99da0' | jq
on the command line and press enter. The general format iscurl -s 'https://matwerk.datamanager.kit.edu/api/v1/dataresources/<UID>' | jq
where<UID>
needs to be replaced with the knownUID
of the data resource. -
Now, the data resource with the queried
UID
will be displayed. If this resource is not publically available, then the authorised users need to authenticate with an access token to access the particular data resource.
curl -s 'https://matwerk.datamanager.kit.edu/api/v1/dataresources/3035cf01-cb70-4fb7-9a3a-ef5b4fb99da0' | jq
{
"id": "3035cf01-cb70-4fb7-9a3a-ef5b4fb99da0",
"identifier": {
"id": 64,
"value": "(:tba)",
"identifierType": "DOI"
},
"creators": [
{
"id": 64,
"familyName": "Soysal",
"givenName": "Mehmet"
}
],
"titles": [
{
"id": 64,
"value": "X-ray computed tomography dataset of a walnut (Acquisition)",
"titleType": "ALTERNATIVE_TITLE",
"lang": "en"
}
],
"publisher": "Jonas Fell",
"publicationYear": "2022",
"resourceType": {
"id": 64,
"value": "Acquisition ",
"typeGeneral": "OTHER"
},
"dates": [
{
"id": 64,
"value": "2023-01-18T13:48:10Z",
"type": "CREATED"
}
],
"language": "en",
"alternateIdentifiers": [
{
"id": 64,
"value": "3035cf01-cb70-4fb7-9a3a-ef5b4fb99da0",
"identifierType": "INTERNAL"
}
],
"lastUpdate": "2023-02-28T20:04:18.074Z",
"state": "VOLATILE"
}
This is the only the administrative metadata of the data resource. Information on the number and type of content files is not available using this command. The next section describes, how to get information about the content associated with a data resource.
View Overview of all Content
To view information about all the data content associated with a data resource with a specific UID, type curl 'https://matwerk.datamanager.kit.edu/api/v1/dataresources/<UID>/data/' -s -X GET -H 'Accept: application/vnd.datamanager.content-information+json' |jq
on the command prompt and press enter.
eg.: curl 'https://matwerk.datamanager.kit.edu/api/v1/dataresources/3035cf01-cb70-4fb7-9a3a-ef5b4fb99da0/data/' -s -X GET -H 'Accept: application/vnd.datamanager.content-information+json' |jq
lists all the content information of the data resource with UID 3035cf01-cb70-4fb7-9a3a-ef5b4fb99da0
as a json file.
curl 'https://matwerk.datamanager.kit.edu/api/v1/dataresources/3035cf01-cb70-4fb7-9a3a-ef5b4fb99da0/data/' -s -X GET -H 'Accept: application/vnd.datamanager.content-information+json' |jq
[
{
"id": 40,
"parentResource": {
"id": "3035cf01-cb70-4fb7-9a3a-ef5b4fb99da0",
"identifier": {
"value": "(:tba)",
"identifierType": "DOI"
},
"alternateIdentifiers": [
{
"value": "3035cf01-cb70-4fb7-9a3a-ef5b4fb99da0",
"identifierType": "INTERNAL"
}
]
},
"relativePath": "measurement_0017.tif",
"version": 1,
"fileVersion": "1",
"versioningService": "simple",
"depth": 1,
"contentUri": "https://matwerk.datamanager.kit.edu/api/v1/dataresources/3035cf01-cb70-4fb7-9a3a-ef5b4fb99da0/data/measurement_0017.tif",
"uploader": "mehmet.soysal@kit.edu",
"mediaType": "image/tiff",
"hash": "sha1:f933c21e95008ee0dab092f2f5a1db17326afd94",
"size": 13547642,
"metadata": {},
"tags": [],
"filename": "measurement_0017.tif"
},
{
"id": 41,
"parentResource": {
"id": "3035cf01-cb70-4fb7-9a3a-ef5b4fb99da0",
"identifier": {
"value": "(:tba)",
"identifierType": "DOI"
},
"alternateIdentifiers": [
{
"value": "3035cf01-cb70-4fb7-9a3a-ef5b4fb99da0",
"identifierType": "INTERNAL"
}
]
},
"relativePath": "measurement_0018.tif",
"version": 1,
"fileVersion": "1",
"versioningService": "simple",
"depth": 1,
"contentUri": "https://matwerk.datamanager.kit.edu/api/v1/dataresources/3035cf01-cb70-4fb7-9a3a-ef5b4fb99da0/data/measurement_0018.tif",
"uploader": "mehmet.soysal@kit.edu",
"mediaType": "image/tiff",
"hash": "sha1:e4cb97209054833f0d3bfbc9d64650866a72a776",
"size": 13547642,
"metadata": {},
"tags": [],
"filename": "measurement_0018.tif"
}
]
In this example, the data resource with UID
3035cf01-cb70-4fb7-9a3a-ef5b4fb99da0
is associated with two images “measurement_0018.tif” and “measurement_0017.tif” (Also visible under the key “filename”). The contentUri
can be then directly used to open the image in the browser (e.g., https://matwerk.datamanager.kit.edu/api/v1/dataresources/3035cf01-cb70-4fb7-9a3a-ef5b4fb99da0/data/measurement_0018.tif) or with the curl
command and be downloaded as shown below.
curl https://matwerk.datamanager.kit.edu/api/v1/dataresources/3035cf01-cb70-4fb7-9a3a-ef5b4fb99da0/data/measurement_0018.tif -o /tmp/measurement_0018.tif
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 12.9M 100 12.9M 0 0 20.4M 0 --:--:-- --:--:-- --:--:-- 20.6M
Access Specific Content
It is possible to access individual content by using the command curl 'https://matwerk.datamanager.kit.edu/api/v1/dataresources/<UID>/data/<filename>' -i -X GET
where <UID>
and <filename>
are to be replaced by the required values. It is not recommended to open individual image files in the command line. For e.g., curl 'https://matwerk.datamanager.kit.edu/api/v1/dataresources/3035cf01-cb70-4fb7-9a3a-ef5b4fb99da0/data/measurement_0000.tif' -i -X GET
gives a warning because of the binary nature of the image file, whereas curl -s 'https://matwerk.datamanager.kit.edu/api/v1/dataresources/3035cf01-cb70-4fb7-9a3a-ef5b4fb99da0/data/info.json'| jq
would display the file “info.json” in a formatted way.
curl 'https://matwerk.datamanager.kit.edu/api/v1/dataresources/3035cf01-cb70-4fb7-9a3a-ef5b4fb99da0/data/measurement_0000.tif' -i -X GET
HTTP/1.1 200
Date: Fri, 25 Aug 2023 11:59:16 GMT
Server: Apache/2.4.56 (Debian)
Vary: Origin,Access-Control-Request-Method,Access-Control-Request-Headers
X-Content-Type-Options: nosniff
X-XSS-Protection: 1; mode=block
Strict-Transport-Security: max-age=31536000 ; includeSubDomains
X-Frame-Options: DENY
Content-Type: image/tiff
Content-Length: 13547642
Warning: Binary output can mess up your terminal. Use "--output -" to tell
Warning: curl to output it to your terminal anyway, or consider "--output
Warning: <FILE>" to save to a file.
Tipp
It is also possible to search for a particular resource using an example or search for a resource using an alternative identifier than the UID. For more information please refer to the documentation of the base-repo.