The ICAT Api has a method called ingestMetadata which allow you in one call to define and set up a new investigation/Dataset with all the related parameters. This method takes as parameter a sessionId and an XML string.
Th3e XML string must conform to a specific Schema that define what parametrers are necessary and what formats are acceptable. The schema follow the same structure than the ICAT database if sometimes in a simplify manner. not all possible relationship are described. The values of the Lookup tables can also be found in the Schema.
The schema can be found in the ICAT Api source code under the icat3-jaxb module under /src/uk/icat3/jaxb/icatXSD.xsd.
To create the XML document necessary, we have developed several tool for the differents formats used at STFC facilites.
- The first one, writeRaw, is specific at ISIS as he reads only the ISIS RAW format.
- The second one, nxingest, is more generic as he reads NeXus format files and depend on a mapping file to find the information necessary.
nxingest creates the XML document but do not sent it to ICAT. you need to have an 0ther tool to invoque the ICAT API.
Here is a extract from the help file nxingest.txt.
USAGE
nxingest mapping_file nexus_file [output_file]
DESCRIPTION
nxingest extract the metadata from a NeXus file to create an XML fileaccording to a mapping file. nxingest has also some reformating capabilities like date modification, use the currrent time, merging several field together, spliting sentences into keywords, ...
The mapping file will defines the structure (names and hierarchy) andcontent (from the NeXus file, from the mapping file or from the current time)of the oputput file. See below for a description of the maping file.
This tool use the NeXus api so any of the supported format (HDF4, HDF5and XML) can be read.
To be accepted by ICAT, the output XML should match the ICAT3 XML schema
MAPPING FILE SYNTAX
XML Nodes
the structure of the output file will be determine by the nodes of the mapping file.
There are several types of node :
- 'tbl' or Table node that define the hierarchy of the output document.
e.g. the mapping : {icat type="tbl"}{study type="tbl"}{investigation type="tbl" trusted="false"}
is mapped into : {icat }{study}{investigation trusted="false"}
NB : XML snippet is not accepted in this blog so the character lt and gt are replaced by { and }
'user_tbl' User Table node is a specific case where the node is scan several time file. at each iteration, nxingest will replace the string {NXuser} by the correct name found in the file.according to the number of {NXuser} type classes are present in the neXus - 'tag' or record node which define a simple metadata record. It has 2 child node that contain the name of the output element and the source of the element.
e.g. the mapping : {record} {icat_name}name{/icat_name} {value type="nexus"} path_to_metadata {/value} {/record}
is mapped into :metadata_from_nexus_file - 'param_str' and 'param_num' or Parameters node define an element of the Parameters table (Dataset, DataFile or Sample).
e.g. the mapping : {parameter type="param_str"} {icat_name} name{/icat_name} path_to_metadata {/value} {description type="fix"} fixed metadata description{/description} {/parameter}{value type="nexus"}
is mapped into : metadata_from_nexus_file{parameter} {name}name {/name} {string_value} fixed metadata description{/description} {/parameter}{/string_value} {units}N/A{/units} {description} - 'keyword_tag' will split the source in its various word and fill the keyword table in the ICAT DB. The mapping with two neXus dataset is like : {keyword type="keyword_tag"} {icat_name} name {/icat_name} {value type="mix"} nexus:/{NXentry}/title | fix: , | nexus:/{NXentry}/notes {/value} {/keyword}
The source of the metadata is defined by nodes of type 'fix', 'nexus',
'special' and 'mix'. if the type is special. the begining of the text will
contain a modifier (fix:, nexus:, time: or sys: ) The value is then the text
without the modifier.
- 'fix' string from the mapping file itself.
- 'nexus' metadata is read from the neXus file according to the path
- 'special'
- 'fix:' idem as above
- 'nexus:' idem as above
- 'time:' the time can be expressed in multiple format, so the the value after the
modifier will be composed in 3 parts :
time:source ; input_format; output format
- The source can be 'now' for the current time or 'nexus()' with the path to the time string between the parenthesis.
- input and output format are optional. The s/w expect an integer. Currently the possible values are :
- '2007-05-23T12:48:05' (default)
- '2007-05-23 12:48:05'
- '2007-05-23'
- '12:48:05'
- '20070523'
- '200705'
- '2007'
- '23/05/2007'
- The source can be 'now' for the current time or 'nexus()' with the path to the time string between the parenthesis.
- system:
- sys:filename gives the filename of the NeXus file.
- sys:location gives the path of the NeXus file.
- sys:size gives the size in bytes of the NeXus file.
- 'mix' To combine several sources, several modifiers used with node type 'special' are used separated with '|'.
e.g.nexus:/{NXentry}/{NXinstrument}.short_name | fix:_ | time:now ; 0 ; 5
NeXus syntax.
NeXus Data is divided in different classes that hold data sets. The datasets may hold any type of data from a single byte to unlimited dimensionarrays. The data sets and the classes may also have attributes.
To collect data from a neXus file, you have to build the path to the data you want.
- Dataset (singular string or number)
The path is the name of the different classes separated by '/' the last nameis the name of the dataset.
e.g. /run/title
- Attribute (singular string or number)
The attribute name is separated from the dataset by a '.'
e.g. /run/data.units
- Arrays
Most of the data will be stored as multi dimensional arrays. We may want toextract particular information from the data.
- Specific value from an array
A null or positive number between square brackets after the data setname. nxingest consider all dataset an uni-dimension.
e.g. /run/data_array[3]
- Derived value
nxingest may derived a few value from an array. To express that, you have to put the name of the derived parameter between square brackets. Available values are :- [AVG] Average
- [STD] Standard Deviation
- [MIN] Minimum Value
- [MAX] Maximum Value
- [SUM] Sum of all values
- Generic classes
NeXus defined generic classes type that user can name freely. nxingest can use some of these to generalise the mapping files for similar instrument.
By Writing the class type under rounded brackets like {NXentry} the program will substitue it with the actual class name from the current file.
This is currenlty only available for {NXentry}, {NXinstrument} and {NXuser}
e.g. /{NXentry}/{NXinstrument}/source/name is equivalent to /run/MUSR/source/name and /entry_0/I18/source/name
Also there may be more than 1 user define in a NeXus file. nxingest will loop over each of them if the mapping include a special node 'user_tbl'.
No comments:
Post a Comment