User Guide¶
Your First Module¶
import argschema
class MySchema(argschema.ArgSchema):
a = argschema.fields.Int(default = 42, description= 'my first parameter')
if __name__ == '__main__':
mod = argschema.ArgSchemaParser(schema_type=MySchema)
print(mod.args)
running this code produces
$ python mymodule.py
{'a': 42, 'log_level': u'ERROR'}
$ python mymodule.py --a 2
{'a': 2, 'log_level': u'ERROR'}
$ python mymodule.py --a 2 --log_level WARNING
{'a': 2, 'log_level': u'WARNING'}
WARNING:argschema.argschema_parser:this program does nothing useful
$ python mymodule.py -h
usage: mymodule.py [-h] [--a A] [--output_json OUTPUT_JSON]
[--log_level LOG_LEVEL] [--input_json INPUT_JSON]
optional arguments:
-h, --help show this help message and exit
--a A my first parameter
--output_json OUTPUT_JSON
file path to output json file
--log_level LOG_LEVEL
set the logging level of the module
--input_json INPUT_JSON
file path of input json file
Great you are thinking, that is basically argparse, congratulations!
But there is more.. you can also give your module a dictionary in an interactive session
>>> from argschema import ArgSchemaParser
>>> from mymodule import MySchema
>>> d = {'a':5}
>>> mod = ArgSchemaParser(input_data=d,schema_type=MySchema)
>>> print(mod.args)
{'a': 5, 'log_level': u'ERROR'}
or you write out a json file and pass it the path on the command line
{
"a":99
}
$ python mymodule.py --input_json myinput.json
{'a': 99, 'log_level': u'ERROR', 'input_json': u'myinput.json'}
or override a parameter if you want
$ python mymodule.py --input_json myinput.json --a 100
{'a': 100, 'log_level': u'ERROR', 'input_json': u'myinput.json'}
plus, no matter how you give it parameters, they will always be validated, before any of your code runs.
Whether from the command line
$ python mymodule.py --input_json myinput.json --a 5!
usage: mymodule.py [-h] [--a A] [--output_json OUTPUT_JSON]
[--log_level LOG_LEVEL] [--input_json INPUT_JSON]
mymodule.py: error: argument --a: invalid int value: '5!'
or from a dictionary
>>> from argschema import ArgSchemaParser
>>> from mymodule import MySchema
>>> d={'a':'hello'}
>>> mod = ArgSchemaParser(input_data=d,schema_type=MySchema)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Users/forrestcollman/argschema/argschema/argschema_parser.py", line 159, in __init__
raise mm.ValidationError(json.dumps(result.errors, indent=2))
marshmallow.exceptions.ValidationError: {
"a": [
"Not a valid integer."
]
}
Fields¶
argschema uses marshmallow (http://marshmallow.readthedocs.io/)
under the hood to define the parameters schemas. It comes with a basic set of fields
that you can use to define your schemas. One powerful feature of Marshmallow is that you
can define custom fields that do arbitrary validation.
fields
contains all the built-in marshmallow fields,
but also some useful custom ones,
such as InputFile
,
OutputFile
,
InputDir
that validate that the paths exist and have the proper
permissions to allow files to be read or written.
Other fields, such as NumpyArray
will deserialize ordered lists of lists
directly into a numpy array of your choosing.
Finally, an important Field to know is Nested
, which allows you to define
heirarchical nested structures. Note, that if you use Nested schemas, your Nested schemas should
subclass DefaultSchema
in order that they properly fill in default values,
as marshmallow.Schema
does not do that by itself.
The template_module example shows how you might combine these features to define a more complex parameter structure.
from argschema import ArgSchemaParser, ArgSchema
from argschema.fields import OutputFile, NumpyArray, Boolean, Int, Str, Nested
from argschema.schemas import DefaultSchema
import numpy as np
import json
# these are the core parameters for my module
class MyNestedParameters(DefaultSchema):
name = Str(required=True, description='name of vector')
increment = Int(required=True, description='value to increment')
array = NumpyArray(dtype=np.float, required=True, description='array to increment')
write_output = Boolean(required=False, default=True)
# but i'm going to nest them inside a subsection called inc
class MyParameters(ArgSchema):
inc = Nested(MyNestedParameters)
#this is another schema we will use to validate and deserialize our output
class MyOutputParams(DefaultSchema):
name = Str(required=True, description='name of vector')
inc_array = NumpyArray(dtype=np.float, required=True, description='incremented array')
if __name__ == '__main__':
# this defines a default dictionary that will be used if input_json is not specified
example_input = {
"inc": {
"name": "from_dictionary",
"increment": 5,
"array": [0, 2, 5],
"write_output": True
},
"output_json": "output_dictionary.json"
}
# here is my ArgSchemaParser that processes my inputs
mod = ArgSchemaParser(input_data=example_input,
schema_type=MyParameters,
output_schema_type=MyOutputParams)
# pull out the inc section of the parameters
inc_params = mod.args['inc']
# do my simple addition of the parameters
inc_array = inc_params['array'] + inc_params['increment']
# define the output dictionary
output = {
'name': inc_params['name'],
'inc_array': inc_array
}
# if the parameters are set as such write the output
if inc_params['write_output']:
mod.output(output)
so now if run the example commands found in run_template.sh
{
"inc": {
"name": "from_json",
"increment": 1,
"array": [3, 2, 1],
"write_output": true
}
}
$ python template_module.py \
--output_json output_command.json \
--inc.name from_command \
--inc.increment 2
{u'name': u'from_command', u'inc_array': [2.0, 4.0, 7.0]}
$ python template_module.py \
--input_json input.json \
--output_json output_fromjson.json
{u'name': u'from_json', u'inc_array': [4.0, 3.0, 2.0]}
$ python template_module.py
{u'name': u'from_dictionary', u'inc_array': [5.0, 7.0, 10.0]}
Sphinx Documentation¶
argschema comes with a autodocumentation feature for Sphnix which will help you automatically add documentation of your Schemas and ArgSchemaParser classes in your project. This is how the documentation of the test suite included here was generated.
To configure sphnix to use this function, you must be using the sphnix autodoc module and add the following to your conf.py file
from argschema.autodoc import process_schemas
def setup(app):
app.connect('autodoc-process-docstring',process_schemas)