Expressions
Learn about types of expressions in JMESPath
An expression is a combination of values, variables, operators and calls to functions that is evaluated to produce a value.
Most fixed values as arguments to functions can be replaced with an expression that returns a value of the corresponding data type.
JSON
JSON (JavaScript Object Notation) is a simple and widely used file format for storing and exchanging data in a human-readable and machine-readable way. In other words, JSON files are made up of plain text. You can open them with a simple text editor or view them in your web browser.
JSON files are organized into objects consisting of key-value pairs, also called properties. Each piece of data (a value) is associated with a label (a key). Values can take different formats, such as text (strings), numbers, true/false values (Booleans), null (for empty or missing data), arrays and objects themselves, while keys are always strings. We occasionally use a double slash to denote a comment, but it is not valid JSON. Here's an example of a JSON file:
{
"name": "Jane",
"age": 30,
"isStudent": false,
"hobbies": ["Reading", "Hiking", "Basketball"],
"address": {
"street": "1 Main St",
"city": "Anytown",
"zipCode": "1234"
}
}
JMESPath
JMESPath is a query language for JSON files. It allows extraction and transformation of elements from a JSON document, making it a useful tool for automation. The JMESPath syntax consists of defining an output JSON by listing keys that are strings (can be written without double quotes if they contain no special characters), and to them attaching values that can be an expression of any type. Expressions are the operations that transform data in JMESPath. Every valid JMESPath expression evaluated on valid JSON will yield a valid JSON object.
Here is an overview of JMESPath expression types.
Identifier expressions
To fetch a value from an object that the current scope is set to, use the property identifier, that is, the key name. For instance, name
will fetch the value associated with the key name
from the object. Calling non-existent properties will return null
.
{"name": "Jane", "age": 30}
Expression | Result |
---|---|
name | "Jane" |
age | 30 |
address | null |
Subexpressions
&leftExpression.&rightExpression
Subexpressions offer one way to chain multiple expressions. &leftExpression
is evaluated on the current scope, and the scope for &rightExpression
is set to the result of &leftExpression
. &rightExpression
is then evaluated and the result is returned. A subexpression is still an expression, so subexpressions can be chained together. Note that &rightExpression
can only be an identifier, multiselect list or hash as documented below, or a function expression.
{
"person": {
"name": "Jane",
"surname": "Doe",
"age": 30
}
}
Expression | Result |
---|---|
person.name | "Jane" |
person.{name: name, surname: surname} | {"name": "Jane", "surname": "Doe"} |
person.[name, surname] | ["Jane", "Doe"] |
multiply(person.age, 2).subtract(@, 45) | 15 |
Index expressions
array[n]
Used for accessing elements of an array by adding an index in brackets to the array name, e.g., array[0]
. Indexing is zero-based, so the index of 0
refers to the first element of the array. Negative integers also make valid indices; the index of -n
accesses nth element from the end of the array. Integers greater or equal to the length of the array and less than the negative length of the array will result in null
when used as index.
{"sampleArray":[0, 1, 2, 3, 4, 5]}
Expression | Result |
---|---|
sampleArray[0] | 0 |
sampleArray[2] | 2 |
sampleArray[-1] | 5 |
sampleArray[100] | null |
Slices
array[start:stop:step]
Slices return a subset of an array. The general format is array[start:stop:step]
, but each of them is optional and can be omitted.
NoteSlices in JMESPath have the same semantics as python slices. If you’re familiar with python slices, you’re familiar with JMESPath slices.
The start
parameter indicates at which index to start fetching elements, stop
indicates the first index that will not be returned, and step
indicates the step between consecutive elements fetched. By default, it's equal to one and an array of all elements between array[start]
(included) and array[stop]
(not included) is returned.
Slice expressions adhere to the following rules:
- Negative
start
andstop
parameters behave like negative indices. - If no
start
is given, it is assumed to be zero ifstep
is positive, or the length of the array ifstep
is negative. - If no
stop
is given, it is assumed to be the length of the array ifstep
is positive or zero ifstep
is negative. - If
step
is omitted, it it assumed to be one. - If
step
is zero, an error is raised. - If the element being sliced is not an array, the result is
null
.
If the element being sliced is an array, but there is no elements within the slice (e.g. start
is bigger than the length of the array), the result is an empty array.
{"sampleArray":[0, 1, 2, 3, 4, 5]}
Expression | Result |
---|---|
sampleArray[1:4] | [1, 2, 3] |
sampleArray[3:] | [3, 4, 5] |
sampleArray[:-2:2] | [0, 2] |
sampleArray[-1:2:-1] | [5, 4, 3] |
sampleArray[3:7] | [3, 4, 5] |
sampleArray[-100:-90] | [] |
sampleArray[::0] | null |
Wildcard Expressions
object.*.expression
, or array[*].expression
A wildcard expression or a projection is an expression of either *
, applied to an object or [*]
, applied to an array. They are called wildcards since the asterisk can be interpreted to stand for all the property names or indices at once.
The wildcard expressions are used to apply other expressions, especially identifier expressions and subexpressions, to all the properties of an object, or all the elements of an array at the same time. To that end, the return type is an array of the results of subsequent expressions applied to each property of an object or each element of an array.
We say that the subsequent expressions are projected onto the elements of the resulting list. This means that wildcard expressions change the underlying data type into a projection, since JMESPath has to keep applying expressions to all the elements, rather than to the entire array or object. Pipe expressions enable converting back to objects when this behavior is undesirable.
The elements or properties that evaluate to null
do not yield an element in the resulting array. If the expressions *
or [*]
are applied to any data type other than an object and an array respectively, the result is going to be null
. The elements of the array are returned in order, but there is no guarantee on the order of properties when using a wildcard on an object.
{
"clients": [
{"name": "Jane", "age": 30},
{"name": "John", "age": 25},
{"name": "Jill"}
],
"employees":{
"CEO":{"name": "Julia"},
"Assistant":{"name": "Jack"},
"Tech Lead":{"name": "Jude"}
}
}
Expression | Result |
---|---|
clients[*].name | ["Jane", "John", "Jill"] |
clients[*].age | [30, 25] |
clients[*].dateOfBirth | [] |
employees.*.name | ["Julia", "Jack", "Jude"] |
Pipe expressions
wildcardExpression | expression
The pipe (|
) expression combines two expressions, one on each side of the pipe character. Its behavior is identical to subexpressions, except for two differences:
- A pipe-expression stops projections on the left hand side from propagating to the right hand side after applying a wildcard. If the left expression creates a projection, it does not apply to the right hand side. This is what makes the pipe valuable; sometimes it's still necessary to apply expressions to the entire result after using a wildcard, instead of transforming the elements individually.
- Any type of expression can be used after the pipe, whereas the subexpressions are more restrictive.
{
"synonymsOfWords": [
{"dark": ["dim", "murky", "black"]},
{"light": ["glow", "glare", "gleam"]},
{"light": ["lightweight", "small", "weightless", "feathery"]},
{"light": ["smooth", "easy", "simple"]}
]
}
Expression | Result | Interpretation |
---|---|---|
synonymsOfWords[*].light | [0] | ["glow", "glare", "gleam"] | All the synonyms for the most common meaning. |
synonymsOfWords[*].light[0] | ["glow", "lightweight", "smooth"] | The best synonym for each meaning. |
Flatten Operator
arrayOfArrays[]
The flatten operator simplifies the structure of arrays of arrays. It merges subarrays of the parent array into a single array. The result of the flattening operator on an array is obtained as follows:
- Create an empty result array.
- Iterate over the elements of the current array.
- If the current element is not an array, add to the end of the result array.
- If the current element is an array, add each element of the current element to the end of the result array.
- The resulting array is returned.
Once the flattening operation has been performed, subsequent operations are projected onto the flattened array with the same semantics as a wildcard expression. Therefore, if there are no nested arrays, [*]
and []
are equivalent.
{
"books": [
{"title": "Book 1", "authors": ["Author 1", "Author 2"]},
{"title": "Book 2", "authors": ["Author 3"]}
]
}
Expression | Result |
---|---|
books[].authors | [["Author 1", "Author 2"], ["Author 3"]] |
books[].authors[] | ["Author 1", "Author 2", "Author 3"] |
MultiSelect List
object.[properties]
An expression that creates an array from an input object. Each element of the array is fetched by evaluating an expression against the input object, so the syntax consists of selecting the needed expressions separated by a comma and enclosing them in square brackets, the standard array notation.
{
"name": "Jane",
"age": 30,
"isStudent": false,
"hobbies": ["Reading", "Hiking", "Basketball"],
"address": {
"street": "1 Main St",
"city": "Anytown",
"zipCode": "1234"
}
}
Expression | Result |
---|---|
[name, age] | ["Jane", 30] |
[address.street, address.city, address.zipCode] | ["1 Main St", "Anytown", "1234"] |
[hobbies[0], hobbies[-1]] | ["Reading", "Basketball"] |
Multiselect Hash
object.{key: valueExpression}
An expression that creates a new object from an input object. The name hash comes from the name hash map, which is a standard implementation of an JSON-type objects. The keys need to be specified as static strings, while the values are the results of evaluating expressions against the input object. The syntax follows the standard object notation: inside braces {}
, define key-value pairs where the keys are strings and the values expressions, and they are connected by a colon :
. Different key-value pairs are separated by a comma.
When evaluating on the same data as in Multiselect List:
Expression | Result |
---|---|
{name: name, age: age} | {"name": "Jane", "age": 30} |
{name: name, city: address.city} | {"name": "Jane", "city": "Anytown"} |
{name: name, lastHobbies: hobbies[1:]} | {"name": "Jane", "lastHobbies": ["Hiking", "Basketball"]} |
Literal Expressions
Literal expressions allow arbitrary JSON constants to be specified. The constants need to be enclosed in backticks, e.g., true
, null
, [0, 1]
or "text"
.
To define string literals, one can use single quotes ''
instead of a combination of backticks and double quotes. When inserted into the document, the single-quoted strings will include the exact string enclosed in them, which is why they are called raw string literals. If you need characters like newline \n
or the unicode escape sequences, you should use the combination of backticks and double quotes.
The strings created by using raw string literals are going to have double backslashes in the result window of the UI - this is how the escape sequences are interpreted as literals.
Filter Expressions
array[?condition]
Filter expressions are used for selecting elements from an array that meet some criteria, i.e., that yield true
when evaluated by a given expression. A filter expression is evaluated as follows:
- For each element in an array, evaluate the expression against the element.
- If the expression yields a truthy value, the element (in its entirety) is added to the result list. Otherwise, it is excluded from the result list and we move on to evaluating the next element.
{
"numbers": [1, 2, 3, 4, 5],
"names": ["Jane", "John", "Julia"]
}
Expression | Result |
---|---|
numbers[?@> 2] | [3, 4, 5] |
numbers[?@!= 3] | [1, 2, 4, 5] |
names[?length(@)== 4] | ["Jane", "John"] |
Comparison and logical operators
Operators like <
and similar also perform operations on fetched data and transform it. You can read more about them on the Operators page.
Parentheses Expressions
The default token precedence from weakest to tightest binding is:
- pipe
|
- or
||
- and
&&
- unary not:
!
- right bracket:
]
Just like in mathematical notation, parentheses expressions allow you to override the order of execution of tokens.
{"a": false, "b": false}
Expression | Result |
---|---|
!a && b | false |
!(a && b) | true |
Context operators
The Documotor implementation of JMESPath offers several operators whose output depends on the data used, and the current scope of data. Here's a short overview.
Current scope (@)
The current scope token @
represents the current node, i.e., object, being evaluated. By default, the current scope is set to the object encompassing the entire JSON file supplied to the data transformation as sample data. This implicit scoping is how identifier expressions and subexpressions access the properties without needing to specify which object they are in. In a way, the .
operator changes the scope to the object identified by the expression preceding it.
The scope changes inside certain functions. For example, map function acts on an array by setting the scope to each of its elements in turn and applying a given expression to them. The operator is especially valuable in situations like this. Additionally, the parent
function is often used in conjunction with @
.
Root (@@)
Contrary to the current scope
operator, the root operator @@
always accesses the entire data payload that's provided to the transformation. Most often, this would be identical to @
, but @@
is useful when the scope is changed inside certain functions, like the map function mentioned above. Occasionally, the general properties of the ancestor object are needed rather than accessing only the element in the current scope.
Current output ($)
The @
and @@
operators access subsets of input data, but the current output operator $
accesses the data that's already been transformed up until the point when the operator is used. This is useful for splitting larger operations into multiple lines and avoiding repeating the operations that yielded the already transformed data.
Here are some use cases for these operators:
{
"exchangeRateEURToDKK": 7.5,
"fruit": [
{"name": "Apple", "priceEUR": 0.5},
{"name": "Banana", "priceEUR": 1}
]
}
{
entireObject: @,
"@@IsTheSameNow": @==@@,
fruitNames: map(@.name, fruit),
fruitPricesDKK: map(multiply(@.priceEUR, @@.exchangeRateEURToDKK), fruit),
fruitNamesAgain: $.fruitNames
}
{
"entireObject":{
"exchangeRateEURToDKK":7.5,
"fruit":[
{"name":"Apple","priceEUR":0.5,"__index":0},
{"name":"Banana","priceEUR":1,"__index":1}
]
},
"@@IsTheSameNow": true,
"fruitNames":["Apple","Banana"],
"fruitPricesDKK":[3.75,7.5],
"fruitNamesAgain":["Apple","Banana"]
}
Function Expressions
Most functions modify data and return it, which is why they are considered expressions as well. They can also be chained and combined with other expressions. Find out more about functions from the navigation pane on the left, where they are grouped by purpose.
Updated 4 months ago