Skip to main content

[edit on GitHub]

Search indexes allow queries to be made for any type of data that is indexed by the Chef Infra Server, including data bags (and data bag items), environments, nodes, and roles. A defined query syntax is used to support search patterns like exact, wildcard, range, and fuzzy. A search is a full-text query that can be done from several locations, including from within a recipe, by using the search subcommand in knife, the search method in the Chef Infra Language, the search box in the Chef management console, and by using the /search or /search/INDEX endpoints in the Chef Infra Server API. The search engine is based on Elasticsearch and is run from the Chef Infra Server.

Use the search method to perform a search query against the Chef Infra Server from within a recipe.

The syntax for the search method is as follows:

search(:index, 'query')

where:

  • :index is of name of the index on the Chef Infra Server against which the search query will run: :client, :data_bag_name, :environment, :node, and :role
  • 'query' is a valid search query against an object on the Chef Infra Server (see below for more information about how to build the query)

For example, using the results of a search query within a variable:

webservers = search(:node, 'role:webserver')

and then using the results of that query to populate a template:

template '/tmp/list_of_webservers' do
  source 'list_of_webservers.erb'
  variables(:webservers => webservers)
end

:filter_result

Use :filter_result as part of a search query to filter the search output based on the pattern specified by a Hash. Only attributes in the Hash will be returned.

The syntax for the search method that uses :filter_result is as follows:

search(:index, 'query',
  filter_result: { 'foo' => [ 'abc' ],
                      'bar' => [ '123' ],
                      'baz' => %w(sea power),
                    }
).each do |result|
  puts result['foo']
  puts result['bar']
  puts result['baz']
end

where:

  • :index is of name of the index on the Chef Infra Server against which the search query will run: :client, :data_bag_name, :environment, :node, and :role
  • 'query' is a valid search query against an object on the Chef server
  • :filter_result defines a Hash of values to be returned

For example:

search(:node, 'role:web',
  filter_result: { 'name' => [ 'name' ],
                      'ip' => [ 'ipaddress' ],
                      'kernel_version' => %w(kernel version),
                    }
).each do |result|
  puts result['name']
  puts result['ip']
  puts result['kernel_version']
end

Query Syntax

A search query is comprised of two parts: the key and the search pattern. A search query has the following syntax:

key:search_pattern

where key is a field name that is found in the JSON description of an indexable object on the Chef Infra Server (a role, node, client, environment, or data bag) and search_pattern defines what will be searched for, using one of the following search patterns: exact, wildcard, range, or fuzzy matching. Both key and search_pattern are case-sensitive; key has limited support for multiple character wildcard matching using an asterisk ("*") (and as long as it is not the first character).

Keys

A field name/description pair is available in the JSON object. Use the field name when searching for this information in the JSON object. Any field that exists in any JSON description for any role, node, Chef Infra Client, environment, or data bag can be searched.

Nested Fields

A nested field appears deeper in the JSON data structure. For example, information about a network interface might be several layers deep: node['network']['interfaces']['en1']. When nested fields are present in a JSON structure, Chef Infra Client will extract those nested fields to the top-level, flattening them into compound fields that support wildcard search patterns.

By combining wildcards with range-matching patterns and wildcard queries, it is possible to perform very powerful searches, such as using the vendor part of the MAC address to find every node that has a network card made by the specified vendor.

Consider the following snippet of JSON data:

{"network":
  [
  //snipped...
    "interfaces",
      {"en1": {
        "number": "1",
        "flags": [
          "UP",
          "BROADCAST",
          "SMART",
          "RUNNING",
          "SIMPLEX",
          "MULTICAST"
        ],
        "addresses": {
          "fe80::fa1e:dfff:fed8:63a2": {
            "scope": "Link",
            "prefixlen": "64",
            "family": "inet6"
          },
          "f8:1e:df:d8:63:a2": {
            "family": "lladdr"
          },
          "192.0.2.0": {
            "netmask": "255.255.255.0",
            "broadcast": "192.168.0.255",
            "family": "inet"
          }
        },
        "mtu": "1500",
        "media": {
          "supported": {
            "autoselect": {
              "options": [

              ]
            }
          },
          "selected": {
            "autoselect": {
              "options": [

              ]
            }
          }
        },
        "type": "en",
        "status": "active",
        "encapsulation": "Ethernet"
      },
  //snipped...

Before this data is indexed on the Chef Infra Server, the nested fields are extracted into the top level, similar to:

"broadcast" => "192.168.0.255",
"flags"     => ["UP", "BROADCAST", "SMART", "RUNNING", "SIMPLEX", "MULTICAST"]
"mtu"       => "1500"

which allows searches like the following to find data that is present in this node:

node "broadcast:192.168.0.*"

or:

node "mtu:1500"

or:

node "flags:UP"

This data is also flattened into various compound fields, which follow the same pattern as the JSON hierarchy and use underscores (_) to separate the levels of data, similar to:

# ...snip...
"network_interfaces_en1_addresses_192.0.2.0_broadcast" => "192.168.0.255",
"network_interfaces_en1_addresses_fe80::fa1e:tldr_family"  => "inet6",
"network_interfaces_en1_addresses"                         => ["fe80::fa1e:tldr","f8:1e:df:tldr","192.0.2.0"]
# ...snip...

which allows searches like the following to find data that is present in this node:

node "network_interfaces_en1_addresses:192.0.2.0"

This flattened data structure also supports using wildcard compound fields, which allow searches to omit levels within the JSON data structure that are not important to the search query. In the following example, an asterisk (*) is used to show where the wildcard can exist when searching for a nested field:

"network_interfaces_*_flags"     => ["UP", "BROADCAST", "SMART", "RUNNING", "SIMPLEX", "MULTICAST"]
"network_interfaces_*_addresses" => ["fe80::fa1e:dfff:fed8:63a2", "192.0.2.0", "f8:1e:df:d8:63:a2"]
"network_interfaces_en0_media_*" => ["autoselect", "none", "1000baseT", "10baseT/UTP", "100baseTX"]
"network_interfaces_en1_*"       => ["1", "UP", "BROADCAST", "SMART", "RUNNING", "SIMPLEX", "MULTICAST",
                                     "fe80::fa1e:dfff:fed8:63a2", "f8:1e:df:d8:63:a2", "192.0.2.0",
                                     "1500", "supported", "selected", "en", "active", "Ethernet"]

For each of the wildcard examples above, the possible values are shown contained within the brackets. When running a search query, the query syntax for wildcards is to simply omit the name of the node (while preserving the underscores), similar to:

network_interfaces__flags

This query will search within the flags node, within the JSON structure, for each of UP, BROADCAST, SMART, RUNNING, SIMPLEX, and MULTICAST.

Patterns

A search pattern is a way to fine-tune search results by returning anything that matches some type of incomplete search query. There are four types of search patterns that can be used when searching the search indexes on the Chef Infra Server: exact, wildcard, range, and fuzzy.

Exact Match

An exact matching search pattern is used to search for a key with a name that exactly matches a search query. If the name of the key contains spaces, quotes must be used in the search pattern to ensure the search query finds the key. The entire query must also be contained within quotes, so as to prevent it from being interpreted by Ruby or a command shell. The best way to ensure that quotes are used consistently is to quote the entire query using single quotes (' ‘) and a search pattern with double quotes (" “).

Wildcard Match

A wildcard matching search pattern is used to query for substring matches that replace zero (or more) characters in the search pattern with anything that could match the replaced character. There are two types of wildcard searches:

  • A question mark (?) can be used to replace exactly one character (as long as that character is not the first character in the search pattern)
  • An asterisk (*) can be used to replace any number of characters (including zero)

Range Match

A range matching search pattern is used to query for values that are within a range defined by upper and lower boundaries. A range matching search pattern can be inclusive or exclusive of the boundaries. Use square brackets ("[ ]") to denote inclusive boundaries and curly braces ("{ }") to denote exclusive boundaries and with the following syntax:

boundary TO boundary

where TO is required (and must be capitalized).

Fuzzy Match

A fuzzy matching search pattern is used to search based on the proximity of two strings of characters. An (optional) integer may be used as part of the search query to more closely define the proximity. A fuzzy matching search pattern has the following syntax:

"search_query"~edit_distance

where search_query is the string that will be used during the search and edit_distance is the proximity. A tilde ("~") is used to separate the edit distance from the search query.

Operators

An operator can be used to ensure that certain terms are included in the results, are excluded from the results, or are not included even when other aspects of the query match. Searches can use the following operators:

OperatorDescription
ANDUse to find a match when both terms exist.
ORUse to find a match if either term exists.
NOTUse to exclude the term after NOT from the search results.

Special Characters

A special character can be used to fine-tune a search query and to increase the accuracy of the search results. The following characters can be included within the search query syntax, but each occurrence of a special character must be escaped with a backslash (\), also (/) must be escaped against the Elasticsearch:

+  -  &&  | |  !  ( )  { }  [ ]  ^  "  ~  *  ?  :  \  /

For example:

\(1\+1\)\:2

Examples

The following examples show how the search method can be used in a recipe.

Use the search recipe DSL method to find users

The following example shows how to use the search method in the Recipe DSL to search for users:

#  the following code sample comes from the openvpn cookbook: https://github.com/chef-cookbooks/openvpn

search("users", "*:*") do |u|
  execute "generate-openvpn-#{u['id']}" do
    command "./pkitool #{u['id']}"
    cwd '/etc/openvpn/easy-rsa'
    environment(
      'EASY_RSA' => '/etc/openvpn/easy-rsa',
      'KEY_CONFIG' => '/etc/openvpn/easy-rsa/openssl.cnf',
      'KEY_DIR' => node['openvpn']['key_dir'],
      'CA_EXPIRE' => node['openvpn']['key']['ca_expire'].to_s,
      'KEY_EXPIRE' => node['openvpn']['key']['expire'].to_s,
      'KEY_SIZE' => node['openvpn']['key']['size'].to_s,
      'KEY_COUNTRY' => node['openvpn']['key']['country'],
      'KEY_PROVINCE' => node['openvpn']['key']['province'],
      'KEY_CITY' => node['openvpn']['key']['city'],
      'KEY_ORG' => node['openvpn']['key']['org'],
      'KEY_EMAIL' => node['openvpn']['key']['email']
    )
    not_if { File.exist?("#{node['openvpn']['key_dir']}/#{u['id']}.crt") }
  end

  %w{ conf ovpn }.each do |ext|
    template "#{node['openvpn']['key_dir']}/#{u['id']}.#{ext}" do
      source 'client.conf.erb'
      variables :username => u['id']
    end
  end

  execute "create-openvpn-tar-#{u['id']}" do
    cwd node['openvpn']['key_dir']
    command <<-EOH
      tar zcf #{u['id']}.tar.gz \
      ca.crt #{u['id']}.crt #{u['id']}.key \
      #{u['id']}.conf #{u['id']}.ovpn \
    EOH
    not_if { File.exist?("#{node['openvpn']['key_dir']}/#{u['id']}.tar.gz") }
  end
end

where

  • the search will use both of the execute resources, unless the condition specified by the not_if commands are met
  • the environments property in the first execute resource is being used to define values that appear as variables in the OpenVPN configuration
  • the template resource tells Chef Infra Client which template to use

Was this page helpful?