Mapping
最后更新于:2022-04-01 00:39:32
# 映射
As explained in <>, each document in an index has a _type_.Every type has its own _mapping_ or _schema definition_. A mappingdefines the fields within a type, the datatype for each field,and how the field should be handled by Elasticsearch. A mapping is also usedto configure metadata associated with the type.
We discuss mappings in detail in <>. In this section we're goingto look at just enough to get you started.
[[core-fields]]==== Core simple field types
Elasticsearch supports the following simple field types:
[horizontal]String: :: `string`Whole number: :: `byte`, `short`, `integer`, `long`Floating point: :: `float`, `double`Boolean: :: `boolean`Date: :: `date`
When you index a document which contains a new field -- one previously notseen -- Elasticsearch will use <> to tryto guess the field type from the basic datatypes available in JSON,using the following rules:
[horizontal]_JSON type:_ :: _Field type:_
Boolean: `true` or `false` :: `"boolean"`
Whole number: `123` :: `"long"`
Floating point: `123.45` :: `"double"`
String, valid date: `"2014-09-15"` :: `"date"`
String: `"foo bar"` :: `"string"`
NOTE: This means that, if you index a number in quotes -- `"123"` it will bemapped as type `"string"`, not type `"long"`. However, if the field isalready mapped as type `"long"`, then Elasticsearch will try to convertthe string into a long, and throw an exception if it can't.
==== Viewing the mapping
We can view the mapping that Elasticsearch has for one or more types in one ormore indices using the `/_mapping` endpoint. At the <> we already retrieved the mapping for type `tweet` in index`gb`:
### [source,js]
### GET /gb/_mapping/tweet
This shows us the mapping for the fields (called _properties_) thatElasticsearch generated dynamically from the documents that we indexed:
### [source,js]
{ "gb": { "mappings": { "tweet": { "properties": { "date": { "type": "date", "format": "dateOptionalTime" }, "name": { "type": "string" }, "tweet": { "type": "string" }, "user_id": { "type": "long" } } } } }
### }
# [TIP]
Incorrect mappings, such as having an `age` field mapped as type `string`instead of `integer`, can produce confusing results to your queries.
# Instead of assuming that your mapping is correct, check it!
[[custom-field-mappings]]==== Customizing field mappings
The most important attribute of a field is the `type`. For fieldsother than `string` fields, you will seldom need to map anything otherthan `type`:
### [source,js]
{ "number_of_clicks": { "type": "integer" }
### }
Fields of type `"string"` are, by default, considered to contain full text.That is, their value will be passed through an analyzer before being indexedand a full text query on the field will pass the query string through ananalyzer before searching.
The two most important mapping attributes for `string` fields are`index` and `analyzer`.
===== `index`
The `index` attribute controls how the string will be indexed. Itcan contain one of three values:
[horizontal]`analyzed`:: First analyze the string, then index it. In other words, index this field as full text.
`not_analyzed`:: Index this field, so it is searchable, but index the value exactly as specified. Do not analyze it.
`no`:: Don't index this field at all. This field will not be searchable.
The default value of `index` for a `string` field is `analyzed`. If wewant to map the field as an exact value, then we need to set it to`not_analyzed`:
### [source,js]
{ "tag": { "type": "string", "index": "not_analyzed" }
### }
The other simple types -- `long`, `double`, `date` etc -- also accept the`index` parameter, but the only relevant values are `no` and `not_analyzed`,as their values are never analyzed.
===== `analyzer`
For `analyzed` string fields, use the `analyzer` attribute tospecify which analyzer to apply both at search time and at index time. Bydefault, Elasticsearch uses the `standard` analyzer, but you can change thisby specifying one of the built-in analyzers, such as`whitespace`, `simple`, or `english`:
### [source,js]
{ "tweet": { "type": "string", "analyzer": "english" }
### }
In <> we will show you how to define and use custom analyzersas well.
==== Updating a mapping
You can specify the mapping for a type when you first create an index.Alternatively, you can add the mapping for a new type (or update the mappingfor an existing type) later, using the `/_mapping` endpoint.
# [IMPORTANT]
While you can _add_ to an existing mapping, you can't _change_ it. If a fieldalready exists in the mapping, then it probably means that data from thatfield has already been indexed. If you were to change the field mapping, then
# the already indexed data would be wrong and would not be properly searchable.
We can update a mapping to add a new field, but we can't change an existingfield from `analyzed` to `not_analyzed`.
To demonstrate both ways of specifying mappings, let's first delete the `gb`index:
### [source,sh]
### DELETE /gb
// SENSE: 052_Mapping_Analysis/45_Mapping.json
Then create a new index, specifying that the `tweet` field should usethe `english` analyzer:
### [source,js]
PUT /gb <1>{ "mappings": { "tweet" : { "properties" : { "tweet" : { "type" : "string", "analyzer": "english" }, "date" : { "type" : "date" }, "name" : { "type" : "string" }, "user_id" : { "type" : "long" } } } }
### }
// SENSE: 052_Mapping_Analysis/45_Mapping.json
<1> This creates the index with the `mappings` specified in the body.
Later on, we decide to add a new `not_analyzed` text field called `tag` to the`tweet` mapping, using the `_mapping` endpoint:
### [source,js]
PUT /gb/_mapping/tweet{ "properties" : { "tag" : { "type" : "string", "index": "not_analyzed" } }
### }
// SENSE: 052_Mapping_Analysis/45_Mapping.json
Note that we didn't need to list all of the existing fields again, as we can'tchange them anyway. Our new field has been merged into the existing mapping.
==== Testing the mapping
You can use the `analyze` API to test the mapping for string fields byname. Compare the output of these two requests:
### [source,js]
GET /gb/_analyze?field=tweetBlack-cats <1>
GET /gb/_analyze?field=tag
### Black-cats <1>
// SENSE: 052_Mapping_Analysis/45_Mapping.json
<1> The text we want to analyze is passed in the body.
The `tweet` field produces the two terms `"black"` and `"cat"`, while the`tag` field produces the single term `"Black-cats"`. In other words, ourmapping is working correctly.