做网站端口映射/整合营销策略有哪些
前言
删除文档作为ES操作中重要的一部分,其必要性毋庸置疑。而根据官网文档api可知,有两种删除方式:一是直接根据index
,type
,id
直接删除,而第二种是查询删除,也就是所谓的Delete By Query API
。
第一种删除方式因为id作为唯一标识,所以如果文档存在肯定能指定删除。
而第二种查询删除的方式,其作用过程相当于先查询出满足条件的文档,再根据文档ID依次删除。所以必须注意查询条件,确定查询结果范围。否则会误删很多文档。
当使用RestHighLevelClient操作时,第一种api没有问题,而第二种虽然提供了DeleteByQueryRequest
,但是没有相应的方法执行这个请求。(如果存在,还望不吝指教!)只能自己查询再删除两步走。虽然由客户端发出两次请求肯定没有Delete By Query
快,但是目前只能使用这种方式曲线救国了。
还有一种方式就是使用RestClient,灵活拼接json语句,发送Http请求。
消息来源:https://discuss.elastic.co/t/delete-by-query-with-new-java-rest-api/107578
正文
准备数据
/PUT http://{{host}}:{{port}}/delete_demo
{"mappings":{"demo":{"properties":{"content":{"type":"text","fields":{ "keyword":{ "type":"keyword" } } }}}}
}
/POST http://{{host}}:{{port}}/_bulk
{"index":{"_index":"delete_demo","_type":"demo"}}
{"content":"test1"}
{"index":{"_index":"delete_demo","_type":"demo"}}
{"content":"test1"}
{"index":{"_index":"delete_demo","_type":"demo"}}
{"content":"test1 add"}
{"index":{"_index":"delete_demo","_type":"demo"}}
{"content":"test2"}
注意:批量操作时,每行数据后面都得回车换行,最后一行后要跟空行!
{"took": 7,"errors": false,"items": [{"index": {"_index": "delete_demo","_type": "demo","_id": "AWExGSdW00f4t28WAPen","_version": 1,"result": "created","_shards": {"total": 2,"successful": 1,"failed": 0},"created": true,"status": 201}},{"index": {"_index": "delete_demo","_type": "demo","_id": "AWExGSdW00f4t28WAPeo","_version": 1,"result": "created","_shards": {"total": 2,"successful": 1,"failed": 0},"created": true,"status": 201}},{"index": {"_index": "delete_demo","_type": "demo","_id": "AWExGSdW00f4t28WAPep","_version": 1,"result": "created","_shards": {"total": 2,"successful": 1,"failed": 0},"created": true,"status": 201}},{"index": {"_index": "delete_demo","_type": "demo","_id": "AWExGSdW00f4t28WAPeq","_version": 1,"result": "created","_shards": {"total": 2,"successful": 1,"failed": 0},"created": true,"status": 201}}]
}
ID方式删除
API格式
/DELETE http://{{host}}:{{port}}/delete_demo/demo/AWExGSdW00f4t28WAPen
Java 客户端
public class ElkDaoTest extends BaseTest{@Autowiredprivate RestHighLevelClient rhlClient;private String index;private String type;private String id;@Beforepublic void prepare(){index = "delete_demo";type = "demo";id = "AWExGSdW00f4t28WAPeo";}@Testpublic void delete(){DeleteRequest deleteRequest = new DeleteRequest(index,type,id);DeleteResponse response = null;try {response = rhlClient.delete(deleteRequest);} catch (IOException e) {// TODO Auto-generated catch blocke.printStackTrace();}System.out.println(response);}
}
同样删除成功。
关于rhlClient的使用可以参考之前的博文ElasticSearch Rest High Level Client 教程(一)通用操作。
Delete By Query
API方式
首先重新把之前的数据恢复到四个文档。
/POST http://{{host}}:{{port}}/delete_demo/demo/_delete_by_query
{"query":{"match":{"content":"test1"}}
}
{"took": 14,"timed_out": false,"total": 3,"deleted": 3,"batches": 1,"version_conflicts": 0,"noops": 0,"retries": {"bulk": 0,"search": 0},"throttled_millis": 0,"requests_per_second": -1,"throttled_until_millis": 0,"failures": []
}
/GET http://{{host}}:{{port}}/delete_demo/demo/_search
{"took": 0,"timed_out": false,"_shards": {"total": 5,"successful": 5,"skipped": 0,"failed": 0},"hits": {"total": 1,"max_score": 1,"hits": [{"_index": "delete_demo","_type": "demo","_id": "AWExKDse00f4t28WAafF","_score": 1,"_source": {"content": "test2"}}]}
}
结果显示删除了三个文档,即test1
,test1
,test1 add
,只剩下test2
。显然是将查询到的结果都删除了。
如果使用term
,也是同样按照查询匹配删除。
/POST http://{{host}}:{{port}}/delete_demo/demo/_delete_by_query
{"query":{"term":{"content.keyword":"test1"}}
}
{"took": 6,"timed_out": false,"total": 2,"deleted": 2,"batches": 1,"version_conflicts": 0,"noops": 0,"retries": {"bulk": 0,"search": 0},"throttled_millis": 0,"requests_per_second": -1,"throttled_until_millis": 0,"failures": []
}
证明Delete By Query
就是先查询再删除的过程。
Java 客户端
使用RestHighLevelClient
public class ElkDaoTest extends BaseTest {@Autowired private RestHighLevelClient rhlClient;private String index;private String type;private String deleteText;@Before public void prepare() {index = "delete_demo";type = "demo";deleteText = "test1"; }@Test public void delete() {try {SearchSourceBuilder sourceBuilder = new SearchSourceBuilder();sourceBuilder.timeout(new TimeValue(2, TimeUnit.SECONDS));TermQueryBuilder termQueryBuilder1 = QueryBuilders.termQuery("content.keyword", deleteText);sourceBuilder.query(termQueryBuilder1);SearchRequest searchRequest = new SearchRequest(index);searchRequest.types(type);searchRequest.source(sourceBuilder);SearchResponse response = rhlClient.search(searchRequest);SearchHits hits = response.getHits();List<String> docIds = new ArrayList<>(hits.getHits().length);for (SearchHit hit : hits) {docIds.add(hit.getId());}BulkRequest bulkRequest = new BulkRequest();for (String id : docIds) {DeleteRequest deleteRequest = new DeleteRequest(index, type, id);bulkRequest.add(deleteRequest);}rhlClient.bulk(bulkRequest);} catch (IOException e) {e.printStackTrace();}} }
恢复数据再执行以上代码,查询只剩下
test1 add
和test2
两个文档,删除查询成功。具体查询不再贴出。使用RestClient
之前系列文章就有提到过,rhlClient是对RestClient的封装,而rhlClient有部分功能还在完善,还未在java中实现。那么使用restClient直接以http的形式调用ES服务就好了。
public class ElkDaoTest extends BaseTest {@Autowired private RestClient restClient;private String index;private String type;private String deleteText;@Before public void prepare() {index = "delete_demo";type = "demo";deleteText = "test1"; }@Test public void delete() {String endPoint = "/" + index + "/" + type +"/_delete_by_query";String source = genereateQueryString();HttpEntity entity = new NStringEntity(source, ContentType.APPLICATION_JSON);try {restClient.performRequest("POST", endPoint,Collections.<String, String> emptyMap(),entity);} catch (IOException e) {// TODO Auto-generated catch blocke.printStackTrace();}}public String genereateQueryString(){IndexRequest indexRequest = new IndexRequest();XContentBuilder builder;try {builder = JsonXContent.contentBuilder().startObject().startObject("query").startObject("term").field("content.keyword",deleteText).endObject().endObject().endObject();indexRequest.source(builder);} catch (IOException e) {// TODO Auto-generated catch blocke.printStackTrace();}String source = indexRequest.source().utf8ToString();return source; } }
运行后,同样删除了
test1
的两个文档,功能实现。优点就在于不需要发起两次HTTP连接,节省时间。
总结
就删除操作而言,RestHighLevelClient所能做的还不够完善,因此要联系RestClient的灵活性才能实现我们想要的功能。
系列文章:
ElasticSearch Rest High Level Client 教程(一)通用操作
ElasticSearch RestHighLevelClient 教程(二) 操作index