土壤图更新中基于土壤类型面积分级的训练样点选择方法 Training Sample Selection Method Based on Grading of Soil Types by Area for Updating Conventional Soil Maps期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

按检索

土壤图更新中基于土壤类型面积分级的训练样点选择方法

引用本文：	刘雪琦,朱阿兴,杨琳,缪亚敏,曾灿英.土壤图更新中基于土壤类型面积分级的训练样点选择方法[J].土壤学报,2017,54(1):36-47.

作者姓名：	刘雪琦朱阿兴杨琳缪亚敏曾灿英

作者单位：	1. 南京师范大学地理科学学院,南京,210023;2. 南京师范大学地理科学学院，南京 210023; 资源与环境信息系统国家重点实验室中国科学院地理科学与资源研究所，北京 100101; 虚拟地理环境教育部重点实验室南京师范大学，江苏省地理环境演化国家重点实验室培育建设点，江苏省地理信息资源开发与利用协同创新中心，南京 210023; Department of Geography，University of Wisconsin-Madison，Madison，WI 53706，USA;3. 资源与环境信息系统国家重点实验室中国科学院地理科学与资源研究所,北京,100101

基金项目：	国家自然科学基金项目（41431177；41471178）、江苏省高校自然科学研究重大项目(14KJA170001)、江苏省高校研究生科研创新计划项目（KYLX15_0715）、国家重点基础研究发展计划973项目 (2015CB954102)和千人计划

摘要：	基于数据挖掘模型的土壤图更新是一项重要的研究。数据挖掘模型构建中训练样点的质量不仅决定其对研究区土壤-环境关系表达的充分程度,而且会对推理制图的结果产生至关重要的影响。本文提出一种基于土壤类型面积分级的典型训练样点选择方法,即依据土壤面积对土壤类型分级,并按照等级之间的比例关系基于典型点选择训练样点。将方法应用于更新美国威斯康星州Raffelson流域的传统土壤图,并与另外两种训练样点选择方法对比,以验证该方法的有效性。结果表明,500次重复实验中,本研究方法与另外两种训练样点选择方法相比,能够更新传统土壤图的比例分别为79.5%、71.8%和63.6%,而且其推理制图结果更符合研究区土壤分布的特征。本研究所提方法是一种有效的训练样点选择方法。
关键词：	训练样点数据挖掘模型传统土壤图更新土壤-环境关系
收稿时间：	2016/3/21 0:00:00
修稿时间：	2016/7/13 0:00:00
Training Sample Selection Method Based on Grading of Soil Types by Area for Updating Conventional Soil Maps

LIU Xueqi,ZHU Axing,YANG Lin,MIAO Yamin and ZENG Canying.Training Sample Selection Method Based on Grading of Soil Types by Area for Updating Conventional Soil Maps[J].Acta Pedologica Sinica,2017,54(1):36-47.

Authors:	LIU Xueqi ZHU Axing YANG Lin MIAO Yamin and ZENG Canying

Institution:	School of Geographical Science, Nanjing Normal University,School of Geographical Science, Nanjing Normal University,State Key Laboratory of Environment and Resources Information System, Institute of Geographical Sciences and Resources Research, Chinese Academy of Sciences,School of Geographical Science, Nanjing Normal University,School of Geographical Science, Nanjing Normal University

Abstract:	Objective]Traditional soil surveyshave turned out huge piles of conventional soil maps various in scale and nature. Although these maps are not very high in spatial detail or accuracy,they contain large volumes of valuable expertise concerning soil-environment relationships in relevant regions. Data mining models can be used to extract from these maps information useful to updating of the conventional soil maps. In using data mining models to extract the information of soil spatial distribution,selection of training samples is an essential step. Quality of training samples will affect to a great extent full expression of soil-environmental relationships and accuracy of the updatedsoil maps. The area-weighted proportion method was a common method for selecting of training samples. However,this method usually assigns too much weight to those soil types large in area,so that too many training samples would be selected. Meanwhile,random selection of training samples from polygons of the same soil type may bring in some“noise”samples,occurring on transition areas between soil types,which make the accuracy of the updated soil maps not high.Method]In this paper,a new method was developed to select training samples from conventional soil maps based on grading of soil types by area. The method consists of the following two steps. The first step is to specify typical(representative)samples of each soil type based on conventional soil map,so as to avoid generation of“noise pixels”due to misplacement in delineating boundaries between soil polygons. It is assumed that most of the boundaries of the soil polygons of a certain soil type are correctly delineated,and then the peak of the histogram of a certain environmental factor enclosed in the polygons of the soil type represents the typical environmental conditionunder which the soil develops or exists. The pixels close to the selected environmental conditions or within the peak zone of the histogram are considered as representative samples. All the representative samples selected through histograms of various environmental conditions of a certain soil type are combined into a typical sample set of the soil type. The second step is to select training samples based on grading of soil type by area,with a view to keep the numbers of samples of each soil type in balance. Soil types in the same grade should have the same number of training samples out of the typical sample set of each of the soil types.The random forest model adopted in this study is to update conventional soil maps based on the selected training samples. To evaluate the above-proposed method,comparison was made between this method and two other training sample selection methods. One is to randomly select training samples from polygons of each soil type and the number of training samples for each soil type depended on proportion of the grade the soil type is in,while the other is the common area-weighed proportion method,which randomly selects training samples form the soil polygons of a soil type and the number of training samples for each soil type depended on the area-weighted proportion of the soil type. The study area was a small watershed in Raffelson, Wisconsin of USA. The three selection methods were tried repeatedly,each for 500 times,and validate mean precision of the inferential mapping and proportion of the updated conventional soil maps with 92 independent verification samples in the field.Result]Results show that based on the 500 trails,comparison of this method with the other two reveals that about 79.5%,71.8% and 63.6% of the conventional soil maps could be updated,respectively. Meanwhile,the updated soil maps based on the proposed training sample selection method are more consistent with the actual soil distribution in the Raffelson watershed.Conclusion]It is concluded that the proposed method is an effective training sample selection method for data mining model to update conventional soil maps.

Keywords:	Training sample Data mining model Update conventional soil map Soil-environmental relationships
本文献已被 CNKI 万方数据等数据库收录！
	点击此处可从《土壤学报》浏览原始摘要信息
	点击此处可从《土壤学报》下载免费的PDF全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏