• 本期要文(可全文下载) •

### 利用MapReduce的异常轨迹检测并行算法

1. 南京师范大学计算机科学与技术学院, 南京210023
• 收稿日期:2014-11-13 修回日期:2014-12-11 出版日期:2015-05-10 发布日期:2015-05-10
• 通讯作者: 吉根林(1964-),男,博士,教授,研究方向为数据挖掘及其应用。E-mail:glji@njnu.edu.cn E-mail:glji@njnu.edu.cn
• 作者简介:唐梦梦(1990-),女,硕士生,研究方向为数据挖掘及其应用。E-mail:dreamtang1016@163.com
• 基金资助:

国家自然科学基金项目"云计算环境下顾及用户关系的手机用户时空轨迹模式挖掘方法研究"(41471371)。

### A Parallel Algorithm for Detecting Trajectory Outliers Based on MapReduce

TANG Mengmeng, JI Genlin, ZHAO Bin

1. School of Computer Science and Technology, Nanjing Normal University, Nanjing 210023, China
• Received:2014-11-13 Revised:2014-12-11 Online:2015-05-10 Published:2015-05-10

Abstract:

Trajectory outlier detection is significantly important in the field of data mining for moving object. TRAOD (TRAjectory Outlier Dectection Algorithm), a classic algorithm for detecting trajectory outliers, focuses on a new two-level trajectory partitioning strategy to enhance the efficiency of algorithm. The main advantage of TRAOD algorithm is the ability to detect outlying sub-trajectories. However, it has a low efficiency on abnormality detection for massive trajectory data. In order to improve the efficiency for mining trajectory outliers from massive datasets, a parallel algorithm for detecting trajectory outliers based on MapReduce framework, which is called PTRAOD (Parallel algorithm for TRAjectory Outlier Detection), is presented. It redesigns the TRAOD algorithm based on the MapReduce framework, and encapsulates the steps of TRAOD into its Map and Reduce functions. PTRAOD algorithm takes full advantages of the features from Hadoop platform. It firstly distributes the trajectory data into distributed computing nodes. While distributing the data, it also takes the load-balance into consideration. And after all, each node runs the same algorithms to detect abnormal trajectories. Based on PTRAOD algorithm, a grid-based parallel algorithm for detecting trajectory outliers, called GPTRAOD (Gridbased Parallel algorithm for TRAjectory Outlier Detection), is then proposed. GPTRAOD algorithm makes use of the grid index to realize regional query and reduce unnecessary calculations. At first, GPTRAOD algorithm divides the map into a series of equal- sized grids, whose size is determined with respect to each specific data. Then, the grid index is established to implement the regional query. Finally, the algorithm finds out the abnormal trajectory segments and judges whether the trajectories that contains the abnormal trajectory segments are abnormal. In general, GPTRAOD algorithm takes advantages of the gird index to realize regional query on the basis of PTRAOD algorithm, which furthermore can search abnormal trajectory on the cloud computing platform. To assess the performances of the proposed algorithms, extensive experiments were conducted. The experimental results demonstrate that the proposed two parallel detection algorithms can both successfully achieve the trajectory outlier detection. The efficiency of PTRAOD algorithm is higher than TRAOD algorithm, while GPTRAOD algorithm has the higher scalability and better speedup ratio than PTRAOD algorithm. In addition, with the rapidly expanding of datasets, GPTRAOD algorithm shows obvious advantages and increasing potentials.