The classic visual-inertial odometry (VIO) method estimates a moving camera’s 6-DOF pose relative to its starting point by fusing the camera’s ego-motion measured by a visual odometry (VO) and the motion measured by an inertial measurement unit (IMU). The VIO attempts to updates the estimates of the IMU’s biases at each step by using the VO’s output so as to improve the accuracy of IMU measurement. This approach works only if an accurate VO output can be identified and used. However, there is no reliable method that can be used to evaluate the accuracy of the VO. In this paper, a new VIO method is introduced for pose estimation of a robotic navigation aid (RNA) that uses a 3D time-of-flight camera for perception. The method, called plane-aided visual-inertial odometry (PAVIO), extracts planes from the 3D point cloud of the current camera view and track them onto the next camera view by using the IMU’s measurement. The tracking result is used to accept the VO output only if it is accurate. The accepted VO outputs, the information of the extracted planes, and the IMU’s measurements over time are used to create a factor graph. By optimizing the graph, the method improves the estimation accuracy of the IMU bias and reduces the camera’s pose error. Experimental results with the RNA validate the effectiveness of the proposed method.