Assistive robots can augment the human experience, especially for the infirm and elderly. We demonstrate the use case of a person-following robot to carry a user’s luggage. Using an amalgamation of classical computer vision (CSRT tracking) and deep learning (YOLO object detection) techniques, we develop a long-term person-tracking pipeline with intuitive gestures for robot control. Finally, we showcase the end-to-end pipeline on a TurtleBot2 in the physical world with various cases of occlusions and user variations.