Zhou Ren (任洲) - Home Page

Zhou Ren 

Zhou Ren
Senior Research Lead (CV, Google Scholar, Linkedin)
Wormpex AI Research, Bellevue, WA

<We are hiring Computer Vision researchers!>

Email: renzhou200622 [at] gmail.com
Contact me with your CV if you are interested in full-time or doing an internship with us. :)

About me

  • I am a founding member of Wormpex AI Research, the AI branch of BianLiFeng (便利蜂), a fast growing advanced convenience store chain in China backed by a global capital (which has opened over 2000 convenience stores from scratch within the past 4 years). At Wormpex AI research, we build state-of-the-art AI technologies to facilitate new retail logistics from storefronts, warehouses to manufacture. Before that, I has spent 3 wonderful years at Snap Research as a senior research scientist, working on applying multimodal understanding to support Snap’s content monetization, content security, and creative content creation.

  • As a senior research lead at Wormpex AI Research, I am managing a Multimodal Machine Perception Team, composed of elite researchers and engineers in both Bellevue, WA and Beijing, China. My team are conducting cutting-edge research and building intelligent production systems to benefit Bianlifeng’s retail business using multimodal input signals, with a focus on human-behavior-related modeling, such as human detection, pose, action, ReID, tracking, human-POS machine-interaction, etc.


Research Highlights

  • My research interests lie in the fields of Computer Vision, Multimedia, Machine Learning, and Natural Language Processing. I have worked on Human Centric Understanding (including hand gesture recognition, hand pose estimation, human pose estimation and tracking, human ReID, action detection, etc.), Multi-modal Joint Understanding (including image captioning, video captioning, visual-semantic embedding, etc.), shape understanding, adversarial machine learning, etc.

  • My current focuses are: 1. human centric understanding (pose, hand, gesture, human Re-ID, and tracking); 2. object detection, action detection and video representation learning, 3. multi-modal joint understanding, vision and language.