PhD Research Intern, Multi-Modal Foundation Encoder for Perception