HeadsetsApps & GamesCommunitySupport

Research Scientist, Virtual Humans - Speech Synthesis (PhD)

AR/VR | Pittsburgh, PA

Facebook Reality Labs (FRL) brings together a world-class R&D team of researchers, developers, and engineers to build the future of connection within virtual and augmented reality. We're developing all the technologies needed to enable breakthrough AR glasses and VR headsets. At the Pittsburgh lab, we aspire to a vision of social VR and AR, where people are able to interact with each other across distances in a way that is indistinguishable from in-person interactions. As a Research Scientist at FRL Pittsburgh, you will be solving challenges at the forefront of computer vision and machine learning. You will work alongside top researchers, engineers, and artists collaborating together on the innovation necessary to make that vision a reality. You will be expected to push forward the frontiers of research and publish your work at leading venues in the field. We want people who work well across disciplines, can brainstorm big ideas, and are excited to work in new technology areas. The ideal candidate will have research experience in machine learning applied to waveform/speech synthesis, acoustic modeling, and audio-visual learning. Knowledge of spatial audio processing is a plus.


  • Prototype novel algorithms for speech synthesis leveraging deep learning and multimodal data captured using our state-of-the-art capture systems
  • Develop robust algorithms and systems for integrating multiple sensors and modalities
  • Collaborate with team members across a variety of domains including signal processing, acoustic engineering, computer vision and computer graphics
  • Attend scientific conferences, publish and present papers in international conferences and journals

Minimum Qualifications

  • Currently has, or is in the process of obtaining, a PhD degree or completing a postdoctoral assignment in the field of Computer Science, Machine Learning, Artificial Intelligence, or related field
  • Research experience in one or more of the following areas: neural speech synthesis, multimodal learning, or similar
  • 3+ years of experience in machine learning, audio processing, or computer vision
  • 3+ years of experience in standard AI, CV and ML libraries, including PyTorch, Torch, TensorFlow, Keras, etc.
  • Interpersonal experience: cross-group and cross-culture collaboration
  • Must obtain work authorization in country of employment at the time of hire, and maintain ongoing work authorization during employment

Preferred Qualifications

  • 3+ years of programming experience in C++ or Python
  • Understanding of spatial audio, room acoustics, and multichannel audio array processing
  • Proven track record of achieving significant results as demonstrated by grants, fellowships, patents, as well as first-authored publications at leading workshops or conferences such as CVPR, ECCV/ICCV, ICASSP, NeurIPS, SIGGRAPH, InterSpeech, ICLR, or similar
  • Demonstrated software engineer experience via an internship, work experience, coding competitions, or widely used contributions in open source repositories (e.g. GitHub)

Ready to Join?

Apply Now

Oculus is proud to be an Equal Opportunity and Affirmative Action employer. We do not discriminate based upon race, religion, color, national origin, sex (including pregnancy, childbirth, or related medical conditions), sexual orientation, gender, gender identity, gender expression, transgender status, sexual stereotypes, age, status as a protected veteran, status as an individual with a disability, genetic information, or other applicable legally protected characteristics. We also consider qualified applicants with criminal histories, consistent with applicable federal, state and local law.

Oculus is committed to providing reasonable accommodations for candidates with disabilities in our recruiting process. If you need any assistance or accommodations due to a disability, please let us know at accommodations-ext@fb.com.